Welcome everyone to this post. This is my first in a long time and also my first since I decided to take my blog from distill to quarto.
Today, we are going to create a complete NBA players squad poster using the combination of R
and Bash
languages.
Tis kind of languages mixing is made possible by the binding capabilities of quarto
engine to run blocks of code from multiple languages using the Jupyter engine1.
Since we are using Bash
alongside R
, an easy solution would have been to use system
to execute the Bash
code.
system(command, ...)
invokes the OS command specified by the command
Retrieve API Data
The first part of any data viz is about data and its wrangling. In our case, we won’t need to wrangle it because the data will be cleaned through API. Of course, we could have scraped the teams data from official NBA site or directly from the ESPN site. But this alternative would have been taken time. Fortunately, ESPN offers developers various API endpoints. There are many resources along the web, describing them and their uses. Here is an exhaustive list of GitHub pages related to topics:
List of NFL API Endpoints | https://gist.github.com/nntrn/ee26cb2a0716de0947a0a4e9a157bc1c |
ESPN’s hidden API endpoints | https://gist.github.com/akeaswaran/b48b02f1c94f873c6655e7129910fc3b |
The first link is limited to NFL endpoints but can be adapted for other sports leagues.
For example, if we want to retrieve the list of all NBA athletes2, the API endpoint is: http://sports.core.api.espn.com/v2/sports/basketball/leagues/nba/athletes/4277869?lang=en®ion=us.
If you open the link in your favorite web browser, you will see a preview of the JSON response.
If you expand the items
element, you will have an array of all the 265 athletes.
Similarly, if you want to retrieve the list of all the 30 NBA teams links, there is a specific API endpoint for that: https://sports.core.api.espn.com/v2/sports/basketball/leagues/nba/teams.
An API endpoint returning more information about teams is also available: http://site.api.espn.com/apis/site/v2/sports/basketball/nba/teams.
We will use this one and dig deeper into this information in the next section.
Team Information
For team information data, we will use an API endpoint with the team number as an argument. As there are 30 teams in National Basketball Association, to get team information we will just need to make the proper request with the appropriate team id. So let’s dive into our data-collecting process for the team with id 1
. We will just have, in the end, to wrap all the processes in a function to automate the workflow. As the result of an API request is a JSON
3, we will use the rjson
package and particularly the fromJSON
function of that package. fromJSON
takes several arguments:
json_str
: a JSON object to convertfile
: the name of a file to read thejson_str
<- rjson::fromJSON(file = glue::glue("https://sports.core.api.espn.com/v2/sports/basketball/leagues/nba/seasons/2022/teams/{team_num}")) team
As you can see, the returned result is a named list of information about the Atlanta Hawks
organization.
To retrieve that information in different variables, the possibilities are multiple. For example, you can retrieve the element displayName
of the named list with all of the following commands:
team$displayName
team[['displayName']]
- or using the
pluck
function frompurrr
packagepluck(team, 'displayName')
.
There are likely many other ways, but those 3 are those I know. Let’s retrieve other teams information.
<- pluck(team, "displayName")
team_name <- pluck(team, "abbreviation") |> tolower()
team_abbreviation <- pluck(team, "logos", 1, "href")
logo_link <- paste0(team_abbreviation,".png")
logo_file <- here::here(team_folder, logo_file) logo_file_path
Since we have the team logo link and our ultimate goal is to create a poster for all 30 teams, let’s create a directory for each of them. That directory will be the warehouse to store the team logo, team players headshots … etc.
# Team folder
<- here::here("Graphics","nba",team_num)
team_folder ::dir_create(team_folder) fs
The next step is to download the team logo.
download.file(url = logo_link, destfile = logo_file_path)
Teams athletes information
We have the different information we want about the team. We create the directory to contain the team logo, we downloaded the players headshots for the future poster. Let’s collect team players data. If the team number makes it possible to have access to team information, it also gives access to team players bio links through another API endpoint.
<- rjson::fromJSON(file = glue::glue("https://sports.core.api.espn.com/v2/sports/basketball/leagues/nba/seasons/2022/teams/1/athletes")) team_athletes
As you can see the items
element of the returned list is an array of a named list of a $ref
element.
The element contains another API endpoint referring to player detailed information:
name
date of birth
salary
- …
- and the more important his
headshot
link
So let’s retrieve an array of all the team players information API endpoints using purrr
package map_chr
function.
<- map_chr(team_athletes$items, ~ pluck(., '$ref')) players_urls
Well, we have the list of all API endpoint links for team players. Let’s continue our “individualization” process by collecting the information for a player. We will generalize the process by wrapping it in a function.
<- function(player_bio_url) {
get_player_headshot <- rjson::fromJSON(file = player_bio_url)
player_profil |>
player_profil enframe() |>
pivot_wider(
names_from = name,
values_from = value
|>
) select(id, contains('Name'),any_of("headshot")) |>
unnest_longer(-any_of("headshot")) |>
unnest_wider(any_of("headshot"))
}
Now, we see the result from a player API endpoint, let’s wrap in a function, and retrieve the data for the complete squad using another purrr
function pmap_dfr
.
<- map_dfr(players_urls, get_player_headshot, .id = "indice") players_headshots
indice | id | firstName | lastName | fullName | displayName | shortName | href | alt |
---|---|---|---|---|---|---|---|---|
1 | 3037789 | Bogdan | Bogdanovic | Bogdan Bogdanovic | Bogdan Bogdanovic | B. Bogdanovic | https://a.espncdn.com/i/headshots/nba/players/full/3037789.png | Bogdan Bogdanovic |
2 | 3102529 | Clint | Capela | Clint Capela | Clint Capela | C. Capela | https://a.espncdn.com/i/headshots/nba/players/full/3102529.png | Clint Capela |
3 | 3908845 | John | Collins | John Collins | John Collins | J. Collins | https://a.espncdn.com/i/headshots/nba/players/full/3908845.png | John Collins |
4 | 4257 | Derrick | Favors | Derrick Favors | Derrick Favors | D. Favors | https://a.espncdn.com/i/headshots/nba/players/full/4257.png | Derrick Favors |
5 | 4065656 | Trent | Forrest | Trent Forrest | Trent Forrest | T. Forrest | https://a.espncdn.com/i/headshots/nba/players/full/4065656.png | Trent Forrest |
6 | 4432585 | AJ | Griffin | AJ Griffin | AJ Griffin | A. Griffin | https://a.espncdn.com/i/headshots/nba/players/full/4432585.png | AJ Griffin |
7 | 3922230 | Aaron | Holiday | Aaron Holiday | Aaron Holiday | A. Holiday | https://a.espncdn.com/i/headshots/nba/players/full/3922230.png | Aaron Holiday |
8 | 2284101 | Justin | Holiday | Justin Holiday | Justin Holiday | J. Holiday | https://a.espncdn.com/i/headshots/nba/players/full/2284101.png | Justin Holiday |
9 | 4065732 | De'Andre | Hunter | De'Andre Hunter | De'Andre Hunter | D. Hunter | https://a.espncdn.com/i/headshots/nba/players/full/4065732.png | De'Andre Hunter |
10 | 4701230 | Jalen | Johnson | Jalen Johnson | Jalen Johnson | J. Johnson | https://a.espncdn.com/i/headshots/nba/players/full/4701230.png | Jalen Johnson |
11 | 2579294 | Frank | Kaminsky | Frank Kaminsky | Frank Kaminsky | F. Kaminsky | https://a.espncdn.com/i/headshots/nba/players/full/2579294.png | Frank Kaminsky |
12 | 4578893 | Vit | Krejci | Vit Krejci | Vit Krejci | V. Krejci | https://a.espncdn.com/i/headshots/nba/players/full/4578893.png | Vit Krejci |
13 | 4397179 | Tyrese | Martin | Tyrese Martin | Tyrese Martin | T. Martin | https://a.espncdn.com/i/headshots/nba/players/full/4397179.png | Tyrese Martin |
14 | 3907497 | Dejounte | Murray | Dejounte Murray | Dejounte Murray | D. Murray | https://a.espncdn.com/i/headshots/nba/players/full/3907497.png | Dejounte Murray |
15 | 4431680 | Onyeka | Okongwu | Onyeka Okongwu | Onyeka Okongwu | O. Okongwu | https://a.espncdn.com/i/headshots/nba/players/full/4431680.png | Onyeka Okongwu |
16 | 4592304 | Donovan | Williams | Donovan Williams | Donovan Williams | D. Williams | NA | NA |
17 | 4277905 | Trae | Young | Trae Young | Trae Young | T. Young | https://a.espncdn.com/i/headshots/nba/players/full/4277905.png | Trae Young |
Players Headshots
As we now have a dataframe with players information including their headshots links and names, we can upload this headshot to the intended repertory and build an important element for our poster creation, the headshots labels.
|>
players_headshots select(fullName, href) |>
pmap_chr(function(fullName, href) {
if (!is.na(href)) {
<- here::here(team_folder, fs::path_file(href))
destfile download.file(url = href, destfile = destfile)
::glue("-label \"{fullName}\" \"{destfile}\"")
glueelse {
} NA
}-> players_montage })
Team Poster montage
Headshots montage
Once we have for our team, its logo, its players headshots uploaded and labels set, we are ready for the next and main steps of this post, the team poster. Of course, as a ggplot
aficionado, the wanted result can be made using an approach with geom_image
combined with facet_wrap
and managing the headshot labels by customizing the strip texts. I have done a similar thing in the past, but the montage
command is a more natural and fluid process for this kind of task.
To be able to use the montage
command, be sure that you have ImageMagick
installed on your computer. You can find a complete guide on how to install it according to your Operation System here. To discuss the various options of the montage
command, we will need more than a blog post and probably a manual. There are a many resources available on the web. Here is a handy one I have found. The options, we will need are:
label
to tellmontage
to label the image with their source filenamesfont
to render text with this fontpointsize
for the font point sizebackground
for the color outside the drawn framefill
for the fill color for text labels and titlesmode
is for the concatenation mode, which is used to join images together without any extra spacestile
option to ensure all images appear in a single image
Let’s build our montage
<- players_montage[!is.na(players_montage)] |> paste0(collapse = " ")
players_montage
# Escape otherwise trouble with space
<- glue::glue("\"{here::here(team_folder,paste0(team_abbreviation,'_montage.png'))}\"")
montage_file
# Montage
system(command = glue::glue('montage {players_montage} -font "GothamNarrow-Medium" -pointsize 60 -background "#111111" -fill white -mode Concatenate -tile 4x5+1+1 {montage_file}'))
# Annotation with title
<- glue::glue('./title_montage.sh "{team_folder}" "{team_name}" {team_abbreviation}')
command
system(command)
Well, the result starts to look like the desired one, except for some points. We didn’t insert the team logo, and it would be pretty cool to add the team name, which we previously stored in a variable.
Team logo and name inserting
To insert the team logo and its name using the possibilities Imagemagick
gives us, I decide to put all the processes in a Bash
script. The job can also be done using the R
package magick
, but as this post is related to R
and Bash
I choose again to use that approach.
In our script, we need 3 characteristics for each team whose poster we want to generate. You can see them as function parameters. We have:
team_folder
team_name
team_abbreviation
If in most programming languages paradigm, in the function call, parameters are separated with commas. In Bash
, script execution command, the arguments are separated by space.
Example :
./my_script.sh arg1 arg2 arg3
In the script, you can access arg1
with $1
, arg2
with $2
, and so on. So in my final montage script, I will basically use the convert
command. The first step was to expand the initial plot because I want extra space to display the team name.
convert -background '#111111' "$1"/$3_montage.png -gravity southeast -splice 20x20 -gravity northwest -splice 150x150 -fill white -pointsize 100 -font 'GothamNarrow-Bold' -annotate +300+20 "$2" "$1"/$3_montage.png
The second step is to resize the team logo to have a dimension that satisfies my desire.
Again, feel free to adapt the resizing percentage to your liking.
convert -resize 25% "$1"/$3.png "$1"/$3_cropped.png
Once you have resized the team logo as we want, we will fix it in the headshots montage after making some adjustments to the montage size again.
convert "$1"/$3_montage.png "$1"/$3_cropped.png -gravity northwest -geometry +150+10 -composite "$1"/$3_montage.png
The complete Bash
script is so:
title_montage.sh
#!/bin/bash
#$1 : team_folder
#$2 : team_name
#$3 : team abbreviation used for the logo
# Expand the initial plot with gravity and splice
convert -background '#111111' "$1"/$3_montage.png -gravity southeast -splice 20x20 -gravity northwest -splice 150x150 -fill white -pointsize 100 -font 'GothamNarrow-Bold' -annotate +300+20 "$2" "$1"/$3_montage.png
# Crop franchise logo
convert -resize 25% "$1"/$3.png "$1"/$3_cropped.png
# Add franchise logo to main plot
convert "$1"/$3_montage.png "$1"/$3_cropped.png -gravity northwest -geometry +150+10 -composite "$1"/$3_montage.png
# Remove cropped logo
rm "$1"/$3_cropped.png
Well, we finally arrive at the result we wanted from the beginning. But remember, our goal was to be able to automate the entire poster generation process for all teams.
Parallelize overall teams posters
The first step is to wrap all the processes in a big function that has the team_id
as parameter.
Process Function
library(tidyverse)
# For one player
<- function(player_bio_url) {
get_player_headshot <- rjson::fromJSON(file = player_bio_url)
player_profil |>
player_profil enframe() |>
pivot_wider(
names_from = name,
values_from = value
|>
) select(id, contains('Name'),any_of("headshot")) |>
unnest_longer(-any_of("headshot")) |>
unnest_wider(any_of("headshot"))
}
<-function(team_num) {
generate_team_trombinoscope # Team folder
<- here::here("Graphics","nba",team_num)
team_folder ::dir_create(team_folder)
fs
# Team Informations
<- rjson::fromJSON(file = glue::glue("https://sports.core.api.espn.com/v2/sports/basketball/leagues/nba/seasons/2022/teams/{team_num}"))
team <- pluck(team, "displayName")
team_name <- pluck(team, "abbreviation") |> tolower()
team_abbreviation <- pluck(team, "logos", 1, "href")
logo_link <- paste0(team_abbreviation,".png")
logo_file <- here::here(team_folder, logo_file)
logo_file_path
# Download the team logo
download.file(url = logo_link, destfile = logo_file_path)
# Team athletes
<- rjson::fromJSON(file = glue::glue("https://sports.core.api.espn.com/v2/sports/basketball/leagues/nba/seasons/2022/teams/{team_num}/athletes"))
team_athletes
<- map_chr(team_athletes$items, ~ pluck(., '$ref'))
players_urls
<- map_dfr(players_urls, get_player_headshot, .id = "indice")
players_headshots
players_headshots
# Players headshots
if (!length(fs::dir_ls(team_folder))) {
|>
players_headshots select(fullName, href) |>
pmap_chr(function(fullName, href) {
if (!is.na(href)) {
<- here::here(team_folder, fs::path_file(href))
destfile download.file(url = href, destfile = destfile)
::glue("-label \"{fullName}\" \"{destfile}\"")
glueelse {
} NA
}-> players_montage
})
}
<- players_montage[!is.na(players_montage)] |> paste0(collapse = " ")
players_montage
# Escape otherwise trouble with space
<- glue::glue("\"{here::here(team_folder,paste0(team_abbreviation,'_montage.png'))}\"")
montage_file # Montage
system(command = glue::glue('montage {players_montage} -font "GothamNarrow-Medium" -pointsize 60 -background "#111111" -fill white -mode Concatenate -tile 4x5+1+1 {montage_file}'))
# Annotation with title
<- glue::glue('./title_montage.sh "{team_folder}" "{team_name}" {team_abbreviation}')
command system(command)
}
Once we have our super function, the process is simply to call that function with the 30 NBA teams ids, again using a function from the purrr
package but this time walk
as we didn’t wait of returned value.
::walk(1:30, generate_team_trombinoscope) purrr
Well, you can stop there as you have the final posters we so wanted. But this last chunk of code can be sped up using another amazing package, furrr
.
This package is your ally for multi-processing tasks. Generally, functions in the purrr
package apply a function to each element of a list or atomic vector sequentially5, regardless matter how many CPU cores you need. But thanks to furrr
you can remedy that by using multiple sessions6. The advantage is that we speed up the task by 3-4 times.
library(furrr)
plan(multisession, workers = 4) # != plan(sequential)
::future_walk(1:30, generate_team_trombinoscope) furrr
Possibilities of improvements
Of course, perfection doesn’t come from the human world. Just look at Milwaukee Bucks (team_id = 15
) poster, and you will understand why this post is not an exemption to our human world. If nothing goes wrong, you are lucky. The font you choose probably didn’t likely turn things wrong. In my case, I have the labels of Antetokounmpo brothers’ collapsing. The problem can be solved by decreasing the point size of font.
Also, something I didn’t like is that for n
headshots montage, when n
is not a multiple of 20
(4 x 5
: the montage tile
parameter), the last row of headshots is displayed from left to right. For aesthetic reasons, I think they should be placed at the center. It is a real improvement, you could make to those posters.
30 teams Posters
Voilà, it is the end. I hope You enjoyed this post you as I enjoyed writing it. Please let me know in the comments if there is anything you think I should improve on as I plan to publish at least one article per month.
Footnotes
Citation
@online{issabida2022,
author = {Abdoul ISSA BIDA},
title = {NBA {Posters}},
date = {2022-09-26},
url = {https://www.abdoulblog.com},
langid = {en}
}