class: title-slide, center, bottom # Plot Twist ## 10 Bake Offs, 11 Ways ### Alison Hill · RStudio #### R-Ladies Sydney Meetup · 2019-10-04 --- name: hello class: inverse, right, bottom <img style="border-radius: 50%;" src="https://github.com/apreshill.png" width="150px"/> # Find me at... [
@apreshill](http://twitter.com/apreshill) [
@apreshill](http://github.com/apreshill) [
alison.rbind.io](https://alison.rbind.io) [
alison@rstudio.com](mailto:alison@rstudio.com) --- class: center, middle, inverse # Inspired by: ## [Flowing Data: One Dataset Visualized 25 Ways](https://flowingdata.com/2017/01/24/one-dataset-visualized-25-ways/) <img src="images/flowing-data-inspo.png" width="50%" style="display: block; margin: auto;" /> --- class: center, middle, inverse ## Disclaimers -- I am a data visualization .whisper[practioner]. -- I offer what I hope are well-reasoned opinions here, but obviously .whisper[YMMV]. -- I do not claim that all of the following are .whisper[good] nor .whisper[publication-worthy] visualizations (for those viewing these slides without narrative)  --- class:middle, inverse, center ## My messages for today --- class:middle, inverse, center ## tidiness `\(\neq\)` .shout[godliness]  --- class:middle, inverse, center ## tidiness = .whisper[nimbleness]  --- class:middle, inverse, center ## tidy `\(\neq\)` .shout[done]  --- class:middle, inverse, center <img src="images/tidyverse_wrangle.png" width="50%" style="display: block; margin: auto;" /> --- class: center, middle, inverse ## But also... Don't be afraid to chop out those `ggplot2` defaults!  --- class: center, middle, inverse ## But also... It's all in the details.  --- # Packages first I'll use all of the following: ```r library(tidyverse) library(bakeoff) # data + colors! ``` --- # Data second ```r series10 <- tribble( ~series, ~episode, ~viewers_7day, 10, 1, 9.62, 10, 2, 9.38, 10, 3, 8.72, 10, 4, 8.73 ) ratings_ten <- ratings %>% full_join(series10) %>% mutate(series = as.factor(series)) ``` --- # Glimpse ``` Observations: 88 Variables: 8 $ series <fct> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3… $ episode <dbl> 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 7, 8, 1… $ uk_airdate <date> 2010-08-17, 2010-08-24, 2010-08-31, 2010-0… *$ viewers_7day <dbl> 2.24, 3.00, 3.00, 2.60, 3.03, 2.75, 3.10, 3… $ viewers_28day <dbl> 7, 3, 2, 4, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1… $ network_rank <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… $ channels_rank <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… $ bbc_iplayer_requests <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,… ``` --- class: center, middle, inverse # 🍰 ## Recipe 1: Continuous Bar Chart --- ## Recipe 1: Continuous Bar Chart <img src="index_files/figure-html/episodebar-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 1: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geom]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/episodebar-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 1: Code for Bar Chart ```r # create continuous episode count plot_off1 <- ratings_ten %>% * mutate(ep_id = row_number()) %>% select(ep_id, viewers_7day, series, episode) # create coordinates for labels series_labels <- plot_off1 %>% group_by(series) %>% summarize(y_position = median(viewers_7day) + 1, x_position = mean(ep_id)) # make the plot *ggplot(plot_off1, aes(x = ep_id, y = viewers_7day, fill = series)) + * geom_col(alpha = .9) + ggtitle("Series 8 was a Big Setback in Viewers", subtitle= "7-Day Viewers across All Series/Episodes") + geom_text(data = series_labels, aes(label = series, x = x_position, y = y_position)) + theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(), axis.title.x = element_blank(), panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank()) + scale_fill_manual(values = bakeoff_cols, guide = FALSE) + scale_x_discrete(expand = c(0, 0)) ``` --- class: center, middle, inverse # 🍰 ## What is going on with Series 8? > *"The eighth series of The Great British Bake Off began on 29 August 2017, with this being the first of The Great British Bake Off to be broadcast on Channel 4, after the production company Love Productions moved the show. It is the first series for new hosts Noel Fielding and Sandi Toksvig, and new judge Prue Leith." -- <a href="https://en.wikipedia.org/wiki/The_Great_British_Bake_Off_(series_8)">Wikipedia</a>* --- class: center, middle, inverse  ## Read: -- ## No Mary Berry, no Mel, no Sue --- class: center, middle, inverse # 🍰 ## Recipe 2: Lollipop Plot --- ## Recipe 2: Lollipop Plot <img src="index_files/figure-html/lolli-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 2: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[3 geoms]? 1. What .whisper[variable] is .shout[facet wrapped]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/lolli-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 2: Code for Lollipop Plot ```r plot_off2 <- ratings_ten %>% * group_by(series) %>% * mutate(series_avg = mean(viewers_7day, na.rm = TRUE), * diff_avg = viewers_7day - series_avg) %>% filter(max(episode) == 10) %>% mutate(episode = as.factor(episode)) %>% select(episode, viewers_7day, series, diff_avg, series_avg) *ggplot(plot_off2, aes(x = as.factor(episode), y = viewers_7day, color = diff_avg)) + * geom_hline(aes(yintercept = series_avg), alpha = .5) + * geom_point() + * geom_segment(aes(xend = episode, yend = series_avg)) + * facet_wrap(~series) + scale_color_viridis_c(option="plasma", begin = 0, end = .8, guide = FALSE) + ggtitle("Great British Bake Off Finales Get the Most Viewers", subtitle = "Way Higher than Series Average (for Series with 10 episodes)") ``` --- class: center, middle, inverse # 🍰 ## Recipe 3: Series Line Plot --- ## Recipe 3: Series Line Plot <img src="index_files/figure-html/serieslines-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 3: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geom]? 1. What .whisper[variable] is .shout[grouped]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/serieslines-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 3: Code for Series Line Plot ```r line_labels <- ratings_ten %>% group_by(series) %>% filter(episode == max(episode)) %>% select(series, x_position = episode, y_position = viewers_7day) *ggplot(ratings, aes(x = as.factor(episode), * y = viewers_7day, * color = as.factor(series), * group = series)) + * geom_line() + scale_color_manual(values = bakeoff_cols, guide = FALSE) + labs(color = "Series", x = "Episode") + geom_text(data = line_labels, aes(label = series, x = x_position + .25, y = y_position)) ``` --- class: center, middle, inverse # 🍰 ## Recipe 4: Facetted Line Plot --- ## Recipe 4: Facetted Line Plot <img src="index_files/figure-html/facetlines-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 4: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geom]? 1. What .whisper[variable] is .shout[facetted]? 1. What .whisper[variable] is .shout[grouped]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/facetlines-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 4: Code for Facetted Line Plot ```r *ggplot(ratings_ten, aes(x = as.factor(episode), * y = viewers_7day, * color = fct_reorder2(series, episode, viewers_7day), * group = series)) + * facet_wrap(~series) + * geom_line(lwd = 2) + scale_color_manual(values = bakeoff_cols, guide = FALSE) + labs(color = "Series", x = "Episode") ``` --- class: center, middle, inverse # 🍰 ## Recipe 5: First vs. Last --- ## Recipe 5: First vs. Last <img src="index_files/figure-html/firstlastline-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 5: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geoms]? 1. What .whisper[variable] is .shout[grouped]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/firstlastline-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 5: Code for First vs. Last ```r # some wrangling here plot_off5 <- ratings_ten %>% filter(series %in% complete_series) %>% select(series, episode, viewers_7day) %>% * group_by(series) %>% * filter(episode == 1 | episode == max(episode)) %>% * mutate(episode = recode(episode, `1` = "first", .default = "last")) %>% ungroup() # code for plot *ggplot(plot_off5, aes(x = series, * y = viewers_7day, * color = fct_reorder2(episode, series, viewers_7day), * group = episode)) + * geom_point() + * geom_line() + scale_color_manual(values = bakeoff_cols) + ggtitle("Great British Bake Off Finales Get More Viewers than Premieres") + labs(color = "Episode") ``` --- class: center, middle, inverse # 🍰 ## What is going on with the Series 8 finale? --- class: middle, center, inverse ## A [tweet](https://twitter.com/PrueLeith/status/925329937644564480) heard 'round the world <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">I am so sorry to the fans of the show for my mistake this morning, I am in a different time zone and mortified by my error <a href="https://twitter.com/hashtag/GBBO?src=hash&ref_src=twsrc%5Etfw">#GBBO</a>.</p>— Prue Leith (@PrueLeith) <a href="https://twitter.com/PrueLeith/status/925329937644564480?ref_src=twsrc%5Etfw">October 31, 2017</a></blockquote> --- class: center, middle, inverse # 🍰 ## Recipe 6: Dumbbell Plot --- ## Recipe 6: Dumbbell Plot <img src="index_files/figure-html/dumbbell-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 6: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geoms]? 1. What .whisper[variable] is .shout[grouped]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/dumbbell-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 6: Code for Dumbbell Plot ```r *ggplot(plot_off5, aes(x = viewers_7day, * y = fct_rev(series), * color = episode, * group = series)) + * geom_line(size = .75) + * geom_point(size = 2.5) + scale_color_manual(values = bakeoff_cols) + labs(y = "Series", x = "Viewers (millions)", color = "Episode") + ggtitle("Great British Bake Off Finales Get More Viewers than Premieres") ``` --- class: center, middle, inverse # 🍰 ## Recipe 7: Slope Graph --- ## Recipe 7: Slope Graph <img src="index_files/figure-html/slope-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 7: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geoms]? 1. What .whisper[variable] is .shout[grouped]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/slope-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 7: Code for Slope Graph ```r slope_labels <- plot_off5 %>% filter(episode == "last") %>% select(series, x_position = episode, y_position = viewers_7day) *ggplot(plot_off5, aes(x = episode, * y = viewers_7day, * color = series, * group = series)) + * geom_point() + * geom_line() + scale_color_manual(values = bakeoff_cols, guide = FALSE) + geom_text(data = slope_labels, aes(label = series, x = x_position, y = y_position), nudge_x = .1) + theme(panel.grid = element_blank(), axis.line = element_line(color = "gray")) ``` --- class: center, middle, inverse # 🍰 ## Recipe 8: Finale "Bumps" --- ## Recipe 8: Finale "Bumps" <img src="index_files/figure-html/bump-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 8: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geom]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/bump-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 8: Code for Finale "Bumps" ```r # some more serious wrangling here plot_off8 <- ratings_ten %>% filter(series %in% complete_series) %>% select(series, episode, viewers_7day) %>% group_by(series) %>% filter(episode == 1 | episode == max(episode)) %>% * mutate(episode = recode(episode, `1` = "first", .default = "last")) %>% * spread(episode, viewers_7day) %>% * mutate(finale_bump = last - first) # plot *ggplot(plot_off8, aes(x = fct_rev(series), * y = finale_bump)) + * geom_col(fill = bakeoff_cols("baltic"), alpha = .7) + * coord_flip() + labs(x = "Series", y = "Difference in Viewers for Finale from Premiere (millions)") + ggtitle("Finale 'Bumps' were Smallest for Series 1 and 8", subtitle = "Finale 7-day Viewers Relative to Premiere") ``` --- class: center, middle, inverse # 🍰 ## Recipe 9: % Change Bar Chart --- ## Recipe 9: % Change Bar Chart <img src="index_files/figure-html/changebar-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 9: Questions .left-column[ 1. Which .whisper[dataset]? 1. Which .whisper[geom]? 1. What .whisper[variable] is mapped on the .shout[x-axis]? 1. What .whisper[variable] is mapped on the .shout[y-axis]? 1. What .whisper[variable] is mapped to .shout[color]? ] .right-column[ <img src="index_files/figure-html/changebar-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## Recipe 9: Code for % Bar ```r # wrangling to calculate percent change plot_off9 <- ratings_ten %>% filter(series %in% complete_series) %>% select(series, episode, viewers_7day) %>% group_by(series) %>% filter(episode == 1 | episode == max(episode)) %>% ungroup() %>% mutate(episode = recode(episode, `1` = "first", .default = "last")) %>% * spread(episode, viewers_7day) %>% * mutate(pct_change = (last - first) / first) # plot *ggplot(plot_off9, aes(x = fct_rev(series), * y = pct_change)) + * geom_col(fill = bakeoff_cols("baltic"), alpha = .5) + geom_hline(aes(yintercept = median(pct_change, na.rm = TRUE)), color = bakeoff_cols("berry"), lwd = 2) + labs(x = "Series", y = "% Increase in Viewers, First to Last Episode") + ggtitle("Series 8 had a 6% Increase in Viewers from Premiere to Finale", subtitle= "The Lowest Across All Series (Line is the Median)") + scale_y_continuous(labels = scales::percent) + coord_flip() ``` --- class: center, middle, inverse # 🎂 ## Recipe 10: Lollipop Plot, % Change --- ## Recipe 10: Lollipop Plot, % Change <img src="index_files/figure-html/lollipercent-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 10: Code for % Lollipop Plot ```r # plot *ggplot(plot_off9, aes(x = fct_rev(series), * y = pct_change)) + * geom_point(color = bakeoff_cols("bluesapphire"), size = 2) + * geom_segment(aes(xend = fct_rev(series), yend = 0), color = bakeoff_cols("bluesapphire")) + geom_text(aes(label = scales::percent(pct_change)), hjust = -.25) + labs(x = "Series", y = "% Change in Viewers from First to Last Episode") + ggtitle("Percent Increase in Viewers was the Smallest for Series 8", subtitle= "Finale 7-day Viewers Relative to Premiere") + scale_y_continuous(labels = scales::percent, limits = c(0, .85)) + coord_flip() ``` --- class: center, middle, inverse  --- class: center, middle, inverse # 🎂 ## Recipe 11: Scatterplot --- ## Recipe 11: Scatterplot <img src="index_files/figure-html/scatter-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 11: Code for Scatterplot ```r *ggplot(plot_off8, aes(x = first, y = last)) + * geom_point() + * geom_smooth(se = FALSE, color = '#EBBFDD') + * geom_abline(slope = 1, intercept = 0, color = "gray", alpha = .5) + geom_text(aes(label = series), hjust = -1) + labs(x = "Premiere Episode 7-day Viewers (millions)", y = "Finale Episode 7-day Viewers (millions)") + coord_equal(ratio=1) ``` --- class: center, middle, inverse # 🎂 ## Recipe 11.1: Pop-Out Scatterplot --- ## Recipe 11.1: Pop-Out Scatterplot <img src="index_files/figure-html/lollipop-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Recipe 11.1: Code for Pop-Out Scatterplot ```r ggplot(plot_off8, aes(x = first, y = last)) + geom_abline(slope = 1, intercept = 0, color = "gray", alpha = .5) + geom_smooth(se = FALSE, color = "#11B2E8") + geom_point(data = filter(plot_off8, series %in% c(1:7, 9))) + geom_point(data = filter(plot_off8, series == 8), colour = "#CF2154") + geom_text(data = filter(plot_off8, series %in% c(1:7, 9)), aes(label = series), hjust = -1) + geom_text(data = filter(plot_off8, series == 8), aes(label = series), hjust = -1, colour = "#CF2154") + labs(x = "Premiere Episode 7-day Viewers (millions)", y = "Finale Episode 7-day Viewers (millions)") ``` --- class:inverse, middle, center 