Introducing trendyy

A tidy wrapper for gtrendsR

trendyy is a package for querying Google Trends. It is build around Philippe Massicotte’s package gtrendsR which accesses this data wonderfully.

The inspiration for this package was to provide a tidy interface to the trends data.

Getting Started

Installation

You can install trendyy from CRAN using install.packages("trendyy").

Usage

Use trendy() to search Google Trends. The only mandatory argument is search_terms. This is a character vector with the terms of interest. It is important to note that Google Trends is only capable of comparing up to five terms. Thus, if your search_terms vector is longer than 5, it will search each term individually. This will remove the direct comparative advantage that Google Trends gives you.

Additional arguments

  • from: The beginning date of the query in "YYYY-MM-DD" format.
  • to: The end date of the query in "YYYY-MM-DD" format.
  • ... : any additional arguments that would be passed to gtrendsR::gtrends(). Note that it might be useful to indicate the geography of interest. See gtrendsR::countries for list of possible geographies.

Accessor Functions

  • get_interest(): Retrieve interest over time
  • get_interest_city(): Retrieve interest by city
  • get_interest_country(): Retrieve interest by country
  • get_interest_dma(): Retrieve interest by DMA
  • get_interest_region(): Retrieve interest by region
  • get_related_queries(): Retrieve related queries
  • get_related_topics(): Retrieve related topics

Example

Seeing as I found an interest in this due to the relatively pervasive use of Google Trends in political analysis, I will compare the top five polling candidates in the 2020 Democratic Primary. As of May 22nd, they were Joe Biden, Kamala Harris, Beto O’Rourke, Bernie Sanders, and Elizabeth Warren.

First, I will create a vector of my desired search terms. Second, I will pass that vector to trendy() specifying my query date range from the first of 2019 until today (May 25th, 2019).

candidates <- c("Joe Biden", "Kamala Harris", "Beto O'Rourke", "Bernie Sanders", "Elizabeth Warren")

candidate_trends <- trendy(candidates, from = "2019-01-01", to = Sys.Date())

Now that we have a trendy object, we can print it out to get a summary of the trends.

candidate_trends
## ~Trendy results~
## 
## Search Terms: Joe Biden, Kamala Harris, Beto O'Rourke, Bernie Sanders, Elizabeth Warren
## 
## (>^.^)> ~~~~~~~~~~~~~~~~~~~~ summary ~~~~~~~~~~~~~~~~~~~~ <(^.^<)
## # A tibble: 5 x 5
##   keyword          max_hits min_hits from       to        
##   <chr>               <dbl>    <dbl> <date>     <date>    
## 1 Bernie Sanders         93        2 2019-01-01 2019-05-23
## 2 Beto O'Rourke           5        1 2019-01-01 2019-05-23
## 3 Elizabeth Warren       34        1 2019-01-01 2019-05-23
## 4 Joe Biden              84        1 2019-01-01 2019-05-23
## 5 Kamala Harris         100        1 2019-01-01 2019-05-23

In order to retrieve the trend data, use get_interest(). Note, that this is dplyr friendly.

get_interest(candidate_trends)
## # A tibble: 715 x 7
##    date                 hits geo   time          keyword  gprop category   
##    <dttm>              <int> <chr> <chr>         <chr>    <chr> <chr>      
##  1 2019-01-01 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  2 2019-01-02 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  3 2019-01-03 00:00:00     3 world 2019-01-01 2… Joe Bid… web   All catego…
##  4 2019-01-04 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
##  5 2019-01-05 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
##  6 2019-01-06 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
##  7 2019-01-07 00:00:00     5 world 2019-01-01 2… Joe Bid… web   All catego…
##  8 2019-01-08 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
##  9 2019-01-09 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
## 10 2019-01-10 00:00:00     2 world 2019-01-01 2… Joe Bid… web   All catego…
## # … with 705 more rows

Plotting Interest

candidate_trends %>% 
  get_interest() %>% 
  ggplot(aes(date, hits, color = keyword)) +
  geom_line() +
  geom_point(alpha = .2) +
  theme_minimal() +
  theme(legend.position = "bottom") +
  labs(x = "", 
       y = "Relative Search Popularity",
       title = "Google Search Popularity")

It is also possible to view the related search queries for a given set of keywords using get_related_queries().

candidate_trends %>% 
  get_related_queries() %>% 
  group_by(keyword) %>% 
  sample_n(2)
## # A tibble: 10 x 5
## # Groups:   keyword [5]
##    subject  related_queries value                  keyword      category   
##    <chr>    <chr>           <chr>                  <chr>        <chr>      
##  1 17       top             bernie sanders running Bernie Sand… All catego…
##  2 Breakout rising          karl lagerfeld         Bernie Sand… All catego…
##  3 +60%     rising          beto orourke net worth Beto ORourke All catego…
##  4 +400%    rising          beto orourke running … Beto ORourke All catego…
##  5 Breakout rising          elizabeth warren tax … Elizabeth W… All catego…
##  6 Breakout rising          elizabeth warren live… Elizabeth W… All catego…
##  7 Breakout rising          joe biden kissing      Joe Biden    All catego…
##  8 Breakout rising          joe biden lucy flores  Joe Biden    All catego…
##  9 Breakout rising          kamala harris nephew   Kamala Harr… All catego…
## 10 Breakout rising          jussie smollett and k… Kamala Harr… All catego…
Avatar
Josiah Parry
Social Data Scientist

Related