Dynamic variable assignment in R

Evan C. Mascitti

2020-11-17

Assigning multiple objects at once

This section demonstrates how to assign each element of a list as a new object.

Generally speaking, I think this should be discouraged as most tasks can be accomplished inside a data frame or list (or the tidyverse versions: a tibble or nested list). Keeping everything in one place ensures the data lives together through an entire operation and prevents one from making mistakes (see Jenny Bryan’s tutorial, “Thinking inside the box: you can do that inside a data frame?!”).

Why you would want to do this, sometimes

With the above disclaimer, here is one example when one wishes to refer to computed values in an R Markdown document using inline R expressions. Let’s say you have already created a long summary of data, for example the high and low points of the stock market indices over the past 100 years. In this case, it’s pain to type a long expression which requires filtering or indexing into a data frame…you just want to refer to the value of one “thing” by its name. In this case it is useful to have each element of the list available as a named object. For example, typing

“Black Friday was a precipitous crash due to computer-generated trades. Just a few months before, in late 1986, the S&P 500 hit a then-record high of $`r sp1986hi`.”

is a lot simpler and less prone to mistakes than:

“Black Friday was a precipitous crash due to computer-generated trades. Just a few months before, in late 1986, the S&P 500 hit a then-record high of $`r stocks_history %>% filter(year == 1986, type == "hi") %>% map_dbl(1)`.”

Minimal example

The list elements must be named (otherwise how could you assign them to objects?). Here I create a named list, where the name of each element is a lowercase letter and each element contains a vector of random numbers. All the elements have different lengths, so this data set cannot be stored as a data frame.

library(tidyverse)
set.seed(20)
q <- list(a= sample(10, 8, replace = FALSE),
          b= sample(10, 5, replace = FALSE),
          c=sample(10, 6, replace = FALSE),
          d=sample(10, 2, replace = FALSE))
set.seed(NULL)

Let’s verify the contents of the list.

str(q)
## List of 4
##  $ a: int [1:8] 6 8 2 1 9 5 10 4
##  $ b: int [1:5] 9 3 5 1 8
##  $ c: int [1:6] 8 2 7 4 1 5
##  $ d: int [1:2] 1 9

To assign each element of q to a list, simply call the function list2env(), whose first argument is (not surprisingly) the list whose elements you will be assigning as new objects. In this case, specify the environment argument as the global environment. If you don’t want the objects to clutter your global environment, you could assign it to an alternative environment that you create on the fly with environment(), but that’s not the point here.

q %>% 
  list2env(envir = .GlobalEnv)
## <environment: R_GlobalEnv>

We can verify that the objects have been assigned by noting their presence in the Environment pane:

If we call a couple of the new objects, their contents are printed in the console:

a
## [1]  6  8  2  1  9  5 10  4
d
## [1] 1 9

These are simple objects; they are just atomic vectors, but you could definitely do the same thing with more complicated lists.

list2env() has a much-more-sloppy cousin called list2DF(). list2DF() transposes a list to a data frame (which of course, is a special type of list…). unfortunately this function uses vector recycling without any complaints, and without regard to the length of the vectors involved. So list2DF() will “work” even if the list elements are not of the same length - it will simply extend each list by as many “extra” entries as it needs to fill out the data frame. For example, if we call this on our list q from above:

list2DF(q)
##    a b c d
## 1  6 9 8 1
## 2  8 3 2 9
## 3  2 5 7 1
## 4  1 1 4 9
## 5  9 8 1 1
## 6  5 9 5 9
## 7 10 3 8 1
## 8  4 5 2 9

The shorter elements are just filled out to match the length of a which is 8 elements. 😱 😱 😱

There is no direct equivalent for this function in the tidyverse. You could try to coerce the list to a tbl_df with tibble::as_tibble(), which fails:

as_tibble(q)
## Error: Tibble columns must have compatible sizes.
## * Size 2: Column `d`.
## * Size 5: Column `b`.
## * Size 6: Column `c`.
## * Size 8: Column `a`.
## i Only values of size one are recycled.

However, the equivalent coercion statement in base R will also prevent this behavior:

as.data.frame(q)
## Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 8, 5, 6, 2

Strict limits on vector recycling comprise one key philosophy of the tidyverse. In this particular case there’s not much difference between base R and tibble.

“Gathering” arbitrary number of objects with mget()

The base function mget() is nearly the inverse of assigning a list to an environment. Instead, it collects any objects specified by the user and compiles them into a named list. Since it takes a character vector, you can leverage a call to ls() to construct a character vector of all existing objects in an local or global environment.

Here’s an example using the objects we created above: both the individual numeric vectors (a, b. c, and d) and the list object q:

whole_list <- mget(ls())

str(whole_list)
## List of 9
##  $ a       : int [1:8] 6 8 2 1 9 5 10 4
##  $ b       : int [1:5] 9 3 5 1 8
##  $ c       : int [1:6] 8 2 7 4 1 5
##  $ d       : int [1:2] 1 9
##  $ inf_col : chr "#A57251"
##  $ psu_navy: chr "#041E42"
##  $ q       :List of 4
##   ..$ a: int [1:8] 6 8 2 1 9 5 10 4
##   ..$ b: int [1:5] 9 3 5 1 8
##   ..$ c: int [1:6] 8 2 7 4 1 5
##   ..$ d: int [1:2] 1 9
##  $ ref_grey: chr "#737373"
##  $ turf_col: chr "#425B44"

The object whole_list now contains all the other objects which previously existed in the environment. This makes it easy to scoop up all the existing objects in a given environment and stash them in a list while preserving their names. I use this inside functions when I want to export all defined objects, without relying on manual assignment.

Summary and words of caution

These are some simple and powerful functions. They can make things simpler and less error-prone by replacing copy-paste and manual input with dynamically-generated variables. They should be used carefully, because it is easy to pollute an environment or cause unintentional side effects. mget() and list2env() might be best suited for local environments rather than the global workspace.