Applying R to Lifestyle and Brain Health Research
University of Wisconsin - Madison
September 9, 2026
Functions include three components: arguments, body, and environment.
formals() control parts of the functionbody() is the code inside the functionenvironment() contains values associated with the names.Unlike the formals and body, the environment is specified implicitly based on where you define the function. The function’s environment always exists.
Functions may have different attributes like srcref short for source reference. It points to the source code used to create the function and is used for printing because it contains code comments and other formatting.
Primitive functions exist in the base package and call C code directly. These functions have either type builtin or special. In this case, the functions exist primarily in C so their formals, body, and environment are NULL.
R functions are objects without a special syntax for defining and naming a function. You simply create a function object using function and bind it to a name with <-.
Nearly all functions are bound to a name, but there are times where anonymous or list functions are preferred.
A function is usually called by placing the arguments inside parentheses next to the function name. In some instances, the arguments are contained within a data structure and it is easier to use do.call and pass a list containing the function arguments.
Base R provides two ways to compose multiple function calls. You can either save the intermediate results as variables or nest the function calls.
The magrittr package provides a third option using the binary operator %>% which is called a pipe and pronounced as “and then”. Base R developed a comparable solution with |>; however, it lacks some of the advanced features of %>% like being able to use a placeholder . in multiple locations.
R looks up the values of names based on how a function is defined and not how it is called. Lexical is a technical term used in computer science that tells us that the scoping rules use a parse-time structure rather than a run-time structure. The lexical scoping in R follows four primary rules: (1) name masking, (2) functions versus variables,(3) a fresh start, and (4) dynamic lookup.
Names defined inside of a function mask names dfined outside of a function.
R looks one level up if a name is not defined inside of a function.
The same rules apply if a function is defined inside another function. If R is unable to find the value, it looks for the value in the location that the function is defined, all the way up to the global environment including other loaded packages.
The name masking rules apply to functions. However, a function and an object sharing the same name must reside in different environments.
A new environment is created each time a function is called to execute the function.
Values are searched for when the function is run. Thus, values can differ based on the objects outside of the function’s environment.
R relies on lexical scoping to find everything. Potential problems with dynamic lookup are not found when you create the function and error messages may never be returned depending on the variables defined in a user’s global environment.
The codetools::findGlobals function can be used to detect external dependencies within a function and setting a function’s environment to empty using emptyenv can help solve this problem. Of course, you would then need to add the functions from findGlobals manually to an environment before calling the function.
Functions are lazily evaluated, meaning that they are only evaluated if accessed. This allows you to include potentially expensive computations in function arguments that are only evaluated if needed.
Lazy evaluation is powered by a data structure called a promise. A promise has 3 components:
Default values can be defined based on other arguments or even variables defined later in the function. Many base R functions use this strategy, but it is harder to read the code adn predict what is returned.
The evaluation environment is slightly different for default and user supplied arguments. Default arguments are evaluated inside the function.
We can determine if an argument’s value comes from the user or from a default using the missing function. The sample function uses this technique.
This code is from base::sample which takes a sample of the specified size from the elements of x with or without replacement.
missing function evaluates whether size is given and returns the length of x if it is missing.An alternative way to write sample is setting size = NULL in the function arguments to indicate that it is not required. A simpler version of the sample function is to check for NULL.
sample <- function(x, size = NULL, replace = FALSE, prob = NULL) {
if (is.null(size)) {
size <- length(x)
}
# Or using the `%||%` operator from base R to use the left
# side if it's not NULL and return the right side otherwise.
# size <- size %||% length(x)
x[sample.int(length(x), size, replace, prob)]
}A special argument ... (pronounced dot-dot-dot) can be used in a function to take any additional number of arguments. These arguments can either be used inside the function or passed to another function.
It can be useful to store the arguments in a list to pass along to a different function. This is how the lapply function works in R.
S3 generic functions can use ... to allow the methods to take additional arguments. A function like print would have too many arguements for all the objects it would need to display.
The downsides of using ... is that more documentation may be needed to help user’s understand what arguments they can pass and where those arguments go. Additionally, misspelled arguments go unnoticed as they disappear and do not raise an error.
Functions exit by returning a value or throwing an error. Returns can be implicit where the last evaluated expression is returned or explicit by calling return.
Most functions return visibly, but the invisible function prevents automatic printing of the last value. The most common function that returns invisibly is <-.
Errors occur when a function cannot complete its assigned task. It uses the stop function to stop the execution of the function.
A function may need to make a temporary change to the global state. However, cleaning up those changes may be problematic if there is an error. Using on.exit ensures those changes are undone and the global state restored.
You can use after = TRUE or after = FALSE to control the order of on.exit within a function if you need some actions to be performed in a specific order.
There are four varieties of functions in R:
[[, if, and forAn interesting property of R is that all function varieties can be rewritten to prefix form.
Knowing the name of a non-prefix function allows you to override its behavior.
Most common functions in R and can specify functions in three ways:
help(mean))help(top = mean))help(topic = mean))get_structure <- function(age_years, weight_kg, amyloid_level) {
str(list(age = age_years, weight = weight_kg, amyloid = amyloid_level))
}
# By Position
get_structure(30, 70, 19)
#> List of 3
#> $ age : num 30
#> $ weight : num 70
#> $ amyloid: num 19
# By name (full name or partial-matching)
get_structure(weight = 70, 19, age = 30)
#> List of 3
#> $ age : num 30
#> $ weight : num 70
#> $ amyloid: num 19
# Raises error as a matches both age_years and amyloid_level
get_structure(30, 175, a = 19)
#> Error in get_structure(30, 175, a = 19) :
#> argument 3 matches multiple formal argumentsInfix functions have two arguments with the function inbetween those arguments.
R has a many built-in infix operators including:
:, ::, :::, $, @, ^, *, /, +, -, >, >=, <, <=, ==, !=, !, &, &&, |, ||, ~, <-, and <<-
You can also create your own infix functions that start and end with %. Base R uses this pattern to define %%, %*%, %/%, %in%, %o%, and %x%.
Names of infix functions are more flexible as they can contain any sequence of characters except for %. Any special characters need to be escaped when you define the function and infix operators are composed from the left to the right.
`% %` <- function(a, b) paste(a, b)
"another" % % "new" % % "string"
#> [1] "another new string"
`%/\\%` <- function(a, b) paste(a, b)
"and" %/\% "one" %/\% "with" %/\% "special" %/\% "characters"
#> [1] "and one with special characters"
`%-%` <- function(a, b) paste0("(", a, " %-% ", b, ")")
"a" %-% "b" %-% "c"
#> [1] "((a %-% b) %-% c)"Act like they modify their arguments in place and have the special name xxx<- with arguments x and value. If additional arguments are needed, place them between x and value.
Combining replacement with other functions requires more complex translations.
There are many language features in R that are written in special ways but also have prefix forms.
| Special | Prefix |
|---|---|
| (x) | `(`(x) |
| {x} | `{`(x) |
| x[i] | `[`(x, i) |
| x[[i]] | `[[`(x, i) |
| if (cond) true | `if`(cond, true) |
| if (cond) true else false | `if`(cond, true, false) |
| for(var in seq) action | `for`(var, seq, action) |
| while(cond) action | `while`(cond, action) |
| repeat expr | `repeat`(expr) |
| next | `next`() |
| break | `break`() |
| function(arg1, arg2) {body} | `function`(alist(arg1, arg2), body, env) |
R for Lifestyle and Brain Health (R-LAB)