Object Oriented Programming

Applying R to Lifestyle and Brain Health Research

Brian C. Helsel, PhD

University of Kansas Medical Center

October 14, 2026

Introduction

Object-Oriented Programming (OOP) focuses on objects (like data frames or models). The same function can work differently depending on the object type — this is called polymorphism. It means you can use one function name for many kinds of input, and R figures out the right behavior.

Polymorphism

Polymorphism is what allows summary to produce different outputs for numeric and factor variables.

class(ggplot2::diamonds$carat)
#> [1] numeric

summary(ggplot2::diamonds$carat)
#>   Min.   1st Qu. Median   Mean   3rd Qu.  Max.
#>  0.2000  0.4000  0.7000  0.7979  1.0400  5.0100

class(ggplot2::diamonds$cut)
#> [1] "ordered" "factor"

summary(ggplot2::diamonds$cut)
#>  Fair      Good Very Good   Premium     Ideal
#>  1610      4906     12082     13791     21551

An OOP system makes it possible for any developer to extend the interface with by adding implementations for new types of input. In OOP, the type of an object is a class and its implementation is the method. There are two main paradigms for OOP which differ in how methods and classes are related:

  • Encapsulated OOP: Methods belong to objects or classes (e.g., object.method(arg1, arg2))
  • Functional OOP: Methods belong to generic functions (e.g., generic(object, arg1, arg2))

Object-Oriented Programming in R

Base R has three OOP systems including S3, S4, and RC. A number of other OOP systems exist from CRAN packages including R6, R.oo, and proto.

The sloop package provides tools to help you interactively explore and understand object oriented programming in R. For example, sloop::otype() makes it easy to find the type of OOP system being used.

sloop::otype(mtcars)

#> [1] "S3"

sloop::otype(stats4::mle(function(x = 1) (x - 2)^2))
#> [1] "S4"

Base Types

While everything in R is an object, not everything is object-oriented. Base objects in R come from S, and were developed before anyone thought that S needed an OOP system.

Base and object-oriented objects can be identified using sloop::otype. The main difference between base and object-oriented objects is the “class” attribte.

sloop::otype(1:10)
#> [1] "base"

attr(1:10, "class")
#> NULL

sloop::otype(mtcars)
#> [1] "S3"

attr(mtcars, "class")
#> [1] "data.frame"

You can use sloop::s3_class to return the classes that the S3 and S4 systems will use to choose their methods.

x <- matrix(1:4, nrow = 2)

sloop::s3_class(x)
#> [1] "matrix" "integer" "numeric"

S3 Basics

An S3 object is a base type with at least a class attribute.

It behaves differently when passed to a generic function.

Identifying Generic Functions

You can use sloop::ftype to identify generic functions.

sloop::ftype(print)
#> [1] "S3" "generic"

sloop::ftype(str)
#> [1] "S3" "generic"

The generic function defines the interface and finds the right method for the class using method dispatch. You can use sloop::s3_dispatch to see the process of method dispatch.

sloop::s3_dispatch(print(factor(letters)))
#> => print.factor
#>  * print.default

The naming scheme of S3 methods are always generic.class(), but you should never call the method directly and instead rely on the generic function to find it for you.

You can use sloop::s3_get_method to see the source code of generic functions which are often not exported in R.

sloop::s3_get_method(weighted.mean.Date)

#> function (x, w, ...)
#> .Date(weighted.mean(unclass(x), w, ...))
#> <bytecode: 0x14e0872b0>
#> <environment: namespace:stats>

Classes

S3 has no formal definition of a class. Instead, to make an object an instance of a class, you set the class attribute with structure or class<-(). You can view a class with class() and check to see if an object is an instance of a class with inherits(x, “classname”).

# Create and assign class in a single step
x <- structure(list(), class = "myclass")

# Create and then assign class
x <- list()
class(x) <- "myclass"

class(x)
#> [1] "myclass"

inherits(x, "myclass")
#> [1] TRUE

Making Your Own Classes

It is helpful to provide a constructor, validator, and helper function when creating your own classes. This makes it easy for others to add objects of your class.

  • Constructor: Efficiently creates new objects with the correct structure (e.g., new_myclass())
  • Validator: Performs more computationally expensive checks to ensure the object has correct values (e.g., validate_myclass())
  • Helper: Provides a convenient way for others to create objects of your class (e.g., myclass())

Constructors

There is no built-in way to ensure that all objects of a class have the same structure (e.g., same base type and attributes). A constructor can help enforce a consistent structure and should follow three principles:

  • Be called new_myclass()
  • Have one argument for the base object and one for each attribute
  • Check the type of base object and the types of each attribute
new_Date <- function(x = double()) {
  stopifnot(is.double(x))
  structure(x, class = "Date")
}

new_Date(c(-1, 0, 1))
#> [1] "1969-12-31" "1970-01-01" "1970-01-02"

Validators

More complicated classes require additional checks for validity. A constructor only checks that types are correct, making it possible to create malformed objects.

new_factor <- function(x = integer(), levels = character()) {
  stopifnot(is.integer(x))
  stopifnot(is.character(levels))
  structure(x, levels = levels, class = "factor")
}

new_factor(1:5, "a")
#> Error in `as.character.factor()`: malformed factor

It is better to add checks to a validator rather than including them in the constructor. This allows you to create new objects quickly when you know the values are correct, and re-use the validation checks in other places.

validate_factor <- function(x) {
  values <- unclass(x)
  levels <- attr(x, "levels")
  if (!all(!is.na(values) & values > 0)) {
    stop(
      "All `x` values must be non-missing and greater than zero",
      call. = FALSE
    )
  }
  if (length(levels) < max(values)) {
    stop(
      "There must be at least as many `levels` as possible values in `x`",
      call. = FALSE
    )
  }
  x
}

validate_factor(new_factor(1:5, "a"))
#> Error: There must be at least as many `levels` as possible values in `x`

validate_factor(new_factor(0:1, "a"))
#> Error: All `x` values must be non-missing and greater than zero

The validator function is primarily called for its side-effects (i.e., throwing an error if the object is not valid). It is useful for validation methods to visibly return the original input.

Helpers

Helpers can make constructing objects from your class simple if it always:

  • Has the same name as the class (e.g., myclass())
  • Finish by calling the constructor and validator
  • Create error messages that are helpful to the end user
  • Have a thoughtfully crafted user interface with carefully chosen default values and conversions

At times, the helper only needs to coerce its inputs to the desired type. For example, the new_difftime constructor below is strict and violates the usual convention that an integer vector can be substituted for a double. A helper function can be created to coerce the input.

# Constructor
new_difftime <- function(x = double(), units = "secs") {
  stopifnot(is.double(x))
  units <- match.arg(units, c("secs", "mins", "hours", "days", "weeks"))
  structure(x, class = "difftime", units = units)
}

new_difftime(1:10)
#> Error in `new_difftime()`: is.double(x) is not TRUE

# Helper
difftime <- function(x = double(), units = "secs") {
  x <- as.double(x)
  new_difftime(x, units = units)
}

difftime(1:10)
#> Time differences in secs
#> [1] 1 2 3 4 5 6 7 8 9 10

Complex objects are often easiest to represent as strings. For example, you can create factors from a character vector, and a helper function can set the levels from the unique values.

factor <- function(x = character(), levels = unique(x)) {
  ind <- match(x, levels)
  validate_factor(new_factor(ind, levels))
}

factor(c("a", "a", "b"))
#> [1] a a b
#> Levels: a b

Some complex objects are best built from simple parts. For example, a datetime is naturally created from year, month, and day, which makes construction easier for users.

POSIXct <- function(
  year = integer(),
  month = integer(),
  day = integer(),
  hour = 0L,
  minute = 0L,
  sec = 0,
  tzone = ""
) {
  ISOdatetime(year, month, day, hour, minute, sec, tz = tzone)
}

POSIXct(2025, 9, 10, 9, tzone = "America/Chicago")
#> [1] "2025-09-10 09:00:00 CDT"

Generics and Methods

An S3 generic chooses the right method for an object’s class (method dispatch). This is done by UseMethod(), which takes the generic’s name and, optionally, the argument to dispatch on. If the second argument is left out (the usual case), it dispatches based on the first argument. Most S3 generics are nothing more than a call to UseMethod().

mean
#> function (x, ...)
#> UseMethod("mean")
#> <bytecode: 0x12f6fb108>
#> <environment: namespace:base>

# Creating your own S3 generic is simple
myNewGeneric <- function(x) {
  UseMethod("myNewGeneric")
}

Method Dispatch

useMethod creates a vector of method names and looks for each potential method. We can see this with sloop::s3_dispatch.

x <- Sys.Date()

sloop::s3_dispatch(print(x))

#> => print.Date
#>  * print.default
  • => indicates the method that is called
  • * indicates a method that is defined, but not called

The default class is a special fallback, not a real class, that provides a method when no other match is found. While basic method dispatch is simple, it becomes more complex when inheritance, base types, internal generics, and group generics are involved.

x <- matrix(1:10, nrow = 2)

sloop::s3_dispatch(mean(x))
#>    mean.matrix
#>    mean.integer
#>    mean.numeric
#> => mean.default

sloop::s3_dispatch(sum(x))
#>    sum.matrix
#>    sum.integer
#>    sum.numeric
#>    sum.default
#>    Summary.matrix
#>    Summary.integer
#>    Summary.numeric
#>    Summary.default
#> => sum (internal)

Finding Methods

We can use sloop::s3_dispatch to find the specific method for a single call. Finding all the possible methods defined for a generic or associated with a class can be done with sloop::s3_methods_generic and sloop::s3_methods_class.

sloop::s3_methods_generic("mean")
#> 1 mean    Date       TRUE    base
#> 2 mean    default    TRUE    base
#> 3 mean    difftime   TRUE    base
#> 4 mean    POSIXct    TRUE    base
#> 5 mean    POSIXlt    TRUE    base
#> 6 mean    quosure    FALSE   registered S3method
#> 7 mean    vctrs_vctr FALSE   registered S3method

sloop::s3_methods_class("ordered")
#> 1 as.data.frame ordered TRUE    base
#> 2 Ops           ordered TRUE    base
#> 3 relevel       ordered FALSE   registered S3method
#> 4 Summary       ordered TRUE    base

Creating Methods

When writing a new method, watch out for two common pitfalls:

  • Ownership: Only define a method if you own the generic or the class. While R lets you define methods on anything, it’s best practice to work with the original author to avoid conflicts.
  • Arguments: A method must use the same arguments as its generic. The only exception is when the generic uses …, in which case the method can include extra arguments.

Object Styles

Record style objects (e.g., datetime), data frames, and scalar objects (e.g., linear model) are examples of generics that length(x) does not equal the number of observations.

Record style objects use a list of equal-length vectors to represent individual components of the object. The best example is POSIXlt which is a list of 11 date-time components.

x <- as.POSIXlt(ISOdatetime(2020, 1, 1, 0, 0, 1:3))
#> [1] "2020-01-01 00:00:01 CST" "2020-01-01 00:00:02 CST" "2020-01-01 00:00:03 CST"

length(x)
#> [1] 3

length(unclass(x))
#> [1] 11

Data frames are two dimensional and the number of observations is the number of rows, not the length.

x <- data.frame(x = 1:100, y = 1:100)

length(x)
#> [1] 2

nrow(x)
#> [1] 100

Scalar objects use a list to represent a single object. For example, a lm object is a list of length 12 even though it represents one model.

mod <- lm(mpg ~ wt, data = mtcars)

length(mod)
#> [1] 12

Inheritance

S3 classes can share behavior through a mechanism called inheritance which is powered by three ideas:

The class can be a character vector.

class(ordered("x"))
#> [1] "ordered" "factor"

class(Sys.time())
#> [1] "POSIXct" "POSIXt"

R looks for a method in subsequent classes if a method is not found for the class in the first element of the vector.

sloop::s3_dispatch(print(ordered("x")))
#>    print.ordered
#> => print.factor
#>  * print.default

sloop::s3_dispatch(print(Sys.time()))
#> => print.POSIXct
#>    print.POSIXt
#  * print.default

A method can delegate work by calling NextMethod(). The s3_dispatch reports delegation with ->.

sloop::s3_dispatch(ordered("x")[1])
#>    [.ordered
#> => [.factor
#>    [.default
#> -> [ (internal)

Ordered is a subclass of factor because it appears before factor in the class vector; likewise, factor is a superclass of ordered. S3 doesn’t enforce rules on subclasses and superclasses, but it’s good practice to keep base types and attributes consistent.

NextMethod

NextMethod tells R to keep looking for the next method up the class hierarchy and run it.

We can create an example with [, the most common use case, by adding a secret class that hides its output when printed.

# Add a new constructor
new_secret <- function(x = double()) {
  stopifnot(is.double(x))
  structure(x, class = "secret")
}

# Add a new class to the print S3 generic
print.secret <- function(x, ...) {
  print(strrep("x", nchar(x)))
  invisible(x)
}

x <- new_secret(c(15, 1, 456))
#> [1] "xx" "x" "xxx"

sloop::s3_dispatch(x[1])
#>    [.secret
#>    [.default
#> => [ (internal)

x[1]
#> [1] 15

This works, but the default [ method does not preserve the class and hide the output. Providing a [.secret method would solve this problem. However, the naive approach would cause an infinite loop.

# Naive approach - infinite loop
`[.secret` <- function(x, i) {
  new_secret(x[i])
}

# Inefficient way because it creates a copy of x
`[.secret` <- function(x, i) {
  x <- unclass(x)
  new_secret(x[i])
}

x[1]
#> [1] "xx"

# Best way using NextMethod()
`[.secret` <- function(x, i) {
  new_secret(NextMethod())
}

x[1]
#> [1] "xx"

# Shows [.secret is called by work is delegated to the internal method
sloop::s3_dispatch(x[1])
#> => [.secret
#>    [.default
#> -> [ (internal)

Allowing Subclassing

If you allow subclasses when creating a class, the parent constructor needs to have ... and class arguments. The subclass constructor can just call to the parent class constructor with additional arguments as needed.

new_secret <- function(x, ..., class = character()) {
  stopifnot(is.double(x))
  structure(x, ..., class = c(class, "secret"))
}

# Create a new supersecret class that hides the number of characters
new_supersecret <- function(x) {
  new_secret(x, class = "supersecret")
}

print.supersecret <- function(x, ...) {
  print(rep("xxxxx", length(x)))
  invisible(x)
}

x2 <- new_supersecret(c(15, 1, 456))
x2
#> [1] "xxxxx" "xxxxx" "xxxxx"

Using the constructor inside methods breaks inheritance (i.e., the result always has the same class, even for subclasses). The vctrs::vec_restore() function restores the original class after operations like subsetting.

# Ignores supersecret class
`[.secret` <- function(x, ...) {
  new_secret(NextMethod())
}

x2[1:3]
#> [1] "xx"  "x"  "xxx"

`[.secret` <- function(x, ...) {
  vctrs::vec_restore(NextMethod(), x)
}

x2[1:3]
#> [1] "xxxxx" "xxxxx" "xxxxx"

If you build your class with vctrs, this behavior comes automatically; you only need a custom [ method for special cases. Explore this package for tools that help create and work with vector-like objects in R

R6 Basics

The R6 OOP system has 2 special propoerties:

  1. It uses the encapsulated OOP paradigm, meaning that methods belong to objects (not generics) and are called with object$method()

  2. Objects are mutable which means they have reference semantics and are modified in place

Classes and Methods

R6 functions have two important arguments:

  • classname: Improves error messages and makes it possible to use R6 objects with S3 generics
  • public: Provides a list of methods (functions) and fields (anything other than functions) that make up the public interface of the object.

Use UpperCamelCase for R6 classes and snake_case for methods and fields. Always assign the result of R6Class to a variable with the same name as the class. Access to the methods and fields of the current object is done via self$

Accumulator <- R6::R6Class(
  classname = "Accumulator",
  public = list(sum = 0, add = function(x = 1) {
    self$sum <- self$sum + x
    invisible(self)
  })
)

Accumulator

#> <Accumulator> object generator
#>   Public:
#>     sum: 0
#>     add: function (x = 1)
#>     clone: function (deep = FALSE)
#>   Parent env: <environment: R_GlobalEnv>
#>   Locked objects: TRUE
#>   Locked class: FALSE
#>   Portable: TRUE

You can construct a new object from the class by calling the new() method and then call methods and access fields with $.

x <- Accumulator$new()

# Access method and add 4 to a sum of 0
x$add(4)

# Access the field to get the sum
x$sum
#> [1] 4

Add is primarily called for its side-effect of updating sum. Side-effect methods should always return self invisibly. This returns the current object and makes method chaining possible.

x$add(10)$add(10)$sum
#> [1] 24

Important Methods

The $initialize() and $print() methods should be defined for most classes as they make your class easier to use. The initialize() method overrides the default behavior of \(new()** and **\)print() overrides the default printing behavior.

Person <- R6::R6Class(
  classname = "Person",
  public = list(
    name = NULL,
    age = NA,
    initialize = function(name, age = NA) {
      self$name <- name
      self$age <- age
    },
    print = function(...) {
      cat("Person: \n")
      cat("  Name: ", self$name, "\n", sep = "")
      cat("  Age: ", self$age, "\n", sep = "")
      invisible(self)
    }
  )
)

Brian <- Person$new("Person", age = 36)
Brian
#> Person:
#>   Name: Person
#>   Age: 36

Adding Methods After Creation

The fields and methods of an R6 class can be modified after creation. This is useful when exploring interactively or breaking up a class with many functions into smaller pieces. You can use $set() to add new elements to an existing class.

Accumulator <- R6::R6Class(classname = "Accumulator")
Accumulator$set("public", "sum", 0)
Accumulator$set("public", "add", function(x = 1) {
  self$sum <- self$sum + x
  invisible(self)
})

Inheritance

Behavior from an existing class can be used by providing the class object to the inherit argument. Below, $add() overrides the superclass implementation; however, using super$ will delegate to the superclass implementation (like using NextMethod).

AccumulatorChatty <- R6::R6Class(
  classname = "AccumulatorChatty",
  inherit = Accumulator,
  public = list(
    add = function(x = 1) {
      cat("Adding ", x, "\n", sep = "")
      super$add(x = x)
    }
  )
)

x2 <- AccumulatorChatty$new()
x2$add(10)$add(1)$sum
#> Adding 10
#> Adding 1
#> [1] 11

Introspection

Every R6 object has an S3 class that reflects its hierarchy of R6 classes. It includes the base R6 class which elicits common behavior like print.R6. You can list all methods and fields of an R6 with names.

class(x2)
#> [1] "AccumulatorChatty" "Accumulator" "R6"

names(x2)
#> [1] ".__enclos_env__" "sum" "clone" "add

Controlling Access

R6Class has two other arguments that work similarly to public:

  • private: Allows you to create fields and methods that are only available from within the class
  • active: Allows you to use accessor functions to define dynamic (i.e., active) fields

Private

The private argument to R6Class works the same way as the public argument. It accepts a named list of methods (functions) and fields. Fields and methods are available within the class by using private$ instead of self$.

Person <- R6::R6Class(
  classname = "Person",
  public = list(
    initialize = function(name, age = NA) {
      private$name <- name
      private$age <- age
    },
    print = function(...) {
      cat("Person: \n")
      cat("  Name: ", private$name, "\n", sep = "")
      cat("  Age: ", private$age, "\n", sep = "")
    }
  ),
  private = list(
    age = NA,
    name = NULL
  )
)

Brian <- Person$new("Brian", age = 36)
Brian
#> Person:
#>   Name: Brian
#>   Age: 36

Brian$name
#> NULL

Active

Active fields allow you to define components that look like fields, but are defined with functions. They are implemented using active bindings and each active binding is a function that takes a single argument (i.e., value). If the argument is missing(), the value is being retrieved, otherwise it is being modified.

Random <- R6::R6Class(
  classname = "Random",
  active = list(
    random = function(value) {
      if (missing(value)) {
        runif(1)
      } else {
        stop("Can't set `$random`", call. = FALSE)
      }
    }
  )
)

x <- Random$new()

set.seed(123)
x$random
#> [1] 0.2875775

set.seed(456)
x$random
#> [1] 0.0895516

Active fields are particularly useful with private fields as it allows you to implement components that look like fields from the outside but provide additional checks. For example, we can create a read-only age field and to ensure that name is a character vector of length one.

Person <- R6::R6Class(
  classname = "Person",
  private = list(
    .age = NA,
    .name = NULL
  ),
  active = list(
    age = function(value) {
      if (missing(value)) {
        private$.age
      } else {
        stop("`$age` is read only", call. = FALSE)
      }
    },
    name = function(value) {
      if (missing(value)) {
        private$.name
      } else {
        stopifnot(is.character(value), length(value) == 1)
        private$.name <- value
        self
      }
    }
  ),
  public = list(
    initialize = function(name, age = NA) {
      private$.name <- name
      private$.age <- age
    }
  )
)

Brian <- Person$new("Brian", age = 36)
Brian$name
#> [1] "Brian"

Brian$name <- 10
#> Error: is.character(value) is not TRUE

Brian$age <- 35
#> Error: `$age` is read only

Reference Semantics

A difference between R6 and other objects is that they have reference semantics. This means that objects are not copied when modified. Instead, if you want a copy, you need to use $clone(). If you want recursive cloning of nested R6 objects, you will need to use $clone(deep = TRUE).

y1 <- Accumulator$new()

y2 <- y1$clone()

y1$add(10)

c(y1 = y1$sum, y2 = y2$sum)
#> y1 y2
#> 10  0

There are some less obvious consequences of reference semantics:

  • The $finalize() method should be used to clean up the resources created by the initializer
  • An R6 class as the default value of a field is shared across all instances of the object

Since R6 objects are not copied-on-modify, they are only deleted one time. We can use $finalize() as a complement to $initialize() as they play a similar role to on.exit in a function and clean up any resources created by the initializer.

TemporaryFile <- R6::R6Class(
  classname = "TemporaryFile",
  public = list(
    path = NULL,
    initialize = function() {
      self$path <- tempfile()
    }
  ),
  private = list(
    finalize = function() {
      message("Cleaning up ", self$path)
      unlink(self$path)
    }
  )
)

tf <- TemporaryFile$new()
rm(tf)
gc()
#> Cleaning up /var/folders/_j/6wfn_8q16dn9f0lq313z__hn_4vjtn/T//Rtmpgqvd1t/file3a385e47c3dc

If you use an R6 class as the default value of a field, it will be shared across all instances of the object. For example, we want to create a temporary database everytime we call TemporaryDatabase$new(), but the code below always uses the same path.

TemporaryDatabase <- R6::R6Class(
  classname = "TemporaryDatabase",
  public = list(
    con = NULL,
    file = TemporaryFile$new(),
    initialize = function() {
      self$con <- DBI::dbConnect(RSQLite::SQLite(), path = file$path)
    }
  ),
  private = list(
    finalize = function() {
      DBI::dbDisconnect(self$con)
    }
  )
)

db_a <- TemporaryDatabase$new()
db_b <- TemporaryDatabase$new()

db_a$file$path == db_b$file$path
#> [1] TRUE

In the example above, TemporaryFile$new() is only called once when TemporaryDatabase is defined. We can move the call to $initialize() to create a new file each time

TemporaryDatabase <- R6::R6Class(
  classname = "TemporaryDatabase",
  public = list(
    con = NULL,
    file = NULL,
    initialize = function() {
      self$file <- TemporaryFile$new()
      self$con <- DBI::dbConnect(RSQLite::SQLite(), path = file$path)
    }
  ),
  private = list(
    finalize = function() {
      DBI::dbDisconnect(self$con)
    }
  )
)

db_a <- TemporaryDatabase$new()
db_b <- TemporaryDatabase$new()

db_a$file$path == db_b$file$path
#> [1] FALSE

S4 Basics

S4 provides a more formal approach to functional OOP. The underlying ideas are like S3, but it has stricter implementation and makes use of specialized functions for creating classes (setClass()), generics (setGeneric()), and methods (setMethod()). S4 also provides multiple inheritance (i.e., a class can have multiple parents, section 15.5.2) and dispatch (i.e., method dispatch can use the class of multiple arguments, section 15.5.3).

Defining Classes and Setting Generics

You can define an S4 class using the methods package by calling setClass with the class name and a definition for its slots, a named component of the object that is accessed using the @ operator. Once the class is defined, you can use new to construct new objects.

#' The methods package is always available when running R interactively
#' but using library(methods) can let users know you are using S4
library(methods)

setClass(
  "Person",
  slots = c(
    name = "character",
    age = "numeric"
  )
)

Brian <- new("Person", name = "Brian Helsel", age = 36)

# Check class with "is"
is(Brian)
#> [1] "Person"

# Access slots with "@" or "slot"
# @ is equivalent to $
# slot is equivalent to [[

Brian@name
#> [1] "Brian Helsel"

slot(Brian, "age")
#> [1] 36

Accessor functions should be used to allow you to safely set and get slot values.

setGeneric("age", function(x) standardGeneric("age"))
setGeneric("age<-", function(x, value) standardGeneric("age<-"))

setMethod("age", "Person", function(x) x@age)
setMethod("age<-", "Person", function(x, value) {
  x@age <- value
  x
})

age(Brian) <- 37
age(Brian)
#> [1] 37

You can use sloop functions to identify S4 objects and generics.

sloop::otype(Brian)
#> [1] "S4"

sloop::ftype(age)
#> [1] "S4" "generic"

Exploring More Arguments for setClass

Defining S4 classes with setClass and three arguments:

  • **Class: The class name using UpperCamelCase*
  • slots: A named character vector describing the names and classes of the fields (use “ANY” to allow a slot to accept objects from any type)
  • prototype: A list of default values for each slot
setClass(
  "Person",
  slots = c(
    name = "character",
    age = "numeric"
  ),
  prototype = list(
    name = NA_character_,
    age = NA_real_
  )
)

Brian <- new("Person", name = "Brian Helsel")
str(Brian)
#> Formal class 'Person' [package ".GlobalEnv"] with 2 slots
#>   ..@ name: chr "Brian Helsel"
#>   ..@ age : num NA

Inheritance

The setClass function also has a contains argument to specify a class (or classes) to inherit slots and behavior from.

setClass(
  "Employee",
  contains = "Person",
  slots = c(
    boss = "Person"
  ),
  prototype = list(
    boss = new("Person")
  )
)

str(new("Employee"))
#> Formal class 'Employee' [package ".GlobalEnv"] with 3 slots
#>   ..@ boss:Formal class 'Person' [package ".GlobalEnv"] with 2 slots
#>   .. .. ..@ name: chr NA
#>   .. .. ..@ age : num NA
#>   ..@ name: chr NA
#>   ..@ age : num NA

Introspection

Determining what class an object inherits from or testing if an object inherits from a specific class can be done using is().

is(new("Person"))
#> [1] "Person"

is(new("Employee"))
#> [1] "Employee" "Person"

is(Brian, "Person")
#> [1] TRUE

When calling setClass, you are registering a class definition in a hidden global variable. Thus, the object is defined and constructed at the same time. Careful implementation of state-modifying functions is important as it is possible to create invalid objects (e.g., when redefining a class).

Helper

The new() function is a low-level constructor suitable for use by the developer, but user-facing classes should be paired with a helper which always:

  • Has the same name as the class
  • Uses a thoughtfully crafted user interface with carefully chosen defaults and conversions
  • Finishes by calling methods::new()
Person <- function(name, age = NA) {
  age <- as.double(age)
  new("Person", name = name, age = age)
}

str(Person("Brian"))
#> Formal class 'Person' [package ".GlobalEnv"] with 2 slots
#>   ..@ name: chr "Brian"
#>   ..@ age : num NA

Validator

The constructor automatically checks that the slots have the correct classes, but you will need to implement more complicated checks. We can do this with the setValidity function.

setValidity(Class = "Person", method = function(object) {
  if (length(object@name) != length(object@age)) {
    "@name and @age must be the same length"
  } else {
    TRUE
  }
})

Person("Brian", age = c(36, 37))
#> Error in `validObject()`:
#> invalid class “Person” object: @name and @age must be the same length

# Check the validity with validObject
validObject(Person("Brian", age = 36))
#> [1] TRUE

Generics and Methods

The role of a generic is to perform method dispatch (i.e., find the right implementation for the defined classes). We can use setGeneric with a function that calls standardGeneric to create a new S4 generic.

setGeneric("myGeneric", function(x) standardGeneric("myGeneric"))

It is best practice to use lowerCamelCase with S4 generics and avoid using {} in the function as it triggers a special case the is more computationally expensive.

The signature argument allows you to control the arguments that are used for method dispatch. If signature is not provided, all arguments except for ... are used. At times, it is useful to remove arguments from dispatch which allows you to require methods (e.g., verbose = TRUE), but ensure that they are not involved in dispatch.

setGeneric(
  "myGeneric",
  function(x, ..., verbose = TRUE) standardGeneric("myGeneric"),
  signature = "x"
)

We can add methods with setMethod which takes three important arguments: the name of the generic, the name of the class, and the method itself. The second argument to setMethod is the signature and can include multiple arguments

setMethod("myGeneric", "Person", function(x) {
  # Add your method implementation here...
})

You can list all the methods that belong to a generic, i.e., methods("generic"), are associated with a class, i.e., methods(class = “class”), and find the implementation of a specific method, i.e., selectMethod(“generic”, “class”).

Show Method

The show method is the most commonly defined S4 method that controls how the object appears when it is printed. To define a method for an existing generic, first retrieve all its arguments with the args function.

args(getGeneric("show"))
#> function (object)
#> NULL

setMethod("show", "Person", function(object) {
  cat(
    is(object)[[1]],
    "\n",
    "  Name: ",
    object@name,
    "\n",
    "  Age: ",
    object@age,
    "\n",
    sep = ""
  )
})

Person("Brian", 35)
#> Person
#>   Name: Brian
#>   Age: 35

Accessors

Slots should be considered an internal implementation detail and all user-accessible slots should be acccompanied by a pair of accessors.

setGeneric("name", function(x) standardGeneric("name"))
setMethod("name", "Person", function(x) x@name)

name(Brian)
#> [1] Brian Helsel

If the slot is writeable, you should provide a setter function and always include validObject() to prevent the user from creating invalid objects.

setGeneric("name<-", function(x, value) standardGeneric("name<-"))
setMethod("name<-", "Person", function(x, value) {
  x@name <- value
  validObject(x)
  x
})

name(Brian) <- "Brian C. Helsel"
name(Brian)
#> [1] "Brian C. Helsel"

name(Brian) <- letters
#> Error in `validObject()`:
# invalid class “Person” object: @name and @age must be the same length

S4 and S3 Classes and Generics

When writing S4 code, you will often need to interact with existing S3 classes and generics. When using setClass, you can include S4 classes, S3 classes, or the implicit class of base type. To use an S3 class, you must first register it with setOldClass, but it is usually better to provide a full S4 definition with slots and a prototype.

setClass(
  "factor",
  contains = "integer",
  slots = c(levels = "character"),
  prototype = structure(integer(), levels = character())
)

setOldClass("factor", S4Class = "factor")

If an S4 object inherits from an S3 class or a base type, it will have a special virtual slot called .Data containing the underlying base type or S3 object.

RangedNumeric <- setClass(
  "RangedNumeric",
  contains = "numeric",
  slots = c(min = "numeric", max = "numeric"),
  prototype = structure(numeric(), min = NA_real_, max = NA_real_)
)

rn <- RangedNumeric(1:10, min = 1, max = 10)
str(rn)
#> Formal class 'RangedNumeric' [package ".GlobalEnv"] with 3 slots
#>   ..@ .Data: int [1:10] 1 2 3 4 5 6 7 8 9 10
#>   ..@ min  : num 1
#>   ..@ max  : num 10

It is also possible to convert an existing S3 generic to an S4 generic.

selectMethod("mean", "ANY")
#> Error in `getGeneric()`: no generic function found for ‘mean’

setGeneric("mean")

selectMethod("mean", "ANY")
#> Method Definition (Class "derivedDefaultMethod"):
#> function (x, ...)
#> UseMethod("mean")
#> <bytecode: 0x11e0ad228>
#> <environment: namespace:base>

#> Signatures:
#>         x
#> target  "ANY"
#> defined "ANY"

Trade-Offs

Read Advanced R Chapter 16 for a full description of the trade-offs between S3, S4, and R6.

System Strengths Weaknesses Best For
S3 Simple, lightweight, flexible Informal, no checks, inconsistent Quick extensions, prototyping
S4 Formal definitions, strong validation, supports inheritance Verbose, harder to learn Large projects, safety-critical code
R6 Encapsulation, mutable objects, familiar to OO programmers Less functional, mutable state harder to debug Simulations, external APIs, stateful objects