If you print things in R it probably hasn’t escaped your attention that print() is capable playing many roles.
> print("some text") [1] "some text" > print(my_data_frame) a b c 1 1 red a 2 2 blue bcb 3 3 green <NA> > print(my_ggplot)
So what’s going on here? Is print some kind of amazing function which knows about how every data structure imaginable in R needs to be displayed?
Of course not.
What print actually is in this case is a whole heap of different functions (actually we term them methods in this scenario, but to get started you can think of them simple as lots of different functions). 
S3 methods
At the heart of any S3 method is a generic function. This is the first point of entry when you invoke the named function.
In the case of print, we can inspect its code by simply typing print without any parentheses at the console prompt:
> print
function (x, ...) 
UseMethod("print")
This the simple one-line function definition which sits at the heart of any multi-purpose S3 function.
In essence it tells the R dispatcher to look for a function whose name is “print” followed by a dot, followed by the name of the object’s class.
In the case of a data frame, for example, there is a function called print.data.frame – again we can test for that, and inspect its code, by typing its name at the console prompt:
> print.data.frame
function (x, ..., digits = NULL, quote = FALSE, right = TRUE, 
    row.names = TRUE) 
{
    n <- length(row.names(x))
    if (length(x) == 0L) {
        cat(sprintf(ngettext(n, "data frame with 0 columns and %d row", 
            "data frame with 0 columns and %d rows"), n), "\n", 
            sep = "")
    }
    else if (n == 0L) {
        print.default(names(x), quote = FALSE)
        cat(gettext("<0 rows> (or 0-length row.names)\n"))
    }
    else {
        m <- as.matrix(format.data.frame(x, digits = digits, 
            na.encode = FALSE))
        if (!isTRUE(row.names)) 
            dimnames(m)[[1L]] <- if (identical(row.names, FALSE)) 
                rep.int("", n)
            else row.names
        print(m, ..., quote = quote, right = right)
    }
    invisible(x)
}
We could if we actually wanted to, explicitly call the method by name. With a data frame for example, we can test that by observing that the following two commands print the same thing:
print(my_data_frame)
print.data.frame(my_data_frame)
We could also fool it by calling the default print function (the go-to function which only knows how to print the simplest objects like lists and vectors) with an explicit function call:
print.default(my_data_frame)
Notice that it tries to print your data frame much like it would a list. Which isn’t a bad guess, just not the best.
So how many different print functions are there?
To inspect the hierarchy of methods for a given function we can call the methods() function, passing it the name of the generic, as a character string.
methods("print")
On my setup that lists 276 different functions. Whose names all begin with “print.“.
Note that many have an asterisk by their names. This tells us that they’re not actually available to call directly in the manner we demonstrated above.
It’s a shame. I really wanted to see what happens when I try to print.ggplot a data frame!
But it’s not really important. In most cases we won’t ever need to invoke these functions by calling anything other than print().
Enough of this, let’s set up a reason to built our own bespoke print method.
Let’s make triangles!
What do we need to define a triangle. Just 3 numbers really – the lengths of its 3 sides.
But that’s just a vector isn’t it? Yes. But like a vector plus. In our case let’s call a triangle a vector with a class of triangle.
> tri <- c(3,4,5)
> class(tri) <- "triangle"
> print(tri)
[1] 3 4 5
attr(,"class")
[1] "triange"
It’s been clever enough to invoke the default print function, which really just prints a vector and the values of any attributes.
But instead let’s create our own triangle printing function…
> print.triangle <- function(x) {
+     cat("Ooh a triangle\n")
+     print(c(x))
+     if (max(x)^2 == sum(x^2)/2)
+         cat("Oh wow, and it's a right-angled triangle!\n")
+ }
…and invoke it with our new triangle…
> print(x)
Ooh a triangle
[1] 3 4 5
Oh wow, and it's a right-angled triangle!
Note that our new print function also contains its own print statement:
print(c(x))
Which is coercing our vector to a simple numeric class-ed vector, which executes the default print function to print out those 3 values between our extra lines of text.
We can see more of that, if we encapsulate tri as part of a more complex list() object
print( list("blue", tri, my_data_frame) )
What you can infer from that is that when you print a list, the default function writes out the list element number (or names) in turn, then recursively calls more print statements to print the individual lists elements.
Default list print knows when and how to call our funky new triangle print. How cool is that!
If your list elements are in turn made of lists, then it’ll get very recursive indeed – but you can easy see that the print function itself doesn’t really need to be especially clever – in a way it kind of just keeps printing smaller and smaller objects ’til it runs out of data.
Simples!
Further reading
Hadley Wickam goes into much, much more detail about S3 at http://adv-r.had.co.nz/S3.html – that, and the rest of that book, is an essential read!
Further exploration
I’ve added a little package of triangle-related functions which expand on this theme to github at https://github.com/JerBoon/triangle/.
The package can by installed directly in R by executing
install.packages("https://github.com/JerBoon/triangle/archive/v0.1.0.tar.gz", repos=NULL, type="source")
Leave a Reply