Starting from scratch – building a package in R

For the first time, I am going to share something more related to my master thesis. When I started this thesis, I did not know how to use R. In order to learn R, I started using DataCamp, which is a series of interactive courses. You can start from scratch and build your skills step by step. My favorite course so far is called “Writing Functions in R”. During the course, you are told:

If you need to copy something three times or more – build a function.

As a rookie, it sounds complicated, but in fact it is really simple AND it will save you so much trouble later. Defining a function will allow you to produce code that is easy to read, easy to reuse and easy to share with colleagues.  A function can be defined by the following syntax:

my_function <- function(arguments){
  Function Body
}

my_function  can be any valid variable name, however, it is a good idea to avoid names used elsewhere in R. Arguments (also called formals) can be any R object that is needed for my_function to run,  for example numbers or data frames. Arguments can have a default value or not. If not, a value must be provided for the given function to run. Function body is the code between the { brackets }  and this is run every time the function is applied. Preferably, the function body should be short and a function should do just one thing. If a large function cannot be avoided, often they can be constructed using a series of small functions. An example of a simple function could be:

cookies <- function(who, number=10){
  print(paste(who, "ate", number, "cookies", sep = " "))
}

The cookie function has two arguments, the number argument defaults to 10 and the user does not necessarily need to provide a value. The who argument on the other hand has no default and a name must be provided. I had some cookies BUT I only had nine cookies so I better change the number argument:

cookies(who="Julie", number=9)
[1] "Julie ate 9 cookies"

So, now I have defined a function to keep track of my cookie consumption. What if I want to share this with the rest of Albertsen Lab? I could forward a script for them to save locally.  No no, I will build a personal R package. This might seem like overkill for the cookie function, but imagine a more complex function.  In my search for helpful tools for calculating correlations, I have come by several functions/sets of functions with no documentation. It is nearly impossible to piece together how, what and when to use arguments with no provided help.  So, now I will build a bare minimum package to help me share my function with the group, strongly inspired by Not So Standard Deviations. For more information check out the excellent book  “R-packages” by Hadley Wickham.

First, you will need the following packages:

install.packages("devtools")
library("devtools")
install.packages("roxygen2")
library("roxygen2")

After this we need to create a package directory:

create("cookies") #create package

 So now,  a package called cookies has been created (you can change the folder with: setwd("my_directory")).

It is a good idea to update the DESCRIPTION file, so that it contains the relevant information about the package (cookies) and the author (me). Next step is to add the cookie function to the package. For this I save a script containing the function in the R folder. If you want to add more functions to your package, you can either create a new file for each function (recommended) or define the functions sequentially in one file.

Now comes the important part – documentation. Good documentation is key if you want other people to benefit from your work. This can be done easily using the roxygen2 package also by Hadley Wickham. roxygen2 uses a custom syntax so that the text starting with #' will be compiled into the correct format for R documentation when processed. Make a new R script with the following code and save it as cookies.R in the folder cookies/R:

#' Cookies
#'
#' This function will allow you to keep track of who your cookies.
#'
#' @param who Who ate the cookies? (Requires input)
#' @param number How many cookies has been eaten? (Default: 10)
#' @keywords cookies
#' @export


cookies <- function(who, number=10){
  print(paste(who, "ate", number, "cookies", sep = " "))
}

After creating the script then roxygen2 can be used to create all the needed files and documentation:

roxygenise("cookies/")

Lastly the package needs to be installed:

install_local("cookies")

You can now access your awesome functions by loading your brand new package:

library("cookies")

 Now you have built a real R package! If you type ?cookies in the console a help page will actually pop up.

Finally, you can upload you package to github.com (Guide). This will allow your others to try out your package, point out issues and suggest changes. Download and install directly from github is easy using install_github() from the devtools package. Try it out by typing this:

devtools::install_github("julieklessner/cookies")

It really can be this easy! So next time you copy something more than three times or really want to share your work, consider defining a function and building your own personal package with associated documentation.

The following two tabs change content below.
Julie K.T. Pedersen

Julie K.T. Pedersen

Msc. Student. Developing new bioinformatic tools for time-series and network analysis.
Posted in Master student, R and tagged , , , , .

Leave a Reply

Your email address will not be published. Required fields are marked *