2 Setting up

2.1 Open RStudio

The first thing we need to do is open RStudio. Do this now. If you currently have a project open close it by clicking File > Close project.

2.2 Naming your package

Before we create our package we need to give it a name. Package names can only consist of letters, numbers and dots (.) and must start with a letter. While all of these are allowed it is generally best to stick to just lowercase letters. Having a mix of lower and upper case letters can be hard for users to remember (is it RColorBrewer or Rcolorbrewer or rcolorbrewer?). Believe it or not choosing a name can be one of the hardest parts of making a package! There is a balance between choosing a name that is unique enough that it is easy to find (and doesn’t already exist) and choosing something that makes it obvious what the package does. Acronyms or abbreviations are one option that often works well. It can be tricky to change the name of a package later so it is worth spending some time thinking about it before you start.

Checking availability

If there is even a small chance that your package might be used by other people it is worth checking that a package with your name doesn’t already exist. A handy tool for doing this is the available package. This package will check common package repositories for your name as well as things like Urban Dictionary to make sure your name doesn’t have some meanings you weren’t aware of!

At the end of this workshop we want you to have a personal package that you can continue to add to and use so we suggest choosing a name that is specific to you. Something like your initials, a nickname or a username would be good options. For the example code we are going to use mypkg and you could use that for the workshop if you want to.

2.3 Creating your package

To create a template for our package we will use the usethis::create_package() function. All it needs is a path to the directory where we want to create the package. For the example we put it on the desktop but you should put it somewhere more sensible.

You will see some information printed to the console, something like (where USER is your username):

✔ Creating 'C:/Users/USER/Desktop/mypkg/'
✔ Setting active project to 'C:/Users/USER/Desktop/mypkg'
✔ Creating 'R/'
✔ Writing 'DESCRIPTION'
Package: mypkg
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R (parsed):
    * First Last <first.last@example.com> [aut, cre] (<https://orcid.org/YOUR-ORCID-ID>)
Description: What the package does (one paragraph).
License: What license it uses
Encoding: UTF-8
LazyData: true
✔ Writing 'NAMESPACE'
✔ Writing 'mypkg.Rproj'
✔ Adding '.Rproj.user' to '.gitignore'
✔ Adding '^mypkg\\.Rproj$', '^\\.Rproj\\.user$' to '.Rbuildignore'
✔ Opening 'C:/Users/USER/Desktop/mypkg/' in new RStudio session
✔ Setting active project to '<no active project>'

You will see something similar whenever we run a usethis command. Green ticks indicate that a step has been completed correctly. If you ever see a red dot that means that there is something usethis can’t do for you and you will need to follow some instructions to do it manually. At the end a new RStudio window with your package should open. In this window you should see the following files:

  • DESCRIPTION - The metadata file for your package. We will fill this in next and it will be updated as we develop our package.
  • NAMESPACE - This file describes the functions in our package. Traditionally this has been a tricky file to get right but the modern development tools mean that we shouldn’t need to edit it manually. If you open it you will see a message telling you not to.
  • R/ - This is the directory that will hold all our R code.

These files are the minimal amount that is required for a package but we will create other files as we go along. Some other useful files have also been created by usethis.

  • .gitignore - This is useful if you use git for version control.
  • .Rbuildignore - This file is used to mark files that are in the directory but aren’t really part of the package and shouldn’t be included when we build it. Most of the time you won’t need to worry about this as usethis will edit it for you.
  • mypkg.Rproj - The RStudio project file. Again you don’t need to worry about this.

2.4 Filling in the DESCRIPTION

The DESCRIPTION file is one of the most important parts of a package. It contains all the metadata about the package, things like what the package is called, what version it is, a description, who the authors are, what other packages it depends on etc. Open the DESCRIPTION file and you should see something like this (with your package name).

Package: mypkg
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R: 
    person(given = "First",
           family = "Last",
           role = c("aut", "cre"),
           email = "first.last@example.com",
           comment = c(ORCID = "YOUR-ORCID-ID"))
Description: What the package does (one paragraph).
License: What license it uses
Encoding: UTF-8
LazyData: true

2.4.1 Title and description

The package name is already set correctly but most of the other fields need to be updated. First let’s update the title and description. The title should be a single line in Title Case that explains what your package is. The description is a paragraph which goes into a bit more detail. For example you could write something like this:

Package: mypkg
Title: My Personal Package
Version: 0.0.0.9000
Authors@R: 
    person(given = "First",
           family = "Last",
           role = c("aut", "cre"),
           email = "first.last@example.com",
           comment = c(ORCID = "YOUR-ORCID-ID"))
Description: This is my personal package. It contains some handy functions that
    I find useful for my projects.
License: What license it uses
Encoding: UTF-8
LazyData: true

2.4.2 Authors

The next thing we will update is the field. There are a couple of ways to define the author for a package but is the most flexible. The example shows us how to define an author. You can see that the example person has been assigned the author (“aut”) and creator (“cre”) roles. There must be at least one author and one creator for every package (they can be the same person) and the creator must have an email address. There are many possible roles (including woodcutter (“wdc”) and lyricist (“lyr”)) but the most important ones are:

  • cre: the creator or maintainer of the package, the person who should be contacted with there are problems
  • aut: authors, people who have made significant contributions to the package
  • ctb: contributors, people who have made smaller contributions
  • cph: copyright holder, useful if this is someone other than the creator (such as their employer)

Adding an ORCID

If you have an ORCID you can add it as a comment as shown in the example. Although not an official field this is recognised in various places (including CRAN) and is recommended if you want to get academic credit for your package (or have a common name that could be confused with other package authors).

Update the author information with your details. If you need to add another author simply concatenate them using c() like you would with a normal vector.

Package: mypkg
Title: My Personal Package
Version: 0.0.0.9000
Authors@R: c(
    person(given = "Package",
           family = "Creator",
           role = c("aut", "cre"),
           email = "package.creator@mypkg.com"),
    person(given = "Package",
           family = "Contributor",
           role = c("ctb"),
           email = "package.contributor@mypkg.com")
    )
Description: This is my personal package. It contains some handy functions that
    I find useful for my projects.
License: What license it uses
Encoding: UTF-8
LazyData: true

2.4.3 License

The last thing we will update now is the software license. The describes how our code can be used and without one people must assume that it can’t be used at all! It is good to be as open and free as you can with your license to make sure your code is as useful to the community as possible. For this example we will use the MIT license which basically says the code can be used for any purpose and doesn’t come with any warranties. There are templates for some of the most common licenses included in usethis.

This will update the license field.

Package: mypkg
Title: My Personal Package
Version: 0.0.0.9000
Authors@R: c(
    person(given = "Package",
           family = "Creator",
           role = c("aut", "cre"),
           email = "package.creator@mypkg.com"),
    person(given = "Package",
           family = "Contributor",
           role = c("ctb"),
           email = "package.contributor@mypkg.com")
    )
Description: This is my personal package. It contains some handy functions that
    I find useful for my projects.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true

It will also also create two new files, LICENSE.md which contains the text of the MIT license (it’s very short if you want to give it a read) and LICENSE which simply contains:

YEAR: 2019
COPYRIGHT HOLDER: Your Name

There are various other licenses you can use but make sure you choose one designed for software not other kinds of content. For example the Creative Commons licenses are great for writing or images but aren’t designed for code. For more information about different licenses and what they cover have a look at http://choosealicense.com/ or https://tldrlegal.com/. For a good discussion about why it is important to declare a license read this blog post by Jeff Attwood http://blog.codinghorror.com/pick-a-license-any-license/.