Go To Home Page

What is R?

In this first unit, we will learn what R is.

https://www.r-project.org/

R is a:

  • Free
  • Open-source
  • Modular
  • Statistical
  • Programming Language

What Should You Know About R?

  • It is open-source, so anyone can contribute to it – and many people do.
  • It can do (practically) anything!
  • It requires more knowledge than other statistical software
    • Knowledge of scripting/coding
    • Understanding of statistics
  • There is always more than one way to do things in R!
    • Choose the method that makes best sense to you.
  • Google is your friend!
    • People who use R are CONSTANTLY googling how to do things.
    • Even things they ‘should already know’ or ‘used to know’
  • It’s too big to fit the whole thing into your brain!
    • You will NEVER master R – so don’t worry about it.
    • Just learn how to make it do what you need it to do.

Why Should You Learn R?

  • It’s free!
    • So you can use it for the rest of your life
  • Make concrete contributions to science
    • Anybody can hypothesize and brainstorm experiments, but somebody has to do the science. And that somebody could be you!
  • Clarity of thought
    • Learning to code helps you think more precisely and algorithmically
  • Employability
    • If you can use R to answer questions about data, people will want to hire you.
  • True education
    • Many classes you take tell you what other scientists have learned. Learning R is a skill that teaches you how to be a scientist.

How Should You Learn R?

This tutorial is a good place to start.

You’ll never learn R just by reading! You’ve got to try it yourself. When you see a box like this one, do what it says. That’s when the real learning happens. And for those of you taking my class, be sure to do the numbered prompts as you will be graded on those.

Should I use AI to help me code?

AI can be a useful resource, but you should be cautious about relying too heavily on AI for the following reasons:

  • You don’t know enough yet to give AI good prompts. It will often not understand what you want it to do because YOU don’t yet understand what you want it to do.
  • There is always more than one way to do things in R, and AI often defaults to a more complex, less intuitive, or obscure way of solving the problem.
  • Even if you get AI to generate code for you, you might not understand that code or be able to change or fix it.

When faced with a problem, do some thinking first and try your own code. Then if your code isn’t working, you can ask AI what the problem is.

What is R Studio?

https://www.rstudio.com/products/rstudio/

R Studio is a free tool that makes R easier and more intuitive to use. It divides the screen into 4 quadrants:

  • Top Left: The code and text that you are typing appears here.
  • Bottom Left: The output of your code appears here in the Console.
  • Top Right:
    • Variables and data sets you have imported into R are shown in the Environment tab.
    • The History tab shows commands you’ve run recently.
  • Bottom Right:
    • The Files tab shows you everything in the project directory.
    • The Plots tab shows the most recent graph you generated in R
    • The Packages tab lets you install and load R packages
    • The Help tab shows you help documentation about commands and packages

Let’s explore R Studio!

Open up Rstudio and explore the interface.

  1. Download and install both R and R Studio.
  2. Open R Studio

Using the Console

  1. Use the console to do some simple math, such as 2 + 2 or 867 - 5309. Just type the equation and hit enter. (Imaginary) Bonus points if you can figure out how to do division and multiplication.

Other Windows in R Studio

  1. Explore 2 of the tabs in the two windows on the right of R Studio and describe the function of each tab. For example, the Environment, History, Files, Plots, Packages, or Help tabs.

Basic Commands

Now let’s take a few tentative steps in exploring R.

Step 1. Create a New Script:

There are two ways to open a new script in R:

  • Through the Menu: File -> New File -> R Script
  • Keyboard Shortcut: (CTRL + Shift + N)
  1. Demonstrate two ways to open a new R Script. A document called Untitled1 should open in the top left quadrant of RStudio.
Step 2. Type a command into the script.
  1. Inside the script, type these commands:
2 + 2

round(3.14159)

print("Hello World")

rep("hello!",100)

hist(x=rnorm(10000))
Step 3. Run the command.

There are 3 ways to run a command:

  1. Put the cursor on the line you want to run (anywhere). Then hit the Run button at the top of the script. Or use the keyboard shortcut (CTRL + ENTER). This will run only 1 line of code.
  2. Use the mouse to select the code you want to run. This can be part of a line or many lines together. Then hit the Run button at the top of the script. Or use the keyboard shortcut (CTRL + ENTER). This will run the selected section of code.
  3. CTRL + ALT + R will run the entire script.
  1. Test out 3 different ways to run the commands you typed.
Step 4: Repeat

Wasn’t that fun! Let’s do it again!

  1. Inside the script, write and run a command that divides 1600 by 4
  2. Inside the script, write and run a command that prints the words “R Rocks!” in the console

Making Mistakes

When you are writing R code, there are 2 kinds of mistakes you can make:

  • Interpretable mistake: You typed a command that R understands, but it doesn’t really do what you intended. This will make your code act in ways you didn’t expect, and will probably give rise to sentient machines that want to destroy all humans.
  • Uninterpretable mistake: You mis-typed a command. You will see an error message of some kind. This is by far the most common - R demands that you be very precise.

Most error messages in R are completely unhelpful. When you get one, go back and check your recent code for typos.

  1. To see what R does when you make a mistake, copy the commands below into your script and try to run the script
2 x 2 # R doesn't know what the X means

round(3.14159 # Oops, forgot a closing parenthesis. R will wait for you to complete the command

print(Hello World) # Forgot the "" around "Hello World" to make it a string. R thinks its a variable or function, and doesn't know what to do with it.

rep("hello!" 100) # Should have put a comma in between "hello!" and 100

hist(x=rnorm(10000) # Can you figure out the problem here?
     
print(Oops!)

max(1, 2, 3 4, 5)

Rep("hello!", 100)

print(max(10:20)
  1. Are these interpretable or uninterpretable mistakes? How do you know?

  2. What should you do when you get an error message in R?

  3. Now fix these commands and run the script again.

Let’s Get Some Help!

R help

Let’s do get help! to find out more about these functions.

  1. Run the following lines of code (notice the ? in front of the command name): Briefly describe what information we get from these commands. Where does it appear?
?print
?rep
?hist

Sections of R Documentation

When you use the ? before a command, it brings up the R documentation for that function in the Help tab (bottom right). Here’s what to look for:

  • The name of the function {and the package it is from} are listed in the top left. {base} means that it comes pre-loaded in R.
  • Description: A (jargon-filled, usually unhelpful) description of the function
  • Usage: A template for how the function might look when you type it out, with default arguments shown. Kinda useful, but looks like gibberish at first.
  • Arguments: A description of each argument a function can take. Kinda useful.
  • Details/Value/Note: Not helpful most of the time.
  • References: Need help sleeping at night? Read this!
  • See Also: Links to similar/related functions.
  • Examples: Everyone likes examples. I often skip straight here.
  1. Get to the help page for the paste function in R Studio using a command.
  2. Get to the help page for the paste function in R Studio using the Help tab.
  3. Explore the paste R help page. Which section(s) did you find useful in helping you understand paste. Least useful?
  4. Using what you learned, use the paste command to combine and print the following strings:
  • “R”
  • “is”
  • “the”
  • “best!”

Using Google

You can use google to get more information about a function

Try googling “r print” or “r rep” or “r {function that I want to know more about}”. Some of the links will just be the R Documentation, but others will be helpful humans providing actual examples and explanations.

You can use google to find the function that you need

Try googling “R repeat a string X times” or “R make a histogram” to find a function to accomplish what you want.

Using the internet, find an R function that can…

  1. Calculate the mean of the numbers (2, 4, 6, 8, 1999)
  2. Calculate the standard deviation of the numbers (2, 4, 6, 8, 1999)
  3. Add up the set of numbers (2, 4, 6, 8, 1999)
  4. Find out how long a word is
  5. Find the longest run of numbers in a set
  6. Take every letter “e” in “Enter to learn, go forth to serve” and replace it with the word “CHEESE”

Now do this:

  1. Pick two of these functions. Put them in a script and show that you can run them without error.

Packages

Base R is useful, but what makes R especially powerful are the many, many, MANY packages that are available for it.

A package is a collection of functions, data sets, and other R objects. It expands the capacity of R, allowing it to do new, specialized things.

The tidyverse

One package we will use EVERY TIME we load R is the tidyverse package. So, let’s load this package together.

How Do I Get an R Package?

You can get an R package in 2 ways:

  • Use R Studio
  • Type a command

Let’s see how each one works.

Installing a package using R Studio

  1. Click on the Packages tab (bottom right pane)
  2. Click the “Install” button
  3. Type the name of the package you want
  4. Click install

Advantages of this approach: it’s easy and intuitive Disadvantages: you have to do it every time you change computers or update R.

  1. Follow the above steps to install the tidyverse package.

Installing a package using code

Type the following command in your script and run it:

install.packages("tidyverse")

Note the quotes around the package name.

Advantages of this approach: you can put this in your code and it will execute whenever you run, ensuring that the package is available Disadvantages: Typing is required. Also, this code will run every time you run the script, which isn’t necessary if you’re using the same computer.

Advanced Tip

Try this code instead:

if (!require("tidyverse")) install.packages("tidyverse")

This will check to see if the package is already installed on your computer. If it isn’t, it will be installed. Classy!

Loading a package

Installing a package (see above) puts it on the computer. BUT it doesn’t make it immediately available for use. To do this, we next have to load the package. Again, there are 2 ways:

Loading a package using R Studio

Do the following

  1. Click on the Packages tab (bottom right pane)
  2. Type the name of the package in the search bar (or just scroll forever)
  3. Check the box to the left of the package name

Advantages of this approach: it’s easy and intuitive

Disadvantages: you have to do it every time you start R.

Loading a package using code

Type the following command and run it:

library(tidyverse)

Note that quotes are not required around the package name.

Advantages of this approach: you can put this in your code and it will execute whenever you run, ensuring that the package is loaded every time with no effort on your part!

Disadvantages: Typing is required. Which isn’t much of a disadvantage - each keystroke only burns a tiny fraction of a calorie. Really, this is the better way to load a package.

  1. Describe what a package is in R.
  2. Explain the difference between installing a package and loading a package.
  3. Try out both ways to install a new package in R. Which is the better way to use most of the time? Why?
  4. Try out both ways to load a package. Which is the better way to use most of the time? Why?

Finding a package

How can we find out which package we need? We use google.

  1. Use Google to find a package that has a function that computes Standard Error (not Standard Deviation). What is that package?
  2. Use Google to find a package that has a function for creating a correlation matrix. What is that package?

There are so many options!

R has an astounding number of packages available:

https://cran.r-project.org/web/packages/available_packages_by_name.html

This means you can do almost anything with R. And this list is always expanding!

(Bonus) Fun with packages!

  1. Find and install a package that is interesting or unusual from the list at the link above.
  2. Using a script, load the package.
  3. In the same script, run at least one command from that package to try it out.

Go To Home Page