What is R?
A little bit of history: R was first appeared in the 1990s and has served as an implementation of the S statistical programming language, which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. Although there are some important differences, much code written for S runs unaltered under R.
The R programming language is a popular and important tool to perform data analysis and machine learning tasks. When you think of some traditional ways of doing data mining or analysis, you may think of tools like Microsoft Excel or SQL. These software’s are definitely good at certain aspects; however, people are switching to R instead for some really good reasons:
- R is an open source software: who doesn’t like free stuff, right?
- R has a great package ecosystem, which helps users to achieve various tasks: There are more than 6,200 add-on packages in R. These packages contain functions that aid users to do data manipulation, data mining, data visualization, complex statistics calculation, machine learning algorithm, etc.
- R is user friendly (if you use RStudio as your interface, which I highly recommend): In the later section, I will introduce the RStudio interface. One of the advantages of using RStudio is that you can store a series of complex commands and re-use your analysis work on similar data.
There are a lot more reasons why R is better than Excel, for example, better statistics capabilities, faster computation, capability of running more data, better visualization tool, and more.
How to install R?
First step is go to the R official website (https://www.r-project.org/) to download and install the R programming language to your computer. R supports Windows, OS X, and different Unix platforms. Next step is to download and install RStudio at this website: https://www.rstudio.com/products/rstudio/. You have to install the R programming language before installing RStudio.
As you can see in the above screenshot, the RStudio interface is separated into 4 parts: R script editor, interactive console, work space, and an information window.
- The top left window is the R script editor, which allows user to create a new R script file or open an existing file. This is where user can input multiple lines of R code and run it all in one click by clicking the ‘Run’ button on top of this editor window. User can also choose portion of code to run by highlighting them and click ‘Run’.
- The bottom left window is the interactive console where user can type in R command one line at a time. Whenever user run code from the editor window, the ran code and result will appear in this console.
- The top right window is the workspace, which includes a list of objects currently in memory. There is also a history tab with a list of the user’s prior commands.
- The bottom right window is an information window that consists of four tabs. Files tab allows user to control the workspace that is currently working on. Plots tab shows the user the plot that is created through the R command. Packages tab shows all the downloaded packages. Finally, help tab is an information search tool where user can type in some key words or a package name that he or she wants to learn more about.
Shortcuts in RStudio
There are many shortcuts within RStudio, if you want to know all of the shortcuts, here is a website for all: https://support.rstudio.com/hc/en-us/articles/200711853-Keyboard-Shortcuts. However, the following are some useful shortcuts in my opinion:
|Description||Windows & Linux||Mac|
|Run current line/selection||Ctrl + Enter||Command + Enter|
|Select to Line Start||Alt + Shift + Left||Command + Shift + Left|
|New document (except on Chrome/Windows)||Ctrl + Shift + N||Command + Shift + N|
Setting up RStudio using basic commands
To get your current working directory, you can enter the following:
As the result suggests, my working directory is now at Desktop; if I want to change it to somewhere else, use the below command:
Remember when you are entering the path, always use forward slashes (/).
Diversity of packages is one of the advantages of R, so we should be able to know how to download and install a package. I will take a popular package, car, as an example.
> ## First download and install the package
Installing package into ‘C:/Users/Jeffrey/Documents/R/win-library/3.3’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.3/car_2.1-4.zip'
Content type 'application/zip' length 1483425 bytes (1.4 MB)
downloaded 1.4 MB
package ‘car’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
> ## Now, you can call the installed package from your library
package ‘car’ was built under R version 3.3.3
> ## Make sure you have the latest version of the package
> ## If you decide to remove the package from your system
Removing package from ‘C:/Users/Jeffrey/Documents/R/win-library/3.3’
(as ‘lib’ is unspecified)
If you want to find out more about a function, do the following:
The result will pop up in the information window. There is another way to get the same result by using the help function:
If you want to search through R’s help documentation for a specific term, you can use:
There is also a shortcut: