#02 R Basics
#02 R Basics
#02
Basic R
WH Chan @ 2021
Outline
• R Basics
• Packages in R
• Search Help in R
• Functions in R
• Control Structures in R
• Import/Export in R
#02
R BASICS
Arithmetic with R
• In the most basic form, R can be used as a simple
calculator.
• Using following arithmetic operators:
– Addition: +
– Subtraction: -
– Multiplication: *
– Division: /
– Exponentiation: ^
– Modulo: %%
Variable Assignment in R
• In R, variable assignment is done using either “<-” or
“=“
• For example: “x<-4” is equivalent to “x=4”
• However, some style guide such as Google R Style
Guide prefers the use of “<-” instead of “=“
https://google.github.io/styleguide/Rguide.xml
Basic Data Types in R
• Some of the basic data types in R are:
– Numeric: ……1, 10, 15, 1C, 2E,……
– Integer: ……2, 15, 33, 13, 18,……
– Logical: TRUE or FALSE
– Characters: Text or strings
• The data types of a variable can be checked with
class() function.
Vectors in R
• Vectors are one-dimension arrays that can hold numeric,
integer, character or logical data.
• In R, vectors are created using function c() ,for example:
– numeric_vector <- c(1, 10, 49)
– character_vector <- c("a", "b", "c")
– boolean_vector <- c(TRUE, FALSE, TRUE)
• In R, we can set custom name for the vector. Example:
– names(character_vector) <- c(“One”, “Two”, “Three”)
Arithmetic of Vectors in R
• Vectors can perform element-wise arithmetic operation.
• Sum and average of the elements in the vector can be calculated using
sum(vector) and mean(vector) respectively.
• Vectors can used to do comparison, for example:
– vector = c(1,2,3,4,5)
vector > 3
Output:
FALSE FALSE FALSE TRUE TRUE
• Element in specific position in the vector can be selected using
vector[position].
– Bool_value = c(TRUE, FALSE, TRUE, FALSE)
Vector = c(2, 3, 5, 6)
Vector[Bool_value]
Output:
25
Factors in R
• Factors are categorical variables that are useful in
summary statistics, plots and regressions.
• To create factors in R, use factor() function.
• The factor levels can be changed using the levels()
function.
my_vector <- c("L", "S", "L", "M", "M")
# Option 1
my_factor <- factor(my_vector)
levels(my_factor) <- c("Large", "Medium", "Small")
# Option 2
my_factor <- factor(my_vector,
levels = c("S", "M", "L"),
labels = c("Small", "Medium", "Large"))
Matrices in R
• In R, a matrix is a collection of elements of the same data
type (numeric, character, or logical) arranged into a fixed
number of rows and columns.
• Since we are only working with rows and columns, a matrix is
called two-dimensional.
• You can construct a matrix in R with the matrix() function.
Consider the following example:
– matrix(1:9, byrow = TRUE, nrow = 3)
• The row and column of the matrix can be named using
rownames() and colnames() respectively.
– Example: colnames(matrix) <- names
Operations in Matrices
• We can calculate the column sum and row sum using
colSums() and rowSums() respectively.
– Example: colSums(matrix)
• Additional data can be bind into the matrix using
cbind() and rbind().
– cbind(matrix_A, matrix_B): add matrix_B into the “right” side
of matrix_A
– rbind(matrix_A, matrix_B): add matrix_B into the “bottom”
side of matrix_A
• Element in specific position in the matrix can be
selected using matrix[row_position, col_position].
Creating a Data Frame in R
• In R, data in matrix should consists of same data type. However, if the
data consists of different data types, a data frame is more suitable for
this case.
• Data frame can be constructed with the data.frame() function.
– Example: data.frame(vector1, vector2, vector3, vector4, vector 5)
• Different vectors will be passed as argument and they will become the
different columns of your data frame.
• Because every column has the same length, the vectors you pass must
also have the same length.
vector1
vector2
vector3
vector4
vector5
Data Frame
Data Frame in R
• When dealing with huge data frame, we can use head(data
frame) or tail(data frame) to show only the first few
observations or last few observations of the data frame
respectively.
• In order to inspect the structure of the data frame, we can
use the function str()
Accessing Data Frame
• Data frame can be access via dataframe[row
position, column position], dataframe[row position,
column name]
• Column of the data frame also can be assess by using
“$”
– For example: dataframe$column1
• Subset of the data frame can be obtained through
function subset(dataframe, subset=condition)
• The data frame can be sorted using order() function.
– For example: order(dataframe$column)
Characters Data in R
• Character's data in R often use double quotes.
• Concatenation of strings can be done using paste()
function. For example: paste(string1, string2, sep=“”)
• To print/display output on the console window, use
print() function.
PACKAGES IN R
R Packages
• R is highly extensible with widely available packages
contributed by community.
• Easy sharing of scripts.
• Comprehensive R Archive Network (CRAN)
• Currently lists over 10000 free packages for R
• Example of packages such as:
– ggplot2 – for advanced visualization of data
– e1071 – standard library for support vector machine (SVM)
• To manage R packages, there are two ways:
– via R Console
– via RStudio
List available R Packages
• Line 3 code will open up the URL in web browser which shows
a large list of categories of packages that are available.
• Line 5 code will shows a long list of packages by name.
• To show the installed packages in the machine, use library()
• To show the current active packages, use search()
Install R Packages
• There are several ways to install R packages in RStudio.
– Method 1: Using function install.packages()
• For example: install.packages(“package_name”)
– Method 2: Install package through RStudio GUI (online/manual)
• Can be accessed through:
– Package pane at the bottom right and click install.
– Tools menu > Install Packages
• For loop in R
for(x in 1: 10) {
print(x)
}
• While loop in R
x <- 1;
while(x<11) {
print(x)
x = x+1;
}
IMPORT/EXPORT IN R
Save/Load and Import/Export
• In R, data can be save into R binary data (.Rdata
extension), which only readable by R.
• This can be done by save() function.
– For example: save(variable, file=“path/filename.Rdata”)
– The data saved in Rdata format can be loaded into R using
load()
• In R, data also can be exported to a Excel-readable
format (.csv) using write.csv()
– For example: write.csv(dataframe, “path/filename.csv”)
– The CSV file can be imported into R using read.csv()
Import from Excel
• In R, Excel file can be imported by navigating to the
option pane: File Import Dataset From Excel
• If first time use
it will ask to
download and
install a package.
Import from Excel
• File selection window will pop out as follow: