0% found this document useful (0 votes)
33 views

Statistical Models Using R

R is a popular open-source programming language for statistical computing and data analysis. It typically includes a command-line interface and was developed at the University of Auckland in New Zealand. R supports modular programming via functions and can be integrated with procedures written in other languages like C and Python. In modern times, R is frequently used by data analysts, statisticians, and marketers for accessing, cleaning, and presenting data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Statistical Models Using R

R is a popular open-source programming language for statistical computing and data analysis. It typically includes a command-line interface and was developed at the University of Auckland in New Zealand. R supports modular programming via functions and can be integrated with procedures written in other languages like C and Python. In modern times, R is frequently used by data analysts, statisticians, and marketers for accessing, cleaning, and presenting data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

What is R?

R is a popular open-source programming language for statistical computing


and data analysis.

R typically includes a command-line interface.

It is an interpreted computer programming language developed by Ross


Ihaka and Robert Gentleman at the University of Auckland in New Zealand.

R supports not only branching and looping but also modular programming via
functions. To boost efficiency, R can be integrated with procedures written in
C, C++, .Net, Python, and FORTRAN.

In the modern world, data analysts, statisticians, and marketers frequently use
R as a tool for accessing, cleaning, and presenting data.

R is a software environment for analyzing statistical data and graphical


depiction. R is available on popular platforms such as Windows, Linux, and
macOS

Why to use R?
Open Source: R is an open-source language, meaning it's freely available for anyone
to use, modify, and distribute.

Supports cross language Integration: R can easily integrate with other languages
and tools.

Cross-Platform Compatibility: R is compatible with major operating systems like


Windows, macOS, and Linux, ensuring that users can work with R regardless of their
preferred platform.

Community Support: The R community is large and active. This makes it easier for
users to find help and learn from others.

Statistical Computing: R was developed with a focus on statistical computing and


data analysis. It provides a comprehensive set of statistical and graphical techniques,
making it a preferred choice for statisticians and data analysts.
Flexibility: R is a flexible language that allows users to customize and extend its
functionality according to their specific needs. Advanced users can write their own
functions and packages to tailor R to their requirements.

Applications of R
1. Statistical Analysis: R is widely used for statistical analysis in fields such as
academia, pharmaceuticals, finance, and social sciences. It offers a rich set of
statistical tools for hypothesis testing, regression analysis, time series analysis,
and more.
2. Data Visualization: R is a popular choice for data visualization due to
packages like ggplot2, which enable users to create a wide variety of high-
quality plots and charts for exploring and presenting data.
3. Machine Learning: R provides several packages for machine learning, such as
caret, mlr, and TensorFlow, making it suitable for tasks like classification,
regression, clustering, and dimensionality reduction.
4. Data Mining: R can be used for data mining tasks such as association rule
mining, cluster analysis, and anomaly detection. Packages like arules and
cluster facilitate these tasks.
5. Bioinformatics: R is extensively used in bioinformatics for analyzing and
visualizing biological data, including DNA sequencing data,
6. Social Network Analysis: R offers packages like igraph and network for
analyzing and visualizing social networks and complex networks

To install RStudio, follow these steps:


1. Download RStudio: Go to the RStudio website
(https://www.rstudio.com/products/rstudio/download/) and download the
appropriate version of RStudio for your operating system (Windows, macOS,
or Linux).
2. Install R: If you haven't already installed R on your system, you'll need to do
so. You can download R from the Comprehensive R Archive Network (CRAN)
website (https://cran.r-project.org/).
3. Run the Installer: Once the RStudio installer file is downloaded, run it to start
the installation process.
4. Follow the Instructions: Follow the on-screen instructions provided by the
installer. You may need to agree to the terms and conditions and choose
installation options such as the installation directory.
5. Complete Installation: After the installation is complete, you should be able
to launch RStudio from your desktop or applications menu.
6. Start Using RStudio: Once RStudio is launched, you can start using it to write
R code, run analyses, create visualizations, and more.

Data Types in R

Basic Data
Values Examples
Types
Numeric Set of all real numbers "numeric_value <- 3.14"

Integer Set of all integers, Z "integer_value <- 42L"


Logical TRUE and FALSE "logical_value <- TRUE"
Complex Set of complex numbers "complex_value <- 1 + 2i"

“a”, “b”, “c”, …, “@”, “#”, “$”, …., “1”, "character_value <- "Hello
Character
“2”, …etc Geeks"

raw as.raw() "single_raw <- as.raw(255)"

Data Structures in R Programming


A data structure is a particular way of organizing data in a computer so that it
can be used effectively. The idea is to reduce the space and time complexities
of different tasks. Data structures in R programming are tools for holding
multiple values.

R’s base data structures are often organized by their dimensionality (1D, 2D, or
nD) and whether they’re homogeneous (all elements must be of the identical
type) or heterogeneous (the elements are often of various types). This gives
rise to the three data types which are most frequently utilized in data analysis.

The most essential data structures used in R include:

• Vectors
• Dataframes
• Matrices

• Vectors

A vector is an ordered collection of basic data types of a given length. The


only key thing here is all the elements of a vector must be of the identical data
type e.g homogeneous data structures. Vectors are one-dimensional data
structures.

Example:

• Python3

# R program to illustrate Vector

# Vectors(ordered collection of same data type)

X = c(1, 3, 5, 7, 8)

# Printing those elements in console

print(X)

Output:

[1] 1 3 5 7 8

Dataframes
Dataframes are generic data objects of R which are used to store the tabular
data. Dataframes are the foremost popular data objects in R programming
because we are comfortable in seeing the data within the tabular form. They
are two-dimensional, heterogeneous data structures. These are lists of vectors
of equal lengths.

Data frames have the following constraints placed upon them:

• A data-frame must have column names and every row should have a
unique name.
• Each column must have the identical number of items.
• Each item in a single column must be of the same data type.
• Different columns may have different data types.

To create a data frame we use the data.frame() function.

Example:

• Python3

# R program to illustrate dataframe

# A vector which is a character vector

Name = c("Amiya", "Raj", "Asish")

# A vector which is a character vector

Language = c("R", "Python", "Java")

# A vector which is a numeric vector

Age = c(22, 25, 45)

# To create dataframe use data.frame command

# and then pass each of the vectors

# we have created as arguments

# to the function data.frame()

df = data.frame(Name, Language, Age)

print(df)

Output:

Name Language Age


1 Amiya R 22
2 Raj Python 25
3 Asish Java 45
Matrices
A matrix is a rectangular arrangement of numbers in rows and columns. In a
matrix, as we know rows are the ones that run horizontally and columns are
the ones that run vertically. Matrices are two-dimensional, homogeneous
data structures.
Now, let’s see how to create a matrix in R. To create a matrix in R you need to
use the function called matrix. The arguments to this matrix() are the set of
elements in the vector. You have to pass how many numbers of rows and how
many numbers of columns you want to have in your matrix and this is the
important point you have to remember that by default, matrices are in
column-wise order.

Example:

• Python3

# R program to illustrate a matrix

A = matrix(

# Taking sequence of elements

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

# No of rows and columns

nrow = 3, ncol = 3,

# By default matrices are

# in column-wise order

# So this parameter decides

# how to arrange the matrix

byrow = TRUE

print(A)

Output:

[,1] [,2] [,3]


[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy