0% found this document useful (0 votes)
92 views9 pages

ALY 6000 Project 1-2

This document provides instructions for Project 1. Students are asked to complete 31 problems/tasks in an R script file. The problems involve creating vectors using functions like c(), seq(), and rep(), performing operations on vectors like extracting elements and removing values, and using functions like sum(), mean(), median(), max(), min() to analyze vectors. Students are to include comments in their code and answers in a written report. They should follow specific formatting guidelines and variable naming. The project also includes test code to check solutions.

Uploaded by

writersleed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views9 pages

ALY 6000 Project 1-2

This document provides instructions for Project 1. Students are asked to complete 31 problems/tasks in an R script file. The problems involve creating vectors using functions like c(), seq(), and rep(), performing operations on vectors like extracting elements and removing values, and using functions like sum(), mean(), median(), max(), min() to analyze vectors. Students are to include comments in their code and answers in a written report. They should follow specific formatting guidelines and variable naming. The project also includes test code to check solutions.

Uploaded by

writersleed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Project 1 – R Practice

ALY 6000

Project Instructions
In this project you will produce and submit two files. The first is an R script that you create
by following the instructions below. The second is a report containing answers and
visualizations as specified in the following instructions. Be sure to follow all formatting
guidelines.

Setting Up Your Project


When working in RStudio, be careful with cloud drives. RStudio does not always play well
with cloud drives’ longer file path names. You will be best served by using a local drive
whenever possible until you are comfortable troubleshooting technical issues.
1. Create a new project titled “Lastname-Project1”. For example, if your last name is
Smith your project would be titled “Smith-Project1”.
2. Create an R Script file within your project titled “Project1_Script.R”.
3.

4. Include your name, the date, and class in a comment as the first line of the script.

5. Include the following code at the very top of every R file you create. This will clear
out the environment each time you run your entire code and prevent past actions
from interfering with current work.
cat("\014") # clears console
rm(list = ls()) # clears global environment
try(dev.off(dev.list()["RStudioGD"]), silent = TRUE) # clears plots
try(p_unload(p_loaded(), character.only = TRUE), silent = TRUE) #
clears packages
options(scipen = 100) # disables scientific notation for entire R
session

6. [Optional] Install the pacman package. This is a simple, user-friendly package that
makes installing and loading other packages a one-line process.
# You should do this line only once in the entire course.
install.packages("pacman")
# Once you have done the install line, the following line is what you
will always need to do to utilize the pacman package in R
library(pacman)

Testing Your Solution


The file project1_tests.R contains a set of test cases which will affirm that some of the
problems you have solved are correct. To use this file, save it into your project folder.
Then, from the console line write the following code:
library(pacman)
p_load(testthat)
test_file("project1_tests.R")

Tests can be run at any time. This code will report the total number of tested items that
currently succeed as well as any errors preventing you from passing others. A single
problem may have multiple tests while others have none. You should be able to complete
this project with 100% accuracy on all tests. Use the exact names of variables stated in each
problem, pay extra attention to matching any upper and lower case letters.

Project 1 Instructions
Complete the following problems in the R script you created above. Be sure to name
specified variables exactly as they are given, paying special attention to spellings and the
use of upper and lower case letters.
1. Write lines of code to compute all of the following. Include the answers in your
written report.

123 * 453
5^2 * 40
TRUE & FALSE
TRUE | FALSE
75 %% 10
75 / 10

2. Create a vector using the c function with the values 17, 12, -33, 5 and assign it to a
variable called first_vector.

[1] 17 12 -33 5

3. Create a vector using the c function with the values 5, 10, 15, 20, 25, 30, 35 and
assign it to a variable called counting_by_fives.
4. Create a vector using the seq function containing every even number between 10
and 30 inclusive and assign it to a variable called second_vector.

[1] 10 12 14 16 18 20 22 24 26 28 30

5. Create a vector using the seq function containing the values 5, 10, 15, 20, 25, 30 , 35
and assign it to a variable called counting_by_fives_with_seq.

[1] 5 10 15 20 25 30 35

6. Create a vector using the function rep and provide it with first_vector as its first
argument and 10 as its second argument. Assign the result to a variable called
third_vector.
[1] 17 12 -33 5 17 12 -33 5 17 12 -33 5 17 12 -33 5
17 12 -33
[20] 5 17 12 -33 5 17 12 -33 5 17 12 -33 5 17 12 -33
5 17 12
[39] -33 5

7. Using the rep function, create a vector containing the number zero, 20 times. Store
the result in a variable called rep_vector.

[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

8. Create a vector using the range operator (the colon), that contains the numbers
from 10 to 1 . Store the result in a variable called fourth_vector.

[1] 10 9 8 7 6 5 4 3 2 1

9. Create a vector using the range operator that contains the numbers from 5 to 15.
Store the result in a variable called counting_vector.
[1] 5 6 7 8 9 10 11 12 13 14 15

10. Create a vector with the values (96, 100, 85, 92, 81, 72) and store it in a variable
called grades.
[1] 96 100 85 92 81 72

11. Add the number 3 to the vector grades. Store the result in a variable called
bonus_points_added.

[1] 99 103 88 95 84 75

12. Create a vector with the values 1 – 100. Store it in a variable called
one_to_one_hundred. Do not type out all 100 numbers.

[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
53 54
[55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
89 90
[91] 91 92 93 94 95 96 97 98 99 100

13. Create a vector with values from 100 to -100 by 3s. Store the result in a variable
called reverse_numbers. To clarify, the first 3 numbers in this vector will be (100,
97, 94…)

[1] 100 97 94 91 88 85 82 79 76 73 70 67 64 61 58 55
52 49 46
[20] 43 40 37 34 31 28 25 22 19 16 13 10 7 4 1 -2 -
5 -8 -11
[39] -14 -17 -20 -23 -26 -29 -32 -35 -38 -41 -44 -47 -50 -53 -56 -59 -
62 -65 -68
[58] -71 -74 -77 -80 -83 -86 -89 -92 -95 -98

14. Write each of the following lines of code. Add a one-sentence comment above each
line explaining what is happening. Include your comments in the written report.
second_vector + 20
second_vector * 20
second_vector >= 20
second_vector != 20 # != means "not equal"

15. Using the built in sum function, compute the sum of one_to_one_hundred and store
it in a variable called total.
[1] 5050

16. Using the built in mean function, compute the average of one_to_one_hundred and
store the result in a variable called average_value.

[1] 50.5

17. Using the built in median function, compute the average of one_to_one_hundred
and store the result in a variable called median_value.

[1] 50.5

18. Using the built in max function, compute the average of one_to_one_hundred and
store the result in a variable called max_value.

[1] 100

19. Using the built in min function, compute the average of one_to_one_hundred and
store the result in a variable called min_value.
[1] 1

20. Using brackets, extract the first value from second_vector and store it in a variable
called first_value.
[1] 10

21. Using brackets, extract the first, second and third values from second_vector and
store it in a variable called first_three_values.
[1] 10 12 14

22. Using brackets, extract the 1st, 5th, 10th, and 11th elements of second_vector. Store
the resulting vector in a variable called vector_from_brackets.
[1] 10 18 28 30

23. Use the brackets to extract elements from the first_vector using the following vector
c(FALSE, TRUE, FALSE, TRUE). Store the result in a variable called
vector_from_boolean_brackets. Explain in a comment what happens. Include the
answer in your written report.
[1] 12 5

24. Examine the following piece of code and write a one-sentence comment explaining
what is happening. Include the answer in your written report.
second_vector >= 20

25. Examine the following piece of code and write a one-sentence comment explaining
what is happening.
ages_vector <- seq(from = 10, to = 30, by = 2)

26. Examine the following piece of code and write a one-sentence comment explaining
what is happening, assuming ages_vector was computed in the previous problem.
Include the answers in your written report.

ages_vector [ages_vector >= 20]

27. Using the same approach as the previous question, create a new vector by removing
from the grades vector all values lower than or equal to 85. Store the new vector in
a variable called lowest_grades_removed.
[1] 96 100 85 92

28. Use the grades vector to create a new vector with the 3rd and 4th elements of
grades removed. Store this in a variable called middle_grades_removed. Try
utilizing a vector of negative indexes to complete this task.
[1] 96 100 81 72
29. Use bracket notation to remove the 5th and 10th elements of second_vector. Store
the result in a variable called fifth_vector.

[1] 10 12 14 16 20 22 24 26 30

30. Write the following code. Explain in a comment what you think the code is doing.
Include the answer in your written report.

set.seed(5)
random_vector <- runif(n=10, min = 0, max = 1000)

31. Use the sum function to compute the total of random_vector. Store the result in a
variable called sum_vector.
[1] 5295.264

32. Use the cumsum function to compute the cumulative sum of random_vector. Store
the result in a variable called cumsum_vector.

[1] 200.2145 885.4330 1802.3088 2086.7083 2191.3584 2892.4159


3420.3759
[8] 4228.3111 5184.8112 5295.2642

33. Use the mean function to compute the mean of random_vector. Store the result in a
variable called mean_vector.
[1] 529.5264

34. Use the sd function to compute the standard deviation of random_vector. Store the
result in a variable called sd_vector.

[1] 331.3606

35. Use the round function to round the values of random_vector. Store the result in a
variable called round_vector.

[1] 200 685 917 284 105 701 528 808 957 110

36. Use the sort function to sort the values of random_vector. Store the result in a
variable called sort_vector.

[1] 104.6501 110.4530 200.2145 284.3995 527.9600 685.2186 701.0575


807.9352
[9] 916.8758 956.5001

37. Consider the following code. Explain in a comment what you think the code is doing.
Include the answer in your written report.

set.seed(5)
random_vector <- rnorm(n=1000, mean = 50, sd = 15)
38. Use the hist function and provide it with random_vector. Explain the result in a
comment. Include both the explanation and visualization in your report.

39. Download the datafile ds_salaries.csv from Canvas. Save it on your computer in the
same folder (directory) where your .R file for this project is located.

40. Using the p_load function, load the tidyverse set of libraries. Pacman will both
install and load the tidyverse libraries.

p_load(tidyverse)

41. The tidyverse has a function called read_csv to read files when you specify the
filename. Store the result of the read into a variable called first_dataframe.

42. Try each of the following blocks of code. Add a one-sentence comment describing
what you believe is happening. Include your answers in your written report.

head(first_dataframe)

head(first_dataframe, n = 7)

names(first_dataframe)

smaller_dataframe <- select(first_dataframe, job_title, salary_in_usd)

smaller_dataframe
better_smaller_dataframe <- arrange(smaller_dataframe,
desc(salary_in_usd))

better_smaller_dataframe

better_smaller_dataframe <- filter(smaller_dataframe, salary_in_usd >


80000)

better_smaller_dataframe

better_smaller_dataframe <-
mutate(smaller_dataframe, salary_in_euros = salary_in_usd * .94)

better_smaller_dataframe

better_smaller_dataframe <- slice(smaller_dataframe, 1, 1, 2, 3, 4, 10,


1)

better_smaller_dataframe

ggplot(better_smaller_dataframe) +
geom_col(mapping = aes(x = job_title, y = salary_in_usd), fill =
"blue") +
xlab("Job Title") +
ylab("Salary in US Dollars") +
labs(title = "Comparison of Jobs ") +
scale_y_continuous(labels = scales::dollar) +
theme(axis.text.x = element_text(angle = 50, hjust = 1))

Submitting to Canvas
When you are satisfied with your solution, take the following steps:
1. Remove any lines in your code with “install.packages.”

2. Remove any lines in your code that use the view function.

3. Submit two (2) files under the appropriate assignment in Canvas:

1. Your R script named Lastname_Project1.R.

2. A PDF of your report titled Lastname_Project1_Report.pdf.

Your report should contain the following information formatted as specified:


Title Page
Include your name, assignment title, and submission date.
Introduction and Key Findings
Include an overview of the assignment and any findings.
Conclusion/Recommendations
Include evidence-based recommendations and visualizations or direct presentation
of tabular data.
Works Cited
Include all sources, including YouTube videos, instruction materials, Google search
results, and texts that informed your study of statistics and R.
Your report should be as concise as possible while maintaining fluency. Your key findings
will be strongest if supported by visualizations or direct presentation of tabular data.

Your summary must adhere to APA guidelines, including page numbers on each page
(including the title page) in the upper right corner. See the following examples for title
pages, citations, and general APA formatting.
Congratulations on completing your first R project!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy