Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Course: Getting and Cleaning Data

Submission Date: 17.09.2018

Prepared by: Oleksandr Titorchuk

Code description: run_analysis.R

This file provides details on how the original data was processed, and how the script which handles it works.

The script is divided into 5 blocks according to the exercise description.

The proper work of this script requires 'reshape2' package to be downloaded.

Preparatory work includes checking if the directory with original dataset exists ("UCI HAR Dataset"),and if not - downloading it, and also reading the neccesary data into R.

1. Merges the training and the test sets to create one data set

"xTrain"/"xTest" datasets are merged together through the rbind() function. Column names are assigned to the resulting table from the "features" file.

2. Extracts only the measurements on the mean and standard deviation for each measurement

grepl () function is used to create the logical vector, indicating which variables' names contain 'mean()' and 'std()'. This logical vector let us to subset only the needed columns from the 'X' dataset.

3. Uses descriptive activity names to name the activities in the data set

The next step is to merge "activity" and "subject" datasets, which will represent the first 2 columns of the final dataset, and also replace the id numbers in 'activity' column with activity names.

4. Appropriately labels the data set with descriptive variable names

gsub() function is used to rename variables' names of the 'X' data.frame to make them more descriptive.

5. Creates a second, independent tidy data set with the average of each variable for each activity and each subject

The 'reshape2' package let us easily melt and cast the data by 'subject' and 'activity' applying the mean() function. After that we need to rename the columns of the tidy dataset to reflect that they are averaged now ('.averaged'). And finnaly save the tidy dataset into "tidy.data.txt" file.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
UCI HAR Dataset		UCI HAR Dataset
CodeBook.md		CodeBook.md
README.md		README.md
run_analysis.R		run_analysis.R
tidy.data.txt		tidy.data.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Code description: run_analysis.R

1. Merges the training and the test sets to create one data set

2. Extracts only the measurements on the mean and standard deviation for each measurement

3. Uses descriptive activity names to name the activities in the data set

4. Appropriately labels the data set with descriptive variable names

5. Creates a second, independent tidy data set with the average of each variable for each activity and each subject

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

coursera-data-science-learn/getcleandata

Folders and files

Latest commit

History

Repository files navigation

Code description: run_analysis.R

1. Merges the training and the test sets to create one data set

2. Extracts only the measurements on the mean and standard deviation for each measurement

3. Uses descriptive activity names to name the activities in the data set

4. Appropriately labels the data set with descriptive variable names

5. Creates a second, independent tidy data set with the average of each variable for each activity and each subject

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages