Skip to content

coursera-data-science-learn/getcleandata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Course: Getting and Cleaning Data

Submission Date: 17.09.2018

Prepared by: Oleksandr Titorchuk

Code description: run_analysis.R

This file provides details on how the original data was processed, and how the script which handles it works.

The script is divided into 5 blocks according to the exercise description.

The proper work of this script requires 'reshape2' package to be downloaded.

Preparatory work includes checking if the directory with original dataset exists ("UCI HAR Dataset"),and if not - downloading it, and also reading the neccesary data into R.

1. Merges the training and the test sets to create one data set

"xTrain"/"xTest" datasets are merged together through the rbind() function. Column names are assigned to the resulting table from the "features" file.

2. Extracts only the measurements on the mean and standard deviation for each measurement

grepl () function is used to create the logical vector, indicating which variables' names contain 'mean()' and 'std()'. This logical vector let us to subset only the needed columns from the 'X' dataset.

3. Uses descriptive activity names to name the activities in the data set

The next step is to merge "activity" and "subject" datasets, which will represent the first 2 columns of the final dataset, and also replace the id numbers in 'activity' column with activity names.

4. Appropriately labels the data set with descriptive variable names

gsub() function is used to rename variables' names of the 'X' data.frame to make them more descriptive.

5. Creates a second, independent tidy data set with the average of each variable for each activity and each subject

The 'reshape2' package let us easily melt and cast the data by 'subject' and 'activity' applying the mean() function. After that we need to rename the columns of the tidy dataset to reflect that they are averaged now ('.averaged'). And finnaly save the tidy dataset into "tidy.data.txt" file.

About

Course Project for "Getting and Cleaning Data"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy