SAS Assignment 1 5 E057

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 47

SAS Assignment

Name: Jayant Singh


Roll Number: E057
Class: B. Tech CS
Div: E

Activity 1.03 *;

* 1) View the code. How many steps are in the program? => 3

* 2) How many statements are in the PROC PRINT step? => 4

* 3) How many global statements are in the program? => 3

* 4) Run the program and view the log.


* 5) How many observations were read by the PROC PRINT step? => 11
Activity 1.04 *;

* 1) Format the program to improve the spacing. What syntax error is detected? Fix the error
and run the program.
data canadashoes;
set sashelp.shoes;
where region="Canada";
Profit=Sales-Returns;
run;

proc print
data=canadashoes; run;

* 2) Read the log and identify any additional syntax *;

* errors or warnings. Correct the program and *;

* format the code again. *;

* 3) Add a comment to describe the changes that you *;

* made to the program. *;

1. Added semicolon after data canadashoes


2. Indented set sashelp.shoes and added semicolon
3. Indented run statements in both the lines
4. Fixed a typo in proc statement

* 4) Run the program and examine the log and results.

*; LOG:
RESULTS:

* How many rows are in the canadashoes data? => 37


* Activity 2.04 *;

* 1) Write a PROC CONTENTS step to generate a report *;

* of the STORM_SUMMARY.SAS7BDAT table properties. *;

* Highlight the step and run only the selected *;

* code. =>
proc sort data=pg1.storm_summary;
by Season Name;
run;

* 2) How many observations are in the table? => 3118

* 3) How is the table sorted? => Sorted by Season Name

* Activity 2.07 *;

* 1) Complete the OPTIONS statement to ensure that the column names follow SAS
naming conventions. *;
options VALIDVARNAME=V7 ;

* 2) Complete the LIBNAME statement to create a library named NP that reads NP_INFO.XLSX
in the data folder.
LIBNAME NP XLSX "/folders/myfolders/2. Datasets/PG1/data/pg194/np_info.xlsx";
* 3) Highlight the OPTIONS and LIBNAME statements and *;

* run the selection. *;

* 4) Navigate to your list of libraries and open the *;

* NP library. Open each table and view the data. *;

NP.Parks

NP.SPECIES:

NP.VISITS
* Activity 2.08 *;

* 1) If necessary, update the path of the course files *;

* in the LIBNAME statement. *;


libname np xlsx "/folders/myfolders/2. Datasets/PG1/data/np_info.xlsx";

* 2) Complete the PROC CONTENTS step to read the parks *;

* table in the NP library. *;


proc contents data= np.parks;
run;

* 3) Complete the LIBNAME statement to clear the NP *;

* library. *;
libname NP CLEAR;

* 4) Run the program and examine the log. Which column *;

* names were modified to follow SAS naming *;

* conventions? *;
* Activity 2.09 *;

* 1) This program imports a tab-delimited file. Run *;

* the program twice and carefully read the log. *;

* What is different about the second submission? *;

* 2) Fix the program and rerun it to confirm that the *;

* import is successful. *;

FIRST RUN:
SECOND RUN:

CHANGED THE OUTPUT TO storm_damage_tab_new


proc import
datafile="/folders/myfolders/2.Datasets/PG1/data/pg194/storm_damage.tab"
dbms=tab out=storm_damage_tab;
run;
proc import
datafile="/folders/myfolders/2.Datasets/PG1/data/pg194/storm_d
amage.tab"
dbms=tab out=storm_damage_tab_new;
run;

* Activity 3.02 *;

* 1) Run the program. Examine the results and the log. *;

* Are the two WHERE statements applied? *;

The where statements are not applied as MaxWindMPH has the value 132 which is smaller than 156
* 2) Change the second WHERE statement to WHERE ALSO *;

* and rerun the code. Examine the results and the *;

* log. Are the two WHERE statements applied? *;

Modified Code:
proc print data=pg1.storm_summary;
where MaxWindMPH>156;
where also MinPressure>800 and MinPressure<920;
run;

Results:
* Activity 3.03 *;

* 1) Uncomment each WHERE statement one at a time and *;

* run the step to observe the rows that are *;

* included in the results. *;

FIRST Where
SECOND Where
THIRD Where
FOURTH Where

* 2) Comment all previous WHERE statements. Add a new *;

* WHERE statement to print storms that begin with *;

* Z. How many storms are included in the results? *;

CODE:
proc print data=pg1.storm_summary(obs=50);

*where MinPressure is missing; /*same as MinPressure = .*/

*where Type is not missing; /*same as Type ne " "*/

*where MaxWindMPH between 150 and 155;

*where Basin like "_I";

where Name like 'Z%';


run;

* Activity 3.04 *;

* 1) Change the value in the %LET statement from NA to *;

* SP. *;

* 2) Run the program and carefully read the log. *;

* Which procedure did not produce a report? *;

* What is different about the WHERE statement in *;

* that step? *;

NA:
SP:
* Activity 3.06 *;

* 1) Highlight the PROC PRINT step and run the *;

* selected code. Notice how the values of Lat, Lon, *;

* StartDate, and EndDate are displayed in the *;

* report. *;

CODE:

proc print data=pg1.storm_summary(obs=20);

format Lat Lon 4. StartDate EndDate

date7.;

run;
* 2) Change the width of the DATE format to 7 and run *;

* the PROC PRINT step. How does the display of *;

* StartDate and EndDate change? *;

DATE WIDTH = 9

DATE WIDTH = 7 (Note 1980 changes to 80)


* 3) Change the width of the DATE format to 11 and run *;

* the PROC PRINT step. How does the display of *;

* StartDate and EndDate change? *;

DATE WIDTH = 11 (Hyphens are added)

* 4) Highlight the PROC FREQ step and run the selected *;

* code. Notice that the report includes the number *;

* of storms for each StartDate. *;

CODE:

proc freq data=pg1.storm_summary

order=freq; tables StartDate;

*Add a FORMAT statement;


run;

* 5) Add a FORMAT statement to apply the MONNAME. *;

* format to StartDate and run the PROC FREQ step. *;

* How many rows are in the report? *;

CODE:

proc freq data=pg1.storm_summary order=freq;


tables StartDate;
format put StartDate MONNAME3.;
run;
* Activity 3.07 *;

* 1) Modify the OUT= option in the PROC SORT statement *;

* to create a temporary table named STORM_SORT. *;

* 2) Complete the WHERE and BY statements to answer *;

* the following question: Which storm in the North *;

* Atlantic basin (NA or na) had the strongest *;

* MaxWindMPH? *;

proc sort data=pg1.storm_summary out=STORM_SORT;


where Basin = 'na' or Basin = 'NA';
by descending MaxWindMPH;
run;

* Activity 4.01 *;

* 1) Complete the DATA step to create a temporary *;

* table named STORM_NEW and read PG1.STORM_SUMMARY. *;

* Run the program and read the log. *;

data STORM_NEW;
set PG1.storm_summary;
run;
* 2) Define a library named out pointing to the output *;

* folder in the main course files folder. *;

libname out "/folders/myfolders/2. Datasets/Libraries";

* 3) Change the program to save a permanent version of *;

* STORM_NEW in the out library. Run the modified *;

* program. *;

data out.STORM_NEW;
set PG1.storm_summary;
run;
* Activity 4.03 *;

* 1) Change the name of the output table to *;

* STORM_CAT5. *;

* 2) Include only Category 5 storms (MaxWindMPH *;

* greater than or equal to 156) with StartDate on *;

* or after 01JAN2000. *;

* 3) Add a statement to include the following columns *;

* in the output data: Season, Basin, Name, Type, *;

* and MaxWindMPH. How many Category 5 storms *;

* occurred since January 1, 2000? *;


CODE:
libname out "/folders/myfolders/2. Datasets/Libraries";

data out.storm_cat5;

set pg1.storm_summary;

where MaxWindMPH >= 156 and StartDate >=

MDY(1,1,2000); keep Season Basin Name Type MaxWindMPH

run;
* Activity 4.04 *;

* 1) Add an assignment statement to create StormLength *;

* that represents the number of days between *;

* StartDate and EndDate. *;

data storm_length;
set pg1.storm_summary;
drop Hem_EW Hem_NS Lat Lon;
StormLength = EndDate -
StartDate;
run;

* 2) Run the program. In 1980, how long did the storm *;

* named Agatha last? => 6 Days

* Activity 4.05 *;

* 1) Open the PG1.STORM_RANGE table and examine the *;

* columns. Notice that each storm has four wind *;

* speed measurements. *;

* 2) Create a new column named WindAvg that is the *;

* mean of Wind1, Wind2, Wind3, and Wind4. *;

* 3) Create a new column WindRange that is the range *;


* of Wind1, Wind2, Wind3, and Wind4. *;

Code:
data storm_wingavg;
set pg1.storm_range;
WindAvg = mean(Wind1, Wind2, Wind3, Wind4);
WindRange = Range(Wind1, Wind2, Wind3,
Wind4);
run;

* Activity 4.06 *;

* 1) Add a WHERE statement that uses the SUBSTR *;

* function to include rows where the second letter *;

* of Basin is P (Pacific ocean storms). *;


* 2) Run the program and view the log and data. How *;

* many storms were in the Pacific basin? *;

CODE:

data pacific;

set pg1.storm_summary;

drop Type Hem_EW Hem_NS MinPressure Lat

Lon; where substr(Basin, 2, 1)='P';

run;

* Activity 4.07 *;

* 1) Add the ELSE keyword to test conditions *;

* sequentially until a true condition is met. *;

* 2) Change the final IF-THEN statement to an ELSE *;

* statement. *;

data storm_cat;
set pg1.storm_summary;
keep Name Basin MinPressure StartDate PressureGroup;
*add ELSE keyword and remove final condition;
*if MinPressure=. then PressureGroup=.;
if MinPressure<=920 then PressureGroup=1;
else PressureGroup=0;

run;

proc freq data=storm_cat;


where PressureGroup =
1; tables PressureGroup;
run;

* 3) How many storms are in PressureGroup 1? => 1 Storm

* Activity 4.08 *;

* 1) Run the program and examine the results. Why is *;

* Ocean truncated? What value is assigned when *;

* Basin='na'?

=> Since length is less than name of the ocean.

=> if Basin is ‘na’ Value assigned is Atlantic.

* 2) Modify the program to add a LENGTH statement to *;

* declare the name, type, and length of Ocean *;


* before the column is created. *;

* 3) Add an assignment statement after the KEEP *;

* statement to convert Basin to uppercase. Run the *;

* program. *;

* 4) Move the LENGTH statement to the end of the DATA *;

* step. Run the program. Does it matter where the *;

* LENGTH statement is in the DATA step? *;

Ocean Length = 10

Basin = Uppercase

Code:
data storm_summary2;
set pg1.storm_summary;
where Basin = 'na';
length Ocean $10;
keep Basin Season Name MaxWindMPH Ocean;
Basin = upcase(Basin);
OceanCode=substr(Basin,2,1);
if OceanCode="I" then Ocean="Indian";
else if OceanCode="A" then Ocean="Atlantic";
else Ocean="Pacific";
run;
* Activity 4.09 *;

* Run the program. Why does the program fail? *;

Reason for Fail: The program fails due to no matching If-then clause and as there were 2 unclosed Do
blocks.

* Activity 5.01 *;
* 1) In the program, notice that there is a TITLE *;
* statement followed by two procedures. Run the *;
* program. Where does the title appear in the *;
* output? *;
* 2) Add a TITLE2 statement above PROC MEANS to print *;
* a second line: *;
* Summary Statistics for MaxWind and MinPressure *;
Code:
title "Storm Analysis";
title "Summary Statistics for MaxWind and MinPressure";

proc means data=pg1.storm_final;


var MaxWindMPH
MinPressure;
run;

proc freq data=pg1.storm_final;


tables BasinName;
run;
* 3) Add another TITLE2 statement above PROC FREQ with *;
* this title: Frequency Report for Basin *;
Code:

title "Storm Analysis";


title "Summary Statistics for MaxWind and MinPressure";

proc means data=pg1.storm_final;


var MaxWindMPH
MinPressure;
run;

title "Frequency Report for Basin";


proc freq data=pg1.storm_final;
tables BasinName;
run;

* 4) Run the program. Which titles appear above each *;


* report? *;
* Activity 5.02 *;
* Notice that there are no TITLE statements in the *;
* code. Run the program. Does the report have *;
* titles? *;

No Title
* Activity 5.03 *;
* 1) This code creates a macro variable named oc that *;
* stores the text string Pacific. The oc macro *;
* variable is then used in the WHERE statement to *;
* subset the data. *;
* 2) Update the TITLE2 statement to use the macro *;
* variable. Run the program. *;
Code:
%let oc=Pacific;
ods noproctitle;
title 'Storm Analysis';
title2 &oc;

proc means data=pg1.storm_final;


where Ocean="&oc";
var MaxWindMPH MinPressure;

run;

ods proctitle;
title;

* 3) Change the value of the macro variable to *;


* Atlantic and run the program again. *;
Code:
%let oc=Atlantic;
ods noproctitle;
title 'Storm Analysis';
title2 &oc;

proc means data=pg1.storm_final;


where Ocean="&oc";
var MaxWindMPH MinPressure;
run;

ods proctitle;
title;
* Activity 5.04 *;
* 1) Modify the LABEL statement in the DATA step to *;
* label the Invoice column as Invoice Price. *;
Code:
data cars_update;
set sashelp.cars;
keep Make Model MSRP Invoice AvgMPG;
AvgMPG=mean(MPG_Highway, MPG_City);
label MSRP="Manufacturer Suggested Retail
Price" AvgMPG="Average Miles per Gallon"
Invoice = "Invoice Price";
run;

proc means data=cars_update min mean


max; var MSRP Invoice;
run;

proc print data=cars_update;


var Make Model MSRP Invoice
AvgMPG; run;

* 2) Run the program. Why do the labels appear in the *;


* PROC MEANS report but not in the PROC PRINT *;
* report? Fix the program and run it again. *;
Code:
data cars_update;
set sashelp.cars;
keep Make Model MSRP Invoice AvgMPG;
AvgMPG=mean(MPG_Highway, MPG_City);
label MSRP="Manufacturer Suggested Retail
Price" AvgMPG="Average Miles per Gallon"
Invoice = "Invoice Price";
run;

proc means data=cars_update min mean


max; var MSRP Invoice;
run;

proc print data=cars_update label;


var Make Model MSRP Invoice
AvgMPG; run;

* Activity 5.05 *;
* 1) Create an output table named STORM_COUNT by *;
* completing the OUT= option in the TABLES *;
* statement. *;

Code:
* 2) Run the program. Which data values are included *;
* in the output table? Which statistics are *;
* included? *;
* 3) Put StartDate and BasinName in separate TABLES *;
* statements. Add the OUT= option in each *;
* statement, and name the tables MONTH_COUNT and *;
* BASIN_COUNT. *;

Code:
title "Frequency Report for Basin and Storm Month";
proc freq data=pg1.storm_final order=freq noprint;
tables BasinName / out=BASIN_COUNT;
tables StartDate / out=MONTH_COUNT;
format StartDate monname. ;
run;

* 4) Run the program and examine the two tables. Which *;


* month has the highest number of storms? *;
Ans: September with 486 Storms
* Activity 5.06 *;
* 1) Add options to include N (count), MEAN, and MIN *;
* statistics. Round each statistic to the nearest *;
* integer. *;
Code:
proc means data=pg1.storm_final n mean min maxdec= 0;
var MinPressure;
where Season >=2010;
run;

* 2) Add a CLASS statement to group the data by Season *;


* and Ocean. Run the program. *;
Code:
proc means data=pg1.storm_final n mean min maxdec= 0 ;
var MinPressure;
class Season Ocean;
where Season >=2010;
run;
* 3) Modify the program to add the WAYS statement so *;
* that separate reports are created for Season and *;
* Ocean statistics. Run the program. *;
* Which ocean had the lowest mean for minimum *;
* pressure? *;
ANS: Pacific
* Which season had the lowest mean for minimum *;
* pressure? *;
ANS: 2015
CODE:
proc means data=pg1.storm_final n mean min maxdec=0;
var MinPressure;
class Season Ocean;
Ways 1;
where Season >=2010;
run;

***********************************************************;
* Activity 5.07 *;
* 1) Run the PROC MEANS step and compare the report *;
* and the wind_stats table. Are the same statistics *;
* in the report and table? What do the first five *;
* rows in the table represent? *;
ANS: The first five rows represent where type =
0 CODE:
proc means data=pg1.storm_final mean median max;
var MaxWindMPH;
class BasinName;
*ways 1;
output out=wind_stats;
run;
* 2) Uncomment the WAYS statement. Delete the *;
* statistics listed in the PROC MEANS statement and *;
* add the NOPRINT option. Run the program. Notice *;
* that a report is not generated and the first five *;
* rows from the previous table are excluded. *;
ANS:
CODE:
proc means data=pg1.storm_final noprint;
var MaxWindMPH;
class BasinName;
ways 1;
output out=wind_stats;
run;
* 3) Add the following options in the OUTPUT statement *;
* and run the program again. How many rows are in *;
* the output table? *;
* output out=wind_stats mean=AvgWind max=MaxWind; *;
ANS:
CODE:
proc means data=pg1.storm_final noprint;
var MaxWindMPH;
class BasinName;
ways 1;
output out=wind_stats mean=AvgWind max=MaxWind;
run;
**************************************************;
* Activity 5.08 *;
* Run the program and examine the results to *;
* see examples of other procedures that *;
* analyze and report on the data. *;
**************************************************;
ANS:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy