0% found this document useful (0 votes)

137 views5 pages

Time Series Analysis in Spark SQL

This document provides code for performing time series analysis in Spark SQL. It includes code to: 1. Load time series data from a CSV file into a Spark DataFrame. 2. Calculate moving averages and seasonally adjusted indices for the time series data. 3. Obtain a regression trend line and seasonally adjusted trend estimate for a given quarter. The code performs various time series analysis techniques including calculating moving averages, seasonal indices, and linear regression on time series data loaded from a CSV file using Spark SQL. It provides the necessary Spark and SQL code to preprocess, analyze, and model the time series data.

Uploaded by

JP Vijaykumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

137 views5 pages

Time Series Analysis in Spark SQL

Uploaded by

JP Vijaykumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 5

"""

Time Series Analysis in Spark SQL

Written JP Vijaykumar
Date Mar 8 2021

This script is provided for educational purpose only.

Pls modify/change the script as may be required to suit your environment.

I presented a script to process data using Time Series Analysis algorithm in sql
and pl/sql earlier.
In this article, I am using the same code(90%) from my previous article and 10%
pyspark code.

If you know sql, coding in pyspark is not that difficult.

I like spark sql for the following reasons:

01) It is open source.
02) It combines the rich functionality of python and sql
03) It has the datamining libraries.
04) can be installed on my desktop and play around.

Besides, I love scripting and complex algorithms.

I used the following urls to install spark and setup spark on my desktop.
https://www.youtube.com/watch?v=IQfG0faDrzE
https://www.youtube.com/watch?v=WQErwxRTiW0
http://media.sundog-soft.com/spark-python-install.pdf

There are slight variations in the way, Time Series Analysis is performed, from
presentation to presentation.
I followed mostly the below mentioned vedio presentation, on Time Series Analysis
for programming in sql and pl/sql,
and spark sql presented in this article.
https://www.youtube.com/watch?v=HIWXdHlDSFs --TIME SERIES ANALYSIS

Questions to be answered:
01) Using the ratio to moving average method calculate seasonally adjusted indicies
for each quarter.
02) Obtain a regression trend line representing the above data.
03) Obtain a seasonally adjusted trend estimate for the 4th quarter of 2011.

I created the "e:/data/TimeSeries.csv" file with following data:

year ,q1 ,q2 ,q3 ,q4
2008,20,30,39,60
2009,40,51,62,81
2010,50,64,74,95

Pls modify the code with the location of the csv file on your machine.
"""

#spark-submit.cmd python/pysparkTimeSeriesAnalysis.py
from pyspark.sql import SparkSession
from pyspark import SparkContext,SparkConf
from pyspark.sql.functions import *
from pyspark.sql.window import Window

spark = SparkSession.builder.appName("TimeSeriesAnalysis") \
.master("local[*]").getOrCreate()
spark.conf.set("spark.sql.debug.maxToStringFields",100)
spark.conf.set("spark.sql.crossJoin.enabled", "true") #To enable cartesian product
in sql

df = spark.read.csv("e:/data/TimeSeries.csv",inferSchema=True,header=True)
print(df.printSchema())
print(df.columns)
df.show()
#trim alternative. When spaces are there in DataFrame columns' names
#select the specific column with df.columns[<position number>] method
df.select(df.columns[3]).show()
#unpivot table, rotate rows as columns
#crosstable function can be implemented with "explode" option
df =
df.select(array(col(df.columns[1]),col(df.columns[2]),col(df.columns[3]),col(df.col
umns[4])).alias("val"))
df = df.withColumn("val",explode(col("val")))
df.show()
w = Window().orderBy("val")
df.select("*",row_number().over(w).alias("id")).show() #add rownum/row_num/rowid to
output data starts from "1"

df2 = df.withColumn
("rowid",row_number().over(Window.orderBy(monotonically_increasing_id())) + 0)
#rownum starts from "1"
df2.show()
df = df.repartition(1).withColumn("rnum",monotonically_increasing_id() + 1) #add
rownum to output data
#by default monotonically_increasing_id starts with "0", add "+ 1" to start with
"1"
df.select("*").show() #add rownum to output data starts with "0"
df.registerTempTable("DF")
spark.catalog.cacheTable("DF")

spark.sql("select rnum + 1 as id,val from DF").show() #you can add "+ 1" while
selecting data from the DataFrame also
#######################################
spark.sql("""
with
frqma01 as (select round(avg(val),2) frqma_val from DF where rnum>=1 and rnum<=4 ),

frqma02 as (select round(avg(val),2) frqma_val from DF where rnum>=2 and rnum<=5 ),

frqma03 as (select round(avg(val),2) frqma_val from DF where rnum>=3 and rnum<=6 ),

frqma04 as (select round(avg(val),2) frqma_val from DF where rnum>=4 and rnum<=7 ),

frqma05 as (select round(avg(val),2) frqma_val from DF where rnum>=5 and rnum<=8 ),

frqma06 as (select round(avg(val),2) frqma_val from DF where rnum>=6 and rnum<=9 ),

frqma07 as (select round(avg(val),2) frqma_val from DF where rnum>=7 and

rnum<=10 ),
frqma08 as (select round(avg(val),2) frqma_val from DF where rnum>=8 and
rnum<=11 ),
frqma09 as (select round(avg(val),2) frqma_val from DF where rnum>=9 and
rnum<=12 ),
frqma_rpt as (select cast('Four Quarter Moving Average: ' as char(60))||frqma_val
description from(
select frqma_val from frqma01
union all
select frqma_val from frqma02
union all
select frqma_val from frqma03
union all
select frqma_val from frqma04
union all
select frqma_val from frqma05
union all
select frqma_val from frqma06
union all
select frqma_val from frqma07
union all
select frqma_val from frqma08
union all
select frqma_val from frqma09
)),
ctdma1 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma01 union all select frqma_val from frqma02)),
ctdma2 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma02 union all select frqma_val from frqma03)),
ctdma3 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma03 union all select frqma_val from frqma04)),
ctdma4 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma04 union all select frqma_val from frqma05)),
ctdma5 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma05 union all select frqma_val from frqma06)),
ctdma6 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma06 union all select frqma_val from frqma07)),
ctdma7 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma07 union all select frqma_val from frqma08)),
ctdma8 as (select round(avg(frqma_val),2) ctdma_val from (select frqma_val from
frqma08 union all select frqma_val from frqma09)),
ctdma_rpt as (select cast('Centered Average: ' as char(60))||ctdma_val from (
select ctdma_val from ctdma1
union all
select ctdma_val from ctdma2
union all
select ctdma_val from ctdma3
union all
select ctdma_val from ctdma4
union all
select ctdma_val from ctdma5
union all
select ctdma_val from ctdma6
union all
select ctdma_val from ctdma7
union all
select ctdma_val from ctdma8
)),
pctavg_rpt as (select cast('PCT of Average: ' as char(60))||pct_avg from (
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma1 where rnum=3
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma2 where rnum=4
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma3 where rnum=5
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma4 where rnum=6
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma5 where rnum=7
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma6 where rnum=8
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma7 where rnum=9
union all
select round(val*100/ctdma_val,2) pct_avg from DF,ctdma8 where rnum=10
)),
q3_1 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma1 where rnum=3 ),
q4_1 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma2 where rnum=4 ),
q1_2 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma3 where rnum=5 ),
q2_2 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma4 where rnum=6 ),
q3_2 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma5 where rnum=7 ),
q4_2 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma6 where rnum=8 ),
q1_3 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma7 where rnum=9 ),
q2_3 as (select round(val*100/ctdma_val,2) pct_avg from DF,ctdma8 where rnum=10),
mean_rpt as (select cast('Mean: ' as char(60))||mean from (
select round(avg(pct_avg),2) mean from (select * from q1_2 union all select * from
q1_3)
union all
select round(avg(pct_avg),2) mean from (select * from q2_2 union all select * from
q2_3)
union all
select round(avg(pct_avg),2) mean from (select * from q3_1 union all select * from
q3_2)
union all
select round(avg(pct_avg),2) mean from (select * from q4_1 union all select * from
q4_2)
)),
m1 as (select round(avg(pct_avg),2) mean from (select * from q1_2 union all select
* from q1_3)),
m2 as (select round(avg(pct_avg),2) mean from (select * from q2_2 union all select
* from q2_3)),
m3 as (select round(avg(pct_avg),2) mean from (select * from q3_1 union all select
* from q3_2)),
m4 as (select round(avg(pct_avg),2) mean from (select * from q4_1 union all select
* from q4_2)),
adj_factor as (select round(400/sum(mean),4) adj_factor from (select mean from m1
union all select mean from m2 union all select mean from m3 union all select mean
from m4 )),
s1 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m1),
s2 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m2),
s3 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m3),
s4 as (select round(adj_factor*mean,2) seasonal_idx from adj_factor,m4),
sum_ssidx as (select cast('Sum Seasonal Index: ' as char(60))||sum(seasonal_idx)
from (select seasonal_idx from s1 union all select seasonal_idx from s2 union all
select seasonal_idx from s3 union all select seasonal_idx from s4)),
tc as (select count(*) num_recs,sum(rnum) sum_x,sum(rnum)/count(*) mean_x,sum(val)
sum_y,sum(val)/count(*) mean_y,sum(rnum*val) sum_xy,sum(power(rnum,2)) sum_x_sqr
from DF),
b as (select round((num_recs*sum_xy - sum_x*sum_y)/(num_recs*sum_x_sqr -
power(sum_x,2)),2) b_val from tc),
a as (select round( mean_y - b_val*mean_x,2) a_val from tc,b)
select cast('X code and Y code values: ' as char(60))||rnum||' '||val description
from DF
union all
select * from frqma_rpt
union all
select * from ctdma_rpt
union all
select * from pctavg_rpt
union all
select * from mean_rpt
union all
select cast('Seasonal Index: ' as char(60))||seasonal_idx from s1
union all
select cast('Seasonal Index: ' as char(60))||seasonal_idx from s2
union all
select cast('Seasonal Index: ' as char(60))||seasonal_idx from s3
union all
select cast('Seasonal Index: ' as char(60))||seasonal_idx from s4
union all
select * from sum_ssidx
union all
select cast('X Adjustment Factor: ' as char(60))||adj_factor from adj_factor
union all
select cast('b value : ' as char(60))||b_val from b
union all
select cast('a value : ' as char(60))||a_val from a
union all
select cast('Seasonally Adjusted Trend Estimate for 1st Quarter of 2011: ' as
char(60))||round((a_val + b_val*13)*seasonal_idx/100,2) seasonal_index from a,b,s1
union all
select cast('Seasonally Adjusted Trend Estimate for 2nd Quarter of 2011: ' as
char(60))||round((a_val + b_val*14)*seasonal_idx/100,2)seasonal_index from a,b,s2
union all
select cast('Seasonally Adjusted Trend Estimate for 3rd Quarter of 2011: ' as
char(60))||round((a_val + b_val*15)*seasonal_idx/100,2)seasonal_index from a,b,s3
union all
select cast('Seasonally Adjusted Trend Estimate for 4th Quarter of 2011: ' as
char(60))||round((a_val + b_val*16)*seasonal_idx/100,2) seasonal_index from a,b,s4
"""
).show(60,False) #with this option, "show" will not chop columns' length in display

spark.stop()
#References:
#http://www.orafaq.com/node/3187 "TIME SERIES ANALYSIS IN SQL AND PL/SQL"

#http://www.orafaq.com/node/3204
#https://stackoverflow.com/questions/33742895/how-to-show-full-column-content-in-a-
spark-dataframe

Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
ML Lab Manual 1-10
No ratings yet
ML Lab Manual 1-10
58 pages
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
From Everand
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
Kanto
No ratings yet
DESADV
No ratings yet
DESADV
66 pages
Gr.9 Booklet - S (1) .1 - 2017-18-1 - 155
No ratings yet
Gr.9 Booklet - S (1) .1 - 2017-18-1 - 155
98 pages
V2 SQL Final Document
No ratings yet
V2 SQL Final Document
35 pages
ADBMS
No ratings yet
ADBMS
111 pages
The Eagle - Io
No ratings yet
The Eagle - Io
17 pages
Seminar On PRVNTV Obg
100% (1)
Seminar On PRVNTV Obg
43 pages
DWDM
No ratings yet
DWDM
81 pages
Katalog Komputer Haidakom
No ratings yet
Katalog Komputer Haidakom
8 pages
Update DLL g10 Kitchenn 101
No ratings yet
Update DLL g10 Kitchenn 101
3 pages
Exploratory Data Analysis (SQL) : Business Decisions Based On The Unclean Projectfinaldata' Table
No ratings yet
Exploratory Data Analysis (SQL) : Business Decisions Based On The Unclean Projectfinaldata' Table
18 pages
Manualhb 00 01 v1.2
No ratings yet
Manualhb 00 01 v1.2
40 pages
Advanced SQL Analysis SSMS
No ratings yet
Advanced SQL Analysis SSMS
3 pages
DP600 Code Used 240514
No ratings yet
DP600 Code Used 240514
27 pages
Stephan Universalmachines Um5 To Um60 Eng
No ratings yet
Stephan Universalmachines Um5 To Um60 Eng
6 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Essential SQL Queries Reference Guide
No ratings yet
Essential SQL Queries Reference Guide
8 pages
SQ L Questions by Lips A
No ratings yet
SQ L Questions by Lips A
25 pages
Olap Ssas
No ratings yet
Olap Ssas
69 pages
PySpark Transformations
No ratings yet
PySpark Transformations
18 pages
Cambridge IGCSE and O Level Computer Science Second Edition Boost Ebook
No ratings yet
Cambridge IGCSE and O Level Computer Science Second Edition Boost Ebook
1 page
Final Need Assessment Report of Project Sites, Enhanced Management and Enforcement of Ethiopia's Protected Areas Estate Project
No ratings yet
Final Need Assessment Report of Project Sites, Enhanced Management and Enforcement of Ethiopia's Protected Areas Estate Project
68 pages
Quote Comparison For Qa Account Qwuzpov2fd55m2l Submitted On 2025 01 15
No ratings yet
Quote Comparison For Qa Account Qwuzpov2fd55m2l Submitted On 2025 01 15
14 pages
Chapter2 1
No ratings yet
Chapter2 1
41 pages
Pyspark Interview Questions
No ratings yet
Pyspark Interview Questions
4 pages
Analytical Chemistry
No ratings yet
Analytical Chemistry
7 pages
Half Yearly Answers
No ratings yet
Half Yearly Answers
10 pages
CS-30013 (DMDW) - CS Mid Sept 2024
No ratings yet
CS-30013 (DMDW) - CS Mid Sept 2024
12 pages
Quewtion SQL - Pyspark
No ratings yet
Quewtion SQL - Pyspark
4 pages
LL02
No ratings yet
LL02
64 pages
Project Adventureworks
No ratings yet
Project Adventureworks
11 pages
EDA Code SQL
No ratings yet
EDA Code SQL
18 pages
SQL and PySpark
No ratings yet
SQL and PySpark
80 pages
Sqldev320a Week7-2
No ratings yet
Sqldev320a Week7-2
31 pages
Practice Test - 2
No ratings yet
Practice Test - 2
4 pages
A Developer Guide To Websockets Development With Jakarta EE
No ratings yet
A Developer Guide To Websockets Development With Jakarta EE
15 pages
Practical 2 Analytical Queries
No ratings yet
Practical 2 Analytical Queries
5 pages
Dan Tow Manual SQL Tuning
No ratings yet
Dan Tow Manual SQL Tuning
41 pages
9dl Merged
No ratings yet
9dl Merged
7 pages
TD Advanced SQL
No ratings yet
TD Advanced SQL
88 pages
Supercharging The Tecsun PL-380
No ratings yet
Supercharging The Tecsun PL-380
5 pages
SQL Interview Preparation Part 4.2
No ratings yet
SQL Interview Preparation Part 4.2
3 pages
Analytic Functions in Oracle
No ratings yet
Analytic Functions in Oracle
35 pages
4.4.tuning SQL Execution-Plan
No ratings yet
4.4.tuning SQL Execution-Plan
56 pages
Advanced Oracle SQL Cheat Sheet
No ratings yet
Advanced Oracle SQL Cheat Sheet
3 pages
Strategic Planning 2
No ratings yet
Strategic Planning 2
9 pages
Quotation For Server Installation and Configuration Services
No ratings yet
Quotation For Server Installation and Configuration Services
2 pages
XII CBSE IP Lab Solutions (2024-25)
No ratings yet
XII CBSE IP Lab Solutions (2024-25)
15 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
Pyhtonpractice Questions
No ratings yet
Pyhtonpractice Questions
5 pages
Analytics Functions Demo
No ratings yet
Analytics Functions Demo
36 pages
SQL Vs PySpark 1678871778
No ratings yet
SQL Vs PySpark 1678871778
8 pages
SQL & pySPARK
No ratings yet
SQL & pySPARK
9 pages
Sentencia de SQL Merma Autorizada
No ratings yet
Sentencia de SQL Merma Autorizada
5 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Practicals Ak
No ratings yet
Practicals Ak
5 pages
Logic Works 06
No ratings yet
Logic Works 06
15 pages
Data Ingestion Using Unifi
No ratings yet
Data Ingestion Using Unifi
30 pages
Grade Level: Grade 1 Subject: Mother Tongue Grading Period Most Essential Learning Competencies No
100% (1)
Grade Level: Grade 1 Subject: Mother Tongue Grading Period Most Essential Learning Competencies No
11 pages
Fundamentals of Data Analysis (Access)
No ratings yet
Fundamentals of Data Analysis (Access)
24 pages
Data Science Data Prep With SQL Quick Reference 1636560429
No ratings yet
Data Science Data Prep With SQL Quick Reference 1636560429
1 page
Warehouse and SQL QUESTIONS
No ratings yet
Warehouse and SQL QUESTIONS
14 pages
SQL Final Document
No ratings yet
SQL Final Document
37 pages
Ade 1737191501
No ratings yet
Ade 1737191501
29 pages
Apriori Algorithm in SQL, PL/SQL and Spark SQL
No ratings yet
Apriori Algorithm in SQL, PL/SQL and Spark SQL
13 pages
Ultimate Begginers PPL Split
No ratings yet
Ultimate Begginers PPL Split
2 pages
Tech Mahindra SQL Interview Questions For Data Engineer
No ratings yet
Tech Mahindra SQL Interview Questions For Data Engineer
6 pages
SQL Vs Pyspark-1
No ratings yet
SQL Vs Pyspark-1
9 pages
Unselective Indexes
No ratings yet
Unselective Indexes
14 pages
Oracle Analytic Functions Session1
No ratings yet
Oracle Analytic Functions Session1
16 pages
SQL Cheatsheet
No ratings yet
SQL Cheatsheet
2 pages
SQL Programming Practice Papers Set 5 6
No ratings yet
SQL Programming Practice Papers Set 5 6
4 pages
SQL Server Analytical Functions
No ratings yet
SQL Server Analytical Functions
9 pages
Oracle Identity Management For SAP in Heterogeneous IT Environments
No ratings yet
Oracle Identity Management For SAP in Heterogeneous IT Environments
12 pages
Handling Large Numbers in Oracle
No ratings yet
Handling Large Numbers in Oracle
9 pages
Squares and Square Roots
No ratings yet
Squares and Square Roots
3 pages
Assignment SQL
No ratings yet
Assignment SQL
3 pages
ADBMS Lab - Practical No.4 - Praveen Yadav - Roll No.62
No ratings yet
ADBMS Lab - Practical No.4 - Praveen Yadav - Roll No.62
9 pages
Naive Bayes Algorithm in PL/SQL
No ratings yet
Naive Bayes Algorithm in PL/SQL
4 pages
PLSQL String Parser
No ratings yet
PLSQL String Parser
9 pages
Performance On Compound Column Indexes
No ratings yet
Performance On Compound Column Indexes
8 pages
Synonyms' Power
No ratings yet
Synonyms' Power
8 pages
Who Am I?
No ratings yet
Who Am I?
35 pages
Data Engineer (3-5 Years of Experience.) PDF
No ratings yet
Data Engineer (3-5 Years of Experience.) PDF
7 pages
Prime Numbers
No ratings yet
Prime Numbers
7 pages
Improving Analysis of Data Mining by Creating Dataset Using SQL Aggregations
No ratings yet
Improving Analysis of Data Mining by Creating Dataset Using SQL Aggregations
6 pages
PySpark Entity Resolution
No ratings yet
PySpark Entity Resolution
5 pages
TDSQL
No ratings yet
TDSQL
5 pages
Decision Tree Algorithm in PL/SQL
No ratings yet
Decision Tree Algorithm in PL/SQL
5 pages
Pyspark SQL and DataFrames
No ratings yet
Pyspark SQL and DataFrames
6 pages
Tablespace Growth Report
No ratings yet
Tablespace Growth Report
4 pages
Emami
100% (1)
Emami
39 pages
Apriori Algorithm in PL/SQL
No ratings yet
Apriori Algorithm in PL/SQL
4 pages
Linear Regression Algorithm in PL/SQL
No ratings yet
Linear Regression Algorithm in PL/SQL
1 page
SQL Interview Questions For A Data Engineer
No ratings yet
SQL Interview Questions For A Data Engineer
11 pages
Installation and Setup Guide: Simulate ONTAP 9.7
No ratings yet
Installation and Setup Guide: Simulate ONTAP 9.7
19 pages
Purge Script
No ratings yet
Purge Script
3 pages
Table/Index Row Count Mismatch
No ratings yet
Table/Index Row Count Mismatch
3 pages
If Exists If Not Exists in Oracle
No ratings yet
If Exists If Not Exists in Oracle
2 pages
List Game PS2 - 1
No ratings yet
List Game PS2 - 1
10 pages
Tom Pryor Formed A Management Consulting Firm
No ratings yet
Tom Pryor Formed A Management Consulting Firm
5 pages
Failure and Root Cause Analysis of Vehicle Drive Shaft
No ratings yet
Failure and Root Cause Analysis of Vehicle Drive Shaft
10 pages
Mr. Faisal Zia 5kW System
No ratings yet
Mr. Faisal Zia 5kW System
11 pages
Customer Relationship Management (CRM) Is A Management
No ratings yet
Customer Relationship Management (CRM) Is A Management
59 pages
10g Install in Linux 5
No ratings yet
10g Install in Linux 5
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Time Series Analysis in Spark SQL

Uploaded by

Time Series Analysis in Spark SQL

Uploaded by

"""

Time Series Analysis in Spark SQL

This script is provided for educational purpose only.

If you know sql, coding in pyspark is not that difficult.

I like spark sql for the following reasons:

Besides, I love scripting and complex algorithms.

I created the "e:/data/TimeSeries.csv" file with following data:

frqma02 as (select round(avg(val),2) frqma_val from DF where rnum>=2 and rnum<=5 ),

frqma03 as (select round(avg(val),2) frqma_val from DF where rnum>=3 and rnum<=6 ),

frqma04 as (select round(avg(val),2) frqma_val from DF where rnum>=4 and rnum<=7 ),

frqma05 as (select round(avg(val),2) frqma_val from DF where rnum>=5 and rnum<=8 ),

frqma06 as (select round(avg(val),2) frqma_val from DF where rnum>=6 and rnum<=9 ),

frqma07 as (select round(avg(val),2) frqma_val from DF where rnum>=7 and

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.