2_Scala_1(1)
2_Scala_1(1)
Introduction to Scala
Goal
• Our goal here is to learn enough Scala to write
Spark code
2
What is Scala?
• Scala is a general-purpose programming language
• Concise – like Python
• Scala source code compiles to Java bytecode that
runs on a Java virtual machine
• Language interoperability with Java
3
Features of Scala
• Scala ≡ Scalable Language
• Good support for functional programming
– high-order functions, immutable values, lazy
evaluation, optimization, pattern matching
• Good support for object-oriented programming
• A strong type system
• Implicits
– code that is concise and easier to understand
4
Why Learn Scala for Big Data
• Provides a boost to your professional career
• Write robust code with few bugs
• Spark is written in Scala
• Best support for Spark
• Faster Spark code
5
Installing Scala
• Make sure you have Java 8 or newer
• Download the Scala Binaries from
https://www.scala-lang.org/download/
6
Installing Scala
• Place the scala\bin subdirectory to system path
• For windows ~\AppData\Local\Coursier\data\bin
7
Using Scala
• To start Scala REPL, type scala at the command prompt
• To quit, type :quit, or :q
• REPL ≡ Read, Evaluate, Print, Loop
8
Alternative Way of Using Scala
• Use a text editor to type your code as a singleton object
• Assume we saved the following program as HelloWorld.scala
object HelloWorld {
def main(args: Array[String]) = {
println("Hello World")
}
}
• Then, use the command prompt and the following command to
compile
c:\>scalac HelloWorld.scala
• necessary bytecode files will be created. To execute
c:\>scala HelloWorld
9
Alternative Way of Using Scala (cont.)
10
Alternative Way of Using Scala (cont.)
11
Compiling and Running A Scala Program
12
Using an IDE with Scala
• You could use Eclipse or IntelliJ
• Instructions for IntelliJ at https://docs.scala-
lang.org/getting-started-intellij-track/getting-
started-with-scala-in-intellij.html
• For Eclipse
– Download from scala-ide.org
– Instructions at http://scala-ide.org/docs/current-user-
doc/gettingstarted/index.html
13
Basic Types
Variable Type Description
8-bit signed integer
Byte
16-bit signed integer
Short
32-bit signed integer
Int
64-bit signed integer
Long
32-bit single precision float
Float
64-bit double precision float
Double
16-bit unsigned Unicode character
Char
A sequence of Chars
String
true or false
Boolean
14
Basic Types (cont.)
• Scala has 7 numeric types and a Boolean type
• Each type in Scala is implemented as a class
• We can invoke methods on numbers
– 1.toString() //yields the string "1"
– 99.44.toInt //yields 99
– 1.to(10) // yields the Range(1, 2, 3, …, 10)
– 2.3.getClass.getSimpleName //res26: String = double
15
Variables
• mutable vs. immutable
16
Variables
• mutable vs. immutable
• Use var to declared a mutable variable
17
Variables
• mutable vs. immutable
• Use var to declared a mutable variable
var x = 10
18
Variables
• mutable vs. immutable
• Use var to declared a mutable variable
var x = 10
x = 20
19
Variables
• mutable vs. immutable
• Use var to declared a mutable variable
var x = 10
x = 20
20
Variables
• mutable vs. immutable
• Use var to declared a mutable variable
var x = 10
x = 20
val x = 10
21
Variables
• mutable vs. immutable
• Use var to declared a mutable variable
var x = 10
x = 20
val x = 10
x = 20 //error
22
Remarks
23
Remarks
• semicolon at the end of a statement is optional
24
Remarks
• semicolon at the end of a statement is optional
val y = 10;
val y = 10 // Equivalent
25
Remarks
• semicolon at the end of a statement is optional
val y = 10;
val y = 10 // Equivalent
• the compiler infers type wherever possible
26
Remarks
• semicolon at the end of a statement is optional
val y = 10;
val y = 10 // Equivalent
• the compiler infers type wherever possible
val y = 10
27
Remarks
• semicolon at the end of a statement is optional
val y = 10;
val y = 10 // Equivalent
• the compiler infers type wherever possible
val y = 10
val y: Int = 10
28
Remarks
• semicolon at the end of a statement is optional
val y = 10;
val y = 10 // Equivalent
• the compiler infers type wherever possible
val y = 10
val y: Int = 10
• Scala is a statically typed language, so everything
has a type
29
Lazy Values
• When a val is declared as lazy, its initialization is
deferred until it is needed
30
Lazy Values
• When a val is declared as lazy, its initialization is
deferred until it is needed
• Syntax:
lazy val x = 10
31
Lazy Values
• When a val is declared as lazy, its initialization is
deferred until it is needed
• Syntax:
lazy val x = 10
• Initialization will take place when x is used
print( x + 5)
val y = x + 2
32
Lazy Values
• When a val is declared as lazy, its initialization is
deferred until it is needed
• Syntax:
lazy val x = 10
• Initialization will take place when x is used
print( x + 5)
val y = x + 2
• Notice: only vals can be lazy
33
Arithmetic and Operator Overloading
• Scala has the same arithmetic operators as other
languages: + - * / %
34
Arithmetic and Operator Overloading
• Scala has the same arithmetic operators as other
languages: + - * / %
• In Scala, operators are actually methods
35
Arithmetic and Operator Overloading
• Scala has the same arithmetic operators as other
languages: + - * / %
• In Scala, operators are actually methods
a+b
is a shorthand for
a.+(b)
36
Arithmetic and Operator Overloading
• Scala has the same arithmetic operators as other
languages: + - * / %
• In Scala, operators are actually methods
a+b
is a shorthand for
a.+(b)
• In general, these are equivalent:
a method b
a.method(b)
37
Arithmetic and Operator Overloading
• Scala has the same arithmetic operators as other
languages: + - * / %
• In Scala, operators are actually methods
a+b
is a shorthand for
a.+(b)
• In general, these are equivalent:
a method b
a.method(b)
• So, 1.to(10) can be written as 1 to 10
38
Arithmetic and Operator Overloading
• Scala has the same arithmetic operators as other
languages: + - * / %
• In Scala, operators are actually methods
a+b
is a shorthand for
a.+(b)
• In general, these are equivalent:
a method b
a.method(b)
• So, 1.to(10) can be written as 1 to 10
• There is no ++ or – – in Scala
39
More about Calling Methods
• When calling a method that has no parameters,
don’t use parentheses after method’s name
• Ex: the method sorted, yields a new string with
the letters in sorted order
"Bonjour".sorted // Yields the string "Bjnooru“
40
More about Calling Methods
• When calling a method that has no parameters,
don’t use parentheses after method’s name
• Ex: the method sorted, yields a new string with
the letters in sorted order
"Bonjour".sorted // Yields the string "Bjnooru“
• The rule of thumb is that a parameter-less
method that doesn’t modify the object has no
parentheses
41
Importing Packages
• Serve same purpose as packages in Python & Java or
namespaces in C++
• Allow us to avoid naming conflicts and to write shorter
syntax without any prefix
• To print 4
print(scala.math.sqrt(4)) //w.o. an import statement
• import scala.math.sqrt
print(sqrt(4)) //with an import statement
• To import everything from a package use _
import scala.math._
• To import more than one member from a package use
import scala.math.{max, min, cos, Pi}
42
Importing Packages
• Serve same purpose as packages in Python & Java or
namespaces in C++
• Allow us to avoid naming conflicts and to write shorter
syntax without any prefix
• To print 4
print(scala.math.sqrt(4)) //w.o. an import statement
• import scala.math.sqrt
print(sqrt(4)) //with an import statement
• To import everything from a package use _
import scala.math._
• To import more than one member from a package use
import scala.math.{max, min, cos, Pi}
43
Importing Packages (cont.)
• We can rename members:
import scala.math.{max => maximum}
print(maximum(2, 3))
44
Importing Packages (cont.)
• We can rename members:
import scala.math.{max => maximum}
print(maximum(2, 3))
• Every Scala program implicitly starts with
import java.lang._
import scala._
import Predef._
45
Importing Packages (cont.)
• We can rename members:
import scala.math.{max => maximum}
print(maximum(2, 3))
• Every Scala program implicitly starts with
import java.lang._
import scala._
import Predef._
• if a package starts with scala., you can omit the
scala prefix
46
Importing Packages (cont.)
• We can rename members:
import scala.math.{max => maximum}
print(maximum(2, 3))
• Every Scala program implicitly starts with
import java.lang._
import scala._
import Predef._
• if a package starts with scala., you can omit the
scala prefix
math.sqrt is as good as scala.math.sqrt
47
The apply Method
• Ex: val s = "Hello";
• s(4) //yields 'o'
• Overloaded form of the () operator,
which is implemented with the method apply
49
The apply Method
• Likewise, BigInt("1234567890") is a shortcut for
BigInt.apply("1234567890")
• It yields a new BigInt object
BigInt("1234567890") * BigInt("112358111321")
50
The apply Method
• Likewise, BigInt("1234567890") is a shortcut for
BigInt.apply("1234567890")
• It yields a new BigInt object
BigInt("1234567890") * BigInt("112358111321")
51
The apply Method
• Likewise, BigInt("1234567890") is a shortcut for
BigInt.apply("1234567890")
• It yields a new BigInt object
BigInt("1234567890") * BigInt("112358111321")
• Using the apply method of a class is a common
Scala idiom for constructing objects
• For example, Array(1, 4, 9, 16) returns an array
52
Constructs have Values
In Java or C++
• An expressions has a value
– 2+3
• A statement carries an action
– if statement, assignment statement
53
Constructs have Values
In Java or C++
• An expressions has a value
– 2+3
• A statement carries an action
– if statement, assignment statement
• In Scala, almost all constructs have values
– an if expression has a value
– a block has a value—the value of its last expression
54
Constructs have Values
In Java or C++
• An expressions has a value
– 2+3
• A statement carries an action
– if statement, assignment statement
• In Scala, almost all constructs have values
– an if expression has a value
– a block has a value—the value of its last expression
• Benefit: concise and more readable code
55
Conditional Statements
• If/else statements have same syntax as in
Java/C++
• In Scala, an if/else has a value, namely the value
of the expression that follows the if or else
if (x > 0) 1 else -1
56
Conditional Statements
• If/else statements have same syntax as in
Java/C++
• In Scala, an if/else has a value, namely the value
of the expression that follows the if or else
if (x > 0) 1 else -1
val s = if (x > 0) 1 else -1
This has the same effect as
if (x > 0) s = 1 else s = -1
57
Conditional Statements
• If/else statements have same syntax as in
Java/C++
• In Scala, an if/else has a value, namely the value
of the expression that follows the if or else
if (x > 0) 1 else -1
val s = if (x > 0) 1 else -1
This has the same effect as
if (x > 0) s = 1 else s = -1
59
Type/Class Any
• In Scala, every expression has a type
• if (x > 0) 1 else -1
has type Int
60
Type/Class Any
• In Scala, every expression has a type
• if (x > 0) 1 else -1
has type Int
• What is the type of
if (x > 0) "positive" else -1
61
Type/Class Any
• In Scala, every expression has a type
• if (x > 0) 1 else -1
has type Int
• What is the type of
if (x > 0) "positive" else -1
• The type of a mixed-type expression is the
common supertype of both branches
62
Type/Class Any
• In Scala, every expression has a type
• if (x > 0) 1 else -1
has type Int
• What is the type of
if (x > 0) "positive" else -1
• The type of a mixed-type expression is the
common supertype of both branches
• The common supertype of String and Int is called
Any
63
The Inheritance Hierarchy of Scala Classes
64
Type/Class Unit
• What is the type of
if (x > 0) 1
65
Type/Class Unit
• What is the type of
if (x > 0) 1
• This if statement could yield no value
66
Type/Class Unit
• What is the type of
if (x > 0) 1
• This if statement could yield no value
• Since every expression should have a value, Scala
introduces a class Unit that has one value, “no value”,
written as ()
67
Type/Class Unit
• What is the type of
if (x > 0) 1
• This if statement could yield no value
• Since every expression should have a value, Scala
introduces a class Unit that has one value, “no value”,
written as ()
• The above if statement is equivalent to
68
Type/Class Unit
• What is the type of
if (x > 0) 1
• This if statement could yield no value
• Since every expression should have a value, Scala
introduces a class Unit that has one value, “no value”,
written as ()
• The above if statement is equivalent to
if (x > 0) 1 else ()
69
Type/Class Unit
• What is the type of
if (x > 0) 1
• This if statement could yield no value
• Since every expression should have a value, Scala
introduces a class Unit that has one value, “no value”,
written as ()
• The above if statement is equivalent to
if (x > 0) 1 else ()
• Think of () as a placeholder for “no useful value,” and of
Unit as an analog of void in Java/C++
70
Type/Class Unit
• What is the type of
if (x > 0) 1
• This if statement could yield no value
• Since every expression should have a value, Scala
introduces a class Unit that has one value, “no value”,
written as ()
• The above if statement is equivalent to
if (x > 0) 1 else ()
• Think of () as a placeholder for “no useful value,” and of
Unit as an analog of void in Java/C++
• The supertype of Int and Unit is AnyVal
71
Block Expressions and Assignments
• {} makes a block of code
• The value of a block is that of the last expression
inside it
72
Block Expressions and Assignments
• In Scala, assignments have no value (i.e., Unit
value)
• So, if we have a block that ends with an
assignment statement, that block has a Unit value
73
Block Expressions and Assignments
• In Scala, assignments have no value (i.e., Unit
value)
• So, if we have a block that ends with an
assignment statement, that block has a Unit value
74
Remark on Chained Assignments
• x = y =1
75
Remark on Chained Assignments
• Do not use chain assignments in
Scala
x = y = 1 // No
• The value of y = 1 is ()
• The expression y = 1 has Unit value
• If syntax allows it, x would have a
Unit value
76
Remark on Chained Assignments
• Do not use chain assignments in
Scala
x = y = 1 // No
• The value of y is 1
• The expression y = 1 has Unit value
• If syntax allows it, x would have a
Unit value
77
Input and Output
• print() and println()
val x = 3; val y = 5
println(x + y) //outputs: 8
• Scala has printf() with a C-style syntax
val name = "Mark"; val age = 5
printf("Hello %4s! Your are %5d years old.\n", name, age);
//Hello Mark! Your are 5 years old.
78
String Interpolations
• We can also use string interpolation
1. The f Interpolator (f-Strings)
– simple formatted strings, all variable references should
be followed by a printf-style format string
• A formatted string can contain expressions and
format directives
80
String Interpolations (cont.)
• With prefix s, a string can contain expressions but not
format directives. Escape sequences are evaluated.
val name = "Mark"; val age = 5
print(f"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
81
String Interpolations (cont.)
• With prefix s, a string can contain expressions but not
format directives. Escape sequences are evaluated.
val name = "Mark"; val age = 5
print(f"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.50 years old.
82
String Interpolations (cont.)
• With prefix s, a string can contain expressions but not
format directives. Escape sequences are evaluated.
val name = "Mark"; val age = 5
print(f"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.50 years old.
print(s"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
83
String Interpolations (cont.)
• With prefix s, a string can contain expressions but not
format directives. Escape sequences are evaluated.
val name = "Mark"; val age = 5
print(f"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.50 years old.
print(s"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.5%7.2f years old.%n
84
String Interpolations (cont.)
• With prefix s, a string can contain expressions but not
format directives. Escape sequences are evaluated.
val name = "Mark"; val age = 5
print(f"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.50 years old.
print(s"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.5%7.2f years old.%n
print(s"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.\n")
85
String Interpolations (cont.)
• With a prefix of raw, neither escape sequences nor format
directive are evaluated
val name = "Mark"; val age = 5
print(f"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.50 years old.
print(s"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.%n")
//Hello, Mark! In six months, you'll be 5.5%7.2f years old.%n
print(s"Hello, $name! In six months, you'll be ${age + 0.5}%7.2f years
old.\n")
Hello, Mark! In six months, you'll be 5.5%7.2f years old.
88
Loops
• Scala has the same while and do while loops as in
Java/C++
var i = 5; var summation = 0 var i = 0; var summation = 0
while (i > 0){ while {
summation += i i += 1
i -=1 i<=5
} } do (summation += i)
print(summation) //15 print(summation) //15
// 5 + 4 + 3 + 2 + 1 // 1 + 2 + 3 + 4 + 5
89
Loops - for
• Scala does not have C++/Java for loop
• Instead, one can use this kind of loop
for (i <- 1 to n) for (i <- expr)
do something do something
90
Loops – for (Examples)
for (i <- 1 to 5)
print(i + " ")
// 1 2 3 4 5
91
Loops – for (Examples)
for (i <- 1 to 5) val s = "ABC"
print(i + " ") for (ch <- s)
// 1 2 3 4 5 print(ch + " ")
//A B C
92
Loops – for (Examples)
for (i <- 1 to 5) val s = "ABC"
print(i + " ") for (ch <- s)
// 1 2 3 4 5 print(ch + " ")
//A B C
val s = "ABC"
var result = 0
for (i <- 0 to s.length - 1)
result += s(i)
print("result is " + result)
//result is 198 "65 + 66 + 67"
93
Loops – for (Examples)
for (i <- 1 to 5) val s = "ABC"
print(i + " ") for (ch <- s)
// 1 2 3 4 5 print(ch + " ")
//A B C
94
For Comprehension
• if the body of the for loop starts with yield, the
loop constructs a collection of values, one for
each iteration:
for (i <- 1 to 10) yield i % 3
// Yields Vector(1, 2, 0, 1, 2, 0, 1, 2, 0, 1)
////res104:scala.collection.immutable.IndexedSeq[Int] =
Vector(1, 2, 0, 1, 2, 0, 1, 2, 0, 1)
• This type of loop is called a for comprehension
95