Haskell Assignment 2024-25
Haskell Assignment 2024-25
Due Date: Sunday 5th January 2025 (20% of total exam marks)
Define the following Haskell functions. Always include the function signature. (Note: the
marks displayed sum up to 100)
1. is_square. [4]
Define a function is_square that takes an integer (positive) and returns True if the number is
the square of an integer number. DO NOT use the SQRT function.
2. freq_letter_pc [6]
Define a function freq_letter_pc, that displays the percentage of each letter instead of the
number of occurrences (do not count symbols other than [a..z]). Transform the text in small case
first. Example:
freq_letter_pc “this is text”
Output:
[(t,0.3),(s,0.2),(i,0.2),(h,0.1),(e,0.1),(x,0.1)]
A small database contains two tables, one is a list of cities, identified by a city_id, city_name,
city_population and country_ id. The other table is a list of countries
(country_id,country_name). Table country is related by a 1 to many relation with table cities,
country_id is the foreign key. The following is the content of the table:
Cities = [(1,”Paris”,7000000,1),(2,”London”,8000000,2),
(1,”Rome”,3000000,3),(1,”Edinburgh”,500000,2),(1,”Florence”,500
00,3),(1,”Venice”,200000,3),(1,”Lyon”,1000000,1),(1,”Milan”,30000
00,3), (1, ”Madrid”,6000000,4),(1,”Barcelona”,5000000,4),]
countries = [(1,”UK), (2,”France”), (3,”Italy”), (4,”Spain”)]
a. get_city_above n – to get all the names of the cities with a population above n (=input)
get_city_above 6000000 = [“Paris”,”London”,”Madrid”]
b. get_city country_name – to get all the cities given a country_name (not country_
id! You need to have a function that, given the country_name gets the country_id)
Create a function eucl_dist to compute the euclidean distance between two vectors. Each
vector is represented by a list of float number of unknown length. The two lists must have
the same number of elements.
The distance between list x and y is defined as follows: #
𝑑𝑖𝑠𝑡(𝑥, 𝑦) = (𝑥 − 𝑦 ) + (𝑥 − 𝑦 ) + . . +(𝑥 − 𝑦 )
The file lang.hs contains two lists of 26 float numbers each. The lists represent the
frequency of each letter in English (list eng_freq) and Portuguese (pt_freq). For instance, the
letter "b", the second letter in the alphabet, has a frequency of 1.492% in English and 1.04%
in Portuguese.
Write a Haskell function get_lang that gets a text and print the message "The text is in
English" if the text is written in English or "The text is in Portuguese" if the text is written in
Portuguese (as a simplification, we only use text containing the 26 basic letters, without
accents or other phonetic symbols). In order to identify the language, compute the
frequency distribution of the letters in the text, store it into a list and check if the
distribution is closer to eng_freq or pt_freq. You can use the function defined at exercise 2
and 6. Transform the text in small case first.
Using the function readfile, create a version of the program running from command line that
takes a file as an input (command line parameter) and detect the language of the file.
6. Caesar Cipher [5]
A (simplified) Caesar cypher is one of the oldest and simplest forms of encryption. Given a
positive integer number n (called the index) and a string of text (for simplicity, we only work
with sentences containing small letters and the space symbol), each letter is shifted by n.
If, for instance, n=3, then “a” becomes “d”, “b” -> “e” , “c” -> “f” and so on…
It is circular, therefore “x” becomes “a”, “y” -> “b” and “z” -> “c”. The space character is
unchanged.
The file encrypt.hs is the Haskell function (attached to this assignment) to read a file and
encrypt it. The function has two inputs: the name of the file and the index (number used to
shift the letters). All the text is converted to lower case before doing the encryption using
the function toLower.
You are required to write a function c_decrypt that takes a file encrypted with a Caesar
cypher, an integer positive number (the index of the cypher) and generate the decrypted
file.
Suppose that we receive an encrypted message, and we do not know the value of the index
n used to generate the message. We just know that the message was written in English.
We need to identify the value of the index. Since we know that the original message was in
English, we can identify the index by using a dictionary of English words. The idea is to try to
decrypt the messages with all the 26 possible indexes, and select the one that generates a
message with the highest number of English words.
You are required to write an Haskell function to read the three files and save into a single file
dict.txt all the distinct unique words in the novels.
You are required to write an Haskell function to compute the value of the area in grey (a
quarter of a circle).
You have to approximate the value of the area using a simple Montecarlo simulation. The
idea is to divide the square of side 1 into many points and count the proportion of points that
are inside the grey area, as depicted in the picture below.
then
math_series sample_series 4
goes to 𝜋 when k goes to infinity, by showing that the greater is 𝑘, the closer the sum of the
series gets to 𝜋.
x1 , x2 = the two integration intervals (x1 < x2), float numbers as well n = the number of
The integral is an approximation of the real integral, and it is the sum of the yellow
rectangles as shown in the picture below, where we use the function 𝑓 = 𝑥 and 𝑥 =
1, 𝑥 = 20, 𝑛 = 9. Note that the precise value of the integral is 99.75.
integral f 1 20 9 = 89.72222 (the sum of the area of the 9 rectangles with same
width).
Note that f (or any other function you want to use) must be defined. In our example f is
defined as follows (but it can be any function returning a real number): f x = 0.5*x