ENGG1811 Assignment 1: Fault Detection: Version: v1.03 On 11 March 2020. Updates
ENGG1811 Assignment 1: Fault Detection: Version: v1.03 On 11 March 2020. Updates
ENGG1811 Assignment 1: Fault Detection: Version: v1.03 On 11 March 2020. Updates
Due date: 5pm, Friday 3 April (week 7). Late submissions will be penalised at the
rate of 10% per day. The penalty applies to the maximum available mark.
Submissions will generally not be accepted after 5pm, Monday 6 April, 2020.
Version: v1.03 on 11 March 2020.
Updates:
Fault detection
Automatic detection of faults can be found in many engineering systems. There are
systems to automatically diagnose faults in engines, chemical plants, power
generation plants, robotic arms and on on.
In this assignment, you will write Python programs to perform fault detection. The
aim of your program is to process data sequences of solar irradiance and power to
determine whether there are faults and if so, when they have occurred.
Note that we chose the word inspired earlier because we have adapted the fault
detection problem in [1] as a programming assignment by simplifying and liberally
changing a few aspects of the original problem. In particular, we have made
changes so that, in this assignment, you will have to use the various Python
constructs that you have learnt. This means a few details of this assignment may
not be realistic in engineering terms, but on the whole, you will still get a taste on
how programming can be used to perform automatic fault detection.
Learning objectives
Prohibition
The algorithm uses two sets of measurements. The first is the amount of solar
irradiance which is the quantity of solar radiation falling on the solar panels. The
second is the amount of electrical power generated by the solar panels; we will
simply refer to that as power or power generated.
The key idea of the fault detection algorithm is to use the measured irradiance and
power to determine whether a fault has occurred. For a given amount of irradiance,
the algorithm uses a model (which in this case is a formula) to predict what the
expected amount of power the PV plant should generate. After that, the algorithm
compares the power predicted by the formula against the measured power. If the
difference between these two quantities is too big then the algorithm will decide
that a fault has occurred.
This section describes the requirements on the fault detection algorithm that you
will be programming in this assignment. You should be able to implement these
requirements by using only the Python skills that you have learnt in the first four
weeks' of the lectures in this course.
We begin with describing the data that the algorithm will operate on. We will use
the following Python code as an example. In the following, we will refer to the
following code as the sample code. Note that the data and parameter values in the
sample code are for illustration only; your code should work with any allowed
input data and parameter values.
In the sample code, there are two data series which contain, respectively, the
irradiance and power measurements. Both series are Python lists whose entries are
of the float type. Their variable names
are irradiance_time_series and power_time_series. The irradiance is measured in
Watts per square metre and power generated is measured in kilowatts.
In the sample code, the irradiance and power measurements were collected once
every 12 and 60 minutes respectively. These values are stored in the
variables irradiance_sampling_time and power_sampling_time.
(Remark: In [1], the irradiance measurements were taken once every 5s, which is a
more realistic sampling time. We have chosen a sampling time of 12 minutes for
irradiance so that the length of the list irradiance_time_series will not be
exceedingly long in this example.)
We break the algorithm down into a number of steps. The first step is to compute
the average of the irradiance data.
Since irradiance and power were measured every 12 and 60 minutes, respectively,
therefore there are 5 irradiance samples within the duration of a power sample. We
assume that the first power measurement power_time_series[0] corresponds to the
first 5 irradiance measurements:
irradiance_time_series[0], irradiance_time_series[1], irradiance_time_series[2], irr
adiance_time_series[3], irradiance_time_series[4].
irradiance_time_series[5], irradiance_time_series[6], irradiance_time_series[7], irr
adiance_time_series[8], irradiance_time_series[9].
Similarly for power_time_series[2] and power_time_series[3].
Since we can only make (complete) correspondences between the first 4 power
measurements and the first 20 irradiance measurements, so we will only use these
measurements for fault detection.
Note that we rounded the numbers in the last column to 2 decimal places for
display only. You should not be rounding any of your calculations in this
assignment.
(Use the average irradiance and model to predict the expected power generated)
The next step is to use the average irradiance in each segment to predict the
expected amount of power generated. To do that, we use a model (which in this
case is a formula) to calculate the expected power from irradiance. We first define
some notation:
P = G (a0 + a1 G + a2 log(G))
By using the values of a0, a1 and a2 from the sample code, and the average
irradiance calculated earlier, we can calculate the expected power generated for
each time segment:
Segment Average Predicted power generated (rounded to 2 decimal
number irradiance points for display only)
0 251.54 27.98
1 288.76 32.61
2 374.52 43.69
3 482.00 58.38
The next step is to compare the predicted power against the measured power. We
will use the algorithmic parameter margin which is defined in the sample code.
If the value of measured power minus predicted power is less than or equal
to margin and bigger than or equal to -margin, then the decision is that there is no
fault because the measured power is sufficiently close to the predicted power;
otherwise, there is a fault. For example, by using the values of margin from the
sample code, we have:
For segment number 0, the average irradiance is 251.54 W/m 2 which gives a
predicted power generation of 27.98 kW. The measured power is 31.2 kW,
which is 3.22 kW higher than the predicted value. Since the difference is
within than the margin, there is no fault. We will use the Boolean value
of False to denote the absence of fault.
For segment number 2, the average irradiance is 374.52W/m 2 which gives a
predicted power generation of 43.69 kW. The measured power is 55.5 kW,
which is 11.81 kW higher than the predicted value. Since the difference is
higher than the margin, there is a fault. We will use the Boolean value
of True to denote the presence of fault.
The above examples show how the fault detection is to be performed for two
power measurements. The following table summarizes the result of fault detection
for the time series.
We will use a list to indicate when the faults had occurred. For the above example,
we will represent the faults in the data series using [2,3] because the
measurements power_time_series[2] and power_time_series[3] are determined to
be faults.
In the case where there are no faults, we will indicate that by an empty list [ ].
The following figure illustrates the fault detection decision making. The solid blue
dots show the predicted power generated for the average irradiance. The vertical
lines are centred at the predicted power and have a height of 2*margin. The power
measurements are plotted with crosses. If the cross is within the vertical line, then
it is not a fault; otherwise, it is.
After a fault detection algorithm has been designed, the engineers will want to
check how well the algorithm is in catching the faults. One way that they can do
that is to monitor the PV plant manually to determine whether actual faults have
occurred. There are two possible types of error:
your_fault_list = [2,3]
real_fault_list = [1,2]
A task for this assignment is to determine the false alarms from the
given your_fault_list and real_fault_list. For this assignment, you will store the
false alarms in a list. In this example, it is [1] [3]. In the case where there are no
false alarms, that should be indicated by an empty list [ ].
Note that the engineers should also be interested in missed detection, but the
calculation is very similar to false alarms, so we will not ask you to do that.
Validity checks
Requirements for
Algorithmic Assumptions you can make when
the parameter to be
parameters testing or further explanation
valid
Examples of invalid parameter
Data type must values are -5, -5.2, 5.7. You can
irradiance_sam
be int and its value assume that, when we test your
pling_time
is strictly positive code, irradiance_sampling_time is
always a number
Data type must You can assume that, when we test
power_samplin
be int and its value your code, power_sampling_time is
g_time
is strictly positive always a number
irradiance_sam The value For example,
pling_time, of irradiance_sampli if power_sampling_time is 12
power_samplin ng_time must to an and irradiance_sampling_time is 7,
g_time integral multiple of then the given parameters are
the value invalid because 12 is not an integral
of power_sampling_ multiple of 7.
time
You can also assume
The value that power_sampling_time and irrad
of power_sampling_ iance_sampling_time are given in
time must to an the same unit.
integral multiple of
the value
of irradiance_sampl
ing_time
You can assume that the
given model_para is always a list
and its entries are always numbers
Must have exactly 3 (int or float).
model_para
entries in the list
For example, if the
given model_para has four entries,
then it is invalid.
Must be You can assume that the
margin a strictly positive given margin is always a number
number (int or float).
The above sample code shows the situation where the overall duration of power
measurements (6 samples times 60 minutes = 360 minutes) is more than that of the
irradiance measurements (22 samples times 12 minutes = 264 minutes). The above
example shows that we should only be using the first 4 power measurements and
the first 20 irradiance measurements.
Another situation is when the overall duration of power measurements is less than
that of the irradiance measurements. Consider the following code:
In order for the fault detection algorithm to run, there must be enough power and
irradiance measurements. The requirements are:
Implementation requirements
You need to implement the following six functions. The first five functions
working together will implement the the fault detection algorithm. The sixth
function finds the false alarms.
3. def fault_detection_one_sample(irradiance_average_one_sample,
power_one_sample, model_para, margin):
o The aim of this function is to use one value of average irradiance, one
power measurement, the model parameters and the margin to
determine whether the given power measurement is a fault or not.
o An explanation of this computation is given earlier under the
heading Compare the predicted power generated against the
measured power to determine whether there is a fault - FOR ONE
POWER SAMPLE.
o The function should return one output which is of Python bool
(Boolean) type.
o This function requires power_prediction(). An import line have been
included in the template file for you. Please do not change it.
o This function can be tested using the file
test_fault_detection_one_sample.py.
4. def fault_detection_time_series(irradiance_time_series_average,
power_time_series, model_para, margin):
o The aim of this function is to use the time series of average irradiance
and power measurements to determine whether there are faults.
o An example of the computation is given earlier under the
heading Performing fault detection for a time-series
o The function should return one output which is a list. The list should
contain the indices in power_time_series that correspond to faults.
The list should be empty if there are no faults.
o You are expected to use fault_detection_one_sample() to complete
this function. An import line have been included in the template file
for you. Please do not change it.
o This function can be tested using the file
test_fault_detection_time_series.py.
o This function is called after all the input data have been specified, see
the last line in the sample code.
o The function has 6 inputs. The names for the inputs have been chosen
to match their roles in the description earlier.
o The function should return one output which can be a list (possibly
empty) or a string depending on the situation
o The expected steps within the function fault_detection_main() are:
The function should first check whether all algorithmic
parameters are valid. If any algorithmic parameter is invalid,
the function should return the string 'Corrupted input'. It should
not proceed to execute the next two steps. See the section with
heading Validity Checks for the requirements on the
algorithmic parameters.
If all algorithmic parameters are valid, the function should
determine whether there are enough data for the calculations. If
there are not enough data, the function should return the string
'Not enough data'. It should not proceed to execute the next
step. See the section with heading Checking whether there are
enough data for the requirements.
If all algorithmic parameters are valid and there are enough
data, then the function should proceed to determine the faults.
The function should return a list. The list should contain the
indices in power_time_series that correspond to faults. The list
should be empty if there are no faults.
o You can use the following test files: test_fault_detection_main_1.py
and test_fault_detection_main_2.py.
For the tests in test_fault_detection_main_1.py, there are
enough data and all algorithmic parameters are valid. Your
code should proceed to determine the faults.
Test 1 in test_fault_detection_main_1.py is based on
the sample code.
The test file test_fault_detection_main_2.py contains a number
of test cases where the algorithmic parameters are invalid
and/or there are not enough data. For all the test cases, the
function should return a string.
o This function requires the
functions calc_average() and fault_detection_time_series(). Two
import lines have been included in the template file for you. Please do
not change them.
o This function should return one output which is a list of false alarms.
The list should be empty when there are no false alarms.
o This function can be tested using the file test_find_false_alarms.py.
o Hint: The Python keyword in is useful here. You have seen how in is
used with for, but there is another usage of in. You can type in the
following lines of code in the Python console to see what the answers
are:
6 in [2,6,7]
3 in [2,6,7]
Additional requirements: In order to facilitate testing, you need to make sure that
within each submitted file, you only have the code required for that function.
Do not include test code in your submitted file.
Getting Started
1. Download the zip file assign1_prelim.zip (which contains 6 template files
and 7 test files) and unzip it. This will create the directory (folder) named
'assign1_prelim'.
2. Rename/move the directory (folder) you just created named 'assign1_prelim'
to 'assign1'. The name is different to avoid possibly overwriting your work if
you were to download the 'assign1_prelim.zip' file again later.
3. First browse through all the files provided including the test files.
4. (Incremental development) Do not try to implement too much at once, just
one function at a time and test that it is working before moving on.
5. Start implementing the first function, properly test it using the given testing
file, and once you are happy, move on to the the second function, and so on.
6. Please do not use 'print' or 'input' statements. We won't be able to assess
your program properly if you do. Remember, all the required values are part
of the parameters, and your function needs to return the required answer. Do
not 'print' your answers.
Testing
You can use the provided Python programs (files like test_calc_average.py etc.) to
test your functions. Please note that each file covers a limited number of test cases.
We have purposely not included all the cases because we want you to think about
how you should be testing your code. You are welcome to use the forum to
discuss additional tests that you should use to test your code.
We will test each of your files independently. Let us give you an example. Let us
assume we are testing three files: prog_a.py, prog_b.py and prog_c.py. These files
contain one function each and they are: prog_a(), prog_b() and prog_c(). Let us say
prog_b() calls prog_a(); and prog_c() calls both prog_b() and prog_a(). We will
test your files as follows:
Submission
You need to submit the following six files. Do not submit any other files. For
example, you do not need to submit your modified test files.
calc_average.py
power_prediction.py
fault_detection_one_sample.py
fault_detection_time_series.py
fault_detection_main.py
find_false_alarms.py
To submit this assignment, go to the Assignment 1 page and click the tab named
"Make Submission".
Assessment Criteria
We will test your program thoroughly and objectively. This assignment will be
marked out of 27 where 21 marks are for correctness and 6 marks are for style.
Correctness
Nominal
Criteria
marks
Function calc_average.py 3
Function power_prediction.py 3
Function fault_detection_one_sample.py 3
Function fault_detection_time_series.py 3
Function fault_detection_main.py Case 1: Expected output is
2
the string 'Corrupted input'
Function fault_detection_main.py Case 2: Expected output is
1
the string 'Not enough data'
Function fault_detection_main.py Case 3: Expected output is
3
a list or an empty list.
Function find_false_alarms.py 3
Style
Six (6) marks are awarded by your tutor for style and complexity of your solution.
The style assessment includes the following, in no particular order:
Assignment Originality
You are reminded that work submitted for assessment must be your own. It's OK to
discuss approaches to solutions with other students, and to get help from tutors, but
you must write the Python code yourself. Sophisticated software is used to identify
submissions that are unreasonably similar, and marks will be reduced or removed
in such cases.
Further Information
We will run Help Sessions for this assignment during Weeks 4-7. These are
face-to-face consultation in the lab on a first-come-first-serve basis. The
timetable for the Help Sessions can be found on the course website.
Use the forum to ask general questions about the assignment, but take
specific ones to Help Sessions.
Keep an eye on the course webpage notice board for updates and responses.