My Revision Notes AQA CS A-Level
My Revision Notes AQA CS A-Level
Student Magazines
Workbooks Ideal for developing independent
learning skills, our Student
Maximise your potential with
Magazines provide in-depth
these write-in workbooks written
information surrounding subject
by experienced authors to help
content, with topical articles
you to practise and apply what
and expert exam advice to help
you have learned during your
deepen your understanding and
course. Build confidence and
put knowledge into context.
independent learning skills
through a blend of focused course
guidance and varied practical
activities.
Answers to workbook
questions available online
at hodderplus.co.uk Also available as eMagazines
www.hoddereducation.co.uk/studentworkbooks www.hoddereducation.co.uk/magazines
AQA
A-level
COMPUTER
SCIENCE
THIRD EDITION
Mark Clarkson
325487_01_MRN_AQA_CS_001-033.indd Page 2 6/29/21 9:39 PM f-0116 /103/HO02206/work/indd
Every effort has been made to trace all copyright holders, but if any have been inadvertently
overlooked, the Publishers will be pleased to make the necessary arrangements at the first
opportunity.
Although every effort has been made to ensure that website addresses are correct at time of
going to press, Hodder Education cannot be held responsible for the content of any website
mentioned in this book. It is sometimes possible to find a relocated web page by typing in
the address of the home page for a website in the URL window of your browser.
Hachette UK’s policy is to use papers that are natural, renewable and recyclable products
and made from wood grown in well-managed forests and other controlled sources. The
logging and manufacturing processes are expected to conform to the environmental
regulations of the country of origin.
Orders: please contact Hachette UK Distribution, Hely Hutchinson Centre, Milton Road,
Didcot, Oxfordshire, OX11 7HH. Telephone: +44 (0)1235 827827.
Email education@hachette.co.uk. Lines are open from 9 a.m. to 5 p.m., Monday to Friday.
You can also order through our website: www.hoddereducation.co.uk
ISBN: 978 1 3983 2548 7
© Mark Clarkson 2021
First edition published in 2016. This edition published in 2021 by
Hodder Education,
An Hachette UK Company
Carmelite House
50 Victoria Embankment
London EC4Y 0DZ
www.hoddereducation.co.uk
Impression number 10 9 8 7 6 5 4 3 2 1
Year 2025 2024 2023 2022 2021
All rights reserved. Apart from any use permitted under UK copyright law, no part of this
publication may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying and recording, or held within any information
storage and retrieval system, without permission in writing from the publisher or under
licence from the Copyright Licensing Agency Limited. Further details of such licences (for
reprographic reproduction) may be obtained from the Copyright Licensing Agency Limited,
www.cla.co.uk
Cover photo © Maksym Yemelyanov - stock.adobe.com
Illustrations by Aptara, Inc.
Typeset in India by Aptara, Inc.
Printed in Spain
A catalogue record for this title is available from the British Library.
325487_01_MRN_AQA_CS_001-033.indd Page 3 6/29/21 9:40 PM f-0116 /103/HO02206/work/indd
topic by topic. Use this book as the cornerstone My Revision Planner 7 Computer organisation and architecture
158 Internal hardware components
162 The stored program concept
My Revision Planner
My Revision Planner
162 Structure and role of the processor and its components
Introduction 173 External hardware devices
6 Command words
8 Consequences of using computers
1 Fundamentals of programming 182 Moral, ethical, legal and cultural issues and opportunities
9 Programming
9 Fundamentals of communication and networking
have:
143 Types of program translator
145 Logic gates
152 Boolean algebra
4 5
✚ revised and understood a topic Check your understanding and progress at www.hoddereducation.co.uk/myrevisionnotesdownloads
/103/HO02206/work/indd
Data structures and abstract Multi-dimensional arrays games and physical models,
and are discussed in more
data types
A two-dimensional array can be written down as a table. detail towards the end of
A two-dimensional array can be thought of as an array of one-dimensional this chapter.
arrays.
quizzes
require a separate selection statement to interrogate each one, and additional Array A data structure for
code would be needed each time the number of variables increased. Numbers [1]
holding values in a table. Numbers
Using a data structure called an array means that it is possible to step through [0] [1] [2] [3]
[0] [1] [2] [3]
each item using a loop, reducing the amount of code and making the code [1] 11 13 17 19
[0] 1 3 5 7
much more flexible . Exam tip
[1] 11 13 17 19 Numbers [2]
Arrays can also be defined
Single- and multi-dimensional arrays [2] 23 29 31 37 [0] [1] [2] [3]
using the first index to find
[3] 41 43 47 51 [2] 23 29 31 37
(or equivalent) Numbers [2][2]
the column and the second
Numbers [3]
index to find the item in that
An array is a fixed-size collection of values, all of the same data type. column. Always read the
Index A value indicating the [0] [1] [2] [3] question carefully as any
Single-dimensional arrays position of a value with in [3] 41 43 47 51 ambiguities should be made
A single-dimensional array can be written down as a single row of a table. an array or list. Numbers [3][1] clear.
Each individual item in the array is referred to using the identifier (or List A data structure similar Figure 2.1 A two-dimensional array called Numbers
variable name) of the array, followed by the index in brackets (usually square to an array, commonly used Making links
In the example above the contents of Numbers[0][3] can be found by isolating
in Python in place of an
brackets). Each index position can hold data. row 0 and then finding the item in position 3. One common use for a
array.
two-dimensional array is
Index [0] [1] [2] [3] [4] A two-dimensional array is useful because it can store more complex data; for
storing a matrix. Matrices
example, the scores for different students or values for different days.
Data 1 3 5 7 11 Exam tip are often used in complex
It is possible to find the size of a two-dimensional array using the Length maths, and can also be used
In most languages (including
function (exact syntax varies by programming language). to represent a graph data
Note C#, Delphi/Pascal, Java &
structure. This is explored
VB.Net) the array index For an array with indexes defined as [row][column], to find the number of
Arrays are always declared as a fixed size and cannot be changed later in the program. further later in this chapter.
starts at 0, but be very rows, find the number of one-dimensional arrays the table can be split into:
In an array, all data must be of the same data type. careful when reading
NumberOfRows ← Length(ArrayName)
Lists are very similar to arrays, however it is possible to alter the size of a list while the exam questions as some
Matrix A rectangular, two-
program is running and it is possible to store data of differing types in the same list. languages and some To find the number of columns, find the length of a row in the table: dimensional collection of
scenarios will have the array
In some languages (notably including Python) it is much more common to use a list, NumberOfColumns ← Length(ArrayName[0]) values.
index starting at 1.
whereas in others it is much more common to use an array. Nested A selection
To interrogate a two-dimensional array it is necessary to use nested loops.
You are not expected or required to show your understanding of the difference statement or loop inside
FOR i ← 0 to Length(Numbers) // Step through each row another selection statement
between the two, and you should use whichever version is most suitable for your
chosen programming language. FOR j ← 0 to Length(Numbers[i]) // Step through each or loop.
value in that row
Arrays are an effective solution for storing several values because a FOR loop OUTPUT Numbers[i][j] Exam tip
can be used to iterate (or loop) over an array.
ENDFOR Make sure you are able to
In this example Numbers is the name of an array and Length(Numbers) hand trace a program such
ENDFOR as the one above to fully
returns the size of the array (the number of items is contains).
Using the principle above, it is possible to think of a three-dimensional array understand which counter
FOR i ← 0 TO Length(Numbers) variable is addressing the
as an array of two-dimensional arrays.
OUTPUT Numbers[i] row or column index of the
34 table. 35
ENDFOR
Check your understanding and progress at www.hoddereducation.co.uk/myrevisionnotesdownloads My Revision Notes AQA A-level Computer Science Third Edition
My Revision Planner
My Revision Planner
Introduction
6 Command words
1 Fundamentals of programming
9 Programming
22 Programming paradigms
23 Object-oriented programming
My Revision Planner
162 Structure and role of the processor and its components
173 External hardware devices
8 Consequences of using computers
182 Moral, ethical, legal and cultural issues and opportunities
9 Fundamentals of communication and networking
188 Communication
190 Networking
195 The internet
201 The Transmission Control Protocol/Internet Protocol
10 Fundamentals of databases
214 Conceptual models and entity relationship modelling
216 Relational databases
217 Database design and normalisation techniques
222 Client-server databases
11 Big Data
224 Big data
12 Functional programming
227 Functional programming paradigm
230 Writing functional programs
13 Systematic approach to problem solving
235 Aspects of software development
241 Glossary
Introduction
Introduction
Assessment objectives
✚ AO1 You should be able to demonstrate knowledge and understanding of
the principles and concepts of computer science, including abstraction, logic,
algorithms and data representation.
✚ AO2 You must be able to apply your knowledge and understanding of the
principles and concepts of computer science, including to analyse problems in
computation terms.
✚ AO3 You must be able to design, program and evaluate computer systems
that solve problems, making reasoned judgements about these and presenting
conclusions.
Command words
Familiarity with the relevant command words is important. It helps you to
avoid wasting time in the exam room (for example, trying to evaluate when
there is no requirement for it). The most frequently used command words
used for the A-level papers are listed here.
✚ Calculate… requires you to work out the value of something. A correct
final answer, to the required degree of accuracy and with the correct units,
will score full marks. Working is usually required, and correct working can
score marks. (AO2)
✚ Compare… requires you to identify similarities and differences between
ideas, technologies, or approaches. (AO1)
✚ Create… requires you to write program code that solves a problem. Even if
you struggle to get the syntax correct, marks are awarded for evidence of
the approach taken as well as for working code. (AO3)
✚ Define… requires you to specify the meaning of a technical term in order
to show that you understand what it means. (AO1)
✚ Describe… requires you to set out the characteristics of a device or a
computing concept. These can be very short questions worth 1 mark, or
long-answer questions worth up to 12 marks. (AO1)
✚ Discuss… requires you to present the key points. These questions are
typically of medium length, worth 4-6 marks. You should aim to present as
many points as you can and, where appropriate, provide balance between
advantages and disadvantages. (AO1)
✚ Draw… requires you to produce a diagram. These are usually technical
diagrams such an E-R diagram for a database, or a logic circuit diagram. (AO2)
✚ Explain… requires you to provide purposes or reasons. These questions
are used to assess your knowledge of a topic and can sometimes be used
within a specific context. If a question is asked in relation to a particular
scenario then you should always make sure you link to the scenario in
your answer. (AO1, AO2)
✚ Express… requires you to convert an input into a particular format.
Examples might include encrypting a message or calculating a value. (AO2)
✚ Modify… requires you to take an existing section of program code and add
to or edit it. This is commonly asked as part of Section D in Component 1,
involving changes to the skeleton program. (AO3)
✚ State… requires you to express some knowledge in clear terms. These questions
6
are usually short answer questions and are usually worth 1 mark. (AO1)
Introduction
process should be included. (AO3)
✚ Write… requires you to create a program or a Boolean expression to solve
a specific problem. These questions can be quite lengthy and partially
complete answers are usually worth a significant proportion of the marks.
(AO2, AO3)
Component 1
Component 1 is an on-screen exam, largely focused on programming
and computation. In advance of the exam, you and your teachers will
be provided with a skeleton program and a small pack of background
information.
The exam will be taken using a programming language you have studied
throughout the course, and will be pre-selected by your teacher. The available
languages are:
✚ C# ✚ Python
✚ Java ✚ VB.net
✚ Pascal/Delphi
The skeleton program is a working program that is functional but could
be improved. Typical examples include text-based role-playing games or
simulations.
For A-level students the pre-release information is available from September
of that academic year (typically Year 13).
The pre-release information should be given to you by your teacher, but they
may choose not to give the information out straight away, depending on how
and when they plan to deliver the content.
The A-level exam is split into four sections:
1 Section A: Questions about programming and computation; for example,
finite state machines, standard algorithms, trace tables, Turing machines and
computation logic.
2 Section B: A programming problem (unrelated to the skeleton program);
for example, writing a program to convert between binary and denary
numbers.
3 Section C: Questions about the skeleton program; for example, identifying
specific variables or programming constructs, hierarchy charts, class
diagrams and explaining the purpose of specific elements.
4 Section D: Improving the skeleton code; for example, adding an extra menu
option, improving exception handling, adding new functionality.
Programming syntax
Some questions in Component 1 will include algorithms written using AQA
pseudo-code. It is important to be able to read and understand this pseudo-
code and, in some cases, to write program code in your chosen programming
Introduction
Component 2
Component 2 is a traditional-style written paper, largely focused on the more
theoretical components of this course.
The topics covered in this paper test subject content from Chapters 5–12, and
typically include:
✚ data representation ✚ networking
✚ computer systems ✚ databases and Big Data
✚ computer hardware ✚ functional programming.
✚ consequences of computing
As this is a written paper, no practical programming activities are assessed,
though some practical elements such as calculations, data conversions and
trace tables (especially for assembly language and instruction sets) are likely
to appear.
This paper will generally include short answer questions (1–2 marks) that will
assess your knowledge, questions that will assess your ability to apply your
knowledge (3–4 marks), and a small number of longer-answer questions that
require detailed discussion (6–12 marks).
For these longer-answer discussion questions, credit is awarded for
identifying, discussing and evaluating potential issues.
✚ Marks are awarded for identifying relevant knowledge; for example,
suggesting appropriate input devices for collecting data or identifying the
methods by which wireless data transmissions can be intercepted.
✚ To reach the top mark bands it is important to follow a line of reasoning, using
your knowledge to write in connected sentences in a way that makes sense
and relates to the context of the question. Explaining how each point links to
the scenario and adding as much technical detail and vocabulary as you can
makes it more likely that you will score well on this type of question.
✚ Always make sure you back up any arguments or suggestions you make
with facts, logical arguments and technical details, as unsubstantiated
statements don’t demonstrate your understanding.
Component 3
Component 3 is a non-examination assessment (NEA) component, based on
completing a programming project.
The programming project is extremely open-ended, and it is up to you to
identify a real-world problem and then work through each phase of the
systems development lifecycle in order to solve it.
It is beyond the scope of this book to go into detail in terms of completing
the programming project, but typical examples include online booking and
scheduling systems, computer games with a simple AI component, animal
population simulations, and so on. There is no definitive list of expected or
excluded projects and your teacher will be able to provide with much more
specific guidance.
1 Fundamentals of programming
Programming
Data types
It is important to declare variables using the correct data type. This will make
sure that memory is not wasted, and that the program is able to process the
data correctly.
Different data types are processed in different ways; for example, adding two
strings produces a different result to adding two integers.
"123" + "456" = "123456"
Note
All programming languages deal with data types slightly differently.
It is important to note that Python does not support the character data type, using
only a string to store text of any length. Python also does not support the array data
type. The closest alternative is a list data type.
Programming concepts
For each of the programming concepts you should be familiar with both the
Pseudo-code A format
pseudo-code used by AQA and the syntax used in your own programming
for program code that
language.
is not specific to one
programming language.
Declaration Used extensively in
Variable declaration is the process of creating a variable. Component 1. Pseudo-code
is useful for describing an
In languages which are strongly typed (for example, C#, Pascal/Delphi, Java algorithm that could be
and VB.net) the data type of the variable is stated, followed by the variable’s coded in one of several
identifier. In these languages it is possible to declare a variable without different languages.
initially assigning a value to it.
Syntax The strict rules
✚ int Age; (C# and Java)
and structures used within
✚ var Age: integer; (Delphi and Pascal)
a specific programming
✚ Dim Age As Integer (VB.net)
language. You will only be
In languages which are weakly typed (for example, Python) a value must be assessed in the syntax of
assigned to a new variable, and no data type is declared as the data type can one programming language
be changed during the course of the program. for Component 1.
✚ Age = 18 (Python) Declaration The creation
Constants are declared in a similar way, but with a key word that indicates of a variable in memory.
the variable cannot be changed. Because the value of a constant cannot be
changed later, they must be declared with an initial value:
Exam tip
✚ const int MAX = 3; (C#)
✚ Const MAX = 3; (Pascal/Delphi) In Component 1 you will
✚ final int MIN = 3; (Java) be presented with pseudo-
✚ Const MAX As Integer = 3 (VB.net) code algorithms to read,
and to turn into program
Python does not support the use of constants and therefore it is not possible code, but you will not be
to prevent the accidental assignment of a new value to a constant. expected to write answers
using pseudo-code.
Making links
Practical programming
It is important to consider where a variable is declared. A variable declared inside an questions are intended to
IF statement, loop or subroutine can only be used within that section of the code be coded using the specific
and will be destroyed once that section has ended. This issue is dealt with more in the programming language you
section on local and global variables. have been entered for.
10
1 Fundamentals of programming
✚ Assignment in most programming languages uses an equals sign (for
example, Score = 12)
Sequencing
Program code is executed in the order in which it appears. For example, given
the code:
x ← 5
OUTPUT x
x ← x + 1
The value 5 will be output before the value of x is increased.
Nested selection
Nested IF statements occur when an IF statement is placed inside an IF
Nested Placing a
statement.
programming structure
IF animal = "dog" THEN inside another programming
1 Fundamentals of programming
structure.
IF target = "sheep" THEN
OUTPUT "Dog chases sheep"
Revision activity
ELSE
✚ Write a program in your
OUTPUT "Dog wags tail"
chosen programming
END IF language that uses an
IF statement to allow
END IF the user to choose one
of four options.
Now test yourself
✚ Re-write the program
9 What programming concept is implemented using an IF statement? to use a SWITCH
10 Which part of an IF statement does not need a condition? statement.
11 What alternative to an IF statement is sometimes used if there are several
possible options? Exam tips
12 What is meant by a nested IF statement?
Be very careful with the
Answers available online start and end conditions
when creating FOR loops.
While a pseudo-code
Iteration algorithm may explicitly
When a block of code needs to be repeated, this is referred to as iteration. start and end at given
values, the implementation
Definite iteration for a FOR loop may be
Definite iteration, or count-controlled iteration, refers to the use a FOR loop, less clear (for example,
where the number of times to repeat is known. Even if the number of times to in Python for i in
repeat isn’t always the same, if it is known at the start of the loop then a FOR range(1,5) would stop
loop should be used. at i = 4 and in Java for
(int i = 1; i < 5;
You should be familiar with the syntax of a FOR loop in your own i++) would also stop at
programming language as well as the pseudo-code you might see in an exam: i = 4).
FOR i ← 1 TO 5 Although the specification
refers to definite and
OUTPUT "This is step " + i
indefinite iteration, question
ENDFOR papers typically use the
terms count-controlled and
Indefinite iteration condition-controlled iteration.
Indefinite iteration or condition-controlled iteration refers to a loop where
the number of iterations is not known. A typical example might be a
Iteration The repetition of
validation loop, asking a user to enter a valid input and repeating while a process or block of code.
their answer is invalid.
Definite iteration,
Indefinite loops can be further split into those that assess the condition at the or count-controlled
start of the loop (a WHILE loop) and those that assess the condition at the end iteration Iterating a fixed
of the loop (a DO-WHILE loop). number of times (also
In AQA pseudo-code: known as a count-controlled
loop, implemented as a
A WHILE loop will have the condition at the top of the loop, allowing the program to FOR loop).
bypass the code inside completely. For example: Indefinite iteration, or
value ← 5 condition-controlled
iteration Iterating until a
WHILE value < 100 DO
condition is met (also known
value ← value * 2 as a condition-controlled
ENDWHILE loop, implemented in AQA
pseudo-code as a WHILE
12 loop or DO-WHILE loop).
A DO-WHILE loop will have the condition at the bottom of the loop, forcing the program Note
to pass through the code at least once. For example:
Python does not support
DO
the use of a DO-WHILE
1 Fundamentals of programming
OUTPUT "Enter shoe size:" loop, but students are still
expected to be familiar with
ShoeSize ← INPUT
the concept. One solution
WHILE ShoeSize > 12 to this problem is to copy
and paste the first iteration
of the code before a WHILE
Nested iteration loop. Another is to create
Nested iteration means having a loop within a loop. This is used in a number
a REPEAT … UNTIL
of applications, including when working with 2D arrays, and in a number of structure.
standard algorithms including the bubble sort.
FOR i ← 1 TO 3
FOR j ← 1 TO 5 Remember
OUTPUT "Outer loop = " + i + " | Inner loop = " + j Remember that if you know
how many times to loop you
ENDFOR
should use a FOR loop. If
ENDFOR you don’t know how many
times then use a WHILE or
Now test yourself DO-WHILE loop.
13 What type of loop is also called a count-controlled loop? Remember that the inner
loop is completed multiple
14 When should you use a condition-controlled loop?
times for each step around
15 Describe the difference between a WHILE loop and a DO-WHILE loop. the outer loop. Selection
16 Suggest two possible uses for nested iteration. and iteration statements
can be nested inside each
Answers available online other, potentially many
layers deep.
Revision activity
✚ Write a program that uses iteration to print out the 12 times table.
Debug The process of
✚ Write a program that uses nested iteration to print out all of the times tables from 1
identifying and removing
to 12.
errors from program code.
Self-documenting
Meaningful identifier names code Program code that
It is important to choose identifiers for variables (and subroutines) that tell other uses naming conventions
programmers something about the purpose of that variable (or subroutine). and programming
✚ Var1, Var2 and Var3 are very poor choices as they make it very hard to conventions that help other
read the code and to debug if there are any errors. programmers to understand
✚ UserName, UserAge and DateUserLastLoggedIn are much more the code without needing
meaningful and are key to writing self-documenting code. to read additional
documentation.
It is not usually possible to use spaces in identifier names. Common strategies
to aid readability include the use of CamelCase, Kebab-Case or Snake_Case.
Now test yourself
Exam tip
17 Describe two
In Component 1, Section B and Section D, it is very important to use any identifier
advantages of using
names exactly as they are provided in the question. If the identifier name is not given
meaningful identifier
explicitly then remember that making it easier for the examiner to understand the
names.
code makes it more likely you will pick up the marks.
18 Explain the term self-
documenting code.
Note
Answers available
The use of meaningful identifier names to produce self-documenting code can have a online
significant impact on the marks available in the NEA.
13
Subroutines
Information on subroutines can be found in the section Purpose of subroutines.
Arithmetic operations
1 Fundamentals of programming
There are several key operators you should make sure you are familiar with.
Simple arithmetic operators include those for addition (+), subtraction (-), Exam tip
multiplication (*) and division (/).
DIV and MOD are common
It is important to understand the different types of division operation that can
terms for integer division
be carried out: and modulo. Make sure you
✚ Real or float division will result in a real or float answer; for example, are comfortable with how
5 / 2 = 2.5 they function.
✚ Integer division will strip any fractional part of the answer, effectively
rounding down; for example, 5 DIV 2 = 2
✚ The modulo operator will find the modulus – the remainder of an integer Operators Symbols used
division, as a whole number; for example, 5 MOD 2 = 1 to indicate a function.
DIV and MOD are useful for converting between different number systems, Modulus The remainder of
the division of one number
including conversion between units of time, imperial measurements, and
by another.
between numbers using different bases (for example, denary and hexadecimal).
Exponentiation The
Exponentiation refers to powers; for example, 2^3 = 2*2*2 = 8
raising of one number to the
Rounding can be carried to a given number of decimal places or significant power of another.
figures, generally using a function; for example, round(3.14,1) = 3.1,
Rounding Reducing the
round(3.16,1) = 3.2 number of digits used to
Additional functions can be used that always round up or always round down; represent a value while
for example, roundup(3.142,1) = 3.2 maintaining its approximate
value.
Rounding down has the same effect as truncation.
Truncation Removing any
Exam tip value after a certain number
of decimal places.
Some questions in Component 1 will assess your understanding of programming
principles in general and may involve having to read and show understanding of
pseudo-code. Other questions will require you to write your own program code in your
chosen programming language, so it is important to make sure you are familiar with
the specific operators and functions that are used in that programming language.
19 What is the difference between float division and integer division? Write a program that uses
DIV and MOD to convert a
20 What is the value of 7.86 when rounded to one decimal place?
given number of hours into
21 What is the value of 7.86 when truncated to one decimal place? a number of days.
Answers available online
Relational operations
Relational operators are used in comparisons, typically in selection
statements and condition-controlled loops. A relational operation, or
comparison operation, will always return either True or False.
The relational operators are as follows:
= or == Equal to
!= or <> or ≠ Not equal to
> Greater than
>= Greater than or equal to
< Less than
14 <= Less than or equal to
Note
Different programming languages have different syntax rules for the equal to
comparison operator.
1 Fundamentals of programming
AQA pseudo-code and VB.net use a single equals (=).
C#, Pascal/Delphi, Java and Python use a double equals (==).
State whether each of these is True or False when x ← 60. C# and Java are not able to
22 x != 60 make use of the standard
relational operators when
23 x < 60
comparing strings. String
24 x >= 60 handling operations are
25 x = 60 discussed on page 16.
Boolean operations
Boolean operations can be used to invert the logic of a conditional statement,
or to combine two or more conditions together.
NOT Will invert the logic (that is, a condition that returns True will become False).
AND Both conditions must be True.
OR Either one condition must be True, or both.
XOR One condition must be True and the other False.
Making links
The exam for Component 2 assesses understanding of Boolean logic in much more
detail, including questions on Boolean algebra. The same basic principles apply
whether combining logical statements in a practical programming setting or solving
Boolean equations. For a more in-depth examination of Boolean logic see Chapter 6.
Where a = 50 and b = 100, would each overall condition be True or False? Write a program to calculate
the output of simple logic
26 NOT (a = b)
circuits for NOT, AND, OR
27 a < 100 AND b > 100 and XOR.
28 a > b OR b = 2*a
29 a < b XOR b = 100
Variables are used to store values that are used in a computer program. The
Exam tip
value of a variable can be changed while the program is running (such as a
score, a running total, a user’s name). Component 1, Section C will
often start with asking you
A named constant is a variable whose value cannot be changed while the
1 Fundamentals of programming
16
1 Fundamentals of programming
✚ C#, Delphi and VB.net require the start position and the length.
Make sure you read any pseudo-code questions that involve substrings carefully,
and make sure you are comfortable programming with substrings using the language
chosen for your exam.
Some languages treat a string as an array of characters and can use an index array to
refer to a specific character; for example, string[3]. Some languages require the
use of functions such as charAt(int).
Make sure you are familiar with all of the operations above in your own language in
advance of the Component 1 exam.
Revision activity
✚ Write a program that will ask for a string and display each character’s
numeric code.
✚ Write a program that will ask for a series of numeric codes and convert them
into a single string which is displayed.
Exam tip
In most years there is some random number generation included in the skeleton code.
If you forget the correct syntax during the exam then make sure you know where to
find a working example from the skeleton code to help you.
Revision activity
✚ Write a program that will ask for a minimum and maximum number and will
generate a float between those two values.
✚ Write a program like that one above that will generate an integer between those
two values.
17
Exception handling
Exception handling is used to deal with situations where a program may
Exam tip
crash. The program should try to execute a block of code and then catch the
1 Fundamentals of programming
Purpose of subroutines
A fundamental aspect of improving the efficiency of program code is the use
of subroutines. Exam tip
A subroutine allows a programmer to take a section of program code that Make sure you are able
performs a specific task and move it out of line of the rest of the program. to explain the difference
between a function and a
This means that the programmer can call the subroutine in order to run that
procedure, and that you can
block of code at any point.
identify which subroutines
Subroutines are called using the subroutine’s identifier, followed by are which in the skeleton
any arguments that must be passed in parentheses; for example, code. This is a common
DisplayGreeting(name). topic in section C.
Subroutines that do not require any parameters must still be called using
parentheses; for example, DisplayDate(). Subroutine A named block
of code designed to carry
The advantages of subroutines are:
out one specific task.
✚ the subroutine can be called multiple times without needing to duplicate
Call The process of running
the code
a subroutine by stating
✚ changes to the subroutine only need to be made once its identifier, followed by
✚ it is easier to read the code any required arguments in
✚ it is easier to debug the code if there is a problem parentheses.
✚ subroutines can be re-used in other programs Pass The transfer of a
✚ the job of writing a program can be split, with each programmer tackling value, or the value of a
their own subroutines. variable, to a subroutine.
Making links
Using meaningful, self-documenting identifiers for subroutines is important to make
18 your code more readable. It is essential to do this as part of your NEA.
Subroutines are broken down into two types: functions and procedures.
Function A subroutine that
A function is a subroutine that returns a value once it has finished executing. returns a value.
A typical use of a function is to carry out a calculation and return the result. Procedure A subroutine
that does not return a value.
1 Fundamentals of programming
A procedure is a subroutine that does not return a value. A typical use of a
Module A file that contains
procedure is to display some data and/or prompts to the user.
one or more subroutines
Subroutines can be grouped together in a file to form a module or library. that can be imported and
These subroutines can then be re-used in other programs. used in another program.
Library A collection of
It is common to import libraries that have already been written to help solve modules that provide
problems; for example, in Java import java.util.* or, in Python, import related functionality.
random.
Parameters
The parameters of a subroutine are the variables that must be passed to a Parameters The variables
subroutine when it is called. This is declared when the subroutine is written. that a subroutine needs in
For example, in the code Procedure DisplayTemperature(int Temp, order for the subroutine
bool Celsius), the subroutine called DisplayTemperature needs to be passed to run.
an integer value and a Boolean value. Arguments The actual
values that are passed to
When a subroutine is called, the values must be passed as arguments; for the subroutine at runtime.
example, DisplayTemperature(20,True).
Returning a value
Functions must always return a value at the end of their execution. This is Return To pass a value or the
carried out with a return statement. contents of a variable back
For example: to the place in the program
where the function was called.
Function Double(int StartVal)
return 2*StartVal Now test yourself
End function 44 Describe what is meant
The value that is returned is passed back to the part of the program that by the term parameter.
called the function. 45 Explain the purpose of a
return statement.
It is possible for a subroutine to have several different return statements – for
example, within a selection structure – but the subroutine will stop once a Answers available
value has been returned. online
Local variables
When a subroutine is called, any variables passed as parameters and any Local variables Variables
variables declared within that subroutine are referred to as local variables. that are declared within a
These variables can only be accessed within that subroutine and will be subroutine and can only be
destroyed once the subroutine has finished executing. This is referred to as accessed by the subroutine
the scope of the variable. during the execution of that
subroutine.
Scope The visibility of variables
(either local or global).
19
the program
✚ easier to debug
✚ variable identifiers can be re-used in separate subroutines Modular Independent of
✚ subroutines can be more easily re-used (subroutines are modular). other subroutines.
Knowing that local variables are destroyed once that section of code has
finished executing, it is important to note that variables declared within a
selection statement or iteration structure will also be destroyed at the end of
that section of code. It is therefore very important to choose carefully where
in the program a variable will be declared.
Global variables
Global variables are variables that are declared in the main program and can
Global variables Variables
be read or altered in any subroutine.
that can be accessed from
Accessing global variables from within subroutines reduces the need for any subroutine.
passing parameters and using return statements but should generally be
avoided where possible.
Making links
Global variables are, however, useful for named constants.
Try to limit your use of
Now test yourself global variables when
working on your NEA
46 Explain what is meant by the scope of a variable. programming project. Using
47 Explain three advantages of using local variables. a modular structure for your
subroutines will increase
48 Explain three disadvantages of using global variables.
the range of marks you are
49 Describe a situation where it would be appropriate to use a global variable able to access.
Answers available online
Stack frames
When a subroutine is called a stack frame is created. This stack frame
contains: Stack frame The collection
✚ the return address – where to return to in the program once the subroutine of data associated with a
has finished executing subroutine call.
Call stack A data structure
✚ parameters – the variables to which data was passed when the subroutine
that stores the stack frames
was called
for each active subroutine
✚ local variables – any variables declared within that subroutine.
while the program is running.
Newly called subroutines are added to the top of the call stack and the top
stack frame is removed once that subroutine has been completed.
Exam tip
Making links Make sure you can recall
Stacks are a complex data structure with a variety of uses in programming. Stacks and the three things stored
other complex data structures are explored further in Chapter 2. in a stack frame as this
is a common question in
Section A.
Now test yourself
Recursive techniques
Some algorithms are best solved by solving smaller and smaller instances
Recursion The process of
of the same problem. To achieve this, a function must call itself repeatedly –
a function repeatedly calling
1 Fundamentals of programming
this is known as recursion. One example of recursion is used in calculating a itself.
factorial.
A factorial is calculated by multiplying a number by all of the integers less
than itself. The ‘!’ is used as the mathematical symbol for ‘factorial’. For
example:
5! = 5 × 4 × 3 × 2 × 1
This can be simplified as 5 × 4!
4! = 4 × 3!
3! = 3 × 2!
2! = 2 × 1!
1! = 1
A recursive subroutine will continue to call itself (known as the general case) General case A case in
until it reaches a decision that returns a value without calling itself (known as which a recursive function is
the base case or terminating case). called and must call itself.
Base case The case in
The pseudo-code algorithm for a factorial calculator might read: which a recursive function
Function Factorial (int n) terminates and does not call
itself.
IF n <= 1 THEN
RETURN 1
Exam tips
ELSE
Following a recursive
RETURN n * Factorial (n-1) algorithm can be very tricky
and it is important to get
END IF
lots of practice using trace
End Function tables to step through
recursive algorithms.
If n = 1 or less then the function cannot attempt to break the task down
any further, and should stop calling itself recursively, returning the result of Section B questions can
1 (1! = 1). This is the base case. require the use of recursion
to achieve full marks,
In all other cases the function will call itself. This is the general case. though a non-recursive
It is possible to have more than one general case and more than one base case. solution will always be
possible.
Now test yourself
21
Programming paradigms
Programming paradigms
1 Fundamentals of programming
Procedural-oriented programming
Procedural-oriented programming is designed to allow programmers to use a
Making links
structured, top-down approach to solve a given problem.
For more on subroutines
The program designer uses decomposition to break the problem down into
see the section Purpose of
increasingly small sub-problems, each of which can then be solved using a
subroutines above on
subroutine (either a function or a procedure).
page 18.
The main program will then be constructed by calling subroutines which will,
in turn, call other subroutines in order to solve the original problem. Top-down approach
A method of planning
Data can be passed to subroutines and values can be returned from them,
solutions that starts with
allowing the different parts of the program to interact with each other.
the big picture and breaks
This type of computer program is generally simpler to understand and the it down into smaller sub-
subroutines can be re-used at different points in the program without needing problems.
to copy it. This type of program is very modular and can easily be updated by Decomposition A
changing one subroutine. method of solving a larger
problem by breaking it up
into smaller and smaller
Hierarchy charts problems until each
Procedural-oriented programs can be represented using hierarchy charts. problem can’t be broken
Simple hierarchy charts show the relationship between subroutines, such down any further.
as the chart shown in Figure 1.1, with lines used to connect subroutines Hierarchy charts A
wherever one subroutine calls another. diagram that shows which
subroutines call which other
subroutines. More complex
versions will also show what
22 data is passed and returned.
Exam tip
Program
Questions involving
hierarchy charts almost
1 Fundamentals of programming
exclusively appear in
Section C, referring to the
Initialise Input Process Output
skeleton program, and
usually involve a ‘fill the
gaps’ style of question.
Make your own hierarchy
Capture
Enter data Validate charts when studying
form
the skeleton code to help
you understand how the
subroutines fit together and
Figure 1.1 A simple hierarchy chart
see past papers for example
of this style of question.
Now test yourself
Revision activity
✚ Open a complex program you have been working on in your lessons and create a
hierarchy chart to show the relationships between each subroutine.
✚ Consider the steps involved in a two-player game such as Rock-Paper-Scissors
or Noughts & Crosses. Create a hierarchy chart to show how the game could be
broken down into subroutines and how those subroutines would be related.
Object-oriented programming
Classes
Object-oriented programming (often referred to as OOP) uses a different
Class The definition of the
approach to programming. attributes and methods of a
The programmer thinks about real-world objects and creates a class to group of similar objects.
describe the attributes and methods for that type of object. For instance, in Attributes The properties
Figure 1.2, the class called Customer has: that an object of that type
✚ the attributes: Name, Address, and Date of birth has, implemented using
✚ the methods: Edit customer and Delete customer. variables.
Methods Processes or
Customer Account actions that an object
of that type can do,
Name
Account number implemented using
Address Attributes subroutines.
Balance
Date of birth
Edit customer Check balance
Methods
Delete customer Add interest
Figure 1.2 Classes containing attributes and methods
23
Encapsulation
Deciding how to structure and organise classes can be difficult, but the main
Encapsulation The
rule is to group together objects with common characteristics (attributes) and
concept of grouping similar
behaviours (methods). Keeping these features together in one class is called
attributes, data, and
encapsulation.
methods together in one
Encapsulation is useful because it means that program code is more modular. object.
This means that the code is easier to debug, can be re-used more easily and Information hiding
teams of programmers can work on individual classes without needing to Controlling access to data
know how other classes are programmed – they only need to know what stored within an object.
methods can be accessed.
Encapsulation is also helpful as it allows for information hiding, in which the
data stored in the attributes can be kept within that object and access to that
data can be controlled.
24
1 Fundamentals of programming
or method.
Access specifier Symbol Attributes Methods
Public An access specifier
public + can be viewed or updated can be called by any that allows that attribute or
by any object of any class object of any class method to be accessed by
private - cannot be viewed or cannot be called by any any other object.
updated by any other other object, regardless Private An access specifier
object, regardless of class of class that protects that attribute
or method from being
protected # can only be viewed or can only be called by this
accessed by any other
updated by this object object or another object
object.
or another object of that of that class, or a subclass
Protected An access
class, or a subclass
specifier that protects that
The convention is to declare attributes as private so that other objects cannot attribute or method from
directly interact or affect the values that are stored. This make it less likely being accessed by other
objects unless they are
that a class written by another programmer could adversely affect the overall
instances of that class or a
program.
subclass.
To allow access to those data, a class should include getters and setters – Getter A function used
public methods that allow other objects to ask an object to return the value of to return the value of an
a specific attribute, or that ask an object to update a value. attribute to another object.
One example might be a game character that has a score and a number of Setter A procedure used to
lives which are both set to private, and uses public methods to allow other allow another object to set
objects to interact with those values: or update the value of an
attribute.
Character = Class
Private:
Revision activity
Score: Int
✚ Using pseudo-code,
Lives: Int describe the class
Public: attributes and methods
required for a virtual pet.
Function GetScore() ✚ Design and build
Function GetLives() an object-oriented
program that uses a
Procedure AddPoints() Calculator class
Procedure LoseALife() with methods such as
Press0, Press1,
Procedure GainExtraLife() PressPlus and
End Class PressEquals.
This means that the same code can be re-used without needing to be copied,
Revision activity
and if the code needs to be changed then it only needs to be changed in the
base class, reducing the risk of errors and making it easier to debug. ✚ Design a base
class, Room, and
In class diagrams, inheritance is showing using a hollow arrow, which always
1 Fundamentals of programming
subclasses Kitchen,
points towards the base class.
LivingRoom and
Bedroom. Consider
Account
the attributes that could
be inherited and those
Current Mortgage
which must be declared
in the subclasses.
✚ Design classes that
Figure 1.4 An inheritance diagram
could be used to model
Figure 1.4 shows a class structure in which two types of account (Current and various forms of public
Mortgage) are subclasses will inherit the attributes and methods from the transport, including
Account base class. buses, taxis and trains.
Exam tip
Exam tip
In AQA Pseudo-code, subclasses are declared by adding the base class name in
parentheses; for example: The terms subclass
Dog = Class (Animal) and child class are
It is rare for the exam to ask students to describe a class using pseudo-code. However, interchangeable and both
when this is the case, then language-specific syntax is accepted; for example: mean the same thing.
Dog = Class extends Animal The terms base class
and parent class are also
interchangeable. It is helpful
Subclasses will have additional properties and methods specific to that type
to either use the terms
of object (for example, a current account might have a withdrawal limit or the
subclass and base class
ability to allow an overdraft, whereas a mortgage account might have a fixed
OR child class and parent
end-date and the ability to allow a payment holiday).
class for consistency, but
Both subclasses will inherit the attributes and methods from the base class both sets of terminology are
(that is, they both have an account number and a balance and both have acceptable.
methods for checking the balance and adding interest).
Job Role
1 Fundamentals of programming
Manager Employee
Composition aggregation
Composition aggregation describes a more dependant relationship in which
the container object is directly made up (composed) of the associated objects.
If the container object is removed or destroyed then so are the associated
objects. In our example the container object Workforce is made up of
managers and employers. Removing Workforce from the model altogether
results in the removal of the employees and managers.
In class diagrams, composition aggregation is shown using a filled diamond,
which always points towards the container class.
Workforce
Manager Employee
Polymorphism
Polymorphism means ‘many forms’. It covers the potential scenario of having Polymorphism Literally
two or more methods with the same name. ‘many forms’ – the ability for
✚ A method to ring(PhoneNumber) would involve dialling that specific two methods with the same
number. name to carry out their
✚ A method to ring(Name) would involve looking up the phone number for actions using different code.
the person you want to ring and then ringing that number. Overriding A method in
a subclass re-defining a
Both methods have the same name, but different parameters. By looking
method inherited from a
at the data type of the value that is passed, the object can identify which
base class.
method to use.
Having methods with the same names but different parameters means that
the same basic goal can be achieved using different steps, or methods, and is
Exam tip
a common feature in object-oriented programming. Look carefully through
the skeleton code for any
Overriding examples of polymorphism.
If there are any then this is a
Overriding is the situation where a base class has a method and a subclass
likely question to appear in
has a method of the same name, but with different steps. The subclass Section C.
method overrides (takes priority over) the method from the base class.
For example, an Account class can have a method CloseAccount that
transfers the remaining balance to the customer.
27
Revision activity
✚ Create flashcards for each piece of key vocabulary in this chapter.
✚ Examine the skeleton program for your Component 1 exam, or a previous skeleton
program, and identify as many examples of instantiation, inheritance, aggregation,
polymorphism and overriding as possible.
1 Fundamentals of programming
classes, which is not true for subclasses.
Changing a base class can cause unexpected side-effects in subclasses,
so using composition rather than inheritance means it is safer to update
program code in the future.
A particular problem with inheritance is that a subclass can’t inherit from two
different classes at once. This means that if a new behaviour is common to
two different classes then it may not be easier to re-use code without copying
it into both classes.
on that concept will appear in Section C and it is possible you will be expected to use relationship symbol. Make
those techniques when writing your own program code for Section D. sure you remember the
following.
✚ Inheritance is shown
Class diagrams using a hollow arrow,
The design of classes can be described using a class diagram, using arrows which always points
and diamonds to describe the relationships towards the base class.
✚ Association aggregation
Simple class diagrams just show the names of classes and the relationships
is shown using a hollow
between them.
diamond, which always
More detailed class diagrams show the attributes and methods within each points towards the
class as well. container class.
✚ Composition
These diagrams show the name of the class in the top section, attributes in aggregation is shown
the middle section and methods in the bottom section, along with access using a filled diamond,
specifiers. which always points
towards the container
Base class or parent class
class.
Account
– AccountNumber: String
– OpeningDate: Date
– CurrentBalance: Currency
– InterestRate: Real
+ GetAccountNumber()
+ GetCurrentBalance()
# AddInterest()
+ SetInterestRate
Revision activity
✚ Create a class diagram
Current Mortgage
that could be used
– PaymentType: String to model a house,
– Overdraft: Boolean – EndDate: Date including a House
Subclasses or container class and
+ SetPaymentType() + GetEndDate() associated classes that
child classes
+ SetOverdraft() + SetEndDate() include Kitchen,
LivingRoom,
+ GetOverdraft() Bedroom and
Garden. Consider the
Figure 1.7 A class diagram showing the design for different types of account
relationship between
Note the way that attributes are described using Identifier: Data Type. each class.
✚ Examine the skeleton
Now test yourself program for your
Component 1 exam,
85 What type of method can be called even if an object of that class has not been or a previous skeleton
instantiated? program, and create a
86 What key word means that a method can be overridden? detailed class diagram
87 What type of method must be overridden? to show attributes and
methods in each class as
88 Identify six things should be included in a detailed class diagram.
well as the relationships
Answers available online between them.
30
Summary
1 Fundamentals of programming
a data type and a value. It is important to choose a behaviours of real-world objects
meaningful identifier ✚ A class contains the attributes (variables) and methods
✚ Named constants are variables whose values cannot (subroutines) associated with that type of object and
be changed while the program is running functions as a blueprint for how objects of that class
✚ Selection is used to decide which block of code to will behave
execute and is implemented using an IF or SWITCH ✚ Objects are instances of a specific class and
statement instantiated by calling their constructor
✚ Iteration is used to repeat a block of code and either ✚ The process of grouping objects with common
uses a count-controlled (FOR) loop if the number of attributes and behaviours together is called
repetitions is known or a condition-controlled (WHILE encapsulation
or DO-WHILE) loop if the number of repetitions is not ✚ Access to an object’s attributes and methods is
known controlled using access specifiers
✚ Arithmetic operators include basic arithmetic, plus ✚ It is common for all attributes to be declared as private,
rounding, truncation and the use of DIV and MOD and for setter and getter methods to be declared as
✚ Relational operators can be combined with Boolean public in order to allow other objects to interact in a
operators to create more complex conditions controlled manner
✚ String handling operations include the basic skills of ✚ Inheritance describes an ‘is a’ relationship. A subclass
finding the length, addressing specific positions and will inherit the attributes and methods of a base class
concatenation (joining) of strings, as well as extracting ✚ Aggregation describes a ‘has a’ relationship. A
substrings, converting to and from the numeric container class will be linked to one or more associated
character codes and converting data types classes
✚ It is important to be able to use random number ✚ In association aggregation, if the container class
generation and exception handling in your is destroyed then the associated classes will be
programming untouched
✚ Exception handling is an important tool when a block of ✚ In composition aggregation, if the container class
code has a chance of failure (for example, opening a file is destroyed then the associated classes will be
that may not exist, or processing user inputs that may destroyed as well.
be in the wrong format) ✚ Polymorphism is the term for using having two or more
✚ Subroutines allow for code to be re-used. Parameters methods with the same identifier in the same class,
describe the data that must be passed to a subroutine but using different parameters to allow the program to
when it is called decide which method to run
✚ Functions are subroutines that return a value. ✚ Overriding is the term for a subclass having a method
Procedures are subroutines that do not return a value with the same identifier as a method in the base class;
✚ Local variables are preferable to global variables the method in the subclass will always take precedence
in most cases as they use less memory, make ✚ There are three design principles to remember:
subroutines more modular and make programs easier ‘Encapsulate what varies’, ‘Favour composition
to read and debug. over inheritance’ and ‘Program to interfaces, not
✚ Stack frames contain the return address, parameters, implementation’
and local variables of a subroutine call while it is ✚ You should not be asked about abstract, virtual and
running static methods but these may appear in the skeleton
✚ A recursive function is a function that calls itself. It code
must include at least one base case, a point at which ✚ Class diagrams should always show the name of the
the function will stop calling itself class, the attributes for that class (including data types
and access modifiers) and the methods for that class
Programming paradigms (including access modifiers)
✚ There are two main programming paradigms to study – ✚ Inheritance is shown using a hollow arrow that points
procedural-oriented programming and object-oriented towards the base class
programming ✚ Association aggregation is shown using a hollow diamond
✚ In procedural-oriented programming the main problem that points towards the container class
is broken down into smaller sub-problems, each of ✚ Composition aggregation is shown using a filled diamond
which is solved using a subroutine that points towards the container class
✚ Hierarchy charts are used to show which subroutines
call other subroutines
31
Exam practice
1 Dave has been asked to write a program as part of a project looking at rainfall. At the end of a week the user of the
program will enter the total rainfall (measured in mm) for each one of the last seven days as a number with one
1 Fundamentals of programming
FUNCTION Palindrome()
OUTPUT "Enter a word or phrase"
Phrase ← INPUT
Result ← PCheck(Phrase)
IF Result = TRUE THEN
OUTPUT "This is a palindrome"
ELSE
OUTPUT "This is not a palindrome"
ENDIF
ENDFUNCTION
FUNCTION PCheck(Phrase)
NumChars = Length(Phrase)
IF NumChars < 2 THEN
RETURN TRUE
ELSEIF Phrase[0] = Phrase[NumChars-1] THEN
RETURN PCheck(Phrase.Substring(1,NumChars-2))
ELSE
RETURN FALSE
ENDFUNCTION
32
3 This class diagram is a partial representation of the relationships between some of the classes in a program for a
garage.
Vehicle
1 Fundamentals of programming
- Registration: String
- Colour: String
- New: Boolean
- Owner: Owner
+ GetDetails()
# ChangeColour()
+ ChangeOwner()
33
Each individual item in the array is referred to using the identifier (or List A data structure similar
variable name) of the array, followed by the index in brackets (usually square to an array, commonly used
in Python in place of an
brackets). Each index position can hold data.
array.
Index [0] [1] [2] [3] [4]
Data 1 3 5 7 11 Exam tip
In most languages (including
Note C#, Delphi/Pascal, Java &
VB.Net) the array index
Arrays are always declared as a fixed size and cannot be changed later in the program. starts at 0, but be very
In an array, all data must be of the same data type. careful when reading
Lists are very similar to arrays, however it is possible to alter the size of a list while the exam questions as some
program is running and it is possible to store data of differing types in the same list. languages and some
scenarios will have the array
In some languages (notably including Python) it is much more common to use a list, index starting at 1.
whereas in others it is much more common to use an array.
You are not expected or required to show your understanding of the difference
between the two, and you should use whichever version is most suitable for your
chosen programming language.
Arrays are an effective solution for storing several values because a FOR loop
can be used to iterate (or loop) over an array.
In this example Numbers is the name of an array and Length(Numbers)
returns the size of the array (the number of items is contains).
FOR i ← 0 TO Length(Numbers)
OUTPUT Numbers[i]
34
ENDFOR
This code will step through the items in the array n times, where n is the size
Making links
of the array. This code would work for all arrays regardless of their size. This
is far more efficient than stepping through several variables one at a time, as One-dimensional arrays
this would require a separate line of code for each variable and the program can be an effective way
Numbers [1]
Numbers
[0] [1] [2] [3]
[0] [1] [2] [3]
[1] 11 13 17 19
[0] 1 3 5 7
Exam tip
[1] 11 13 17 19 Numbers [2]
[2] 23 29 31 37
Arrays can also be defined
[0] [1] [2] [3]
using the first index to find
[3] 41 43 47 51 [2] 23 29 31 37
the column and the second
Numbers [2][2] index to find the item in that
Numbers [3]
column. Always read the
[0] [1] [2] [3] question carefully as any
[3] 41 43 47 51 ambiguities should be made
Numbers [3][1] clear.
Figure 2.1 A two-dimensional array called Numbers
Making links
In the example above the contents of Numbers[0][3] can be found by isolating
row 0 and then finding the item in position 3. One common use for a
two-dimensional array is
A two-dimensional array is useful because it can store more complex data; for
storing a matrix. Matrices
example, the scores for different students or values for different days.
are often used in complex
It is possible to find the size of a two-dimensional array using the Length maths, and can also be used
function (exact syntax varies by programming language). to represent a graph data
structure. This is explored
For an array with indexes defined as [row][column], to find the number of
further later in this chapter.
rows, find the number of one-dimensional arrays the table can be split into:
NumberOfRows ← Length(ArrayName)
Matrix A rectangular, two-
To find the number of columns, find the length of a row in the table: dimensional collection of
NumberOfColumns ← Length(ArrayName[0]) values.
[2] 23 29 31 37
[3] 41 43 47 51
Exam tip
While questions on two-dimensional arrays are quite common, three-dimensional
arrays are generally limited to potential skeleton programs and NEA programming
projects. Examine the skeleton code carefully for any examples of multi-dimensional
arrays.
Revision activity
✚ Create a two-dimensional array called MovieRatings. Populate the array with
film review ratings, with each row representing one review website and each
column representing one film.
✚ Create a menu and write program code that will allow the user to:
✚ find the min, max and mean average review score for each film
✚ find the min, max and mean average review score from each website
✚ find a specific review rating given the name of the film and the review website.
✚ Create a 2-dimensional array to store the game state for a game of noughts and
36 crosses.
Each item in a record can be a different data type, making a record more
complex than an array, but also more flexible.
There are many ways to implement a record, and this is handled differently in
different programming languages. Understanding and recognising records of
data and which fields are being used is the key concept.
One common concept when considering records is the importance of writing
records to a file.
Files are a way of permanently storing data which would otherwise be lost
File A persistent collection
once the program or subroutine has finished executing. of data, saved for access
Text files use a character set (such as ASCII or Unicode) to store the data as and accessible once the
text and typically use a delimiter such as a comma (,) or colon (:) to separate program or subroutine has
the individual items of data. These files are usually saved with a .txt or .csv finished running.
(comma separated value) file extension. Text file A file that uses
A saved file from a game of Noughts and Crosses might be saved like this as a text encoding (such as ASCII
.csv file, where a comma is used to separate each item of data. or Unicode) to store data.
Binary file A file that uses
X binary values to represent
each item of data.
O X
X O
-,X,-,-,O,X,-,X,O
An alternative is to save the game as a binary file. A binary file only stores
binary information, and needs a key to translate binary into particular items
of data. For instance, for the Noughts and Crosses game, using two bits per
cell, the following key could be used:
00 Empty
01 X
10 O
Note
2 Fundamentals of data structures
For A-level it is only necessary to consider sequential file access. That is, reading files
from start to end, in order.
Exam tip
The syntax for writing to and reading from files varies significantly in different
programming languages. It is important to be familiar with file handling routines in
your chosen programming language and to look carefully for any file handling that
takes place in the skeleton program.
Revision activity
✚ Create a two-dimensional array to represent the state of a game of noughts and
crosses. Write program code that will:
✚ save the game state as a text file
✚ load a game state from a text file
✚ save the game state as a binary file
✚ load a game state as a binary file.
✚ Compare the file size of the text and binary files. Try opening both types of file in a
simple text editor (such as Notepad) and compare the contents.
Queues
A queue data structure works in a very similar way to a queue in the real
Common uses for queues include: First In, First Out (FIFO)
✚ buffering (storing data as it arrives until it can be processed) Those items placed into the
queue first will be the first
✚ simulating a card game (cards are drawn from the front and replaced at
ones to be accessed.
the back).
Pointer A value that stores
Making links an address. In the context
of queues this is usually the
Queues are a fundamental part of the breadth first search (BFS) which is explored in index of the front or rear
Chapter 3. item.
FP RP
[0] [1] [2] [3] [4] [5] FP 0
Dave Angelina Faaris RP 2
RP
[0] [1] [2] [3] [4] [5]
Dave Angelina Faaris Kev RP 3
Items are retrieved from the front of the queue by moving each item forward
in the array:
RP
[0] [1] [2] [3] [4] [5]
Angelina Faaris Kev RP 2
This can be inefficient, especially for large queues, as moving each item in the
array takes time. 39
FP RP
[0] [1] [2] [3] [4] [5] FP 1
Dave Angelina Faaris Kev RP 3
Note
In a circular queue there is no need to delete the value from the array, only to move
the pointers. The data that has been dequeued will eventually be overwritten when
the rear pointer comes back around.
When a pointer reaches the end of the array, it wraps around back to the start.
For example, given the following state of a circular queue:
FP RP
[0] [1] [2] [3] [4] [5] FP 2
Dave Angelina Faaris Kev RP 5
When a new item is enqueued, the rear pointer will wrap around to 0.
RP FP
[0] [1] [2] [3] [4] [5] FP 2
Aadya Dave Angelina Faaris Kev RP 0
Note
Exact implementations vary slightly, for example some implementations have the rear
pointer pointing to the first empty space rather than the last full space.
A circular queue is much more efficient that a linear queue when the queue is
large as it avoids the need to move each item each time a value is dequeued.
A circular queue is also a more complex data structure and a linear queue
may be more appropriate for a smaller queue.
FP RP
[0] [1] [2] [3] [4] [5]
Engine Steering Calculate
Warning Input Position
5 4 3
40
FP RP
[0] [1] [2] [3] [4] [5]
Engine Steering Rudder Calculate
Warning Input Input Position
5 4 4 3
Stacks
A stack is a Last In, First Out (LIFO) data structure. Stack A data structure in
Much as with a stack of plates, new items are placed at the top of the stack which items are added to
and items are also removed from the top of the stack. the top and removed from
the top, much like a stack of
Common uses for stacks include: plates.
✚ reversing a list of values
Last In, First Out (LIFO)
✚ performing undo operations
Those items placed into the
✚ as a call stack for keeping track of subroutine calls.
stack most recently will be
A stack can be implemented using a single-dimensional array and a single the first ones to be accessed.
integer variable to point at the top of the stack.
[5] Top 2
[4]
[3]
[2] Change font Top
[1] Make bold
[0] Type
41
The IsFull and IsEmpty actions are necessary because a program will crash if
there is an attempt to push data when the stack is already full or to pop data
from an empty stack.
When a value is pushed onto the stack it is placed at the top position and the
value of the top pointer is incremented.
[5] Top 3
[4]
[3] Delete selection
[2] Change font Top
[1] Make bold
[0] Type
When a value is popped from the stack the value at the top position is
returned and the top pointer is decremented.
[5] Top 2
[4]
[3] Delete selection
[2] Change font Top
[1] Make bold
[0] Type
17 State the acronym used to classify a stack data structure, and state the words this In a stack there is no need
stands for. to delete the value from
18 Describe two potential uses for a stack. the array, only to move the
pointer. The data that has
19 What error could occur when trying to Push an item? State how this situation could
been popped will eventually
be dealt with.
be overwritten when new
20 Describe the difference between Pop and Peek. data is pushed and the top
pointer moves back up.
Answers available online
Revision activity
✚ Create a Stack class in your chosen programming language. Include the following
methods:
✚ IsFull
✚ IsEmpty
✚ Push
✚ Pop
✚ Peek
✚ Take a program that you have written that makes use of subroutines and write down
the state of the call stack each time a new subroutine is called or a value returned.
✚ Repeat the first bullet for a recursive program that you have written.
42
Graphs
A graph is a data structure designed to represent more complex
Graph A data structure
relationships.
Types of graph
Unweighted and undirected graphs
Town A Town B
A simple graph (unweighted and undirected) shows how two or more items
are connected.
Figure 2.3 An unweighted,
undirected graph
Weighted graphs
A weighted graph is used where the connection between the two nodes has a Weighted graph A graph in
cost or value (such as a distance). which each edge has a value
or cost associated with it.
30
Town A Town B Directed graph A graph
some edges can only be
Figure 2.4 A weighted graph traversed in one direction,
shown with an arrow head.
Charles Pauline
Exam tip
Harry Make sure you can identify
the different types of
graph. Be careful with the
Dave
Jack term ‘edge’. This doesn’t
mean the outskirts of the
graph, but the connections
Figure 2.5 A directed graph
between each node.
In a directed graph, not all edges need to be one-directional.
Now test yourself
Disconnected graphs 21 Explain the following terms.
A disconnected graph is one in which two or more vertices are not connected a Weighted graph
to the rest of the graph. b Directed graph
c Disconnected graph
A B
22 State two other terms
for a line or a connector
E
in a graph.
23 What symbol is only used
C
in a directed graph?
D 24 Suggest two possible uses
for a graph data structure.
Figure 2.6 A disconnected graph Answers available
online 43
For a weighted graph, this list also includes the cost of travelling along
each edge.
This is efficient for sparse graphs, where the presence of edges does not need Sparse graph A graph with
to be tested often, because sparse graphs have very few edges and so not few edges.
much storage space is needed. However reading and processing each item in
an adjacency list can be inefficient if the list is large. Adjacency matrix A
method of representing
An adjacency matrix uses a grid to store details of adjacent nodes. a graph using a grid with
For an unweighted graph the adjacency matrix uses a binary number to show values in each cell to show
the presence or lack of an edge. which nodes are connected
to each other.
A B C D E
A B Dense graph A graph with
A 0 1 1 0 0 many edges.
E B 1 0 1 0 1
C 1 1 0 1 0
C D 0 0 1 0 1
D
E 0 1 0 1 0
This is efficient for dense graphs, where the presence of edges needs to be
tested often, because the adjacency matrix is simple to step through. However
a lot of space can be wasted if the graph has few edges.
Undirected graphs will produce an adjacency matrix with a diagonal line of
symmetry.
Directed graphs will produce an adjacency matrix without symmetry, as not
all edges can be traversed in either direction.
44
A B C D E
A B
A 0 1 0 1 0
E B 0 0 0 0 1
Weighted graphs will produce an adjacency matrix that stores the cost of each
edge.
Where nodes are not adjacent the cost of traversing is infinite.
20 A B C D E
A B 25
A ∞ 20 30 ∞ ∞
30 E B 20 ∞ 30 ∞ 25
30 Exam tips
C 30 30 ∞ 35 ∞
40 Make sure you are confident
C 35 D ∞ ∞ 35 ∞ 40 recalling the pros and
D
E ∞ 25 ∞ 40 ∞ cons of adjacency lists and
adjacency matrices, as this
Figure 2.11 An adjacency matrix for a weighted graph is a common question.
The number of edges is the main consideration when deciding whether an The difference between
adjacency list or an adjacency matrix is the most appropriate data structure a graph and a tree is that
for recording the details of a graph. a tree has no cycles. This
question comes up quite
Advantages Disadvantages often.
Adjacency list Good for sparse graphs (few Poor for dense graphs as it
edges) or situations where takes longer to process each
edges don’t need to be tested value.
often as it takes up less
storage space.
Adjacency matrix Good for dense graphs (many Poor for sparse graphs as it
edges) or situations where wastes storage space.
edges do need to be tested
often as it is easier to process
each value.
25 Which representation for a graph, when written down, looks most like a grid? ✚ Create a weighted graph
26 Which representation for a graph is most suitable for a sparse graph? showing the time taken
to travel between five
27 When else would that representation be most suitable?
towns near your home.
28 When would an adjacency matrix not be symmetrical? ✚ Create an adjacency
29 True or false? An adjacency list cannot be used for a weighted graph. matrix and an adjacency
list to represent the
Answers available online graph.
Trees
A tree is a special example of a graph which has no cycles (or loops). Tree A graph which has no
cycles (it is not possible to loop
Trees are always connected (no disconnected nodes) and undirected (edges back around to a previously
can be traversed in both directions). traversed part of the tree).
A rooted tree has one node as the root, or starting point, and all of the edges Rooted tree A tree with a
tend to lead away from the root. Rooted trees are commonly used to represent rooted node, from which all
a hierarchical structure, such as the chapters in this book. They are very edges leading away from
45
commonly used as binary search trees, as described below. that root.
3
4
2 Fundamentals of data structures
Component 1 Component 2
5
1 2 3 4
Programming Data Structures Algorithms Computation 6
Budgie Fish
Rooted Non-rooted
Figure 2.12 A rooted tree to represent chapters and topics in this book Rabbit
A non-rooted tree doesn’t have a clear start point and appears at a glance to
Figure 2.14 A binary search
be more like a graph, though with no cycles.
tree for storing pets in
A binary tree is a rooted tree in which each node has a maximum of two alphabetical order
children.
Cat
Binary trees consist of:
Root the node that is the starting point for the tree Budgie Fish
Branch a node that comes after the root and has one or more children
Leaf a node that does not have any children
Dog Rabbit
Binary search trees are especially useful for storing data in an ordered way
Figure 2.15 A binary search
and can be quickly and easily traversed to provide an ordered list.
tree with the addition of a
When adding data to a binary search tree, each new node is added according node for Dog
to its order.
In the example in Figure 2.14, where pets are stored alphabetically, to insert Binary tree A tree in which
the value Dog we would: each node has a maximum
✚ Compare to the root: Dog comes after Cat so follow the right edge of two children.
✚ Compare to the next node: Dog comes before Fish in the alphabet so we Root A node that has no
follow the left node parent nodes.
✚ If there is no child node in that direction, then we add one:
Branch A node that has a
The order in which the nodes are written will be different depending on the parent node and at least
order in which nodes are added. However the binary search tree will always one child node.
be able to provide a correctly ordered list.
Leaf A node that has no
Note child nodes.
Binary search tree A
Binary trees are also useful for representing equations and can be traversed in such
binary tree used to store
a way as to produce equations in infix notation, Polish notation and reverse Polish
data in order so that it
notation (RPN).
can be quickly and easily
searched.
Hash tables
A hash table is a data structure in which values are found by looking up their
associated key by following a hashing algorithm. Revision activity
Making links
Hashing algorithms are also used for checksums (to check the validity of a file or other
piece of data) and for encryption (hashed passwords are typically stored in databases Revision activity
rather than plaintext passwords). Checksums and other error checking processes, and ✚ Design a simple hashing
encryption, are described in Chapter 5. algorithm for storing the
names of films based
on the length of the film
Now test yourself title.
✚ Implement a hash table,
33 Describe the relationship between a value and its key in a hash table. storing the names
34 What is meant by a collision in a hash table? of films in an array
based on the hashed
35 Describe two steps that must be taken when rehashing.
keys. Display an error
36 Describe three potential uses for a hashing algorithm. message if two films
37 A hashing algorithm for an ID number is written as ‘sum of each digit MOD 6’. generate the same key
Calculate the hashes for each of the following values. and devise an improved
a 7216 hashing algorithm to
b 5891 reduce the number of
c 0275 collisions (for example,
alter the modulo
Answers available online division).
47
Dictionaries
A dictionary is a collection of key–value pairs, in which the value is found by
Dictionary A data
looking up the key.
2 Fundamentals of data structures
One use for a dictionary is information retrieval, for example in frequency Frequency analysis The
analysis (storing the value of how many instances of a word, letter or other process of examining how
key appears) or as a high score table: often something occurs.
A useful tool for trying to
{"Dave" : 23, "Kev" : 37, "Angelina" : 42} break some encryption
Another use for a dictionary is lossless compression where a dictionary of methods.
values is stored and the original file is represented with just the keys, in order. Lossless compression
For instance using the following dictionary: Reducing the size of a file
without the loss of any data.
Key Value
1 ask
2 not
3 what
4 your
5 country
6 can
7 do
8 for
9 you
10 ,
11 .
We can compress the sentence:
"Ask not what your country can do for you, ask what you can
do for your country."
This sentence can be stored as:
1 2 3 4 5 6 7 8 9 10 1 3 6 7 8 4 5 12
Vectors
Vector representation
In mathematics, scalars are simple numbers that only have a size. Vectors on Scalar A single number
the other hand are numbers that have both a magnitude and a direction. used to represent or adjust
Vectors are often used to represent a position. For example, a 2D vector can be the scale of a vector.
used to can be represent a position on a 2D plane. Vector A data structure
used to represent a position
Vectors can be shown in the list representation – when it is represented (for example, a 2D vector
as a list of numbers, such as [3,5]. In this two-dimensional vector the
48 represents a position on a 2D
plane).
first number represents the horizontal position, and the second number
represents the vertical position.
The same vector can be represented visually as a position on a two-
dimensional grid, or as an arrow.
Vector manipulation
Because vectors are not normal numbers, the have special rules when adding
and multiplying. Vector addition involves adding the equivalent values
together. For example,
[3, 5] + [7, 1] =
[3+7, 5+1] =
[10, 6]
The result of vector addition is translation; for example, taking the vector [3,
5] and moving (or translating) by [7, 1].
The convex combination of two vectors means that two vectors must be
multiplied by two scalar numbers which add up to exactly 1 and are both
positive (greater than or equal to 0).
0.2 x [3, 5] + 0.8 x [7, 1] =
2 Fundamentals of data structures
26
Note
The result of a dot product is always a scalar, not a vector.
The result of the convex
The dot product of two vectors can be used to calculate the angle between the
combination of 3 (or more)
two vectors.
vectors will always be a
It is not necessary to know how to calculate the exact angle, but the table vector that is within the
below gives a general indication. shape described by the
original vectors.
Dot product Angle between the vectors
0 Exactly 90°
Bigger than 0 Between 0 and 90°
Less than 0 Between 90° and 180°
Note
The dot product of two vectors is used in many computer games to calculate the
angles between characters in order to check whether sentries can see an intruder or
to help non-player characters make decisions relating to route finding.
50
Summary
✚ Data structures provide a way to store collections of ✚ Trees are graphs that don’t include any cycles (loops)
data together rather than in individual variables ✚ A rooted tree has one vertex which is the root and all
Exam practice
51
C D
A
B
C
D
E
c) Copy and complete the adjacency matrix for this graph. [2]
A B C D E
A 0 1
B
C
D
E
52
3 Fundamentals of algorithms
Several standard algorithms are covered in this chapter.
Algorithm A sequence
For each algorithm, you are expected to be familiar with: of instructions that are
✚ the purpose of the algorithm followed in order to solve a
✚ the general steps involved in the algorithm problem.
✚ the program code for the algorithm Hand-trace Also known
✚ how to hand-trace the algorithm. as dry run. Without using
You are not expected to memorise the program code for each algorithm, a computer, completing a
although you should be able to recognise each algorithm if you see it. table to record the values
of each variable as the
Making links program is executed line-
by-line.
Hand-tracing algorithms is an important concept that appears in every Component 1
exam. More detail on how to tackle trace table problems is covered in Chapter 4.
When you are confident with the principles of hand-tracing algorithms it is advised to
practise with each of the examples in this chapter.
Graph-traversal
Graph traversal algorithms are designed to find a route from one node of a
Making links
graph to another.
For more on graphs, see
There are two main approaches to this problem, the breadth first search (BFS)
Chapter 2.
and depth first search (DFS).
In each of the examples below, the algorithm is being used to find a route
from node A to node F.
In each step, the item at the front of the queue is removed, and each adjacent
node is added to the queue.
In this first step, we remove the first item, A, from the queue and inspect it.
The adjacent nodes are B, C and D so these are added to the queue and A is
marked as visited:
53
F H
E G
C
3 Fundamentals of algorithms
D Front Rear
B
B C D
A
Visited: A
In the next iteration the first item (node B) is removed from the queue and
inspected. The only adjacent node to B is C, which is already in the queue. As
there are no new adjacent nodes no items are added to the queue. Node B is
marked as visited.
F H
E G
C
D Front Rear
B
C D
A
Visited: A, B
Again, the first item in the queue is removed and inspected. In this case it is
node C. Node C is adjacent to nodes E and G which are added to the queue.
Node D is also adjacent to node C but is already in the queue. Node C is
marked as visited.
F H
E G
C
D Front Rear
B
D E G
A
Visited: A, B, C
The first item in the queue, node D, is removed and inspected. There are
no new adjacent nodes to node D, and so the queue is unchanged. Node D is
marked as visited.
F H
E G
C
D Front Rear
B
E G
A
Visited: A, B, C, D
The first item in the queue, node E, is removed and inspected. Node E is
inspected next, which has an adjacent node F (the target). F is added to the
queue and E is marked as visited.
54
F H
E G
C
3 Fundamentals of algorithms
D Front Rear
B
G F
A
Visited: A, B, C, D, E
By the end of the algorithm the nodes have been visited in the order
[A,B,C,D,E,G,F,H], however by keeping track of the parent for each node (which
node was being interrogated when that node was found) it is possible to
identify the shortest route – in this case [A, C, E, F].
This algorithm will always find the shortest path through an unweighted
graph, however it may take many steps to do so in a large graph.
Revision activity
A queue is the most appropriate data structure to use as each newly
Using a section of the
discovered node is added to the rear of the queue and nodes are inspected in
London Underground
the order in which they are found (FIFO).
Walking Tube Map (you
can find a PDF of this on
Now test yourself
the Transport for London
1 What data structure should be used in a breadth first search? website, tfl.gov.uk), carry
out a breadth first search
2 State one potential use for a breadth first search.
to find the shortest route
Answers available online between two stations.
E G
C
D
B
Bottom A A Bottom
When node A is removed from the stack and inspected, the nodes B, C and D
are added.
Top F H Top
E G
C
D
D
B C
Bottom A B Bottom 55
The node at the top of the stack is always inspected first. Node D is not
adjacent to any new nodes, and so the stack is unchanged.
Top F H Top
3 Fundamentals of algorithms
E G
C
D
C B C
B Bottom A B Bottom
Inspecting node C means that nodes E and G are added to the stack.
Top F H Top
E G
C
G
D
B E
B Bottom A B Bottom
Top F H Top
E G
C
H
D
E B E
B Bottom A B Bottom
Node H is inspected but does not have any new adjacent nodes, and so the
stack is unchanged.
Top F H Top
E G
C
D
E B E
B Bottom A B Bottom
Top F H Top
E G
C
D
B F
B Bottom A B Bottom
At this point the node has been identified and the route take is described as
[A, D, C, E, F].
A stack is the most appropriate data structure to use as each newly
discovered node is inspected immediately (LIFO) and the route to this point
56 can be stepped through in reverse (popped) to find a new path.
3 Fundamentals of algorithms
Algorithm Data Structure to use Useful for
Breadth First Search (BFS) Queue Finding the shortest path between two nodes
Depth First Search (DFS) Stack Navigating a maze
3 What data structure should be used in a depth first search? It is important to be familiar
4 State one potential use for a depth first search. with the workings of the
algorithms covered in this
Answers available online chapter, and how to hand-
trace the algorithm. This
chapter focuses on the
Revision activity principles of the algorithms.
Using a section of the London Underground Walking Tube Map, carry out a depth first Hand tracing technique is
search to find the shortest route between two stations. covered in Trace tables in
Chapter 4 and the pseudo-
code for the breadth first
Exam tip search and depth first search
is included in that chapter.
You are not expected to memorise the exact syntax for the BFS and DFS algorithms,
though you should be familiar with the principles of the algorithm and be able to
describe which algorithm might be chosen in different circumstances.
It is also important to be able to hand-trace both the BFS and DFS algorithms. The
pseudo-code algorithm would be provided for this, as well as a trace table to fill in.
Tree traversal
Tree traversal is carried out using one of three recursive algorithms, all of
which look similar.
✚ Pre-order tree traversal
✚ In-order tree traversal
✚ Post-order tree traversal
The difference between the algorithms is in when the node’s value is output.
Making links
// In-Order Traversal
For more on trees see
FUNCTION Traverse (node)
Chapter 2.
OUTPUT node
IF node.left THEN Tree traversal Inspecting
Traverse(node.left) the items stored in a tree
data structure.
IF node.right THEN
Pre-order tree traversal
Traverse(node.right)
Inspecting or displaying the
ENDFUNCTION value of each node before
// In-Order Traversal moving on to other nodes.
// Post-Order Traversal
FUNCTION Traverse (node)
IF node.left THEN
3 Fundamentals of algorithms
Traverse(node.left)
IF node.right THEN
Traverse(node.right)
OUTPUT node
ENDFUNCTION
Pre Post A
In
B C
D E F
In Figure 3.1, the green outline shows the order in which the nodes are
traversed, starting with Node A and then checking the left node each time
until a dead end is reached. The subroutine then goes back a stage and checks
the right-hand node.
Using pre-order tree traversal the value of the node is output as soon as that
node is first inspected. This can be shown by placing a horizontal bar to the
left each time the green outline meets a node.
The output of pre-order tree traversal can be found by following the outline
and writing down the value of the node each time it crosses a purple line.
With this tree the output would be A, B, D, E, C, F.
The pre-order tree traversal is used when copying a tree, as each node value
will be immediately copied to the new tree as soon as it is found, meaning
that it is impossible to copy node E, for example, before node B has already
been copied.
Making links
A binary search tree is a data structure used to store data that can be quickly
searched. The binary search algorithm is explored on page 66.
58
3 Fundamentals of algorithms
This can be represented by using a vertical bar underneath each node.
Whenever the outline path crosses a vertical bar, the value from that node
should be output.
Pre Post A
In
B C
D E F
Pre Post A
In
B C
D E F
Making links
Infix notation is how people typically write mathematical equations. Postfix notation
is an alternative format that is easier and faster for a computer to process. Both
notations are described in more detail later in this chapter.
59
Each method of traversal has a different effect on the order in which the items
are displayed.
5 Which tree traversal algorithm will print the value of the root node as the
last line?
6 Which tree traversal algorithm would be most appropriate for copying a tree?
7 Describe two possible uses for post-order tree traversal.
8 Which tree traversal algorithm can be used to output a list in ascending order?
Revision activity
Practise creating a binary search tree by adding the names of people, pets, sports
teams, and so on.
✚ Try using all three tree traversal algorithms to copy the tree, and identify which
method works best.
✚ Try using all three tree traversal algorithms to empty the tree, and identify which
method works best.
✚ Try using all three tree traversal algorithms to display the values in the list, and
identify which is in the correct, sorted order.
Postfix, or reverse Polish notation (RPN) looks slightly different because the Postfix / Reverse Polish
operations are written after their operands; for example, 2 5 – 7 x. notation A method of
writing mathematical or
Reverse Polish notation is useful for processing because:
logical expressions in which
✚ there is no need for parentheses/brackets the operators appear after
✚ the processing can be carried out using a stack. their operands; for example,
When processing RPN using a stack, each value is added to the stack until 2 2 +.
an operator is found. At this point two items are popped from the stack, the Operation A function to be
answer calculated and then pushed back onto the stack. applied; for example, +, -,
AND, OR.
Worked example
In RPN, the following calculation 2 5 – 7 x is performed, using a stack, as follows:
3 Fundamentals of algorithms
2 5 – 7 x
5 7
2 2 –3 –3 –21
Transforming between infix and postfix/reverse Polish notation can be carried out by inspection (that is, by looking at the
expression and working out the answer in your head).
The transformation can also be carried out using a tree.
3 x
2 5
✚ Practise writing infix notation expressions using a binary tree and then using post- Make sure you are
order tree traversal to convert them into RPN. confident with converting
✚ Draw out and step-through the process of using a stack to evaluate, or solve, RPN between infix notation and
expressions. reverse Polish notation
by inspection as well as
understanding how trees
Now test yourself can be used to perform the
translation algorithmically.
9 Transform each of the following infix expressions into reverse Polish notation.
a 3 + 7
b 12 – 3 + 9
c 4 x 12 + 6
d (7 – 3) ÷ 2
10 Transform each of the following reverse Polish expressions into infix notation.
a 12 6 –
b 15 2 x 6 +
c 3 7 2 x –
11 Give two advantages for using reverse Polish notation.
Searching algorithms
Linear search
3 Fundamentals of algorithms
The linear search is a very inefficient but simple algorithm for finding an item
Linear search A searching
in an array of items.
algorithm in which each item
The basic aim of the algorithm is to check each item in an array, one-by-one, is checked one-by-one.
from start to finish.
FOR i ← 0 to Length(Items)
IF Items[i] = Target
OUTPUT "Found"
ENDIF
ENDFOR
The linear search is very simple to program and can be effective for very
small lists. It is also useful if the list is unordered, however it is extremely
inefficient for large lists.
Binary search
The binary search is an example of a ‘divide and conquer’ approach to
Binary search A searching
searching.
algorithm in which the
Given a sorted array of items the algorithm will check the middle item. middle item is checked and
Because the list is sorted, this means that half of the array can then be half of the list is discarded.
discarded, halving the possible number of values to search each time.
The implementation of a binary search uses pointers to mark the Start, End
and Midpoint of the array, Mid, calculating the midpoint by adding the Start
and End values and halving them.
Mid = (Start + End) / 2
Mid 3
62
Where there are an even number of items to choose from the algorithm can
be programmed to use the item on the left, or the item on the right. Using
integer division will mean that the item on the left is chosen in this case.
Mid = (Start + End) // 2
3 Fundamentals of algorithms
Start Mid End Start 0
Mid 2
Revision activity
✚ Practise programming the linear search to find an item from a list of celebrity names.
✚ Practise programming a binary search from scratch.
✚ Hand trace the binary search with some sample values and then use your
programming platform’s debugging tool to check your accuracy. 63
Because each item is added to a tree in order, a recursive algorithm can a binary tree is traversed.
be used to travel through the tree, moving through the left or right node Binary tree A tree in which
depending on the target value. each node can have no more
An example implementation might look like this: than two child nodes, each
placed to the left or right of
FUNCTION BinaryTreeSearch (Node) the preceding node so that
IF Node.Value = Target THEN the tree is always in order.
Return True
Cat
ELSEIF Node.Value > Target AND Exists(Node.Left) THEN
Return BinaryTreeSearch(Node.Left)
Budgie Fish
ELSEIF Node.Value < Target AND Exists(Node.Right) THEN
Return BinaryTreeSearch(Node.Right)
Rabbit
ELSE
Figure 3.5 A binary tree
Return False
ENDIF
ENDFUNCTION
And a trace table might look like this:
Target Call Node Value returned
Dog 1 Cat
17 State one advantage of using a binary search tree to store data and then a binary
tree search to retrieve data instead of:
a an array and a linear search
b an array and a binary search.
Revision activity
Create a binary search tree using any data you like (such as film names or book titles)
and hand trace the steps in searching for specific items.
64
Sorting algorithms
Bubble sort
3 Fundamentals of algorithms
The bubble sort is a relatively simple, but very inefficient sorting algorithm. Bubble sort A sorting
algorithm in which pairs of
In the first pass, each item is compared to its immediate neighbour and the
items are sorted, causing the
value are swapped if necessary. In this way, the highest value bubbles to the
largest item to bubble up to
end of the list each time.
the top of the list.
If there are n items in the list, there are (n-1) comparisons in each pass, so in
Pass Travelling through
a list of six items there are five comparisons.
a list from start to finish
[0] [1] [2] [3] [4] [5] exactly once.
The same algorithm is then repeated (n-1) times; that is, there are (n-1)
passes because, in the worst case scenario (where the last number in the list
is the smallest value), the value will be moved down one place each time.
Exam tip
FOR i ← 0 to n – 1 //This loop is for each pass
You are expected to be
// Complete one pass: very familiar with the
FOR j ← 0 to n – 1 //This loop is for each comparison bubble sort and you should
be confident to program
IF Items[j] > Items[j+1] THEN one from scratch in the
Temp ← Items[j] Component 1 exam.
Items[j] ← Items[j+1]
Items[j+1] ← Temp
ENDIF
ENDFOR
ENDFOR
There are two main methods to improve the efficiency of the bubble sort.
1 Record the number of swaps completed in each pass. If 0 swaps were made
then the list must be in order and the algorithm can be stopped.
2 For each pass (outer loop), reduce the number of comparisons by 1. This is
because the highest value is guaranteed to have bubbled to the end of the
list each time, and so on the second pass it is not necessary to make the
final comparison. On the third pass it is not necessary to make the final
two comparisons, and so on.
For a large list, or a list that is nearly sorted, these improvements can have a
dramatic impact on the running time of the algorithm.
The bubble sort is still considered to be very inefficient in most cases and is
rarely used in industry. Making links
The bubble sort can be suitable for a list that is nearly sorted. Time-wise complexity is
explored in more detail in
Algorithm Time-wise Complexity
Comparing algorithms in
Bubble sort O(n2) Chapter 4.
65
18 State the number of comparisons needed to run through one pass of the bubble
sort for a list of 10 items.
3 Fundamentals of algorithms
19 Suggest two methods for improving the efficiency of the standard bubble sort.
Revision activity
✚ Take a physical collection (such as CDs, DVDs, books) and practise carrying out a
bubble sort in real life.
✚ Practise programming the basic bubble sort from scratch.
✚ Improve your basic implementation by reducing the number of steps in the inner
loop by 1 each time and by using a flag to check whether any swaps were made
during that pass.
✚ Hand-trace the bubble sort with some sample values and then use your
programming platform’s debugging tool to check your accuracy.
Merge sort
The merge sort is significantly more efficient than the bubble sort, though it is Merge sort A sorting
also significantly more complex. algorithm in which items are
✚ A list is broken in half, and in half again until we have a series of single split into single items and
elements. then merged into sorted
✚ The elements are placed into pairs and each pair is sorted. pairs, fours, eights and so on.
✚ Each group of two pairs is merged into a sorted group of four.
✚ Each group of two fours is merged into a sorted group of eight.
✚ Repeat as necessary until the list is complete.
3 7 12 9 4 8 6 2
3 7 12 9 4 8 6 2
3 7 12 9 4 8 6 2
3 7 12 9 4 8 6 2
3 7 9 12 4 8 2 6
3 7 9 12 2 4 6 8
2 3 4 6 7 8 9 12
When merging two lists, only the top two values need to be compared each
time as each list is already guaranteed to be in order.
Making links
This reduces the overall number of steps required and means that the
algorithm is much more efficient than the bubble sort in most cases. Time-wise complexity is
explored in more detail in
The merge sort also has a fixed number of steps, no matter how ordered or
Comparing algorithms in
66 unordered the list may be. As a result, some lists that are nearly in order can
Chapter 4.
be sorted more quickly using a bubble sort, though this is rare.
3 Fundamentals of algorithms
Note
Though not covered in this specification, there are many other sorting algorithms.
These include the quick sort, heap sort, insertion sort and shell sort. An investigation
into the efficiency of different sorting algorithms could make for an interesting NEA
programming project.
Exam tip
You may be asked to trace the merge sort, or to demonstrate your understanding
of how the merge sort works. You are unlikely to be asked to program a merge sort
from scratch so focus your revision on understanding the algorithm rather than
programming it from memory.
20 Give two situations in which the merge sort would be more suitable than the As discussed, time-wise
bubble sort. complexity is a measure of
how efficient an algorithm
21 Give one situation in which the bubble sort would be more suitable than the
is in terms of time and is
merge sort.
explored in more detail in
22 Which sorting algorithm always has a fixed number of steps? Comparing algorithms in
Chapter 4. However, it is
Answers available online
worth trying to remember
the time-wise complexity
Revision activity for each of these
algorithms:
Take a physical collection (such as CDs, DVDs, books) and practise carrying out a
merge sort in real life.
Optimisation algorithms
Dijkstra’s shortest path algorithm
Dijkstra’s shortest path algorithm is a graph traversal algorithm that will
Shortest path The route
find the shortest route from a given starting node to every other node on the between two nodes on a
graph. It is very efficient for route finding across weighted graphs, though a graph that incurs the least
breadth first search is more efficient for unweighted graphs. cost. In an unweighted graph
The trace table for Dijkstra’s algorithm provides the shortest overall distance the cost for each path can be
to each node in the graph and the previous node to travel from. This means assumed to be 1.
that the quickest path to different nodes from the starting node can be found
quickly, without the need to carry out the algorithm again.
67
Initially the distance from the start node to each node is set to ∞ (infinity).
Then the algorithm works as follows:
✚ For every node attached to the current node:
✚ add the distance from this closest unvisited node to the current node,
3 Fundamentals of algorithms
and add it to the distance from the current node to the start node – this
is the distance of the unvisited node to the start node
✚ if this calculated distance is less than the current distance between the
start node and unvisited node, the value is updated – this is how the
shortest route is calculated
✚ note the current node as the previous ‘parent’ node for the unvisited node
✚ repeat until all nodes attached to the current node have distances
recorded.
✚ Mark the current node as visited.
✚ Select a new current node, which is the closest unvisited node, and repeat
all the steps above.
✚ Repeat until the target is reached.
Worked example
2 7
A B F
2
5 4 G
4
4
C 6
D 3 E
A B C D E F G
Shortest distance 0 2 5 11 14 9 11
Previous node - A A C C B F
Once Dijkstra’s algorithm has produced a result, the shortest route from node A to any other node can be found by
working backwards, using the ‘previous node’.
✚ The shortest route to G ends by moving to node G from node F.
✚ Node F is reached from node B.
✚ Node B is reached from node A.
Therefore, the shortest path is A–B–F–G.
Exam tip
It is not necessary to
Making links
remember each and
Hand-tracing algorithms is an important concept that is assessed in every Component 1 every step in Dijkstra’s
exam. More detail on how to tackle trace table problems is covered in Chapter 4. shortest path algorithm,
When you are confident with the principles of hand-tracing algorithms it is advised to but understanding how the
practise with each of the examples in this chapter. algorithm works makes it
simpler to complete a trace
table. You should make
Now test yourself sure you are confident
identifying the shortest path
23 Explain the purpose of Dijkstra’s algorithm. to a given node once the
24 State what two values must be stored for each node following the application of trace is complete.
Dijkstra’s algorithm.
68
Revision activity
Using a section of the London Underground Walking Tube Map, carry out Dijkstra's
shortest path algorithm to find the shortest route from one station to all other stations
3 Fundamentals of algorithms
in that section.
Summary
✚ Graph traversal algorithms are used to find a path from ✚ RPN is used in interpreters based on a stack; for
one node to another in a graph example, Postscript and bytecode
✚ The breadth first search (BFS) scans all of the nodes ✚ The linear search steps through each item from a
one step away from the start point, then all the nodes list in a line, and is very inefficient, with a time-wise
two steps away, and so on complexity O(n)
✚ The breadth first search uses a queue data structure ✚ The binary search checks the middle item from a list and
and is used to find the shortest path between two discards half of the values each step, making it much
nodes of an unweighted graph more efficient, with a time-wise complexity O(log n)
✚ The depth first search (DFS) continues in one direction ✚ The binary search can only be applied to an ordered list
until it reaches a dead end and then steps backwards ✚ The binary tree search discards an entire branch each
until it finds a new, unused path step, making the search quick and efficient, with a
✚ The depth first search uses a stack data structure and time-wise complexity O(log n)
is used to find a route through a maze ✚ The bubble sort involves swapping pairs of values so
✚ Tree traversal algorithms are used to interrogate each that the largest value will bubble up to the end each
node in a tree pass
✚ The three main tree traversal algorithms look very ✚ The bubble sort can be improved by reducing the
similar, with only the position of the print statement number of comparisons in each step by 1 and by
moving halting if no swaps are made in a full pass
✚ Pre-order tree traversal is used for copying a tree ✚ The bubble sort has a time-wise complexity O(n2)
✚ In-order tree traversal is used to output a sorted list of ✚ The merge sort involves splitting a list into individual
values from a binary search tree and for outputting an items and then continually merging them into sorted
infix notation expression lists of 2, 4, 8, … items, comparing only the front-most
✚ Post-order tree traversal is used for emptying a tree items in each case
and for outputting a reverse Polish notation expression ✚ The merge sort is much more efficient than the
✚ Operators are terms such +, −, AND, OR bubble sort for large, unsorted lists, with a time-wise
✚ Operands are the values on which operators act, such complexity O(n log n)
as numbers (for example, 3, 81.2) or variables (for ✚ Dijkstra’s shortest path algorithm is used to find the
example, x) shortest route from one node to any other node in that
✚ Infix expressions place the operator inbetween its graph
operands ✚ Dijkstra’s algorithm is used for route finding across
✚ Reverse Polish notation (RPN) or outfix expressions weighted graphs
place the operator after its operands
✚ RPN expressions do not need to use brackets and can
be evaluated using a stack
Exam practice
1 This graph represents train routes between different towns, and the distance
between each station in miles. Dave lives in town A and wants to travel to town G.
2 7
A B F
2
5 4 G
4
4
C 6
D 3 E
a) Dave has the choice of using a breadth first search or a depth first search.
State the name of the graph traversal algorithm that would be most suitable to
find the shortest path between two nodes. [1]
b) State what data structure would be most suitable to be used in the
algorithm identified in Question 1a), justifying your choice. [2]
69
c) Give a suitable use for the algorithm that was not chosen in Question part a). [1]
d) Dave is advised to used Dijsktra’s shortest path algorithm since he travels by
train a lot. Explain why using Dijkstra’s algorithm will save time overall. [2]
2 This array, Cars, contains the names of car manufacturers.
3 Fundamentals of algorithms
b) Copy and complete the trace table below by following the given algorithm. [4]
4 Theory of computation
Worked example
Statements:
✚ All elephants are green.
✚ Some elephants are big. Exam tip
Proposed deduction: Try looking online for
✚ All big things are green. puzzles, riddles and logic
problems to help prepare
Response:
for this topic, which is often
The deduction is uncertain (or does not follow) because we can’t know if there are covered in the first question
other big things that are not elephants. in Component 1.
1 All sweets are blue. All sweets are small. Which of these statements follows?
Revision activity
A All small things are blue.
B All blue things are sweets. Search the internet for logic
C Some blue things are small. puzzles such as:
✚ sudokus
2 A farmer is travelling with a wolf, a goat and a cabbage. If left alone the wolf will eat
✚ logic problems
the goat and the goat will eat the cabbage. The farmer must cross a river in a boat,
✚ syllogisms
but can only take one companion at a time. How can the farmer get all three across?
and practise solving them.
Answers available online
Pseudo-code doesn’t always have strict rules about the exact syntax that is
used, although AQA questions are written using a consistent format.
It is important to be familiar with reading and understanding each of the
following programming constructs in pseudo-code:
4 Theory of computation
Sequence The concept that the instructions are carried out in the order that Sequence Executing
they are written. instructions in the order
Assignment The concept that a value, pointer, or the result of a process can be they are written.
assigned to a variable; for example: Assignment Changing the
Total ← Total + ExtraValue value stored in a variable.
Selection The concept that a decision over whether to execute a particular Selection Using conditions
part of the program can be made based on a condition; for example: to decide whether to
execute part of a program.
IF Total < 100 THEN
Iteration Repeating or
…
looping through a section
ENDIF of code.
Iteration That concept that a section of code can be repeated; for example:
FOR i ← 0 to 5
…
ENDFOR
Trace tables
One way in which your ability to read and understand algorithms is assessed
Trace table A table for
is using a trace table to hand-trace an algorithm.
recording the values of
In these questions a pseudo-code algorithm is provided and the task is to fill variables as a program runs.
in a table showing how the values of the variables in the program change.
Hand-trace Simulating the
The key advice for completing trace tables is: execution of an algorithm
✚ use your finger to record where you are up to in the algorithm at all times without running it, using a
✚ write the values of the variables as they change from left to right, moving trace table to record the
to a new line when necessary state of the program after
✚ only write down a value when that variable changes. each instruction.
For example, hand-tracing the algorithm below using the following list of
values:
List
Target ← Banana
Start ← 0
72 End ← Length(Items) – 1
4 Theory of computation
Start ← Mid + 1
ELSE
End ← Mid – 1
ENDIF
Mid ← (Start + End) // 2
ENDWHILE
Because you should always read left to right through the table, the final
change to the value of Start takes place on a new line, and it is not necessary
to replace the value of End.
It is also common to be asked to identify the purpose of an algorithm.
More complex algorithms lead to more complex trace tables and it is common
for some data to be filled in, and those boxes greyed out, in order to help
students check that their progress through the trace table is correct.
More complex trace tables often refer to the standard algorithms discussed in
Chapter 3. Common examples include the breadth first search and depth first
search.
The task is to trace the algorithm to show how the graph is traversed when
given the call Breadth(A,E).
Note
4 Theory of computation
Making links
For a full and detailed explanation of how to complete this trace table see:
www.hoddereducation.co.uk/myrevisionnotesdownloads
Working backwards from node C we can see that the parent node is node B,
and from node B the parent node is node A. Therefore the route found is
A→ B → C.
B
D
A
C E
Figure 4.2
74
PROCEDURE Depth(Current,Target)
Stack ← [Empty]
Discovered ← [False]
4 Theory of computation
Stack.Push(Start)
DO
Current ← Stack.Pop()
IF Not Discovered[Current] THEN
Discovered[Current] ← True
FOR each node Temp adjacent to Current
AND not in Stack DO
IF Not Discovered[Temp] DO
Stack.Push(Temp)
ENDIF
ENDFOR
ENDIF
WHILE Stack is Not Empty And Current != Target
ENDPROCEDURE
The task is to trace the algorithm to show how the graph is traversed when
given the call Depth(A,C).
The first three lines of the algorithm have already been completed and the Note
initial state of the program has been highlighted in grey in order to indicate
This particular algorithm is
that no changes should be made to those cells. A typical exam question
designed for an unweighted
might include the values of variables Current and Target in order to support
graph.
students in ensuring they are carrying out the algorithm correctly.
Making links
For a full and detailed explanation of how to complete this trace table see:
www.hoddereducation.co.uk/myrevisionnotesdownloads
75
Queue.Enqueue(Start)
Discovered[Start] ← True D E F
Complete ← False
DO
Current ← Queue.Deqeue() Figure 4.3
FOR each node Temp adjacent to Current DO
IF Discovered[Temp] = False AND Complete = False THEN
Queue.Enqueue(Temp)
Discovered[Temp] ← True
Parent[Temp] ← Current
IF Temp = Target THEN
Complete ← True
ENDIF
ENDIF
ENDFOR
WHILE Queue Not Empty AND Complete = False
ENDPROCEDURE
a Copy and complete this trace table to show the steps involved in following the breadth first search. The function
call is ShortestPath(A,E).
C D
C D E
b Copy and complete the trace table below to show the steps involved in following the depth first search. The
function call is DFS(A,E).
PROCEDURE Depth(Start,Target)
Stack <- [Empty]
Discovered <- [False]
Stack.Push(Start)
DO
Current <- Stack.Pop()
IF Not Discovered[Current] THEN
76
4 Theory of computation
Stack.Push(Temp)
ENDIF
ENDFOR
ENDIF
WHILE Stack is Not Empty And Current != Target
ENDPROCEDURE
Return BinaryTreeSearch(Node.Left)
ELSEIF Node.Value < Target AND Exists(Node.Right)
THEN
Return BinaryTreeSearch(Node.Right)
ELSE
77
Return False
ENDIF
ENDFUNCTION
4 Theory of computation
ENDIF
ENDFUNCTION
ELSE
Return False
ENDIF
ENDFUNCTION
78
ELSE
Return False
ENDIF
ENDFUNCTION
4 Theory of computation
If the target was ‘Rabbit’ then the trace table would look like this:
Worked example
Here is an example of another pseudo-code algorithm and its final trace table.
FUNCTION Fact(x)
IF x < 2 THEN
RETURN 1
ELSE
RETURN x * Fact(x – 1)
ENDIF
ENDFUNCTION
Call x Return
1 4
2 3
3 2
4 1 1
3 2 2
2 3 6
1 4 24
Worked example
A more complex example might include more than one recursive call. For example:
FUNCTION Fib (int x)
IF x < 2 THEN
RETURN 1
ELSE
RETURN Fib(x-1) + Fib(x-2)
ENDIF
ENDFUNCTION
As with the non-recursive trace tables, it is common to be provided with pre-
completed values highlighted in order to help you check that you are completing the
table correctly.
79
3 1 1
2 2 1
4 0 1
2 2 1 1 2
1 3 2
5 1 1
1 3 2 1 3
Making links
For a full and detailed explanation of how to complete this trace table see:
www.hoddereducation.co.uk/myrevisionnotesdownloads
For more on recursive algorithms, including general case and base case, see Chapter 1.
4 Complete the trace table below by hand-tracing the algorithm, from the FUNCTION
Fact (X) algorithm in the first Worked example above, when the call Fact(5) is made.
Call x Return
1 5
2
3
4
5
4
3
2
1
5 Complete the trace table below by hand-tracing the algorithm, from the FUNCTION
Fib (X) algorithm in the second Worked example above, when the call Fib(4) is made.
4 Theory of computation
1
7
8
7
9
7
1
Abstraction
Abstraction is a method of reducing the complexity of a problem.
Abstraction Making
This makes the problem easier to understand, and easier to solve. a problem simpler by
removing or hiding features.
There are a number of different techniques, each discussed below.
Information hiding
When designing a solution it is helpful to hide any details that that do not
need to be accessed by other parts of the program.
In procedural-oriented programming this can be carried out by using local
variables which are not accessed by other subroutines.
In object-oriented programming this can be carried out by declaring attributes
and methods as private in order to hide them from other objects
81
Procedural abstraction
In order to avoid repeatedly typing very similar code to carry out tasks that
rely on the same method over and over again, it is good practice to use a
4 Theory of computation
Functional abstraction
Functional abstraction aims to hide the complexity of a specific algorithm, or
how a part of the program works, using a subroutine. A programmer can then
call that subroutine without needing to know exactly how it functions.
A good example is a method to print a message to the screen. This is actually
a very complex process that involves re-drawing the screen with the new
data. However it is often the very first program written by new programmers
and the complexity is almost completely hidden from the programmer by
putting the code inside a procedure called print.
When a programmer wants to carry out this very complex procedure, they
can simply call print (for example, print("Hello World")) - the need to
understand the detail of how the message is printed is avoided.
Data abstraction
In data abstraction, because the details of how the data are actually stored
in a computer are hidden, it is possible to create structures that behave in a
particular way.
It is straightforward to understand how structures such as stacks, queues,
graphs and trees work at a conceptual level. However, these structures are not
really stored in the computer as stacks, queues, graphs or trees.
For instance:
✚ Implementing a circular queue can be achieved using a one-dimensional
array and two pointers.
✚ Once implemented, a programmer can then interact with this circular
queue using subroutines such as enqueue, dequeue and peek.
✚ However, the programmer does not need to be concerned with the way
that the data is actually stored within the array – they just need to know
how to use the enqueue, dequeue and peek subroutines.
Problem abstraction/reduction
Problem abstraction involves removing unnecessary detail from the problem
in order to reduce the problem to one that has already been solved.
One example might be that the owner of a chain of shops may want to visit
each of their stores in order to award a prize for the employee of the month at
each location.
By removing the unnecessary detail of the purpose of the visit, this can be
identified as an example of the travelling salesman problem, for which a
82
number of potential solutions already exist.
Decomposition
Decomposition is the process of taking a large problem and breaking it down
Decomposition Breaking
into smaller and smaller sub-problems until each sub-problem covers exactly
a problem into smaller sub-
4 Theory of computation
one task.
problems.
A programmer can then create a subroutine for each task in order to create a
complete solution.
This technique doesn’t necessarily reduce the overall level of complexity,
but is intended to provide a method of breaking the problem into more
manageable chunks.
Composition
Composition refers to the concept of combining features together.
Composition Combining
One example is using a subroutine which calls another subroutine. For parts of a solution together
example, a subroutine to start a game of cards may be composed of to create a solution made of
subroutines to shuffle the deck and then deal the cards to the players. component parts.
Automation
Automation is the overall goal of applying some or all of the abstraction
Automation Designing,
methods discussed above.
implementing and executing
In order to solve a problem, it is necessary to: a solution to solve a
✚ design an algorithm to solve the problem problem automatically.
✚ implement the data structures to store the data that must be stored
✚ implement the algorithms using program code
✚ execute the code.
By completing these four steps it is possible to automate processes and to
model real-world problems and scenarios in a meaningful and useful way.
Regular languages
Finite state machines (FSM) with and
without output
A finite state machine can be used to represent potential states within a
Finite state machine
system in order to help the programmer understand the task more clearly.
(FSM) A computational
For this use, each state represents a real-world state. For instance, in a model which has a fixed
83
computer program that controls a lift the states might be: number of potential states.
b, c a, c
a, b, c S4 b b
a
a, b
c
S3 S2
Figure 4.5 A state transition diagram for a finite state machine with no outputs
84
11 What type of state is shown with two concentric circles? Create state transition
12 Identify two methods for representing a finite state machine. diagrams for everyday
4 Theory of computation
devices, such as hairdryers
13 A finite state machine can be used to represent the states of a real-world problem.
and ovens.
Suggest one other use for an FSM.
S1
1|0
S0 1|0 0|1
0|0
S2
0|0
Figure 4.6 A state transition diagram showing a Mealy machine
The first value in the label for a transition is the input, and the second is the
Exam tip
output.
Starting at state S0, an input of 1 will give an output of 0. It is very important to read
the question very carefully.
This finite state machine does not have an accepting state, as its purpose is In the example above,
not to validate an input, but to display an output based on its input. reading the input from right
In this particular case an input of 0110, reading from left to right, would give to left would give a different
an output of 0011. answer – the output 1100,
carrying out a left logical
The purpose of this finite state machine is to carry out a right logical shift.
shift. You need to pay
Mealy machines can also be described using a state transition table, with an careful attention to these
extra column indicating the output. details in the question.
You should not be assessed specifically on this mathematical understanding, sets are collections of
but it will form part of the basis for the regular expressions topic. objects.
A set is an unordered collection of values, with no duplicates. Set comprehension A
A = {1, 2, 3, 4, 5} collection of rules to define
which values are in a set.
A set can also be described using a set of rules, known as set comprehension
or set-builder notation.
These symbols are used in set comprehension:
Making links
| ‘such that’ Set theory, and particularly
∈ ‘is a member of’ the sets of (natural
∧ logical operator for ‘and’ numbers), (integers),
(rational numbers) and
≤, ≥, < > logical operators for inequalities (real numbers), are essential
to understanding the maths
Here is the set comprehension for A:
that underpins regular
A = { x | x ∈ ∧ x ≥ 1} expressions. This topic is
This translates as: A is the set of x, such that x is a member of the set of explained in more detail in
natural numbers and x is greater than or equal to 1. Chapter 5.
A subset is made up of some elements of the main set. The main set is
Subset A selection of the
sometimes called the superset.
values found in a set.
Subsets can be described in one of these three ways:
✚ A ⊆ B means ‘A is a subset of B’.
4 Theory of computation
Exam tip
This means that everything in A is also in B. They may or may not be
equal. For example: Remember that the sign for
a subset, ⊆, incorporates
{0,1} ⊆ {0,1,2} and {0,1,2} ⊆ {0,1,2} an underscore in the same
✚ A ⊂ B means ‘A is a proper subset of B’. way as the ‘greater than
or equals’ symbol, ≥. This
This means that everything in A is also in B, but they are not equal. There
indicates that the subset
is at least one other value in B.
can be equal to its superset.
{0,1} ⊂ {0,1,2}
The symbol for a proper
✚ A = B means ‘A is identical to B’. subset, ⊂, does not
{0,1,2} = {0,1,2} incorporate an underscore
and cannot be the same as
Now test yourself its superset.
Set operations
For each example, A = {1, 2, 3} and B = {3, 4, 5}.
✚ Membership of a set indicates that a value is in that set.
1 is a member of A.
✚ Union (represented as ∪) means joining both sets together and including
all elements, see Figure 4.7.
The ∪ symbol is similar to the logical ∨ symbol, indicating OR.
Figure 4.7
A ∪ B = {1, 2, 3, 4, 5}
✚ Intersection (represented as ∩) means joining both sets together and
including only those elements common to both, see Figure 4.8.
The ∩ symbol is similar to the logical ∧ symbol, indicating AND.
A ∩ B = {3}
✚ Difference (represented as , Δ or \) means joining both sets together and
including only those elements that are not common to both, see Figure 4.9. Figure 4.8
A \ B = {1, 2, 4, 5}
Regular expressions
Regular expressions are a way of describing a set. They are also used to
Regular expression A
describe particular types of languages. They can be extremely useful to
sequence of characters
4 Theory of computation
identify items that meet certain criteria; for example, for validation checks.
used to describe a pattern.
Symbols used in regular expressions include: Used in searching and
✚ | meaning ‘or’; for example, a | b validation.
The set of valid inputs would be {a, b}
✚ + meaning ‘1 or more’; for example, ab+
The set of valid inputs would include {ab, abb, abbb, abbbb, …}
✚ * meaning ‘0 or more’; for example, ab*
The set of valid inputs would include {a, ab, abb, abbb, …}
✚ ? meaning ‘0 or 1’; for example, ab?
The set of valid inputs would be {a, ab}
b c
S0 S1 S2
a, b
c
a, b, c
S3
a, b, c
Figure 4.10 A finite state machine
In this machine the acceptable inputs include 0 or more ‘a’ values followed by
a single ‘b’ and a single ‘c’. Any other inputs would be sent to the trap state.
Therefore, the acceptable inputs would be written as: a*bc
b
a
S0 S1 S2
c
c
b, c a, b
S3
In this finite state machine the accepting state can be reached by inputting ‘a’
followed by 0 or more repetitions of ‘bc’.
This would be written as a(bc)*
Regular languages
A formal language is made up of words that contain letters from the alphabet
and where those words fit a set of rules.
A regular language is a formal language that can be represented using a
regular expression. If the language cannot be represented using a regular
expression then it is not a regular language.
88
A regular language can also be represented using a finite state machine.
4 Theory of computation
b *
c ?
d | Revision activity
24 Identify another way to describe a regular expression. ✚ For each regular
25 List the values which would be accepted by each regular expression: expression you can find,
a ab(c|d) create the equivalent
finite state machine.
b ab?cd
✚ For each finite state
c (ab)|(cd+)
machine you can find
d ab*a
with an accepting state,
26 What type of language can be described using a regular expression? create the equivalent
regular expression.
Answers available online
Context-free languages
Backus–Naur form (BNF) and syntax
diagrams
Backus–Naur form
Backus–Naur form (BNF) is a method of describing syntax rules.
Backus–Naur form (BNF)
In BNF, each type of value is described as containing values from a set. A notation used to describe
the syntax of a language.
Each individual production rule is replaced with (::=) certain values. The
symbol ::= can be read ‘is defined as’. Syntax The set of rules for
a given language.
Values in <angle brackets> must be further broken down and defined
until all values are made up of terminal values (that is, values that cannot be Production rule The set
broken down any further). The symbol | means ‘or’. of acceptable inputs for a
given symbol.
In the example below, spaces are represented with an underscore.
Terminal A single value
For example, the syntax rules for entering a name could be defined as follows: that cannot be broken down
<fullname> ::= <title> _ <name> into smaller parts.
Recursion is used in lines 3 and 4. To see how this works consider the set
Exam tip
word contains ‘DAVE’.
✚ Line 4 says if a word is not defined as a single letter then it is defined as a When asked in an exam,
word followed by a letter. BNF definitions that do
4 Theory of computation
✚ This would replace the word ‘DAVE’ with the word ‘DAV’ followed by the not use recursion can be
letter E. described using a regular
✚ The new word ‘DAV’ is replaced with the word ‘DA’ followed by the letter expression (such as
‘V’. fullname, title and
✚ The new word ‘DA’ is replaced with the word ‘D’ followed by the letter ‘A’. letter in the previous
✚ Finally, the word ‘D’ is replaced with the a value from the set letter, example). BNF definitions
which contains the terminal value ‘D’. that do use recursion
cannot be described using
The set might contain other values which need to be defined. For example, a regular expression (such
a fullname is made up of further sets title and name. The set might be as name and word in the
defined as containing one of a series of terminal values which do not need to previous example). This is
be defined. a simplification of the real
world but provides the
For example, acceptable titles include Mr, Miss, Mrs and Ms.
expected answer for A-level
The set might use recursion to allow for values that can be of variable length. exam requirements.
For example, a name is made up of a word, or a name followed by a word.
‘Dave’ is a word, and so can be accepted as a name.
‘Dave Andrew’ is a name followed by a word, and so can be accepted as a
name.
‘Dave Andrew Smith’ is a name followed by a word, and so can be accepted as
a name.
Syntax diagrams
Syntax diagrams are an alternative way to represent BNF.
Syntax diagram A
Definitions that contain terminals tend to be quite straightforward, diagram used to describe
demonstrating each possible value. For instance, a digit can be described as: BNF graphically.
Digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Digit
0
1
2
3
4
5
6
7
8
9
Definitions that contain recursion are drawn using a loop, so that an integer
can be described as:
Integer ::= <digit> | <digit><integer>
Integer
digit
integer
90
The rules for a name are given using the following syntax rules:
<name> ::= <word> | <word> _ <name>
4 Theory of computation
<word> ::= <letter> | <letter> _ <word>
<letter> ::= A|B|C|D|E|F|G|H|…
27 Give one example of a terminal value in this syntax.
28 Explain why the rule for a name cannot be described using a regular expression.
29 Represent the production rule for a word using a syntax diagram.
Revision activity
✚ Identify patterns and rules in constructions such as postcodes and addresses, and
try to represent these using BNF.
✚ For each production rule, create the equivalent syntax diagram.
Classification of algorithms
Comparing algorithms
It is important to be able to compare the efficiency of different algorithms in
Efficiency Being able
order to identify which is the best solution for each particular problem
to complete a task with
As the size of the data to be processed grows, the efficiency of the algorithm the least use of time or
becomes much more important. Sorting a list of 10 values with an inefficient memory.
algorithm may not cause a problem, but when sorting a list of 100 000 values it Time-wise complexity A
is much more important to choose wisely. measure of how much time,
The efficiency of an algorithm can be described by looking at two main aspects: or how many steps, will be
✚ Time-wise complexity: Algorithms that have a lot of steps or comparisons required to complete an
will take longer to compute for large data sets. algorithm.
✚ Space-wise complexity: Algorithms that store a lot of temporary values Space-wise complexity
while the processes take place will take up a lot of memory. A measure of how much
memory will be required to
An ideal algorithm will be efficient in terms of both time to execute and space
complete an algorithm.
needed for memory.
However, depending on the particular situation, sometimes it might be more
important to:
✚ choose an algorithm which is time-efficient even if it uses a lot of memory
to store values temporarily during execution
✚ choose an algorithm which is space-efficient even if it might take longer to
carry out the instructions.
Note
Recursive algorithms are often considered to be elegant solutions as they can be
described using few lines of code. They are typically inefficient however, particularly
space-wise, as the local variables for each call need to be stored while any further
calls are running.
Functions
A function is a method of mapping one set of values to another.
For example, the function y = 2x maps each value of x to a new value, y.
x y Making links
−3 −6 The issues surrounding
−2 −4 function terminology,
including domains, co-
−1 −2
domains and sets, is
0 y = 2x 0 typically only assessed
1 2 in questions relating to
functional programming.
2 4 Functional programming
3 6 is covered in detail in
Chapter 12.
Figure 4.14 A mapping function
The input values are referred to as the domain, and the output values are
referred to as the co-domain.
If the domain (input values) is made of integers (natural numbers) and the
co-domain is also made of integers then this can be described using the
format → .
Consider each of the following graphs.
✚ The graph of y = 3 represents a constant function.
The value of y is constant and does not go up as x changes
✚ The graph of y = 2x represents a linear function.
Linear Rising in a straight
The value of y goes up in a straight line – so if x doubles, so does y. line; for example, the graph
✚ The graph of y = 2x2 represents a polynomial function. of y = 2x.
The value of x is raised by a power. Other examples include graphs with x3, Polynomial An expression
x4 and so on. involving powers; for
✚ The graph of y = 2x represents an exponential function. example, y = 2x2.
A value is raised by the power of x. Whenever x is used as a power the Exponential An expression
value of y will go up increasingly quickly. where the variable is used
✚ The graph of y = log2 x represents a logarithmic function. as an exponent, or power;
for example, y = 2 x.
The value of y goes up very slowly. Using y = log2 x is the opposite of using 2x.
✚ The graph of y = x! represents a factorial function. Logarithmic The opposite
of exponential. If y = 2 x,
The value of y goes up extraordinarily quickly. then x = log 2y.
1000
900
800
700
600
500
400
300
200
100
0
0 10 20 30 40 50 60 70 80 90 100
Key
y = log x y=x y = n log n y = x2 y = 2x y = x!
92
Figure 4.15 A graph representing different types of function
20 21 22 23 24 25 26 27 28
4 Theory of computation
1 2 4 8 16 32 64 128 256
log 2 1 log 2 2 log 2 4 log 2 8 log 2 16 log 2 32 log 2 64 log 2 128 log 2 256 Factorial The product of all
0 1 2 3 4 5 6 7 8 positive integers less than
or equal to a given integer.
Permutation One of the
Factorials and permutations different ways that a set can
It is also important to be familiar with the concept of a factorial. be arranged.
n! (pronounced ‘n factorial’) multiplies all of the positive integers up to and
including n. For example:
Now test yourself
5! = 5 × 4 × 3 × 2 × 1
6! = 6 × 5 × 4 × 3 × 2 × 1 33 Give an example of a
linear function.
The graph of n! grows more rapidly than an exponential function.
34 What type of function is
A factorial is used to describe the maximum number of permutations of a set the graph of y = 2x3?
of values. 35 Which graph rises most
steeply – an exponential
For example, if I want to visit cities A, B and C in any order then I could use
function or a logarithmic
any of the following permutations:
function?
{A,B,C}, {A,C,B}, {B,A,C}, {B,C,A}, {C,A,B}, {C,B,A} 36 What type of function is
If there are three values in the set, then there are 3! (= 6) possible y = 1?
permutations.
Answers available
If there are four values in the set, then there are 4! (= 24) possible permutations. online
Order of complexity
Time-wise efficiency is described using Big-O notation.
This is used to put the time efficiency of different algorithms into some sort of
order, and uses the notation O(n), where n is the number of steps and different
terms are used inside the brackets for each order of complexity.
Constant time
Finding the first item in a list, or jumping straight to a value at a given index,
always takes exactly one step.
The time-wise complexity for an algorithm that always takes the same
number of steps, no matter how big the list, is described using the time-wise
complexity O(1) for one step, O(2) for two steps, O(3) for three steps, and so on –
though O(1) is much more likely.
Logarithmic time
A binary search of n values will usually take very little time because the
possible list of remaining values is halved each time. This halving at each step
means that the maximum possible number of steps taken to find the right
item goes up on a logarithmic scale.
We can, therefore, state that the time-wise complexity of the binary search is 93
O(n log n).
Linear time
In the worst case, a linear search of n items will take n steps. If the item is the
last one in the list then every item much be checked before it is found.
4 Theory of computation
We can therefore state that the time-wise complexity of the linear search is O(n).
Notes
The term loglinear is not
Loglinear (n log n) time
often used – simply writing
The merge sort is a very efficient algorithm which, much like the binary search, n log n is more common.
involves splitting lists in half repeatedly. Sorting is always a more complex task
than search, and so the time-wise complexity is not quite as small. Polynomial time refers to
time-wise complexity where
The time-wise complexity of the merge sort is O(n log n). n is raised to a power; for
example, O(n2), O(n3), O(n4),
Polynomial time and so on.
To sort a list of n values using the bubble sort, each pass would require (n – 1)
comparisons.
To fully sort the list it will be necessary to complete (n – 1) passes. The total
steps would therefore be (n – 1) × (n – 1).
However, for a large list, n ≈ n – 1 (that is, n is almost equal to (n – 1))
So, (n – 1)(n – 1) ≈ n2
We can, therefore, state that the time-wise complexity of the bubble sort is O(n2).
Other algorithms that can be expressed as O(n2) time-wise complexity include
the insertion sort and selection sort.
Exponential time
The time-wise complexity for an algorithm that uses exponential time will
be written as O(kn), where k is a constant and the value is raised to the power
of n.
An example could be trying every possible value of a passcode. If a passcode
uses four digits then there are 104 possible values to try. If the passcode uses
six digits then there are 106 possible values.
Factorial time
Factorial time is even worse than exponential time. The travelling salesman
problem is an example of a problem that has a time-wise complexity of O(n!).
To make it easier to remember these time complexities, and examples of each,
consider this graph:
Exam tip
Intractable Tractable Big-O notation
1000
O(n2) – bubble sort
Remember that searching is
900 O(n!)
always simpler than sorting,
800 O(kn)
and for each topic there
ime
94
4 Theory of computation
complexity O(n2).
39 Explain why the linear search has a time-wise complexity of O(n).
40 State the time-wise complexity for the binary search.
41 Explain why the bubble sort has a time complexity of O(n2).
Limits of computation
Although many problems can be solved with enough time and enough memory,
it is important to consider the physical constraints of computing hardware.
A time-efficient algorithm that uses a large amount of memory may not be
able to function effectively once all available RAM is used.
Similarly, while CPUs run at enormously high frequencies, time-complex
algorithms may still take longer than is reasonable to carry out their
processing.
As such, even if a problem can be computed in theory, hardware limits may
mean that it is not possible to compute it in a practical setting.
Revision activity
Now test yourself
Investigate problems such
42 What is meant by a tractable problem? as the travelling salesman
43 What two features define an intractable problem? problem, the bin-packing
problem and the knapsack
44 Suggest three strategies for tackling an intractable problem.
problem to see examples of
Answers available online intractable problems.
95
Some problems are non-computable. That means that they simply cannot be computer.
solved using a computer.
Halting problem
The halting problem is: given a computer program and an input, is it possible
Non-computable A
to determine whether that program would loop forever or would eventually
problem that cannot be
stop, without running it?
solved by a computer.
A mathematical proof shows that an algorithm to solve the halting problem
Halting problem A
cannot exist. specific example of a non-
Note computable problem that
proves that some problems
The proof for the halting problem is not required for A-level computer science, but it are non-computable.
involves demonstrating that a successful solution would create a paradox, which is
therefore impossible.
The significance of the halting problem is that it demonstrates that there are
some problems that cannot be solved using a computer.
A model of computation
Turing machine
A Turing machine is a theoretical computing device with a single, fixed
Turing machine A model
program that is made up of:
of computation with a single
✚ a finite set of states
program which manipulates
✚ a finite alphabet of symbols symbols on a strip of tape.
✚ an infinite length of tape, with marked off squares
✚ a read-write head that can travel across the tape, one square at a time. Starting state The state a
Turing machine is in when it
Because a Turing machine uses a finite set of states, it can be described using starts its program.
a finite state machine (FSM), either using a state transition diagram or a state
transition table. Halting state A state with
no outgoing transitions, and
An FSM should have one state as the starting state. so the Turing machine will
Any states with no outgoing transitions are called halting states. stop.
Empty states can be recorded using a number of symbols, but it is typical for
AQA to use the # symbol.
Each transition is labelled with an input, output and an arrow indicating the
movement of the read-write head.
An example of Turing machine is described below using a state transition
diagram.
96
0|0 0|1
1|0
1|1 S1
S0 S1
4 Theory of computation
#|# #|#
SH
The diagram in Figure 4.16 could be shown using this state transition table:
The tape for the Turing machine is shown below, with an asterisk showing
the current position or the read-write head.
… # 0 1 1 0* … S0
When the first value is read, the Turing machine is in State 0 and the input
is 0.
Reading the transition function for this situation, the state should remain as
State 0, the output 0 should be written to the tape and the read-write head
should move to the left.
This leads to the result below:
… # 0 1 1* 0 … S0
When the next value is read, the Turing machine is in State 0 – and the input
is 1.
Reading the transition function for this situation, the state should be set to
State 1, the output 1 should be written to the tape and read-write head should
move to the left.
97
… # 0 1* 1 0 … S1
4 Theory of computation
… # 0 1 1 0* … S0
… # 0 1 1* 0 … S0
… # 0 1* 1 0 … S1
… # 0* 0 1 0 … S1
… #* 1 0 1 0 … S1
… # 1* 0 1 0 … SH
The purpose of this Turing machine is to calculate the twos complement of a Revision activity
binary number.
✚ Try creating your
Now test yourself own Turing machine
to carry out bit-wise
48 What four things are needed for a Turing machine? operations such as twos
49 Give two ways that a Turing machine can be represented complement, left shifts
50 What is the name for a state with no outgoing transitions? and right shifts.
51 A state transition table contains a starting state and input, plus what three other ✚ Find examples of simple
pieces of data? Turing machines online
and hand-trace them
Answers available online using your own inputs.
Summary
✚ Logic problems can be solved using logical deductions ✚ A regular expression is a notation used to describe a set
✚ An algorithm is a series of steps to complete a task, ✚ Regular expressions are used in searching and validation
4 Theory of computation
that always terminates ✚ The symbols *, +, ?, and | are used in regular
✚ Algorithms written in pseudo-code solve problems expressions and have specific meanings
using the standard programming constructs: sequence, ✚ Regular expressions can also be represented using a
assignment, selection, iteration finite state machine, and vice versa
✚ Algorithms can be hand-traced by completing a trace table ✚ It is possible to be asked to represent a finite state
✚ In recursive algorithms, trace tables should include the machine as a regular expression, and vice versa
call number ✚ A language is classed as a regular language if it can be
✚ It is important to be able to convert the description of represented as a regular expression
an algorithm, whether in English or pseudo-code, into ✚ The syntax rules of a language can be checked using
program code in your chosen language Backus–Naur form or syntax diagrams
✚ Where they are provided, always make sure to use the ✚ Rules in BNF that do not use recursion can be
exact variable names and prompts as they appear in represented using a regular expression
the exam paper ✚ If BNF uses recursion, then it can represent languages
✚ Abstraction is a technique for simplifying a problem that cannot be represented using regular expressions
by removing or hiding complexity and includes: ✚ The efficiency of algorithms can be described time-
representational abstraction, generalisation, information wise and space-wise
hiding, procedural abstraction, functional abstraction, ✚ Functions map one set of values (the domain) to
data abstraction and problem abstraction/reduction another set of values (co-domain)
✚ Decomposition is the technique of breaking a problem ✚ The number of permutations of a set of n objects is n!
into increasingly small sub-problems (where, for example, 4! = 4 × 3 × 2 × 1)
✚ Composition is the technique of combining individual ✚ Big-O notation is used to describe the time-wise
subroutines to create compound subroutines that solve efficiency of algorithms
larger problems ✚ Searching algorithms include the binary search, O(log n)
✚ Automation is the technique of designing, implementing, (logarithmic time), and the linear search, O(n) (linear time)
and executing solutions to complex problems ✚ Sorting algorithms include the merge sort, O(n log n),
✚ A finite state machine is a model of a system that can and the bubble sort, O(n2) (polynomial time)
be described using a fixed number of states ✚ Other algorithms use constant time, O(1), exponential
✚ Each time an input is provided to a finite state machine time, O(kn) and factorial time, O(n!)
there will be a transition to a new state ✚ Tractable problems can be solved in a reasonable
✚ A Mealy machine is a finite state machine where each (polynomial or less) time
input provides a specific output as well as a transition ✚ Intractable problems can be solved, but not in a
between states reasonable time (because of hardware limits, for example)
✚ Finite state machines can be described using state ✚ Heuristic methods such as reducing complexity or
transition diagrams and state transition tables accepting a close-enough solution are used to solve
✚ A set is an unordered list of values, with no duplicates intractable problems
✚ Sets can be described using set comprehension ✚ Some problems, such as the Halting problem, cannot
symbols; for example: be solved algorithmically
A = { x | x ∈ ∪ ∧ x ≥ 1 } ✚ The halting problem states that a program that can
check whether or not another program will loop
✚ Sets can also be described using compact
forever, given a specific input, cannot exist
representation; for example:
✚ The halting problem means that some problems cannot
{ anbn | n > 0 ∧ n < 4 } be solved by a computer
✚ The cardinality of a set refers to the number of ✚ A Turing machine is a theoretical computing device that
elements within it can carry out one fixed program
✚ Some sets are finite, and some are infinite ✚ A Turing machine has a finite set of states, a finite
✚ The elements within countably infinite sets can be alphabet, an infinite strip of tape and a read-write head
counted off by the natural numbers ✚ Any algorithm that can be computed can be described
✚ The Cartesian product of two sets contains the set of using a Turing machine, and vice-versa
all ordered pairs from those sets ✚ A Turing machine can be described using a finite state
✚ All values in a subset also belong in the superset machine, state transition diagrams, state transition
✚ All values in a proper subset belong in the superset, but tables and transition functions
this doesn’t contain all of the values from the superset ✚ A universal Turing machine can read in the description
✚ The union of two sets includes all values from both for any Turing machine and its tape in order to simulate it
sets combined ✚ A universal Turing machine is a theoretical machine,
✚ The intersection of two sets only contains those values more powerful than any real computer because it has
in both sets infinite memory / an infinite tape, and provides the
✚ The difference of two sets contains only the values that theoretical basis for modern computers
are in one set, but not both 99
Exam practice
1 A 7-bit code, 010 1011 is processed by the finite state machine, as shown.
The code is processed from the most significant bit, so the first value processed will be 0.
4 Theory of computation
1|1
SA SB
1|0
0|0 0|1
a) State the output when the 7-bit code 010 1011 is processed. [1]
b) The last value that is output from the finite state machine is added to the 7-bit code before it is transmitted.
State the purpose of the final bit. [2]
c) The finite state machine could also be represented using a state transition table. Copy and complete the state
transition table below. [3]
d) A Turing machine to achieve the same outcome is shown, along with its tape, below. The current position is
labelled with an asterisk (*).
SC
#|0 #|1
1|1
SA SB
1|0
0|0 0|1
… 1* 1 0 0 1 0 1 # # … SA
Copy and complete this trace table by hand-tracing the Turing machine. [4]
… 1* 1 0 0 1 0 1 # # … SA
… …
… …
… …
… …
… …
… …
… …
… …
e) As well as a finite set of states and a tape, identify two other features required by a Turing machine. [2]
➜
100
4 Theory of computation
FUNCTION Fold (List)
IF Len(List) > 1 THEN
Head = List[0]
Tail = Copy(List,1)
return Head * Fold(Tail)
ELSE
return List[0]
ENDIF
ENDFUNCTION
a) Copy an complete this trace table by hand-tracing the subroutine call Fold([3,5,8,2]). [4]
b) Explain why recursive algorithms are often considered to be space-wise inefficient. [2]
c) State the time-wise complexity for each of the following algorithms. [4]
d) Explain why the time-wise complexity for a bubble sort is O(n2). [2]
e) State what is meant by an intractable problem. [2]
f) Suggest two possible approaches to solving an intractable problem. [2]
g) Explain the significance of the halting problem. [1]
3 Abstraction and automation are essential techniques in computer programming.
a) Explain what is meant by abstraction. [2]
b) Copy and complete this table by entering the appropriate letter in each case. [6]
A Abstraction by generalisation
B Data abstraction
C Information hiding
D Composition
E Functional abstraction
F Problem abstraction ➜
101
Description Abstraction
Combining individual functions to create compound functions
Removing unnecessary details until the problem can be
4 Theory of computation
c
S0 S1 S2
b
a, b
c a, b, c
S4
Input Accepted?
12ab
ab12
3ab1
2abc
321cba
21caabc
102
5 Fundamentals of data
representation
Number systems
Numbers can be grouped into different sets.
Many standard sets of numbers can be indicated using specialised set
notations such as , , , and .
Natural numbers
The set of natural numbers, , is the set of positive integers. A handy way to
Natural numbers Positive
remember this is that, when humans first started counting, these were the
integers, including 0.
numbers that were obvious. 0 people, one person, two people, three people,
Numbers used to count
and so on.
things.
={0,1,2,3,4,…}
The set of natural numbers is infinitely large, so we use the notation ‘…’ to
show that the set carries on forever.
Integer numbers
The set of integer numbers, , is the set of all integers, both positive and
Integer numbers Whole
negative.
numbers, including both
= { … , −3 , −2 , −1 , 0 , 1 , 2 , 3 , … } positive and negative values.
Rational numbers
The set of rational numbers, , is the set of all numbers that can be
Rational numbers
represented as fractions. This includes integers, since all integers can be
Any number that can be
written as a fraction of 1 (for example, 3 = 3/1).
represented as a fraction,
includes { … , −2 , −4/3 , 0 , 1/100 , … } including an integer.
Irrational numbers
The set of irrational numbers is the set of all real numbers which are not
Irrational numbers Any
rational numbers (in other words, numbers that cannot be represented as
number that cannot be
fractions). It does not use a specialised set notation, though it can be
represented as a fraction,
considered as the set of all real numbers, minus all rational numbers ( \ ).
including π and square roots
The set of irrational numbers includes { … , π , √2 , e , … } of non-square numbers.
Real numbers
The set of real numbers, , contains all rational and irrational numbers. It
Real numbers The
does not include imaginary numbers, such as i or √−2.
collection of all rational and
It is important to recognise the relationships between the sets. For example, irrational numbers.
the set of natural numbers is a subset of the set of integer numbers; the set of
integer numbers is a subset of the set of rational numbers; the set of rational
numbers and the set of irrational numbers are both subsets of the set of
real numbers.
103
Making links
Sets are discussed in the Component 1 topic on regular expressions (Chapter 4), with
additional operators that can be used to define bespoke sets.
5 Fundamentals of data representation
Ordinal numbers
Ordinal numbers are used to describe the order in which numbers appear. For
example, in the ordered set {apple, banana, clementine}, ordinal numbers can
be used to indicate that ‘clementine’ is the third item in the set.
Number Irrational
7
4.3
−2
√5
0
Number bases
Numbers are traditionally represented in powers of 10, because humans
Number base The number
typically have 10 fingers and thumbs on which to count. This is referred to as
of digits available in that
base 10, denary, or decimal.
number system.
Decimal numbers can be written with the number 10 as a subscript; for Decimal Numbers with
example, 4310. base 10.
In computer systems, which are based on on/off circuits, numbers are Denary An alternative less
represented in powers of 2. This is referred to as base 2, or binary. ambiguous term for base 10
Binary numbers can be written with the number 2 as a subscript; for example, numbers.
104
0010 10112. Binary Numbers with base 2.
A different number system is used as a shorthand for binary that uses one
Hexadecimal Numbers
digit to represent each group of four binary numbers. Each set of four binary
with base 16.
numbers can have one of 16 values, from 0 to 15. This number system is
referred to as base 16, or hexadecimal.
5 What name is used for the number system based on two digits? Note
6 What name is used for the number system based on ten digits?
It is important to remember
7 Why is it sometimes preferable to represent numbers from a computer system that computer systems
using hexadecimal? always store and process
data in a binary format.
Answers available online
These representations may
be used for input or output,
Converting between binary and decimal but computers are only
To fully understand how to convert between number bases you must be capable of storing values as
familiar with the concept of place value. We can write a decimal number such 0s and 1s.
as 503 more formally like this:
To obtain the decimal number we multiply each decimal digit by its place
value: (5 × 100) + (0 × 10) + (3 × 1) = 503.
To convert a binary number to its decimal equivalent, we follow exactly the
same process. However the place values in binary are different because they
are powers of 2 rather than powers of 10. Least significant bit
(LSB) The right-most bit in a
Starting at the right-hand side, the place value of the least significant bit (LSB),
binary number. The bit with
has a value of 1. Each subsequent place value, moving left, is worth double the
the smallest place value.
previous value.
Place value 128 (=27) 64 (=26) 32 (=25) 16 (=24) 8 (=23) 4 (=22) 2 (=21) 1 (=20)
Binary number 0 1 1 0 1 0 1 0
The decimal value of the binary number is found by multiplying each binary
digit by its place value and adding them all together. In this example the
binary number 01101010 is expressed in decimal as:
(128 × 0) + (64 × 1) + (32 × 1) + (16 × 0) + (8 × 1) + (4 × 0) + (2 × 1) + (1 × 0)
= 64 + 32 + 8 + 2
= 10610
Most significant bit
To convert a decimal number to its binary equivalent there are two main
(MSB) The left-most bit in a
methods. The more intuitive method is to remove the largest possible power
binary number. The bit with
of 2, working from the most significant bit (MSB), and removing values one at the largest place value.
a time. 105
Worked example
Convert 12310 to binary.
12310 = 64 + 59
5 Fundamentals of data representation
5910 = 32 + 27
Worked example
Convert 12310 to binary.
12310 ÷ 2 = 61 remainder 1.
6110 ÷ 2 = 30 remainder 1.
3010 ÷ 2 = 15 remainder 0.
106
Because each binary number is split into blocks of 4 binary digits, even large
numbers are quick and simple to convert:
1001 11012 = 9D16
The reverse process involves expanding each hexadecimal digit back into its
binary equivalent.
F316 = 1111 00112
To convert from decimal to hexadecimal it is possible to convert to binary first:
17310 = 1010 11012 = AD16
Alternatively, the hexadecimal value can be found directly by dividing the
decimal number by 16 and representing the division and remainder in
hexadecimal.
Worked example
Convert 17310 to hexadecimal.
17310 ÷ 16
= 10 remainder 13
= AD16
As 1010 = A16 and 1310 = D16
Units of information
Bits and bytes
A single binary digit is referred to as a bit (a shortening of the words
Binary digit).
Each bit can take the value of a 0 or a 1.
In order to improve the ease of reading binary data, bits are usually written in
blocks of four digits, referred to as a nibble.
For most purposes, eight binary digits are grouped together as a byte.
The number of unique combinations of values within a binary value is always
a power of 2. Exam tip
If there are 2 bits then there are 22 possible combinations. 22 = 4. If you are asked to
complete a list of all unique
00, 01, 10, 11
combinations (for example,
If there are 3 bits then there are 23 possible combinations. 23 = 8. for a trace table) then write
out the binary numbers in
000, 001, 010, 011, 100, 101, 110, 111
numerical order. That is, in
If there are 8 bits then there are 28 combinations. 28 = 256. order of the decimal values
0, 1, 2, 3, and so on.
In general, if there are n bits then there are 2n combinations.
Units
In modern computing a large amount of binary data is stored, transferred or
used, and so larger units are required. Decimal prefix A
shorthand used for
As is the case with all metric measurements (grams, metres, and so on), a multiples of 1000 (103)
decimal prefix is used for multiples of 1000 (103). bytes; for example, kilobyte,
1000 bytes = 103 bytes = 1 kilobyte, or 1 kB megabyte, gigabyte.
108
1024 GiB = 240 bytes = 1 tebibyte, or 1 TiB Remember that a letter ‘i’
in the name always refers
Now test yourself to the binary prefix (for
example, KiB, MiB, GiB).
14 State the number of bits in a:
Always read the question
a byte carefully to check whether
b nibble. you are being asked about
15 How many unique combinations can be made using five binary digits? the metric prefix or the
16 Without using a calculator, find the number of bytes in: binary prefix.
a 1 megabyte
b 2 kibibytes
c 120 TB
17 Which is bigger, 300 MB or 300 MiB?
Exam tip
Remember that each additional binary digit always doubles the possible range of
values – for example, a 7 bit number has 128 possible values and an 8 bit number has
256 possible values. This fact is critical and questions relying on this knowledge can
crop up anywhere in Component 2, or Component 1.
109
To add two unsigned binary numbers, write the numbers out one above the
other and add in columns, starting with the least significant bit.
To do this we need to remember sum and carry rules and note the following
when adding binary digits:
0+0=0
0+1=1
1+0=1
1 + 1 = 0 carry 1
1 + 1 + 1 = 1 carry 1
Binary digits can only be 0 or 1, so remember to carry if the answer would be 2 Exam tip
or more in decimal. Always write the numbers in binary.
Marks are given for the
0 0 1 1 0 0 1 0 + method as well as the
answer. Always show
1 0 1 1 0 1 0 1 your working for binary
1 1 1 0 0 1 1 1 ← sum arithmetic questions – never
just write the answer.
1 1 ← carry
Multiplication
To multiply two binary numbers the method involves:
✚ splitting the small number into a sum of powers of 2
✚ multiplying the larger number by each of these powers of 2
✚ adding together the results.
Worked example
Calculate (11 × 5)10 in binary.
Split the smaller number into powers of 2; in this case, 5 = 22 + 21 = 4 + 1
11 × 5 = 11 × (1 + 4)
= (11 × 1) + (11 × 4)
Multiplying by powers of 2 can be done by performing left shifts. For each power of 2,
perform the appropriate left shifts and then add the resulting values.
1011 × 1 = 1011 = 1110
1011 × 2 = 1 0110 = 2210
1011 × 4 = 10 1100 = 4410
Therefore, 1110 × 510 = 10112 + 10 11002
Write this out by putting the smaller number across the top and the larger number
across the bottom.
1 0 1 1 +
1 0 1 1 0 0
1 1 0 1 1 1
Note
1
It is not necessary to perform
Finally, convert the result back to decimal: binary division. Binary
subtraction is explained in
11 01112 = 5510
the next section, on two’s
110 We can check that this is correct by noting in decimal that 11 × 5 is indeed 55. complement binary.
Revision activity
✚ Use a random number generator to choose numbers and practise converting them to
binary and adding them. Convert the result back to decimal to check your accuracy.
✚ Use a random number generator to choose one large number (up to 256) and one
small number (up to 10), then practise converting them to binary and multiplying
them. Convert the result back to decimal to check your accuracy.
−8 4 2 1
1 1 1 0
Exam tip
If you are asked to find the decimal representation of a two’s complement binary
number, or if you want to check your working, remembering to count the most
significant bit as a negative number is the quickest and simplest method.
Note
Remember that −1 is larger than −128. This can seem counter-intuitive but consider
that having −£1 (owing £1) is better than having −£100 (owing £100).
111
There are two methods for converting between a positive binary number and
its negative equivalent in two’s complement.
Worked example
5 Fundamentals of data representation
To check:
−128 64 32 16 8 4 2 1
0 0 1 1 0 1 0 0
−128 64 32 16 8 4 2 1
1 1 0 0 1 1 0 0
Worked example
Method 2: flip from the right
Subtracting binary numbers can sometimes be complicated, and it is possible to carry
out the operations in the wrong order. Some people prefer a different method.
Regardless of the conversion (positive to negative, or negative to positive), copy the
binary sequence, starting with the right-most bit.
Copy the pattern up to and including the first 1 digit, and then flip all of the remaining
bits.
20 How can you tell instantly whether a two’s complement number is positive or Use a random number
negative? generator to choose
Subtraction
Using two’s complement binary it is possible to subtract two numbers.
For instance, 73 − 14 can be solved by adding +73 to −14.
Worked example
7310 = 0100 10012
1410 = 0000 11102
To convert 1410 to −1410 in binary, we flip the bits and add 1:
−1410 = 1111 00102
Now we perform the calculation: 73 +(−14):
0 1 0 0 1 0 0 1 +
1 1 1 1 0 0 1 0
0 0 1 1 1 0 1 1
1 1
In general, for an n bit number, the possible range is −2n − 1 to (2n − 1) − 1 113
23 What is the possible range of values in a 7-bit two’s complement number? Create your own practice
24 Use binary arithmetic to add a positive and a negative number. questions by picking two
5 Fundamentals of data representation
1 1 1 1
−8 4 2 1 . 2 4 8 16
0 1 1 0 . 1 0 1 0
Worked example
Converting the decimal fraction 7.5625 into its fixed point binary representation.
Calculate the whole number part, in this case 7:
−8 4 2 1 .
0 1 1 1 .
114 Note that 7.5625 – 7 = 0.5625, which is the value left to find
1
The first value after the binary point is worth 2
= 0.5
−8 4 2 1 . 0.5
−8 4 2 1 . 0.5 0.25
0 1 1 1 . 1 0
1
Third value after the binary point is worth 8
= 0.125
0.125 can’t be taken from 0.0625
1 Exam tip
Fourth value after the binary point is worth 16 = 0.0.625
0.0625 – 0.0625 = 0 Fixed point binary
questions may use two’s
−8 4 2 1 . 0.5 0.25 0.125 0.0625 complement or unsigned
0 1 1 1 . 1 0 0 1 binary representations.
Always check the question
Therefore, 7.562510 = 0111.10012 carefully.
The exponent is also stored in two’s complement format. For example, for a 4
Note
bit exponent:
Exam questions using the
−8 4 2 1 floating point representation
5 Fundamentals of data representation
0 1 1 0 0 1 0 0 0 0 1 1
Mantissa Exponent
The first step is to calculate the value of the exponent. In this case, +3.
−8 4 2 1
0 0 1 1
The second step is to move the binary point by that number of positions …
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
0 . 1 1 0 0 1 0 0
… becomes …
−8 4 2 1 . 1 1 1 1
2 4 8 16
0 1 1 0 . 0 1 0 0
The third step is to convert this value in the same way as with fixed point binary:
1
4+2+ 4
= 6.25
The same method is used for negative numbers. Remember that both the
mantissa and exponent are in two’s complement, which means if the first bit
is 1 then the number is negative.
Worked example
Calculate the decimal value of the floating point binary number 10110000
1111.
1 0 1 1 0 0 0 0 1 1 1 1
Mantissa Exponent
The first step is to calculate the value of the exponent. In this case, −1.
−8 4 2 1
1 1 1 1
The second step is to move the binary point by that number of positions. However, for
a negative exponent, rather than moving the binary point to the left instead shift the
bits to the right, which achieves the same effect. This is because we do not want the
left-most bit to move, as it is used to indicate the sign …
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
1 . 0 1 1 0 0 0 0
116
… becomes …
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
1 . 0 0 1 1 0 0 0
Making links
At the second step it is important that the two digits either side of the binary point
should be different. For example:
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
0 . 1
Or …
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
1 . 0
This is referred to as normalised floating point binary and is discussed in more detail
laer in this chapter.
Worked example
Write the decimal number 11.2510 as a floating point number with an 8 bit
mantissa and a 4 bit component.
1 Write the number in fixed point two’s complement.
11.2510 = 01011.012
2 Move the binary point so that it comes after the first value.
Mantissa = 0.101101 Exam tip
3 Count the number of positions that the binary point has been moved by and express
Floating binary is a topic
this value in two’s complement. In this case the binary point moved four positions:
that often trips students
Exponent = 410 = 01002 up. Remember that both
4 Finally, write the full binary number down using the number of digits specified in mantissa and exponent
the question. We were asked for an 8 bit mantissa, so we need to add another bit, will be written using two’s
of value 0, to the right of the mantissa. complement binary and
ensure you get lots of
0 1 0 1 1 0 1 0 0 1 0 0 practical practice converting
Mantissa Exponent between decimal and
binary numbers, both large
(positive exponents) and
small (negative exponents).
117
When using floating point binary it is possible to represent very large and
very small numbers. Exam questions will usually specify the number of bits
to use for the mantissa and the exponent, with a mantissa of 8 bits and an
exponent of 4 bits being typical. Be prepared to work with exponents of up to
5 Fundamentals of data representation
6 binary digits.
28 Which part of a floating point number is used to describe how far to move the ✚ Practise converting
decimal point? large integers and small
29 Convert each of the following two’s complement, floating point binary numbers fractions, both positive
into their decimal equivalent. and negative, into
floating point form and
Mantissa Exponent use online tools to check
your accuracy.
a 0.11010 001001 ✚ Generate random
b 1.0110000 0010 binary numbers to
c 0.11000 11110 use as mantissa and
exponent. Guess the
30 Convert each of the following decimal numbers into their two’s complement, scale of the number first
floating point binary equivalent. Use a 6-bit mantissa and a 4-bit exponent in each (large or small, positive
case. or negative) and then
convert the number
a 12.5
into decimal. Use online
b −7.25
tools to check your
c 0.125 accuracy.
d −0.9375
Rounding errors
Due to the nature of fractional numbers which are written as fractions of
powers of 2, it is not possible to store some values with complete accuracy.
Representing 0.510 as a binary fraction is trivial, using 0.12.
Representing 0.110 accurately as a binary fraction is impossible, even using
an extremely large number of binary digits. It can only be represented
approximately as the binary value is rounded up or down.
118
Worked example
Calculate the relative error when storing the decimal value 0.1310 in binary.
0.1310 is represented using 4 bits in binary as 0.0012.
Notice that this relative error is much larger than the previous example, even
though the absolute error was smaller.
Advantages Disadvantages
Fixed point Simpler to convert and allows for quicker A more limited range, particularly if large
binary processing. magnitude numbers and small magnitude numbers
need to be stored using the same system.
All values can be stored with the same level of
absolute precision. Limited precision.
Values are processed in a very similar manner to
binary integers, meaning that hardware can be
re-purposed without needing to be re-designed.
Floating A wider range of values are possible by using a The precision of two different values can be very
point binary large exponent. different due to the different exponents.
More precise as a small exponent can be used. Processing takes longer.
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
5 Fundamentals of data representation
0 . 1
Or …
−1 . 1 1 1 1 1 1 1
2 4 8 16 32 64 128
1 . 0
It is also important when converting numbers into floating point binary form
to ensure that they are normalised.
33 Give two reasons why someone might choose to use fixed point rather than Questions about whether
floating point binary. a floating point number
34 Give two reasons why someone might choose to use floating point rather than is normalised are very
fixed point binary. common. Remember that
the first two digits should
35 What is the quickest way to check if a floating point number is normalised?
always be opposite to
36 For each floating point number, state whether it is normalised. each other if the number is
normalised.
Mantissa Exponent
a 00101010 0110
b 10100 10110
c 11010110 11
d 0110 110101
Assuming the answer should be stored using 4 bits, as with the two original
values, the extra 1 from the calculation cannot fit in the available space.
Depending on the configuration of the system:
✚ this may cause the leading 1 to be dropped, resulting in an incorrect result
(in this case suggesting that 13 + 6 = 3)
✚ the large result may spill over into the next block of memory.
On some occasions overflow errors are not a problem, for example when
adding a positive and negative number using two’s complement.
For example, calculating 7 – 4 using two’s complement binary:
0 1 1 1 +
1 1 0 0
1 0 0 1 1
120
1 1
The overflow digit can be safely dropped, as the solution is correct. This
is because the first digit is used to represent the sign, and so the overflow
simply allows for the leading bit to be flipped.
An underflow can occur when a subtraction takes place and the value will not
121
There are two main character sets in use – ASCII and Unicode.
ASCII A 7-bit character set
that can represent up to 128
ASCII unique characters.
ASCII stands for the American Standard Code for Information Interchange.
5 Fundamentals of data representation
The first 33 codes in the character set are reserved for special characters.
The other 95 values are used for letters and symbols, including:
✚ upper case letters ✚ punctuation
✚ lower case letters ✚ mathematical operators.
✚ digits 0-9
This is a section of the ASCII character set:
The character codes for upper case letters in ASCII are separated by 3210 from
lower case letters. This means that only one bit (the sixth bit) needs to be
changed to change cases:
Also note that the last 5 binary digits can be used to identify the letter:
Exam tip
A = 0100 0001 = 64 + 1
You are not expected
B = 0100 0010 = 64 + 2 to remember individual
character codes, but it is
Z = 0101 1010 = 64 + 26
useful to know that the last
When a key on a keyboard is pressed, the binary code is transmitted to the five digits can be used to
computer. identify the letter.
Text files are saved by storing the binary code for each character in the file.
Some codes refer to special characters which are not normally visible when a Note
file is displayed on a screen, such as STX (start of text), BS (backspace), LF (line
There are various specific
feed, or new line), and so on.
versions of Unicode
Unicode including UTF-8, UTF-16 and
UTF-32. It is not necessary
ASCII does not leave enough possible values to represent letters with accents,
to be aware of the
characters from other alphabets or emojis.
differences and for the sake
Unicode is an expanded character set which uses a varied character length of this qualification all forms
of between 8 and 32 bits per character. This allows for all 1 million+ valid of Unicode are referred to
characters to be uniquely represented. under that single term.
122 In situations where these are not needed, ASCII can be preferable as it uses
less storage space.
It is important to note that the first 128 Unicode characters and codes are the
same as ASCII, so the two systems are compatible. Note
Now test yourself The Unicode Consortium
Parity bits
A simple solution is to count the number of 1s in a sequence of binary values
Parity bit A single binary
and add a parity bit.
digit added to some data
In odd parity, there should be an odd number of 1s. in order to help with error
checking.
The number of 1s is counted in the data and if there is already an odd number
then a 0 is added as the parity bit. Odd parity A method of
using parity in which each
001 1010 – Original data
block of data should have
0001 1010 – Data to be transmitted an odd number of 1s.
If there is an even number of 1s then an extra 1 is added as the parity bit, so
that the total number of 1s is now odd.
001 1011 – Original data
1001 1011 – Data to be transmitted
When the message is received, the number of 1s is counted again. Exam tip
If the number of 1s is still odd then it is assumed that the data is correct. The parity bit can be added
to the start or end of the
If the number of 1s is even, then there has been an error, and the data is
message. Always read the
re-requested. question carefully to check
1001 1011 – Message appears to be correct whether the question refers
to even parity or odd parity
1001 0011 – Message contains an error
and to check which is the
The parity bit is then discarded and the original value can be processed. parity bit.
With even parity, the process is identical except that the parity bit is chosen
Even parity A method of
so that the number of 1s is even.
using parity in which each
The receiver needs to know whether if the data has been sent using even or block of data should have
odd parity. an even number of 1s.
Overhead Additional data
Advantages Disadvantages
added to the original values.
Relatively small overhead Cannot help correct the error.
in terms of adding extra data.
If two bits are incorrect, then the error will not be identified.
Note
A technique known as 2D parity can help identify multiple errors and can also correct some
errors. This is an interesting technique but is not covered in the AQA A-level specification. 123
Majority voting
Majority voting involves sending each bit repeatedly.
Majority voting
Each bit must be sent an odd number of times and, within that transmission, Transmitting each bit an odd
5 Fundamentals of data representation
whichever digit is seen most commonly is considered to be the correct one. number of times in order
to identify and correct any
For example the bit pattern 1010 is being sent using majority voting, and each
transmission errors.
bit is sent 3 times.
The transmission should be 111 000 111 000.
Due to a timing error, the message received reads 110 000 011 000.
In each block of bits, the majority is considered to be correct and so the data
stored is 1010.
An odd number of bits must be used in order to avoid a tie when deciding
which is the most common value. Repetitions of 3, 5 and 7 bits are all
acceptable.
Advantages Disadvantages
Makes it more likely that an error can be fixed. The extra overhead in data
transmission is very high.
Less chance that the data will have to be re-sent.
Checksums
A checksum is a value that can be found by applying an arithmetic algorithm
Checksum A value derived
to the original data.
by following an algorithm,
A very simple algorithm might be to add up all of the decimal digits in a large used to check for errors.
number and store the total.
For an original value of 374 268, the digits added together would total 30. Making links
When the transmission is received the same process is repeated and the total Hashing algorithms can be
is compared to the checksum. used to generate checksums.
While checksums appear
If the value 375 268 has been received then the sum of the individual digits in the specification for
will be 31. Since this does not match then an error has been identified and the Component 2, hashing
data should be re-requested. algorithms are explored as
A more complex algorithm might be to add the square of the digits in the even part of the data structures
positions to the cube of the digits in the odd positions. topic in Component 1 (see
Hash tables in Chapter 2).
83 + 62 + 23+ 42 + 73 + 32 = 512 + 36 + 8 + 16 + 343 + 9 = 924
Checksums are effective at identifying errors because two or more errors are Note
very unlikely to cancel each other out.
One common checksum
Simple checksums are not able to locate or correct the exact error. algorithm is the MD5
algorithm, and large
Advantages Disadvantages downloads will sometimes
The transmission overhead is smaller than The transmission overhead is larger than have an MD5 checksum file
using majority voting. using parity bits. that can be downloaded
Effective at identifying errors because two Simple checksums are not able to locate for checking once the
or more errors are very unlikely to cancel or correct the exact error. download is complete.
each other out.
ISBN-13 9 7 8 1 4 7 1 8 6 5 8 2
Weight 1 3 1 3 1 3 1 3 1 3 1 3
Partial sum 9 21 8 3 4 21 1 24 6 15 8 6
Total = 126
Total % 10 = 6
Value to add = 4
The final 13-digit code is therefore 978-1-4718-6582-4 (the spacing is added to
aid readability).
When the code is entered, the same calculation is performed on the full code.
ISBN-13 9 7 8 1 4 7 1 8 6 5 8 2 4
Weight 1 3 1 3 1 3 1 3 1 3 1 3 1
Note
Partial sum 9 21 8 3 4 21 1 24 6 15 8 6 4
In some cases, such as
Total = 130 ISBN-10, there are 11
possible values for the
Total % 10 = 0
check digit. Using modulo
As the modulo result is 0, it can be assumed that the values have been 11 the final value could be a
recorded or transmitted correctly. digit (0-9) or an X.
Advantages Disadvantages
Lower overhead than a checksum, With only 10 or 11 possible values for the
meaning that data can be transmitted check digit, it is possible that an erroneous
more quickly. transmission will be identified as correct.
42 Explain how a parity bit can be used to check a binary value for an error.
43 Describe one situation where an incorrect binary value could pass a parity check.
44 Explain the difference between odd and even parity. Revision activity
45 What rule is used to decide how many times to transmit each bit using majority voting? Carry out your own check
46 Describe the difference between a checksum and a check digit. digit calculations, using
the ISBN numbers on the
47 Of the four methods described in this Chapter, only majority voting is capable of
back of any textbooks and
correcting errors. Explain why this isn’t the most commonly used method.
revision guides you have
Answers available online with you.
Analogue signals are continuous as there are an infinite number of values as a curved wave.
between each measurement. They are usually represented as a wave with
smooth curves.
Amplitude
Time
Analogue/digital conversion
In order for computers to read in analogue data, an analogue to digital
Data Values or information.
converter (ADC) must be used.
Analogue to digital
An ADC will use an analogue sensor to read in an analogue signal and
converter (ADC) Converts
produce a digital output.
an analogue signal into a
Microphones and digital cameras are examples of devices that use an ADC. digital signal.
Each digital value will be a close approximation to the analogue signal at any Signal The electric or
particular moment. electromagnetic impulses
that are used to transmit
A digital to analogue converter (DAC) is used to convert a digital signal into an data.
analogue signal.
Digital to analogue
This is used, for example, to convert a digital audio file into an analogue converter (DAC) Converts
signal which can be used to recreate the sound through a speaker. a digital signal to an
analogue signal.
Exam tip
Questions about ADCs and DACs are usually written in the context of either digital
images or digital sound. Make sure you are able to apply your understanding of ADCs
and DACs to the topics on the following pages.
48 What is the main difference between an analogue signal and a digital signal?
49 Explain why it is not possible to store an analogue signal in a computer system.
50 What device is used to convert a digital signal so that it can be transmitted over an
analogue medium?
51 Other than a microphone, name one device which contains an ADC.
126
Bitmapped graphics
A bitmap (or raster) graphic is an image made up of individual blocks of colour
Bitmap An image format
called pixels.
that uses a grid of pixels.
1 1 1 0 1 1 0 0
Figure 5.3 Each pixel in this black and white image is represented by one bit of data
Colour depth
Increasing the number of bits per pixel increases the possible number of
colours that each pixel can represent. This is known as the colour depth.
Most high-quality images use a colour depth of 24 bits per pixel – 8 bits each
for the amount of red, green and blue (RGB).
Increasing the colour depth allows for a more accurate representation of the
image, however it also increases the size of the image file.
127
Metadata
Metadata means data about the data.
Metadata Data about data.
While an image file contains the binary data that makes up the colour of each Additional data stored in a
individual pixel, additional data is also needed. file.
This includes the width, height and colour depth of an image which is
necessary to convert the raw binary data back into a usable image.
Other metadata might include the time and date the image was created or
saved, the camera settings used (for a photograph), the name of the software
used to edit it and GPS data.
Vector graphics
Vector graphics use geometry to describe the objects that make up an image.
Vector graphic An image
Each object is a shape (also known as a vector primitive), such as a line, made of lines and shapes.
polygon, circle, curve or text.
Vector primitive Simple
Properties are used to describe each object in the image so that it can be objects which can be
drawn as needed. These properties include co-ordinates, line colour, line combined to create a vector
thickness and fill colour. Vector graphic files store the details of these graphic.
properties. Polygon A shape made of
A vector graphic is made up of various vector primitives and the image can be straight lines, for example,
drawn to any scale. rectangle, hexagon, and so on.
128
15
14
13
12
5 Fundamentals of data representation
11
10
9
Amplitude
8
7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9
Time
Figure 5.5 Sampling an analogue wave
The fixed interval of time is known as the sample rate and is measured in Hz,
Sample rate The rate at
where 1 Hz is one sample per second.
which samples are taken,
A sample rate of 48 kHz means that there will be 48 000 samples per second. typically measured in kHz.
Using a higher sample rate means that more samples will be captured: Sampling resolution The
✚ increasing the accuracy of the recording number of bits per sample.
✚ while also increasing the file size.
The sampling resolution refers to the number of bits per sample to measure
the amplitude. (This concept is similar to the colour depth of an image).
In Figure 5.5 each measurement of amplitude can take the integer values
0-15. This represents a 4 bit sample resolution (as 24 = 16). For a 3 bit sample
resolution the horizontal axis of Figure 5.5 would only have eight values
(as 23 = 8).
Using a larger sampling resolution means that the amplitude measurement of
each sample will be captured more accurately, again increasing the accuracy
of the recording and the file size.
The file size required for an audio recording can be found by multiplying the
Note
length of the recording (in seconds) by the sample rate and the sampling
resolution. If audio is recorded in
stereo then there will be
file size = length of recording (seconds) × sample rate (Hz) × sampling
two recordings – a left
resolution (bits per sample)
channel and a right channel,
For example, for a 2 minute recording at 48 kHz and 16 bits per sample, the file doubling the overall file
size is: size requirements; in this
example, the stereo file size
file size = 120 seconds × 48 000 Hz × 16 bps would be 23.04 MB
= 92 160 000 bits
= 11 520 000 bytes
= 11.52 MB
Nyquist’s theorem
If the sample rate of a sound wave is too low then it is possible that the
recording will miss some peaks or troughs.
In order to ensure that the recorded sound does not miss this data, Nyquist’s Nyquist’s theorem The
theorem states that the sample rate should be at least double the maximum rule that the sample rate
hearing range. should be at least double
the maximum frequency
Since humans can hear sounds up to around 20 kHz, a minimum sample rate
130 that can be heard.
of 40 kHz should be used.
Sounds recorded at a lower sample rate may still be recognisable, but will not
be as accurate.
Telephone calls typically use a lower sample rate, and while this is suitable for
speech, music played over a telephone line sounds tinny and lacks definition.
Data compression
Data compression means reducing the size taken up by data. This is useful as
Compression Reducing
it means that less storage space is used on storage devices, and less time is
the size of a file.
taken to transmit data.
Sound files and image files are often compressed as these files typically use Note
large amounts of data.
Video files are made up of
Other files, including text files, data files and program files can also be image data and audio data.
compressed. Video files are not explicitly
covered in the specification,
but it can be assumed that
131
similar techniques apply.
Lossy compression
In lossy compression some data is removed from the file. Once removed this
Lossy compression
data cannot be replaced.
Reducing the size of a file
5 Fundamentals of data representation
In image files this can involve reducing the pixel dimensions or reducing the by permanently removing
colour depth. some data.
Reducing the resolution means that fewer pixels are needed. While this
reduces the image quality, it is often possible to reduce the number of pixels
quite significantly without making the image unusable, especially if the
image is to be viewed on screen.
Figure 5.6 A high resolution image on the left and a low resolution image on the right
Reducing the colour depth means that each pixel is stored using fewer bits.
The cost of this is that fewer colours are available which can lower the overall
accuracy of the image.
The advantage of using lossy compression is that the file size can be
significantly reduced, often with minimal noticeable effects. The major
downside is that the lost data cannot be recovered.
Text files, program code and raw data cannot be compressed with lossy
compression as the data would lose its meaning - for example removing every
third letter from a piece of computer code would stop the code from working.
Lossless compression
Lossless compression, as the name suggests, does not involve losing or
Lossless compression
removing any data.
Reducing the size of a file
One technique to achieve this is run length encoding (RLE). without removing any data.
A run is a sequence of identical values, and the data can be shortened by Run length encoding
recording the data in pairs of values indicating the run length and then the (RLE) A lossless
value. compression technique of
recording the length and
The text data AAAAAAAAABBBBBBBBAAAAAA could be shortened to 9A 8B 6A. value of each run in the data.
Another method of lossless compression is to use a dictionary method. Run A series of identical
Each unique value in the data is recorded as a key-value pair. values in a file or dataset.
Key Value
1 how
2 much
3 wood
4 could
5 a
6 woodchuck
7 chuck
132 8 if
The phrase could then be stored using just the keys: “1 2 3 4 5 6 7 8 5 6 4 7 3”.
Making links
Dictionaries are a type of data structure commonly used in programming, and
The major advantage of lossless compression is that no data is lost and, when
Note
decompressed, the original file is recreated exactly as it was. Any file can
potentially be compressed using lossless compression. Other compression
techniques exist, and
The major disadvantage of lossless compression is that file size reductions are
data compression is a
usually much more limited than with lossy compression. Depending on the
highly important area of
number of runs, or the number of unique values in the file, it is even possible
research with far-reaching
that the compressed version will not be any smaller than the original. implications. Only those
Now test yourself techniques described
here are included in the
70 Explain the purpose of compressing data. specification.
71 Identify two methods of reducing the file size of an image using lossy
compression. Revision activity
72 Why can program code not be compressed using lossy compression? Find examples of simple
73 Other than the answer above, give one disadvantage for using lossless compression. 8-bit graphics using an
74 A digital image is made of repeating patterns. Explain how a dictionary could be image search and practise
used to compress the image file. representing them using run
length encoding
Answers available online
Encryption
Encryption is a method of scrambling data so that it cannot be understood.
Encryption A method of
Encryption is carried out by starting with a plaintext message and applying a hiding the meaning of a
cipher and a key in order to generate a ciphertext message. message.
Once received, the ciphertext message is then decrypted back into the Plaintext The original,
original plaintext message. unencrypted message.
Cipher An algorithm for
Caesar cipher encrypting data.
One of the most basic ciphers is the Caesar cipher, an example of a Key A value used to encrypt
substitution cipher. or decrypt data.
In this cipher each letter is shifted by a fixed amount, or a key. For example, Ciphertext The encrypted
with a key of 2, each plaintext letter is moved two places further up the form of the message.
alphabet.
Decryption A method of
converting an encrypted
Plaintext letter A B C D E … X Y Z
message back to its original
Ciphertext letter C D E F G … Z A B form.
Caesar cipher A
Making links substitution cipher in
which each letter is shifted
Writing a computer program to carry out a Caesar cipher is good practice, and the
according to the key.
kind of task that can appear in Component 1, Section B. In order to wrap the alphabet
back to the start, the modulo operator can be used. Make sure you are comfortable Substitution cipher
with the programming techniques in Chapter 1 and try writing your own programmed A cipher in which each
version of the Caesar cipher. plaintext letter is replaced
with a ciphertext letter.
The plaintext message DAZE would be encrypted as FCBG.
Once the message is received, along with the key, the process is reversed, and
the ciphertext message FCA would be decrypted as DAY. 133
cipher, frequency analysis can be used. This means that the most commonly
used letters (typically E and T in the English language) are likely to occur most
often.
Testing the keys where the most frequent letters align with E or T can further
reduce the time taken to crack this encryption method, though with only 25
possible keys it is trivially quick to break this code with a computer.
Exam tip
Although not listed directly in the specification, you may be asked a question based
on a substitution cipher of a similar complexity to the Caesar cipher (for example, with
the letters of the alphabet written backwards, or using space as a 27th character).
Practise with both of these examples to ensure that you are comfortable with the
principles of substitution ciphers.
Vernam cipher
Vernam cipher A
The Vernam cipher is also known as a one-time pad.
cipher which uses a
The cipher starts with choosing a random sequence of characters or binary randomly generated key
digits as the key. This key is the one-time pad as it must only be used once. and is mathematically
unbreakable.
The next task is to write down the binary codes for each character in the
plaintext message and in the key. One-time pad The key
used in the Vernam cipher.
Plaintext message n o w
Key g p w
Plaintext (binary) 0110 1110 0110 1111 0111 0111
Key (binary) 0110 0111 0111 0000 0111 0111
The individual bits in the plaintext and key characters are passed through an
XOR gate in order to generate the cipher text.
Summary
Number systems ✚ Binary values are usually written with a space after
✚ Numbers are classified in sets, including (natural), each block of 4, called a nibble
(integers), (rational), (real) and irrational numbers ✚ A block of n binary bits can represent 2n possible
✚ Ordinal numbers are used to identify the position of a values
value in a set ✚ Multiples of 103 (1000) bytes are referred to as 1 kB,
✚ Natural numbers ( - positive integers, including 0) are 1 MB, 1 GB, 1 TB, and so on
used for counting ✚ Multiples of 210 (1024) bytes are referred to as 1 KiB,
1 MiB, 1 GiB, 1 TiB, and so on
✚ Real numbers ( ) are used for measurement
Number bases Binary number system
✚ Numbers can be represented using any base ✚ Unsigned binary is used to represent natural numbers,
with a maximum range between 0 and 2 n − 1 (where n
✚ You are expected to be familiar with numbers in base 2
is the number of bits)
(binary), base 10 (decimal) and base 16 (hexadecimal)
✚ It is important to be confident adding and multiplying
✚ You are expected to be able to convert between
binary numbers manually
numbers in different bases
✚ Two’s complement is used to represent signed
✚ Hexadecimal is used as a shorthand for binary because
(negative) binary numbers
each hexadecimal digit maps exactly to one binary nibble
✚ In two’s complement the MSB a negative number (for
Bits and bytes example, in a 4 bit number: −8, +4, +2, +1)
✚ A single binary digit is called a bit ✚ Subtracting binary numbers can be done by adding the
✚ Bits are usually grouped into blocks of 8, called a byte negative version of the second number
➜ 135
✚ The range of a two’s complement number is from ✚ A check digit is similar to a checksum, but is only
−2 n − 1 to 2 n − 1 − 1, because 0 is included as a positive represented by a single digit, often making use of the
number modulo operator
✚ Fractional numbers can be represented using fixed
5 Fundamentals of data representation
✚ Lossy compression refers to compression techniques ✚ The encryption algorithm is known as a cipher and the
in which some data is removed (for example, pixels value used to encrypt and decrypt the data is known as
and/or colour depth in an image, samples and/or the key
sample resolution in a sound file ✚ The Caesar cipher is a substitution cipher in which each
Exam practice
1 Four numbers are listed below. i) The number 7.8 is represented as 0111.1100.
a) For each number, tick one or more boxes to show Calculate the absolute and relative error. [3]
which sets it belongs to. Some numbers belong to j) Calculate the decimal value of the floating-point
more than one set. [4] binary number with a mantissa of 0.110 1000 and an
exponent of 0110. You must show your working. [2]
Natural Integer Rational Irrational Real k) Write the normalised floating point representation
17.4 of −426 using a 10 digit mantissa and a five digit
exponent. [3]
√2
3 The letter H is represented in ASCII using the binary
7 code 100 1000.
−12 a) State the binary codes for the letters I and K. [2]
b) Explain why someone writing in using a different
b) State which value from the table above is an ordinal alphabet might not be able to use ASCII encoding
number and explain the purpose of an ordinal and suggest an alternative system. [2]
number. [2] c) The letter H is transmitted using odd parity, with the
2. a) Represent the number 113 as an 8-bit unsigned parity bit appended before the most significant bit.
integer. [1] State the binary code which is transmitted. [2]
b) Represent the number -47 as an 8-bit two’s d) The binary code 1011 0101 is received, still using
complement integer. [1] odd parity. State whether this transmission would
be accepted. [1]
c) Use binary addition to find the value of 113 – 47. [2]
e) Identify two flaws with using a parity check. [2]
d) Using only binary values, calculate the value of
23 × 9. [3] f) Another transmission is sent using majority voting
in which each bit is transmitted three times. The
e) Name the problem that would occur if finding the
data received is 111 000 110 011 101. State the
value of 27 × 11 using 8-bit unsigned binary. [1]
original data. [1]
f) Represent the number 113 as a hexadecimal
g) Explain why it would not have been suitable to
number. [2]
transmit each bit four times. [1]
g) Explain why it may sometimes be preferable to
h) Some text has been encrypted using the Vernam
represent a value using hexadecimal rather than
cipher. Explain what is meant by encryption. [1]
binary. [2]
i) Describe why the message was not encrypted
h) Calculate the decimal value of the two’s
using the Caesar cipher. [1]
complement fixed-point binary number
1101.0110. [2]
➜ 137
j) The received message is as follows, and the key is c) Identify two effects of increasing the colour
MOFK. The ASCII code for the letter H is 100 1000. depth of an image to 9 bits. [2]
Decrypt the message and show the plaintext d) Describe two advantages for saving the file
message. You must show your working. [4] as a vector graphic, rather than a bitmap. [2]
5 Fundamentals of data representation
138
6 Fundamentals of computer
systems
Classification of software
There are two broad classes of software: system software and application
software.
System software is intended to allow the computer system to run. This
System software
includes:
Software intended to allow
✚ operating systems ✚ libraries the computer system to
✚ utility programs ✚ translators. run.
Systems software is there to support the running of the computer system, Application software
rather than to achieve a specific outcome for the user. Software intended to allow
Application software is intended to allow the computer to serve a useful the end-user to achieve a
purpose. This includes: task.
✚ word processing software
✚ spreadsheet software
✚ web browsers.
General-purpose application software can be used to achieve multiple tasks,
for example word processing software can be used to write a letter, make
notes or create a display.
Special purpose application software is written with a more specific purpose in
mind and cannot easily be used for a different purpose. Examples include audio
editing software, flight simulator training software and computer games.
139
System software
Operating systems:
✚ provide an interface between the user and the hardware
6 Fundamentals of computer systems
User interface
The user interface can be graphical (GUI), command line (CLI), menu driven or
voice controlled.
140
Hardware resources
Running the computer hardware includes significant complexities, which
are hidden from the user because the operating system handles them in the
background.
Making links
In order to fully appreciate the hardware complexities which are hidden by the
operating system it is important to understand system architecture, including the
role of the CPU and RAM. This topic is discussed in Chapter 7 (Internal hardware
components).
Classification of programming
languages
Computer programming has evolved over the lifetime of computers and there
are several different classifications for programming languages.
Low-level languages
There are two types of low-level language.
Low-level language A
Machine code refers to the binary code which is directly acted upon by the programming language
processor. which describes exactly
how to interact with the
Each type of processor has its own machine code instruction set with
computer’s hardware.
individual commands for tasks such as fetching data from memory, saving
data to memory, adding values, etc. Machine code Each
instruction is represented
The only language that a computer understands is machine code. as a binary code.
Assembly language is a text-based equivalent to machine code. Remembering Assembly language Each
and correctly entering the exact binary codes is difficult, so assembly instruction is represented
language allows each binary instruction to instead be represented by a short as a text-based command.
code, which is closer to the English language. This makes it easier to program.
141
in register 3.
Each assembly instruction maps to one machine code instruction, and each
type of processor has its own assembly language.
Making links
AQA has developed its own assembly language instruction set and it is normal to
include a question in which students are expected to read or write programs using
assembly language. The specific details are discussed in Chapter 7 (The processor
instruction set).
High-level language
High-level languages use structured statements (for example, IF statements,
High-level language A
WHILE loops, and so on) in order to simplify the task of describing an
programming language
algorithm.
which uses keywords and
Keywords and constructs are written in an English-like form which makes constructs written in an
them more recognisable. English-like form.
One line of code from a high-level language might require multiple machine
code instructions.
High-level languages include those you will have studied for Component 1,
such as:
✚ C# ✚ Python
✚ Delphi/Pascal ✚ Visual Basic.net
✚ Java
Imperative high-level languages are languages in which the commands are
carried out in a programmer-defined order. The term imperative refers to Imperative A language in
which commands are used
types of programming language where the programmer is describing how
to say how the computer
the computer should achieve the desired result. All of the examples above are
should complete the task.
imperative high-level languages.
Declarative A language
The alternative to imperative high-level languages is declarative high-level in which the programmer
languages, in which the programmer writes a program that describes what codes what they want to
the program should achieve, but not how. happen, but not how.
For example, the command for searching through a table of data does not
contain an explicit loop. Exam tip
Remember that imperative
Examples of declarative programming languages include:
refers to the specifics of
✚ SQL ✚ LISP how to solve the problem,
✚ Haskell ✚ Prolog whereas declarative refers
142 to what the program should
achieve.
13 Describe one reason why high-level languages are more commonly used than
low-level languages.
14 Explain the meaning of imperative in relation to high-level languages.
15 Suggest one situation where a programmer may prefer to use a low-level
programming language.
Assembler
An assembler is used to translate assembly language instructions into object
Assembler Translates
code instructions.
assembly language code
Because assembly language and machine code map one-to-one, each into object code.
assembly language instruction is translated directly into its binary equivalent.
Compiler Translates high-
level code into object code
Compiler as a batch.
A compiler is used to translate high-level code into object code. This is a more
complex process than with an assembler as one line of high-level code may
involve several machine code operations.
A compiler will process all of the code in a program in one batch, and will
produce an executable binary file (often with a .exe file extension, though not
always).
143
Once compiled, the executable file can be run multiple times without the
need to translate the source code again.
The executable file is much The executable file will Producing a program that
quicker to run than the be compiled specifically will be run many times
source code. for one set of machine without frequent changes
code instructions and the to the source code.
Once compiled, the
executable file will only run
executable file can be run Producing a program where
on that platform.
without the need to translate the programmer wishes
the source code again. The whole program must to keep the details of the
be fully compiled before algorithms secret.
The programmer can share
it can be run and this can
the executable file with
slow down the process of
others, but the program
debugging.
cannot be easily edited or
algorithms in the source If there is an error in the
code copied as only the program then the program
machine code is present in will not compile, meaning
the file. that a partially working
program cannot be tested.
If the compiler detects any
errors then it will try to
inform the user of all errors
that have been found.
Interpreter
An interpreter achieves the same basic goal as a compiler – translating
Interpreter Translates and
high-level code into machine code. The main difference is that an interpreter
then runs high-level code
translates and then executes one line of code at a time.
one section at a time.
An interpreter doesn’t produce an executable file, but must be re-translated
each time it is run.
144
Bytecode
In order to help compiled programs run on a wider range of devices, some
Bytecode An intermediate
programming languages are compiled to bytecode, rather than object code.
code between high-level
Logic gates
Logic gates are physical devices which take one or more inputs and produce
an output according to certain logical rules. Logic gate Device which
takes one or more binary
There are six basic logic gates which students are expected to recognise and outputs and produces a
be able to draw. single binary output.
The function of each logic gate can be described using a truth table, which Truth table A table showing
shows the possible inputs and the equivalent outputs. the possible inputs and their
corresponding outputs.
Each logic gate can also be represented using a Boolean expression.
Boolean expression A
mathematical notation for
logic gates and circuits.
145
A B Q
AND gate
The AND gate produces an
0 0 0
6 Fundamentals of computer systems
A B Q
NAND gate
The NAND gate is equivalent
0 0 1
A to an AND gate followed by
Q a NOT gate.
B 0 1 1
Note the round ‘nose’ on
2-input NAND Gate
1 0 1 the front to indicate that the
result should be inverted.
1 1 0
Boolean expression Q = A.B Q = NOT (A AND B)
146
A B Q
NOR gate
The NOR gate is equivalent
0 0 1
21 State the name of the gate which produces an output of ‘on’ (or 1) when one input
is on, but not both.
22 State the difference in the symbols for an AND gate and a NAND gate.
23 State another name for an inverter.
Logic circuits
Logic circuits which contain two or more gates can be devised to solve
Logic circuit A solution to
practical problems.
a problem that uses one or
For example, the windscreen wipers (O) on a car may activate if the engine (E) more logic gates.
is switched on and either the wiper switch is activated (W) or the rain sensor
is activated (S).
The logic circuit for this scenario is shown below, along with the relevant
truth table.
E
O
W
S
E W S O
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 0
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
The circuit can also be represented using its Boolean expression: O = E.(W+S)
You may be asked a question which involves translating between any two of:
✚ a written description of a problem
✚ a logic circuit diagram
✚ a logical expression.
147
Worked example
–
Draw a logic circuit to represent the Boolean equation A.B + A.C.
–
The circuit can be created by combining the output of A.B with the output of A.C, using an OR gate.
6 Fundamentals of computer systems
Figure 6.8
148
When completing truth tables for complex circuits it can be helpful to include
intermediate points.
Worked example
C E
Using the expression Q = D⊕E and substituting for D and E we can construct the full
Boolean expression:
– –
Q = A.B ⊕ B.C
149
C
B
Figure 6.10
Adders
Two of the most important logic circuits are the half adder and the full adder.
Half adder A logic circuit to
The half adder is used to add two binary numbers so that: add two binary digits.
0 + 0 = 0 carry 0 Full adder A logic circuit to
add three binary digits.
0 + 1 = 1 carry 0
1 + 0 = 1 carry 0
1 + 1 = 0 carry 1
When written out as part of a larger binary addition it is common to use the
following layout.
0 1 1
1 + 0 + 1 +
1 1 1 0
The outputs are labelled as S (sum) and C (carry).
A
Exam tip
S
B You may be asked to
construct the circuit for a
half adder. If so, start with
the truth table and the
C two gates required can be
worked out from there.
Figure 6.11 A half adder logic circuit
The Sum is calculated using an XOR gate and the Carry is calculated using an
AND gate.
INPUT OUTPUT
A B C S
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0
150
When adding two digit (or larger) numbers, it is necessary to add 3 inputs –
the two numbers being added and the carry from the previous digit. These are
labelled A, B and Cin (Carry in).
When adding three binary numbers there is an additional possible case:
Cout
The truth table gives some clues as to the overall purpose of the logic circuit.
INPUT OUTPUT
A B Cin Cout S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1 Exam tip
0 1 1 1 0
You will not be asked to
1 0 0 0 1
construct a full adder, but
1 0 1 1 0 you may be shown one and
1 1 0 1 0 asked to explain its function,
1 1 1 1 1 or to complete a truth table.
D-type flip-flop
The edge triggered D-type flip-flop is a logic circuit which can be used to store
Edge triggered D-type
a state. This concept is the basis of computer memory. flip-flop A logic circuit used
✚ The circuit takes two inputs, a data input, and a clock input signal which to store the state of an
changes from 0 to 1 and back again at regular intervals. input.
✚ When the clock input signal changes from 0 to 1 (called the rising edge),
the output changes to match the current data input signal. Rising edge The point
✚ In between the rising edges, at any other point in the clock signal, any at which the clock signal
changes from 0 to 1.
changes in the input data has no effect on the output – the old output will
still appear, regardless of the current data input. Volatile memory Memory
✚ This can be used to check if a value has changed since the last timing which can only store a value
signal (by comparing the current output of the signal to its input), or to when supplied with power.
store data to be used later.
It should be noted that this circuit still requires power and is therefore a type
of volatile memory. 151
Exam tip
You are not expected to recall the inner workings of a D-type flip-flop. However, you
should remember that:
6 Fundamentals of computer systems
Boolean algebra
Using Boolean algebra
Each logic gate and each logic circuit can be written as a Boolean expression.
More complex logic circuits can often be replaced with a simpler design, and
so Boolean expressions are checked to see if they can be simplified.
There are three main steps involved in simplifying a Boolean expression.
Boolean identities
A Boolean identity refers to an expression where there is only one input, and
Boolean identity A
therefore the circuit can be simplified.
relation that is always true.
For example the Boolean expression A.1 would produce the following truth
table.
A 1 A.1
0 1 0
1 1 1
As the second input is always on (represented as ‘1’), the output is always the
same as the value of A.
–
Another Boolean expression that can be simplified is A.A .
– –
A A A.A
0 1 0
1 0 0
As A and NOT A are opposite, the output from the AND gate must always be 0.
There are eight main Boolean identities to remember:
A+0=A A .A = A
A .0 = 0 A+A=A
–
A+1=1 A. A = 0
–
A .1 = A A+A=1
152
Exam tip
In the Boolean expressions above, whilst A is a single input it could represent a more
complex expression.
O
Q
A
Figure 6.13 Q = A + 0
Now consider the following diagram:
O
B Q
C A
Figure 6.14
Now we can see that A is in fact the output of B.C and, therefore, the expression
above can be written as:
Q = B.C + 0, where A = B.C
and, since A + 0 = A,
B.C + 0 = B.C
In fact, it doesn’t matter what expression is contained within dotted box – if the box
has one output then you can think of it as: ‘box + 0 = box’. And the same goes for any
of the Boolean identities above.
The first two terms cannot be removed as a Boolean expression because the
Note
A– must be multiplied first.
AND operations should
be considered to have the
Expanding and factorising brackets
6 Fundamentals of computer systems
De Morgan’s laws
Given a Boolean expression with a bar over two or more inputs, the bar can be
split:
––
A + B = A.B
The reverse is also true, and the bar can be joined:
– – De Morgan’s laws Break
A + B = A.B
De Morgan’s laws can be simplified to a simple phrase: ‘Break the line, change the line, change the sign
154 (and vice versa).
the sign’.
Using this technique it is common to find you have a double bar. Applying the
Exam tip
first De Morgan law:
– – – – Any question set on
A + B = A. B
the exam paper must
Alternatively applying the second De Morgan law gives:
Summary
✚ The term hardware describes the physical components of ✚ Assembly language code is written in a form closer to
a computer system English
✚ The term software refers to the programs that run on ✚ Each machine code and assembly language is specific
that system to one platform
✚ Software can be classified as systems software or ✚ Each assembly language instruction is equivalent to
application software one machine code instruction
✚ Systems software is concerned with allowing the ✚ High-level languages use structured statements and
computer system to run and includes the operating English-like keywords and constructs
system, utility programs, libraries, and translators ✚ High-level languages support the use of local variables,
✚ Application software is not associated with the parameters, named constants and indentation
operation of a computer but instead allows it to serve a ✚ Imperative high-level languages describe how a
useful purpose by performing tasks problem should be solved, step by step
✚ The operating system is a piece of systems software ✚ Declarative high-level languages describe what should
which hides the complexity of the hardware and happen, but not how
provides a user interface ✚ High-level languages make it easier for programmers to
✚ The operating system also performs resource write code and can be run on different platforms
allocation, specifically memory management, ✚ Low-level languages use less memory and can be
controlling CPU access and peripheral management optimised for the platform on which it runs
✚ Utilities are programs which deal with one aspect of ✚ An assembler is used to translate assembly language
running the computer system code into object code
✚ Examples of utility programs include disk ✚ A compiler or an interpreter can be used to translate
defragmenters, back-up software, virus scanners and high-level code into object code
disk cleanup ✚ A compiler translates all of the code at once and
✚ Libraries are collections of subroutines which can be produces an executable file which can be run without
accessed by other programs further translation
✚ Translators are programs that convert source code into ✚ An interpreter translates and then executes one line of
object code – assemblers, compilers and interpreters code at a time, and does not produce an executable file
✚ The low-level languages are machine code and ✚ A compiler will attempt to identify all errors when
assembly language compiling and ensures that the source code can be
✚ Machine code is a set of instructions for a particular kept secret, as only the executable file is distributed
CPU made up of individual binary codes ✚ A compiled program will only run on the system it has
been compiled for
155
✚ An interpreter will re-translate the program each time ✚ A D-type flip-flop stores the state of the input and can
it is run, so that the same source code can be run on therefore be used as a unit of memory
more platforms without needing to provide a different ✚ A D-type flip-flop has two inputs – a data signal and a
executable file clock signal
6 Fundamentals of computer systems
✚ An interpreter can allow for faster debugging as a ✚ Boolean expressions use the ‘+’ symbol to mean
partially complete program will still run OR and the ‘.’ operator (meaning multiply) to mean
✚ When sharing an interpreted program, the source code AND
must be shared, and the program must be re-translated ✚ BIDMAS rules apply, and so AND operations should be
each time it can run applied before OR operations
✚ Bytecode is used in some languages as an intermediate ✚ Boolean identities can be used to simplify Boolean
step as this bytecode can then be executed in a virtual expressions, for example, A.1 = A
machine, allowing compiled programs to run on a wide ✚ Boolean expressions can be factorised, and brackets
range of platforms expanded, just as in normal algebra
✚ Logic gates take one or more binary inputs and ✚ A bar over any part of a Boolean expression refers to a
produce a single binary output NOT operator
✚ The six main logic gates are the NOT, AND, OR, XOR, ✚ De Morgan’s Law means that if you break a line, you
NAND and NOR gates change the sign
✚ Each logic gate has its own truth table, which describes ✚ The reverse is true, and two bars can be joined, with
the range of possible inputs and the corresponding the operator where the break was being changed
outputs ✚ A rule of thumb when simplifying Boolean expressions
✚ Logic gates can be combined to create a logic circuit, is to look for opportunities to use De Morgan’s Law,
which can be used to solve a larger range of problems followed by looking for Boolean identities and finally
✚ A half adder is used to add two binary numbers looking for opportunities to use brackets (factorising or
✚ A full adder is used to add three binary numbers expansion)
Exam practice
1 A computer has been newly set up, with an operating 3 a) State the name of this logic gate. [1]
system, disk defragmenter, hard disk drive and a
A
compiler Q
B
a) State which item from the list above is an item of
hardware. [1]
b) Copy and complete the truth table for this logic
b) State which category of system software a disk gate. [2]
defragmenter belongs to. [1]
c) Describe two different types of resource A B Q
management carried out by the operating 0
system. [2]
0
d) Identify one other feature of the operating
system. [1] 1
2 Dani has written a computer program using an 1
imperative high-level language.
a) Explain the meaning of the term imperative high- c) Draw the circuit diagram for the logical expression.
level language. [2] [4]
b) Suggest two reasons why Dani may have chosen Q = A.B + B.C
to use a high-level language rather than a low-level d) Copy and complete the truth table for the logic
language. [2] circuit described in part c). [3]
c) State the name of one low-level language. [1]
A B C Q
d) State the purpose of a compiler. [2]
e) Describe two advantages to Dani for choosing to
use a compiler. [4]
f) Describe one disadvantage to the end-user
of Dani’s program if she had chosen to use an
interpreter. [2]
g) Dani’s program is compiled to bytecode instead
of object code. Suggest one reason why she has
chosen this method. [1]
h) Explain the process of executing the bytecode. [2]
➜
156
e) Use the truth table to draw a simplified logic circuit. 4 Simplify each of these Boolean expressions.
–
[3] a) 1 ⊕ B [1]
f) State the name of this logic circuit. [1] b) A.B + B [3]
– –
c) (A + B). (A + B) [4]
Cout
157
The components
Processor
The processor, or central processing unit (CPU), is used to process
instructions.
Every other aspect of computer hardware is designed around passing
instructions and data to the CPU in order to be processed, and then returned.
In some systems it is possible to remove and replace the processor in order
to upgrade the system, while in other systems the processor is permanently
fixed and cannot be replaced.
The individual components of a processor are discussed later in this chapter.
CPU Electronic device used
to process instructions.
Main memory
Main memory Memory
Main memory stores data and instructions that are directly accessed by the
that can be directly
CPU. There are two main forms of main memory: RAM and ROM.
accessed by the CPU.
✚ RAM (random access memory), is volatile storage used to hold the
instructions and data for programs that are currently running. RAM Short term storage for
✚ ROM (read only memory) non-volatile storage used to hold the startup currently running programs
instructions for the computer and currently used data.
RAM can be easily upgraded in most desktop and laptop computers as ROM Long term storage for
each memory module is built onto a small circuit board with standardised startup instructions.
connectors. Volatile Data is lost when
electrical power is removed.
Non-volatile Data is
Note retained even when
electrical power is removed.
When a computer system runs out of main memory a portion of the secondary
storage can be allocated as virtual memory. The least frequently accessed contents Virtual memory A portion
of main memory are placed in virtual memory, which has significantly slower data of secondary storage used
access times due to the physical limitations (for example, spinning up a magnetic to store the least frequently-
hard drive). used instructions and data.
158
Addressable memory
Inside main memory, each block of memory is given a unique address so that
the correct data can be retrieved and passed to the CPU. Without using blocks,
the whole of the memory would need to be retrieved each time. Applications
Buses
Three buses are used to allow communication between the processor, RAM
Bus A communication
and the I/O controllers. system for transferring data.
159
Monitor Keyboard
Video Keyboard
Processor RAM
controller controller
Address bus
Data bus
Control bus
160
von Neumann architecture A computer system design with one shared memory
for instructions and data.
Harvard architecture A computer system design with separate memory for
instructions and data.
161
✚ Memory address register (MAR) – stores the address of the item in memory
Exam tip
currently being addressed. This could refer to an instruction or to data
✚ Memory buffer register (MBR) – stores the data that is currently being Be familiar with the
fetched from memory acronyms for each
15 Name the five main components in a processor – both the full names and the
abbreviations.
16 What two tasks are carried out by the ALU?
17 Which part of the processor is used to decode instructions?
18 What data can be stored in a general-purpose register?
Fetch
In this part of the cycle the next instruction is fetched from memory.
1 The contents of the PC are copied to the MAR
2 The contents of the PC are incremented.
3 The address bus transfers the contents of the MAR to main memory.
4 The data at that memory location is transferred to the processor via the
data bus.
5 This fetched data is stored in the memory buffer register (MBR).
6 The data (which is an instruction) is transferred to the CIR.
Decode
In this part of the cycle the contents of the CIR are decoded by the processor’s
control unit.
Execute
The final part varies significantly depending on the exact instruction.
If the instruction involves fetching data from memory then:
✚ the address of that data is stored in the MAR
✚ the address bus transfers the contents of the MAR to main memory
✚ the data at that memory location is transferred to the processor via the
data bus
✚ the fetched data is stored in the memory buffer register (MBR).
If the instruction involves writing data to memory, then the process will be
almost identical, except that the data will be transferred from the MBR to
main memory via the data bus.
Sometimes the cycle is described in five steps:
1 MAR ← [PC]
2 a PC ← [PC]+1
b MBR ← [Memory]addressed
3 CIR ← [MBR]
4 Decode instruction
5 Execute instruction 163
In this style of question, the square brackets read ‘the contents of’ and the
Exam tip
arrow, ←, reads ‘copied to’. For example, Step 1 reads ‘The contents of the
program counter are copied to the memory address register’. Questions referring to the
fetch-execute cycle are
Note that steps 2a and 2b can occur simultaneously, as both tasks use
7 Computer organisation and architecture
typically longer-answer
different registers and do not rely on the result of the other operation.
questions. Make sure you
are confident explaining the
Now test yourself fetch part of the cycle in
19 Identify the three main stages in the fetch-execute cycle. particular, including the use
of both buses and registers.
20 Describe the first part of the fetch stage.
21 What is the purpose of the program counter?
22 Which register communicates via the address bus?
23 Which register communicates via the data bus?
Note that in machine code the addressing mode is included as part of the
opcode. In assembly language the addressing mode is typically shown as part
of the operand in order to make the assembly language instruction easier to
follow, however the addressing mode bit is still stored as part of the opcode.
opcode operand(s)
basic machine operation addressing mode
1 0 0 1 1 1 1 1 1 1 0 0 0 1 1
Figure 7.4 An example machine code instruction
CMP r1, #10 ‘compare the value in register 1 with the value 10’
Addressing modes
The addressing mode describes whether the operand is an immediate value,
or the address of a value.
Addressing mode Either
Immediate addressing is when the operand of an instruction contains the immediate or direct.
number itself. For example:
Immediate addressing
ADD R1, R2, #3 The operand is the value,
This instruction will add the number 3 to the value in register 2 and store the which is to be processed.
answer in register 1. Direct addressing The
Direct addressing is when the operand of an instruction contains an address operand is the address of
(either a register number or an address in memory). For example: the value which is to be
164 processed.
ADD R1, R2, R3
This instruction will add the value in register 3 to the value in register 2 and
Exam tip
store the answer in register 1.
To help remember the
The exact format of the addressing mode in machine code is specific to that
vocabulary consider that
processor’s instruction set.
AQA has its own instruction set which is used in Component 2 examinations. The AQA Assembly
Language Instruction Set
can be found under ‘Other
Exam tip assessment resources’
on the AQA website, aqa.
The instruction set is always included in the question paper with details of what each
org.uk. Search for ‘7516-
operation does and how it should be structured, so you do not need to memorise this
7517’ and then scroll down
content.
to ‘Past papers and mark
Questions may ask you to write your own assembly language code, or to trace an schemes’.
existing algorithm (or both).
This section often causes some confusion for students, so while you don’t need to
memorise the instruction set, you do need to have practical experience of using it.
The first instruction will fetch the item from memory location 100 and copy it
into register 0.
The second instruction will copy the item from register 2 into memory
address 103.
7 Computer organisation and architecture
Arithmetic
With arithmetic, remember that the first register is the location for storing
the result. Exam tip
28 For the instruction LDR R2, 102, which one of these statements is true?
A The value 102 will be stored in register 2
B The value stored in register 2 will be copied into memory
C The value stored at memory address 102 will be copied into register 2
29 Write the assembly language code for storing the value in register 4 into the
memory address 101.
30 What symbol is used to indicate immediate addressing?
31 Explain the effect of executing the instruction MOV R4, R0.
32 If R1 = 10, R2 = 20 and R3 = 30, state the values stored in each register
after executing the instruction SUB R2, R1, R3.
Answers available online
Remember that you cannot branch on a condition unless you have performed Branch Jump to a labelled
a comparison first. part of the program.
A selection structure can be created as follows: Selection A method
CMP R3, R5 of choosing whether to
execute the following
BEQ same instruction(s).
B diff
Iteration A method of
The first operation will compare the values in register 3 and register 5. The repeating a section of code.
result will be either:
Condition A comparison to
✚ EQ (equal to)
see if two values are equal,
✚ NE (not equal to) or whether one is greater/
166 ✚ GT (greater than – if R3 > R5) less than the other.
✚ LT (less than – if R3 < R5)
The second operation will branch (or goto) the part of the program with the
Label A named point in the
label ‘same’ if the result was EQ.
program.
Alternative operations that would also be based on the result of the
comparison would be:
The values of register 1 and register 2 would be processed using a bitwise AND.
The other logical operations all use three characters in the AQA assembly
language.
ORR R1, R1, #10
EOR R2, R3, #20
MVN R3, R3
The first instruction will perform a bitwise OR between the value in register 1
and the decimal number 10, storing the result in register 1.
The second instruction will perform a bitwise XOR (exclusive OR) between the
value in register 3 and the decimal number 20, storing the result in register 2.
The third instruction will perform a bitwise NOT (move NOT) on the value in
Exam tip
register 3 and store the answer in register 3.
To find out if two values are
0001 0100 0001 0010
equal, an XOR operation can
0000 1010 OR 0001 0100 XOR 0001 0010 NOT be used. A XOR 2 (A⊕2)
will return 010 if and only if
0001 1110 0000 0110 1110 1101
A = 2.
Registers Main memory
R0 R1 R2 R3 R4 R5 100 101 102 103
4 30 6 237 28 18 8 12 25 5
168
36 Convert each value to unsigned binary and carry out the bitwise operation.
a 12 AND 61
This program multiplies the value in register 3 by the value in memory address 100 and
stores the result in memory address 103.
Lines 2 and 3 of the program will jump to the HALT operation if the value in register 4
is less than 1.
170
44 Copy and complete the trace table for the following program.
MOV R1, R0
Registers
R0 R1 R2 R3 R4 R5
10 15 12
Interrupts
The processor is continually processing instructions, even when a computer
seems to be idle.
It is important to be able to get the processor’s attention and to alter the
sequence of instructions which is about to be processed. This might happen
for a number of reasons, such as:
✚ timing interrupts; for example, at fixed intervals the screen must be
redrawn
✚ program error interrupts; for example, if a program has attempted to
divide by 0
✚ hardware error interrupts; for example, a printer reports a paper jam
✚ I/O interrupts; for example, the user has pressed a key.
An interrupt is sent to the processor in order to ensure that it can deal with
this event.
In order for the processor to recognise this, the FDE cycle includes a step to
check for an interrupt.
Interrupt A signal sent to
Fetch
the processor in order to
alert it to an event.
Check for interrupt Decode Interrupt service routine
(ISR) A program which will
examine an interrupt and
Execute
handle the event.
Figure 7.6 A fetch-execute cycle model with a check for any interrupts
Stack A data structure in
Because programs will require a large number of instructions to be processed, which the most recently
when an interrupt is detected then that program’s execution must be paused. added item is returned first.
✚ The current state of the processor and its registers is saved on a stack.
✚ The source of the interrupt is identified.
✚ The appropriate interrupt service routine is called.
✚ The state of the processor is restored from the stack.
An interrupt service routine (ISR) is a software routine which will take an
interrupt and examine it in order to determine the best course of action.
This will usually involve adding new instructions to the queue of instructions 171
to be processed.
Different ISRs are written for different types of interrupt as the responses to
different interrupts will need to deal with that specific event.
Cores
A single-core processor can carry out one instruction at a time.
A dual-core processor can carry out two instructions at a time.
A quad-core processor can carry out four instructions at a time.
Single-core A processor
… and so on. containing a single CPU.
However, it is not true that a quad-core processor will be four times faster Dual-core A processor
than a single-core because: containing two CPUs, both
✚ some tasks cannot be run in parallel working simultaneously.
✚ other tasks may cause a delay while waiting for the result of another
Quad-core A processor
instruction
containing four CPUs, both
✚ there is a processing overhead in splitting the task into separate threads.
working simultaneously.
That said, a quad-core processor will typically complete a set of tasks more
Cache A small unit of
quickly than an equivalent single-core processor.
volatile memory placed on
the same circuit board as a
Cache memory processor for fast access.
A cache is a small amount of memory located on the processor chip which Clock A component in the
acts as main memory for the most frequently accessed instructions and data. processor for generating
The cache can be accessed much more quickly than RAM and, therefore, timing signals.
increasing the size of the cache reduces the average time it takes to fetch
frequently-used instructions and data from memory.
Note
Most desktop and laptop processors will have three levels of cache: L1, L2 and L3.
L1 is the smallest and offers the quickest access.
L3 is the largest and offers the slowest access – though it is still faster than accessing
RAM.
Clock speed
The clock is the processor component that generates timing signals.
✚ Most modern processors have a clock speed of at least 1 GHz, which means
there are 1 billion state changes per second.
✚ Increasing the clock speed means that each instruction is processed more
quickly.
✚ One significant downside of increasing clock speed is that this generates
more heat which will ultimately shorten the lifespan of the processor and
make it more prone to errors.
172
Word length
A binary word is a piece of binary data that can be processed in one unit. Word The maximum
✚ Using a 4-bit word length for example, means that only 4 bits can be number of bits that can
processed at once. This would mean that adding two 16-bit numbers would
The barcode scanner needs direct line of sight in order to function. Obscured,
Note
folded or damaged barcodes cannot be read.
Barcodes can be thought of
as a one-dimensional data
7 Computer organisation and architecture
3 4
Figure 7.8 A digital camera: light is let in through the shutter (1) and focused by the
lens (2); it is directed through RGB filters (3) before being focused onto the sensor
array (4)
✚ RFID tags are fairly cheap, as well as being small and light. They are
Exam tip
slightly more expensive than printing a barcode.
✚ RFID tags are often used for advertising on public transport, for contactless Barcode readers, digital
payment and contactless passports. cameras and RFID
175
Questions on storage devices generally fall into one of two categories: Volatile Loses data when
powered down.
1 Explain the workings of one particular type of storage device in detail.
2 Compare the advantages and disadvantages of different storage devices for a
given scenario. Exam tip
For questions of type 1, focus on learning and using the technical vocabulary and you Many students incorrectly
can construct your answer around these key terms. assume that secondary
storage only refers to
Mechanical hard disk drive backups. In order to avoid
this trap, whenever you
A mechanical hard disk drive (HDD) uses a metal disk, called a platter, which
see the term ‘secondary
is coated in a thin film of magnetic material.
storage’, just think ‘storage’.
✚ The film is made up of concentric rings, or tracks, each of which is split up
into sectors.
✚ Each sector is made up of thousands of magnetic charges, to indicate 0s or Hard disk drive (HDD) A
1s that represent data. storage device which saves
✚ The platter spins at high speed and a read/write head is moved over the data using magnetic film.
platter, which can both detect and change the magnetic charges in that Platter A metal disk used
sector. to store the data in a HDD.
Most mechanical hard drives use multiple platters, each with its own read/ Track A concentric ring on
write head. a platter.
Head
Sector A small section of
assembly Disks Cylinders
a track.
Read/write head The
device used to read and
write magnetic data to each
to sector.
track
sector
Solid state drives are gradually replacing HDDs in desktop and laptop Block A subdivision of the
computer systems. storage on an SSD.
Figure 7.12 shows the principle of NAND flash memory. When the control gate Page A subdivision of a
is turned on, electrons flow from the source to the drain, and some of those block.
electrons are attracted into the floating gate. When the control gate is turned Latency The time taken for
off, the electron flow stops and electrons in the floating gate are trapped the first signal to reach its
there. The presence, or not, of electrons in the floating gate corresponds to a destination.
‘1’ or ‘0’ state.
NAND memory Note
Insulating oxide
layers The spelling ‘disk’ is
Control Gate generally used in computing
terms, though the terms
Floating Gate CD and DVD are standards
defined using the spelling
‘disc’. You will not be
Source Drain
n+ n+
penalised for your choice of
electrons
spelling.
drive?
60 What two components
make up a solid state
disk drive?
61 How is data stored on
an optical disk?
Answers available
Figure 7.13 The workings of an optical disk online
Hard disk drives Solid state disk Optical disks Capacity The amount of
drives data that can be stored.
Capacity Typically have a very Have a range of Small storage
Access speed The time
large storage capacity capacities, typically capacity (700 MB for
taken to read or write data.
(frequently measured measured in GB and CDs up to 50 GB for
in TB). increasingly available BluRay). Cost This can mean the
in the TB range. cost per device, or cost per
Access time Access time is quite Access time is Access times are GB.
slow. extremely fast due extremely poor Robustness How likely
to the lack of moving and is by far the the device is to break or be
parts. slowest of the damaged.
three technologies
discussed here.
Cost The overall cost of The cost is higher The cost to produce
storage is very low. than for an equivalent one disk is very low,
capacity HDD. though the overall
cost per GB is quite
high.
Robustness Fairly robust, although Very robust Robust – they can
they can be damaged and difficult to be dropped without
if dropped and can accidentally damage. too much damage.
be affected by very However, there are However, they are
strong magnetic a limited number easily scratched and
fields. of read and write exposure to UV light
Exam tip
actions before the can degrade the When asked to compare
memory degrades. tracks. solutions it is vital to refer
Applications An effective solution An effective solution Suitable for back to the scenario given
for archived storage as a storage device in transporting data in the question, as the
of large quantities of a laptop or desktop in small batches, effectiveness of each
data. due to their fast such as computer solution will depend heavily
response time. programs, films and on the requirements in that
music. specific case.
178
Summary
✚ The internal hardware of a computer system includes ✚ Each instruction is made up of an opcode (including the
the processor, main memory, I/O controllers, and buses addressing mode) and an operand
✚ Main memory is addressable so that each section of ✚ When using immediate addressing the operand should
memory can be retrieved when needed be treated as the value (also known as the datum) to be
✚ I/O controllers are hardware interfaces between the operated on
internal components and external devices ✚ When using direct addressing the operand should be
✚ The three buses are the control bus, address bus and treated as the address of a value in memory or a register
data bus ✚ In assembly code programming each opcode is
✚ The control bus is used to transmit control and status represented as a short piece of text
signals ✚ AQA examinations use a consistent assembly language
✚ The address bus is used to transmit the address where instruction set which is provided in the exam paper – it
data is to be read or written, from the processor to is important to be familiar with using this assembly
memory language instruction set to write simple programs and
✚ The data bus is used to transmit data to and from to hand trace existing programs
different components ✚ Basic AQA assembly language operations include: load,
✚ von Neumann architecture uses one area of memory add, subtract, store, branching, compare and halt
for both instructions and data, and is used in general- ✚ Bitwise logical operations such as AND, OR (ORR), XOR
purpose computers (EOR) and NOT (MVN) are carried out one bit at a time
✚ Harvard architecture uses two distinct areas of ✚ Logical shifts involve shifting the binary values to the
memory, one for instructions only and one for data left (LSL) or right (LSR) a given number of times
only, and is used in embedded systems such as digital ✚ A logical shift left has the effect of doubling the value
signal processing systems (DSPs) each time
✚ Programs can be saved in memory so that they can ✚ A logical shift right has the effect of halving the value
be executed repeatedly without having to input the each time, though at the risk of loss of accuracy due to
program each time underflow
✚ Machine code instructions are fetched from main ✚ Interrupts are signals which need to get the processor’s
memory and executed by the processor attention as it carries out the F-E cycle
✚ The processor is made of five key components – the ✚ There are four main types of interrupt – timing,
arithmetic logic unit (ALU), control unit, clock, general- program error, hardware error and I/O interrupt
purpose registers and dedicated registers ✚ When an interrupt is triggered the state of the processor
✚ The ALU carries out all arithmetic and logical operations is stored on a stack, the source of the interrupt is
✚ The control unit decodes instructions and sends identified, the appropriate interrupt service routine (ISR)
control signals to other devices is called and the state of the processor is restored
✚ The clock generates a timing signal so that each ✚ An ISR is a program designed to decide on the best
process can be synchronised course of action to follow for a given interrupt
✚ General-purpose registers are used to store data that is ✚ Increasing the number of cores in a processor means
being worked on by the processor that more instructions can be carried out simultaneously
✚ Each dedicated register is used for a specific purpose, ✚ Increasing the clock speed of a processor means that
mostly related to the fetch-execute cycle each instruction takes less time to complete
✚ The dedicated registers are the program counter ✚ A cache is a small, very fast area of memory used for
(PC), memory address register (MAR), memory buffer storing the most frequently accessed instructions and data
register (MBR), current instruction register (CIR) and the ✚ The word length dictates the number of bits which can
status register (SR) be processed in a processor core in any one cycle
✚ The fetch-execute cycle is often referred to as the ✚ The address bus width limits the range of available
fetch-decode-execute cycle as there are three phases memory addresses
✚ It is important to be familiar with the steps involved in ✚ The data bus width limits the amount of data which can
the fetch phase of the F-E cycle be fetched from memory in one cycle
✚ Each processor has its own instruction set, describing ✚ A barcode reader emits a red light which is aimed at a
179
the machine code for each operation black and white image made of bars
✚ The sensor in the scanner reads the reflected light and and off, to remove the electric charge on the drum and
the decoder converts this into a digital code attract toner to areas that should be black
✚ Barcodes are cheap, small and light, but require line of ✚ The drum rolls over the paper, transferring the toner
sight and can easily be obscured or damaged which is then fused to the paper
7 Computer organisation and architecture
✚ A digital camera uses a lens to focus the incoming ✚ A mechanical hard disk drive (HDD) uses platters split
light onto a sensor chip with an array of sensors which into tracks and sectors
produces an electrical signal ✚ The platter of a HDD is coated in a magnetic film which
✚ Each sensor represents one pixel can be read and written to using a read/write head
✚ A colour filter is used to separate the amount of red, ✚ A solid state disk drive contains NAND flash memory
green, and blue light being captured. and a controller
✚ Each sensor electrical signal is processed by an ADC ✚ The SSD data is stored using floating gate transistors
✚ Digital cameras collect a large amount of data, which and is split into blocks and pages
requires a lot of processing in an automated system ✚ An optical disk uses a single spiral track filled with pits
✚ RFID (radio frequency identification) is able to transmit and lands
data over a short distance using radio waves ✚ A laser beam is used to read the data on an optical
✚ Active RFID tags use a battery to allow them to disk, with a transition between a pit and a land
broadcast the signal and have a longer range. representing a 1
✚ Passive RFID tags take power from the radio frequency ✚ HDDs typically have a large capacity and a medium
energy taken from the signal sent by the RFID scanner access time
✚ RFID tags are cheap, small, and light, and do not require ✚ SSDs typically have a medium capacity and a fast
line of sight access time
✚ A laser printer applies electrical static charges to a ✚ Optical disks typically have a small capacity and a very
drum and toner and uses a laser beam, switching on slow access time
Exam practice
4 Write an assembly language program to carry out the following high level algorithm.
The mathematical operator // represents an integer division.
Assume that register R1 holds the value of A, R2 holds the value of B, R3 holds the value of
181
Definitions
✚ Moral and ethical issues relate to what is right or wrong, and how
consequences might affect individuals and groups of people. For example, an
automated telephone answering system might mean that a company makes
more profit by reducing the need for human call handlers, however it might
mean that the user experience for the customer is less satisfying and will
likely have a negative consequence for the staff who are no longer needed.
✚ Legal issues relate to the law. This might refer to those making laws,
enforcing laws and those who may break laws.
✚ Cultural issues relate to how the culture of a society changes. For example,
the widespread availability of computing equipment and fast internet
access has allowed many more people to work from home. This has both
positive and negative consequences such as reduced travel costs, but also
a lack of social interaction that previously happened in the workplace.
The people who develop systems must think carefully about what data is to
be collected, how it can be collected, how it should be stored, what processing
will occur and how the results of the processing should be used.
Another point to consider is that software is scalable. This means that: Scalable Used to describe
✚ once written and compiled, the internet allows the same software to be a product, service or
easily distributed to almost anywhere in the world business that can cope with
✚ one piece of software could be installed on literally millions or even increased demand.
billions of computers.
This represents the tremendous impact that software developers and
computer scientists can have, and why they must consider all of the ethical,
moral, legal and cultural impacts of their code.
1 Identify four items of personal data that are typically stored in online systems.
2 When creating software, suggest three ways that inappropriate consequences
could be introduced into the program.
Legislation
It is not necessary to have a detailed knowledge of computing-related
legislation, though it may be useful to have a basic knowledge of some key
examples.
Making links
Encryption is an important aspect of ensuring that data cannot be understood.
Techniques for encryption are explored in Chapter 5.
Network security concerns, including malware, are discussed in Chapter 9.
3 What piece of legislation makes it illegal to distribute films, games and albums
without permission?
4 Other than computing-specific legislation, suggest two other legal issues relating
to self-driving cars.
5 Dave keeps a list of email addresses and phone numbers for the players in a small,
local football team. What legislation does he need to consider?
184
Answers available online
Case studies
In order to explore the topic of the consequences of computing, it can be
useful to look at past examples and to discuss both the technological details Making links
Malicious programmers
There have been many cases where a programmer has written code that will
collect personal data that is not necessarily needed or has created a method
by which they can collect data that they should not have access to.
Relevant legislation such as the Computer Misuse Act and the Data Protection
Act is appropriate for discussion here, and it is reasonable to ask how the
situation could have been prevented.
The sharing and auditing of code within an organisation is important in order
to be able to recognise situations where a programmer has, either maliciously 185
Exam tip
It is common for questions of this type to not have a ‘right’ and ‘wrong’ answer. Credit
is awarded for identifying, discussing and evaluating potential issues.
Marks are awarded for identifying relevant knowledge; for example, suggesting
appropriate input devices for collecting data or identifying the methods by which
wireless data transmissions can be intercepted.
To reach the top mark bands it is important to follow a line of reasoning, using your
knowledge to write in connected sentences in a way that makes sense and relates
to the context of the question. Explaining how each point links to the scenario and
adding as much technical detail and vocabulary as you can makes it more likely that
you will score well on this type of question.
Always make sure you back up any arguments or suggestions you make with
facts, logical arguments and technical details, as unsubstantiated statements don’t
demonstrate your understanding.
Hypotheticals
Another strategy for preparing for these types of questions is to ask ‘what if…?’.
Summary
✚ Computer systems make it easier and more common ✚ Other legislation is often relevant depending on the
for organisations to collect personal data circumstances, including health & safety, road safety
✚ How that personal data is used has significant and consumer rights legislation
consequences that affect individuals, groups, and ✚ Legislation often lags behind changes in technology
larger societies as it takes time to recognise the need for it and then
✚ Moral and ethical consequences are those that affect create tightly defined legislation to deal with the
individuals and groups of people. These consequences consequences of that technology
can be positive or negative, and are usually elements ✚ Case studies of current and recent ethical
of both consequences of computing issues are an effective
✚ Cultural consequences affect the way that a society way of preparing for questions on this topic
works, thinks or behaves ✚ Hypothetical, what if, questions are a useful approach
✚ Computer scientists and software engineers have a to preparing for this topic
responsibility to ensure that systems are developed ✚ Questions on ethics will often include a technical
with these consequences in mind element that relies on knowledge of other topics such
✚ Design decisions should consider how data is to be as network security, data collection methods and
collected, stored, processed and used effective program design
✚ Consideration should be taken of the impact on ✚ To reach the top mark bands it is important to follow a
individuals and on society at large line of reasoning and to back up your points with facts,
✚ Computer-based legislation includes the Computer logical arguments and technical details
Misuse Act, Data Protection Act, Copyright, Regulation of
Investigatory Powers Act, and Designs and Patents Act
Exam practice
1 A company wishes to produce an automated garage Discuss a range of technologies that could be used
door opener. The door should automatically open when to allow the automated garage door opener to
the homeowner returns. function and consider the moral, ethical, legal and
cultural consequences of using this device. [12] 187
9 Fundamentals of communication
and networking
Communication
Communication methods
In serial data transmission each bit is sent one after another. This can be done
Serial data transmission
with a single wire, or a single track on a circuit board.
Data is sent down one wire,
In parallel data transmission multiple bits are sent simultaneously. To achieve one bit after another.
this there must be multiple wires or multiple tracks on which to send the
Parallel data transmission
data. Data is sent on several wires,
In theory it is quicker to send data over a parallel connection as more data simultaneously.
can be transferred at once. However, there are several reasons why most Crosstalk Interference
communication is sent serially: caused when two or
✚ Crosstalk occurs when the signal in one wire causes electrical interference more worse are in close
with the signals on the neighbouring wires. proximity.
✚ Data skew can occur, meaning that one or more bits can be read incorrectly
Data skew When data that
by the receiver if the transmission is not correctly synchronised.
was sent at the same time
✚ The hardware for serial communication is simpler (and, therefore, cheaper)
arrives at a slightly different
to produce.
time to each other.
In synchronous data transmission both sender and receiver must be in sync.
Synchronous data
This means that a common clock must be used and the timing signal must be
transmission Where data
sent in addition to the data. is sent along with a timing
In asynchronous data transmission there is no common clock. signal.
The sender will send a start bit at the beginning of a transmission to indicate Asynchronous data
that the data is about to be sent. A stop bit will be sent at the end of that transmission Where data is
transmission. A start bit will be 0 and a stop bit will be 1. sent without a timing signal.
The process then repeats, and the receiver is able to use the start bit to Clock Used to provide a
synchronise its clock to that of the sender. This synchronisation is carried out timing signal.
each time a start bit is received. Start bit A bit sent at the
start of a message in order
Additionally, a parity bit might be added before the stop bit in order to allow
to provide timing data.
error checking to occur.
Stop bit A bit sent to mark
Finally, the transmission is often shown in reverse, indicating which bits will
the end of a message.
be received first by the receiver. If so, then the MSB would be on the right
hand side. Parity bit A bit used in
error detection.
A typical transmission of an ASCII character using odd parity might work as
follows: ASCII A 7-bit code used for
representing characters as
Character to transmit: M binary numbers.
ASCII value: 7710 or 100 11012
Parity bit : 1 Exam tip
Data transmitted: Parity bits are sometimes
included in questions on
Stop bit Parity bit Data Start bit start and stop bits.
1 1 1 0 1 1 0 0 1 0
Making links
The use of ASCII to represent characters using binary numbers is an important aspect
of data transmission as non-numeric data must often be communicated and is
1 Describe the main difference between serial and parallel data transmission.
2 In synchronous data transmission, what is transmitted other than the data itself?
3 In asynchronous data transmission, what is transmitted other than the data itself?
Communication basics
When data is transmitted a signal is sent which changes at fixed time
intervals to indicate a new value. 11
Each signal can represent more than one binary digit by using more than one
10
possible value. For example, an electrical current can be sent with a different
voltage.
01
The baud rate is the number of state changes per second and is measured in
Hz. For example, at a baud rate of 100 kHz there are 100 000 changes to the 00
signal per second.
0 1 2 3 4
The bit rate is the maximum number of bits that can be transferred per
Time (1/1000s of a second)
second. This can be calculated by multiplying the baud rate by the maximum
number of bits per signal. Figure 9.1 Sending multiple
bits per value at a baud rate
The bandwidth describes the range of possible signals, and a larger of 1 kHz and a bit rate of 2000
bandwidth means that a larger number of distinct values can potentially be bits per second
sent in one signal.
The bit rate is directly proportional to the bandwidth; in other words, if the Baud rate The number of
bandwidth doubles, then there are twice as many possible unique symbols state changes per second.
that could be sent. This means that the bit rate doubles as well. Bit rate The number of bits
The latency describes the time it takes for the first signal to reach its that can be transmitted per
destination. This is not linked to the baud rate, bit rate or bandwidth of a second.
connection. For instance, a high bandwidth connection can have a high or low Bandwidth The range of
latency depending on a number of other factors such as physical distance. possible signals that can be
sent in one signal.
A protocol is a set of rules or standards, and a number of transmission
protocols exist. These are essential as different systems need to be able to Latency The time taken for
communicate with each other. the first signal to reach its
destination.
Now test yourself Protocol A set of rules or
4 What term refers to the range of possible values in one received signal? standards.
5 What term refers to the time it takes for a signal to arrive at its destination?
6 What term refers to the number of signals that can be sent or received per second?
7 How is the total bit rate calculated?
8 What is the definition of a protocol?
189
Networking
Network topology
9 Fundamentals of communication and networking
Terminator Terminator
Printer
Figure 9.2 A bus network
Star topology
In a star topology, a central switch is used to connect each device on the Star topology A network
network together. All data transmissions are passed through a link to the arranged with a switch (or
switch which then forwards the data on to the intended recipient. hub) at the centre.
Many small networks, and almost all home networks, are based on this
Switch A device that
topology.
receives and forwards data
on a network.
Link A physical connect
between two devices.
190
Figure 9.3 A star network
9 What device is always used in a star network? 13 Give two advantages and one disadvantage for using
10 What does the word ‘bus’ refer to in a bus network? a star network.
11 What is the definition of a physical topology?
12 How can a network have a different physical topology
and logical topology? Answers available online
Client–server
Client–server network
In a client–server network, a server controls access to a centralised resource; A network in which clients
for example: make requests to servers.
✚ files, in a file server
✚ emails, in an email server Client A device which
✚ web pages, in a web server. makes requests.
Server A device which
The client requests access to the resource and the server then processes the
controls centralised access
request, decides whether access should be granted and provides the service
or response.
to a resource. 191
Peer-to-peer
In a peer-to-peer network all devices have an equal status, and no resource is
centrally controlled.
Each resource can be stored on one or multiple devices on the network and
can be accessed by any other device on that network. Peer-to-peer network A
network in which all devices
This effectively shares the storage and processing load across the devices and
are peers.
is useful for situations where no central control is needed, or when a large
number of requests might cause delays if relying on a central server. Peer Of equal standing,
able to act as a client or a
Peer-to-peer networks are frequently used for accessing or sharing large files, server.
as segments of the file can be stored on and copied from any device on the
network with that segment.
Peer-to-peer networks don’t require a central server and are typically easier to
set-up and maintain. Management of security is more difficult, however, and
it is difficult to ensure data consistency across the network.
One example of a peer-to-peer network is a simple home WiFi network in
which files can be sent to or from any device on the network.
Advantages Disadvantages
Client–server Better security – software and security If the server goes down then all clients are affected.
updates managed centrally, logins and
Access times may be slow if the server gets too many
access to files/folders also controlled.
requests from different clients at the same time.
Data backups easier to manage as all
More expensive because of the costs involved with
held in one place.
maintaining the central server.
A hacker targeting the server can bring down the network.
Peer-to-peer Easy to set up. Less secure – all devices have to have software and security
updates run individually, so outdated security is more likely.
Less expensive to set up – doesn’t
require specialist hardware. Multiple versions of files on different devices.
Network unaffected if one device fails. Each device needs to be backed up individually.
Exam tip
Questions on client–server or peer-to-peer networks tend to be longer discussion
and comparison questions. Make sure that you consider the context of the question
and remember that a server is always used in systems where centralised control is
important.
192
Wireless networking
Wireless networking uses radio frequencies in order to transmit data over a
Wireless network A
relatively small area (such as a house or office), called a local area network
network that allows devices
(LAN).
to transmit data using radio
Wi-Fi is an international communication standard for wireless networking. frequencies.
No cabling is required in a wireless network, meaning that devices can be Wi-Fi A set of technology
moved easily without disrupting communication. standards that allows
devices to communicate
In order to connect to a wireless network, a wireless access point must be using radio frequencies.
used. This is a device which broadcasts and receives the wireless signal to
and from the devices on the network. Wireless access point
A hardware device for
A wireless access point can be, and frequently is, connected to a wired allowing other devices
network in order to allow both wired and wireless connections, as necessary. to connect to an existing
Each device connecting to the wireless network requires a wireless network wireless network.
adapter. This can be built into the device (for example, on a computer’s Wireless network
motherboard) or added later (for example, through an expansion card or USB adapter A hardware device
wireless adapter). for enabling devices to
communicate using Wi-Fi.
Each wireless network has a service set identifier (SSID) which appears as the
‘name’ of the network. Enabling the broadcast of the SSID makes it easier for SSID Service set identifier-
new users to join a network. an identifier, or name, for a
wireless network.
Wireless security
The nature of wireless networking means that any device within range of the Note
wireless signal will be able to intercept and read the data being transmitted,
Wi-Fi is not an acronym and
without needing to gain physical access to the network.
is not a directly shortened
There are several methods for dealing with this security risk: form of any longer words.
✚ Disabling SSID broadcast. Wi-Fi is not the only wireless
Broadcasting of the network’s SSID can be disabled. Disabling the communication method.
broadcast means that only users who are aware of the network and know Bluetooth, 5G, NFC and
its SSID will be able to connect. Zigbee all make use of
✚ Strong encryption radio frequencies to allow
wireless communication.
Data sent over a wireless network should be strongly encrypted. Several
encryption methods exist including WEP and WPA. Most wireless
networks use WPA2 encryption as this offers significantly stronger
encryption than other methods.
Encryption A method of
Only users with the encryption key are able to send and receive data over scrambling data so that it
the wireless network, and while other devices can read the data being cannot be understood.
transmitted, this cannot be understood without the key. MAC address A physical
✚ MAC address whitelist address, uniquely assigned
A MAC address (or media access control address) is a unique hardware to each piece of network
address assigned to each network interface card (which includes wireless hardware.
network adapters). Whitelist A list of things
By enforcing a whitelist, only those MAC addresses on the whitelist will be considered to be acceptable
able to access the network and any other devices will be refused access. or trustworthy.
193
20 What is an SSID?
21 What two pieces of information are usually required for a device to access a
wireless network?
22 Identify three ways to ensure that wireless data transmissions are secure
Avoiding collisions
If two or more devices attempt to transmit data at the same time, there will Collision Where two items
be a collision and the data will be unreadable. of data are transmitted at
To avoid this there are two systems in use, often used together. the same time, causing both
to be lost.
CSMA/CA stands for carrier sense multiple access with collision avoidance.
CSMA/CA Carrier sense
The aim of CSMA/CA is to allow devices to recognise when a duplicate
multiple access/collision
broadcast is occurring and to wait a random period of time before avoidance. A method
broadcasting again. It works as follows: of collision avoidance
✚ Sending device checks for traffic. by checking for existing
✚ If another device is broadcasting, the sending device waits before repeating transmissions.
the process.
✚ If no other device is broadcasting, the data is transmitted. RTS/CTS Request to send /
clear to send. A method
RTS/CTS stands for request to send / clear to send. of collision avoidance by
Using RTS/CTS the device wishing to transmit sends an RTS request and waits requesting clearance to
for a CTS response from the receiver. If the CTS response is not received, the transmit.
sender wait a random amount of time before sending the RTS again.
The random amount of time is used to ensure that, if two devices attempt to
broadcast at the same time, it is unlikely that both will attempt to broadcast at
exactly the same time again.
Once a CTS response has been received, the data is transmitted.
Finally, the receiver should respond with an acknowledgement (ACK). If this is
not received then the data is resent.
The full process can be written as follows.
✚ Sending device checks for traffic.
✚ If another device is broadcasting, the sending device waits before repeating
the process.
✚ If no other device is broadcasting, an RTS signal is sent.
✚ The receiving device sends a CTS response if it is ready to receive the
transmission.
✚ If no CTS response is received, the sending device waits a random period
of time before repeating the process.
✚ If a CTS response is received, the data is transmitted.
✚ The receiving device sends an acknowledgement (ACK) once all data has
been received.
✚ If the sending device does not receive an ACK then it repeats the process.
The internet
The internet and how it works
Router
✚ A router is used to connect different networks together. When a router
receives a packet it will read the destination IP address (see IP address
section below).
✚ If the packet is intended for that router’s network then the packet will
passed into the network. If the packet is intended for another network then
it will be forwarded to another router.
✚ A router will try to pass the packet to its destination via the fastest route
possible. This is done either by using the fewest number of steps or using
the route that is least congested at that moment.
✚ In order to complete this task a routing table is created which is used to
store the routes to particular network destinations.
195
packet, a gateway will strip most of the header data away and create new
headers so that the packet can be transmitted through the next network.
Home Laptop
Request for
IP address BBC Server
IP address sent
DNS Server
Figure 9.6 Carrying out a DNS lookup before requesting data from a server
197
Internet security
The nature of the internet means that data is routed across several other networks Firewall A software or
before it reaches its destination. This means that packets can be intercepted and hardware service that
9 Fundamentals of communication and networking
that any device connected to the internet can also be subject to a potential attack. blocks or allows individual
packets from entering or
Firewalls leaving.
A firewall is used to check and potentially stop packets entering or leaving Stateful inspection
the network. It can be a piece of software on each individual computer or a Inspecting the data
hardware device that acts as a proxy server. contained in a packet.
✚ Stateful inspection refers to examining the contents of a packet in order to Proxy server A device that
decide whether the data itself is suspicious. The firewall keeps a record of sits between a private and a
all current connections in order to identify whether a packet is part of an public network.
ongoing communication.
✚ A proxy server is a physical device that sits between a private network and Packet filtering Stopping
packets based on their IP
a public network. All data to and from the private network goes via the
address or protocol.
proxy server. It can therefore stop packets that have left a computer before
they reach the public network and can prevent packets from the public
network entering the private network, before they get to the computer. Now test yourself
✚ Packet filtering refers to stopping packets based on their destination or
source IP address or their protocol. A firewall can be configured to allow 37 Is a firewall a piece of
web traffic, but to block FTP packets, or can refuse to send or receive hardware or a piece of
packets addressed to or from a suspicious IP address. This can help prevent software?
data being unknowingly sent from the device and can stop a malicious 38 One way in which a
program downloading other malicious files. firewall secures network
traffic is packet filtering.
Proxy servers give each device inside a private network a level of anonymity as
Identify two others.
the public network can only communicate with the proxy server. It also allows
for packet filtering to occur for the entire network, at an institutional level, 39 Explain how packet
regardless of the configuration of each individual device within a network. filtering is used to keep
a device secure.
Making links
Encryption is an important principle that is used extensively in network communications,
but also in non-networking situations as well. The basic principles of encryption, as
well as two examples of symmetric encryption ciphers, are discussed in Encryption in Encryption Scrambling
Chapter 5. data so that it cannot be
understood.
Symmetric encryption refers to encryption where the same key is used to Symmetric encryption The
encrypt and decrypt the data. Symmetric encryption can be secure, however same key is used to encrypt
the need for both the sender and the receiver to have access to the same key and to decrypt a message.
creates a significant security risk.
Key A value used to encrypt
In asymmetric encryption a different key is used to encrypt and to decrypt or decrypt an encrypted
the data. Knowing the key used to encrypt the data does not allow the data to message.
be decrypted. This works as follows:
Asymmetric encryption
✚ The intended recipient of some data generates a public key and a private
Different keys are used to
key that are mathematically related. The public key is made publicly
encrypt and to decrypt a
available, but the private key is not.
message.
✚ The sender uses the public key to encrypt the data which cannot be
decrypted without the private key. Therefore, only the intended recipient Public key A key that
can decrypt the data and they are never required to share the private key. is made public. Has a
matching private key.
A B
Data sent Private key A key that is
Data encrypted Data decrypted kept secret. Has a matching
using B’s public key using B’s private key public key.
198
Figure 9.7 Private/public key encryption
Digital signatures
A digital signature uses a checksum to ensure that a message has not been Digital signature A
altered during transmission. method of checking that an
encrypted message has not
The original message is passed through a checksum algorithm to produce
been altered.
a digest. The digest is then encrypted with the sender’s private key which
means that it can be decrypted by anyone with the sender’s public key. Checksum A value,
calculated by an algorithm,
The digest is included at the end of the original message before it is encrypted based on the contents of
with the recipient’s public key and transmitted. the original data.
When the recipient receives the message it is decrypted with the recipient’s Digest A checksum value
private key, and the digest can then be additionally decrypted using the used in a digital signature.
sender’s public key.
The checksum algorithm can be applied to the decrypted message and
compared to the decrypted digest. If the two do not match it is assumed that
the message was altered in some way during transmission.
Sender Recipient
Private key Private key
Making links
Public key Public key Checksum algorithms are
Decrypt with used in a variety of stations
Checksum recipient’s to verify that data has not
Original algorithm private key Original
Digest Encrypted been damaged or altered,
message message message
not just in networking. The
Encrypted
Encrypt with sender’s
digest
topic of error checking,
private key including a more detailed
Separate message and digest
look at checksum
Append to original message Encrypted Decrypt with
Transmit algorithms, is discussed in
digest sender’s
Encrypted public key Chapter 5.
Digest
digest
Original Encrypt with
message Compare Exam tip
recipient’s Checksum
Encrypted public key Encrypted Original algorithm
Digest It is very easy to mix up the
digest message message
terms digital certificate and
Figure 9.8 Private/public key encryption including a digital signature digital signature. Remember
that you cannot award
Now test yourself yourself a certificate, that
must be done by someone
40 Who can issue a digital certificate? else (a certification
41 What is the purpose of using a digital signature? authority). You can sign a
42 In a digital signature, why is the digest encrypted using the sender’s private key? document yourself though,
and this method uses the
43 Identify at least six of the 10 steps involved in using a digital signature.
encrypted checksum/digest.
Answers available online 199
Some malware will attempt to delete or destroy data, others will capture data program and self-replicates
such as usernames and passwords which can be sent back to the malware’s when executed.
creator. Trojan A malicious program
that pretends to be a useful
Virus program, does not
✚ A virus is a small piece of malicious code which attaches itself to an self-replicate.
existing program. The virus remains inert until the program it is attached
Worm A malicious program
to runs, at which point the malicious code is executed.
that copies itself over a
✚ A virus will typically add a copy of itself to another program and will
network.
often attempt to distribute itself to other devices by automatically sending
emails with an infected attachment or infecting files in shared workspaces. Anti-virus Software
✚ The key feature of a virus is that the host program must be run in order to that scans programs for
execute the malicious code. malicious code.
Bug A mistake in a
Trojans computer program that
✚ A Trojan is a piece of malware that is disguised as a useful program. When causes unexpected results.
the program is run then the malicious code is executed.
Edge-case A problem that
✚ Unlike viruses, Trojans do not attempt to replicate themselves into other
only occurs in an extreme
programs. Trojans are typically spread through user interaction, for
setting.
example downloading a program or opening an email attachment.
Note
A Trojan gets its name from the story of the Trojan horse. A Greek carpenter built a
giant statue of a horse which was then filled with soldiers and left as a gift for the
Trojans who were at war with the Greeks. The Trojans brought the horse into their
fortified city and the soldiers were able to climb out at night and open the gates,
allowing the rest of the army in. This is also why the word Trojan is usually capitalised.
Worms Note
A worm functions in a manner similar to a virus in that it is self-replicating.
However, worms do not need to attach themselves to a file in the way that a The detailed nature of
virus does, and instead spreads over a network. specific techniques to avoid
security flaws is beyond the
Once a worm is active it exploits vulnerabilities in the systems software to scope of this topic, however
replicate and transmit its code to other devices on the network. it is a tremendously
important and a growth
Protective measures industry in the field of
There are many ways to protect a system from malware. computer science and
✚ Firewalls can be used to monitor packets being sent across a network. They software engineering.
can identify unusual or suspicious communication and prevent packets
from being sent or received.
✚ Anti-virus programs scan files and emails for malicious code. It is vital to
keep anti-virus software up to data as new malicious code is always being
created.
✚ Education for users can help prevent people from running suspicious files
or downloading content from suspicious emails.
✚ Code quality is key. Bugs, errors, and edge-cases in software programs can
all be exploited by malicious programmers.
44 What is meant by malware? 47 Which type of malware can copy itself over a network
45 Name two types of malware that can copy without needing other users to run it?
themselves. 48 Identify four ways of protecting against malware.
200 46 Name one type of malware that does not copy itself.
Answers available online
This formatted data will then be dealt with at the transport layer, which is Transport layer Adds
responsible for two main jobs: a port number, packet
✚ Checking that all packets have arrived and are in the correct order. To number and error detection
data.
achieve this the transport layer adds a packet number to the original data.
✚ Identifying which application layer software should deal with the packets. Port number Extra data
To achieve this the transport layer adds a port number to the packet. The added to a packet that
port number indicates which application layer protocol should be used identifies what application
when the packet is received. layer protocol should be
used to process the data.
The network layer deals with addressing and routing of the data packet,
carrying out three main jobs. Network layer Adds
✚ Adding the sender IP address and recipient IP address. sender and receiver IP
✚ Routing a packet to the next host. addresses.
✚ Adding error checking bits (when sending data) and checking for errors Link layer Deals with the
(when receiving data). physical medium used for
transferring the data.
Exam tip
TCP/IP stack The use of
Always try to be as specific as possible. If asked about the function of the network the TCP/IP layers to add
layer, the response ‘adding an IP address’ is not detailed enough. ‘Adding the sender’s header data which is then
IP address’ is a much more specific answer. processed in reverse order.
The link layer is responsible for the physical transmission of the packet. This
differs between data sent over a wired network and a wireless network, for
example.
Once a packet has been sent from one device to another it is processed in the
reverse order. For this reason it is often referred to as the TCP/IP stack.
Making links
A stack is a LIFO (last in, first out) data structure which is useful in a range of
situations. This, and other data structures, are explored in Component 1 and it is
expected that you will be familiar with using stacks in a programming context. For
more detail please refer to Stacks in Chapter 2.
201
If a packet has been sent to a wireless access point then the receiving device
Note
will remove the data relating to the link layer in order to check the destination
address, before adding new link layer data required to send the packet over a There are several models for
wired part of the network to a router. the TCP/IP stack, including
9 Fundamentals of communication and networking
When the port number and IP address are both known, this is commonly
written using a colon as a separator; for example:
107.162.140.19:80
202
Exam tip
FTP
It is helpful to have had
An FTP server manages access to a store of files and a client using FTP client practical experience using
software can request access to those files. Permissions can be granted on FTP software, although
a file by file and folder by folder basis, allowing the client to potentially questions will not be asked
download and/or upload files remotely. in reference to specific
FTP servers can also be configured to require a login, or to provide examples of FTP programs.
anonymous access.
HTTP/HTTPS
A web server functions in a similar way, storing the resources and layout
information for web pages. A web browser acts as a client, requesting
particular information which is generated and returned by the web server.
203
Once the data has been received the web browser renders the data in the
appropriate format on the screen.
Data that does not need to be secure can be sent using HTTP, which is
unencrypted. Data that does need to be secure is sent over HTTPS, which is
9 Fundamentals of communication and networking
SMTP/POP3
An email server receives email messages and stores them securely until the
data is accessed. If an email client program is used, then the email messages
are downloaded to the client software. If a web browser is used then the
emails stay on the server but are transmitted as a web page.
New emails are composed on the client device and sent to the email server
which forwards the messages on to the intended recipient’s email server.
SMTP is the protocol used for sending emails from a device to an email server.
POP3 is a protocol used for receiving emails from the email server.
Note
There are several other
protocols in common usage,
for example FTPS is the
secure equivalent of FTP,
however only those listed
above are required for the
AQA specification.
SSH
SSH is an encrypted method for remotely managing a computer and is often Now test yourself
used to provide remote access to a server. The client is provided with a
58 What protocol is used to
command line interface which can be used to interact with the server.
connect to a remotely
SSH can be used to access data using other application layer protocols. For manage a computer?
instance, an SSH client can connect on port 80 to carry out web requests using 59 What service is accessed
commands such as GET, or port 25 to carry out email requests. using POP3 and SMTP?
60 What is FTP used for?
Exam tip 61 How can SSH be used to
access data that relies
It is helpful to have had practical experience using SSH to connect to a remote on other protocols?
computer and it is possible that questions will check your knowledge that SSH can
be used to carry out commands using other application layer protocols. However, it Answers available
is not necessary to be familiar with commands specifically relating to remote server online
management.
204
IP address structure
An IP address is a numeric addressing method that uses four 8 bit numbers,
Network identifier The
each between 0 and 255, separated by full stops. An IP address is typically
first part of an IP address,
In order to separate devices on the network this can be further divided into
subnets, which restricts the range of IP addresses for that that particular type
of device.
A software company’s network is going to be split so that there is one subnet
for admin purposes and one subnet for development work.
✚ Devices on the admin network will have an IP address between 192.168.0.0
and 192.168.191.255.
✚ Devices on the development network will have an IP address between
192.168.192.0 and 192.168.255.255.
Therefore, if a device has an IP address of 192.168.192.23, then we know that it
is on the development network.
192.168.192.23:
Subnet masking
Subnet masking is used to identify the network identifier in an IP address. Subnet mask A 32-bit
This is achieved by applying a logical AND between the IP address of any number used to isolate the
device on the network and the subnet mask. network identifier in an IP
address.
For a 192.168.x.x network this can be achieved using the subnet mask 255.255.0.0
on a device on the network, such as one with the address 192.168.0.3 Logical AND The operation
of applying an AND function
11000000.10101000.00000000.00000011 192.168.0.3 between the individual bits
11111111.11111111.00000000.00000000 AND 255.255.0.0 in two binary numbers.
AND
11000000.10101000.00000000.00000000 192.168.0.0 Exam tip
This has the effect of cancelling out, or zeroing, the host identifier. With this The most likely question
example it is fairly straightforward to see how this works using the decimal on this topic is to be asked
representation of the IP address, though it is easier to work directly in the to write out a subnet mask
binary format. given the number of bits
allocated to the network
In the example above the first 16 bits are used for the network identifier and identifier and to calculate the
the last 16 bits are used for the host identifier. This means that there could be maximum number of devices
216 (65,536) possible devices on the network. that can be connected to the 205
network at the same time.
In the following example the first 26 bits are used for the network identifier
Making links
and the last 6 bits are used for the host identifier. This means that there could
be 26 (64) possible devices on the network. For a device on this network with For more on logical AND
the IP address 230.117.156.130 then: see Chapter 6.
9 Fundamentals of communication and networking
11010110.01110101.10011100.10000010 230.117.156.130
11111111.11111111.11111111.11000000 AND 230.117.156.192
AND
11010110.01110101.10011100.10000000 230.117.156.128
IP standards
The examples of IP address so far have all referred to IPv4 (version 4).
IPv4 Internet Protocol
IPv4 was the default standard for IP addresses for a large number of years. version 4; uses 32 bits to
However the range of possible IP addresses for a 32 bit number is 232 (4.3 represent each address.
billion), and this has proven to be insufficient as more and more devices are IPv6 Internet Protocol
being connected to the internet. version 6; uses 128 bits to
As a result, IPv6 was introduced. represent each address.
Note
The problem with the limited supply of IPv4 addresses was identified many years ago
and IPv6 was introduced back in 1995; however, the widespread implementation and
rollout has taken several decades and is still not complete.
Many older routers, switches and devices have hardware that is not capable or
configured for IPv6 and will stop working when IPv4 is no longer used. Therefore both
standards are used on all new devices for both backwards and forwards compatibility.
While IPv4 uses 32 bits for each address, IPv6 uses 128 bits and is typically
represented in hexadecimal. This provides over 3.4 × 1038 possible addresses.
Private IP addresses are only used within networks. These must be unique Private IP address An
within that network, but can be re-used in other networks. address that can only
be accessed within that
There are three main classifications for private IP addresses: network.
✚ The 10 range (10.0.0.0 – 10.255.255.255) with a subnet mask of 255.0.0.0
✚ The 172 range (172.16.0.0 – 172.31.255.255) with a subnet mask of 255.240.0.0
✚ The 192 range (192.168.0.0 – 192.168.255.255) with a subnet mask of
255.255.0.0
206
Note
You are not expected to memorise all of the ranges for private IP addresses, though
it is useful to recognise that 192.168.*.* addresses are private as this is the default in
✚ The computer inside the network sends a request for a web page. The
router changes the sender’s IP address on the packets to 230.117.156.130 so
that the response will be returned to the router.
✚ When the response is received the router changes the destination IP
9 Fundamentals of communication and networking
address for each packet to 192.168.0.3 and the computer inside the network
receives the packets.
Note
The form of NAT described here works well if only one computer inside the network
needs to communicate, but if two or more computers are requesting web pages at the
same time then the router will not know which device to send the responses to.
In this case dynamic NAT can be used, with the router using a pool of public IP
addresses, issuing a different IP address for each device so that the router knows
which private IP address to send each response to.
Port forwarding
Port forwarding can be used in conjunction with network address translation
Port forwarding Using the
by inspecting port numbers to aid the process.
port number of a packet
When a device with a private IP address sends a request the router will use to identify which private IP
NAT to change the sender’s IP address and can also use the port number in address should be used.
order to help identify which device to send the response to.
If one device has sent a web request on port 80 and another has sent an email
request on port 20 then the router will know which device to send each response
to even though both responses are returned to the same public IP address.
Note
Port forwarding can also be used when multiple devices inside a network want to use
the same port number. A router can be configured so that all messages received on
a given port number, such as 4000, are forwarded to one private IP address, and all
messages received on another port number, such as 4010, are forwarded to another
private IP address. This is useful when hosting a server inside a private network.
Client–server model
Client–server
In the client–server model the server controls access to a centralised resource.
✚ When a client wants to access a resource (such as a file or a web page) then Server A device that
controls access to a
the client sends a request to the server.
centralised resource.
✚ The server processes the request and returns an appropriate response (which
could be the requested resource, a login page, a refusal, or something else). Client A device that makes
requests of a server.
In the case of many websites, rather than simply storing a series of HTML
files, the data that makes up the content of the website is stored in a database.
When a request to view a webpage is sent, the server queries the database
208 in order to generate the exact content on the webpage based on the current
values in its database. This means that websites can be automatically updated
Websocket A
to reflect live data (such as view counts, likes, new posts, etc.).
communications protocol
The basic client–server model is relatively passive, with the server only using TCP to create a full-
responding when a request is received from a client. In some situations, duplex connection.
209
Figure 9.11 (left) An XML document for storing details of users; (right) A JSON
document for storing details of users
Exam tip
You are not expected to write exact syntax using either JSON or XML, though you
should be able to recognise which is which. Remember that XML looks a lot like HTML
with its use of tags.
73 What protocol is used to create a full-duplex connection between the client and
the server?
74 State the REST calls for each of the following database commands.
a SELECT
b INSERT
c DELETE
d UPDATE
75 What two formats are typically used to send data between the server and the
web application?
76 Give two advantages for using JSON.
210
Summary
Communication Networking
✚ Serial transmission involves sending one bit at a time ✚ A protocol is a set of rules or standards
down a single wire ✚ A topology describes the physical or logical structure of
✚ Parallel transmission involves sending bits a network
simultaneously down multiple wires ✚ In a star topology all devices are connected to a central
✚ Parallel transmission is hampered by crosstalk and data switch
skew ✚ In a bus topology all devices are connected to a central
✚ Synchronous data transmission requires an additional cable or bus
timing signal to be transmitted ✚ Star topologies require more cabling, but cope better
✚ In asynchronous data transmission, start, stop and with high levels of traffic and are more secure
parity bits are added to provide the receiver with timing ✚ A physical star topology can act as a logical
data and error detection bus network by using a hub which forwards all
✚ Baud rate refers to the number of state changes per transmissions to all devices
second in a transmission ✚ In client–server networking a server controls access to
✚ The bit rate is the maximum number of bits that can be a centralised resource and a client requests access to
transferred per second that resource
✚ The bit rate can be higher than the baud rate if each ✚ In peer-to-peer networking all devices have an equal
signal can represent more than one value status and can act as a client or server in any interaction
✚ The bandwidth describes the range of values that can ✚ Wi-Fi is a protocol to provide a wireless network over a
be sent in a single signal small area using radio frequencies
✚ The bit rate is directly proportional to the bandwidth ✚ A wireless network adapter allows wireless devices to
✚ Latency refers to the time taken for the first part of a connect to a network
transmission to reach its destination ✚ A wireless network access point provides a connection
point to the wired part of the network ➜ 211
✚ Wireless networks require encryption as any device in ✚ A digital certificate is issued by a certification authority in
range can potentially pick up the signals order to ensure that the public key is accurate and genuine
✚ Each wireless network has an SSID, which can be ✚ A digital signature is a digest or checksum that is used
enabled for ease of access or disabled for security to check that an encrypted message has not been
9 Fundamentals of communication and networking
✚ A wireless network can use a MAC address whitelist to altered in any way
block or allow individual devices from accessing that ✚ The digest is encrypted using the sender’s private
network key so that the receiver knows that it is genuine once
✚ CSMA/CA is used by wireless devices to prevent them decrypted using the sender’s public key
from broadcasting at the same time as other devices ✚ Malware is a term meaning malicious software (or
on the network software that causes harm)
✚ RTS/CTS involves sending a request to send and waiting ✚ A virus is a form of malware that attaches malicious
for a clear to send response before transmitting code to an existing program and can self-replicate
when executed
The internet
✚ A Trojan is a form of malware that pretends to be a
✚ The internet is a global connection of millions of helpful program but causes damage when executed. It
networks in a mesh topology is not self-replicating
✚ A router is a device for connecting networks together ✚ A worm is a form of malware that replicates itself across
that use the same protocol a network without attaching itself to another program
✚ A gateway is a device for connecting networks together ✚ Malware can be protected against using firewalls, anti-
that use different protocols virus programs, improved user education and improved
✚ Data is split into packets and additional data is added to code quality
a header before the packet is transmitted
✚ Packet switching means that each packet can be TCP/IP
routed across a different route in order to avoid routes ✚ TCP/IP is a fundamental communication protocol, and
that are slow, busy or have failed is split into four layers
✚ A URL, or uniform resource locator, is a method of ✚ The application layer deals with the formatting of data
addressing an online resource and contains a protocol, specific to an application; for example, HTTP
a domain name and a path ✚ The transport layer adds a packet number and port
✚ An IP address is a numeric address that uniquely number to each packet so that the packets can be
addresses a device on a network reassembled and the completed data passed to the
✚ Domain names are used in web browsers in preference correct application layer protocol
to IP addresses as they are easier to remember, easier ✚ The network layer adds the IP address of both the sender
to type accurately and are more meaningful and the recipient as well as adding error checking bits
✚ A hostname appears before the domain name and ✚ The link layer is used to transmit the packet over a
specifies the host, or device, which is being addressed; physical medium; for example, wired (ethernet) cable,
for example, in mail.awebsite.com and www.awebsite. Wi-Fi, and so on.
com the hostnames are mail and www ✚ The four layers of the TCP/IP stack can be remembered
✚ A fully qualified domain name (FQDN) must include using the mnemonic All Teachers Need Llamas
both a hostname and a domain name ✚ A socket is an endpoint for communication based on
✚ DNS (domain name system) is used to look up domain an IP address and a port number
names to find the corresponding IP address ✚ A MAC address is a unique physical address linked to
✚ Internet registries assign IP addresses to networks in an individual network interface card installed in a device
order to avoid accidental duplication ✚ Well known ports include 80 and 443 for web traffic,
✚ A firewall can be a software or hardware service used 25 and 110 for email, 21 for file transfer and 22 for shell
to stop suspicious transmissions access
✚ Packet filtering involves stopping packets based on ✚ Application layer protocols include FTP, HTTP, HTTPS,
their IP address or port number POP3, SMTP and SSH
✚ A hardware firewall can act as a proxy server if it sits ✚ FTP is used for file transfer
between a private network and a public network by ✚ SSH is used for remote management using a TCP
providing a layer of anonymity connection to a remote port
✚ Stateful inspection involves examining the data content ✚ HTTPS and HTTPS are used to transfer HTML from a
of a packet in order to decide whether it is suspicious web server to a web browser which it renders into text,
✚ Symmetric key encryption means using a cipher in images and other media
which the same key is used to encrypt and decrypt a ✚ POP3 and SMTP are used for email
message ✚ An IP address is split into a network identifier and a
✚ Asymmetric key encryption means that a different key host identifier
must be used to either encrypt or decrypt a message ✚ A subnet mask is used to identify the network identifier
✚ Private/public key encryption uses two mathematically part of an IP address
related keys – one public and one private ✚ IPv4 is used throughout the AQA specification, but IPv6
✚ The sender encrypts the message using the receiver’s is a newer format with a much wider range of possible
public key so that only the receiver can decrypt the addresses
212 message using their private key ➜
✚ IP addresses in the 10, 172 and 192 range are only used ✚ The Websocket protocol allows a full-duplex
inside private networks and are non-routable because connection between a web browser and a sever over
they can be duplicated in other networks TCP
✚ Public IP addresses are routable because they are ✚ A web browser can connect to a database using a
Exam practice
1 The number 82 is sent using asynchronous Describe the process by which the web browser
transmission at a rate of 30 megabits per second finds the IP address for the webpage as well as
a) Explain the difference between the baud rate how it retrieves and displays the content of that
and the bit rate of the data transmission. [2] webpage. [5]
b) State the maximum bit rate of the transmission c) The webpage is transmitted using a secure
if the bandwidth is doubled. [1] protocol.
c) Explain why increasing the bandwidth of the Justify the decision to use asymmetric rather than
transmission will not improve the latency. [2] symmetric encryption. [2]
d) Copy and complete the table below to indicate the d) Explain the principles of operation of asymmetric
data that will be transferred, using even parity. [3] encryption and the significance of using a digital
certificate. [5]
Stop Parity Start 4 A router connecting a private network to the internet
bit bit bit uses the public IP address 17.43.0.61 and the private
IP address 10.0.0.1.
The private network contains several workstations, a
2 A network is configured using a physical star topology
file server, a DHCP server and a printer.
a) Explain the operation of a star topology. [2]
a) Why does the router have two IP addresses? [2]
b) Explain how a physical star topology can function
b) Explain how, when a device connects to the
as a logical bus topology. [2]
network, it is possible to ensure that it is given a
c) The network is to be extended to allow Wi-Fi
unique IP address. [2]
connections. Identify two advantages for
c) When an email request is sent to a public email
allowing wireless connections. [2]
server, the workstation sending the request initially
d) Identify two pieces of hardware required to
has a non-routable address.
create a Wi-Fi network. [2]
Explain what this means and identify one concept
e) Describe two methods of ensuring that a
that can be used to overcome this problem. [3]
wireless network can be secured. [4]
d) A user on the network wishes to download a file
f) CSMA/CA with RTS/CTS are used to reduce the
from the file server and a socket is used.
number of collisions on the wireless network.
State what is meant by a socket and identify two
Explain the steps involved in CSMA/CA and
pieces of information that will be required to
RTS/CTS. [5]
connect to the file server. [3]
3 A private network is connected to the internet e) The network is to be subdivided into several
using a router subnets. Each subnet will use a 27-bit network
a) State the name of an alternative device that can be identifier and will include one switch and one
used to connect a private network to the internet, printer.
and the circumstances in which this device would Write the subnet mask that will be needed and
be chosen. [2] state the maximum number of workstations
b) A user enters the URL for a webpage. that can be connected to one subnet. [2]
213
10 Fundamentals of databases
The single line connected the left-hand box represents the ‘one’ and the
forked lines on the right represent the ‘many’. The surgery can have many
patients, but each patient only has one surgery.
Within the doctor’s surgery, it is however possible for a patient to see more
than one doctor. This is an example of a many-many relationship:
Doctor Patient
Figure 10.3 An E-R diagram showing a three table database using two one-many
214 relationships
10 Fundamentals of databases
one patient wrong way round. Say to
yourself ‘one doctor can
It is quite common to be asked to draw in the relationships between entities, have many appointments’
and this includes both the direct relationships and the indirect relationships. (or the equivalent for your
In this case we would draw in the many-many relationship between the question) and remember
doctor and the patient. that the ‘many’ symbol
should attach to the ‘many’
entity.
Doctor Appointment
Patient
Figure 10.4 An E-R diagram showing all of the relationships between three entities
E-R diagrams can also be written in a format that shows the attributes within
each table in brackets, for instance:
Doctor (DoctorID, FirstName, LastName, Specialism)
Patient (PatientID, FirstName, LastName, PhoneNumber,
DateOfBirth)
Appointment (ApptDate, ApptTime, DoctorID, PatientID,
Notes)
The key fields are traditionally shown using an underline. (For an explanation
of key fields, see the next section.)
215
Relational databases
When databases are created, each entity in a data model is represented in a
Table The representation of
database as a table.
10 Fundamentals of databases
OrderID FirstName LastName Email ItemOrdered Price Record The data about one
item from a table.
1 Dave Smith dave@smith.com Motherboard £89.99
2 Dave Smith dave@smith.com CPU £132.99 Flat file database A
database made up one
3 Dave Smith dave@smith.com RAM £79.99
table.
4 Dave Smith dave@smith.com GPU £219.99
Relational database A
This could be split into two tables: database made up of two or
more, related tables.
Table: Customers
CustomerID FirstName LastName Email
1 Dave Smith dave@smith.com
Table: Orders
OrderID CustomerID ItemOrdered Price
1 1 Motherboard £89.99
2 1 CPU £132.99
3 1 RAM £79.99
4 1 GPU £219.99
This reduces the need for data entry, for data storage and reduces data Primary key An attribute
duplication. It also means that any changes to a customer’s details only need used to uniquely identify a
to be made once, reducing the risk of inconsistencies in the data. record.
✚ Each entity is made up of attributes, or characteristics, of that type of
Composite primary key A
object. Attributes are represented in tables as fields.
primary key made up of two
✚ Each table is populated with records, in which each record describes one
or more attributes.
item in that table.
✚ Each table requires a primary key, which is an attribute with values that Foreign key An attribute
uniquely identify each record in that table. This is important to avoid in one table that links to
ambiguity. There cannot be any duplicate values in a primary key field. the primary key in a related
✚ A primary key can be based on a single attribute or can be composed of table.
several attributes combined. This is known as a composite primary key, or
a composite key for short. Exam tip
✚ A foreign key is an attribute in a table which matches the primary key in a
related table. This creates the link between the two tables. AQA sometimes refer
to database tables as
In the Customers table above, the CustomerID is a primary key.
‘relations’ and sometimes
In the Orders table, OrderID is a primary key and the CustomerID is a as ‘entities’. Be careful
foreign key. not to confuse the terms
‘relation’ and ‘relationship’.
Now test yourself
10 Fundamentals of databases
The rules around making sure that a database design is efficient are referred Note
to as normalisation.
There are several
✚ Normalisation ensures that there is less redundancy and less
normalisation rules, and
inconsistency in the data.
the AQA specification
✚ A poorly designed database might result in data being stored twice because
considers these up to and
the same attributes exist in different tables, or because different records
including the third normal
include the same data (for example, storing the address for a customer
form. It is not necessary
every time they make a purchase). to know which rule refers
✚ A poorly designed database might result in erroneous data because to which specific level of
duplicate data has been incorrectly updated or deleted (for example, if normalisation.
the address for a customer is stored every time they make a purchase but
needs to be changed then it is possible to miss one or more records with
the old address). Normalisation Structuring
✚ A fully normalised database is atomic. Each table, or group or attributes, a database in a way
cannot be broken down any further and so there are no repeating groups designed to reduce data
of attributes. redundancy and improve
integrity.
For example:
✚a ‘name’ attribute could be broken down into ‘first name’ and ‘last Redundancy Not needed,
name’ for example data that is
✚ an address attribute could be broken down into ‘house number’, ‘street’,
unnecessarily stored twice.
‘town’, ‘county’ and ‘postcode’. Inconsistency The result is
✚ A fully normalised database has no partial dependencies. A partial not always the same.
dependency is when an attribute depends on part of a composite key but
Third normal form A
not the whole composite key (note, this point is only relevant to tables with
database in which all data is
composite keys). atomic, there are no partial
For example, a table for an appointment with a doctor uses a composite dependencies and no non-
key: key dependencies.
appointment(date, time, doctorName, patientID, notes,
doctorSpecialism) Exam tip
The doctorSpecialism attribute depends on the doctorName, but not on
If you are asked what it
the rest of the attributes in the composite primary key.
means for a database to be
The solution is to create a separate doctor entity: fully normalised remember
doctor(doctorID, doctorName, doctorSpecialism) that data must be atomic,
and that all of the attributes
appointment(date, time, doctorID, patientID, notes)
in the table depend on
✚ A fully normalised database has no non-key dependencies. The value of the whole primary key
each attribute must only depend on the key. If the value of an attribute can (including all parts of
be determined using any other attribute, then it is not independent. If this a composite key, if the
is the case, then the table normally needs splitting into two further linked primary key is composite)
tables. and no other attributes.
For example, an address entity is designed as follows:
address(addressID, houseNumber, street, town, county, Now test yourself
postcode) 8 What are two
Two neighbours have a different house number, but the same street, town, advantages of
county and postcode. The attributes for street, town and county can be normalising a database?
determined from the postcode as well as the primary key addressID, as 9 What does the term
everyone on that street has the same postcode. atomic mean?
To resolve this, it can then be split into two tables: 10 What are two other
requirements for a fully
address(addressID, houseNumber, postcode) normalised database?
postcode(postcode, street, town, county) Answers available
online 217
There are many possible data types that can be used in SQL. Any valid type
will be accepted but, for numeric types it is important to consider whether an
integer or non-integer type should be used.
Example data Appropriate datatypes Notes Now test yourself
fixed length text CHAR(10) Each entry must be exactly the
11 Give two SQL data
length specified.
types that could be used
variable length text VARCHAR(10) Each entry can be any length up to to store a fractional
the specified length. number.
whole number INT Should only be used for numbers 12 Suggest three SQL data
that can never be fractional. types suitable for a date
fractional number FLOAT, REAL, FLOAT and REAL are both used for and time.
CURRENCY fractional numbers – they just have 13 Give one way of
different levels of precision. declaring a customer ID
CURRENCY should only be used for as a primary key in SQL.
financial data.
Answers available
date or time DATE, TIME, DATETIME DATETIME stores both a date and a
online
time.
yes or no BOOLEAN, TINYINT A Boolean is ultimately stored as a 0
or 1 and in some database systems
will be automatically converted to a
TINYINT (an integer type that is one
218 byte in size).
Retrieve (SELECT)
To retrieve data from a query, we use three SQL commands:
SELECT <fields>
10 Fundamentals of databases
FROM <table>
WHERE <condition>
The first two are mandatory but WHERE is optional. These can also be written
in one line:
SELECT ProductName, Price FROM Products WHERE Price < 100
An asterisk (*) can be used as a wildcard in order to select all fields:
SELECT * FROM Products WHERE Price < 100
It is possible to use logical operations such as AND, OR and NOT to further
refine queries:
SELECT * FROM Products WHERE Price < 100 AND Type = "Canned"
The results from queries can be further improved by sorting the results using
the ORDER BY command:
SELECT * FROM Products WHERE Price < 100 ORDER BY Price ASC
SELECT * FROM Products WHERE Price < 100 ORDER BY Price DESC
219
✚ It is important to add all tables used in the FROM field, but also important
Exam tip
to exclude any tables in the database not relevant to this query.
✚ In this case there are two one-many links between the tables, and so When writing complex
the foreign keys and their corresponding primary keys (DoctorID and queries, always work
10 Fundamentals of databases
Create (INSERT)
To insert data into a database it is easiest to insert an entire record in one go.
In order to do this, the data must be presented in the same order as the fields
appear in the table:
INSERT INTO Appointment VALUES
("2023-03-27","13:50",113,26850,"Routine checkup")
It is possible to enter data by specifying the fields in a different order. It is also Exam tips
possible to exclude some fields altogether, if they are not required:
Don’t forget to include the
INSERT INTO Appointment(DoctorID, PatientID, ApptTime,
key word VALUES when
ApptDate)
inserting data, so that the
VALUES (127,39254,"09:30","2023-04-03") system knows you are
referring to data values and
It is generally easiest to use the first format in exam questions.
not field names.
Note that dates and times are always entered using speech marks. Always remember to
✚ Dates are formatted “YYYY-MM-DD”; for example, "2023-01-17". include quotation marks
✚ Times are formatted “HH:MM:SS” or “HH:MM”; for example, "09:17:02". around date and time values
✚ The datetime format uses both, with the date first and a space before the on database questions,
time; for example, "2023-01-17 09:17:02". but never around Boolean
Note that Boolean values can be entered as TRUE, FALSE, 1, or 0. values.
10 Fundamentals of databases
all records from a table. update query?
DELETE FROM Doctor 24 What are the two main
clauses in a delete
WHERE DoctorID = 128 query?
To delete all records, no WHERE clause is needed. 25 How can you delete all
data from a table?
Though presented in a different order here, the four main types of database
manipulation can be remembered using the acronym CRUD. This stands for: Answers available
✚ create ✚ update online
✚ retrieve ✚ delete.
Making links
Databases are often run on servers, with clients accessing databases remotely. In
order to achieve this over a web interface, a RESTful API must be used which maps
HTTP requests to the equivalent SQL command. More detail on RESTful APIs and web-
based database access are discussed in Chapter 9.
Revision activity
Using the W3Schools website you can create your own database tables using SQL
syntax.
Go to www.w3schools.com, scroll down to Learn SQL, and then click the Try It Yourself
button.
1 Use the command DROP TABLE <tablename> to delete the existing tables.
2 Choose a past paper question and create the tables using SQL code.
3 Make up some suitable data and add this to the database using an INSERT query.
4 Create SELECT queries, including complex queries, according to the questions on
the paper.
5 Create UPDATE and DELETE queries to alter the data in the database, running
SELECT queries after each one to check the results.
221
Client–server databases
In a client–server database system it is possible for two clients to attempt to
Concurrent At the same
access, or alter, the data in a database at the same time. This is referred to as
10 Fundamentals of databases
time.
concurrent access.
If two or more clients attempt to alter data concurrently then it is possible
that the integrity of the database will be compromised.
For example, a bank balance is currently £1000. Now test yourself
✚ A payment of £10 is made into the account and so the balance is checked,
£10 added to the figure and the new balance of £1010 is written to the 26 What does concurrent
account. The write process was slightly slow, however. mean?
✚ At the same time, another transaction took place in which the account 27 Why is concurrent
holder spent £800. This processing was slightly quicker and the balance access a potential
was reduced to £200; however, the first transaction had not yet completed problem?
and, when it did, the balance was overwritten as £1010. 28 How can record locks
There are several possible methods to preserve the integrity of the data. prevent problems with
✚ Record locks: When a record is initially accessed it is locked and cannot concurrent access?
be accessed again until the lock has been removed. In the case above the 29 Identify two other
record for that customer would be locked by the first transaction until it methods of preserving
was complete. Only then could the second transaction be processed. data integrity against
✚ Serialisation: The database system manages the transactions so that only concurrent access
one can be carried out at a time. As both transactions were received at the issues.
same time the second transaction would be held in a queue momentarily Answers available
until the first transaction had completed. online
✚ Timestamp ordering: Each time access to the database is made, the
timestamp is recorded. In the above example, the second transaction
would change the timestamp at which the balance was last changed.
When the first transaction attempt to complete it would see that the
timestamp had changed and so would start again.
✚ Commitment ordering: Transactions are arranged into an order that avoids
potential conflicts, using an algorithm within the database system.
Summary
Conceptual data modelling and E-R modelling ✚ In a database normalised to the third normal form,
✚ Entity-Relationship diagrams show the relationships data should be atomic, data should have no partial-
between entities as: one-to-one, one-to-many or key dependencies and should have no non-key
many-to-many dependencies
✚ Entities contain attributes which each describe a SQL
characteristic of that entity
✚ SQL commands are used to create and manipulate
✚ Underlining can be used to identify the key fields
database tables
Relational databases ✚ Tables are created using the command CREATE
✚ Relational databases are designed to represent data TABLE ( … )
using multiple related entities ✚ Fields are declared by stating their name, followed by
✚ Data is stored in records, where each record describes their data type
one object of that type ✚ Text fields are declared using the data types CHAR(n)
✚ A primary key is an attribute which must have a unique or VARCHAR(n), where n refers to the fixed (for
value in order to identify a record CHAR) or maximum (for VARCHAR) length of the text
✚ A composite primary key is a primary key made up ✚ Dates and times are declared using the data types
of two or more attributes which must have a unique DATE, TIME, or DATETIME
combination ✚ Primary keys are declared using the command
✚ A foreign key is used to reference a primary key from PRIMARY KEY (<fieldname>)
another table in order to create a relationship between ✚ Foreign keys are declared using the command
those entities FOREIGN KEY (<foreignkey>) REFERENCES
<table>(<primarykey>)
Database design and normalisation ✚ The four main database manipulation commands can
✚ Normalisation is the technique of ensuring that a be remembered using the acronym CRUD
222 database design reduces data redundancy and ✚ Creating data is done using an INSERT query:
improves data integrity INSERT INTO <table> VALUES (…,…,…) ➜
Check your understanding and progress at www.hoddereducation.co.uk/myrevisionnotesdownloads
325487_10_MRN_AQA_CS_214-223.indd Page 223 6/29/21 9:51 PM f-0116 /103/HO02206/work/indd
✚ Retrieving data is done using a SELECT query: ✚ Data can be sorted once retrieved by adding an extra
SELECT <fields> FROM <table> WHERE clause SORT BY <fieldname> ASC or SORT BY
<condition> <fieldname> DESC
✚ Updating data is done using an UPDATE query: ✚ Dates and times are written using the format "YYYY-
10 Fundamentals of databases
UPDATE <table> SET <field> = <value> MM-DD HH:MM:SS"
WHERE <condition> ✚ Boolean values are written as either TRUE, FALSE, 1,
✚ Deleting data is done using a DELETE query: DELETE or 0
FROM <table> WHERE <condition>
Client server databases
✚ Complex queries can retrieve data from more
than one table by including additional conditions ✚ In a client–server database system it is possible for
linking the primary and foreign keys; for two or more clients to attempt to access a record
example, AND <table>.<foreignkey> = concurrently, threatening the integrity of the database
<table>.<primarykey> ✚ This can be avoided using record locks, serialisation,
timestamp ordering or commitment ordering.
Exam practice
1 A community cinema with a single screen wishes to c) Design the remaining relation, underlining the
use a database to store the details of film showings attribute(s) for the primary key. [2]
and ticket bookings. d) The database should be fully normalised. Define
✚ Each film is identified with a unique number and what it means for a database to be fully
the film’s title, certificate and running length (in normalised. [2]
minutes) are recorded. e) Write the SQL code needed to create the Film
✚ Each showing is identified with a combination of the table. [3]
film’s ID, date of showing, and time of showing. f) Write the SQL code needed to add a new customer
✚ Each customer is identified with a unique code
named Nicola Tandy. Their email address is
made up from the first three letters of their first nicolatandy@hotmail.com and they are not
name and the first five letters of their surname. The currently a member of the loyalty scheme. [2]
customer’s first name, last name and email address
g) A number of showings have had the wrong start
are stored, as well as whether or not they are a
time recorded incorrectly. Write the SQL code
member of the cinema’s loyalty scheme.
needed to change all films scheduled to start
✚ Each ticket is identified with a unique number and
before 9am so that they will now start at 9:30am. [3]
the seat number is stored.
h) The cinema has cancelled a Halloween special.
a) The entity-relationship diagram below is
Write the SQL code needed to delete all showings
incomplete.
for the 30 October 2023. [2]
Draw on three additional relationships. [3]
i) Write an SQL query to show a list of all the names
and email addresses for loyalty scheme members.
Film Showing [3]
j) Write an SQL query to show the title and start
times for all 18 certificate films on the 23 May
2023. [5]
k) The owners of the community cinema are
considering splitting the building into several
screens, each with a different number of seats.
Customer Ticket
Describe the changes that would need to be made
to the database in order to accommodate this,
b) The Film, Showing and Customer relations ensuring that tickets could only be booked for
have already been defined. seats that exist, and are available. You do not
Film(FilmID, Title, Certificate, Length) need to provide any SQL code. [4]
Showing(FilmID, Date, StartTime) l) Customers can book tickets in the cinema, and
Customer(CustomerID, FirstName, LastName, also online, allowing concurrent access to occur.
Email, LoyaltyMember) Describe one problem that could occur. [2]
State the type of primary key used in the m) Identify two possible methods for avoiding
Showing table. [1] issues caused by concurrent access. [2]
223
11 Big Data
Big data
Defining Big Data
Modern computer systems can be required to collect and process a huge
amount of data that lacks structure. This is referred to as Big Data. Big Data Data that cannot
easily be stored, retrieved,
Simply having a lot of data doesn’t necessarily mean that this is Big Data. and processed because it
There are three characteristics that are used to define Big Data: volume, is too varied, too large or
velocity and variety. acquired too rapidly.
Volume The quantity of
Volume data to be stored, typically
This refers to an amount of data so large that it cannot fit into a single server. too much to fit on a single
server.
Such a large volume of data must be processed across multiple servers
simultaneously. This has an impact because relational databases do not work Velocity The rate at which
well across multiple devices. Therefore, even structured data becomes hard to data is generated or collected.
analyse and process if there is a lot of it. Variety Data in many
forms; that is, unstructured.
Velocity
This refers to the rate at which data is generated or collected – for example Exam tip
social media platforms receive hundreds of millions of digital images per day. If you are asked what
A high velocity of data means that one server would be unable to cope with constitutes Big Data,
processing the data, particularly if the processing relies on data currently remember the three Vs of
stored on another server. volume, velocity and variety,
but ensure that you explain
your answers. Simply
Variety recalling the word will
This refers to the unstructured format of data and the different forms of data not be sufficient to show
being received or captured, including images, video, audio, text and numeric understanding.
data.
The variety of data means that it will not fit into a row-and-column format Now test yourself
that is required for processing in a traditional database table.
1 What are the three Vs of
These three issues combined make it difficult to store data, retrieve data and Big Data?
process data on demand. 2 Why is it a problem if
data is captured by the
Machine learning techniques are often used to process Big Data, as they allow
system in a wide range
the system to identify patterns in the data which make it easier to extract
of forms?
meaningful information.
3 Suggest one source of
Typical sources of Big Data include: Big Data.
✚ systems with networked sensors
✚ video surveillance systems Answers available
✚ social media platforms. online
11 Big Data
distributed programming: immutable data structures, statelessness, and Immutable data
higher-order functions. structures Data structures
✚ Immutable data structures: Rather than changing data and re-assigning a that cannot be changed.
new value to a variable, data is not changed in functional programming. Statelessness A system in
Instead, the data is processed by a series of functions, and the result is which the processing does
returned, leaving the original data unaltered. not depend on the state of
✚ Statelessness: The functions in a functional program don’t change another part of the program.
depending on another factor elsewhere in the program. This means High-order function A
that the state of another part of the system will not impact the function that can take another
process being applied. This is important for distributed processing as function as a parameter.
it may not be possible to check the states of other parts of
the system. Now test yourself
✚ Higher-order functions: Higher-order functions can take another function 4 What is meant by
as an argument. This means that functions can be chained together to distributed processing?
produce a result rather than having to store the intermediate values in 5 What type of
additional variables. programming is typically
used in distributed
processing?
Making links
6 What are the three
Functional programming is a different paradigm, or style of programming, to main features of that
procedural and to object-oriented programming. Your ability to understand and to type of programming?
write functional programs is assessed in Component 2 and this topic is explored in
much more detail in Chapter 12. Answers available
online
Student: Student:
Dave Smith Nicola Tandy
Teaches Teaches
Teacher:
Siblings
Andrea Pla
Student:
Freya Smith Teaching Room:
CP02
Tutor Group:
12AP
Figure 11.1 A graph schema with nodes (the shapes) and edges (the lines) showing 225
facts about staff and students at a college
In this example:
Now test yourself
✚ the teacher and the students are objects and so they are shown in ovals,
✚ each object has some data that relates to that person, which is shown in 7 In a graph schema, what
rectangles. is represented by a
11 Big Data
✚ dashed edges are used to connect the data to the person (for example, rectangular node?
Freya is in tutor group 12AP) 8 What is represented by
✚ solid edges are used to connect the objects (in this case, people). an oval node?
✚ sometimes, no arrow is shown (for example, Freya and Dave are siblings) 9 What kind of edge
because this is not a one-way relationship. should connect an item
✚ sometimes, an arrow is shown because Andrea teaches Dave and Nicola of data to an object?
(they don’t teach Andrea)
10 Using the figure above,
✚ for all solid edges, a label is added to describe the relationship.
state three things you
know about Dave Smith.
Summary
Answers available
✚ Big Data refers to data that does not fit into standard data structures online
✚ Big Data can be described as data that has one or all of the following
characteristics: volume, velocity, and variety
✚ Volume: data can be classified as Big Data if the volume is so big that it will not fit
on one server
✚ Velocity: data can be classified as Big Data if it is being generated or captured at an
extremely high rate
✚ Variety: data can be classified as Big Data if it is unstructured, and hence cannot
be represented using the table structures within relational databases
✚ Machine learning techniques are often used to identify patterns and to extract
meaningful information from Big Data
✚ Distributed processing is often required in order to process the large quantities of
data across multiple servers
✚ Functional programming is often used as the programming paradigm of choice for
processing Big Data because it lends itself to distributed processing
✚ Functional programming supports immutable data structures, statelessness and
high-order functions
✚ Big Data can be represented using a fact-based model
✚ A graph schema is a graphical representation of a fact-based model
✚ Objects are represented using an oval node
✚ Items of data are represented using a rectangular node
✚ Connections between objects are represented using a solid edge, which can be
directional
✚ Connections between data and objects are represented using a dashed edge
Exam practice
1 Describe three characteristics that might result Copy and complete the graph schema, adding the
in a set of data being classified as Big Data. [3] following information. [4]
2 Explain why functional programming is a suitable ✚ Andy Frey is 32 years old.
method to use for processing Big Data. [2] ✚ Rahal Meyrick treated a cat called Shadow. The cat
is black.
3 This is a graph schema for a fact-based model:
✚ Both vets are employed by a vet surgery called
Soucek’s Pets, based in Crewe.
Age:
27
✚ The email address for the vet surgery is info@
soucekspets.co.uk.
Vet: Vet:
Andy Frey Rahal Meyrick
Treated
Patient: Patient:
Colin Shadow
12 Functional programming
Functional programming
paradigm
Function type
A function is a rule that takes a value from a set of inputs (A) and assigns it to
an outputs that is contained within set B: Function A rule that takes
one or more values as an
For example, a function f that takes a positive integer as an input, and
input and produces an
produces an output which is the double of each input, can be written as
output.
follows:
Set A collection of objects.
f: {0,1,2,3,4,…} → {0,2,4,6,8,…}
Domain A set from which
The domain is the set that contains all of the input values.
all input values are chosen.
The co-domain is the set that contains all of the output values.
Co-domain A set from
It is not necessary for all values in the co-domain to be included in the output. which all output values are
For instance, in this example the co-domain does not contain the numbers 1, chosen.
5 and 9. However, all of the values in the co-domain are natural numbers.
The domain and co-domain are always subsets of objects of a given data
type (for example, integers, rational numbers, characters, and so on). In the
function above, the domain is the set of natural numbers ().
The co-domain is also the set of natural numbers (), although not every
value from that set will be included.
A function f has a function type:
f: A → B Making links
A is the argument type and B is the result type. They refer to the data types When describing the
from which the domain and co-domain are taken. For instance, for the domain and co-domain it is
example above where the domain and co-domains are taken from the set of important to be familiar with
natural numbers (integers), we can say: the main sets of , , and
f: → , as well as the principles
of subsets. The sets are
or described in Chapter 5. The
f: integer → integer topic of subsets and their
mathematical descriptions
However, this notation only describes the data type of the domain and co-
are discussed in Chapter 4.
domain and does not indicate the purpose or effect of the function.
First-class object
A first-class object is an object that can:
✚ appear in expressions First-class object An
✚ be assigned to a variable object that can be assigned
✚ be assigned as an argument as an argument or returned
✚ be returned from a function. from a function.
Here, the value of StartNum has been passed as an argument and used in an
expression. The value resulting from that expression has been assigned to a
variable and returned from the function.
In functional programming, a function is also a first-class object. This means
12 Functional programming
Function application
Function application is the process of applying a function to its arguments.
For example, a function to add two values takes two integer arguments as an
input and produces a single integer result as an output.
add(6,2) → 8
The domain of the function is described as × , which is the Cartesian
product of and . This means that the input contains two values, both of
which are integers.
Note
The Cartesian product of two sets contains ordered pairs of the values from both sets.
For example:
{0,1,2,…} × {0,1,2,…} = {(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),
(2,1),(2,2),…}
This means that the function call must be either add(0,0) or add(0,1) or
add(0,2) or … etc.
Since the domain only contains integers, the call add(1,0.5) would not be accepted.
The function add _six will add 6 to a single integer (since it takes an integer
input, and produces an integer output): fy: →
For example: fy: 2 → 8
12 Functional programming
Note that the new function is an anonymous function and does not need to
be written by the programmer; it is simply a method of describing how the
partial function application occurs.
The whole process can be described by saying that the function add takes one
integer input in order to create a function that takes the second integer input.
The result of carrying out the new function produces an integer output.
This is written as: fx: → ( → )
which means that function fx takes an integer argument and produces
another function that takes an integer argument, which produces an integer
output.
This can also be written as: fx: → →
Composition of functions
Functional composition involves combining two or more functions together.
Functional composition
Function fx takes an integer input and returns the double of that value: The process of combining
fx a = 2a two or more functions.
5 A function fx takes a two positive integers as arguments and returns the product
of those arguments: fx [a,b] = a*b
a Write the type of the function using function application.
b Write the type of the function using partial function application.
6 fx a = 2*a
fy b = b2
fz c = c – 1
a What is the result of fx ° fy 2?
b What is the result of fz ° fx 10?
c What is the result of fy ° fx ° fz 3?
229
Answers available online
functional programs.
It is useful to have
A function that takes a single argument will typically be written using the done some practical
following format: programming in a functional
fx a = a + 2 programming language.
However, this is not
This means that when the function fx is called, whatever value is passed to necessary and questions
the function as an argument, that value + 2 will be returned. For example: will only be asked on
fx 3 → 5 Component 2, the written
exam.
A function that takes two arguments will typically be written using the
following format:
fy [b,c] = b + c
This means that when the function fy is called, the value returned will be the
sum of the values passed as arguments. For example:
fy [7,4] → 11
A function combination will typically be written using the following format:
fz [d,e] = fx (fy [d,e])
This means that the function fy will be applied to the values passed as
arguments. The function fx will then be applied to the result. For example:
fz[3,1] → 6 (since fy [3,1] → 4 and fx 4 → 6)
There are three higher-order functions that you are expected to be familiar
with. Each one takes two arguments: a function, and a list.
Map
Map is a higher-order function that applies a given function to each item in
Map A higher-order
a list of values. Once complete, a new list containing the results is returned.
function that applies a given
(Lists are described in more detail in the next section.)
function to each item in a
The map function will typically be written in the following format: list.
fu f = map fx f
From the previous section, remember that fx a = a + 2. This means that
the function fx will be applied to each item in the list f, creating a new list of
values which is returned. For example:
If list = [3,9,25,10]
fu list → [5,11,27,12]
List1 List2
1 3
2 Apply function 4
3 5
fx a = a + 2
4 6
5 7
6 8
… …
Figure 12.1 Using the map function to apply a function to a list, in this case the
function will output the square of the input
230
Filter
Filter is a higher-order function that uses a conditional operator to generate a
Filter A higher-order
new list.
function that produces a
12 Functional programming
The filter function will typically be written in the following format: new list containing all items
from the input list that meet
fv g = filter (<10) g
a given condition.
This means that the function fv will return a new list containing all of the
values in list g that meet the criteria in parentheses (in this case, values less
than 10).
If list = [3,9,25,10]
fv list → [3,9]
List1 List2
1 1
2 Apply function 3
3 5
Filter = odd
4 …
5
6
…
Figure 12.2 Using the filter function to create a new list according to a condition.
Filters can use inequalities (for example, <10) or filters such as ‘odd’ or ‘even’
9 Write a function, fz, that will take two numbers, carry out function fy, and then
carry out function fx on the result.
10 What will be the result of using the function map fx [1,2,3,4]?
11 What will be the result of using the function filter (even) [1,2,3,4]?
12 What will be the result of using the function fold (*) 1 [1,2,3,4]?
13 Describe the purpose of the fold function.
232
12 Functional programming
This is because Making links
tail(list) → [9,25,10] Recursion is a programming
technique in which a
and the head of this list is 9.
subroutine will call itself
Using head and tail it is also possible to describe the application of the fold repeatedly until it reaches
function in more detail. a non-recursing, base case.
Re-visit the table in the Fold section above to see how the head:tail structure is Recursion is explored in
Recursive techniques in
used to recursively call the same combining function, reaching the base case
Chapter 1.
in which the tail is an empty list.
14 Given that list = [2,3,5,7,11,13], state the result of applying the function:
a head(list)
b tail(list)
c head(tail(tail(list))).
15 How is an empty list represented?
16 State the result of [1] ++ list.
17 Explain how head and tail can be used to recursively carry out a filter function.
Summary
Exam practice
234
Analysis
Evaluation Design
Exam tip
It is not necessary to be
familiar with any specific
model of software
Testing Implementation
development. However,
make sure you are familiar
with the five key stages
Figure 13.1 Stages of system development described below.
Making links
The NEA programming project is broken down into the same five steps outlined below. It
is rare for questions on this topic to appear specifically on the Component 2 examination,
though the knowledge here will be very helpful to supporting you in the NEA.
235
identify what features are common, which are most used, and which are
unnecessary or irrelevant to this scenario.
✚ Identify required features
Having looked at the current system, compared existing solutions and
discussing the needs with a variety of stakeholders, it is important to
narrow down a list of key features that the solution must include.
✚ Development platforms
It is important to consider whether the solution should take the form of
a desktop program, mobile program, web-based platform or some other
form. It is also important to consider what programming languages could
be used in the development and details such as whether a database system
or file management system is necessary.
✚ Modelling
A model for the data must be created in order to understand the problem
and identify a suitable solution. Describing both the problem, and the
proposed solution, using diagrams, charts and tables is an effective way
to generate and refine the data model and to understand how the solution
might fit together. It can also help to demonstrate ideas to a client, who Modelling Graphical
may not be a computing expert. descriptions such as
flowcharts, E-R diagrams,
This is an example of using abstraction to help model aspects of the class diagrams and
external world for use in a computer program hierarchy charts.
✚ User Interface
User interface The inputs
A user interface is an extremely important consideration for any computer and outputs of a program.
system. The solution could involve a command line interface, a graphical
Agile A method of project
user interface (GUI), a menu driven interface or even a voice-controlled
management in which a
interface. Sketches and ideas for prompts, dialog boxes and other elements
rough prototype is created
of the user interface should be produced as early as possible.
and repeatedly improved.
Client discussion is key to analysis, both in order to help to understand the
problem and to ensure that both parties agree on the proposed solution. You
should show your ideas, plans, sketches and models to your client to help
them understand your proposed solution and you may wish to take an agile
approach, producing ever-more complete prototypes throughout the life of the
project.
Making links
The agile approach makes use of rapidly built prototypes to plan and develop the
project as you go. This can be a risky strategy and this is discussed in more detail in
the design section.
The final stage is to create a list of detailed, formal objectives. Each objective
should be SMART – specific, measurable, achievable, relevant, and timely. Any
object that does not meet these criteria will be either difficult to measure,
difficult to achieve, or will add little value to the overall solution.
The client may have questions to help them understand the proposal, the
developer may have questions to help clarify certain parts of the problem, and
the client might well have additional ideas or concerns about the proposed
solution that are important to consider.
236
Design
The design of the solution is something easily overlooked or rushed, as people
Design The second stage
are often keen to get straight into the practical development of the solution.
of systems development,
As projects become larger and more complex, however, the time taken in the
in which the details of the
design phase becomes increasingly important to avoid problems later in the
solution are planned.
development.
Top level design An
The design should consist of several sections. outline design that shows
how the whole program will
Top level design fit together.
A top level design gives an overview of the solution. The aim is to look at the Flowchart A diagram using
big picture in terms of what the project aims to achieve. Details of specific symbols to represent parts
algorithms can be skipped at this stage in order to provide a clear design that of an algorithm and arrows
covers the whole solution in one place. to show which step to
follow next.
The nature of the top-level design will depend very much on the type of
project. For example: Entity-relationship
diagram A diagram
✚ a flowchart might be ideal for a heavily algorithm-based project
showing the relationship
✚ an entity-relationship diagram might be more suitable for a heavily
between different entities in
database centred project
a database.
✚ a class diagram can help to summarise an object-oriented project
✚ a hierarchy chart can help to show how a large task is split into Class diagram A diagram
subroutines. showing the structure of
and relationship between
A combination of two or three diagrams might be needed to give this top-level classes in an object-
overview. oriented program.
Hierarchy chart A diagram
Making links
showing which subroutines
Class diagrams are an effective design tool to help describe the structure of an object- will call which other
oriented program. Object oriented programming is explored in Chapter 1. subroutines.
Algorithm design
Algorithm A sequence of
Algorithms are a key component to any software project. It is therefore
steps to follow in order to
necessary to plan algorithms before they are programmed. It is common
achieve an outcome.
to create the broad, top-level design initially and then plan each individual
subroutine in more detail. Pseudo-code A form of
program design that uses
It is not usually necessary to design every single algorithm as some are code which is not specific to
relatively straightforward. However, more complex and significant algorithms one programming language.
should always be planned in advance. This can be done with pseudo-code, 237
flowcharts, or other forms of modelling as appropriate.
Using a modular structure means that parts of the program are loosely Modular Made up of
coupled. This means that if one part of the program needs to be replaced independent units or parts.
then the rest of the program should be unaffected. This can be achieved by
avoiding the use of global variables and creating subroutines that solve a
general problem (for example, a subroutine in a card game should shuffle
an array of cards rather than the whole deck. This way the whole deck can
be passed if needed, but a smaller selection of cards can also be shuffled if
needed).
Modular programs are easier to debug, easier to update, require less code (as
modules can be re-used) and modules can be re-used in future projects.
User interface
Though the user interface should have been considered in the analysis phase,
the design phase is likely to be much more detailed and so a comprehensive
design of the user interface should be included.
This involves designs for all screens, pages, prompts, responses, and error
messages that may be required as part of the solution.
The design phase is often an iterative process, meaning that it is unlikely that Iterative Repeating
the whole project would be designed in one go before being implemented. a process in order to
In practice it makes much more sense to design part of the project and then approach an end point.
implement that section. This means that practical problems can be overcome
quickly and future iterations of the designs can take this into account. Agile A method of project
management in which a
rough prototype is created
Agile approach and repeatedly improved.
You may wish to take an agile approach to solving the problem. This involves
starting with a rough prototype that is repeatedly improved. In each phase
a relatively small part of the solution is quickly planned, implemented and
tested.
Phases are very short and approaches can be quickly tried and discarded if
not suitable. This can be an effective solution, but comes with risks as it is
possible to spend several phases working towards an end-goal, only to realise
that there is a critical flaw that requires a complete re-write to solve.
Implementation
Implementation refers to the creation of the algorithms and data structures
Implementation: The
required to solve the problem.
third stage of systems
Testing
Testing should be carried out throughout the development of the solution. The
aim of testing is not only to focus on whether the system works with expected Testing The fourth stage
data, but to test that it copes appropriately with extreme and invalid data. of systems development, in
which the program is tested
For example, a database designed to track shoe sizes up to and including size to ensure it functions as it
13 should be tested with: should.
✚ normal data (such as 6, 8, 11),
✚ boundary data (such as 13) Boundary data Data on
the edge of what should be
✚ erroneous data (such as 14, 15, 106, -2, ‘orange’).
accepted.
This will ensure that the boundaries are set appropriately and that incorrect
Erroneous data Data that
or invalid inputs are dealt with cleanly.
is in error or invalid.
Defensive programming should be used to prevent the program from
Defensive programming
crashing. For example, exception handling should be used to deal with invalid
Programming intended
inputs such as a string being entered where a number is expected.
to deal with unforeseen
Making links circumstances.
Exception handling
Exception handling is an effective technique for dealing with invalid inputs. The syntax
A program technique
and structure of exception handling is discussed in Chapter 1.
that uses a try … catch
structure to deal with errors
Acceptance testing should also take place towards the end of the project. that would otherwise cause
At this point, the client and/or end-users should be provided with the the program to crash; for
opportunity to test the solution for themselves in order to check: example, if the user enters
✚ they are happy that the solution solves the original problem a string where a number is
✚ it functions as it should expected.
✚ it does not contain any errors or problems.
Acceptance testing
Acceptance testing should be at an appropriate time in the software Testing by the client and/
development schedule so that if the feedback suggests that improvements are or end-user to ensure that
necessary, there is time to correct them. the system meets the
specification.
Now test yourself
Evaluation
The final stage of any project is to evaluate the success of the project. To do
this the original objectives should be reviewed, using the testing process as Evaluation The fifth and
13 Systematic approach to problem solving
evidence, and each item marked as complete or incomplete. final stage of systems
✚ The most fundamental aspect of an evaluation is to consider whether the development, in which the
original problem has been solved. It is possible to make a fully functioning, success of the project is
complex program that doesn’t fully meet the requirements initially set out. considered.
✚ User feedback should be considered. This includes feedback from the
client, from the end-users and from other stakeholders.
It is perfectly acceptable for feedback to suggest further improvements and
developments for the future, and this does not necessarily mean that the
project is not a success. There are always opportunities to further refine and
extend any project.
Now test yourself
Summary
✚ Large software development projects can be split into ✚ Testing is focused on ensuring that the program works
five sections for valid data and copes well with boundary data and
✚ Analysis is focused on identifying and describing the erroneous data
problem, establishing system requirements from potential ✚ Testing also includes acceptance testing with intended
users, creating data models and agreeing a specification users of the system
✚ Design is focused on planning the data structures, ✚ Evaluation is focused on reflecting on the finished
algorithms, modular structure and user interface solution and making sure that it solves the original
required to solve the problem problem
✚ Implementation is focused on the creation of the ✚ An iterative approach may be used at each stage as
program code and data structures part of a prototyping/agile approach
Exam practice
1 Anne-Marie has been asked to develop a computer program for a company that
manufactures vending machines.
a) Copy and complete the table by entering the appropriate letter,
A-E, to match the activities with the stages of systems development. [3]
A Analysis D Testing
B Design E Evaluation
C Implementation
Tasks Stage
Creating a flowchart to describe an algorithm for
part of the program
Reviewing the overall success of the finished
program
Trying out other vending machines in order to see
what features they include
Entering invalid data to see how the program copes
Writing the program code
Glossary
Glossary
Term Definition Page
Absolute error The difference between the intended value and the actual value. 118
Abstract data A data structure which can be implemented using a combination of other, simpler 238
structure data structures.
Abstract data type A complex data structure in which the complexity of how the data is stored or 38
accessed is typically hidden from the programmer.
Abstract method A method that has no program code and is intended to be overridden. 29
Abstraction Making a problem simpler by removing or hiding features. 81
Acceptance testing Testing by the client and/ or end-user to ensure that the system meets the 239
specification.
Access specifier (or A keyword that sets the accessibility of an attribute or method. 25
access modifier)
Access speed The time taken to read or write data. 178
ADC Analogue to digital converter. 174
Address bus A bus for transmitting memory addresses. 160
Address bus width The maximum number of bits that can be used to address a memory location. 173
Addressing mode Either immediate or direct. 164
Adjacency list A method of representing a graph by listing, for each node, just the nodes that are 44
connected to it directly.
Adjacency matrix A method of representing a graph using a grid with values in each cell to show which 44
nodes are connected to each other.
Aggregation The idea that one class can be related to another using a ‘has a’ relationship. 26
Agile A method of project management in which a rough prototype is created and 236, 238
repeatedly improved.
Algorithm A sequence of instructions that are followed in order to solve a problem. 53, 71,
237
Amplitude A measure of how large a vibration or oscillation is. 129
Analogue A continuous signal, usually represented as a curved wave. 126
Analogue to digital Converts an analogue signal into a digital signal. 126
converter (ADC)
Analysis The first stage of systems development, in which the problem is identified, and 235
solutions proposed.
Anti-virus Software that scans programs for malicious code. 200
Append Add something to the end of a list. 232
Application layer Formats data according to the application to be used. 201
Application A tool that allows other programs to make calls or requests. 209
programming
interface (API)
Application Software intended to allow the end-user to achieve a task. 139
software
Arguments The actual values that are passed to the subroutine at runtime. 19
Arithmetic Mathematical operations such as addition, subtraction and multiplication. 162
Arithmetic logic Carries out arithmetic (mathematical) and logical operations. 162
unit (ALU)
Array A data structure for holding values in a table 34
ASCII A 7-bit code used for representing characters as binary numbers. 122, 188
Assembler Translates assembly language code into object code. 143
241
associated object)
Association A type of aggregation where the container object can be destroyed without 26
aggregation destroying its associated objects.
Asymmetric Different keys are used for encryption and decryption. 135
Asymmetric Different keys are used to encrypt and to decrypt a message. 198
encryption
Asynchronous data Where data is sent without a timing signal. 188
transmission
Attribute A single characteristic of an entity. 214, 216
Attributes The properties that an object of that type has, implemented using variables. 23
Automation Designing, implementing and executing a solution to solve a problem automatically. 83
Backus–Naur form A notation used to describe the syntax of a language. 89
(BNF)
Bandwidth The range of possible signals that can be sent in one signal. 189
Barcode A black and white image consisting of vertical bars of different widths to represent 173
data.
Barcode reader A device for reading barcodes. 173
Base case The case in which a recursive function terminates and does not call itself. 22
Base class (or A class whose attributes and methods are inherited by another class. 25
parent class)
Baud rate The number of state changes per second. 189
Bi-directional Data travels in both directions. 159
Big Data Data that cannot easily be stored, retrieved, and processed because it is too varied, 224
too large or acquired too rapidly.
Binary Numbers with base 2. 104
Binary file A file that uses binary values to represent each item of data. 37
Binary point Used to separate the whole number and fractional parts of a binary number. 114
Binary prefix A shorthand used for multiples of 1024 (210) bytes; for example kibibyte, mebibyte, 109
gibibyte.
Binary search A searching algorithm in which the middle item is checked and half of the list is 62
discarded.
Binary tree A tree in which each node can have no more than two child nodes, each placed to the 64
left or right of the preceding node so that the tree is always in order.
Binary tree search A searching algorithm in which a binary tree is traversed. 64
Bit rate The number of bits that can be transmitted per second. 189
Bitmap An image format that uses a grid of pixels. 127
Bitwise An operation that works on one bit at a time. 167
Block A subdivision of the storage on an SSD. 177
Blu-Ray A high-capacity optical disc typically used for high-definition video and large computer 177
programs.
Boolean expression A mathematical notation for logic gates and circuits. 145
Boolean identity A relation that is always true. 152
Bottleneck A congested path within a network. 195
Boundary data Data on the edge of what should be accepted. 239
Branch Jump to a labelled part of the program. 166
Breadth first search Searching a graph by checking every node one step from the starting point, then 53
(BFS) every node two steps away, etc.
242 Brute force Trying every possible combination. 134
Bubble sort A sorting algorithm in which pairs of items are sorted, causing the largest item to 65
bubble up to the top of the list.
Bug A mistake in a computer program that causes unexpected results. 200
Glossary
Bus A communication system for transferring data. 159
Bus topology A network arranged with a main cable, or bus, connecting all devices. 190
Bytecode An intermediate code between high-level and object code which can run in a virtual 145
machine.
Cache A small unit of volatile memory placed on the same circuit board as a processor for 172
fast access.
Caesar cipher A substitution cipher in which each letter is shifted according to the key. 133
Call The process of running a subroutine by stating its identifier, followed by any required 18
arguments in parentheses.
Call stack A data structure that stores the stack frames for each active subroutine while the 20
program is running.
Capacity The amount of data that can be stored. 178
Cardinality The number of values in a finite set. 86
Cartesian product The set of ordered pairs of values from both sets. 86
CD Compact disc, typically used for audio and small computer programs. 177
Certification A body capable of providing digital certificates. 199
authority
Character set The set of all characters that can be understood by a computer system, and their 121
associated character codes.
Check digit A single digit derived by following an algorithm, used to check for errors. 124
Checksum A value, calculated by an algorithm, based on the contents of the original data. 124, 199
Cipher An algorithm for encrypting data. 133
Ciphertext The encrypted form of the message. 133
Circular queue A queue which wraps around in a circle. If implemented using an array, the last index 39
is followed by the first index.
Class The definition of the attributes and methods of a group of similar objects. 23
Class diagram A diagram showing the structure of and relationship between classes in an object- 237
oriented program.
Client A device that makes requests of a server. 191, 208
Client–server A network in which clients make requests to servers. 191
network
Clipart A cartoon style image. 129
Clock A component in the processor for generating timing signals. 162,
172, 188
CMYK In a colour printer the colours used are cyan, magenta, yellow and key (black). 175
Co-domain A set from which all output values are chosen. 227
Collision Where two items of data are transmitted at the same time, causing both to be lost. 47, 194
Colour depth The number of bits used to store the colour of each pixel. 127
Command line A text-only interface which is less user-friendly but often more powerful for 140
interface (CLI) knowledgeable users.
Common factor A term which appears in 2 or more parts of an expression. 154
Compare The operation used to check the condition between two values. 166
Compiler Translates high-level code into object code as a batch. 143
Composite primary A primary key made up of two or more attributes. 216
key
Composition Combining parts of a solution together to create a solution made of component parts. 83
Composition A type of aggregation where, if the container object is destroyed, its associated 26
aggregation objects are also destroyed. 243
Glossary
(DFS) stepping back one node at a time.
Design The second stage of systems development, in which the details of the solution are 237
planned.
Dictionary A data structure based on key– value pairs. 48, 132
Digest A checksum value used in a digital signature. 199
Digital A discrete signal, often represented using a stepped wave. 126
Digital camera A device for taking photographs stored as digital data. 174
Digital certificate A digital document indicating that a public key is valid. 199
Digital signal Manipulating digitised data relating to analogue signals. 161
processing
Digital signature A method of checking that an encrypted message has not been altered. 199
Digital to analogue Converts a digital signal to an analogue signal. 126
converter (DAC)
Direct addressing The operand is the address of the value which is to be processed. 164
Directed graph A graph some edges can only be traversed in one direction, shown with an 43
arrowhead.
Disconnected graph A graph in which two or more nodes are not connected to the rest of the graph. 43
Discrete Can only take certain, countable values, such as shoe sizes or ordinal numbers. 104
Distributed A system that uses multiple processors or servers to process data. 224
processing
Domain A set from which all input values are chosen. 227
Domain name The identifier for an individual or organisation’s online presence. 196
Domain name A system for looking up domain names to find their corresponding IP address. 197
system (DNS)
Dot product A calculation used to help find the angle between two vectors. 50
Dotted-decimal A format for writing an IP address with decimal values separated by dots. 197
Drum A round device used to attract the toner. 175
Dual-core A processor containing two CPUs, both working simultaneously. 172
DVD Digital versatile disc, typically used for medium quality video and medium sized 177
computer programs.
Dynamic The size of, and the memory assigned to that data structure can change. 38
Dynamic host A system for issuing private IP addresses within a network. 207
configuration
protocol (DHCP)
E-R diagram An entity-relationship diagram. 214
Edge A line used to represent the relationship between two nodes. Also known as an arc. 43, 225
Edge triggered A logic circuit used to store the state of an input. 151
D-type flip-flop
Edge-case A problem that only occurs in an extreme setting. 200
Efficiency Being able to complete a task with the least use of time or memory. 91
Encapsulation The concept of grouping similar attributes, data, and methods together in one object. 24
Encryption A method of hiding the meaning of a message. 133,
193, 198
Entity A type of real-world object about which data should be stored. 214
Entity-relationship A diagram showing the relationship between different entities in a database. 237
diagram
Erroneous data Data that is in error or invalid. 239
245
Evaluation The fifth and final stage of systems development, in which the success of the project 239
is considered.
Even parity A method of using parity in which each block of data should have an even number 123
of 1s.
Glossary
Exception handling A program technique that uses a try … catch structure to deal with errors that would 239
otherwise cause the program to crash; for example, if the user enters a string where a
number is expected.
Expanding brackets The process of removing brackets by multiplying. 154
Exponent The part of a floating point binary number which states how far to move the decimal 115
point.
Exponential An expression where the variable is used as an exponent, or power; for example, 92
y = 2x.
Exponentiation The raising of one number to the power of another. 14
Extensible markup A file format for human-readable text used to transmit data objects. 209
language (XML)
Factorial The product of all positive integers less than or equal to a given integer. 93
Factorising The process of removing a common factor. 154
Fetch-execute cycle The process by which instructions are fetched from memory, decoded and executed. 162
Field A category of data within a record or group of records. 37, 216
File A persistent collection of data, saved for access and accessible once the program or 37
subroutine has finished running.
Filter A higher-order function that produces a new list containing all items from the input list 231
that meet a given condition.
Finite set A set with a fixed number of values. 86
Finite state A computational model which has a fixed number of potential states. 83
machine (FSM)
Firewall A software or hardware service that blocks or allows individual packets from entering 198
or leaving.
First In, First Out Those items placed into the queue first will be the first ones to be accessed. 39
(FIFO)
First-class object An object that can be assigned as an argument or returned from a function. 227
Fixed point binary A method for representing fractional numbers where the number of binary digits 114
before and after the decimal point is fixed.
Flat file database A database made up one table. 216
Floating gate Used as a memory cell in flash memory. 177
transistor
Floating point A method for representing fractional numbers where the position of the decimal point 115
binary is moved according to the value of the exponent.
Flowchart A diagram using symbols to represent parts of an algorithm and arrows to show which 237
step to follow next.
Fold A higher-order function that reduces a list to a single value by repeatedly applying a 231
combining function.
Foreign key An attribute in one table that links to the primary key in a related table. 216
Frequency analysis The process of examining how often something occurs. A useful tool for trying to 48
break some encryption methods.
Full adder A logic circuit to add three binary digits. 150
Full-duplex Both devices can send transmissions at the same time. 209
Fully qualified A domain name that addresses an exact resource, including a hostname. 196
domain name
(FQDN)
Function A subroutine that returns a value. 19
Function A rule that takes one or more values as an input and produces an output. 227
246
Glossary
Gateway A network hardware device for connecting networks that use different protocols. 196
General case A case in which a recursive function is called and must call itself. 22
General-purpose Registers that can be used to store any data. 162
registers
Getter A function used to return the value of an attribute to another object. 25
Global variables Variables that can be accessed from any subroutine. 20
Graph A data structure designed to represent the relationships between items. 43
Graph schema A graphical tool for representing fact-based models. 225
Graph traversal Inspecting the items stored in a graph data structure. 53
Graphical user An image-based interface that uses windows, icons, menus and pointers. 140
interface (GUI)
Half adder A logic circuit to add two binary digits. 150
Halting problem A specific example of a non-computable problem that proves that some problems are 96
non-computable.
Halting state A state with no outgoing transitions, and so the Turing machine will stop. 96
Hand-trace Also known as dry run. Without using a computer, completing a table to record the 53, 72,
values of each variable as the program is executed line-by-line. 170
Hard disk drive A storage device which saves data using magnetic film. 176
(HDD)
Hardware The physical components of a computer system. 139
Hash table A data structure for holding data that is indexed according to its key. 47
Hashing algorithm An algorithm for calculating a key for each value. 47
Head The first item from a list. 232
Header Extra data added to a packet such as the destination address or packet number. 195
Heuristic An approach to solving intractable problems in a reasonable amount of time by 95
accepting an imperfect solution.
Hexadecimal Numbers with base 16. 105
Hierarchy chart A diagram showing which subroutines will call which other subroutines. 237
Hierarchy charts A diagram that shows which subroutines call which other subroutines. More complex 22
versions will also show what data is passed and returned.
High-level language A programming language which uses keywords and constructs written in an English- 142
like form.
High-order function A function that can take another function as a parameter. 225
Higher-order A function that takes a function as an argument, or returns a function, or both. 228
function
Host identifier The second part of an IP address, indicating the device on that network. 205
Hostname The device within the network that is being addressed. 196
Hypothetical Consideration of a ‘what if…’ scenario. 182
I/O controller The physical interface between an I/O device and the internal components. 159
I/O devices Input and output devices such as keyboard, mice, printers and monitors. 159
Identifier A technical term for the name of a variable. 15
Immediate The operand is the value, which is to be processed. 164
addressing
Immutable data Data structures that cannot be changed. 225
structures
Imperative A language in which commands are used to say how the computer should complete 142
the task. 247
Implementation The third stage of systems development, in which the program code and data 239
structures are created.
In-order tree Inspecting or displaying the value of each node after checking the node on the left but 57
traversal before checking the node on the right.
Glossary
Glossary
Linear search A searching algorithm in which each item is checked one-by-one. 62
Link A physical connect between two devices. 190
Link layer Deals with the physical medium used for transferring the data. 201
List A data structure similar to an array, commonly used in Python in place of an array. 34
Load Fetch a value from memory into a register in the processor. 165
Local variables Variables that are declared within a subroutine and can only be accessed by the 19
subroutine during the execution of that subroutine.
Logarithmic The opposite of exponential. If y = 2x, then x = log2y. 92
Logic circuit A solution to a problem that uses one or more logic gates. 147
Logic gate Device which takes one or more binary outputs and produces a single binary output. 145
Logic problem A puzzle which is intended to be solved using logical reasoning. 71
Logical Operations using computation logic such as AND, OR and NOT. 162
Logical AND The operation of applying an AND function between the individual bits in two binary 205
numbers.
Logical bus network A star network that uses a hub as the central device, which causes it to operate like a 191
topology bus network.
Logical shift An operation involving moving each bit in a given direction. 169
Logical shift left Moving all of the bits left, doubling the value each time. 169
Logical shift right Moving all of the bits to the right, halving the value each time. 169
Lossless Reducing the size of a file without the loss of any data. 48, 132
compression
Lossy compression Reducing the size of a file by permanently removing some data. 132
Low-level language A programming language which describes exactly how to interact with the computer’s 141
hardware.
MAC address A physical address, uniquely assigned to each piece of network hardware. 193
Machine code Binary code instructions which can be read and understood by a processor. 141, 162
Main memory Memory that can be directly accessed by the CPU. 1258
Majority voting Transmitting each bit an odd number of times in order to identify and correct any 124
transmission errors.
Malware Malicious software. 200
Mantissa The part of a floating point binary number which provides the value of that number. 115
Map A higher-order function that applies a given function to each item in a list. 230
Matrix A rectangular, two-dimensional collection of values. 35
Mealy machine A finite state machine in which each input has a corresponding output as well as a 85
transition between states.
Media Access A hardware, or physical, address that does not change. 203
Control (MAC)
address
Memory The location where instructions and data are stored in a computer. 15
Merge sort A sorting algorithm in which items are split into single items and then merged into 66
sorted pairs, fours, eights and so on.
Mesh topology A network topology in which each device can be directly linked to several other 195
devices.
Metadata Data about data. Additional data stored in a file. 128
Methods Processes or actions that an object of that type can do, implemented using 23
subroutines.
Modelling Graphical descriptions such as flowcharts, E-R diagrams, class diagrams and hierarchy 236
charts. 249
Modulo Finding the remainder when one number is divided by another. 124
Modulus The remainder of the division of one number by another. 14
Moral Relating to the principles of right and wrong. 183
Most significant bit The left-most bit in a binary number. The bit with the largest place value. 105
(MSB)
Motherboard The circuit board to which all other components are connected. 158
Musical Instrument A type of file that uses the instructions for how to create each note rather than 131
Digital Interface capturing the analogue sound wave.
(MIDI)
Named constant A variable whose value cannot be changed while the program is running. 16
NAND flash Memory which can be electrically erased and reprogrammed. 177
memory
Natural numbers Positive integers, including 0. Numbers used to count things. 103
Nested Placing a programming structure inside another programming structure. 12, 35
Network address Substituting a private IP address for a public IP address, or vice versa. 207
translation (NAT)
Network identifier The first part of an IP address, indicating the network. 205
Network interface The component in any device that physically connects to the network. 203
card (NIC)
Network layer Adds sender and receiver IP addresses. 201
Node Used to represent an item of data, drawn as a circle. Also known as a vertex. 43
Node An item in a graph. 225
Non-computable A problem that cannot be solved by a computer. 96
Non-routable An address that can only be accessed within that network. 207
Non-volatile Data is retained even when electrical power is removed. 158, 176
Normalisation Structuring a database in a way designed to reduce data redundancy and improve 217
integrity.
Normalised The standard way of writing floating point binary, in which the first two bits are 119
opposite to each other.
Number base The number of digits available in that number system. 104
Nyquist’s theorem The rule that the sample rate should be at least double the maximum frequency that 130
can be heard.
Object A specific instance of a class. 24
Object code Low-level code, translated from the source code. 143
Object-oriented An approach to solving a problem using a model based on real-world objects that are 22
programming designed to interact in a realistic way.
Odd parity A method of using parity in which each block of data should have an odd number of 123
1s.
One-time pad The key used in the Vernam cipher. 134
Opcode An operation code, describing what operation is to be carried out. 164
Operand A value or object that is to be operated on. 60, 164
Operation A function to be applied; for example, +, –, AND, OR. 60
Operators Symbols used to indicate a function. 14
Optical disk A storage medium which uses light to read the data. 177
Ordinal numbers Numbers used to count the order that something appears; for example, 1st, 2nd, 3rd, 104
and so on.
Overflow Where the result of a calculation is too large to be stored in the available digits. 120, 169
250 Overhead Additional data added to the original values. 123
Glossary
Packet switching The process of sending individual packets on different routes. 195
Page A subdivision of a block. 177
Paradigm A particular style or approach to designing a solution to a problem. 22
Parallel data Data is sent on several wires, simultaneously 188
transmission
Parameters The variables that a subroutine needs in order for the subroutine to run. 19
Parity bit A single binary digit added to some data in order to help with error checking. 123, 188
Parse Process a string of symbols or letters. 210
Partial function Providing some of a function’s arguments and producing a function that takes the 228
application reminder of the arguments.
Pass The transfer of a value, or the value of a variable, to a subroutine. 18
Pass Travelling through a list from start to finish exactly once. 65
Path The location of the file or folder on a server that is being addressed. 196
Peer Of equal standing, able to act as a client or a server. 192
Peer-to-peer A network in which all devices are peers. 192
network
Peripheral External hardware devices, on the periphery of the computer system. 139
Permutation One of the different ways that a set can be arranged. 93
Personal data Data that relates to an identifiable person. 183
Pits Small indentations in the track. 177
Pixel The smallest addressable element of an image. 127, 174
Plaintext The original, unencrypted message. 133
Platter A metal disk used to store the data in a HDD. 176
Pointer A value that stores an address. In the context of queues this is usually the index of the 39
front or rear item.
Polygon A shape made of straight lines, for example, rectangle, hexagon, and so on. 128
Polymorphism Literally ‘many forms’ – the ability for two methods with the same name to carry out 27
their actions using different code.
Polynomial An expression involving powers; for example, y = 2x 2. 92
Port forwarding Using the port number of a packet to identify which private IP address should be used. 208
Port number Extra data added to a packet that identifies what application layer protocol should be 201
used to process the data.
Post-order tree Inspecting or displaying the value of each node after checking for all child nodes. 57
traversal
Postfix/Reverse A method of writing mathematical or logical expressions in which the operators 60
Polish notation appear after their operands; for example, 2 2 +.
Pre-order tree Inspecting or displaying the value of each node before moving on to other nodes. 57
traversal
Prepend Add something to the start of a list. 232
Primary key An attribute used to uniquely identify a record. 216
Printer A device for creating a hard copy of a document, usually on paper. 175
Priority queue A queue which stores a priority for each value so that the items with the highest 39
priority can be accessed first.
Private An access specifier that protects that attribute or method from being accessed by any 25
other object.
Private IP address An address that can only be accessed within that network. 206
Private key A key that is kept secret. Has a matching public key. 198
251
Router A network hardware device for connecting different networks together. 195
RTS/CTS Request to send/clear to send. A method of collision avoidance by requesting 194
clearance to transmit.
Glossary
Run A series of identical values in a file or dataset. 132
Run length A lossless compression technique of recording the length and value of each run in the 132
encoding (RLE) data.
Sample rate The rate at which samples are taken, typically measured in kHz. 130
Sampling resolution The number of bits per sample. 130
Scalable Used to describe a product, service or business that can cope with increased 183
demand.
Scalar A single number used to represent or adjust the scale of a vector. 48
Scalar-vector Multiplying each element of a vector by a number in order to increase the scale of a 49
multiplication vector.
Scaling Changing the scale of a vector by multiplying the vector by a scalar. 49
Scope The visibility of variables (either local or global). 19
Secondary storage Persistent, non-volatile storage. 176
Sector A small section of a track. 176
Selection A program structure that makes a decision about which block of code to execute, 11, 72,
typically implemented using an IF statement. 166
Self-documenting Program code that uses naming conventions and programming conventions that 13
code help other programmers to understand the code without needing to read additional
documentation.
Sequence Executing instructions in the order they are written. 72
Serial data Data is sent down one wire, one bit after another 188
transmission
Server A device which controls centralised access to a resource. 191, 208
Set A collection of objects. 227
Set comprehension A collection of rules to define which values are in a set. 86
Set theory A branch of mathematical logic in which sets are collections of objects. 86
Setter A procedure used to allow another object to set or update the value of an attribute. 25
Shortest path The route between two nodes on a graph that incurs the least cost. In an unweighted 67
graph the cost for each path can be assumed to be 1.
Signal The electric or electromagnetic impulses that are used to transmit data. 126
Single-core A processor containing a single CPU. 172
Size The pixel dimensions measured as the width x height of an image in pixels. 127
Socket The endpoint, or final destination, of a packet. Described using both the IP address 203
and port number.
Software The programs that run on a computer system. 139
Solid state disk A storage device with no moving parts that uses flash memory and a controller. 177
drive (SSD)
Source code Code written by a programmer. 143
Space-wise A measure of how much memory will be required to complete an algorithm. 91
complexity
Sparse graph A graph with few edges. 44
SSID Service set identifier – an identifier, or name, for a wireless network. 193
Stack A data structure in which items are added to the top and removed from the top, much 41, 171
like a stack of plates.
Stack frame The collection of data associated with a subroutine call. 20
Stakeholders The people with an interest in the problem or the solution. 235
Star topology A network arranged with a switch (or hub) at the centre. 190
Start bit A bit sent at the start of a message in order to provide timing data. 188 253
Starting state The state a Turing machine is in when it starts its program. 96
State transition A diagram showing the states, inputs and transitions in a FSM. 84
diagram
Glossary
State transition A table showing the states, inputs and transitions in a FSM. 84
table
Stateful inspection Inspecting the data contained in a packet. 198
Statelessness A system in which the processing does not depend on the state of another part of the 225
program.
Static The size of, and the memory assigned to that data structure is fixed and cannot be 38
changed.
Static method A method within a class that can be called without the need to create an object of 29
that class.
Stop bit A bit sent to mark the end of a message. 188
Store Copy a value from a register in the processor and save it in memory. 165
Subclass (or child A class that inherits the attributes and methods from another class. 25
class)
Subnet A subdivision of a network. 205
Subnet mask A 32-bit number used to isolate the network identifier in an IP address. 205
Subroutine A named block of code designed to carry out one specific task. 18
Subset A selection of the values found in a set. 87
Substitution cipher A cipher in which each plaintext letter is replaced with a ciphertext letter. 133
Substring A series of characters taken from a string. 17
SWITCH An alternative selection structure to an IF statement that is slightly quicker to execute 11
if used with exact values rather than a range.
Switch A device that receives and forwards data on a network. 190
Syllogism An argument that uses logical reasoning. 71
Symmetric The same key is used in encryption and decryption. 135
Symmetric The same key is used to encrypt and to decrypt a message. 198
encryption
Synchronous data Where data is sent along with a timing signal. 188
transmission
Syntax The strict rules and structures used within a specific programming language. 10, 89
Syntax diagram A diagram used to describe BNF graphically. 90
System software Software intended to allow the computer system to run. 139
Table The representation of an entity within a database, that is a structure for storing 216
attributes.
Tail The remainder of a list, without its head. 232
TCP/IP stack The use of the TCP/IP layers to add header data which is then processed in reverse 201
order.
Terminal A single value that cannot be broken down into smaller parts. 89
Testing The fourth stage of systems development, in which the program is tested to ensure it 239
functions as it should.
Text file A file that uses text encoding (such as ASCII or Unicode) to store data. 37
Thick-client A significant amount of processing is done on the client side. 211
Thin-client Almost all processing is done on the server side. 211
Third normal form A database in which all data is atomic, there are no partial dependencies and no non- 217
key dependencies.
Time-wise A measure of how much time, or how many steps, will be required to complete an 91
complexity algorithm.
Toner A powdered form of ink with a static charge. 175
254 Top level design An outline design that shows how the whole program will fit together. 237
Top level domain The right-most term in a domain name, such as .com or .uk 196
(TLD)
Top-down approach A method of planning solutions that starts with the big picture and breaks it down into 22
smaller subproblems.
Glossary
Topology The physical or logical arrangement of connections in a network. 190
Trace table A table for recording the values of variables as a program runs. 72, 170
Track A concentric ring on a platter. 176, 177
Tractable A problem which can be completed in a reasonable amount of time (polynomial time 95
or less).
Translation Moving a vector by adding another vector to it. 49
Translator Systems software for converting one form of code into object code. 143
Transmission One of the main protocols used in network communications. 201
Control Protocol /
Internet Protocol
(TCP/IP)
Transport layer Adds a port number, packet number and error detection data. 201
Tree A graph which has no cycles (it is not possible to loop back around to a previously 45
traversed part of the tree).
Tree traversal Inspecting the items stored in a tree data structure. 57
Trojan A malicious program that pretends to be a useful program, does not self-replicate. 200
Truncation Removing any value after a certain number of decimal places. 14
Truth table A table showing the possible inputs and their corresponding outputs. 145
Turing machine A model of computation with a single program which manipulates symbols on a strip 96
of tape.
Two’s complement A representation of binary numbers that includes both positive and negative values. 111
Underflow Where the result of a calculation is too small to be stored in the available digits. 121, 169
Uni-directional Data only travels in one direction. 160
Unicode A variable size character set in which each character code can be 8, 16 or 32 bits in 122
length.
Uniform resource A standard structure for addressing an online resource using alphanumeric strings. 196
locator (URL)
Universal Turing A Turing machine that takes the description of another Turing machine and its tape as 98
machine (UTM) inputs. It can simulate any conceivable Turing machine.
Unsigned binary A system that is only capable of representing positive numbers (0 or larger). 109
User interface The tools that are provided for a user to interact with a computer system. 140, 236
Variety Data in many forms; that is, unstructured. 224
Vector A data structure used to represent a position (for example, a two-dimensional vector 48
represents a position on a 2D plane).
Vector addition Adding two vectors together in order to perform a translation. 49
Vector graphic An image made of lines and shapes. 128
Vector primitive Simple objects which can be combined to create a vector graphic. 128
Velocity The rate at which data is generated or collected. 224
Vernam cipher A cipher which uses a randomly generated key and is mathematically unbreakable. 134
Virtual machine Software that emulates or simulates a computer system. 145
Virtual memory A portion of secondary storage used to store the least frequently-used instructions 158
and data.
Virtual method A method that may be overridden (this is the default in many languages). 29
Virus Malicious code that attaches to another program and self-replicates when executed. 200
Volatile Data is lost when electrical power is removed. 158, 176
Volatile memory Memory which can only store a value when supplied with power. 151
Volume The quantity of data to be stored, typically too much to fit on a single server. 224 255
Wi-Fi A set of technology standards that allows devices to communicate using radio 193
frequencies.
Wireless access A hardware device for allowing other devices to connect to an existing wireless 193
point network.
Wireless network A network that allows devices to transmit data using radio frequencies. 193
Wireless network A hardware device for enabling devices to communicate using Wi-Fi. 193
adapter
Word The maximum number of bits that can be processed in one instruction. 173
Worm A malicious program that copies itself over a network. 200
XOR A logical operation which checks whether both inputs are equal, or different. 134
256