COMP1511 Intro To Programming
COMP1511 Intro To Programming
COMP1511 Intro To Programming
All information for this course will come from Andrew Taylor's and Andrew Bennett's slides/notes
Variables
Variables are used to store a value. The value a variable holds may change at any time. At any point in time a
variable stores one value, (except for quantum computers).
C variables have a type. For this course the main variables we use are:
• int - for integer values
• double - for decimal numbers
• char - for characters
Integer Representation
Typically 4 bytes are used to store an int variable.
4 bytes -> 32 bits -> 232 possible values (bit patterns)
This means only 232 integers can be represented.
These integers are: -231 to 231 - 1 (-2,147,483,648 to +2,147,483,647)
These limits are asymmetric because zero needs a pattern (the pattern being all zeros).
Integer Overflow/Underflow
Storing a variable in an int outside the range that it can be represented is illegal. This can result in unexpected
behaviour from most C implementations, or it may cause programs to halt, or not terminate. This can increase
security holes.
Bits used for int can be different on other platforms. For example, C on a tiny embedded CPU in a washing machine
may use 16 bits. For now we assume int uses 32 bits.
Real Representation
Commonly 8 bytes are used to store a double variable.
8 bytes -> 64 bits -> 264 possible value (bit patterns)
64 bits give huge number of patterns but infinite number of reals.
Variable names:
Variable names can be made up of letters, digits and underscores.
Here are some unwritten rules about variable names:
• Use a lowercase letter to start your variable name
• Beware! variable names are case sensitive
• Beware! certain words can't be used as variable names; e.g. if, while, return, int, double. This is because
these keywords have special meanings in C programs
COMP1511 Page 1
where components in brackets [] are optional. The minimum is therefore a % and a conversion character (e.g. %i).
Flags Flag Meaning
- The output is left justified in its field, not right justified (the default).
+ Signed numbers will always be printed with a leading sign (+ or -).
space Positive numbers are preceded by a space (negative numbers by a - sign).
0 For numeric conversions, pad with leading zeros to the field width.
# An alternative output form. For o, the first digit will be '0'. For x or X, "0x" or "0X" will be
prefixed to a non-zero result. For e, E, f, F, g and G, the output will always have a decimal
point; for g and G, trailing zeros will not be removed.
Field width Converted argument will be printed in a field at least this wide, and wider if necessary.
If the converted argument has fewer characters than the field width, it will be padded on the left (or
right, if left adjustment has been requested) to make up the field width. The padding character is
normally ' ' (space), but is '0' if the zero padding flag (0) is present.
If the field width is specified as *, the value is computed from the next argument, which must be
an int.
Precision A dot '.' separates the field width from the precision.
If the precision is specified as *, the value is computed from the next argument, which must be an int.
Conversion Meaning
s The maximum number of characters to be printed from the string.
e, E, f The number of digits to be printed after the decimal point.
g, G The number of significant digits.
d, i, o, u, x,X The minimum number of digits to be printed. Leading zeros will be added to make
up the field width.
COMP1511 Page 2
p Display a pointer (to any type). The representation is implementation dependent.
% Display the % character.
Above information from <http://personal.ee.surrey.ac.uk/Personal/R.Bowden/C/printf.html>
One method of doing this is the #define statement. #define statements go at the top of your program after the
#include statements and #define names should always be in capital letters and underscores.
#define NAME_OF_CONSTANT 100
Mathematics in C
C supports the usual maths operations: + - * /. BODMAS follows.
Mathematical functions are not part of the standard library because tiny CPUs may not support them.
The library math.h contains mathematical functions such as sqrt(), sin(), cos(), tan() (These take
double as arguments and return double).
COMP1511 Page 3
If Statements
Wednesday, 11 April 2018 8:43 PM
Many problems require executing statements only in some circumstances. This is sometimes called
control flow, branching or conditional execution. We do this in C using if statements.
Relational Operators
C has the usual operators to compare numbers:
Operator Meaning
> greater than
>= greater than or equal to
< less than
<= less than or equal to
!= not equal to
== Equal to
Be careful when comparing doubles for equality == or !=. Recall that doubles are approximations.
Many languages have a separate type for true & false. C just uses 0 for false and other numbers for
true
Logical Operators
C also has logical operators:
Operator Meaning
&& and operator - true if both operands are true
|| or operator - true if either operand is true
! not operator - true if and only if its operand is false
COMP1511 Page 4
Functions
Wednesday, 11 April 2018 8:42 PM
Calling a function
When calling a function, you type the function name and the variable you will pass in to the function.
E.g. x = cube(number);
- cube is the function name
- number is the variable we are passing into the function.
Function Properties
• Functions have a type. This is the type of value they return.
• Type void is for functions that return no value. void is also used to indicate that a function has no
parameters.
• Functions cannot return arrays
• Functions that have their own variables created when the function is called and these variables are destroyed
when the function returns.
• A function's variables are not accessible outside the function
• return statements stop the execution of a function
• return statements specify the value to return unless the function is of type void
• A run-time error occurs if the end of a non-void function is reached without a return
Function Prototypes
COMP1511 Page 5
Function Prototypes
Function prototypes allow function to be called before it is defined. It species key information about the function:
• function return type
• function name
• number and type of function parameters
It allows a top-down order of functions in the file, which makes it more readable. It also allows us to have function
definition in a separate file. This is important since it is crucial to share code and important for larger programs
Library Function
Over 700 functions are defined in the C standard library.
The C compiler needs to see a prototype for these functions before you use them. You do this indirectly with
#include line. For example stdio.h contains prototypes for printf and scanf.
COMP1511 Page 6
While Statements
Wednesday, 11 April 2018 8:42 PM
if statements only allow us to execute or not execute code. This means that they execute code either
once or never. Meanwhile, while statements allow us to execute code once or more times.
Like if, while statements have a controlling expression but while statements execute their body of
statements until the controlling expression is false.
Loop Counter
Often we use a loop counter variable to count loop repetitions. This allows us to have a while loop
execute n times.
loop_counter = 0;
while (loop_counter < n) {
printf("*");
loop_counter++;
// this is the same as loop_counter = loop_counter + 1;
}
printf("\n");
Termination
We can control the termination (stopping) of while loops in many ways. However, it is very easy to
write a while loop which does not terminate. Often a sentinel variable is used to stop a while loop
when a condition occurs in the body of the loop.
COMP1511 Page 7
We often need to nest while loops. When we do this we need separate counter variables for each
nested loop.
j = 0;
while (j < 10) {
printf("* ");
j = j + 1;
}
printf("\n");
i = i + 1;
}
COMP1511 Page 8
Arrays
Wednesday, 11 April 2018 8:42 PM
A C array is a collection of variables called array elements. All array elements must be of the same type. Array
elements do not have names, instead they are accessed by a number called the array index. A valid array index for
an array with n elements are:
0, 1, 2, … , n - 1
Arrays must be initialised, or else weird stuff will happen, which you don’t want.
You also can assign scanf or printf whole arrays, instead you can assign scanf/printf array elements.
Likewise, if you are printing arrays, you must print each element individually
i = 0;
while (i < ARRAY_SIZE) {
printf("%d\n", array[i]);
i = i + 1;
}
A two-dimensional array (a matrix) would be useful for storing both rows and columns of data.
COMP1511 Page 9
A two-dimensional array (a matrix) would be useful for storing both rows and columns of data.
Declaring and initialising an array would look like this:
int matrix[3][3] = { {1, 2, 3},
{4, 5, 6},
{7, 8, 9} };
COMP1511 Page 10
Strings
Wednesday, 11 April 2018 8:42 PM
End of Input
Input functions such as scanf or getchar can fail because no input is available. e.g. if input is coming from a file
and the end of the file is reached. On UNIX-like systems (Linux/OSX) typing ctrl + D signals to the operating
systems no more input from the terminal. Windows has no equivalent to this, although some windows programs
interpret ctrl + Z similarly.
getchar returns a special value to indicate there is no available. This non-ASCII value is #defined as EOF in
stdio.h. On most systems EOF == -1. There is no end-of-file character on modern operating systems.
The programming pattern for reading characters to the end of input would look like this:
Programming pattern for reading characters to the end of input:
int ch;
ch = getchar();
while (ch != EOF) {
printf("'%c' read, ASCII code is %d\n", ch, ch);
ch = getchar();
}
Strings
A string in computer science is a sequence of characters. In C, strings are an arrays of char containing ASCII
codes. These arrays of char have an extra element containing a 0. The extra 0 can also be written '\0' and may
be called a NULL character or NULL-terminator. This is convenient because programs don't have to track the
length of the string.
Note: hello will have 6 elements; 5 for the individual characters, 1 for the NULL terminator '\0'.
The C library includes some useful functions which operate on characters. Here are some below:
#include <ctype.h>
COMP1511 Page 11
Reading a string/line - fgets string.h functions
fgets(array, array size, stream) reads a line of text Here are some string functions that may be useful:
1. array - is a char type array. This is where the line will be #include <string.h>
stored. // string length (not including '\0')
2. array size - is the size of the array. This size is how big the line int strlen(char *s);
can be.
3. stream - is where the line will be read from e.g. stdin. // string copy
char *strcpy(char *dest, char *src);
fgets cannot not store more characters than the array size, because char *strncpy(char *dest, char *src, int n);
it will not be big enough .
fgets always stores a terminating character ,'\0', in the array. // string concatenation/append
fgets stores a newline character, `\n', in the array, if it reads an char *strcat(char *dest, char *src);
entire line. We often need to overwrite this newline character: char *strncat(char *dest, char *src, int n);
int i = strlen(lin);
// string compare
if (i > 0 && line[i - 1] == "\n) {
int strcmp(char *s1, char *s2);
line[i - 1] = '\0';
int strncmp(char *s1, char *s2, int n);
}
int strcasecmp(char *s1, char *s2);
NEVER use the similar C function gets, which can overflow the array int strncasecmp(char *s1, char *s2, int n);
and cause major source of security exploits.
// character search
The programming pattern for using fgets looks like this: char *strchr(char *s, int c);
#define MAX_LINE_LENGTH 1024 char *strrchr(char *s, int c);
...
char line[MAX_LINE_LENGTH];
printf("Enter a line: ");
// fgets returns NULL if it can't read any
characters
if (fgets(line, MAX_LINE_LENGTH, stdin) != NULL {
fputs(line, stdout);
// or
printf("%s" ,line); // same as fputs
}
Week 6 Tutorial
char is a data type that stores single characters
%c is used to read and print characters when using printf and scanf
COMP1511 Page 12
When you initialise an array of characters use double quotes ""
e.g. char string[8] = "";
But when assigning specific ASCII values use single quotes
e.g. char character = 'A';
char string[9] = { 'C', 'O', 'M', 'P', '1', '5', '1', '1', '\0'};
printf("%s\n", string);
Remember: a string is an array of characters with a NULL terminator at the end.
Instead of individually assigning each ASCII character into the array you can just do this:
char string[9] = "COMP1511";
printf("%s\n", string);
if (argc != 2) {
perror("Usage: %s <number\n", argv[0]);
return 1;
}
When we type in characters in terminal it doesn’t immediately print out the characters. This is because the
characters are held in a buffer waiting to be sent to the program once you press enter.
Cracking the caesar cipher: BRUTE FORCE!!!! There are only 25 shifts
Check against English words
COMP1511 Page 13
I/O, Reading and Writing Files
Wednesday, 11 April 2018 2:04 PM
Can access with fprintf. This is just like printf but works with files
COMP1511 Page 14
int fgetc(FILE *f); The return value will be NULL if the file cannot be opened or
This reads a single character from a stream. It returns EOF if no if you try to open a file you don't have permission to access
character is available.
fopen mode parameters
int fputc(int, FILE *f); Mode Description
This writes a single character to a stream. "r" opens an existing file for reading purposes
"w" opens a text file for writing
int fscanf(FILE *f, …); if it doesn’t exist a new file is created
This performs scanf from a stream and returns the number of values start writing from the top of the file (i.e. it will overwrite
read. the original file)
"a" opens a text file for writing in appending mode
int fprintf(FILE *f, …); if doesn’t exist a new file is created
starts writing at the end of existing file content
This prints to a specified stream.
"r+" opens a file for both reading and writing
Example: "w+" w+ truncates the file to zero if it exists
"a+" a+ starts reading from the start of the file and writing at
// opens a file called output.txt in "writing" mode
FILE* output_file = fopen("output.txt", "w"); the end of the existing file contents
COMP1511 Page 15
Memory and Pointers
Wednesday, 11 April 2018 3:14 PM
Hexadecimal Representation
We can interpret the hexadecimal number 3AF1 as:
3 × 163 + 10 × 162 + 15 × 161 + 1 × 100
The base or radix is 16, and the digits are:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
The place values:
… 4096 256 16 1
… 163 162 161 160
We can write the number as 3AF116 (= 1508910)
Memory Organisation
Memory is effectively a GIANT array of bytes. When a program is
executed, program variables are stored in memory. Everything is
stored in memory somewhere. Since everything is stored in
memory, everything has an address.
COMP1511 Page 16
Variables in Memory Arrays in Memory
int k; The elements of an array will be stored in consecutive memory
int m; locations.
printf( "address of k is %p\n", &k ); int a[5];
// prints address of k is 0xbffffb80 int i = 0;
printf( "address of m is %p\n", &m ); while (i < 5) {
// prints address of k is 0xbffffb84 printf("address of a[%d] is %p\n", i, &a[i]);
}
k occupies the 4 bytes from 0xbffffb80 to 0xbffffb83 // prints:
m occupies the four bytes from 0xbffffb84 to 0xbffffb87 // address of a[0] is 0xbffffb60
// address of a[1] is 0xbffffb64
// address of a[2] is 0xbffffb68
// address of a[3] is 0xbffffb6c
// address of a[4] is 0xbffffb70
Note: to print the address in printf, use %p.
Pointers
A pointer is a data type whose value is a reference to another variable.
int *ip; // pointer to int
char *cp; // pointer to char
double *fp; // pointer to double
In most C implementations, pointers store the memory address of the variable they refer to; i.e. they
point to the variable, whose address they store.
Pointer syntax:
[type] *[some_name] = &[something];
For example:
int *my_pointer = &my_variable;
Importantly: the value of the pointer is the address of the variable it points to
For example:
int i = 7;
int *ip = &i;
printf("%d\n", *ip); // prints 7
*ip = *ip * 6;
printf("%d\n", i); // prints 42
i = 24;
printf("%d\n", *ip); // prints 24
Like other variables, pointers need to be initialised before they are used. It is best if novice programmers
initialise pointers as soon as they are declared. The value NULL can be assigned to a pointer to indicate it
does not refer to anything. NULL is a #define in stdio.h and NULL and 0 are interchangeable (in modern
C), however most programmers prefer NULL for readability.
Size of Pointers
Just like any other variable of a certain type, a variable that is a pointer also occupies space in memory.
The number of bytes depends on the computer's architecture.
• 32-bit platform: pointers are likely to be 4 bytes
• 64-bit platform: pointers are likely to be 8 bytes
• Tiny embedded CPU: pointers could be 2 bytes (e.g. your microwave)
Pointer Arguments
When we pass primitive variable types as arguments to functions, they are passed by value and any
changed made to them are not reflected in the caller function. Recall that scanf is a function, that takes
a variable from the main function as an argument. How does a function like scanf manage to update the
value of a variable found in the main function? It takes pointers to those variables as arguments!!
We use pointers to pass variables by reference. By passing the address of a variable rather than its
variable, we can change the value of that variable and have these changes be reflected in the caller
function.
int main(void) {
COMP1511 Page 17
int main(void) {
int i = 1;
increment(&i);
printf("%d\n", i); The function increment is called
//prints 2 with the address of i , and the
memory of the address is passed
return 0; to the variable *n. The function
} increases the value of the location
referenced by n by 1.
void increment(int *n) {
*n = *n + 1;
}
In a sense, pointer arguments allow a function to 'return' more than one value. This increases the
versatility of function. For examples, scanf is able to read multiple values and it uses its return value as
an error status.
You have to be extremely careful when returning pointers. Returning a pointer to a local variable is
illegal - that variable is destroyed when the function returns. But, you can return a pointer that was
given as an argument.
int increment(int *n) {
*n = *n + 1;
return n;
}
Nested calling of functions is now possible: increment(increment(&i));
Array Representation
A C array has a very simple underlying representation, it is We can even use another pointer to act as the array name.
stored in an unbroken memory block and a pointer is kept at
int nums[] = {1, 2, 3, 4, 5};
the beginning of the block. int *p = nums;
char s[] = "Hi!"; printf("%d\n", nums[2]);
printf("s: %p *s: %c\n\n", s, *s); printf("%d\n", p[2]);
printf("&s[0]: %p s[0]: %c\n", &s[0], s[0]); // both print: 3
printf("&s[1]: %p s[1]: %c\n", &s[1], s[1]);
Since nums acts as a pointer we can directly assign its value to the
printf("&s[2]: %p s[2]: %c\n", &s[2], s[2]);
pointer p.
printf("&s[3]: %p s[3]: %c\n", &s[3], s[3]);
// prints
// s: 0x7fff4b741060 *s: H We can even make a pointer point to the middle of an array:
// &s[0]: 0x7fff4b741060 s[0]: H int nums[] = {1, 2, 3, 4, 5};
// &s[1]: 0x7fff4b741061 s[1]: i int *p = &nums[2];
// &s[2]: 0x7fff4b741062 s[2]: ! printf("%d %d\n", *p, p[0]);
// &s[3]: 0x7fff4b741063 s[3]: So what is the difference between an array variable and a pointer?
int i = 5;
Since array variables are pointers, it now becomes clear why p = &i; // this is OK
we can pass arrays to scanf without the need for address- nums = &i; // this is an error
of(&) and why arrays are passed to functions by reference.
Unlike a regular pointer, an array variable is defined to point to the
beginning of the array, it is constant and may not be modified
A good explanation about pointers and arrays:
https://edstem.org/courses/1950/discussion/80387
COMP1511 Page 18
COMP1511 Page 19
Extra C Features
Tuesday, 17 April 2018 9:06 PM
C Features we don't want you to use but are telling you what they are and how to use them
global variables
Variables declared outside of any function are available to all functions. They are called external variables or global
variables.
int g = 12;
void f(void) {
printf("The value of g is %d\n", g); // prints 12
g = 42;
}
int main(void) {
f();
printf("The value of g is %d\n", g); // prints 42
return 0;
}
static functions
Functions are shared between files by default. This is undesirable in large programs because name clashes become more
likely. Name clashes also make code difficult to reuse.
The keyword static makes functions visible only within the file. In other words, static limits the function's scope. If a
function doesn’t need to be visible declare it static e.g.
static double helper_function(int x, double y);
It allows files to be de facto modules in C. Similarly static makes global variables visible only within the file. Beware,
static has different meanings for local (function) variables.
When a function is called, its variables are created. When a function returns, its variables are destroyed. static changes
the lifetime of a function's (local) variable. The value is preserved between function calls. Static variables make
concurrency difficult and it also makes programs harder to read and understand. There is rarely a good reason to use
static variables - so do NOT use them in COMP1511. Note: there is very different meaning to using static outside
functions; poor language design much.
For example, here is a function that counts how many times it has been called.
void count(void) {
static int call_count = 0;
call_count++;
printf("I have been called %d times\n", call_count);
}
More C Operators
C provides some additional operators, which allow shorted statements. This can make your code a little more readable
or a lot less readable.
COMP1511 Page 20
printf("%d %d", k, n) // k=6, n=7
Exiting a Program
In main, return will terminate a program.
stdlib.h provides a function useful outside main:
void exit(int status);
status is passes to exit the same a return value of main.
stdlib.h defines EXIT_SUCCESS and EXIT_FAILURE. An EXIT_SUCCESS program means the program execute successfully,
while and EXIT_FAILURE program means that the program stopped due to an error. EXIT_SUCCESS == 0 on Unix like and
almost all other systems.
COMP1511 Page 21
The compiler steps in and performs and automatic
implicit conversion types and make them explicit.
conversion known as cast, from integer to double.
#include <limits.h>
double d = 3; // 3 is converted to double
#include <assert.h>
int i = 5;
...
d = d + i; // i is converted to double
assert(i >= CHAR_MIN && i <= CHAR_MAX);
char c = (char) i; // for some int i
Implicit conversions are generally performed when Note: When using explicit casts the compiler will often assume
considered 'safe'. e.g. numeric types are converted to that you know what you are doing and not issue warnings
other numeric types with larger capacity. But sometimes even when a cast is very likely to be unsafe. For example:
unsafe implicit conversions are also performed. This is an
aspect of C that is often criticised. Consider: int i = 1000;
char c = (char) i;
int i = 1000; int *ip = (int *) i;
char c1 = 100; // statically checked, OK int nums[] = {0};
char c2 = 1000; // statically checked, printf("%c\n", (char) i);
warning printf("%s\n", (char *) &i);
char c3 = i; // no warning printf("%s\n", (char *) nums);
Here the casts are used to view one type as another. This is
You should be mindful of implicit conversions. Often they often dangerous!!!
make coding easier, but sometimes they mask
programming errors.
typedef
We use the keyword typedef to give a name to a type:
typedef double real;
This means variables can be declared as numeric (real), but they will be actually be of type double. Do not overuse
typedef - it can make programs harder to read. For example:
typedef int andrew;
andrew main(void) {
andrew i,j;
…
real matrix[1000][1000][1000];
real my_atanh(real x) {
real u = (1.0 - x)/(1.0 + x);
return -0.5 * log(u);
}
If we move to a platform with little RAM, we can save memory (and lose precision) by changing the typedef:
typedef float real;
COMP1511 Page 22
char name[MAX_NAME];
double lab_marks[N_LABS]
double assignment1_mark; Unlike arrays, a copy will be made of the entire structure, and only this
double assignment2_mark; copy will be passed to the function. Unlike a function a function can
}; // need semi-colon to indicate you are done return a struct:
// declaring the struct student_t read_student_from_file(char filename[]) {
We can declare an array to hold the details of all students: ....
}
struct student comp1511_students[900];
Note: You can't just assign strings to structs. E.g
Pointers to structs
comp1511_students[0].name = "Andrew";
If a function needs to modify a structs field or if we want to avoid the
You need to use strcpy: inefficiency of copying the entire struct, we can instead pass a pointer
strcpy(comp1511_students[0].name, "Andrew"); to the struct as a parameter:
Combining structs and typedef int scan_zid(student *s) {
A common use of typedef is to give a name to a struct type. return scanf("%d", &((*s).zid));
}
struct student {
int zid; The "arrow" operator is more readable:
char name[64]; int scan_zid(student *s) {
double lab_marks[N_LABS]; return scanf("%d", &(s->zid));
double assignment1_mark; }
double assignment2_mark; If s is a pointer to a struct, s->field is equivalent to (*s).field
};
student_t comp1511_students[900];
Programmers often use convention to separate type names.
e.g. _t suffix
Nested Structures
One structure can be nested inside another:
typedef struct date Date;
typedef struct time Time;
typedef struct speeding Speeding;
struct date {
int day, month, year;
};
struct time {
int hour, minute;
};
struct speeding {
Date date;
Time time;
double speed;
char plate[MAX_PLATE];
};
COMP1511 Page 23
Malloc
Wednesday, 2 May 2018 2:08 PM
malloc() allocates memory in "the heap" and returns a pointer to a block of memory (it returns the
address of the memory it is allocated). malloc() returns a (void *) pointer and can be assigned to any
pointer type. If insufficient memory is available then malloc() returns NULL.
Memory "lives" on forever until we free it. free() indicates that you have finished using a block of
memory. Continuing to use memory after memory after free() results in in very nasty bugs. Using
free() on a memory block twice can also cause bad bugs. If a program keeps calling malloc() without
corresponding free() calls, then the program's memory will grow steadily larger. This is called a
memory leak. Memory leaks are major issues for long running programs.
sizeof
sizeof is a C operator that yields bytes when needed for a type or variable.
You use it like this:
sizeof (type)
sizeof variable_name
Note: unusual syntax (badly designed) brackets indicate argument is a type
You should use sizeof for every malloc call.
Here are some more examples of using sizeof:
printf("%ld", sizeof (char)); // 1
printf("%ld", sizeof (int)); // 4 commonly
printf("%ld", sizeof (double)); // 8 commonly
printf("%ld", sizeof (int[10])); // 40 commonly
printf("%ld", sizeof (int *)); // 4 or 8 commonly
printf("%ld", sizeof "hello"); // 6
COMP1511 Page 24
Linked Lists
Wednesday, 2 May 2018 2:08 PM
Finding an item in a list: using a shorter while loop
Same function but using a more concise while loop.
Self-Referential Structures
Shorter does not always mean more readable.
We can define a structure containing a pointer to the same type of structure like this:
// return pointer to first node containing
struct node {
// specified value, return NULL if no such node
struct node *next;
int data;
struct node *find_node(struct node *head, int data) {
};
struct node *n = head;
These "self-referential" pointers can be used to build larger "dynamic" data structures out of while (n != NULL && n->data != data) {
smaller building blocks. n = n->next;
}
Linked Lists return n;
The most fundamental of these dynamic data structures it the Linked List. }
It is based on the idea of a sequence of data items or nodes. Linked lists are more flexible Finding an item in a list: recursive
than arrays: Same function but function calls itself.
• Items do not have to be located next to each other in memory // return pointer to first node containing
• Items can easily rearranged by altering pointers // specified value, return NULL if no such node
• The number of items can change dynamically
• Items can be added or removed in any order struct node *find_node(struct node *head, int data) {
if (head == NULL) {
return NULL;
}
if (head->data == data) {
return head;
A linked list is a sequence of items. }
Each item contains data and a pointer to the next item. You need to separately store a return find_node(head->next, data);
pointer to the first item or "head" of the list. The last item in the list is special - it contains }
NULL in its next field instead of a pointer to an item .
Finding an item in a list: shorter recursive
Example of List Item Same function but a more concise recursive version.
Example of a list item used to store an address Shorter does not always mean more readable.
In C code: In a diagram: // return pointer to first node containing
struct address_node { // specified value, return NULL if no such node
struct address_node *next;
char *name; struct node *find_node(struct node *head, int data) {
char *address; if (head == NULL || head->data == data) {
char *telephone; return head;
char *email; }
}; return find_node(head->next, data);
}
COMP1511 Page 25
return n;
head = create_node(17, head);
} else {
head = create_node(13, head);
struct node *l = last(head);
l->next = n;
Summing a list return list;
// return sum of list data fields }
}
int sum(struct node *head) {
int sum = 0; Deleting all items from a list
struct node *n = head;
// Delete all the items from a linked list.
// execute until end of list
while (n != NULL) {
void delete_all(struct node *head) {
sum += n->data;
struct node *n = head;
// make n point to next item
struct node *tmp;
n = n->next;
while (n != NULL) {
}
tmp = n;
return sum;
n = n->next;
}
free(tmp);
}
Summing a list: using a for loop }
// return sum of list data fields
Inserting a node into an ordered list
int sum(struct node *head) { struct node *insert(struct node *head, struct node *node) {
int sum = 0; struct node *previous;
for (struct node *n = head; n != NULL; n = n->next) { struct node *n = head;
sum += n->data; // find correct position
} while (n != NULL && node->data > n->data) {
return sum; previous = n;
} n = n->next;
}
Summing a list: recursive // link new node into list
Same function but using a recursive call. if (previous == NULL) {
// return sum of list data fields head = node;
} else {
int sum2(struct node *head) { previous->next = node;
if (head == NULL) { }
return 0; node->next = n;
} return head;
return head->data + sum2(head->next); }
}
Inserting a node into an ordered list: recursive
Finding an item in a list struct node *insert(struct node *head, struct node *node) {
// return pointer to first node containing if (head == NULL || head->data >= node->data) {
// specified value, return NULL if no such node node->next = head;
return node;
struct node *find_node(struct node *head, int data) { }
struct node *n = head; head->next = insert(head->next, node);
// search until end of list reached return head;
while (n != NULL) { }
if (n->data == data) {
// matching item found Deleting a node from a list
return n; struct node *delete(struct node *head, struct node *node) {
} if (node == head) {
// make n point to next item head = head->next; // remove first item
n = n->next; free(node);
} } else {
// item not in list struct node *previous = head;
return NULL; while (previous != NULL && previous->next != node) {
} previous = previous->next;
}
Finding an item in a list: using a for loop if (previous != NULL) { // node found in list
// return pointer to first node containing previous->next = node->next;
// specified value, return NULL if no such node free(node);
} else {
struct node *find_node(struct node *head, int data) { fprintf(stderr, "warning: node not in list\n");
for (struct node *n = head; n != NULL; n = n->next) { }
if (n->data == data) { }
return n; return head;
} }
}
return NULL; Deleting a node from a list: recursive
} struct node *delete(struct node *head, struct node *node) {
if (head == NULL) {
fprintf(stderr, "warning: node not in list\n");
} else if (node == head) {
head = head->next; // remove first item
free(node);
} else if (head == head) {
head->next = delete(head->, node)
}
return head;
}
COMP1511 Page 26
Multiple C files, Header files
Tuesday, 15 May 2018 1:17 PM
We can make programs with multiple .c files. It is no different from making/compiling a program from one .c file.
dcc -o program_name first.c second.c
Note: You cannot have more than one main function. So make sure there is only ONE main function amongst the
code in your .c files
COMP1511 Page 27
Stacks and Queues
Wednesday, 16 May 2018 2:24 PM
Stacks and Queues are ubiquitous data structures in computing. They are part of many algorithms and are a good
example of abstract data types.
Stack - Abstract Data Type - C Interface Queue - Abstract Data Type - C Interface
typedef struct stack_internals *stack; typedef struct queue_internals *queue;
stack stack_create(void); queue queue_create(void);
void stack_free(stack stack); void queue_free(queue queue);
void stack_push(stack stack, int item); void queue_enqueue(queue queue, int item);
int stack_pop(stack stack); int queue_dequeue(queue queue);
int stack_is_empty(stack stack); int queue_is_empty(queue queue);
int stack_top(stack stack); int queue_front(queue queue);
int stack_size(stack stack); int queue_size(queue queue);
Using stacks in C interface:
stack s; Using queues in C interface:
s = stack_create(); queue q;
stack_push(s, 10); q = queue_create();
stack_push(s, 11); queue_enqueue(q, 10);
stack_push(s, 12); queue_enqueue(q, 11);
printf("%d\n", stack_size(s)); // prints 3 queue_enqueue(q, 12);
printf("%d\n", stack_top(s)); // prints 12 printf("%d\n", queue_size(q)); // prints 3
printf("%d\n", stack_pop(s)); // prints 12 printf("%d\n", queue_front(q)); // prints 10
printf("%d\n", stack_pop(s)); // prints 11 printf("%d\n", queue_dequeue(q)); // prints 10
printf("%d\n", stack_pop(s)); // prints 10 printf("%d\n", queue_dequeue(q)); // prints 11
The implementation of a stack is opaque (hidden from the printf("%d\n", queue_dequeue(q)); // prints 12
user). Users program cannot depend on how stack is Like stacks the implementation of queues is opaque. Queue
implemented. Stack implementation can change without implementation can change without risk of breaking user
risk of breaking users program. This type of information programs.
hiding is crucial to managing complexity in large software
systems.
Implementing A Stack with a Linked List
A stack can be implemented using a linked list, by adding and removing at the head
[push() and pop()]. For a queue, we need to either add or remove at the tail.
COMP1511 Page 28
Adding an item to the Tail of a list:
Adding an item to the tail is achieved by making the last node of the list point to the new
node. We first need to scan along the list to find the last item.
struct node *add_to_tail( *new_node, struct node *head) {
if (head == NULL) { // list is empty
head = new_node;
} else { // list not empty
struct node *node = head;
while (node->next != NULL) {
node = node->next; // scan to end
}
node->next = new_node;
}
return head;
}
Efficiency Issues:
Unfortunately, this implementation is very slow. Every time a new item is inserted, we
need to traverse the entire list (which could be very large). We can do the job more
efficiently if we retain a direct link to the last item of "tail" of the list
if (tail == NULL) { // list is empty
head = node;
} else { // list not empty
tail->next = node;
}
tail = node;
Note: There is no way to efficiently remove items from the tail
A calculator using RPN is called a Postfix Calculator, it can be implemented using a stack:
• When a number is entered: push it onto the stack
• When an operator is entered: pop the two items from the stack, apply the
operator to them, and push the result back onto the stack
#include <stdio.h>
#include <ctype.h>
#include "stack.h"
int main(void) {
int ch;
stack s = stack_create();
while ((ch = getc(stdin)) != EOF) {
if (ch == '\n') {
printf("Result: %d\n", stack_pop(s));
} else if (isdigit(ch)) {
ungetc(ch, stdin); // put first digit back
COMP1511 Page 29
ungetc(ch, stdin); // put first digit back
int num;
scanf("%d", &num); // now scan entire number
stack_push(s, num);
} else if (ch == '+' || ch == '-' || ch == '*') {
int a = stack_pop(s);
int b = stack_pop(s);
int result;
if (ch == '+') {
result = b + a;
} else if (ch == '-') {
result = b - a;
} else {
result = b * a;
}
stack_push(s, result);
}
}
}
COMP1511 Page 30
Illegal C
Wednesday, 23 May 2018 2:33 PM
Consequences of bugs
• A compilers gives syntax/semantic errors (if you're very lucky)
• The program halts with run-time error (if you're lucky)
• The program never halts (if you're lucky-ish)
• The program halts, but with incorrect results (if you're unlucky)
• The program appears correct, but has security holes (if you're unlucky)
Changed Variable
int a[10];
int b[10];
printf("a[0] is at address %p\n",&a[0]);
printf("a[9] is at address %p\n", &a[9]);
printf("b[0] is at address %p\n",&b[0]);
printf("b[9] is at address %p\n", &b[9]);
for (int i = 0; i < 10; i++) {
a[i] = 77;
}
for (int i = 0; i <= 12; i++) {
b[i] = 42;
}
for (int i = 0; i < 10; i++) {
printf("%d ", a[i]);
}
printf("\n");
The C program assigns to b[10]..b[12], which do not exist.
The consequence could be anything - a C implementation is permitted to behave in
any manner given an invalid program. On gcc 6.3 on Linux/x86 64 it happens to
change b[0] to 42:
$ gcc invalid_array_index0.c
$ a.out
a[0] is at address 0x7fffc9cbcbf0
a[9] is at address 0x7fffc9cbcc14
b[0] is at address 0x7fffc9cbcbc0
b[9] is at address 0x7fffc9cbcbe4
42 77 77 77 77 77 77 77 77 77
Changed Termination
int i;
int a[10];
printf("i is at address %p\n", &i);
printf("a[0] is at address %p\n", &a[0]);
printf("a[9] is at address %p\n", &a[9]);
printf("a[11] would be stored at address %p\n", &a[10]);
for (i = 0; i <= 11; i++) {
a[i] = 0;
}
Another invalid C program assigning to a non-existent array element.
On gcc 6.3 on Linux/x86 64 it happens to assigns to i and the loop doesn't terminate.
So a character error makes the program invalid, and seemingly certain termination
does not occur.
$ gcc invalid1.c
$ a.out
COMP1511 Page 31
$ a.out
i is at address 0x7fffbb72bfdc
a[0] is at address 0x7fffbb72bfb0
a[9] is at address 0x7fffbb72bfd4
a[10] is equivalent to address 0x7fffbb72bfd8
....
int main(void) {
int answer = 42;
f();
answer = 24;
printf("answer=%d\n", answer);
return 0;
}
Yet another invalid C program assigning to a non-existent array element.
With gcc 6.3 on Linux/x86 64 it changes where the function returns in main.
$ gcc invalid3.c
$ a.out
answer=42
Bypassing Authentication
int authenticated = 0;
char password[8];
printf("Enter your password: ");
gets(password);
COMP1511 Page 32
gets(password);
if (strcmp(password, "secret") == 0) {
authenticated = 1;
}
// a password longer than 8 characters will overflow
// array password on gcc 6.3 on Linux/x86_64 this can
// overwrite the variable authenticated and allow access
if (authenticated) {
printf("Welcome. You are authorized.\n");
} else {
printf("Welcome. You are unauthorized. ");
printf("Your death will now be implemented.\n");
printf("Welcome. You will experience ");
printf("a tingling sensation and then death. \n");
printf("Remain calm while your life is extracted.\n");
}
Yet another invalid C program assigning to a non-existent array element.
A password longer than 8 characters will overflow the array password. This is often
termed buffer-overflow.
$ gcc invalid4.c
$ a.out
Enter your password: secret
Welcome. You are authorized.
$ a.out
Enter your password: wrong
Welcome. You are unauthorized.
Your death will now be implemented.
Welcome. You will experience a
tingling sensation and then death.
Remain calm while your life is extracted.
$ a.out
Enter your password: longcorrectpassword
Welcome. You are authorized.
Implementation vs Language
C was designed for much smaller, slower computers (28K of RAM, 1mhz clock).
Program speed/size is much more important for programs than dominated language
choice.
Most C implementations still focus on maximising performance of valid programs.
Most C implementations do not check array bounds or for arithmetic overflow
because this have performance costs. The C definition does not entail this.
A C implementation can check array bounds and halt if invalid indexes are used.
A C implementation could check and halt if an uninitialised value is used - but is
difficult/expensive to track for arrays.
COMP1511 Page 33
#0 0x55819087cd2b in test3 debug_examples.c:33
#1 0x55819087d19c in main debug_examples.c:96
#2 0x7fccf078d2b0 in __libc_start_main (/lib/...
#3 0x55819087caf9 in _start ...
....
dcc uses -fsanitize=address (with clang) but makes messages more comprehensible for
beginner programmers:
$ cd /home/cs1511/public_html/lec/illegal_C/code/
$ dcc debug_examples.c
$ ./a.out 3
ASAN:DEADLYSIGNAL
int *a = NULL;
// dereferencing NULL pointer
--> a[5] = 42;
}
Values when execution stopped:
Address Sanitizer does not detect the use of uninitialised values. e.g.:
% ./debug_examples 4
0
1
2
3
-2115323248
5
6
7
8
9
COMP1511 Page 34
...
dcc -valgrind
dcc -valgrind causes valgrind to be used to run your program. It makes messages more
comprehensible for beginner programmers:
$ dcc --valgrind debug_examples.c
% ./a.out 4
Runtime error: uninitialized variable accessed.
Execution stopped in test4() debug_examples.c line 45:
// accessing uninitialized array element (a[4])
for (i = 0; i < 10; i++)
--> printf("%d\n", a[i]);
}
COMP1511 Page 35
Searching and Sorting
Wednesday, 23 May 2018 2:33 PM
Efficiency
COMP1511 focuses on writing programs, but efficiency is also important. We often need to consider:
• Execution time
• Memory use
A correct but slow program can be useless. Efficiency often depends on the size of the data being processed.
Understanding this dependency lets us predict program performance on larger data.
Searching Sorting
Aim: to rearrange a sequence so it is in non-decreasing order
Linear Search Unordered Array
Advantages:
int linear_search(int array[], int length, int x) { • Sorted sequences can be searched efficiently
for (int i = 0; i < length; i = i + 1) { • Items with equal keys are located together
if (array[i] == x) { Disadvantages:
return 1; • Simple obvious algorithms are too slow at sorting large sequences
} • Better algorithms can sort very large sequences
} Sorting has been studied extensively and many algorithms have been
return 0;
proposed.
}
One slow obvious algorithm is bubblesort and one fast algorithm is:
An informal analysis: quicksort
Operations:
• Start at the first element Bubblesort Code:
• Inspect each element in turn
void bubblesort(int array[], int length) {
• Stop when you find X or reach the end
int swapped = 1;
If there are N elements to search:
while (swapped) {
• Best case scenario: we only check 1 element swapped = 0;
• Worst case scenario: we need to check N elements for (int i = 1; i < length; i = i + 1) {
• If the element is in the list we will check on average N/2 if (array[i] < array[i - 1]) {
elements int tmp = array[i];
• If it is not in the list, it will check N elements array[i] = array[i - 1];
array[i - 1] = tmp;
Linear Search Ordered Array swapped = 1;
int linear_ordered(int array[], int length, int x) }
{ }
for (int i = 0; i < length; i = i + 1) { }
if (array[i] == x) { }
return 1; Quicksort - Code
} else if (array[i] > x) { void quicksort(int array[], int length) {
return 0; quicksort1(array, 0, length - 1);
} }
} void quicksort1(int array[], int lo, int hi) {
return 0; if (lo >= hi) {
} return;
An informal analysis: }
Operations: int p = partition(array, lo, hi);
• Start at the first element // sort lower part of array
• Inspect each element in turn quicksort1(array, lo, p);
• Stop when you find X or find a value >X or reach the end // sort upper part of array
If there are N elements to search: quicksort1(array, p + 1, hi);
• Best case scenario: we only check 1 element }
• Worst case scenario: we need to check N elements
• If the element is in the list we will check on average N/2 int partition(int array[], int lo, int hi) {
int i = lo, j = hi;
elements
int pivotValue = array[(lo + hi) / 2];
• If it is not in the list, it will check N/2 elements
while (1) {
while (array[i] < pivotValue) {
Binary Search Ordered Array i = i + 1;
int binary_search(int array[], int length, int x) { }
int lower = 0; while (array[j] > pivotValue) {
int upper = length - 1; j = j - 1;
while (lower <= upper) { }
int mid = (lower + upper)/ 2; if (i >= j) {
if (array[mid] == x) { return j;
return 1; }
COMP1511 Page 36
while (lower <= upper) { }
int mid = (lower + upper)/ 2; if (i >= j) {
if (array[mid] == x) { return j;
return 1; }
} else if (array[mid] > x) { int temp = array[i];
upper = mid - 1; array[i] = array[j];
} else { array[j] = temp;
lower = mid + 1; i = i + 1;
} j = j - 1;
} }
return 0; return j;
} }
An informal analysis:
Operations:
• Start with an entire array Quicksort and Bubblesort Compared
• At each step half the range the element may be in If we use quicksort and bubblesort code, we see:
• Stop when you find X or when the range is empty Array size (n) Bubblesort operations Quicksort operations
If there are N elements to search:
• Best case scenario: we only check 1 element 10 81 24
• Worst case scenario: we need to check log2N + 1 elements 100 8415 457
• If the element is in the list we will check on average log2N 1000 981018 9351
elements 10000 98790120 102807
• Bubblesort is proportional to n
log2(N) grows very slowly: • Quicksort is proportional to nlog2n
• log210 = 3.3 • If n is small, there is little difference between the algorithms
• log21000 = 10 • If n is large, there is a significant difference between the
• log21000000 = 20 algorithms
• log21000000000 = 30 • For large n, you need a good sorting algorithm like quicksort
• log21000000000000 = 40
Physicists estimate 1080 atoms in universe: log 2(1080) = 240
Binary search all atoms in universe in < 1 microsecond
COMP1511 Page 37