C For Biologists
C For Biologists
Foreword
Acknowledgement
1 INTRODUCTION
1.1 Classification of Programming Language --- --- 1
2 C FUNDAMENTALS
2.1 History --- --- 13
3 CONTROL STATEMENTS IN C
3.1 Control statements --- --- 51
3.2 Branching statements --- --- 52
3.2.1 Conditional branching statements [ if statement ] --- --- 52
3.2.2 Conditional branching statements [ if – else – if statement ] --- 56
4 ARRAYS
4.1 One dimensional array --- --- 85
4.1.1 Declaration --- --- 85
4.1.2 Memory representation --- --- 86
4.1.3 Initialization --- --- 86
4.2 Two dimensional array --- --- 93
5 FUNCTIONS
5.1 What is Function --- --- 98
5.2 Advantages of Function --- --- 98
6 STRINGS
6.1 String Declaration --- --- 115
7 POINTERS
7.1 What is a pointer --- --- 137
7.2 Advantages of pointers --- --- 138
7.3 Pointer variables --- --- 138
7.3.1 The address operator --- --- 138
7.3.2 The indirection operator --- --- 138
7.4 Declaring pointer variables --- --- 138
9. FILE PROCESSING IN C
9.1 Defining and Opening a File --- --- 159
9.2 Closing a File --- --- 160
9.3 File Input and Output --- --- 160
9.3.1 Character Input and Output --- --- 161
I wish to thank Prof P.P Mathur, Centre Head, Centre for Bioinformatics, Pondicherry
University, Pondicherry who inspired me to write this book.
My sincere thanks to Mr. M. Sundaramohan, Information Officer, Pondicherry University for his
valuable suggestions in composing this work, and Dr. Ayalaru Murali, Assistant Professor,
Centre for Bioinformatics, Pondicherry University, for his timely help in reviewing this work
and in the successful completion of the book.
My heartfelt thanks to all my colleagues, for their undivided support and encouragement and to
all of my students, who motivated me to write this book.
I am very much grateful to my husband, my son and my family members for their continuous
support in writing this book.
G.Jeyakodi
1
Introduction
Language is a medium of communication. The set of instructions given to computers to do its
work is called programming language. Computer programming languages are developed with
the primary objectives of facilitating a large number of people to use computer without the
need to know the details of internal structure of the computer. Using set of instructions or
statements the users can develop a program or software which is called programming.
Example: BASIC, COBOL, PASCAL, C, C++, JAVA, etc
To be confined to the scope of the book and considering the limitation of space, this chapter
is aimed at giving a very brief account of history of computer languages and various tools
needed for efficient programming skills. The readers who are interested in detailed account
on these topics are encouraged to refer to the books listed in the reference section.
All computer languages can be classified broadly into following three categories:
Computer language
The computer language that the computer directly understands is called machine language of
the computer. This language is composed of sequence of 1’s and 0’s. The circuitry of a
computer is wired in a manner that it recognizes the machine language instructions
immediately, and converts them into electrical signals needed to execute them.
1
A machine language instruction normally has a two-part format .The first part is operation
code and the second part is operand. Operation code tells the computer what function to
perform, and the Operand tells the computer what operation to perform and the length and
locations of the data fields involved in the operation. Every computer has a set of operation
codes called its instruction set. Each operation code (or opcode) in the instruction set is meant
to perform a specific basic operation or function. Arithmetic operations, Logical operations,
Branch operations and Data movement operations are the typical operations included in the
instruction set.
Example
ADVANTAGES:
• No translation is needed
DISADVANTAGES:
• Machine dependant
• Difficult to understand
The language that allows instructions and storage locations to be represented by letters and
symbols instead of numbers is called assembly language or symbolic language. A program
written in an assembly language is called assembly language program or symbolic program.
Since the assembly language program is developed by symbol, computer cannot execute the
program directly. It require translator for converting the assembly code into machine code.
“Assembler is the system software that translates the assembly code into machine code”.
Mnemonic Meaning
• Easier to modify
• Easily reloadable
DISADVANTAGES:
The language which is composed of normal English like statement is called High level
language. This program can easily understand by the programmer but it require translator for
converting the code into machine understandable form. “Compiler is system software that
converts the high level program into machine code”.
ADVANTAGES:
• Machine independent
• Fewer errors
• Easier to maintain
DISADVANTAGES:
• Less efficiency
• Require translator
3
The fourth generation languages are very simple. The third generation languages are
considered to be as procedural language. The fourth generation languages have a minimum
number of syntax rules hence common people use this language easily.
i. Query languages
4
ADVANTAGES:
DISADVATAGES:
• Inflexible.
Fifth generation languages might be the future of programming languages. These languages
will be able to process natural process. The user will be free from learning any programming
language to communicate with the computer.
The problem solving technique is a task of expressing solution to the complex problems in
terms of simple operations understood by the computer. Problem solving using a computer
require a well-defined sequence of steps in a systematic manner.
¾ Problem definition.
¾ Problem analysis.
¾ Coding or programming
¾ Documentation
The basic concept, which should be clearly understood, is that “Success can be achieved in
problem solving only when we know what we want to do”. i.e., the problem should be clearly
5
understood first by the user. A proper involvement in this process always helps in generating
a good workable solution.
This step of problem solving analyze what must be done rather than how to do it. Clear
understanding of the problem is must. This step also requires the exact specification of the
problem. The specification may be in the user’s language such as English or some other
natural language. It may include charts, tables and equations of different kinds. The
knowledge about the specification can be gathered using certain techniques such as
observation of the actual task, interviews and so on.
Example 1.1
Consider we have a set of visiting cards, each card containing a name, address and a
telephone number. The problem is to find the telephone number corresponding to any given
name.
In the first stage of problem solving it is necessary for us to be more precise about the
structure of the input data. We may want to know whether the visiting cards are arranged in
alphabetic order of names or are they in a random order. We will assume that the cards are
not in order but we can lead through the cards one at a time from the first to last. Next we
must also decide what action we would perform if the name were not present in any of the
visiting cards. The problem was not precisely defined because it did not tell us about this.
Therefore, the problem statement is to be corrected by indicating that the output should be
either the phone number of the person or a message that the name is not present.
The problem analysis step focuses on understanding the requirements of the problem to be
solved. This process is the first step towards the solution domain. Explicit requirements about
the input – output, time constraints, processing requirements, accuracy, memory limitations,
error handling and interfaces are understood at this stage. The end result of this analysis is the
selection of a method, which is to be used on the computer or a decision that a computer
should not be used because of constraints as it may be seen that manual methods are better.
To completely and properly specify the input to a program it require to answer certain
questions such as:
6
5. What is the range of values allowed for a particular input?
Similarly to provide the output of a program it is necessary to answer certain questions such
as: What are the outputs generated by the program? (The outputs should be clearly and
unambiguously specified)
1. What is the format of the outputs? ( Which includes type, accuracy and the units)
3. How the outputs should be displayed? ( Which includes spacing, layout and heading)
7
Example 1.2: Telephone number search problem
In the analysis of the problem, it is decide that the only reasonable method given the structure
of the input data is to look at each card one after the other and compare the name it contains
with the name being searched. If the card is found stop and report the output otherwise
continue with the remaining cards one after the other if the search reaches the end of the
cards without finding the name, we output the message that the name is not present.
After the completion of problem definition and problem analysis it is necessary to define the
solution of the problem. The solution should include a sequence of steps that will input and
manipulate the data and produce the desired output. The process of good designing can be
done efficiently with the choice of certain design tools. Algorithms and Flowcharts are the
two design tools, helps to represent the solution of a problem.
One more design technique is Top down Design used to solve the complex problems more
effectively by dividing the problems into sub problems. Sub problems are easier to solve than
the complete problem.
I. ALGORITHM
CHARACTERISTICS OF AN ALGORITHM
Example 1.3
Step1: Start
1. Clearly understand the problem statement so that a proper algorithm can be evolved.
3. Design the process, which will produce the desired result after taking the input.
5. Test the algorithm by giving the test data and see if the desired output is generated. If
not, make appropriate changes in the process and repeat the process.
ADVANTAGES OF ALGORITHM
• Easy to understand
• It has got a definite procedure, which can be executed within a set period of time.
• It is easy to first develop an algorithm, and then convert it into a flowchart and then
into a computer program.
• It is easy to debug as every step has got its own logical sequence.
DISADVANTAGES OF ALGORITHM
II. FLOWCHART
1. Clearly understand the problem statements so that a proper flowchart can be evolved.
9
3. Design the process, which will produce the desired result after taking the input.
5. Test the flowchart by giving the test data and see if the desired output is generated. If
not, make appropriate changes in the process and repeat the process.
FLOWCHART SYMBOLS
For easy visual reorganization, standard conventions are used for drawing flowcharts.
In a program.
Advantages of flowcharts
10
1. Flowcharts provide an excellent means of communication, which is very easy to
understand.
2. It has got a definite procedure, which shows all the major parts of a program.
5. It is easy to debug as every step has got its own logical sequence.
Disadvantages of flowcharts
1. It is time consuming.
Example 1.4
Start
Input ntype
F If ntype = T
RNA
Output Deoxy
Ribonucleic Acid Output Wrong
Nucleotide type
11
Stop
Debugging is the process of isolating and correcting the errors in a program. Debugging is a
very important and time-consuming phase of software development. The process of
debugging ensures that the program does what the programmer intends to do. This stage is
also referred to as verification.
2. All the detected errors are corrected and the program is recompiled.
The process is repeated till no errors are displayed.
4. Once all the errors are corrected the program is recompiled again and
executed.
COMPILER: It is software which checks the entire program created by the user for a certain
type of error called as syntax errors. If the program is error free the complete program is
translated to its equivalent machine language program.
12
Source program: The program created by the user in a high level language is referred to as
source program.
Object program: The machine language program generated by the compiler is referred to as
object program.
Syntax: It refers to the set of rules, which should be followed while creating every statement
and structure in a program.
Syntax error: An error, which occurs due to the wrong use of syntax, is referred to as syntax
error.
Semantic errors: An error, which occurs due to the wrong use of logic, is referred to as
semantic error.
Logic error: If the correct translation of an algorithm or flowchart causes the program to
produce wrong results. Such error is referred to as logical errors.
Runtime errors: Errors, which are detected during the execution of a program, are referred
to as runtime errors.
Testing: It is the process of checking whether the program works correctly according to the
requirements of the user. i.e., whether the program generates the correct results for a given
input of data.
13
The correctness of the program ca be determined by trying a large number of carefully
chosen data and then by seeing if the program generates the correct output in all those cases.
If debugging is referred to as verification, testing is referred to as validation.
• Comment statement
• System manual
• User manual
Comments
Comments are natural language statements put within a program to assist anyone reading the
source program listing in understanding the logic of the program. They do not contain any
program logic, and are ignored by a language processor.
System manual
¾ Problem definition
¾ Software description
14
¾ List of program names and its description
¾ File layout, that is, the detailed layout of input and output records.
User manual
Software must have a good user manual to ensure its easy and smooth usage. It is the user,
not the developer or the programmer, who will regularly use the software and it is installed
and commissioned for use. User manual must contain the following:
¾ List of error conditions with explanation for their re-entry into the
system.
Maintaining is the process updating or upgrading new versions of programs so that it can
meet the present requirements of the user.
NEED
15
• The needs of the user have changed over a period of time and the
program has to be modified to meet the present day needs.
• The user has seen the use of new types of software using Graphical
User Interface (GUI) and thus feels that his existing system has to be changed to the
new system.
2
C Fundamentals
2.1 HISTORY
Dennis M. Ritchie, a systems engineer at Bell Laboratories, New Jersey developed C in the early
1970’s, which is now part of AT & T Bell Labs, USA.
The root of modern languages is ALGOL, introduced in early 1960’s. ALGOL was the first
computer language to use the block structure. It gave the concept of structured programming
to the computer science community.
16
In 1967, Martin Richards developed a language called BCPL (Basic Combined Programming
Language) primarily for writing system software. In 1970, Ken Thompson created a language
using many features of BCPL and called it simply B. B was used to create early versions of
UNIX operating system at Bell Laboratories.
C was evolved from ALGOL, BCPL and B by Dennis Ritchie at Bell Laboratories in 1972. C
uses many concepts from these languages and added the concept of data types and other
powerful features. Since it was developed along with the UNIX operating system, it is
strongly associated with UNIX. This operating system was coded almost entirely in C.
To assure that the C language remains standard, in 1983, American National Standards
Institute (ANSI) appointed a technical committee to define a standard C. The committee
approved a version of C in December 1989 which is now known as ANSI C. It was then
approved by International Standards Organization (ISO) in 1990. This version of C is also
referred to as C89.
During 1990’s, C++, a language entirely based on C, with a number of improvements and
changes was developed. During the same period Sun Microsystems of USA created a new
language Java modeled on C and C++.
17
2.2 FEATURES OF C
The capability of C language to work at machine level, makes it well suited for systems
programming. A systems program is a part of a large class of programs, which are developed
for the purpose of simplifying the process of using the system. Some important systems
programs include Operating Systems, Compilers, Interpreters, Assemblers and Editors.
Being a structured language, C is also very useful in developing large application programs.
Some important application programs include Word Processors, Spreadsheets, CAD
applications, animation and games.
ADVANTAGES OF C
• It has a wide variety of derived data structures like pointers, arrays, structures and
unions apart from fundamental data types like integers, floating point numbers and
characters (discussed at length in Chapter 4,5).
• Programs written in c are found to execute faster compared to other languages.
• Provides a rich set of built-in functions.
• It is easily expandable to meet the requirements.
• It has the ability to deal efficiently with bits, bytes, word, addresses etc.,
18
2.4 BASIC STRUCTURE OF C PROGRAM:
Documentation section
Link section
Definition section
Declaration
part
Executable
part
}
Subprogram section
Function 1
Function 2
-
-
Function n
The documentation section consists of a set of comment lines giving the name of the
program, the author and other details. The link section provides instructions to the compiler
to link functions from the system library. The definition section defines all symbolic
constants.
There are some variables that are used in more than one function. Such variables are called
global variables and are declared in global declaration section that is outside all the
functions. This section also declares user-defined functions.
Every C program must have one main() function section. This section consists of two parts,
declaration part and executable part. The declaration part declares all the variables used in
executable part. There is at least one statement in the executable part. These two parts must
enclose between the opening and closing braces. The program execution begins at the
opening brace and ends at the closing brace. All statements in the declaration and executable
parts end with a semicolon (;).
The subprogram section consists of all user-defined functions that are called in the main
function user-defined functions are generally placed immediately after main function,
19
although they may appear in any order. All sections, except main function section may be
absent when they are not required.
2. Compilation phase
3. Execution phase
In the Editing phase the user enters the program. The program created by the user is called as
the source program. The creation of the source program is done with the help of software
called as the editor.
Editor : A software used to interactively review and modify text materials and other
program instructions.
In the Compilation phase the compiler first checks the program for syntax errors. Once all the
errors are corrected the program is then converted into its equivalent machine language code
also called as the object program. The compilation is performed by software called as the
compiler.
In the execution phase the program is executed to check whether it is giving the proper result
or not.
Executing a C program
C program can be executed in the different platform such as Windows, UNIX and LINUX.
Though the executing a source program is same in all operating systems, the way of approach
is slightly different and specific to the operating system. In the next two sections, executing a
source code in different environments is explained.
In windows platform, Turbo C and Borland C are the commonly available C compilers.
These compilers have in-built editors also. After installing Turbo c compiler, the new file can
be created by choosing the New option from the File menu. After entering the source code
the file should be save in the filename.c extension.
C program must be compiled and linked with necessary libraries to build an executable
version (.exe) of that program. The program can be compiled by pressing Alt+C option. If the
compiler displays any error it should be removed, saved and compiled again. Once the source
code is error free, the object file (filename.obj) and the executable file (filename.exe) are
20
created. Press Alt + R, to execute or run the program. The results are viewed by pressing
Alt+F5 key. Once the exe file is created it can run in the DOS environment also by simply
typing the name of the exe file the command prompt.
In UNIX and LINUX platforms, the source program can be created in any one of the
standard editor vi, gedit or gvim. The source code should save under the filename.c extension.
(Ex: welcome.c). The following command is used for compiling the source code.
After compiling the source code, the compiler generates a.out if it is error free. If there are
any errors it should be debug and recompiled. Once the file a.out is created it can be run in
the command prompt by typing the following command
To store the executable file in any other name than the default file name a.out, the option –o
is used as, cc –o welcome welcome.c or gcc –o welcome welcome.c
Now, the executable file is stored in the file name welcome. So to run the program instead of
using a.out, it can be execute by using the filename welcome as
./welcome
Note: In linux, “-lm” option is used during compilation to link math library and the
compilation command would be
cc –lm pH.c
Numerals ( 0 to 9 )
=! &%<>``‘. #\
21
2.7 C TOKENS
In a passage of text, individual words and punctuation marks are called tokens. Similarly, in
C language the smallest individual units are known as C tokens. C has six types of tokens as
shown in the figure.
C TOKENS
molecule {}
seq_length []
Every C word is classified as either a keyword or identifier. All keywords have fixed
meanings and these meanings cannot be changed. Keywords serve as basic building blocks of
program statements. All keywords were written in lower case. The lists of all ANSI C
keywords are listed below:
22
ii) IDENTIFIERS
Identifiers refer to the names of variables, functions and arrays. These are user-defined names
and consist of a sequence of letters and digits, with a letter as a first character. Both
uppercase and lowercase letters are permitted, although lowercase letters are commonly used.
The underscore character is also permitted in identifiers.
Constants in C refer to fixed values that do not change during the execution of a program. C
constants are illustrated in the figure 2.1 as given below:
iii) CONSTANTS
23
a. Integer Constants
An integer constant refers to a sequence of digits. There are three types of integers, namely,
• Decimal integer
• Octal integer
• Hexadecimal integer.
Embedded spaces, commas and non-digit characters are not permitted between digits.
For example,
12 34 10,000 $298
An octal integer constant consists of any combination of digits from the set 0 through 7, with
a leading 0. Some examples are:
The largest integer that can be stored is machine dependent. It is 32767 on 16-bit machines
and 2,147,483,647 on 32-bit machines. It is also possible to store larger integer constants on
these machines by appending qualifiers such as U, L and UL to the constants.
Examples:
24
b. Real Constants
Integer numbers are continuous values. They are inadequate to represent quantities such as,
distances, temperature, and prices and so on. These quantities are represented by numbers
containing fractional parts like 23.89. Such numbers are called real (or floating point)
constants.
Examples:
A real number may also be expressed in exponential (or scientific) notation. For example, the
value 215.65 may be written as 2.1565e2 in exponential notation. E2 means multiply by 10 2.
The general form is:
mantissa e exponent
The mantissa is either a real number expressed in decimal notation or an integer. The
exponent is an integer number with an optional + or – sign. The letter e separating the
mantissa and exponent can be written in either lower or uppercase.
Examples:
Examples:
d. String constants
A string constant is a sequence of characters enclosed in double quotes. The characters may
be letters, numbers, special characters and blank spaces.
Examples:
NOTE:
The character constant ‘a’ is not equivalent to the string constant “a”. Further, a single
character string constant does not have an equivalent integer value while a character constant
has an integer value.
25
e. Backslash Character Constants
C supports some backslash character constants that are used in output functions. A list of
such backslash character constants are listed in the Table 2.1 as given below:
Constant Meaning
‘\a’ Audible alert (bell)
‘\b’ Back space
‘\f’ Form feed
‘\n’ New line
‘\r’ Carriage return
‘\t’ Horizontal tab
‘\v’ Vertical tab
‘\” Single quote
‘\”’ Double quote
‘\?’ Question mark
Fig. 2.1 Constants Hierarchy
‘\\’ Backslash
‘\0’ Null
iv) VARIABLES
A variable name is a data name that may be used to store a data value. Unlike constants that
remain unchanged during the execution of a program, a variable may take different values at
different times during the execution. A variable name can chosen by the programmer in a
meaningful way so as to reflect its function or nature in the program. Some examples are:
Nucleotide
Amino_acid
Molecule_structure
26
• ANSI standard recognizes a length of 31 characters. However the length should not be
normally more than eight characters, since only first eight characters are only treated
significant by many compilers.
v) DATA TYPES
C language is rich in its data types. The varieties of data types are available to allow the
programmer to select the type appropriate to the needs of the application as well as the
machine.
ANSI C supports three classes of data types:
All C compilers support five fundamental data types, namely integer (int), character (char),
floating point (float), double-precision floating point (double) and void. Many of them also
extend data types such as long int and long double.
In order to provide some control over the range of numbers and storage space, C has three
storage classes of integer namely short int, int and long int, in both signed and unsigned
forms. Short int represents fairly small integer values and requires half the amount of storage
as a regular int number uses. Unlike signed integers, unsigned integers use all the bits for the
magnitude of the number and are always positive. We declare long and unsigned integers to
increase the range of values. The Use of Qualifier Signed on integers is optional because the
default declaration assumes a signed number.
short int
int
long int
27
Fig. 2.2 Integer modifiers
float
double
long double
The declaration of variables must be done before they are used in a program.
Where v1, v2 …. vn are the names of the variables. Variables are separated by commas.
A declaration statement must end with a semicolon.
28
For example int seq_length;
double ratio;
Where type refers to the existing data type and identifier refers to the new name given to the
data type.
Example:
Here length symbolizes int and base symbolizes char. They can be later used to declare
variables as follows:
Another user defined data type is enumerated data type provided by ANSI standard. It is
defined as follows:
the “identifier is enumerated dat type which can be used to declare variables that can have
one of the values enclosed within braces (known as enumeration constants). After this
definition, we can declare variables to be of this “new” type as below:
The enumerated variables v1, v2,…vn can only have values value1, value2, …valuen.
v1=value3; v5=value1;
Example:
29
week_st=Monday;
week_end=Friday;
if(week_st==Tuesday)
week_end=Saturday;
The compiler automatically assigns integer digits beginning 0 to all enumeration constants.
That is, the enumeration constant value1 is assigned 0, value2 is assigned 1, and so on.
However, the automatic assignments can be overridden by assigning values explicitly to the
enumeration constants. For example:
Here, the constant Monday is assigned the value of 1. The remaining constants are assigned
values that increase successively by 1.
The definition and declaration of enumerated variables can be combine din one statement.
Example;
variable_name=constant;
Example: Length=100;
It is also possible to assign a value to a variable at the time of variable declaration. It takes the
following form:
Example:
int mol_wt=150;
char base=’A’;
We can assign the values to the variables in two ways. They are compile time assignment and
execution time assignment.
Example:
30
int total=68;
float weight=12.3;
Example:
main()
nucleotide1=’a’;
nucleotide2=’t’;
Data input and output statements are the most important statements as they are used to
facilitate transfer of information or data between the computer and the standard input/output
devices ( e.g., Keyboard, Mouse, Monitor, Data Files, etc.)
In C, getchar() and putchar() functions are used for character input and output from the
standard I/O devices.
Syntax of input:
char variable = getchar();
The above function read a single character from the keyboard and assign it to the character
variable specified.
31
Syntax of output
putchar(char variable);
The above function display a single character specified by the argument onto the monitor.
main()
{
char x;
x = getchar();
putchar(x); // Displays a character on the screen
putchar('\n'); // New line
putchar(tolower(x)); // Displays a lower case letter
}
Sample I/O
char string_name[size];
Example:
char sequence[50];
32
The gets() and puts() functions are used to read and display string from the standard I/O
device.
gets(string variable);
The above function read a string from the keyboard and assign it to the string variable.
puts(string variable);
The above function display a string specified in the string variable on the screen.
main()
{
char string[40];
gets(string);
puts(string);
}
Sample I/O
f bi l i t
33
2.12.3 Formatted Input and Output
The scanf() and printf() functions are used to read and print formatted input and output from a
standard I/O device.
The control string contains the conversion characters (preceded with % symbol) for each data
item to be read in. The commonly used conversion characters for different data types are
given in Table 2.2 given below.
c char
s string
d decimal
o octal
u Unsigned decimal
f float
The variable name in the variable list must proceed with the address operator (&).
Example
34
M
35
25000.00
Rajan
After the execution of the scanf() statement we get, sex=’M’, age=35, salary=25000.00 and
name=”Rajan”
The printf() function prints the data onto any standard output device in the format specified
by the control sting and using the values of the variables var-1,var-2,….var-n.
Example
printf(“Sex=”%c”,sex,”\nAge=%d”,age,”\nName=%s”,name);
C provides the way to specify the width of the data being displayed by using the %w
followed by the conversion characters for the data type. For example %7d specify the integer
data of 7 characters width. The output is displayed in the right justified manner.
It also provide the way to specify the number of decimal places (n) also for a floating-point
data type by using the format %w.nf .For example the format of %12.3f indicates that the
floating-point data has a width of 12 characters of which the decimal places are represented
35
in 3 decimal places, one character is used for decimal point and the remaining 8 character
width is used for the integer part.
#include<stdio.h>
main()
{ printf("welcome to C programming\n");
printf("Bioinformatics is the combination of\n");
printf("Biology, Information Technology and
Statistics");
}
Sample I/O
Welcome to C programming
Bioinformatics is the combination of
36
Sample Program 2.4
Write a C program to illustrate the scanf() and printf() function
#include<stdio.h>
main()
{
int pH;
char name[20];
float mol_wt;
printf("Enter amino acid name,pH value: ");
scanf("%s %d",&name,&pH);
printf("Enter molecular weight: ");
scanf("%f",&mol_wt);
printf("\nAmino acid details\n");
printf("Name:%s",name);
printf("\npH value:%d",pH);
printf("\nMol.wt:%12.4f",mol_wt);
}
Sample I/O
37
2.13 OPERATORS AND EXPRESSIONS
C supports a rich set of operators. An operator is a symbol that tells the computer to perform
certain mathematical or logical manipulations. They usually form a part of mathematical or
logical expressions.
1. Arithmetic operators
2. Relational operators
3. Logical operators
4. Assignment operators
6. Conditional operators
7. Bitwise operators
8. Special operators
10+10
Is an expression whose value is 20. The value can be any type other than void.
C provides all the basic arithmetic operators which are listed below:
Operator Meaning
* Multiplication
/ Division
% Modulo division
38
When both the operands in an expression are integers, the expression is called as integer
expression. For example if a=15 and b=10 we have the following results:
a-b=5
a + b = 25
a * b = 150
a % b = 5 (remainder of division)
An arithmetic operation involving only real operands is called real arithmetic. A real operand
may assume values either in decimal or exponential notation.
When one of the operands is real and the other is integer, the expression is mixed-mode.
For example:
15/10.0 = 1.5
Relational operators are used for comparisons. C supports six relational operations in all
which are tabulated below Table 2.4
Operator Meaning
== Is equal to
!= Is not equal to
39
The value of relational expression is either one or zero. It is one if specified relation is true
and zero if the relation is false.
For example:
10<20 is true
But
20<10 is false
|| meaning logical OR
The logical operators && and || are used when we want to test more than one condition. An
example is:
a > b && x == 10
An expression of this kind which combines two or more relational expressions is known as
logical expression or compound relational expression.
Truth Table
Non-zero Non-zero 1 1
Non-zero 0 0 1
0 Non-zero 0 1
0 0 0 0
40
Assignment operators used to assign the result of an expression to a variable.
a=a+1 a+=1
a=a-1 a-=1
a=a*(n+1) a*=n+1
a=a%b a%=b
The increment and decrement operators are ++ and --.The operator ++ adds 1 to the operand,
while – subtracts 1. Both are unary operators and take the prefix and postfix form as follows
--m; or m--;
We use increment and decrement operators in for and while loop extensively. While ++m and
m++ mean the same thing when they form statements independently, they behave differently
41
when they are used in expressions on the right hand side of the assignment statement.
Consider the following:
m=5;
y=++m;
In this case, the value of y and m would be 6. Suppose if we rewrite the above statement as
m=5;
y=m++;
Then, the value of y would be 5 and m would be 6. A prefix operator first adds 1 to the
operand and then the result is assigned to the variable on left. On the other hand, a postfix
operator first assigns the value to the variable on the left and then increments the operand.
The operator ? : works as follows: exp1 is evaluated first. It is not a nonzero (true), then the
expression exp2 is evaluated and becomes the value of expression. If exp1 is false, exp3 is
evaluated and its value becomes the value of the expression. For example consider the
following:
a=10;
b=15;
In this example x will be assigned the value of b. this can be achieved by using if - else
statements as follows:
if (a> b)
x=a;
else
x=b;
42
Bitwise operators are used to manipulate data at bit level. These operators are used for testing
the bits, or shifting them right or left. Bitwise operators may not be applied to float and
double. The following table illustrates the bitwise operators:
Bitwise Operators
Operator Meaning
| bitwise OR
^ bitwise exclusive OR
C supports some special operators such as comma operator, sizeof operator, pointer operators
(& and *) and member selection operators (. and ->).
The comma operator can be used to link the related expressions together. A comma linked list
of expressions is evaluated from left to right and the value of the right most expression is the
value of the combined expression.
For example:
First assigns the value 10 to x, then assigns 5 to y, and finally assigns 15 to the variable
Value. (i.e. 10 + 5) to value.
The sizeof operator is a compile time operator and, when used with an operand, it returns the
number of bytes the operand occupies. The operand may be a variable, a constant , an
expression or a data type qualifier.
Examples:
m=sizeof (sum);
n=sizeof (long int);
43
k=sizeof (123L);
The sizeof operator is normally used to determine the lengths of arrays and structures when
their sizes are not known to the programmer. It is also used to allocate the memory space
dynamically to variables during execution of a program.
#include<stdio.h>
main()
{
int a=1,b=2,c=3,d;
d = c++;
Sample I/O
Expressions
ab-c a*b-c
(m+n)(x+y) (m+n)*(x+y)
(ab/c) a*b/c
3x2+2x+1 3*x*x+2*x+1
High priority: * / %
Low priority : + -
45
The basic evaluation procedure includes ‘two’ left to right passes. During the first pass, the
high priority operators (if any) are applied as they are encountered. During the second pass,
the low priority operators (if any) are applied as they are encountered.
For example:
X=a-b/3+c*2-1
X=9-12/3*2-1
First pass
Step 1: x=9-4+3*2-1
Step 2: x=9-4+6-1
Second pass
Step 3: x=5+6-1
Step 4: x=11-1
Step 5: x=10
9-12(3+3)*(2-1)
Whenever parentheses are used, the expression within the parentheses assumes highest
priority. If two or more sets of parentheses appear one after another, the expression contained
in the left-most set is evaluated first and the right-most in the last. Given below are the new
steps:
First pass
Step 1: 9-12/6*(2-1)
Step 2: 9-12/6*!
Second pass
Step 3: 9-2*1
Step 4: 9-2
Third pass
46
Step 5: 7
Parentheses may also be nested, and in such case, evaluation of the expression will proceed
outward from the inner-most set of parentheses.
For example:
9-(12/ (3+2)*2)-1=4
• If parentheses are nested, the evaluation begins with the innermost sub expression.
• The associability rule is applied when two or more operators of the same precedence
level appear in the sub expression.
• Arithmetic expressions are evaluated from left to right using the rules of precedence.
• When parenthesis is used, the expressions within parenthesis assume highest priority.
During evaluation it adheres to very strict rules and type conversion. If the operands are of
different types the lower type is automatically converted to the higher type before the
operation proceeds. The result is of higher type.
47
The following rules apply during evaluating expressions:
All short and char are automatically converted to int then
• If one operand is long double, the other will be converted to long double and result
will be a long int.
• If one operand is double, the other will be converted to double and result will be
double.
• If one operand is float, the other will be converted to float and result will be float.
• If one of the operand is unsigned long int, the other will be converted into unsigned
long int and result will be unsigned long int.
• If one operand is long int and another is unsigned int, then
a. If unsigned int can be converted to long int, then unsigned int operand will be
converted as such and the result will be long int.
b. Else both operands will be converted to unsigned long int and the result will be
unsigned long int.
• If one of the operand is long int, the other will be converted to long int and the result
will be a long int.
• If one operand is unsigned int the other will be converted to unsigned int and the
result will be unsigned int.
Conversion Hierarchy:
In C, any implicit type conversions are made from lower size type to higher size type
as shown below:
long double
Conversion
double
Hierarchy
float
long int
unsigned int
int
short char
However the following changes are introduced during the final assignment:
48
1. Float to int causes truncation of fractional part.
3. Long int to int causes dropping of the excess higher order bits.
Uses of casts
Example Action
x=(int)1.3 1.3 is converted to integer by truncation
y=(int)2.1/(int)4.5 Evaluated as 2/4
z=cos((double)x) Converts x to double before using it
m=(int)(a+b) The result of a+b is converted to integer
n=(int)a+b A is converted to integer and then added to b
p=(double)sum/n Division is done in floating point mode
Table 2.9 Type casting
Each operator in C has a precedence associated with it. The precedence is used to determine
how an expression involving more than one operator is evaluated. There are distinct levels of
precedence and an operator may belong to one of these levels. The operators of higher
precedence are evaluated first.
49
The operators of same precedence are evaluated from right to left or from left to right
depending on the level. This is known as associativity property of an operator.
50
The table 2.10 given below gives the precedence of each operator:
51
Sample program 2.5
Write a C program to find the pH of a given solution for any given hydrogen ion
concentration. [ pH = - log[H+]
#include<stdio.h>
main()
{
float H,pH;
printf("Give H value: ");
scanf("%f",&H);
pH = - log10(H);
printf("\nH = %f pH = %f",H,pH);
}
Sample I/O
H = 0.000002 pH = 5.698970
52
Sample program 2.6
Write a C program to find the Body Mass Index (BMI) of a person
[ BMI = weight in kgs / height in metres2 ]
bmi = w / (h*h);
printf("Weight = %.2f",w);
printf("\nHeight = %.2f",h);
printf("\nBody Mass Index = %.5f",bmi);
}
Sample I/O
Weight = 50.00
Height = 1.60
Body Mass Index = 19.53125 53
54
Sample program 2.7
Write a C program to find the pH value for a given [OH-] concentration
[ pH = 14.0 – pOH and pOH = - log10(pOH) ]
Sample I/O
Give OH_con value: 0.04
55
Sample program 2.8
Write a C program to find the rpm value
[ rpm = 1000K√RCF / 11.17 r , where RCF is the Relative Centrifugal Force and the r is the
maximum radius]
#include<stdio.h>
main()
{
float rcf, rpm, radius;
printf("Give rcf, radius values: ");
scanf("%f %f",&rcf, &radius);
Sample I/O
56
Sample Program 2.9
Write a C program to compute RCF value
[ RCF = 11.17 * r_max * ( rpm/1000) ^2, where r_max is given in cn ]
In the previous chapter, we have studied about the fundamental concepts of C. The sample
programs what we have studied require some simple formulae to be evaluated to obtain the
desired result. In all these programs, the statements are executed sequentially to obtain the
result and there is no conditional or control statements involved. This chapter, discuss about
the control statements required to solve problems in which conditions are involved.
Statements which alter the sequential order (or flow) of execution of a program based on
certain conditions are called ‘control’ statements. There are two types of control statements
namely,
i. Branching statements
ii. Looping statements
(i) Branching statements: The statements which are used to execute a group of instructions
or statements upon satisfying some conditions are called branching statements. C has two
types of branching statements.
(ii) if – else if
59
(iii) Nested - if
Apart from this unconditional branching statement, some of the other branching statements or
functions available in C are given below:
break
exit()
continue
(ii) Looping statements: The statements which are used to execute a group of instructions or
statements repeatedly until some specific condition is satisfied are called looping statements.
The three looping statements available in C are given below
while
do – while
for
The simplest control statement is if – else statement. This statement is used to alter the flow
of execution of the program based on the comparison of two quantities.
if (expression)
statement-1;
else
statement-2;
60
The expression must be enclosed within the parentheses. In this case, the statement-1 will be
executed if the expression is satisfied; else statement-2 will be executed. If the statements are
compound statements, must be enclosed within the curly braces {}. The else part is optional
and need not be present always.
The if-else control statement results in a two-way branching. This can be described by the
flow chart as shown in Fig. 3.1.
Examples
1. if ( n % 2 == 0)
printf(“\n Given number %d is even “,n);
else
61
Sample Program 3.1
Write a C program to find the bonus value and gift for the workers.
#include <stdio.h>
#include <ctype.h>
62
main()
{
char gender;
63
Sample Program 3.2
Deleted:
/*
To find the number of purines and pyramidines
pcount.c
*/
#include <stdio.h>
main()
{
int a,t,g,c;
int purines,pyramidines;
if( a – t == 0)
{
purines = a + g;
pyramidines = t + c;
printf(“\n No. of purines = %d “, purines);
printf(“\n No. of pyramidines = %d”,
pyramidines);
} 64
else
{
65
3.2.2 Conditional branching statements (ii) if- else if statement
The if- else if statement is used to putting ifs together when multipath decisions are involved.
A multipath decision is a chain of ifs in which the statement associated with each else is an if.
if ( condition 1)
Statement – 1;
else if ( condition 2)
Statement – 2;
else if ( condition 3)
Statement – 3;
Else
If the statements are compound statements, must be enclosed within the curly braces {}.
This construct is known as the else if ladder. The conditions are evaluated from the top to
downwards. As soon as a true condition is found, the statement associated with it is executed
and the control is transferred to the statement-x (skipping the rest of the ladder). When all the
n conditions become false, then the final else containing the default statement will be
executed. This can be described by the flow chart as shown in Fig. 3.2.
Condition-1
T F
Condition-2
T F
66
F
Statement-1
Condition-3
Statement-2
Condition-n
Statement-3
Statement-n
Statement -x
67
Sample Example 3.3
/*
Find the nature of the given solution
phnature.c
*/
#include <stdio.h>
main()
{
float pH;
if( pH == 7.0)
printf(“\nGiven solution is Neutral”);
else if (ph > 0.0 && pH < 7.0)
printf(“\n Given solution is Acidic”);
else if (pH > 7.0 && pH <= 14.0)
printf(“\n Given solution is Basic “);
else
printf(“\n Invalid pH value”);
}
68
69
Sample program 3.4
Write a c program to identify the glucose level in a blood
The glucose level is identified by <70 – hypoglycemia, 70-180 hyperglycemia,
> 180 diabetics
#include<stdio.h>
main()
{
int glucose;
70
Sample program 3.5
Write a C program to find the anemic level
Anemic level is identified by the hemoglobin value. If the hemoglobin level is 9.6 – 13, mild;
8 – 9.5 , modeate; < 8 severe; > 13 and <=17 Normal
/*
To find the anemic level
anemic.c
*/
71
#include <stdio.h>
main()
72
3.2.3. Conditional branching statements (iii) Nested if statement
In C, control statements can be nested. In a C program segment, one if-else statement can be
completely embedded into another and so on.
if ( condition 1)
if ( condition2)
Statement – A
else
Statement – B
else
if ( condition3)
Statement – C
If the condition1 is true then it check for condition2. If both are true statement-A will be
execute or Statement-B will execute. If condition1 if false, the control will move to
condition3. If it is true statement-C will execute otherwise statement-D will execute. This can
be described by the flow chart as shown in Fig. 3.3.
73
If cond.1
F T
FFT ?
T T
F F
If cond. 3 If cond.2 74
T
Statement A
Statement C
Fig. 3.3. The flowchart of the Nested if statement
75
76
Sample program 3.6
Write a C program to compare two peptides and determines the type of the peptide. i.e small
or poly based on the values inputted peptides.
Amino acid range for small peptide is < 1 - 8> whereas it is < 9 – 50 > for a poly peptide.
/*
Compare two peptides
acid_type.c
*/
#include<stdio.h>
main()
{
int peptide1,peptide2;
if ( peptide1 <=8 )
if ( peptide2 <= 8 )
printf(“\n Entered amino acid are small
peptides”);
else
printf(“\n Entered amino acid are both
small and poly peptides”);
else
if ( peptide2 <= 8 )
printf(“\n Entered amino acid are both
77
78
Sample program 3.7
Write a C program to determine which among three values has the maximum value. This can
be used while aligning amino acid sequences to find the maximal score value among
diagonal, left and above cell scores.
/*
To determine the maximum score among diagonal, left and
above
max_score.c
*/
#include <stdio.h>
main()
{
int diagonal, left, above;
79
(iv)
80
3.2.4 Conditional branching statements (iv) the switch-case statement
The if control statement works well when decisions are to be made from few alternatives.
However, if there are too many alternatives to select from, the if-else structure is too tedious
and confusing. In such cases the switch function is used. It works as a multi-way decision-
making tool to test whether various alternatives satisfy the conditional expression.
switch ( expression )
case value-1 :
block1;
break;
case value-2 :
block2;
break;
……
……
default :
default block;
Statement – x;
81
The expression specified in the switch is compared with case value-1, case value-2 and so on.
If any of the conditions is satisfied then that particular block is executed. The break statement
at the end of each block signals the end of the particular case and causes the program to exit
from the switch statement. The control is then transferred to the statement-x following the
switch. The default statement is optional if it is not present, and all case matches fail, no
action takes place and control is transferred to the statement-x. Figure 3.4 briefly describes
the flowchart diagram of switch-case statement where exp represents the expression, v1, v2,
v3 … vn represent the value-1, value-2 … value-n respectively and b1, b2, b3 …. bn
represent the block1, block2, block3 … blockn, respectively. This can be described by the
flow chart as shown in Fig. 3.4 as follows.
If ( exp)
exp = v1 exp = v2 exp=v3 … exp = vn
82
The break statement
The break statement is used to exit from the switch-case, thus preventing more than one block
being executed. If the break statement is not used, all the case blocks will be executed even if
one case is true. In general break statement is used to terminate all the conditional statement.
The general form of break statement is,
break;
The exit() function is used to terminate the program execution. The normal ending of the
execution of the program will return a zero value. For any other type of ending, due to errors,
may signal the integer value specified within the parentheses of the exit() function. The
general format of the exit() function is,
exit();
The exit() function is commonly used with the default block in a switch-case statement, so as
to terminate the execution of the program, if wrong values are detected for the case values.
83
Difference between if statement and switch statement
1 Applicable for all data types Applicable only for integer / character
constant
3 If any one condition satisfies Even though any case label matches, the
condition satisfies, after executing control does not exit from the switch
the statements, the control exit statement. The break statement transfers
from the if-statement. No need for the control to out of the switch statement.
break statement
84
Sample Program 3.8
Write a C program to determine the type of DNA depending on the base number.
main()
{
int base_no;
#include<stdio.h>
#include<ctype.h>
main()
{
char x;
switch(tolower(x))
{
case ‘a’ :
case ‘e’ :
case ‘i’ :
case ‘o’ :
88
89
Sample program 3.10
Write a C program to create amino acid dictionary
#include<stdio.h>
main()
{
char aa;
printf("Enter the starting letter of the amino acid : ");
scanf("%c",&aa);
printf(“Amino acid starts with %c are \n”);
switch(aa)
{
case 'a':
case 'A': printf("alanine\n");
printf("arginine\n");
printf("asparagine\n");
printf("aspartic acid\n");
break;
case 'c':
case 'C': printf("cysteine\n");
break;
case 'g':
90
91
case 'm':
case 'M': printf("metheonine\n");
break;
case 'p':
case 'P': printf("phenylalanine\n");
printf("proline\n");
break;
case 's':
case 'S': printf("serine\n");
break;
case 't':
case 'T': printf("threonine\n");
printf("tryptophan\n");
printf("tyrosine\n");
break;
case 'v':
case 'V': printf("valine\n");
break;
92
93
3.3 LOOPING STATEMENTS
The conditional statements execute a statement or block of statements once based on the
condition. Sometimes it may require executing a statement or block of statement more than
once. Looping statements are used for this purpose. The while, do – while and for statements
are the example of looping statements.
Looping statement
Test condition is to check whether the statement or block has been repeated or not. The
statement or block is repeated if the condition is true.
Sentinel loops
Loops are classified as fixed loops and sentinel loop. If the repetition time is known then it is
referred as counter-controlled loop or definite repetition loop. The for loop is an example of
this definite loop.
If the number of execution of the loop is not fixed then it is referred as sentinel-controlled
loop. The repetition is based on the special value called sentinel value. For example for
reading data we can indicate the end of data by a special value -1 or any negative data. The
control variable is called sentinel variable. Example for this indefinite repetition loop is while
and do –while statements.
The simplest looping structure is the while statement. This construct is also called as pre-
tested looping statement.
94
The general format of the while statement is
95
The while is the entry controlled loop here. The test-condition is evaluated first. If the
condition is true, body of the loop is executed. After executing the body of the loop, the test
condition is once again evaluated. If it is true, body of the loop is repeated again. This process
continues until the condition becomes false.
Loop Entry
Test
condition
Statement /
Exit loop
block
Control variable
96
Examples
i = 1; // initialization
while ( i < = 15 ) // testing
{
printf(“%d,”,i);
i++; // incrementing
}
97
(ii) Another example which uses the keyboard input
character = ‘ ‘;
The while statement executes the statements, if the condition is true. So there may be the
possibility of the statement not being executed at least once. Sometimes it may require
executing the body of the loop before testing the condition. The do-while statement is used
for handling these situations.
This statement is referred as post tested looping statement. The minimum number of
execution of do-while statement is 1.
do
98
Without checking any condition the body of the loop is executed first. After that the condition
is evaluated. If the condition is true, the body of the loop is executed again otherwise the loop
terminates.
Loop entry
Statement
Counter variable
99
T Test F
Fig 3.5 – Flow diagram for while loop
Example
choice = ‘y’;
do
{
printf(“Welcome to Bioinformatics \n”);
printf(“Do you want to print again :”);
choice = getchar();
} while ( choice == ‘y’);
100
The above structure prints Welcome to Bioinfomatics again and again if the choice is y .
101
3.3.3. The for statement
The for statement executes a statement or block statements for a certain number of times.
for ( expression-1,expression-2,expression-3)
Loop statements;
Where expression-1 is used to initialize some parameter, expression-2 represents the test
condition and expression-3 is used to alter the value of the parameter initially assigned by
expression-1.
The flow diagram of the for loop is given in Fig 3.6 as follows.
Here i indicate the counter variable
102
i
Example
103
By using for loop we can also execute the infinite loop. The format is as follows
for ( ; ; )
{
………..
………..
}
The break statement is used to terminate the infinite loop.
#include <stdio.h>
main()
{
int n,r;
while ( n > 0 )
{
r = n % 10;
104
Sample program 3.12
#include <stdio.h>
main()
{
int n,bn,r,sum=0;
while ( n > 0 )
{
r = n % 10;
105
106
Sample program 3.13
Eg. 153 = 13 + 5 3 + 13 )
#include <stdio.h>
main()
{
int n,bn,r,sum=0;
while ( n > 0 )
{
r = n % 10;
sum += r * r * r;
n = n / 10;
}
107
108
Sample program 3.14
Write a C program to count the number of base character entered through the keyboard using
do-while statement
#include <stdio.h>
main()
{
char base,choice;
int count=0,n=0;
do
{
printf("Enter the character : ");
scanf(" %c",&base);
switch(base)
{
case 'a' :
case 'g' :
case 'c' :
case 't' :
case 'u' : count++;
}
n++; /*To count the total number of character*/
i tf("D U t t t th h t ( / ) ")
109
Enter the character : c
Do U want to enter another character (y/n) : y
Enter the character : t
Do U want to enter another character (y/n) : n
110
Sample program 3.15
#include <stdio.h>
main()
{
int n,f,i;
for(i=1,f=1;i<=n;i++)
{ f *= i; }
111
112
Sample program 3.16
#include<stdio.h>
main()
{
int n,i,a,b,c;
a = -1;
b = 1;
for(i=1; i<=n; i++)
{
c = a + b;
printf("%d\t",c);
113
114
Sample program 3.17
#include <stdio.h>
main()
{
int n,flag=1,i;
if (flag)
printf("\n %d is a prime number",n);
115
116
The continue statement
The continue statement is used to skip a part of the statement block execution. The break
statement terminates the loop. The continue statement causes the loop to continue with the
next iteration by skipping all or any statement after it.
continue;
Example
for ( i=1;i<=10;i++)
{
if ( i ==7)
continue;
printf(“%d\t”,i);
}
#include<stdio.h>
#include<math.h>
117
main()
{
printf("Enter residual value for the given
molecule number :");
scanf("%f",&residual_value);
if(residual_value < 0)
{
printf("Enter positive number for finding
sqrt \n");
continue;
}
118
residual_sqrt = sqrt(residual_value);
printf("SQRT of residual value: %.3f <for molecule number
119
4
Arrays
An array is a sequenced collection of related data items that share a common name. Each
storage location in an array is called an array element. Individual array elements can be
accessed by its index / subscript value. There are two kinds of arrays.
1. Static array: This kind of array contains a fixed number of elements and is allocated
when the program is compiled.
2. Dynamic array: This kind of array contains a dynamic number of elements decided by
the dynamic memory management when the program is run. The array is resized as
per requirement.
Where,
datatype: any structured or simple datatype
arrayname: any valid identifier
120
size: the number of array elements
Example
int genes[20];
char nucleosides[10];
Arrays are not only used to represent simple lists, it also used to represent tables of data in
two, three or more dimensions. Depending on the dimensions, arrays are classified as,
One dimensional array has only one subscript or index. The index is used to identify an array
element. The index value starts from 0.
Example
int marks[5];
The one dimensional array consists of a set of contiguous memory location and each
locations is accessed by an offset value from the first location.
121
The following fig. 4.1 shows the hierarchical memory representation of one dimensional
array
A[9]
The size required to store an array in a memory is the product of size of datatype of an array
and the size of an array.
Size = 2 * 10 = 20 bytes.
After the array declaration, its elements must be initialized. An array can be initialized as
either compile time or run time.
In the compile time initialization, the array values are assigned during the declaration time
itself.
122
The list of values are separated by comma.
Example
If the number of elements are less than the array size the value 0 will be assigned to the
remaining positions.
123
float hemoglobin[5] = { 7.6, 12.5, 11.2 };
will initialize the first three elements to 7.6, 12.5, 11.2 and the remaining two elements to 0.0.
The size of the array may be omitted. In such case the compiler allocates the space for all
initialized array elements.
An array can be explicitly initialized at run time. This approach is usually applied for
initializing large arrays. For example, the following C segment,
{ A[i] = i*5; }
The read function such as scanf also used to initialize an array. For example, the statements
int molecules[3];
{ scanf (“%d”,&molecules[i]); }
will initialize array elements with the values entered through the keyboard.
124
Sample program 4.1
Write a C program to declare an array containing the ORF (Open Reading Frame is the part
of an organism’s genome containing the sequence of bases that could potentially encode a
protein) lengths of various genes and initialize it.
125
Sample program 4.2
Write a C program to count the number of +ves,-ves and 0’s energy molecules stored in an
array
#include <stdio.h>
main()
{
pe=ne=ze=ec=0;
126
printf("\n\n Number og energy molecules are %d",ec);
printf("\n Number of positive energy molecules are
%d",pe);
printf("\n Number of negative energy molecules are
%d",ne);
printf("\n Number of zero energy molecules are %d",ze);
Sample I/O
/*
Find the average base count
base_avg.c
*/
#include <stdio.h>
main()
{
char base[4] = { 'a', 't', 'g', 'c' };
char bmax,bmin;
127
int base_count[4],count;
int max,min,avg,btotal=0;
max = base_count[0];
min = base_count[0];
btotal += base_count[count];
}
128
Sample program 4.4
/* sort n characters
csort.c
*/
#include <stdio.h>
main()
{
char x[10],t;
int n,i,j;
printf("\nEnter %d characters\n",n);
for(i=0; i<n; i++)
scanf(" %c",&x[i]);
129
Result1:
Enter the number of characters to sort : 12
Maximum size of the array is 10. Try again
Result2:
Enter the number of character to sort :5
Enter 5 characters
i
w
The simplest form of multidimensional array is two dimensional array. It is generally used to
represent the table values. This array uses two indices. The first index represents the row size
and the second index represents the column size.
Example
130
int marks[4][3];
The following Fig. 4.2 represents the array of 3 rows and 3 columns memory representation
Row0
Row1
Row2
The size of the two dimensional array is calculated by using the formula,
For example the size of integer array with 3 rows and 4 columns, int a[3][4] is
Size = 3 * 4 * 2 = 24 bytes
131
Just like one dimensional array, two dimensional array values can also be initialized in either
compile time or run time.
In the compile time initialization, the array values are assigned during the declaration time
itself.
Example
To represent two students three subject marks the two dimensional array with 2 rows and 3
columns are required. The representation is int marks[2][3] ;
132
The values can be initialized as
int Marks[2][3] = {40,60,56,73,79,80}; (or)
int Marks[2][3] = { {40,60,56} , {73,79,80} };
will assign,
Marks[0][0] = 40; Marks[0][1] = 60; Marks[0][2] = 56;
Marks[1][0] = 73; Marks[1][1] = 79; Marks[1][2] = 80;
If the number of data elements are less than the size, the value 0 will be assigned for the
remaining locations.
To initialize an array at run time, usually applied for large arrays, can be done by the
following C segment,
int Marks[2][3];
The read function such as scanf also used to initialize an array. For example, the statements
int Marks[2][3];
for ( student = 0; student <2; student++)
for ( subject=0; subject<3; subject++)
133
{
scanf(“%d”,&Marks[student][subject] );
}
Initializes array elements with the values entered through the keyboard.
#include<stdio.h>
main()
134
{
int matrix[5][5],m,n,row,col;
135
Sample I/O
Result1:
Enter the order of the matrix m*n : 6*2
Max. matrix size is 5*5
Try Again
Result2:
Enter the order of the matrix m*n : 3*2
datatype arrayname[size1][size2][size3]…………[sizen];
Example
int m[4][3][6][5];
In multidimensional arrays, it takes the computer time to compute each index. This means
that accessing an element in a multidimensional array can be slower than accessing an
element in a single-dimensional array.
136
5
Functions
This chapter discuss about the important features of C language namely functions, which are
very much useful for the structured programming aspect of C programs.
• Debugging is easier
• It is easier to understand the logic involved in the program
• Testing is easier
• Recursive call is possible
• Irrelevant details in the user point of view are hidden in functions
• Functions are helpful in generalizing the program
137
• Built-in / Library functions
• User-defined functions
The functions which are already predefined by C are called as built-in/library functions. The
functions are grouped into different header files. Some of the header files and the collection
of library functions are given below.
1. math.h - contains all mathematical functions
2. stdio.h - contains all standard i/o functions such as scanf, printf etc.
3. char.h - contains all character functions
4. string.h - contains all string functions
All the mathematical functions are included in math.h header file. So before using any one of
the mathematical functions in a program, one should include the line:
# include <math.h>
The following Table 5.1 lists some of the standard mathematical functions
Function Meaning
cos(x) cosine of x
sin(x) sine of x
tan(x) Tangent of x
abs(x) Absolute value of x
abs(-50) = 50
mod(x,y) Remainder of x/y
Mod(10,3) = 1
138
exp(x) E to the power x (ex)
log(x) Natural log of x, x>0
log10(x) Base 10 log of x, x>0
pow(x,y) X to the power y (xy)
sqrt(x) Square root of x, x>=0
ceil(x) X rounded up to the nearest integer
Ceil(12.2) = 13.0
floor(x) X rounded down to the nearest integer
Table 5.1 Mathematical functions
#include<stdio.h>
#include<math.h>
main()
{
float OH con,pOH,pH;
139
printf("\n OH concentration = %f\n pOH = %f \n pH =
%f",OH_con, pOH, pH);
}
Sample I/O
The character functions are useful for testing and transforming the characters. All the
character functions are included in the ctype.h header file. This header file must be included
for using any one of the character functions.
140
Function Meaning
isalpha( x) Returns true if x is a letter
isupper( x) Returns true if x is a upper case letter
islower( x) Returns true if x is a lower case letter
isdigit( x) Returns true if x is a digit [0-9]
isalnum(x) Returns true if x is an alphanumeric
character
isspace(x) Returns true if x is space, tab, return,
newline or vertical tab character
tolower(x) Convert the uppercase letter to lowercase
letter
toupper(x) Convert the lowercase letter to uppercase
letter
141
Sample program 5.2
Write a C program to accept a single character from keyboard and identify whether it is a
letter or digit. If it is a letter change into opposite case
#include<stdio.h>
#include<ctype.h>
main()
{
char x;
if(isalpha(x))
if(islower(x))
printf("Uppercase of %c is %c",x,toupper(x));
else
printf("Lowercase of %c is %c",x,tolower(x));
else if(isdigit(x))
printf("%c is a digit",x);
else
142
5.3.2 User defined functions
The functions which are developed by the user are called user defined functions. These
functions are created depends on the need of the user. All user defined functions can be
compiled separately.
a. Function Definition
A function definition, also known as function implementation shall include the following
elements:
1. Function name
2. Function type
3. List of parameters
4. Local variable declarations
5. Function statements and
6. A return statement
All the six elements are grouped into two parts, namely,
statement1;
statement 2;
…………
144
The first line is known as the function header and the statements within the opening and
closing braces constitute the function body, which is a compound statement.
Function Header
The function header consists of three parts: the function type (also known as return type), the
function name and the formal parameter list. Semicolon is not used at the end of the function
header.
The function type specifies the type of value (like float or double) that the function is
expected to return to the program calling the function. If the return type is not explicitly
specified, C will assume that it is an integer type. If the function is not returning anything
then we need to specify the return type as void. The void is one of the fundamental data type
in C. The value returned is the output produced by the function.
The function name is any valid C identifier and therefore must follow the same rules of
formation as other variable names in C.
The parameter list declares the variable that will receive the data sent by the calling program.
They serve as input data to the function to carry out the specified task. Since they represent
actual input values, they are often referred to as formal parameters. These parameters can
also be used to send values to the calling programs. The parameters are also known as
arguments.
The parameter list contains declaration of variables separated by commas and surrounded by
parentheses. There is no semicolon after the parantheses.
145
Examples:
Declaration of parameter variables cannot be combined. That is, void swap(int x,y) is
illegal. A function need not always receive values from the calling program. In this situation
void keyword is used to represent the formal parameters are empty.
Example
void printline(void)
{
…
…
}
This function neither receives any input values nor returns back any value. Many compilers
accept an empty set of parentheses, like void printline(). But it is good programming style to
use void to indicate a null parameter list.
Function Body
The function body contains the declarations and statements necessary for performing the
required task. The body enclosed in braces, contains three parts, in the following order
1. Local declarations that specify the variables needed by the function
2. Function statements that perform the task of the function
3. A return statement that returns the value evaluated by the function
The return statement is not required if the function does not return any value.
The function can return only one value in one function call. The general form of the return
statement is
146
return;
(or)
return (expr);
Example
return; // does not return any value
return(a) // return the value of an identifier a
return( a*b );
b. Function calls
The function can be called by the function name followed by the list of actual parameters.
main()
Example
{
int y;
y = sum(15,-10);
printf(“Sum = %d”,y);
When the compiler encounters the function call, the control is transferred to the function
sum(). This function is then executed line by line as described and a value is returned when a
return statement is encountered. This value is assigned to y. This is illustrated below.
147
main()
int y;
return x+y;
The function call sends two integer values 15 and -10, which are assigned to x and y
respectively. The function computes the value and returns the value 5 to the main where it is
assigned to y.
If the actual parameters are more than the formal parameters, the extra actual arguments will
be discarded. On the other hand, if the actual are less than the formals, the unmatched formal
arguments will be initialized to some garbage. Any mismatch in data types may also result in
some garbage values.
c. Function Declaration
148
Like variables, all functions in a C program must be declared, before they are invoked. A
function declaration (also known as function prototype) consists of four parts.
• Function type
• Function name
• Parameter list
• Terminating semicolon
For example, sum function defined in the previous section will be declared as:
int sum( int x, int y); /* function prototype */
A function, depending on whether arguments are present or not and whether a value is
returned or not, may belong to one of the following categories:
Example
149
Sample program 5.3
Write a user defined function to draw a line in different pattern
#include<stdio.h>
main()
{
void drawLine(char,int);
drawLine('*',15);
drawLine('#',10);
drawLine('-',20);
drawLine('@',5);
}
150
Sample program 5.4
Write a user defined function to search an element in an array
#include <stdio.h>
main()
{
int a[20],n,i,se,p;
int Linear_Search(int[],int,int);
printf("Enter %d elements\n",n);
for(i=0;i<n;i++)
scanf("%d",&a[i]);
151
printf("Enter the searching element: ");
scanf("%d",&se);
Sample I/O
Result1:
Enter the number of elements : 25
Max. size of an array is 20
Try Again
152
Result2:
Enter the number of elements : 5
153
5.5 RECURSION
#include<stdio.h>
main()
{
int n,r,nf,rf,nrf,ncr;
int fact(int);
if (n>=r)
155
5.6 THE SCOPE, VISIBILITY AND LIFETIME OF VARIABLES
All the variables in C have the storage class. The storage class is used to describe the scope,
visibility and life time of variables. The scope of the variable determines over what region of
the program a variable is actually available to use (active). The visibility refers to the
accessibility of a variable from the memory. The life time of the variable refers to the period
during which a variable retains a given value during execution of a program (alive).
• Automatic variables
• External variables
• Static variables
• Register variables
The variables may also be broadly categorized as internal (local) or external (global)
depends on the place of their declaration. The variables which are declared inside the
function is called as internal variable and those are declared outside the function are called as
external variables.
Automatic variables are the variables which are declared inside the function. They are created
when the function is called and destroyed automatically when the function is exited, hence
the name automatic. Automatic variables are therefore local to the functions and also called
as local or internal variables.
A variable declared inside a function without any storage class specification is, by default, an
automatic variable. It may also use the keyword auto to declare automatic variable explicitly.
156
Example
main()
{
auto int x;
-------
-------
}
The important feature of automatic variable is that their value cannot be changed in some
other function in the same program. This assures that the same variable can be used in
different function in the same program without causing confusion to the compiler.
#include<stdio.h>
main()
157
{
auto int chromosomes = 36;
158
5.6.2 EXTERNAL VARIABLES
Variables that are both alive and active throughout the entire program are known as external
variables. They are also known as global variables. Global variables can be accessed by any
function in the program. The variables which are declared outside the function are referred to
as external variables by default.
Example
int x ; /* external variable – implicit declaration */
main()
{
extern int y; // external variable – explicit declaration
-------------
-------------
}
#include<stdio.h>
char blood_gp; /* global variable */
main()
{
void change_blood_gp();
blood_gp='A';
printf("Blood Group\n");
159
printf("\nBefore function call: %c",blood_gp);
change_blood_gp();
printf("\nAfter function call: %c",blood_gp);
160
5.6.3. STATIC VARIABLES
The value of static variables persists until the end of the program. A variable can be declared
static using the keyword static like
static int x;
A static variable may be either an internal type or an external type depending on the place of
declaration.
Internal static variables are those which are declared inside a function. The scope of internal
static variables extend up to the end of the function in which they are defined. Therefore,
internal static variables are similar to auto variables, except that they remain in alive
throughout the remainder of the program. Therefore, internal static variables can be used to
retain values between function calls. For example, it can be used to count the number of calls
made to a function.
#include<stdio.h>
main()
{
int i;
void degradation();
162
5.6.4 REGISTER VARIABLES
Register variable is used to keep the variable in one of the machine’s registers, instead of
keeping in the memory. A register access is much faster than a memory access, keeping the
frequently used variables in register lead to faster execution of the programs. This is achieved
as follows;
register int x;
Although, ANSI standard does not restrict its application to any particular data type, most
compilers allow only int or char variables to be placed in the register. Only a few variables
can placed in the register. However, C will automatically convert registers variables into non-
register variables once the limit is reached.
163
6
Strings
A string is a sequence of characters that is treated as a single data item. A string is terminated
by a null character. The null character serves as the “end-of-string” marker.
C does not support string as a data type. It allows strings to represent as a character array. In
c, string is declared as a character array. The general form of string declaration is as follows:
char string_name[size];
The size determines the number of characters assigned to the string variable.
Example
char sequence[20];
char acid_name[10];
When the compiler assigns a character string to a character array, it automatically supplies a
null character(‘\0’) at the end of the string. Therefore, the size should be equal to the
maximum number of characters in the string plus one.
164
6.2 STRING INITIALIZATION
The size of the array is declared as 8, because the alanine length is 7 plus one is assigned for
the null character.
The size is also optional while initializing array values. In that situation the null character
also need to be initializing explicitly.
Example
Char acid_name[] = { ‘g’,’l’,’u’,t’,’a’,’m’,’i’,’n’,’e’,’\0’ };
C does not allow to separate the initialization from declaration. That is,
char seq[5];
seq = “agtc”; // is not allowed.
165
Similarly C does not allow string assignment also.
Example
char food[8]=”protein”;
char pulses[8];
pulses = protein; // is not allowed
The standard input function scanf function with %s as control string is used to read strings
from the terminal.
Example
char carbo_hydrate[15];
scanf(“%s”,carbo_hydrate);
The scanf function terminates its input on the first white space it finds. A white space
included blanks, tabs, carriage returns, from feeds, and new lines. Therefore if the following
line is typed in at the terminal,
brown rice
then only the string “brown” will be read into the array carbo_hydrate. While reading strings,
the ampersand (&) symbol is not required in the scanf function. The array name itself points
to the memory location.
To avoid this problem gets function is used to read strings along with the white space. This
function is available in stdio.h header file. The general form of gets function is
gets(string);
166
The printf function and puts functions are used to print strings to screen. The general format
of above function is as follows:
printf(“%s”,string_name0;
puts(string_name);
Example
printf(“%s”,carbo_hydrate);
puts(carbo_hydrate);
167
6.5 READING A LINE OF TEXT
C supports a format specification known as edit set conversion code %[..] that can be used to
read a line containing a variety of characters, including whitespaces.
Example
char line[80];
scanf(“%[^\n]”,line);
printf(“%s”,line);
will read a line of input from the keyboard and display the same on the screen.
The following table Table 6.1 lists the most commonly used string functions. All the string
functions are stored in string.h header file.
Function Meaning
int strlen(string) Returns the number of characters in the string
before ‘\0’
strcpy(string1,string2) Copies the string2 into the string1
strncpy(string1,string2,size) Copies at most n characters of the string2 into the
sting1
strcat(string1,string2) Appends the string2 to the end of the string1
strncat(string1,string2,size) Appends at most n characters if the string s2 to the
end of the string1
strchr(string,char) Returns a pointer to the first instance of char in
c.Returns a NULL pointer if char is not in the
string
strcmp(string1,strin2) Compares string1 and strng2. The function returns
0 if they are the same, a number <0 if s1<s2, a
number >0 if s1>s2
strncmp(string1,string2,size) compares up to n characters of the string s1 to the
string s2. The function returns 0 if they are the
same, a number < 0 ifs1 < s2, a number > 0
if s1 > s2.
strstr(string1,string2) Returns a pointer to the first instance of string s2 in
s1. Returns a NULL pointer if s2 is not in s1.
strrev(string) Reverse the string
168
Table 6.1 String Handling Functions
#include<stdio.h>
main()
{
int i,a,g,c,t,gap;
float ac,gc,cc,tc;
char seq[50];
a=g=c=t=gap=0;
169
Sample I/O
agtcgcatgccta
Base a=3
171
Sample Program 6.2
Write a C program to count the number of gaps in the given sequence
#include<stdio.h>
main() 172
{
173
Sample Program 6.3
Write a c program to convert the DNA sequence to RNA sequence
#include<stdio.h>
#include<ctype.h>
main()
{
int i;
char seq[20];
for(i=0;seq[i]!='\0';i++)
175
Sample Program 6.4
Write a C program to find the reverse complement of the given sequence
#include<stdio.h>
#include<string.h>
#include<ctype.h>
main()
{
void reverse (char[]);
char seq[60], new[60];
int i,j,l;
for(i=0;seq[i]!='\0';i++)
{
if(islower(seq[i]))
seq[i]=toupper(seq[i]);
176
177
for(i=0;seq[i]!='\0';i++)
{
switch(seq[i])
{
case 'A':
seq[i]= 'T';break;
case 'T':
seq[i]= 'A';break;
case 'G':
seq[i]= 'C';break;
case 'C':
seq[i]= 'G';break;
default:
seq[i]= '-';
}
}
l= strlen(seq);
for(i=l-1,j=0;seq[i]!='\0';i--,j++)
{
new[j]= seq[i];
}
new[j]= '\0';
printf("\tTRANSLATED SEQ.: ");
178
179
Sample Program 6.5
Write a C program to perform case conversion
#include<stdio.h>
#include<ctype.h>
main()
{
char string[20];
void lowercase(char[]);
void uppercase(char[]);
int choice;
printf("\tCASE CONVERSION");
printf("\n1.Lowercase");
printf("\n2.Uppercase");
printf("\n3.Exit\n");
do
{
printf("\nEnter the choice: ");
scanf("%d",&choice);
switch(choice)
{
case 1 : printf("Enter the string in
uppercase\n"); 180
181
void lowercase(char s[])
{
int i;
for(i=0;s[i]!='\0';i++)
s[i]=tolower(s[i]);
printf("Lower case string is\n");
puts(s);
}
Sample I/O
CASE CONVERSION
1.Lowercase
2.Uppercase
3.Exit
182
183
Sample program 6.6
Write a C program to check the given sequence is palindrome sequence or not
#include<stdio.h>
#include<string.h>
main()
{
char seq[40];
int palindrome(char[]);
int p;
p=palindrome(seq);
if(p)
printf("\n%s\nis a palindrome sequence");
else
printf("\n%s\nis not a palindrome sequence");
TCTAGACTGA
185
Sample Program 6.7
Molecular weight of single stranded DNA molecules
The total molecular weight (MW) of single stranded DNA molecules, such as synthetic
oligonucleotides, can be determined by adding the molecular weight of individual nucleotides
( a, t, g and c) using the formula
Write a C program to find the molecular weight of small stranded DNA molecule after
checking for the phosphorylated or dephosphorylated type of it.
#include<stdio.h>
#include<ctype.h>
#include<string.h>
186
187
printf("Phosphorylated [p] or dephosphorylated [d]:
");
scanf("%c",&ptype);
len=strlen(dna);
na=nt=ng=nc=0;
for(i=0;i<len;i++)
{
switch(tolower(dna[i]))
{
case 'a' : na++;
break;
case 't' : nt++;
break;
case 'g' : ng++;
break;
case 'c' : nc++;
break;
default :
printf("\n Wrong base value");
printf("\nExiting");
return;
188
189
Sample I/O
P-type = P
The given DNA sequence : atgcatgc
#include<stdio.h>
#include<string.h>
main ()
{
char *rs,str[100],mtf[100];
int count=0; 190
191
Write a C program to align two sequences
#include<stdio.h>
#include<string.h>
main()
{
int i,l1,l2;
char a, seq1[100],seq2[100],aseq[100];
192
193
l1 = strlen(seq1);
l2 = strlen(seq2);
if( l1>l2)
for(i=l2;i<l1;i++)
aseq[i]='-';
else
for(i=l1;i<l2;i++)
aseq[i]='-';
aseq[i]='\0';
printf("\nSequence are\n");
printf("%s\n%s",seq1,seq2);
Sample I/O
Result1
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGG
AGAGG
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGA
ATGCC
Sequences are
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGG
194
195
Sample Program 6.10
Write a C program to sort ’n’ names
/* sort n names
sort_names.c
*/
#include <stdio.h>
#include <string.h>
main()
{
char names[20][30],t[30];
int n,i,j;
printf("\nEnter %d names\n",n);
for(i=0; i<n; i++)
scanf("%s" names[i]);
196
Sample I/O
Enter 5 names
protein
dna
alanine
197
nucleotide
rna
Sample program 6.11
Write a C program to build a matrix from two sequences inputted by the user
#include<stdio.h>
#include<string.h>
#define M_ROW 20
#define M_COL 20
void print_seq_matrix(int,int);
void build_seq_matrix();
198
void build_seq_matrix()
{
int i,j,rs;
int row,col;
char seq1[15],seq2[15];
rs=0;
200
// Fills up the rest of the matrix with 1
print_seq_matrix(row,col);
}
201
Sample I/O
Result1
Enter the first DNA sequence
agctcga
The computer’s memory is a sequential collection of storage cells. Each cell, commonly
known as a byte, has a number called address associated with it. Typically, the addresses are
numbered consecutively, starting from zero. The last address depends on the memory size. A
computer system having 64K memory will have its last address as 65,535.
This statement instructs the system to find a location for the integer variable seq_length and
puts the value 120 in that location. Consider the system chooses the address location 4800 for
seq_length. This can be represented by the following figure Fig.7.1
Seq_length Variable
4800 address
Value
120
204
Fig.7.1 Variable representation
The value associated with the variable may be accessed by using either the variable name or
by the address. Since the memory addresses are simply numbers it can be assigned to some
variable that can be stored in memory. Such variables that hold the address of another
variable is called as pointers.
A pointer is a special type of variable which points the location of another variable. In
general, variables hold the values that one stored in them. In contrast, pointer variables hold
the address of the memory location of variables. Pointers can point to variables of any basic
data types, such as int, char, float, double and also derived data types like arrays, structures
and unions (the last two will be discussed later in the book).
205
7.2 ADVANTAGES OF POINTERS
For example
p= &seq_length; /*assigns the address 4800 */
For example
p = &seq_length;
Then *p contains the value 120.
Pointer variables contain addresses that belong to a separate data type, they must be declared
as pointers before use them. The declaration of a pointer variable is given as below:
data_type *ptr_name;
206
Example
int *a;
float *pH;
The pointer variable can be declared to any data type. It is important that the pointer variable
of integer type hold the memory location address of another integer variable only. Similarly
the pointer variable of float type holds the memory location address of another float variable
only. The data type of pointer variable must match with the type of the variables whose
memory addresses are to be stored onto them.
207
The following figure Fig 7.2. clearly shows that each ordinary variable has a memory
location address, where the actual value is stored. For example, the ordinary variable x has an
actual value 100 and the memory location 4800 and the ordinary variable y has an actual
value 200 and the memory location 4802. The address of x can be obtained by using &
operator and this can be stored into the pointer variable px by using a simple assignment
statement. Thus, the address of the variable is obtained by using & operator and stored into
the pointer variable py using the assignment statement. In C it can be done by the following
segment,
int x=100,y=200;
int *px, *py;
px = &x;
py = &y;
208
Sample Program 7.1
Write a C program to illustrate the concept of pointer
There is a close relationship between pointers and arrays. Array operation which can be
achieved by array subscripting can also be done with pointers. Pointer arithmetic is faster
than array indexing. If an array aacid[] is declared then aacid is the address of the first
element or &aacid[0] represents the address of its first element.
In the above statement, a sequence array aacid of length 50 characters and a char pointer
paacid are declared. The pointer can be initialized to the address of the first element of array
aacid[] in the following manner.
As the array pointer is variable, it is easy to point it to any element in the array and modify
the contents it points to.
The pointer aacid in the above statement points to the first element in the array. To access the
next element in the array, the pointer should be incremented so that it can point to the next
element for access. The two most common operations on pointers are increments and
decrements. When a pointer is increased or decreased, the pointer value increases or
decreases by the size of the data type, thus pointing to an element up or down the array.
The increment operator increases the value of the pointer by 1 which means that the pointer
points to the next storage location. In case of an array, it points to the next array element.
211
paacid++ value of the pointer is increased by the size of a char (1 byte) as each element in
the array is of char type. Suppose paacid is an integer pointer then,
paacid-- the value of the pointer is decreased by the size of a char ( 1 byte) as each element
in the array is of char type.
212
7.7. POINTER AND FUNCTIONS
213
Syntax
Example
214
7.8 FUNCTION CALL BY VALUE AND CALL BY REFERENCE
215
In a C language function, there are two ways that arguments can be passed to a subroutine.
The first is call by value. This method copies the value of an argument into the formal
parameter of the subroutine. In this case, changes made to the parameter have no effect on the
argument.
Call by reference is a second way of passing arguments to a subroutine. In this method, the
address of an argument is copied into the parameter. Inside the subroutine, the address is used
to access the actual argument used in the call. This means that changes made to the parameter
affect the argument. Call by reference can be created by passing a pointer to an argument,
instead of passing the argument itself.
216
Sample Program 7.5
Write a C program to demonstrate the function call by reference
217
218
219
8
Structures & Unions
An array is a collection of similar data items. C provides a new user defined data type called
structure which can group the similar or dissimilar data types. The variables in the structure
are called as members of the structure. So the structure may contain members of int, float,
char or double. A structure can also be nested. It can also contain an array of another
structure. The main advantage of structure is that one can handle the related data of a
particular entity as a record.
The structure is declared by the keyword struct. The general format of the structure is as
given below.
220
The structure variables of the structure are declared as given below:
Example
struct organism
{
int temperature;
char salt[10];
float pH;
};
The structure can also be created using the keyword typedef. The general format of structure
using the keyword typedef is as follows:
221
If the structure is created using typedef, while declaring actual parameters the keyword
struct is not needed. The actual variables can be created as follows:
structure_name variable-1, variable-2, variable-3;
Example
typedef struct {
int temperature;
char salt[10];
float pH;
} organism;
The advantage of creating a structure using the keyword typedef is, to avoid typing the
keyword struct repeatedly while declaring the variables, passing the structure in a function,
allocating memory and so on.
In general, the structures are declared outside the main() function globally, so that they can
also be accessed by all other functions.
222
8.2 STRUCTURE INITIALIZTION
The members of the structure can also be initialized in the same format as an array.
Example,
The initialization for the above structure created using typedef is as follows:
The structure member can be accessed by using the member operator (.). The general syntax
of accessing structure member is as follows:
structure_variable.member_name;
For example org.temperatue, org.salt, org.pH denotes the three members of the organism
structure variable org.
223
224
8.4 ARRAY OF STRUCTURES
Arrays of structures are declared as the same way as the array are declared.
For example
declares 20 elements of organism structure type to the variable org. So we can process data of
20 organisms of organism.
225
Sample Program 8.2
Write a C program to check the stability of n DNA sequences
226
227
228
229
8.5 NESTED STRUCTURES
A structure can contain other structures that have their own members.
Example
typedef struct
{
int concentration;
float absorbtion;
}absorbance;
230
This can be nested in the structure DNA_abs_260 as follows:
typedef struct
{
absorbance r1;
absorbance r2;
absorbance r3;
absorbance r4;
}DNA_abs_260;
231
232
233
8.6 UNIONS
A union is similar to structure data type which can group different data type. The main
difference of union compared to structure data type is it can hold only one data for the data
member at a given time whereas the structure can hold data for all the data members. The
structure allocate the memory for all the data members separately. But union allocate only
one memory space and all the members share the same memory location. The size of the
union is defined by the maximum size of the data member in the union.
234
Sample program 8.4
Write a C program to illustrate the use of union
235
236
9
File Processing in C
The formatted input and output functions such as scanf() and printf() used for read and write
data from the console such as keyboard and monitor. These methods are fine if the data is
small. But many real life problems involves large amount of data, such as read 100
sequences, sort 100 student names, etc. In this situation console I/O operations subject to two
major problems such as
time consuming and the input and output data is not preserved. Once the program terminates
there is a loss of entire data. So it is necessary to store the data and read whenever it is
necessary. This can be achieved by using the concept file. A file is a place on the disk where
a group of related data is stored. By using files the data can be stored permanently. C
provides a set of commands for processing the data in a file. By using this commands the data
stored in the file can be easily read and write to the file.
Example
Seq.txt
Protein
Amino_acid
Data structure of a file is defined as FILE in the standard I/O (stdio.h) C library. File is the
predefined data type. All the files being used in the program must be declared in the FILE
data type.
237
To open the file the function fopen() is used.
Syntax
file-pointer = fopen(file-name,file-mode);
The function has two string arguments. The first one is the name of the file to open and the
second one specifies the purpose (file-mode) of opening the file. The most important mode
for file processing is, “w” for open the file for write, “r” for open the file for read and “a”
for open the file for appends data.
Example
FILE *fin, *fout;
fin = fopen(“dna.seq”,”r”);
fout = fopen(“rna.seq”,”w”);
The first statement open the file dna.seq for read process using fin and the second statement
open the file rna.seq for write process by using fout.
Example
fclose (fin);
fclose (fout);
238
The following table 9.1 shows the important file handling functions that are available in C
library.
239
Syntax
Example
fputc (ch,fout);
ch = fgetc(fin);
The first statement writes a character to the file using fout pointer and the second statement
reads a single character from the file and store it in a ch variable using fin pointer.
Syntax
Example
char line[25] = “agctagctagctagcgagctct”;
fputs(line,fout);
fgets(line, 25, fin);
The first statement writes a string into the file using the fout file-pointer and the second
statement reads the string from the file and store it into the variable line, using the fin file-
pointer.
240
Syntax
The first statement reads the formatted data from the file using the file pointer in the format
specified by the control string and assign the values to the variables var-1,var-2, …. , var-n.
The second statement writes the data in the variables var-1, var-2, var-3, … var-n into a file
using the file pointer in the format specified by the control string.
Example
char ptype[] = “DNA”, pt[10];
int length = 120,l;
fprintf ( fout, “%s %d”, ptype, length);
fscanf( fin, “%s %d”, pt, l);
241
Sample Program 9.2
242
Write a C program to convert the DNA sequences into RNA sequences using FILE
243
244
Sample Program 9.4
Write a C program to access the formatted data from the file
245
References
246