0% found this document useful (0 votes)
19 views53 pages

Record&Arrays

The document discusses records (structures) and arrays in programming languages, detailing how different languages implement these data types. It covers the definition, memory layout, and initialization of records, as well as the characteristics, operations, and categories of arrays. Key differences in handling records and arrays across languages like C, Pascal, Java, and Fortran are highlighted, alongside their implications for memory management and performance.

Uploaded by

M Nandini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views53 pages

Record&Arrays

The document discusses records (structures) and arrays in programming languages, detailing how different languages implement these data types. It covers the definition, memory layout, and initialization of records, as well as the characteristics, operations, and categories of arrays. Key differences in handling records and arrays across languages like C, Pascal, Java, and Fortran are highlighted, alongside their implications for memory management and performance.

Uploaded by

M Nandini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

R ecords (structures))

Records(structures) Types
• Record types allows related data of
heterogeneous types to be stored and
manipulated together
• Algol68,C,common LISP use the structure instead
of record
• Fortran 90 simply calls its records “types”
• Structures in C++ are defined as a special form of
class.
• Java uses classes in all cases
• C# uses a reference model for variables of class
types,and value model for variables of struct types
Records and Variants contd..

PASCAL: C:
ML:
type element = struct element
type element =
record {
{
name : array[1..2] of char name[2];
name: string,
char; int
atomic_number:
atomic_number : atomic_numbe
int,
integer; r;
atomic_weight:
atomic_weight : real; double
real
metallic :Boolean atomic_weight;
metallic :Boolean
_Bool metallic
};
end; };
Records contd..
Each of the record components is known as
fields
To refer to given field of a record ,most language
uses “dot” notation.
Ex: element copper;
copper.name[0]=‘C’; copper.name[1]=‘u’;
ML differs from most languages in specifying that
the order of record fields is insignificant
ML record value example
{name=“Cu”,atomic_number=29,atomic_weight=63.5
Records (Structures)
• Memory layout and its impact
(structures)

Likely layout in memory for objects of type element on a


Figure
32-bit machine. Alignment restrictions lead to the shaded
“holes.”
Records (Structures)
Memory layout and its impact (structures)
layout in memory for objects of type element on a 32-bit
machine
Name field is only two characters long,it occupies 2
bytes in memory.
Since atomic_number is integer,there is a 2byte hole
between the end of name and beginning of atomic_number
As Boolean is variable occupy a single byte,there are
three bytes of empty space between the end of metallic
field and next aligned location
Records (Structures)
• Memory layout and its impact
(structures-Pascal)

Figure 7.2 Likely memory layout for packed element records(putting the
fields together ,without holes). The atomic_number and atomic_weight fields are
nonaligned, and can only be read or written (on most machines) via
multi-instruction sequences.
Records (Structures)
• Memory layout and its impact
(structures)

Figure Rearranging record fields to minimize holes. By sorting fields


according to the size of their alignment constraint, a compiler can
minimize the space devoted to holes, while keeping the fields aligned.
[byte aligned , half word aligned , word aligned]
University Question
Q:Consider the following records of a particular
language. Let the size of each char variable be 1 byte,
int be 4 bytes and boolean be 1 bytes. Draw the memory
layout for the records for a 32-bit aligned machine.
struct student{

char name[2];
int age;
boolean scolarship;
}
University Question
UQ: How are records represented in programming
languages?
PASCAL RECORD
Record types allows related data of heterogeneous
type element =
record
types to be stored and manipulated together.
name : array[1..2]
of char;
 Each of the record components is known as fields
atomic_number :
integer;
To refer to given field of a record , atomic_weight :
real;
most language uses “dot” notation. metallic :Boolean

Ex: element copper; end;

copper.name[0]=‘C’; copper.name[1]=‘u’;
Arrays
Arrays
• Arrays are the most common and important
composite data types
• An array is an aggregate of homogeneous data
elements in which an individual element is identified
by its position in the aggregate, relative to the first
element.
• A reference to an array element in a program often
includes one or more non constant subscripts.
• Such references require a run-time calculation to
determine the memory location being referenced.
Arrays & indexes
•Indexing is a mapping from indices to elements.
• The mapping can be shown as:
map(array_name, index_value_list) → an element
• C-based languages use [ ] to delimit array indices.
•Two distinct types are involved in an array type:
o The element type , and
o The type of the subscripts .
Ex: A(3) in FORTRAN and Ada, A[3] in Pascal and C
Array requirements
• Subscript Types:
FORTRAN, C - int only
Pascal - any ordinal type (int, boolean, char, enum)
Ada - int or enum (includes boolean and char)
Java - integer types only
• Index range checking
 C, C++, Perl, and Fortran do not specify range
checking
Java, ML, C# specify range checking
In Ada, the default is to require range checking, but
it can be turned off
Arrays: Declaration
• In C:
char upper[26];

• In Pascal:
var upper : array [‘a’..’z’] of char;

• In Fortran:
character, dimension (1:26) :: upper
character (26) upper // shorthand

• Note that in C, we start counting at 0, but that is


NOT true in Fortran,starting from 1.
Design
Issues
•What types are legal for subscripts?
• Are subscripting expressions in element
references range checked?
• When are subscript ranges bound?
• When does array allocation take place?
• Are ragged or rectangular multidimensioned
arrays allowed, or both?
• Can arrays be initialized when they have
their storage allocated?
• What kinds of slices are allowed, if any?
Arrays: Initialization
• a list of values that are put in the array in the order in which the
array elements are stored in memory
C and C++ - put the values in braces; can let the compiler count
them
int stuff [] = {2, 4, 6, 8};
Ada –
• Ada provides two mechanisms for initializing arrays in the
declaration statement: by listing them in the order in which they are to
be stored, or by directly assigning them to an index position using the
=> operator, which in Ada is called an arrow. For example, consider
the following:
• List : array (1..5) of Integer := (1, 3, 5, 7, 9);
• Bunch : array (1..5) of Integer := (1 => 17, 3 => 34, others => 0);
Array layouts: why we care
• Layout makes a big difference for
access speed - one trick in high
performance computing is simply to
set up your code to go in row major
order
• Two layout strategies for arrays :
• Contiguous elements
• Row pointers
Layout of Arrays
1.a) Row-major layout: Each row of array is in a
contiguous chunk of memory
• row major - used by everybody else
1.b) Column-major layout: Each column of array is in a
contiguous chunk of memory
• column major - only in Fortran
2.Row-pointer layout: An array of pointers to rows lying
anywhere in memory
Array layouts

1.Contiguous elements
a)column major: consecutive memory location
hold elements that differ by 1 in initial
subscript.
• A[2,4] is followed by A[3,4]
• only in Fortran
b)row major: consecutive memory location hold
elements that differ by 1 in final subscript
• so A[2,4] is followed by A[2,5] in memory
• used by everybody else.
Arrays

Figure Row- and column-major memory layout for two-dimensional arrays. In row-major order, the
elements of a row are contiguous in memory; in column-major order, the elements of a column are
contiguous. The second cache line of each array is shaded, on the assumption that each element is
an eight-byte floating-point number, that cache lines are 32 bytes long (a common size), and that
the array begins at a cache line boundary. If the array is indexed from A[0,0] to A[9,9], then in
the row-major case elements A[0,4] through A[0,7] share a cache line; in the column-major case
elements A[4,0] through A[7,0] share a cache line.
Array Layout
2.Row pointers
• an option in C
• allows rows to be put anywhere - nice for big
arrays on machines with segmentation
problems
• avoids multiplication
• nice for matrices whose rows are of
different lengths
• e.g. an array of strings
• requires extra space for the pointers
Arrays

Figure Contiguous array allocation v. row pointers in C. The declaration on the left is a
true two-dimensional array. The slashed boxes are NUL bytes; the shaded areas are holes. The
declaration on the right is a ragged array of pointers to arrays of character s. In both
cases, we have omitted bounds in the declaration that can be deduced from the size of the
initializer (aggregate). Both data structures permit individual characters to be accessed
using double subscripts, but the memory layout (and corresponding address arithmetic) is
quite different.
Array Operations
•The most common array operations are
assignment, catenation, comparison for
equality and inequality, and slices
•The C-based languages do not provide any array
operations, except through the methods of Java,
C++, and C#.
•Perl supports array assignments but does not
support comparisons.
Ada allows array assignments, and also provides
catenation,
specified by the ampersand (&).
•Python provides array assignment, although it
is only a reference change. Python also has
operations for array catenation (+) and
element membership (in).
•It includes two different comparison
operators: one that determines whether the
two variables reference the same object (is)
and one that compares all corresponding
objects in the referenced objects, regardless of
how deeply they are nested, for equality (==).
•Fortran 95+ includes a number of
array operations that are called
elemental because they are
operations between pairs of array
elements.
•For example, the add operator (+)
between two arrays results in an
array of the sums of the element
pairs of the two arrays.
•F# includes many array operators in
its Array module. Among these are
Array.append, Array.copy, and
Array.length.
• In APL, the four basic arithmetic
operations are defined for vectors
(single-dimensioned arrays) and
matrices, as well as scalar operands.
•For example, A + B
is a valid expression, whether A and B
APL includes a collection of unary operators for vectors and
matrices,
some of which are as follows (where V is a vector and M is a
matrix):
Rectangular and Jagged
Arrays
•A rectangular array is a multidimensioned
array in which all of the rows have the same
number of elements and all of the columns have the
same number of elements.
A jagged array is one in which the lengths of
the rows need not be the same. For example, a
jagged matrix may consist of three rows, one with 5
elements, one with 7 elements, and one with 12
elements.
•C, C++, and Java support jagged arrays but not
rectangular arrays.
•In those languages, a reference to an
element of a multidimensioned array uses a
separate
pair of brackets for each dimension. For
example, myArray[3][7]
Fortran, Ada, C#, and F# support
rectangular arrays. (C# and F# also
support jagged arrays.) In these cases, all
subscript expressions in references to
elements are placed in a single pair of
brackets.
For example,
Arrays: Slice
• A slice or section is a rectangular portion of an
array
• A slice is some substructure of an array;
nothing more than a referencing
mechanism
• Slice Examples:Consider the following Python
declarations:
• vector = [2, 4, 6, 8, 10, 12, 14, 16]
• mat = [[1, 2, 3],[4, 5, 6],[7, 8, 9]]
• The syntax of a Python slice reference is a pair
of numeric expressions separated by a colon.
•vector[3:6] is a three-element array with the fourth
through sixth elements of vector (those elements with
the subscripts 3, 4, and 5)
•. row of a matrix is specified by giving just one
subscript.
•For example, mat[1] refers to the second row of mat;
•a part of a row can be specified with the same syntax
as a part of a single dimensioned array.
•For example, mat[0][0:2]
refers to the first and second element of the first row
of mat, which is [1, 2].
•Python also supports more complex slices of arrays.
•For example, vector[0:7:2] references every other
element of vector, up to but not including the
element with the subscript 7, starting with the
subscript 0, which is [2, 6, 10, 14].
•Perl supports slices of two forms, a list of specific
subscripts or a range of subscripts. For example,
@list[1..5] = @list2[3, 5, 7, 9, 13];
Array Categories
static array
fixed stack-dynamic array
Stack-dynamic
Fixed heap-dynamic
Heap-dynamic
• Static array: subscript ranges are statically bound
and storage allocation is static (before run-time)
• Advantage: efficiency (no dynamic allocation)
Ex: Arrays declared in C & C++ function that includes the
static modifier are static
Array Categories
• Fixed stack-dynamic: subscript ranges are statically bound,
but the allocation is done at elaboration time during execution.
• Advantages: Space efficiency. A large array in one subprogram
can use the same space as a large array in different
subprograms.
• Ex: Arrays declared in C & C++ function without the static
modifier are fixed stack-dynamic arrays.
A stack-dynamic array is one in which the subscript ranges
are dynamically bound, and the storage allocation is
dynamic “during execution.” Once bound they remain fixed
during the lifetime of the variable.
• Advantages: Flexibility. The size of the array is not known
until the array is about to be used.
Array Categories

The user inputs the number of desired elements for


array List. The elements are then dynamically
allocated when execution reaches the declare block.
When execution reaches the end of the block, the
array is deallocated.
Array Categories
• Fixed heap-dynamic: similar to fixed stack-
dynamic: storage binding is dynamic but fixed
after allocation (i.e., binding is done when
requested and storage is allocated from heap, not
stack)
• The bindings are done when the user program
requests them, rather than at elaboration time and
the storage is allocated on the heap, rather than the
stack.
• Ex: C & C++ also provide fixed heap-dynamic arrays.
• The function malloc and free are used in C. The
operations new and delete are used in C++.
Array Categories
• heap-dynamic : subscript range and storage
bindings are dynamic and not fixed
• In Java, all arrays are objects (heap-dynamic)
e.g. (FORTRAN 90)
INTEGER, ALLOCATABLE, ARRAY (:,:) :: MAT
(Declares MAT to be a dynamic 2-dim array)
ALLOCATE (MAT (10, NUMBER_OF_COLS))
(Allocates MAT to have 10 rows & NUMBER_OF_COLS
columns)
DEALLOCATE MAT //Deallocates MAT’s storage
Array Categories contd..
• C and C++ arrays that include static modifier are
static
• C and C++ arrays without static modifier are
fixed stack-dynamic
• C and C++ can provide fixed heap-dynamic arrays
• C# includes a second array class ArrayList that
provides fixed heap-dynamic
• JAVA , Perl, JavaScript, Python, and Ruby support
heap-dynamic arrays
Arrays-Dimensions, Bounds, and
Allocation
• The shape of an array consists of the number of
dimensions and the bounds of each dimension
in the array.
• The time at which the shape of an array is bound
has an impact on how the array is stored in
memory:
• global lifetime, static shape — If the shape
of an array is known at compile time, and if
the array can exist throughout the execution
of the program, then the compiler can allocate
space for the array in static global memory
Arrays
■local lifetime, static shape — If the shape of the
array is known at compile time, but the array
should not exist throughout the execution of the
program, then space can be allocated in the
subroutine’s stack frame at run time.
• local lifetime, shape bound at run/elaboration
time - variable-size part of local stack frame
• arbitrary lifetime, dynamic shape (bound at
runtime) - allocate from heap or reference to existing
array
• arbitrary lifetime, dynamic shape - also known as
dynamic arrays, must allocate (and potentially
Dope vector
A dope vector contains the dimension,
bounds, and size information for an
array.
 Dynamic arrays require that the dope vector
be held in memory during run-time
Contiguous elements
DOPE vector will contain the lower bound of
each dimension and the size of each
dimension .
Memory Allocation of Arrays

Fig:Elaboration-time allocation of arrays in


Ada or C99.
Array operations
• An array operation is one that operates on an array
as a unit.
• The most common array operations are
assignment, catenation, comparison for equality
and inequality, and slices
• The C-based languages do not provide any array
operations, except through the methods of Java, C+
+, and C#.
• Perl supports array assignments but does not support
comparisons.
• Ada allows array assignments, including those where
the right side is an aggregate value rather than
Array operations
• Ada also provides catenation , specified by the
ampersand (&). Catenation is defined between two
single dimensioned arrays and between a single-
dimensioned array and a scalar.
• Ada have the built-in relational operators for
equality and inequality.
• Python provides array assignment, although it is
only a reference change.
• Python also has operations for array catenation
(+) and element membership (in).
Arrays: Address calculations
• Example: Suppose we have a 3d array in Ada:
A : array [L1..U1] of array [L2..U2] of array [L3..U3] of elem-type;

S3 = size of elem-type
S2 = (U3-L3+1)* S3 (* size of a row *)
S1 = (U2-L2+1) * S2 (* size of a plane *)

• Then calculating A[i , j , k] each time is:


(Address of A) + (i-L1)*S1 + (j-L2)*S2 + (k-L3)*S3
Arrays

Figure Virtual location of an array with nonzero lower bounds. By computing the constant portions of an array index at
compile time, we effectively index into an array whose starting address is offset in memory, but whose lower bounds are all
zero.
Arrays: Address calculations
• Given an array [ 1..8, 1..5, 1..7 ] of integers. Calculate
address of element A[5,3,6], by using rows and columns
methods, if BA=900?
Solution:-
A[i ,j,k]=Address of A+(i-L1)*S1+(j-L2)*S2 +(k-L3)*S3
Let i=5, j=3, k=6,L1=L2=L3=1, U1=8,U2=5,U3=7
S3 = 4 (size of element type)
S2=(U3-L3+1)*S3=(7-1+1)*4=28
S1=(U2-L2+1)*S2=(5-1+1)*28=140
Location(A[5,3,6])= 900 +(5-1)*S1+(3-1)*S2+(6-1)*S3
=900+4*140+2*28+5*4
=1536
University Question
UQ: What are the memory layouts used in arrays? How
the address calculation is done in three dimensional
arrays?
String
Strings
• The syntax is the same: characters in quotes.
• Pascal has one kind of quotes, Ada has two:
'A' is a character, "A" is a string.
The allowed length of strings is a design issue:
fixed-length strings—Pascal, Ada, Fortran;
variable-length strings—C, Java, Perl.
A character may be treated as a string of length 1, or
as a separate data structure.
Many languages (Pascal, Ada, C, Prolog) treat strings
as special cases of arrays or lists
String operations
Typical operations on strings
string  string  string //concatenation
string  int  int  string //substring
string  characters //decompose
into an array or list
characters  string //convert an
array or list into a string
String operations
string  integer // length
string  boolean //is it empty?
string  string  boolean //equality,
ordering

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy