CA Memory System(Updated)
CA Memory System(Updated)
Outline
Syllabus:
Objectives:
Computer Memory System: Memory Systems:
❖ Understand the basic Structure and types
Basic Concepts, Semiconductor RAM
of memory.
Memories, Read Only Memories, Speed, Size,
and Cost, Cache Memories–Mapping ❖ Design the various types of RAM memory
Functions, Performance Considerations, chips
Virtual Memory, Secondary Memory (Disk). ❖ Understand the purpose of cache and
cache mapping techniques.
❖ Understand the basic concepts of virtual
memory and address translation method.
❖ Understand the secondary memory types
in a computer system.
Memory System
❖ The maximum size of the memory that can be used in any
computer is determined by the addressing scheme.
❖ 16bit= 2^16 =64K Memory Locations
❖ 32-bit= 2^32= 4G Memory Locations
❖ The control line is used for co-ordinating data transfer.
❖ The processor reads the data from the memory by loading the
address of the required memory location into MAR and setting
the R/~W line to 1.
❖ The memory responds by placing the data from the addressed
location onto the data lines and con rms this action by asserting
MFC signal.
❖ Upon receipt of MFC signal, the processor loads the data onto
the data lines into MDR register.
fi
Memory System
❖ The processor writes the data into the memory location by loading the address of this location
into MAR and loading the data into MDR sets the R/~W line to 0.
❖ How Speed of a memory is measured?
❖ Memory Access Time: It is the time that elapses between the initiation of an operation and
the completion of that operation.
❖ Memory Cycle Time: It is the minimum time delay that required between the initiation of
the two successive memory operations.
❖ RAM: if any location that can be accessed for a Read/Write operation in xed amount of time,
❖ Cache Memory: It is a small, fast memory that is inserted between the main memory and the
processor.
fi
Semiconductor RAM Memories
INTERNAL ORGANIZATION OF MEMORY CHIPS
❖ Memory cells are usually organized in the form of array, in which each cell is capable of
storing one bit of information.
❖ Each row of cells constitute a memory word and all cells of a row are connected to a
common line called as word line.
❖ The cells in each column are connected to Sense/Write circuit by two bit lines.
❖ The Sense/Write circuits are connected to data input or output lines of the chip.
❖ During a write operation, the sense/write circuit receive input information and store it in
the cells of the selected word.
Semiconductor RAM Memories
INTERNAL ORGANIZATION OF
MEMORY CHIPS
❖ The Memory chip organised as 16X8
requiring 4 bit /connection for
addresses, 8 bit for data lines, R/~W
and CS.
❖ Total of 14 external connections
required.
❖ Transistor pairs (T3, T5) and (T4, T6) form the inverters in the latch.
❖ In state 1, the voltage at point X is high by having T3, T6 on and T4,
T5 are OFF.
❖ Thus T1 and T2 returned ON (Closed), bit line b and b' will have
high and low signals respectively.
❖ The CMOS requires 5V (in older version) or 3.3.V (in new version)
of power supply voltage.
❖ The continuous power is needed for the cell to retain its state.
❖ It has low power consumption because the current ows in the cell
only when the cell is being activated accessed.
CMOS Memory Cell
❖ SRAM’s are said to be volatile memories because their contents are
lost when the power is interrupted.
❖
fl
DRAM
❖ The information stored in a dynamic memory cell in the form of a
charge on a capacitor and this charge can be maintained only for
tens of Milliseconds.
❖ The contents must be periodically refreshed by restoring the
capacitor charge to its full value.
❖ In order to store information in the cell, the transistor T is turned
on & the appropriate voltage is applied to the bit line, which
charges the capacitor.
❖ After the transistor is turned off, the capacitor begins to discharge
which is caused by the capacitor’s own leakage resistance.
❖ Hence the information stored in the cell can be retrieved correctly
before the threshold value of the capacitor drops down. Single Dynamic Memory Cell
DRAM Chip Organization
❖ Organized as 4k X 4k array. 4096 cells in each row are divided into 512 groups of 8. Each row
can store 512 bytes.
❖ 12 bits to select a row, and 9 bits to select a group of 8 bits in a row. Total of 21 bits.
❖ Reduce the number of bits by multiplexing row and column addresses. First apply the row
address, RAS (Row Address Strobe) signal latches the row address.
❖ Then apply the column address, CAS (Column Address Strobe) signal latches the address.
❖ All the contents of a row are selected based on a row address.Particular byte is selected
based on the column address.
❖ Add a latch at the output of the sense circuits in each row. All the latches are loaded when
the row is selected.
❖ Different column addresses can be applied to select and place different bytes on the data
lines.
❖ Consecutive sequence of column addresses can be applied under the control signal CAS,
without reselecting the row.
❖ The timing of the memory device is controlled asynchronously. A special memory controller
circuit provides the necessary control signals RAS, CAS and that govern the timing.
❖ Allows a block of data to be transferred at a much faster rate than random accesses. A small 2M X 8 DRAM Chip
collection/group of bytes is usually referred to as a block. This transfer capability is referred
to as the fast page mode feature.
Asynchronous DRAM
❖ Asynchronous DRAM does not use a system clock to synchronise or coordinate memory accessing.
❖ Asynchronous RAM works in low-speed memory systems but not appropriate for modern high-
speed memory systems
❖ Fast Page Mode: Transferring the bytes in sequential order is achieved by applying the consecutive
sequence of column address under the control of successive CAS signals. This scheme allows
transferring a block of data at a faster rate. The block of transfer capability is called as Fast Page
Mode.
Synchronous DRAM
❖ Here the operations are directly synchronised
with clock signal.
❖ The address and data connections are buffered
by means of registers.
❖ The output of each sense ampli er is connected
to a latch.
❖ A Read operation causes the contents of all cells
in the selected row to be loaded in these latches.
❖ Data held in the latches that correspond to the
selected columns are transferred into the data
output register, thus becoming available on the
data output pins.
fi
Synchronous DRAM
❖ First, the row address is latched under
control of RAS signal.
❖ The memory typically takes 2 or 3 clock
cycles to activate the selected row.
❖ Then the column address is latched under
the control of CAS signal.
❖ After a delay of one clock cycle, the rst set
of data bits is placed on the data lines.
❖ The SDRAM automatically increments the
column address to access the next 3 sets of
bits in the selected row, which are placed on Burst Read of Length 4 in an SDRAM
the data lines in the next 3 clock cycles.
fi
DRAMs
❖ Performance is measured by two parameters.
❖ Latency
❖ Bandwidth
❖ Latency:
❖ It refers to the amount of time it takes to transfer a word of data to or from the memory.
❖ For a transfer of single word, the latency provides the complete indication of memory performance.
❖ For a block transfer, the latency denotes the time it takes to transfer the rst word of data.
❖ Bandwidth
❖ It is de ned as the number of bits or bytes that can be transferred in one second.
❖ Bandwidth mainly depends upon the speed of access to the stored data & on the number of bits that can be
accessed in parallel.
❖
fi
fi
DDR-SDRAM
❖ Double Data rate DRAM (DDR-SDRAM)
❖ The standard SDRAM performs all actions on the rising edge of the clock signal.
❖ The double data rate SDRAM transfer data on both the edges (loading edge, trailing edge).
❖ The Bandwidth of DDR-SDRAM is doubled for long burst transfer.
❖ To make it possible to access the data at high rate, the cell array is organized into two banks.
❖ Each bank can be accessed separately.
❖ Consecutive words of a given block are stored in different banks.
❖ Such interleaving of words allows simultaneous access to two words that are transferred on
successive edge of the clock.
Larger Memories
❖ Static Memory Systems
❖ Consider a memory consisting of 2M (2,097,152) words of 32
bits each
❖ Each column in the gure consists of four chips, which
implement one byte position. Four of these sets provide the
required 2M x 32 memory.
❖ Each chip has a control input called Chip Select. When this
input is set to 1, it enables the chip to accept data from or to
place data on its data lines.
❖ The data output for each chip is of the three-state type. Only
the selected chip places data on the data output line, while all
other outputs are in the high-impedance state.
❖ Twenty one address bits are needed to select a 32-bit word in
this memory.
❖ The high-order 2 bits of the address are decoded to determine
which of the four Chip Select control signals should be
activated, and the remaining 19 address bits are used to access
speci c byte locations inside each chip of the selected row. 2M X 32 Memory using 512K X 8 Memory Chips
fi
fi
Larger Memories
❖ A large memory is built by placing DRAM chips directly on the main system printed-circuit
board that contains the processor, often referred to as a motherboard, it will occupy an
unacceptably large amount of space on the board.
❖ Also, it is awkward to provide for future expansion of the memory, because space must be
allocated and wiring provided for the maximum expected size.
❖ SIMMs (Single In-line Memory Modules) and DIMMs (Dual In-line Memory Modules).
❖ Such a module is an assembly of several memory chips on a separate small board that plugs
vertically into a single socket on the motherboard.
❖ SIMMs and DIMMs of different sizes are designed to use the same size socket.
Rambus Memory
❖ The only way to increase the amount of data transferred over a speed limited bus is by increasing the width
of the bus.
❖ A very wide bus is expensive and requires a lot of space on a motherboard.
❖ An alternative approach is to implement a narrow bus that is much faster.
❖ This approach was used by Rambus Inc. to develop a proprietary design known as Rambus.
❖ The key feature of Rambus technology is a fast signaling method used to transfer information between chips.
❖ Instead of using signals that have voltage levels of either 0 or Vsupply to represent the logic values, the signals
consist of much smaller voltage swings around a reference voltage, Vref.
❖ The reference voltage is about 2 V, and the two logic values are represented by 0.3 V swings above and below
Vref. This type of signalling is generally known as differential signalling.
❖ Also called as Differential Rambus Signalling Levels (DRSL) offers high-performance, low power and cost
effective solution for getting bandwidth on and off chip.
Rambus Memory
❖ Small voltage swings make it possible to have short transition times, which allows for a high
speed of transmission
❖ These chips use cell arrays based on the standard DRAM technology.
❖ Multiple banks of cell arrays are used to access more than one word at a time.
❖ Circuitry needed to interface to the Rambus channel is included on the chip known as
Rambus DRAMs (RDRAMs).
Memory System Considerations
❖ Memory Controller
❖ The required multiplexing of address bits is usually
performed by a memory controller circuit, which is
interposed between the processor and the dynamic memory.
❖ The controller accepts a complete address and the R/W
signal from the processor, under control of a Request signal
which indicates that a memory access operation is needed.
❖ The controller then forwards the row and column portions
of the address to the memory and generates the RAS and
CAS signals (Active Low Signals). Memory Controller
❖ A sense circuit at the end of the bit line generates the proper output
value. Data are written into a ROM when it is manufactured.
Types of ROM
EPROM: ❖ EEPROM:
❖ ROM chip allows the stored data to be erased and ❖ A signi cant disadvantage of EPROMs is that a chip
new data to be loaded. Such an erasable, must be physically removed from the circuit for
reprogrammable ROM is usually called an EPROM. reprogramming and that its entire contents are erased
❖ It provides considerable exibility during the by the ultraviolet light.
development phase of digital systems. ❖ It is possible to implement another version of erasable
❖ Since EPROMs are capable of retaining stored PROMs that can be both programmed and erased
information for a long time, they can be used in electrically.
place of ROMs
❖ The only disadvantage of EEPROMs is that different
❖ The important advantage of EPROM chips is that voltages are needed for erasing, writing, and reading
their contents can be erased and reprogrammed. the stored data.
❖ Erasure requires dissipating the charges trapped in
the transistors of memory cells; this can be done by
exposing the chip to ultraviolet light.
fi
fl
Flash Memory
❖ Flash memory is an electronic non-volatile computer storage medium that can be electrically
erased and reprogrammed. It is a type of EEPROM.
❖ It must be erased (in blocks) before being overwritten.
❖ It has limited number of write cycles.
❖ It is cheaper than SDRAM, but more expensive than disk. It is slower than SRAM, and faster
than disk.
❖ It is extensively used in PDAs, digital audio players, digital cameras, mobile phones, etc.
❖ Its mechanical shock resistance is the reason for its popularity over hard disks in portable
devices, as also its high durability.
SPEED, Size and Cost
❖ The fastest accessing data is held at the processor
registers.
❖ Processor cache holds the copies of instructions and data.
❖ There are two levels of cache Primary Level is L1 cache
present in the processor.
❖ Secondary Cache is Level 2(L2) is the cache memory
placed between the processor and the Main Memory.
❖ Next is main memory implement as SIMM and DIMM.
❖ Secondary Memory/Magnetic disk.
❖ The increase in size, speed and cost per bit is shown in
the Figure.
Memory Hierarchy
Cache Memory
❖ For good performance, the processor cannot spend
much of its time waiting to access instructions and
data in main memory.
❖ Hence, it is important to devise a scheme that
reduces the time needed to access the necessary
information.
❖ Since the speed of the main memory unit is
limited by electronic and packaging constraints,
the solution must be sought in a different
Cache Memory
architectural arrangement.
Cache Memory
❖ The effectiveness of the cache mechanism is based on a property of computer programs called
locality of reference. Analysis of programs shows that most of their execution time is spent on
routines in which many instructions are executed repeatedly.
❖ Temporal: a recently executed instruction is likely to be executed again very soon.
❖ The temporal aspect of the locality of reference suggests that whenever an information
item (instruction or data) is rst needed, this item should be brought into the cache where
it will hopefully remain until it is needed again.
❖ Spatial: instructions in close proximity to a recently executed instruction (with respect to the
instructions’ addresses) are also likely to be executed soon.
❖ Instead of fetching just one item from the main memory to the cache, it is useful to fetch
several items that reside at adjacent addresses as well.
fi
Cache Memory
❖ The correspondence between the main memory blocks and those in the cache is speci ed by a
mapping function.
❖ The collection of rules for making this decision constitutes the replacement algorithm.
❖ Updating or Read/Write operation is done in two ways
❖ Write-through: The cache location and the main memory location are updated simultaneously.
❖ Copy-back Protocol: update only the cache location and mark it as updated with an associated
ag bit, often called the dirty or modi ed bit.
❖ The main memory location of the word is updated later, when the block containing this
marked word is to be removed from the cache to make room for a new block. This technique
is known as the write back, or copy-back, protocol.
fl
fi
fi
Mapping Functions
❖ Consider a cache consisting of 128 blocks of 16 words each, for a total of 2048 (2K) words, and
assume that the main memory is addressable by a 16-bit address.
❖ The main memory has 64K words, which we will view as 4K blocks of 16 words each.
❖ Mapping Functions:
❖ Direct Mapping: The block j of the main memory maps onto block j modulo 128 of the cache.
❖ one of the main memory blocks 0,128, 256 … is stored in the Cache Block 0.
❖ Blocks 1, 129, 257 ... are stored in Cache Block 1, and so on.
❖ Placement of a block in the cache is determined from the memory address.
❖ The memory address can be divided into three elds, as shown in Figure. The low-order 4
bits select one of 16 words in a block.
❖ When a new block enters the cache, the 7-bit cache block eld determines the cache position
in which this block must be stored.
❖ The high-order 5 bits of the memory address of the block are stored in 5 tag bits associated
with its location in the cache.
❖ Identify which of the 32 blocks that are mapped into this cache position are currently
resident in the cache Direct Mapping
fi
fi
Associative Mapping
❖ Associative Mapping: Main memory block can be placed
into any cache block position.
❖ In this case, 12 tag bits are required to identify a memory
block when it is resident in the cache.
❖ It gives complete freedom in choosing the cache location
in which to place the memory block. Thus, the space in
the cache can be used more ef ciently.
❖ The cost of an associative cache is higher than the cost of
a direct-mapped cache because of the need to search all
128 tag patterns to determine whether a given block is in
the cache. A search of this kind is called an associative
search.
Associative Mapping
fi
Set Associative Mapping
❖ A combination of the direct- and associative-mapping techniques
can be used.
❖ Blocks of the cache are grouped into sets, and the mapping allows
a block of the main memory to reside in any block of a speci c set.
❖ In this case, memory blocks 0, 64, 128, ... 4032 map into cache set 0,
and they can occupy either of the two block positions within this
set.
❖ Having 64 sets means that the 6-bit set eld of the address
determines which set of the cache might contain the desired block.
❖ The tag eld of the address must then be associatively compared
to the tags of the two blocks of the set to check if the desired block
is present. This two-way associative search is simple to implement.
length. The cache bridge speed up the gap between main of the corresponding entry in the page table.ie)it gives the starting
memory and secondary storage and it is implemented in address of the page if that page currently resides in memory.
software techniques. ❖ Control Bits in Page Table:
❖ Each virtual address generated by the processor contains ❖ The Control bit speci es the status of the page while it is in main
virtual Page number (Low order bit) and offset(High memory.
order bit) Virtual Page number+ Offset! Speci es the ❖ The control bit indicates the validity of the page ie) it checks
location of a particular byte (or word) within a page. whether the page is actually loaded in the main memory.
❖ It also indicates that whether the page has been modi ed during
❖ Page Table: its residency in the memory; this information is needed to
determine whether the page should be written back to the disk
❖ It contains the information about the main memory
before it is removed from the main memory to make room for
address where the page is stored & the current status another page.
of the page.
fi
fi
fi
fi
Address Translation
❖ The Page table information is used by MMU for
every read & write access.
❖ The Page table is placed in the main memory but a
copy of the small portion of the page table is
located within MMU.
❖ This small portion or small cache is called
Translation Look Aside Buffer (TLB).
❖ This portion consists of the page table entries that
corresponds to the most recently accessed pages
and also contains the virtual address of the entry.
Address Translation
❖ When the operating system changes the contents of page table , the control bit
in TLB will invalidate the corresponding entry in the TLB. Given a virtual
address, the MMU looks in TLB for the referenced page.
❖ If the page table entry for this page is found in TLB, the physical address is
obtained immediately. If there is a miss in TLB, then the required entry is
obtained from the page table in the main memory & TLB is updated.
❖ When a program generates an access request to a page that is not in the main
memory, then Page Fault will occur.
❖ The operating System suspend the execution of the task that caused the page
fault and begin execution of another task whose pages are in main memory
because the long delay occurs while page transfer takes place.
❖ When the task resumes, either the interrupted instruction must continue from
the point of interruption or the instruction must be restarted.
❖ If a new page is brought from the disk when the main memory is full, it must
replace one of the resident pages.In that case, it uses LRU algorithm which
removes the least referenced Page.
Use of Associative TLB
Address Translation
Secondary Memory
❖ The Secondary storage devices provide larger storage requirements. Some of the Secondary
Storage devices are;
❖ Magnetic Disk
❖ Optical Disk
❖ Magnetic Tapes.
Magnetic Disk
❖ Magnetic Disk system consists of one or more disk mounted on a common spindle.
❖ A thin magnetic lm is deposited on each disk, usually on both sides.
❖ The disks are placed in a rotary drive so that the magnetised surfaces move in close proximity to read /write heads.
❖ Each head consists of magnetic yoke & magnetising coil.
❖ Digital information can be stored on the magnetic lm by applying the current pulse of suitable polarity to the
magnetising coil.
❖ Only changes in the magnetic eld under the head can be sensed during the Read operation.
❖ Therefore if the binary states 0 & 1 are represented by two opposite states of magnetisation, a voltage is induced in
the head only at 0-1 and at 1-0 transition in the bit stream.
❖ A consecutive (long string) of 0‟s & 1‟s are determined by using the clock which is mainly used for synchronisation.
❖ Phase Encoding or Manchester Encoding is the technique to combine the clocking information with data.
❖ The Manchester Encoding describes that how the self-clocking scheme is implemented.
fi
fi
fi
Disk Principles
❖ Read/Write heads are placed in a sealed,
air- ltered enclosure called the Winchester
Technology.
❖ In such units, the read/write heads can
operate closure to magnetic track surfaces
because the dust particles which are a
problem in unsealed assemblies are absent.
❖ Seek time – Time required to move the read/write head to the proper track.
❖ Latency – The amount of time that elapses after the head is positioned over the correct track until the
starting position of the addressed sector passes under the read/write head. This is also called as rotational
delay (which is Half of the rotation of the disk i.e the desired sector will be halfway around the disk.)
❖ Disk access time= Seek time + Latency
Problems
❖ Disk Capacity:
❖ Let P-Number of Data Recording Surfaces in the disk
❖ Q- Number of tracks/surfaces
❖ M-Number of Sectors/track
❖ N-Number of data bytes/sector
❖ Total Capacity=(P x Q x M x N) bytes
❖ Q1. A disk has 200 tracks, each with 128 sectors, and each sector can store 512 bytes. What is the total storage capacity of the
disk?
❖ ANS: Total Capacity=Tracks×Sectors per Track×Bytes per Sector=13.1MB
❖ Q2: A disk has 16 data recording surfaces, 2048 tracks/surfaces, each tracks divided into 128 sectors, and each sector can
store 512 bytes. What is the total storage capacity of the disk?
❖ Total Capacity: Surfaces x Tracks per surface × Sectors per Track × Bytes per Sector
❖ ANS: 2GB
Problems
❖ Data Transfer Rate
❖ Let K= number of revolutions per min of the disk
❖ Then the data transfer rate is
❖ Tr=(MxNxK)/60
❖ Q3. Let K=6000 rpm, M=256, N=512 calculate the data transfer rate?
❖ Ans: Tr=(256x512x6000)/60=13.10MBps
❖ A Complete track can be read or written in 60/6000=0.01sec=10ms
Problems
Q1. Given a disk with a seek time of 8 ms, a rotational speed of 7200 RPM, calculate the total
access time?
Ans: Rotational Latency:
Rotational Speed=7200/60 revolutions per second=120 RPS
Time required to complete one revolution=1/120;
Average rotational latency is:
Rotational Latency (Latency)= 1/2*120 =4.17ms
Total Access time= seek time +latency =8ms+4.17ms=12.17ms
Summary
Basics of Memory System
❖
❖ Note:
❖ Memory Types
❖ Chip Organization
❖ Read Chapter 5 till 5.9.1 (Includes)
❖ Larger Memory Chip Organization in your reference / textbook.
❖ Cache Memory
❖ Memory Mapping
❖ ROM
❖ Virtual Memory
❖ Virtual Memory Mapping
❖ Secondary Memory