Lec17 Disks
Lec17 Disks
Lec17 Disks
Kernel I/O
Subsystem
Device Driver
Top Half
Device Driver
Bottom Half
Device
Hardware
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.9
I/O Device Notifying the
OS
• The OS needs to know when:
– The I/O device has completed an operation
– The I/O operation has encountered an error
• I/O Interrupt:
– Device generates an interrupt whenever it needs service
– Handled in bottom half of device driver
» Often run on special kernel-level stack
– Pro: handles unpredictable events well
– Con: interrupts relatively high overhead
• Polling:
– OS periodically checks a device-specific status register
» I/O device puts completion information in status register
» Could use timer to invoke lower half of drivers occasionally
– Pro: low overhead
– Con: may waste many cycles on polling if infrequent or
unpredictable I/O operations
• Actual devices combine both polling and interrupts
– For instance: High-bandwidth network device:
» Interrupt for first incoming packet
» Poll for following packets until hardware empty
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.10
Administrivia
• Projects:
– Phase 2 code due Thursday at 11:59pm
– Make sure you check-out COFF files as binary
• Current News:
– Intel just released the 3.73GHz Pentium Extreme
Edition 965 dual-core processor
– Microsoft moves Vista general release date from
November 2006 to January 2007
» Vista is Microsoft's first major OS update since
Windows XP was released in late 2001
» The delay underlines the challenges of developing a
new operating system that must be compatible with
old software
Read/Write Head
Side View
IBM/Hitachi Microdrive
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.12
Properties of a Hard Magnetic Disk
Sector
Platters
Track
• Properties
– Independently addressable element: sector
» OS always transfers groups of sectors together—”blocks”
– A disk can access directly any given block of information
it contains (random access). Can access any file either
sequentially or randomly.
– A disk can be rewritten in place: it is possible to
read/modify/write a block from the disk
• Typical numbers (depending on the disk size):
– 500 to more than 20,000 tracks per surface
– 32 to 800 sectors per track
» A sector is the smallest unit that can be read or written
• Zoned bit recording
– Constant bit density: more sectors on outer tracks
– Speed varies with track location
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.13
Disk I/O Performance
300 Response
Time (ms)
Controller
User 200
Disk
Thread
Queue
[OS Paths] 100
Software
Result
Media Time
Queue
(Seek+Rot+Xfer)
(Device Driver)
• Highest Bandwidth:
– Transfer large group of blocks sequentially from one track
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.15
Typical Numbers of a Magnetic Disk
• Average seek time as reported by the industry:
– Typically in the range of 8 ms to 12 ms
– Due to locality of disk reference may only be 25% to 33%
of the advertised number
• Rotational Latency:
– Most disks rotate at 3,600 to 7200 RPM (Up to
15,000RPM or more)
– Approximately 16 ms to 8 ms per revolution, respectively
– An average latency to the desired information is halfway
around the disk: 8 ms at 3600 RPM, 4 ms at 7200 RPM
• Transfer Time is a function of:
– Transfer size (usually a sector): 512B – 1KB per sector
– Rotation speed: 3600 RPM to 15000 RPM
– Recording density: bits per inch on a track
– Diameter: ranges from 1 in to 5.25 in
– Typical values: 2 to 50 MB per second
• Controller time depends on controller hardware
• Cost drops by factor of two per year (since 1991)
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.16
Disk Performance
• Assumptions:
– Ignoring queuing and controller times for now
– Avg seek time of 5ms, avg rotational delay of 4ms
– Transfer rate of 4MByte/s, sector size of 1 KByte
• Random place on disk:
– Seek (5ms) + Rot. Delay (4ms) + Transfer (0.25ms)
– Roughly 10ms to fetch/put data: 100 KByte/sec
• Random place in same cylinder:
– Rot. Delay (4ms) + Transfer (0.25ms)
– Roughly 5ms to fetch/put data: 200 KByte/sec
• Next sector on same track:
– Transfer (0.25ms): 4 MByte/sec
• Key to using disk effectively (esp. for filesystems)
is to minimize seek and rotational delays
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.17
Disk Tradeoffs
Controller
Disk
Arrivals Queue
Departures
Queuing System
• What about queuing time??
– Let’s apply some queuing theory
– Queuing Theory applies to long term, steady state
behavior ⇒ Arrival rate = Departure rate
• Little’s Law:
Mean # tasks in system = arrival rate x mean response time
– Observed by many, Little was first to prove
– Simple interpretation: you should see the same number of
tasks in queue when entering as when leaving.
• Applies to any system in equilibrium, as long as nothing
in black box is creating or destroying tasks
– Typical queuing theory doesn’t deal with transient
behavior, only steady-state behavior
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.19
Disk Scheduling
• Disk can do only one request at a time; What order do
you choose to do queued requests?
User
3,10
Head
2,2
5,2
7,2
2,1
2,3
Requests
• FIFO Order
– Fair among requesters, but order of arrival may be to
random spots on the disk ⇒ Very long seeks
• SSTF: Shortest seek time first
Disk Head
– Pick the request that’s closest on the disk 3
– Although called SSTF, today must include
rotational delay in calculation, since 2
rotation can be as long as seek 1
– Con: SSTF good at reducing seeks, but 4
may lead to starvation
• SCAN: Implements an Elevator Algorithm: take the
closest request in the direction of travel
– No starvation, but retains flavor of SSTF
• S-SCAN: Circular-Scan: only goes in one direction
– Skips any requests on the way back
– Fairer than SCAN, not biased towards pages in middle
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.20
Building a File System
• File System: Layer of OS that transforms block
interface of disks (or other block devices) into Files,
Directories, etc.
• File System Components
– Disk Management: collecting disk blocks into files
– Naming: Interface to find files by name, not by blocks
– Protection: Layers to keep data secure
– Reliability/Durability: Keeping of files durable despite
crashes, media failures, attacks, etc
• User vs. System View of a File
– User’s view:
» Durable Data Structures
– System’s view (system call interface):
» Collection of Bytes (UNIX)
» Doesn’t matter to system what kind of data structures you
want to store on disk!
– System’s view (inside OS):
» Collection of blocks (a block is a logical transfer unit, while
a sector is the physical transfer unit)
» Block size ≥ sector size; in UNIX, block size is 4KB
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.21
Translating from User to System View
File
System
File Header
Null
– Pros: Can grow files dynamically, Free list same as file
– Cons: Bad Sequential Access (seek between each block),
Unreliable (lose block, lose rest of file)
– Serious Con: Bad random access!!!!
– Technique originally from Alto (First PC, built at Xerox)
» No attempt to allocate contiguous blocks
• MSDOS used a similar linked approach
– Links not in pages, but in the File Allocation Table (FAT)
» FAT contains an entry for each block on the disk
» FAT Entries corresponding to blocks of file linked together
– Compare with Linked List Approach:
» Sequential access costs more unless FAT cached in memory
» Random access is better if FAT cached in memory
3/22/06 Joseph CS162 ©UCB Spring 2006 Lec 17.28
How to Organize Files on Disk (continued)