Memory Manager in Windows
Memory Manager in Windows
Landy Wang Software Design Engineer Windows Kernel Team Microsoft Corporation
Support for x64 platform added 4GB virtual address space added for 32-bit large address space aware applications
Further increases performance of WOW layer on both Itanium and x64 systems
2005 Microsoft Corporation
Large page support added for user images and pagefile-backed sections Large pages now also used in 32-bit, even when booted with /3GB switch, for
Kernel Page Frame Number (PFN) database Initial non-paged pool
Prior large page support (added in Server 2003) was for the following
User private memory Device driver image mappings Kernel, when not booted with /3GB switch
2005 Microsoft Corporation
Overlapped asynchronous flushing for user requests to maximize I/O throughput Pagefiles zeroed in parallel instead of serially
Faster shutdown when zero my pagefile is set
Per-process working set lock used to synchronize PTE updates and working set list changes to an address space
System, session or process This lock converted from a mutex to a pushlock
Pushlocks support both shared and exclusive acquire modes Mutexes support only exclusive acquisitions
In conjunction with 2-byte interlocked operations this allows parallelization of many operations
MmProbeAndLockPages
Completely remove the PFN lock acquire from this very hot routine
10
11
Boot with very large registries on 32-bit machines With and without /3GB switch
Important for large multipath LUN machines MM locates registry VA space used by boot loader & reuses it as dynamic kernel virtual address space
2005 Microsoft Corporation
12
Features enabled w/o reboot, yet have no cost if not used 64-bit systems grow to maximum limit regardless of underlying physical configuration
128GB paged pool, nonpaged pool 1TB system cache/system PTEs/special pool 128GB session pool 128GB session views (desktop heaps), etc
2005 Microsoft Corporation
13
Windows Vista Planned Enhancements for NUMA, Large System, Large Page Support
Initial nonpaged pool now NUMA aware, with separate VA ranges for each node Per-node look-asides for full pages Page table allocation for system PTEs, the system cache, etc. distributed across nodes
More even locality Avoids exhausting free pages from the boot node
Zeroing of pages for these APIs bounds number of threads more intelligently
2005 Microsoft Corporation
14
Windows Vista Planned Enhancements for NUMA, Large System, Large Page Support
Win32 APIs that specify nodes for allocations & mapped views on per VAD & per section basis VirtualAllocExNuma CreateFileMappingExNuma MapViewOfFileExNuma
Scalable query
QueryWorkingSetEx
PFN database & initial nonpaged pool always mapped with large pages regardless of physical memory sparseness
15
Windows Vista Planned Enhancements for NUMA, Large System, Large Page Support /3GB mode on 32-bit systems supports up to 64GB of RAM
Booting in /3GB mode on 32-bit systems now supports up to 64GB of RAM instead of just 16GB Booting without /3GB on 32-bit systems continues to support up to 128 GB of RAM
16
Windows Vista Planned Enhancements for NUMA, Large System, Large Page Support Much faster large page allocations in kernel & user Support for cache-aligned pool allocation directives Data structures describing non-paged pool free list converted from linked list to bitmap
Reduced lock contention by over 50% Bitmaps can be searched opportunistically lock-free Costly combining of adjacent allocations on free no longer necessary
17
First time Windows-based OS has supported fully pageable mappings w/ arbitrary cache attributes
18
Pages for the I/O are put in transition (not valid) No VA space is required
If the pages are not subsequently referenced, no working set trim and TLB flush is needed either
Further emphasizes that driver writers must be aware that MDL pages can have their contents change !
19
Windows Vista I/O Section Access Improvements Significant changes in pagefile writing
Larger clusters up to 4GB Align near neighbors Sort by virtual address (VA) Reduced fragmentation Improved reads
Cache manager read ahead size limitations in thread structure removed Improved synchronization between cache manager and memory manager data flushing to maximize filesystem/disk throughput and efficiency
20
Windows Vista I/O Section Access Improvements Mapped file writing and file flushing performance increases
Support for writes of any size up to 4GB instead of previous 64k limit per write Multiple asynchronous flushes can be issued, both internally and by the caller, to satisfy a single call
21
Elimination of pagefile writes and potential subsequent re-reads of completely zero pages
Check pages at trim time to see if they are all zero Optimization used to make this nearly free
User virtual address used to check for the first and last ULONG_PTR being zero; if they both are, then After the page is trimmed, and TLB invalidated, a kernel mapping used to make the final check of the entire page Avoids needless scans & TLB flushes
Weve measured over 90% success rate with this algorithm
22
Windows Vista I/O Section Access Improvements Access to large section performance increases
A subsection is the name of the data structure used to describe on-disk file spans for sections The subsection structure was converted
From a singly linked (i.e., linear walk required) To a balanced AVL tree Enables huge performance gain for sections mapping large files
User mappings & flushes, system cache mappings, flushes & purges, section-based backups, etc
23
Windows Vista I/O Section Access Improvements Dependencies between modified writer & mapped writer removed to
Increase parallelism Reduce filesystem deadlock rules Provide the cache manager with a way to influence which portions of files get written first
To optimize disk seek as well as avoiding valid data length extension costs
24
25
26
Page faults Modified writes Page color generation MDL construction for fault I/Os, and so on
27
Address Windowing Extension (AWE) non-zeroed allocations are >10x faster than in SP1
Can now therefore be used for http responses, for example
28
This has been improved by adding a zero and free page SLIST for every NUMA node and page color Now obtain the page without needing the PFN lock in many instances where we need a single page
Demand zero faults, copy on write faults, etc For example, the fault processing path length is cut in half
Alleviates pressure on both the working set pushlock & PFN lock
2005 Microsoft Corporation
29
Support for hot-patching of session-space drivers 64-bit Windows uses demand zero pages instead of pool for WOW64 page table bitmaps
2005 Microsoft Corporation
30
31
Windows Vista Additional Robustness and Diagnosability .pagein debugger support for kernel/driver addresses added
Allows for viewing memory addresses which have been paged out to disk when debugging crashes
32
Call to Action Consider these significant Memory Manager enhancements as you develop drivers for Windows Server 2003 and Windows Vista Use new APIs when available in Windows Vista
33