Untitled document
Untitled document
Latency is the time between a cause and an effect. An example of latency is input lag, or the time between moving
your mouse and the cursor moving on the screen. A good portion of latency comes from the operating system. In this
guidasdadme application on Windows. Google is your friend if you’re not sure about something in this guide (avoid
forums and Reddit). These tweaks aren’t listed in any particular order, but they are all important, otherwise I wouldn’t
bother listing them. Individually, many of these tweaks probably won’t produce a perceivable difference, but if you do
every single tweak you will end up with a significantly more responsive system, even if you usually can’t tell.
You’ll have to change the way you use a PC. In terms of programs, you will need a minimalistic approach. Don’t run
anything in the background that you don’t absolutely need. Heavy programs such as your web browser (Spotify and
Discord are reskinned Google Chrome) will slow down your system and cause stuttering. Close them before gaming
and reopen them when you’re done. This goes for other programs. Windows will allocate CPU time to any service or
program that is running in the background and will halt all other programs until the designated program gets its CPU
time. This is how multitasking works on operating systems. If you’re curious about scheduling and multitasking, read
this, or this.
The averages are quite low. The averages are what you are looking to improve. Intel will have lower averages than AMD.
Different timers (TSC/HPET/PMT etc.) will give different results.
The Tweaks:
Disable Hyper-threading / Simultaneous Multithreading (SMT) in UEFI
This feature allows the operating system to see a physical core as two virtual cores. Although good for
highly-threaded loads such as rendering or compiling, this feature massively increases the system’s latency. This is
because cores only have one execution unit, which is exacerbated by the operating system attempting to spread the
load across both virtual processors of the same core, which creates a stall while the core’s execution unit is busy with
the second virtual processor.
It is ideal to simply disable HT/SMT if you have more cores than your game requires, or force the game to run on
separate cores by changing the affinity to every other logical processor in Task Manager or Process Lasso (example:
CPUs 1,3,5,7+ or 0,2,4,6+ etc.). If you have eight or more cores, you can safely turn it off for almost all games. If you
have six or fewer cores, you might be forced to leave it on and change the affinity of the game to prevent contention
between the logical CPUs. Another benefit to disabling SMT is lower power consumption, which raises overclocking
headroom.
- Latency test of HT on vs. off
If you happened to buy a mutli-CCX Ryzen, you have a few options to minimize latency:
- Use Downcore Control in UEFI to disable a CCX (Zen 1/2) or CCD on Zen 3 (5900X, 5950X)
- Intercore latencies: Zen 1 / Zen+ / Zen 2 / Zen 3
- Windows 10 1903 has a scheduler update to group threads to CCXs, but this does not have the same effect
as disabling a CCX. Another drawback is that you have to use Windows 10
- If you absolutely need all 8 cores, set affinity to 0-3 or 4-7 (SMT off) in Task Manager to minimize inter-CCX
communication, use alternate logical CPUs if SMT is on (0/2/4/6 or 8/10/12/14 - odd or even doesn’t matter)
Disabling a CCX will reduce latency since only local cores are available
Setting 4+0 in BIOS on Ryzen dramatically reduces interrupt to DPC latency
BCDEdit
Run Command Prompt as admin and paste these italicized commands:
- To undo a command in BCDEdit, do bcdedit /deletevalue X (where X is tscsyncpolicy, useplatformtick, etc.)
bcdedit /set disabledynamictick yes (Windows 8+)
- This command forces the kernel timer to constantly poll for interrupts instead of wait for them; dynamic tick
was implemented as a power saving feature for laptops but hurts desktop performance
bcdedit /set hypervisorlaunchtype off
- Disables the hypervisor which is unneeded on a gaming PC
Device Manager
Open Device Manager (devmgmt.msc) and disable anything you’re not using. Be careful not to disable something
you use. Uninstalling a driver via Device Manager will most likely result in it reinstalling after reboot. In order to
completely disable a driver, you must disable it instead of uninstalling. When you disable something in Device
Manager, the driver is unloaded. Drivers interrupt the CPU, halting everything until the driver gets CPU time (some
drivers are poorly programmed and can cause the system to halt for a very long time [stuttering]). What to disable:
Display adapters:
- Intel graphics (if you don’t use it, ideally should be disabled in the BIOS)
Network adapters:
- All WAN miniports
- Microsoft ISATAP Adapter
Storage controllers:
- Microsoft iSCSI Initiator
System devices:
- Composite Bus Enumerator
- Intel Management Engine / AMD PSP
- Intel SPI (flash) Controller
- Microsoft GS Wavetable Synth
- Microsoft Virtual Drive Enumerator (if not using virtual drives)
- NDIS Virtual Network Adapter Enumerator
- Remote Desktop Device Redirector Bus
- SMBus
- System speaker
- Terminal Server Mouse/Keyboard drivers
- UMBus
- In the “Properties” window, be sure to disable “Power Management” for devices such as USB root hubs,
network controllers, etc.
- Here is an example of someone’s device manager to give you a better idea: https://i.imgur.com/9sdzhbl.png
Another way to disable services via the registry is simply with a .reg file. Use the “Properties” box in services.msc to
get the name of the service, then create a .reg file with entries such as:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BluetoothUserService]
"Start"=dword:00000004
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Spooler]
"Start"=dword:00000004
If you get an error when trying to run the .reg, use PowerRun (some services require the TrustedInstaller privilege in
order to be modified, such as Windefend or wuauserv).
DHCP Client, Network Connections, Network List Service, Network Location Awareness, and Network Store Interface
Service are required to automatically connect to a local network, but once a static IP is set, they can be disabled. See
below to setup static IP:
- Open Network and Sharing Center → Change adapter settings → right click, properties → IPv4 properties
→ open cmd and type “ipconfig” and fill out the settings as such: https://i.imgur.com/o1PGS2E.png
- If you’re still confused, see this tutorial on how to set a static IP:
https://pureinfotech.com/set-static-ip-address-windows-10/
- Network Store Interface Service may be required on Windows 10 for dogshit programs that determine
whether or not you’re connected to the Internet such as Apex Legends
- System Events Broker is required for Radeon Software to open on Windows 10
Startup
Prevent useless bloat such as Discord/Realtek/Steam/RGB/mouse/keyboard software etc. from starting up with
Windows. Your PC will start up faster, and once started will run fewer unnecessary programs.
1. Press “Windows key+R” → type “msconfig” → go to the “Startup” tab
2. Uncheck everything unless you absolutely need it. Launch it manually instead.
Windows 10 Specific
- Disable VBS/HVCI (Windows 10/11)
- Under “Exploit Protection” (use search), disable all the mitigations (obviously this will reduce system security
so use caution)
- Disable fast startup (Control panel → Power Options → Choose what the power buttons do → uncheck
“Turn on fast startup”)
- Replace the start menu with OpenShell; OpenShell is faster/lighter than the M$ start menu
- Add .old to StartMenuExperienceHost.exe and SearchApp.exe in C:\Windows\SystemApps to
prevent the Win10 start menu from running
- If you want a Windows7-like configuration, import this .xml to OpenShell by pressing “Backup” in
OpenShell settings (right-click start button)
Power Plan
By default, Windows uses the “Balanced” power plan which attempts to save energy when possible. Instead, set the
plan to “High Performance” in Control Panel→Power Options or even make a custom power plan using
PowerSettingsExplorer. The default “High Performance” plan still has many energy-saving features enabled which is
why it is better to create a custom plan. On W10 1803+ you may enable the “Ultimate Performance” power plan which
is a slight step above the regular “High Performance” plan by pasting this command into CMD as admin:
powercfg -duplicatescheme e9a42b02-d5df-448d-aa00-03f14749eb61
Disable Spectre and Meltdown protection / other mitigations (Windows 10/11 or updated 7/8)
- https://www.grc.com/inspectre.htm
- Example image of what it should look like when you disable mitigations
- In C:\Windows\System32, Rename “mcupdate_GenuineIntel.dll” to “mcupdate_GenuineIntel.dll.old” (change
file permissions in Properties→Security)
- Rename “mcupdate_AuthenticAMD.dll” to “mcupdate_AuthenticAMD.dll.old” if using an AMD CPU
Process scheduling
“Quantum” is the amount of time the Windows process scheduler allocates to a thread. You may choose between
short or long quantum. Furthermore, you can choose to boost the foreground quanta by double or triple, meaning the
currently highlighted program gets two or three times longer quantum. For gaming, it makes most sense to use long
quantum and three times foreground boost, since we want to maximize CPU time the game gets. The higher the
boost, the less the game will be interrupted by background programs. When not gaming, the drawback to using
longer quantum is that apparent responsiveness when using multiple programs may be reduced. In general, the
longer the duration of quanta, the more we minimize context switching. Context switching is computationally
expensive and should be minimized to reduce jitter from background processes/threads when gaming.
The table below lists the possible configurations you can tell the scheduler to use. You may select short or long
quantum, fixed or variable; and if you select variable, how much boost (2x or 3x) to give the foreground program.
What quantum you decide depends on your use case. The default quantum is dec. 38 for non-server Windows
editions, while for server it is dec. 24. This can be checked in Advanced System Settings → Performance →
Advanced. My personal recommendation is dec. 22.
- From the table, add together the decimal values you want and enter that as a decimal to the
Win32PrioritySeparation key. You cannot use the third column unless you use variable quantum. If you are
using fixed quantum, ignore the third column
- Changes to Win32PrioritySeparation apply instantly; no restart required
- Examples: Short, 3x = 32+4+2 = dec. 38 Long, 3x = 16+4+2 = dec. 22 Short, fixed = 32+8 = dec. 40
- Possible values, in decimal:
- 20 = Long, variable, no foreground boost (12:12)
- 21 = Long, variable, 2x foreground boost (24:12)
- 22 = Long, variable, 3x foreground boost (36:12)
- 24 = Long, fixed (36:36)
- 36 = Short, variable, no foreground boost (6:6)
- 37 = Short, variable, 2x foreground boost (12:6)
- 38 = Short, variable, 3x foreground boost (18:6)
- 40 = Short, fixed (18:18)
From http://csit.udc.edu/~byu/UDC3529315/WindowsInternals-4e.pdf, page 344
Example of IRQ sharing - four devices share IRQ 16 which will cause interrupts from these devices to compete with each other
Lock GPU clocks (Nvidia only, see the section below for Radeon cards)
This tweak forces the GPU to always run at boost clocks. This prevents the GPU from constantly switching back and
forth between different clock speeds which will negatively impact performance. Ensure you have adequate load
temperatures (<70°C) or you will shorten the lifespan of your card. Note that starting with Nvidia 1000 series cards,
you cannot completely lock clocks. The core clock will fluctuate based on load, temperature, or power usage.
In regedit, navigate to the path below and create a dword as such (if you have multiple GPUs installed in your
system, the 0000 may be 0001, 0002, etc.):
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4d36e968-e325-11ce-bfc1-08002be10318}\0
000]
"DisableDynamicPstate"=dword:00000001
- Read more here:
https://github.com/djdallmann/GamingPCSetup/blob/master/CONTENT/RESEARCH/WINDRIVERS/READM
E.md
- See Cancerogeno’s guide “A slightly better way to overclock and tweak your Nvidia GPU” which explains
how to properly lock clocks on modern Nvidia GPUs
Radeon Settings
- Radeon Software settings:
- Graphics tab (see this image for all graphics settings):
- Radeon Super Resolution: Off
- Radeon Anti-Lag: Off
- Radeon Chill: Off
- Radeon Boost: Off
- Radeon Image Sharpening: Off
- Radeon Enhanced Sync: Off
- Wait for Vertical Refresh: Off
- Everything else: Off / Lowest
- Display tab:
- AMD FreeSync: Off is almost always better, test 2
- Virtual Super Resolution: Off
- GPU Scaling: Off (unless your monitor doesn’t have scaling support)
- Display Color Enhancement: Off
- HDCP Support: Off (unless you view DRM content)
- Change power limit to max in Performance→Tuning if your power supply is capable
- Raise VRAM clock to something stable
- Radeon Software (the control panel) requires DWM+composition in order to not crash on Windows
7, but it should be disabled once you’ve dialed the settings
- Fullscreen games require Desktop Window Manager and Themes services and Aero
theme if using more than one monitor
- Fix for "Do you want to change the color scheme to improve performance?" on Windows 7
Download and install MorePowerTool; this section will massively help with stuttering caused by power saving
and downclocking. This tool allows you to change certain VBIOS values without having to flash the GPU’s
BIOS; instead they are read from the Registry, meaning you can easily change or revert them if something is
wrong.
- Use GPU-Z to dump your current VBIOS - see this picture if you can’t find the button
- Open MorePowerTool, then use the VBIOS you dumped from GPU-Z to load the Power Play Tables
to edit the settings below
- Features
- Feature Control (5xxx)
- Feature Control (6xxx)
- Power:
- You can slightly raise the GPU’s power limit but this is not an overclocking guide
so be extremely conservative unless you know what you are doing, see here for
more info
- Frequency: Set reasonable minimum frequencies for GFX, SoC, and Fclk (1900, 900,
1700, respectively, are good starting values for 6xxx)
- Setting these too high will result in instability which will cause microstuttering,
stuttering, or crashing
- Ensure your GPU’s cooler is adequate to run these frequencies 24/7, otherwise
your GPU will quickly degrade
- SoC and Fclk can have equal min/max which is ideal, equal GFX requires
MoreClockTool below
- Dcefclk must remain at its default value
- Fan: Disable “Zero RPM Enable”; this will increase stability and performance in game,
while also increasing the lifetime of your card
- Additionally, set a custom fan curve using MoreClockTool below to further prevent
degradation from ridiculous default fan curves
- Once finished, click “Write SPPT” and reboot (restarting the driver is not enough)
- Use HWiNFO to make sure minimum frequencies were applied. If your GPU is still
downclocking below the minimum you set, it means you have to start over and figure out
which clock was set too high (the driver will use default SPPT values if it doesn’t like a
value)
- If you’re satisfied with the values you’ve set, click “Save” so you can apply these changes
whenever you update drivers as it deletes the SPPT (Soft Power Play Table)
Download and install MoreClockTool; using this tool you can raise your maximum clock beyond what is
possible through MorePowerTool. Additionally, it allows you to set your minimum and maximum core clocks
to the same value (e.g. 2500 min, 2500 max, although it will still fluctuate) which is not possible through
MorePowerTool or Radeon Software. You can also change a few other performance-related settings through
this tool which is faster than using Radeon Software. If you haven’t already, change the fan curve because
stock fan curves are generally extremely relaxed for the high level of heat that GPUs output (the VRAM and
other components on the PCB get cooked due to inadequate airflow). Your fan curve should be at 100% at
80°C or lower. One thing to keep note of, depending on your cooler and the type of load, the hotspot
temperature can be 30°C higher than the “GPU” temperature, which can cause rapid degradation if not kept
in check (e.g. 60°C GPU, 90°C hotspot).
- If you get an error when clicking “Set” then you need to change your performance tuning profile to
“Custom” in Radeon Software
- Every time the system crashes (regardless of whether the GPU was at fault) you have redo
everything, so saving the settings to a file will save some time
Interrupt affinity
Using Microsoft’s Interrupt-Affinity Policy Tool (backup link), you can set affinity for a driver’s interrupts. Do not go
overboard as you can make the system perform worse if you randomly start changing affinities. Ideally each device
should have its own core, or left alone if you have already dedicated your most important devices to every available
core.
- Changing the interrupt affinity of some drivers may prevent you from booting. If this is the case, use recovery
mode to boot from last known good configuration
- Default install dir: “C:\Program Files (x86)\Microsoft Corporation\Interrupt Affinity Policy Tool” (use the x64
executable)
Non MSI-X drivers perform best when affinity is set to a single core (IrqPolicySpecifiedProcessors). If a
device uses MSI-X, it will use IRQPolicySpreadMessagesAcrossAllCores by design, regardless of what
affinity you set it to use. If you want to force an MSI-X device to a specific core, you must set its message
limit to 1 via MSI Util. Every time you update a driver (such as your GPU driver) you will have to set the
affinity again. Examples of devices to change:
- GPU
- Setting the graphics card onto a single core gives the best performance, however setting it
to a busy core will result in worse performance. You will have to find out which core
performs best by benchmarking, such as using menu FPS or something very consistent
with high FPS (500+) that you can reproduce easily. Usually it is the last core.
- USB controllers (xHCI/EHCI)
- Audio controllers (does not apply to usbaudio devices, change USB controller interrupt affinity
instead)
- Network controller
- When using RSS, set to IrqPolicySpreadMessagesAcrossAllProcessors. You will also
need to change RssBaseCpu, as interrupts will always land on the RssBaseCpu first, then
each consecutive CPU (depending on how many RSS CPUs). You can change
RssBaseCpu via the GUI from Device Manager under your network adapter’s properties.
If the setting is unavailable then create a dword using regedit called “RssBaseCpu'' under
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Ndis\Parameters, then
entire the number for the corresponding CPU (e.g. 3)
Benchmarking affinities or driver latency
- MouseTester for benchmarking xHCI/EHCI controller affinities
- liblava-demo for benchmarking GPU affinities, or anything else with extremely high FPS
- Use CapFrameX or something similar to benchmark average FPS, 1% and .1% lows
- Use AutoGPUAffinity to automate the process
- Xperf for benchmarking execution latencies for each driver. A script will make using it very easy
- My simple batch script which includes a Windows 7 download link without having to install
all of ADK
- Timecard's script which uses PowerShell if you prefer that instead
- Permon (Ctrl+R) allows you to see DPCs and interrupts per core
- Go to “Performance Monitor” → click the green “+” sign → Processor → select DPC Rate,
DPCs Queued/sec, Interrupts/sec → <All instances> → press “Add > >” then “OK” →
Change to “Report” view
- Run the game or program that you normally would and move perfmon to another monitor
to see DPC/interrupt activity in realtime
CPUs:
For optimal smoothness in gaming, an 8-core CPU is the minimum. 6-core CPUs are obsolete and will not be able to
smoothly run modern games at high frame rates. Ryzen is excluded for latency reasons. 12th and 13th gen
(Alder/Raptor Lake) CPUs are the only CPUs that make sense to buy in this current market. Also, ensure your CPU’s
frequency is locked across all cores to minimize latency and jitter from constant clock switching.
i7-3770K (4C/8T)
- Outdated for modern games; however, the L2 hit latency is 10ns lower than current Skylake-based CPUs
(~10ns vs. ~20ns)
- Uses DDR3 which is lower latency than DDR4 due to DDR4’s grouped banks and timing limits on Skylake
(ex. tRCDtRP, 28 tRAS, 16 tFAW)
Disable E-cores on 12th generation CPUs (Alder lake) through UEFI as they massively limit overclocking potential of
the uncore while also increasing system jitter. The 12900K was superseded by the generally higher performing
13700K.
i7-12700K/F (8P/4E)
- Lower-binned 12900K, expect lower clocks
- Still a viable option if you need 8 P-cores for less than the 13700K
Raptor Lake is similar to Alder Lake, with the exception of 8x2MB L2 vs. 8x1.25MB L2 on Alder Lake and efficiency
improvements. E-cores should be disabled for lower latency, however the performance penalty of leaving them
enabled is not as high as with Alder Lake. If you opt to use E-cores, you will need something like Process Lasso to
automate setting affinities/CPU sets to ensure the game doesn’t get scheduled on the E-cores, or use Windows 11,
which has higher latency than previous Windows versions.
i7-13700K/F (8P/8E)
- Lower binned 13900K, main difference is 8 fewer e-cores and 6MB less L3
i9-13900K/F (8P/16E)
- Higher overclocking potential, but at a large premium over the 13700K
CPU Cooling:
As with any other electronic component, the electrical losses are lower the better they are cooled, resulting
in better efficiency. Therefore it’s important to have a strong cooler for the CPU, as the IHS and small die
size massively limit cooling performance. AIOs offer better cooling performance than air coolers because the
radiators have higher fin density and the warm air can be directly exhausted out of the case. Another benefit
to having water cooling is the ability to mount a RAM fan due to the free space from not having a tower
cooler.
Motherboards:
Cheap motherboards will not allow your hardware to run at its full potential; RAM overclocking is highly dependent on
the motherboard, and to a lesser extent CPU overclocking as well, therefore it is important to be selective when
choosing one. Motherboards can be judged by hardware design; things like PCB layout and trace design, PCB layer
count, VRM design, heatsinks, etc. all play a massive role in quality. On the software side, the firmware is also critical
to RAM overclocking. A poorly optimized firmware will not take advantage of the (hopefully) good hardware.
Motherboards with 2 DIMM slots such as mini-ITX will have higher RAM overclocking potential than boards with 4
DIMM slots due to shorter distance from the CPU. 2 DIMM ATX boards will cost more compared to mini-ITX boards,
but have much stronger VRMs. Mini-ITX motherboards also have a large drawback: the RAM is right next to the GPU
which emits a lot of heat. Even if you do not plan on drawing high current, extra EPS connectors help provide more
stable power for the CPU’s VRM via lower resistance. Gigabyte motherboards lack user addressable IOL/RTL
settings, which can very negatively impact RAM latency. Intel Z390 is the last generation to have T-topology
motherboards for 4 DIMM overclocking (vs. the daisy chain layout, which suffers when 4 DIMMs are present). One
extremely useful feature to consider is BIOS flashback, as it allows you to flash your BIOS without having to turn the
system on. In the case of a failed BIOS flash, flashback should allow you to recover, especially with the newer
WSON8 packages (previously SOIC8) where using external programmers with clips is nearly impossible. Flashback
should also allow you to bypass downgrade and modded BIOS restrictions.
Z490:
Asus Z490i
- Reportedly better RAM OC than MSI Z490i, but much weaker VRM
- 8 layer PCB
MSI Z490i Unify
- Requires firmware updates for CR1 support
- 10 layer PCB
- Direct phase design
MSI Z490 Unify ATX
- 6 layer PCB, decent value for quad rank, ample VRM but uses doublers
Asus Z490 XII Apex
- Only 6 layer PCB, 2 DIMM slots
EVGA Z490 Dark
- Windows XP support
- 10 layer PCB, 2 DIMM slots
- Direct phase design, can disable LLC
Z590:
Avoid the Z590 Gigabyte Aorus Elite/Pro due to faulty power plane design
MSI Z590 Unify-X: $270
- 8 layer PCB, 2 DIMM slots
- 16+2+1 “mirrored” VRM
ASRock Z590 OC Formula: $480
- 12 layer PCB, 2 DIMM slots
- Missing “Hidden OC Item” making the UEFI nearly useless
Asus Z590 XIII Apex: $500
- 10 layer PCB, 2 DIMM slots
- Has new “Vlatch” feature which can detect and report minimum and maximum Vcore voltages through
HWInfo
Gigabyte Z590 Tachyon: $530
- 8 layer PCB, 2 DIMM slots
- Direct phase design
EVGA Z590 Dark: $600
- 10 layer PCB, 2 DIMM slots
- Direct phase design, can disable LLC
Z690:
If using Windows 7 and motherboard audio, beware of the Realtek ALC4080 audio chip present on most higher-end
Z690 boards as there is no W7 driver. ASRock and Gigabyte boards are questionable this generation. Be sure to get
an aftermarket ILM (independent loading mechanism) since the stock ILM causes warpage resulting in very poor
thermal performance:
- igor’sLAB: German Engineered Bend Aids for Intels LGA1700 – Thermal Grizzly CPU Contact Frame and
Alphacool Apex Backplate Thermal Testing | Review
- Gamer’sNexus: $4.35 Fix for Intel Thermal Problems | Thermalright 12th Gen Contact Frame
MSI PRO Z690-A DDR4: $200 (was $160, no longer good value)
- 6 layer PCB, twin phase design
- Barebones Z690 board with a “good enough” VRM for most builds
- There was a batch with broken flashback; if you buy it, ensure flashback works before it’s too late to
exchange/return
EVGA Z690 Classified DDR5: $330
- 10 layer PCB, 19 direct phases
- Mid-range DDR5 board with typical EVGA quality and features
Uses ALC1220
EVGA Z690 Dark: $400
- 10 layer PCB, 2 DIMM slots, direct phase design
- No Vccgt for iGPU (Quick Sync, backup GPU, etc.)
- Uses ALC1220, overall better board than the Unify-X
Z790:
The same ILM issue applies to Z790 as well, so be sure to install an aftermarket contact frame (or delid). While
Raptor Lake CPUs are compatible with Z690 boards after a UEFI update, the updated UEFIs may not be well
optimized for the platform (such as with RAM overclocking). For that reason, if you have the money and want to avoid
potential headaches, buy a Z790 motherboard. As Z690 stock dries up, you’ll be forced to buy one anyway.
Asus Z790-A Strix DDR4: $347
- 6 layer PCB, unknown phase design
- Backwards-compatible with LGA 1200 coolers (additional mounting holes)
Asus Z790 Apex: $700
- 8? layer PCB, 12 phase teamed design
- Beware of bowed PCBs
EVGA Z790 Dark: $800
- 14 layer PCB, 21 phase (direct?)
RAM:
Having fast and stable RAM can dramatically decrease system latency since numerous games and programs heavily
depend on RAM to feed the CPU with data (anything that is not immediately on the CPU must be retrieved from
RAM, which is orders of magnitude slower than the CPU). By default on most systems, DDR4 RAM is clocked at
2133 15-15-15, which is extremely slow compared to something easily attainable like 3600 15-15-15 with tuned
subtings and any decent Samsung B-die RAM kit, and on recent platforms like Z390/X570 and up. Here you can see
some benchmarks of what results are possible just by overclocking RAM:
- Impact of RAM speed on Intel's Skylake desktop architecture by KingFaris
- A benchmark I did a while ago
Built-in overclocking profiles like XMP/DOCP/EOCP can be toggled for better performance, however they are still
overclocks and thus do not guarantee stability. On top of that, the profiles do not include subtimings, meaning there is
still a large amount of performance left on the table. Therefore, it’s a good idea to learn how to overclock the RAM
yourself to ensure it is running at its full potential and is stable. You can reference this guide to learn more:
- DDR4 OC Guide by integralfx
When overclocking anything (CPU/GPU/RAM, etc.) it is important to ensure the overclock is stable and temperatures
are controlled. Higher temperatures result in lower stability which can lead to errors that result in data corruption
and/or crashes. Thus it is critical when stress testing, that you use multiple stress tests (not at once) for multiple
hours to guarantee stability. An unstable overclock that does not immediately appear to be unstable can also have
devastating effects. It can result in constant error-correction which in games can lead to inconsistent frame times
which will be perceived as low smoothness/microstuttering. It is nearly impossible to pinpoint such an issue unless
you recognize your system is overclocked (such as with XMP), as stress tests rarely pick up this type of instability.
Another thing to keep note of is heat from other components such as the CPU/GPU can heat up your RAM which will
lead to instability. Removing case panels can help mitigate this issue, but heat will still build up without proper airflow.
A RAM fan is a must when overclocking. Any 140mm fan will do, but it must be securely mounted to blow directly
onto the DIMMs, otherwise the effectiveness will be little to nothing.
RGB on RAM is detrimental to performance due to the additional traces and components required for the LEDs. This
will increase power draw which will in turn increase heat and electrical noise which will both interfere with RAM
operation, all while driving up cost to you. In terms of DDR4 DRAM voltage, anything under 1.5-1.6V is “safe” for daily
use, but around these voltages the RAM will quickly be thermally limited, even with a fan. The metallic covers on
DIMMs are only there for aesthetic and safety purposes to prevent accidental damage from user error. These covers
can be removed for better thermals since they use low quality thermal tape (or just glue) and cover the back of the
PCB with foam spacers which make the RAM run hotter than if the “heatsinks” weren’t there in the first place. The
temperature sensors present on DIMMs are located on the Serial Presence Detect (SPD) chip and do not report the
actual junction temperatures of the dies. In reality the memory is probably overheating when the temperature sensor
is reporting only 40C, which is the ambient air immediate to the DIMMs.
All else equal, dual-rank RAM performs better than single-rank RAM. This is because the data is more evenly spread
out across different banks, meaning the memory controller is less likely to run into a bank that is busy refreshing.
However, more ranks require more voltage for the same timings and require a high quality motherboard for better
signal integrity. There is also more heat being produced which requires more powerful cooling. Since manufacturers
do not state whether their DIMMs are dual rank or not, the only way to really determine if you’re buying dual rank is to
know what chips are being used. In the case of Samsung B-die, a dual rank kit will be 2x16GB since a single rank
B-die kit is 2x8GB.
If your hardware allows for it, make sure the “command rate” timing (in your BIOS/UEFI) is set to 1. CR2 (command
rate 2) is the default setting on most motherboards since it is easier to guarantee stability. However, there is a latency
penalty when using CR2 since the memory controller will skip X (1, 2, …) cycles before issuing commands to the
RAM chips. However, stabilizing command rate 1 requires a very high quality motherboard, a good IMC (integrated
memory controller), and good RAM (Samsung B-die for DDR4). On top of that, if you have an 11th or 12th gen. Intel
CPU, ensure your memory controller is set to “gear 1.” Gear 2 incurs a large latency penalty since the memory
controller is running at half the memory’s frequency. 11th gen. CPU IMCs typically cap with RAM around 3600 MT/s in
gear 1, while 12th gen. typically caps around 4000 MT/s with some leeway offered if Vccsa is increased. Ryzen CPUs
are similar in that the Mclk and Fclk need to be 1:1, otherwise you incur a large latency penalty just like 11/12th gen.
Intel CPUs. Zen 3 will cap around 3733 MT/s. Both command rate and gear settings are also dependent on the load
on the memory controller. Tighter timings, higher frequency, additional ranks, additional DIMMs, and additional
channels (if applicable) all add stress to the memory controller, with the latter being the heaviest loads. Therefore it is
important to recognize what your limiting factors are.
The “best” consumer DDR4 RAM die in most cases is Samsung 8Gb B-die, as it scales well with voltage allowing for
lower timings. Beware of A0 PCB kits which are usually older (2017-2018). This older PCB layout is less ideal due to
the chips being farther away from the DIMM’s pins. The A2 layout is generally better, and is found in recently
released kits. Listed below are typical B-die timings, but do not guarantee Samsung B-die. Use these as base
timings; higher price does not guarantee a better bin. Keep in mind many of the kits in these lists have RGB which is
detrimental to performance. If you find two kits with similar timings but dissimilar voltage, the lower voltage kit could
imply a better bin.
- 3200 14-14-14-XX
- 3600 14-14-14-XX
- 3600 14-15-15-XX
- 3600 15-15-15-XX
- 3600 16-16-16-XX
- 4000 14-15-15-XX
- 4000 15-16-16-XX
- 4000 16-16-16-XX
- 4000 17-17-17-XX
GPUs:
At low settings, the CPU and RAM are more important than the GPU for high refresh rate gaming. You want a stable
foundation (CPU and RAM) before buying a GPU, so a modern (10700K/5800X+) overclocked eight-core CPU is the
minimum for driving high refresh rates. Avoid buying blower cards (one fan), avoid overly cheap cards, and be wary
of problems brought up in reviews. Nvidia cards are much better optimized for DX11 games where the CPU is the
bottleneck. AMD GPUs typically perform better in DX12/Vulkan. AMD’s video encoder is very far behind Nvidia’s;
both quality and stability-wise, so keep this in mind (streaming/recording). Linux driver support is typically better for
AMD. If you have a Radeon 6000 or Nvidia 3000 series card, consider enabling Resizable BAR as it may help with
performance. See this article for more information about requirements and how to enable Resizable BAR, as well as
benchmarks. If you have an Nvidia GPU, consider enabling Reflex if your game has the option to reduce latency from
higher GPU load. Use “On” instead of “On+Boost” as the latter results in worse performance. Manually lock the GPU
clock instead.
- 6800 XT / 6900 XT
- Only good for DX12/Vulkan. OpenGL, DX9, DX10, and DX11 performance suffers behind Nvidia’s offerings
(games such as Apex Legends, Counter-Strike, Fortnite, Minecraft, etc.). When CPU single-threaded bound
the performance will quickly plummet below any Nvidia offering due to how well Nvidia’s drivers are
optimized
- Beware of driver issues with Radeon cards
- Windows 7 drivers are practically unusable on 5000 and 6000 series, consider Nvidia 30 series for
Windows 7 instead
- AMD has no equivalent of Nvidia’s Reflex which helps even when not GPU bound
RX 7xxx
- Avoid due to locked PowerPlay tables (= no overclocking), and to top it all off having to use inferior AMD
drivers
Storage:
Random accesses are generally what regular usage involves (i.e. gaming, desktop usage), so choosing an SSD with
low latency and high RND4K read speeds is important. NVMe SSDs have much lower latency than SATA SSDs.
HDDs should be avoided unless absolutely necessary as they are inherently slow; they take longer to turn on and
seek files, while making extra noise (acoustic and EMI) and using a lot of energy to do so. Most M.2 ports interface
through the chipset instead of directly to the CPU. While this isn’t terrible, there is obviously a latency penalty. Since
Zen 3 and Intel 11th gen., motherboards have at least a single x4 M.2 slot that interfaces directly with the CPU
instead of PCH, so if applicable, use those ports for higher performance. One thing to keep in mind is that higher
capacity SSDs typically have higher performance and endurance ratings, so note this when choosing which size to
buy (500GB, 1TB, 2TB, etc.). When looking at SSD reviews, any reviewer that doesn’t list system specifications or
isn’t using a platform newer than Intel 12th gen. Or Ryzen 7000 should be disregarded, as CPU performance
massively dictates SSD performance (if the reviewer has the Samsung 980 Pro in a comparison and isn’t getting
>95MB/s 4KQ1T1 in CDM, that’s a massive red flag).
From the software side, the operating system and storage drivers also play an important role in SSD speed. The
generic NVMe driver that comes with Windows generally performs worse than manufacturer drivers such as
Samsung’s1. Ensure your drive never thermal throttles and has some form of cooling (heatsink, fan, or both). For
optimal SSD response, the CPU should be running at a fixed frequency across all cores with SMT disabled, ASPM
and C-states disabled from UEFI, and idle disabled in power plan settings.
1. Use the modded Samsung NVMe driver from Fernando (modded to work with non-Samsung drives):
- Download: Windows 7, Windows 10
Samsung has very questionable reliability, however firmware updates have been released to supposedly address the
issues. Be mindful that SSDs frequently go on sale, so never buy at list price. Prices are for 1TB at typical sale prices.
Solidigm P44 Pro: $70 (Tom’s Hardware, TweakTown)
- Nearly identical to the Hynix P41 while being slightly cheaper (Solidigm was Intel’s SSD division, now owned
by SK Hynix)
WD SN850X: $78
- Beware of issues on Ryzen platforms, otherwise solid performer
Samsung 990 Pro: $90 (TweakTown)
- Highest RND4K performance and lowest latency out of consumer SSDs, ~115MB/s @ 35µs, but very low
Q1T1 SEQ1M read (~4800MB/s)
- Warning: requires firmware update before use
Hynix P41: $90 (exclusive to North America)
- Higher bandwidth than SN850X and 980 Pro, edges out the SN850X for a premium
- Consider the P44 Pro instead due to it being nearly the same SSD under different branding
ZET 983/900p/P4800X/905p/P5800X
- AIC and U.2 drives with much higher performance than M.2 drives, listed by order of highest to lowest
latency
- Can be acquired on Ebay cheaply, but ask about wear level before buying
- Requires 20 CPU PCIe lanes to not run through chipset or force the GPU into x8 mode, meaning Intel Z590
or newer
Mice:
Do not use wireless peripherals unless you are willing to forgo a latency penalty of 1+ milliseconds. Higher DPI
results in lower latency unless there is smoothing (HERO, Focus+, 3366, and certain 3370/3389 implementations can
do 12000+ DPI without additional smoothing). Turn off RGB as it uses extra power, creates additional interference,
and loads the MCU, which can impact the performance of the mouse. Ensure your polling rate is set to 1000Hz or
higher.
Your CPU or chipset’s USB controllers will usually result in the lowest jitter and latency. Ryzen CPUs have a USB
controller on-die, while Intel CPUs have it integrated in the PCH. Regardless of your platform, avoid using third-party
controllers/hubs such as ASMedia as they are almost always worse than the native solutions offered by the
CPU/PCH. To verify that you’re not using the wrong controller/hub, you can check in HWiNFO (main window)→Bus.
On AMD, it will show the chipset and on-die xHCI controllers separately; your mouse should be connected to the
CPU’s controller. Because the ports are not labeled, you will have to try different ports to see if you are connected to
your intended controller/hub.
I will only cover 240Hz+ monitors since CRTs are no longer in production. The latency can be split into two
categories: processing and pixel response time. Processing is the delay of the monitor processing the signal,
whereas response time is how quickly the pixel can change states (manifests as motion blur). An example below
shows the separation of the processing and response time latencies. Note that this selection of monitors is very
limited, so don’t base your monitor purchase off a single source. Typically IPS monitors such as the VG279QM will
have lower processing latency than TN monitors, but will suffer from worse response times. Avoid monitors with PWM
(pulse-width modulation) at all costs, even if high frequency. Amazon Renewed monitors are often much cheaper
than brand new monitors while only having damaged packaging. It is worthwhile as you can save a lot of money and
have a 30 day return policy if you are not content. Higher overdrive is lower latency, so set it as high as you can
tolerate. Black frame insertion (e.g. DyAc, ELMB, etc.) increases latency and introduces flicker which causes eye
strain. Therefore, BFI is not a substitute for having a panel with good response times.
Also, ensure you cap your game’s FPS to your monitor’s refresh rate (or any integer multiple/factor) if not using
adaptive sync. If you cap your FPS improperly, there will be a beat frequency which can be observed as stuttering.
For example, if you have a 240Hz monitor, cap your game to 240 FPS, or even 120 or 480 FPS depending on what
you can steadily hold. An example of an improper cap would be 240Hz / 237FPS because these numbers do not
evenly go into each other. If adaptive sync is not enabled, this will cause 3 stutters per second. 240Hz / 250FPS
would cause 10 stutters per second, and so on. Once the cap is properly set, and the frame limiter is accurate
(generally game limiters are awful for accuracy), the tear line should stay in one location (stutters or a bad limiter will
cause it to wander). One thing to keep note of is default resolutions that monitors ship with may use an offset of 0.1%
for the vertical refresh frequency (59.94Hz, 239.76Hz, etc.), so you will need to adjust it in CRU to make sure the
refresh rate is an integer, otherwise the cap will not work properly.
Source: https://www.tftcentral.co.uk/reviews/asus_rog_swift_360hz_pg259qn.htm#lag
Monitor review sites with latency measurements (do not compare latency measurements from different sources due
to differing test methods)
https://www.tftcentral.co.uk/reviews.htm
https://www.rtings.com/monitor/reviews
https://pcmonitors.info/reviews/archive/
Miscellaneous links
Windows activation
- https://www.reddit.com/r/Piracy/wiki/megathread/tools
How LCD Response Times are Measured, and Why 10% to 90% GtG Measurements are Moderately Deceptive
https://www.youtube.com/watch?v=MbZUgKpzTA0
Fujitsu Primergy Server BIOS Settings for Performance, Low-Latency and Energy Efficiency
https://sp.ts.fujitsu.com/dmsp/Publications/public/wp-bios-settings-primergy-ww-en.pdf
Follow me on Twitter
https://twitter.com/CaIypto
The fruit of my labor. One of the hardest scenarios in Kovaak’s Aim Trainer (now outdated but still a decent score)