NES Architecture: PPU and CPU timing

Have you encountered cases of slowdown or dropped frames in your game? That’s because the game uses too many cycles to be processed in a single frame. But how many cycles do you have actually? This short post will show you the number of CPU cycles you can use, and why it’s that number exactly.

PPU cycles

The resolution of the NES is 256 by 240 “pixels”. In parentheses, because the NES PPU is designed to work on a CRT screen, which doesn’t exactly plots pixels on the screen, but rather sends colored light beams at a screen line by line at a blazingly high speed. This video by Retro Game Mechanics Explained, explains it better than I ever could. Either way, I will call those units dots from now on. Every scanline, the PPU starts with one idle cycle, then plots 256 pixels on a scanline, and then takes 85 PPU cycles to reset and start at the next scanline. This process repeats 240 times; once for each scanline on screen. After the last scanline, there’s one scanline* worth of idle time, and after that, the PPU has 20 scanlines** worth of time to reset itself to the initial position to plot the new screen. This period is called vertical blanking, or vBlank for short. When the PPU enters this state, the NMI will be triggered. After the virtual blanking period, there is one more scanline worth of pre-rendering time – because the vBlank period is system-dependent, we number this scanline -1.

* NTSC and PAL systems have 1 scanline worth of idle time; DENDY systems, however, have 51.
** NTSC and DENDY systems have 20 scanlines worth of vBlank time; PAL, however, has 70.

PPU and CPU clock speeds

The picture processing unit and central processing unit are different micromachines inside the NES, and their clock speeds are different as well. The NTSC and DENDY versions plot exactly 3, and the PAL version plots 3.2 dots, per CPU cycle. That means, for example, by the time the CPU has written one byte to an “absolute,X” address (e.g. STA Object_x_hi,x)*, the PPU has already drawn 15 dots on the screen – and even 16 on PAL systems. This might illustrate how resource-frugal you need to be when writing your game!

* STA absolute,X takes 5 CPU cycles, as documented in the  6502 Instruction Set.

Updating the screen

As you may know, you can’t really update the screen when the PPU is actively sending dots to the TV. Because when you write data to the PPU, this will corrupt whatever output the PPU is handling at that moment, which will cause graphical glitches. The good news is there are two periods when the PPU does not actively send dots to the screen, which is during horizontal blanking (or hBlank, the 85 cycles at the end of each scanline) and vertical blanking (vBlank, the 20 – or 70 – scanlines at the end of each frame). The bad news is that hBlank is basically too short to do anything interesting. On NTSC systems, there’s only 28,33 CPU cycles worth of time per hBlank period, which is barely enough to, let’s say, update a single palette color. On PAL it’s even less: about 26,5 CPU cycles. That’s why pretty much every NES game updates graphics during vBlank, which take up 6.820 PPU (or about 2.273 CPU) cycles on NTSC, and even a whopping 23.870 PPU (or about 7.459 CPU) cycles on PAL.

You’ll have enough time to update all sprites on screen (a default OAMDMA would take up 513 or 514 CPU cycles), some of the background tiles, and the sound driver. You can’t update entire screens in this still pretty narrow timeframe though. That’s why when NESmaker has to update an entire screen, the screen goes blank – the PPU gets disabled, so the game can still update graphics on screen outside these blanking periods without messing up the output.

Overview of cycles

To close things off for now, here’s a table containing the values of cycles, scanlines, frames and blank lengths for the three main TV systems.

NTSC PAL DENDY
Frames per second 60 50 59
PPU dots per CPU cycle 3,0 3,2 3,0
Scanlines per frame 262 312 312
PPU dots per scanline 341 341 341
PPU dots per frame 89342 106392 106392
CPU cycles per frame 29780,67 33247,5 35464
hBlank size in PPU dots 85 85 85
hBlank size in CPU cycles 28,33 26,56 28,33
vBlank size in scanlines 20 70 20
vBlank size in PPU dots 6820 23870 6820
vBlank size in CPU cycles 2273,33 7459,38 2273,33

 

References / further reading