More progress

2026-02-22 09:35:35 -07:00 · 2025-02-02 21:58:52 -03:00
parent 95f5c5eb5f
commit 351d5932e8
1 changed files with 17 additions and 11 deletions
--- a/_posts/2025-01-22-riva128-part-1.md
+++ b/_posts/2025-01-22-riva128-part-1.md
@@ -182,9 +182,9 @@ This MMIO area has numerous functional subsystems of the GPU mapped into it, wit
 | `0x2000-0x3FFF`     | PFIFO       | FIFO buffer for graphics command submission from DMA                        |
 | `0x4000-0x4FFF`     | PRM         | Real mode device support (e.g. MPU-401)                                     |
 | `0x6000-0x6FFF`     | PRAM        | Controls RAMIN area configuration                                           |
-| `0x7000-0x7FFF`     | PRMIO       | Real Mode Access registers                                                  |
+| `0x7000-0x7FFF`     | PRMA        | Real Mode Access registers                                                  |
 | `0x9000-0x9FFF`     | PTIMER      | Custom programmable interval timer                                          |
-| `0xA0000-0xAFFFF`   | VGA RAM     | Emulated VGA video memory                                                   |
+| `0xA0000-0xAFFFF`   | PRMFB       | Real Mode Framebuffer: emulated VGA video memory                            |
 | `0xC0000-0xCFFFF`   | PRMVIO      | Real Mode Video: VGA emulation registers (Weitek)                           |
 | `0x100000-0x100FFF` | PFB         | Framebuffer interface (config, debug, initialisation)                       |
 | `0x101000-0x101FFF` | PEXTDEV     | External Device interface                                                   |
@@ -235,19 +235,25 @@ This area is effectively the last megabyte of VRAM (regardless of VRAM size), bu

 #### Interrupts

-Any graphics card worth its salt needs an interrupt system. So a REALLY good one must have two completely different systems for notifying other parts of the GPU about events, right? There is a traditional interrupt system, with both software and hardware support (indicated by bit 31 of the interrupt status register) controlled by a register in `PMC` that turns on and off interrupts for different components of the GPU. Each component of the GPU also allows individual interrupts to be turned on or off, and has its own interrupt status register. Each component (including the removed-in-revision-B `PAUDIO` for some reason) is represented by a bit in the `PMC` interrupt status register. If the interrupt status register of a component, ANDED with the interrupt status register, is 1, an interrupt is declared to be pending (with some minor exceptions that will be explained in later parts) and a PCI/AGP IRQ is sent. The interrupt registers are set up such that, when they are viewed in hexadecimal, an enabled interrupt appears as a 1 and a disabled interrupt as a 0. Interrupts can be turned off GPU-wide (or for one of just hardware or software) via the `PMC_INTR_EN` register (at `0x0140`)
+A traditional interrupt system is implemented, supporting interrupts issued by different GPU components. `PMC` contains an interrupt status register and an interrupt enable register, with one bit for each component (including the eventually-removed `PAUDIO`), as well as a software interrupt represented by bit 31; components also have a local status register and enable register, with each bit representing an individual interrupt from that block. If the `PMC` interrupt status and enable bits for a given component are both 1, with some minor exceptions to be explained in later parts, an interrupt is declared to be pending and a PCI IRQ is sent.

-This allows an interrupt to be implemented as:
+Interrupts can be turned off globally (or just component interrupts, or just the software interrupt) via the `PMC_INTR_EN` register at `0x0140`.

-```
-<component>.interrupt_status |= (1 << interrupt_number)
-```
+#### Programmable interval timer

-#### Programmable Interrupt Timer: PTIMER
-Time-sensitive functions are provided by a nice, simple (except for the fact that, for some strange reason, the counter is 56-bit, split into two 32-bit registers `PTIMER_TIME0`, of which only bits 31 through 5 are meaningful, and `PTIMER_TIME1`...which has bits 28 through 0 meaningful instead?) programmable interval timer that fires an interrupt whenever the threshold value (set by the `PTIMER_ALARM`) is exceeded in nanoseconds. This is how the drivers internally keep track of many actions that they need to perform and is the first functional block you need to get right if you ever hope to emulate the RIVA 128.
+Time-sensitive functions are provided by a relatively simple programmable interval timer `PTIMER` that fires an interrupt whenever the threshold value (set by the `PTIMER_ALARM`) is exceeded in nanoseconds. This is how the drivers internally keep track of many actions that they need to perform, and is the first functional block which must be done right if you ever hope to emulate the RIVA 128.

-#### Graphics Commands & DMA Engine Overview
-What may be called graphics commands in other GPU architectures are instead called graphics objects in the NV3 architecture, and in fact all NVIDIA architectures use this nomenclature. They are submitted into the GPU core via a custom DMA engine (although Parallel I/O can be used) with its own translation lookaside buffer and other memory management structures. There are 8 DMA channels (only one is allowed at a time; a mechanism known as "context switching" must be performed to use other channels (involving writing to PGRAPH registers for every class to set the current channel ID), with channel 0 being the default). All DMA channels are 64-kilobytes in size of RAM called RAMIN (which will be explained later), and are further divided into subchannels that are `0x2000` bytes in length. The meaning of what is in those subchannels depends on the type (or, as NVIDIA calls it - class) of object submitted into them, with the attributes of each object being called a method. All objects have a defined name (really just a 32-bit value) and another 32-bit value storing various information about the object - where it is relative to the start of `RAMIN`, if it is a software-injected or hardware graphical rendering object (bit 31), the channel and subchannel ID the object is associated with, and the object's class. This is called their *context*. Their contexts are stored in an area of RAM called `RAMFC` if the channel they are in is not being used, and if it is, they are stored in `RAMHT` - a hash table*, where the hash key is every byte of the object's name (which must be above 4096 due to NVIDIA's drivers reserving IDs below that) XORed individually, which is XORed with the channel ID to get the final hash ID. This is then multiplied by 16 to get the object's offset from the start of RAMHT. (It seems the drivers have to manage that this area does not get full on their own with only basic error handling from the hardware itself!). The first four bytes are its name, then its context, and finally the actual methods of the objects that we discussed earlier.
+The least straightforward part of this timer is the counter, a 56-bit value split across two 32-bit registers: the lower 27 bits are stored in bits [31:5] of `PTIMER_TIME0`, and the upper 29 bits are stored in bits [28:0] of `PTIMER_TIME1`.
+
+#### Graphics commands and DMA engine
+
+What may be called *graphics commands* in other GPU architectures are instead called *graphics objects* in the NV3 and all other NVIDIA architectures. Objects are submitted into the GPU core via a custom direct memory access engine with its own translation lookaside buffer and other memory management structures, although programmed I/O can also be used.
+
+There are 8 DMA channels, with the default being channel 0, but only one can be used at a time; using other channels requires a *context switch*, which entails writing the current channel ID to to PGRAPH registers for every class. All DMA channels use 64 KB of RAMIN memory (to be explained later), further divided into 8 KB subchannels; the meaning of what is in those subchannels depends on the type (or *class* to use NVIDIA terminology) of object submitted into them, with the attributes of each object being called a *method*.
+
+All objects have a *context*, consisting of a 32-bit "name" and another 32-bit value storing its class, associated channel and subchannel ID, where it is relative to the start of `RAMIN`, and whether it's a software-injected or hardware graphical rendering object (bit 31). Contexts are stored in an area of RAM called `RAMFC` if the object's channel is not being used; otherwise, they are stored in `RAMHT`, a *hash table* where the hash key is a single byte calculated by XORing each byte of the object's name[^htdriver] as well as the channel ID. Objects are stored in `RAMHT` as structures consisting of their 8-byte context followed by the *methods* mentioned earlier; an object's byte offset in `RAMHT` is its hash multiplied by 16.
+
+[^htdriver]: Object names below 4096 are reserved on NVIDIA's drivers, which also have the duty to prevent the hash table area from getting full with only basic error handling from the hardware itself.

 The exact methods of every graphics object are incredibly long and often shared between several different types of objects (although the first `0x100` bytes are shared and usually the first bytes after that are shared too) and won't be listed in part 1, but an overall list of graphics objects (note - these are the graphics objects defined by the *hardware*, the *drivers* implement their own, much larger set of graphics objects that do not map exactly to the ones in the GPU; furthermore, as you will see later, due to the large - 8KB - size of each object, *only one object does not mean only one - or even any - single object is drawn!*):