diff --git a/_posts/2025-01-22-riva128-part-1.md b/_posts/2025-01-22-riva128-part-1.md index 833f10a..28d3ae5 100644 --- a/_posts/2025-01-22-riva128-part-1.md +++ b/_posts/2025-01-22-riva128-part-1.md @@ -24,26 +24,26 @@ Please contact me @ thefrozenstar_ on Discord, via the 86Box Discord, my email a ## Introduction -The NVidia RIVA 128 is a graphics card released in 1997 by NVidia (nowadays of AI and $2000 overpriced quad-slot GPU fame). It was a Direct3D 5.0-capable accelerator, and one of the first to use a standard graphics API, such as DirectX, as its "native" API. I have been working on emulating this graphics card for the last several months; currently, while VGA works and the drivers are loading successfully on Windows 2000, they are not rendering any kind of accelerated output yet. Many people, including me, have asked for and even tried to develop emulation for this graphics card and other similar cards (such as its successor, "NV4" or Riva TNT), but have not succeeded yet (although many of these efforts continue). This is the first part of a series where I explore the architecture and my experiences in emulating this graphics card. I can't guarantee success, but if it was successful, it appears that it would be the first time that a full functional emulation of this card has been developed (although later Nvidia cards have been emulated to at least some extent, such as the GeForce 3 in Cxbx-Reloaded and Xemu). +The NVidia RIVA 128 is a graphics card released in 1997 by NVidia (nowadays of AI and $2000 overpriced quad-slot GPU fame). It was a Direct3D 5.0-capable accelerator, and one of the first to use a standard graphics API, such as DirectX, as its "native" API. I have been working on emulating this graphics card for the last several months; currently, while VGA works and the drivers are loading successfully on Windows 2000 (a few days ago, the emulation successfully passed the bootscreen!), they are not rendering any kind of accelerated output yet. Many people, including me, have asked for and even tried to develop emulation for this graphics card and other similar cards (such as its successor, "NV4" or Riva TNT), but have not succeeded yet (although many of these efforts continue). This is the first part of a series where I explore the architecture and my experiences in emulating this graphics card. I can't guarantee success, but if it was successful, it appears that it would be the first time that a full functional emulation of this card has been developed (although later Nvidia cards have been emulated to at least some extent, such as the GeForce 3 in Cxbx-Reloaded and Xemu). This is the first part of a series of blog posts that aims to demystify, once and for all, Nvidia Riva 128. This is the first part, which will dive into the history of Nvidia up to the release of the NVidia RIVA 128, and a brief overview of how the Riva 128 actually works. The second part will dive into the architecture of Nvidia's drivers and how they relate to the hardware, and the third part will follow the lifetime of a graphics object from birth to display on the screen in extreme detail. Then, part four and an unknown number of parts after part four will go into detail on the experiences of developing a functional emulation for this graphics card. ## A brief history ### Beginnings -NVidia was conceived in 1992 by three engineers from LSI Logic and Sun Microsystems - Jensen Huang (now one of the world's richest men, still the CEO and, apparently, mobbed by fans in the country of his birth, Taiwan), Curtis Priem (whose boss almost convinced him to work on Java instead of founding the company) and Chris Malachowsky (a veteran of graphics chip development). They saw a business opportunity in the PC graphics and audio market, which was dominated by low-end, high-volume players such as S3 Graphics, Tseng Labs, Cirrus Logic and Matrox (the only one that still exists today - after exiting the consumer graphics market in 2003 and ceasing to design graphics cards entirely in 2014). The company was formally founded on April 5, 1993, after all three left their jobs at LSI Logic and Sun between December 1992 and March 1993. Immediately (well, after the requisite $3 million of venture capital funding was acquired - a little nepotism owing to their reputation helped) began work on its first generation graphics chip; it was one of the first of a rush of dozens of companies attempting to develop graphics cards - both established players in the 2D graphics market such as Number Nine and S3, and new companies, almost all of which no longer exist - and many of which failed to even release a single graphics card. The name was initially GXNV ("GX next version", after a graphics card Malachowsky led the development of at Sun), but Huang requested him to rename the card to NV1 in order to not get sued. This also inspired the name of the company - NVidia, after other names such as "Primal Graphics" and "Huaprimal" were considered and rejected, and their originally chosen name - Invision - turned out to have been trademarked by a toilet paper company. In a perhaps ironic twist of fate, toilet paper turned out to be an apt metaphor for the sales, if not quality, of their first product, which Jensen Huang appears to be embarassed to discuss when asked, and has been quotes as saying "You don't build NV1 because you're great". The product was released in 1995 after a two-year development cycle and the creation of what Nvidia dubbed a hardware simulator, but actually appears to have been simply a set of Windows 3.x drivers intended to emulate their architecture, called the NV0 in 1994. +NVidia was conceived in 1992 by three engineers from LSI Logic and Sun Microsystems - Jensen Huang (now one of the world's richest men, still the CEO and, apparently, mobbed by fans in the country of his birth, Taiwan), Curtis Priem (whose boss almost convinced him to work on Java instead of founding the company) and Chris Malachowsky (a veteran of graphics chip development). They saw a business opportunity in the PC graphics and audio market, which was dominated by low-end, high-volume players such as S3 Graphics, Tseng Labs, Cirrus Logic and Matrox (the only one that still exists today - after exiting the consumer graphics market in 2003 and ceasing to design graphics cards entirely in 2014). The company was formally founded on April 5, 1993, after all three left their jobs at LSI Logic and Sun between December 1992 and March 1993. Immediately - well, after the requisite $3 million of venture capital funding was acquired at least - a little nepotism owing to their reputation helped! - began work on its first generation graphics chip; it was one of the first of a rush of dozens of companies attempting to develop graphics cards - both established players in the 2D graphics market such as Number Nine and S3, and new companies, almost all of which no longer exist - and many of which failed to even release a single graphics card. The name was initially GXNV ("GX next version", after a graphics card Malachowsky led the development of at Sun), but Huang requested him to rename the card to NV1 in order to not get sued. This also inspired the name of the company - NVidia, after other names such as "Primal Graphics" and "Huaprimal" were considered and rejected, and their originally chosen name - Invision - turned out to have been trademarked by a toilet paper company. In a perhaps ironic twist of fate, toilet paper turned out to be an apt metaphor for the sales, if not quality, of their first product, which Jensen Huang appears to be embarassed to discuss when asked, and has been quotes as saying "You don't build NV1 because you're great". The product was released in 1995 after a two-year development cycle and the creation of what Nvidia dubbed a hardware simulator, but actually appears to have been simply a set of Windows 3.x drivers intended to emulate their architecture, called the NV0 in 1994. ### The NV1 -The NV1 was a combination graphics, audio, DRM (yes, really) and game port card implementing what Nvidia dubbed the "NV Unified Media Architecture (UMA)"; the chip was manufactured by SGS-Thomson Microelectronics - now STMicroelectronics - on the 350 nanometer node, who also white-labelled Nvidia's design (except the DAC, which seems to have been designed by SGS, at least based on the original contract text from 1993) as the STG-2000 (without audio functionality - this was also called the "NV1-V32", for 32-bit VRAM, in internal documentation, with Nvidia's being the NV1-D64). The card was designed to implement a reasonable level of 3D graphics functionality, as well as audio, public-key encryption for DRM purposes (which was never used, as it would have required the cooperation of software companies) and Sega Saturn game ports in a single megabyte of memory (memory cost $50 a megabyte when the initial design of the NV1 chip began in 1993). In order to achieve this, many techniques had to be used that ultimately compromised the quality of the 3D rendering of the card, such as using forward texture mapping, where a texel (output pixel) of a texture is directly mapped to a point on the screen, instead of the more traditional inverse texture mapping, which iterates through pixels and maps texels from those. While this has memory space advantages (as you can cache the texture in the very limited amount of VRAM Nvidia had to work with very easily), it has many more disadvantages - firstly, this approach does not support UV mapping (a special coordinate system used to map textures to three-dimensional objects) and other aspects of what would be considered to be today basic graphical functionality. Additionally, the fundamental implementation of 3D rendering used quad patching instead of traditional triangle-based approaches - this has very advantageous implications for things like curved surfaces, and may have been a very effective design for the CAD/CAM customers purchasing more high end 3D products. However, it turned out to not be particularly useful at all for the actually intended target market - gaming. There was also a total lack of SoundBlaster compatibility (required for audio to sound half-decent in many games) in the audio engine, and partially-emulated and very slow VGA compatibility, which led to slow performance in the games people *actually played*, unless your favourite game was a crappier, slower version of Descent, Virtua Cop or Daytona USA for some reason. Another body blow to Nvidia was received when Microsoft released Direct3D in 1996 with DirectX 2.0, which simultaneously used triangles, became the standard 3D API and killed all of the numerous non-OpenGL proprietary 3D apis, including S3's S3D and later Metal, ATIs 3DCIF, and Nvidia's NVLIB. +The NV1 was a combination graphics, audio, DRM (yes, really) and game port card implementing what Nvidia dubbed the "NV Unified Media Architecture (UMA)"; the chip was manufactured by SGS-Thomson Microelectronics - now STMicroelectronics - on the 350 nanometer process node, who also white-labelled Nvidia's design (except the DAC, which seems to have been designed by SGS, at least based on the original contract text from 1993) as the STG-2000 (without audio functionality - this was also called the "NV1-V32", for 32-bit VRAM, in internal documentation, with Nvidia's being the NV1-D64). The card was designed to implement a reasonable level of 3D graphics functionality, as well as audio, public-key encryption for DRM purposes (which was never used, as it would have required the cooperation of software companies) and Sega Saturn game ports in a single megabyte of memory (memory cost $50 a megabyte when the initial design of the NV1 chip began in 1993). In order to achieve this, many techniques had to be used that ultimately compromised the quality of the 3D rendering of the card, such as using forward texture mapping, where a texel (output pixel) of a texture is directly mapped to a point on the screen, instead of the more traditional inverse texture mapping, which iterates through pixels and maps texels from those. While this has memory space advantages (as you can cache the texture in the very limited amount of VRAM Nvidia had to work with very easily), it has many more disadvantages - firstly, this approach does not support UV mapping (a special coordinate system used to map textures to three-dimensional objects) and other aspects of what would be considered to be today basic graphical functionality. Additionally, the fundamental implementation of 3D rendering used quad patching instead of traditional triangle-based approaches - this has very advantageous implications for things like curved surfaces, and may have been a very effective design for the CAD/CAM customers purchasing more high end 3D products. However, it turned out to not be particularly useful at all for the actually intended target market - gaming. There was also a total lack of SoundBlaster compatibility (required for audio to sound half-decent in many games) in the audio engine, and partially-emulated and very slow VGA compatibility, which led to slow performance in the games people *actually played*, unless your favourite game was a crappier, slower version of Descent, Virtua Cop or Daytona USA for some reason. Another body blow to Nvidia was received when Microsoft released Direct3D in 1996 with DirectX 2.0, which simultaneously used triangles, became the standard 3D API and killed all of the numerous non-OpenGL proprietary 3D apis, including S3's S3D and later Metal, ATIs 3DCIF, and Nvidia's NVLIB. -The upshot of all of this was, despite its innovative silicon design, what can be understood as nothing less than the total failure of Nvidia to sell, or convince anyone to develop for in any way, the NV1. While Diamond Multimedia bought 250,000 chips to place into boards (marketed as the Diamond "Edge 3D" series), barely any of them sold, and those that did sell were often returned, leading to the chips themselves being returned to NVidia and hundreds of thousands of chips sitting simply unused in warehouses. Barely any NV1-capable software was released - the few pieces of software that were released came via a partnership with Sega, which I will elaborate on later - and most of it was forced to run under software emulators for Direct3D (or other APIs) written by Priem - which was only possible due to the software architecture Nvidia chose for their drivers - which were slow (slower and worse-looking than software), buggy, and extremely unappealing. Nvidia lost $6.4 million in 1995 on a revenue of $1.1 million, and $3 million on a revenue of $3.9 million in 1996; most of the capital that allowed Nvidia to continue operating were from the milestone payments from SGS-Thomson for developing the card, their NV2 contract with Sega (which wlll be explored later), and their venture capital funding - not the very few NV1 sales. The card reviewed poorly, had very little software and ultimately no sales, and despite various desperate efforts to revive it, including releasing the SDK for free (a proprietary API called NVLIB was used to develop games against the NV1) and straight up begging their customers on their website to spam developers with requests to develop NV1-compatible versions of games, the card was effectively dead within a year. +The upshot of all of this was, despite its innovative silicon design, what can be understood as nothing less than the total failure of Nvidia to sell, or convince anyone to develop for in any way, the NV1. While Diamond Multimedia bought 250,000 chips to place into boards (marketed as the Diamond "Edge 3D" series), and at least several other companies manufactured boards, barely any of them sold, and those that did sell were often returned, leading to the chips themselves being returned to NVidia and hundreds of thousands of chips sitting simply unused in warehouses. Barely any NV1-capable software was released - the few pieces of software that were released came via a partnership with Sega, which I will elaborate on later - and most of it was forced to run under software emulators for Direct3D (or other APIs) written by Priem - which was only possible due to the software architecture Nvidia chose for their drivers - the Direct3D emulation, due to the fundamental incompatibility with the NV1 hardware, was slower and worse-looking than software rendering, buggy, and generally extremely unappealing. Nvidia lost $6.4 million in 1995 on a revenue of $1.1 million, and $3 million on a revenue of $3.9 million in 1996; most of the capital that allowed Nvidia to continue operating were from the milestone payments from SGS-Thomson for developing the card, their NV2 contract with Sega (which wlll be explored later), and their venture capital funding - not the very few NV1 sales. The card reviewed poorly, had very little software and ultimately almost no sales, and despite various desperate efforts to revive it, including releasing the SDK for free (a proprietary API called NVLIB was created to develop games against the NV1, although programming the hardware directly could also be used) and, by early 1996, straight up begging their customers on their website to spam developers with requests to develop NV1-compatible versions of games, the card was effectively dead within a year. ### The NV2 and the near destruction of the company Nevertheless, Nvidia (which by this time had close to a hundred employees, including sales and marketing teams), and especially its cofounders, remained confident in their architecture and overall prospects of success . They had managed to solidify a business relationship with Sega, to the point where they had initially won the contract to provide the graphics hardware for the successor to the Sega Saturn, at that time codenamed "V08". The GPU was codenamed "Mutara" (after the nebula critical to the plot in Star Trek II: The Wrath of Khan) and the overall architecture was the NV2. It maintained many of the functional characteristics of the NV1 and was essentially a more powerful successor to that card. -However, problems started to emerge almost immediately. Game developers, especially Sega's internal game developers, were not happy with having to use a GPU with such a heterodox design (for example, porting games to or from the PC, which Sega did do at the time, would be made far harder). This position was especially championed by Yu Suzuki, the head of one of Sega's most prestigious internal development teams Sega-AM2 (responsible, among others, for the Daytona USA, Virtua Racing, Virtua Fighter, and Shenmue series), who sent his best graphics programmer to interface with Nvidia and push for triangles. At this point, the story diverges - some tellings claim that Nvidia simply refused to accede to Sega's request and this severely damaged their relationship; others that the NV2 wasn't killed until it failed to produce any video during a demonstration, and Sega still paid Nvidia for developing it to prevent bankruptcy (it does appear that one engineer got it working for the sole purpose of receiving a milestone payment). At some point, Sega, as a traditional Japanese company, couldn't simply kill the deal, so officially, they relegated the NV2 to be used in the successor to the educational aimed-at-toddlers Sega Pico, while in reality, Sega of America had already been told to "not worry" about NVIDIA anymore. Nvidia got the hint, and the NV2 was cancelled. A bonus fact that isn't really applicable anywhere is that the NV2 appears to have been manufactured by the then-just founded Helios Semiconductor, based on available sources, the only Nvidia card to have been manufactured by them. +However, problems started to emerge almost immediately. Game developers, especially Sega's internal game developers, were not happy with having to use a GPU with such a heterodox design (for example, porting games to or from the PC, which Sega did do at the time, would be made far harder). This position was especially championed by Yu Suzuki, the head of one of Sega's most prestigious internal development teams Sega-AM2 (responsible, among others, for the Daytona USA, Virtua Racing, Virtua Fighter, and Shenmue series), who sent his best graphics programmer to interface with Nvidia and push for the rendering method of the GPU to be changed to a more traditional triangle-based approach. At this point, the story diverges - some tellings claim that Nvidia simply refused to accede to Sega's request and this severely damaged their relationship; others that the NV2 wasn't killed until it failed to produce any video during a demonstration, and Sega still paid Nvidia for developing it to prevent bankruptcy (it does appear that one engineer got it working for the sole purpose of receiving a milestone payment). At some point, Sega, as a traditional Japanese company, couldn't simply kill the deal, so officially, they relegated the NV2 to be used in the successor to the educational aimed-at-toddlers Sega Pico console, while in reality, Sega of America had already been told to "not worry" about NVIDIA anymore. Nvidia got the hint, and the NV2 was cancelled. A bonus fact that isn't really applicable anywhere is that the NV2 appears to have been manufactured by the then-just founded Helios Semiconductor, which, based on available sources, is the only Nvidia card to have been manufactured by them. -At this point, Nvidia had no sales, no customers, and barely any money (at some point in late 1996, Nvidia had $3 million - and was burning through $330,000 a month). Most of the NV2 team had been redeployed to the next generation - the NV3. No venture capital funding was going to be forthcoming due to the failure to actually create any products people wanted to buy, at least not without extremely unfavourable terms on things like ownership. It was effectively almost a complete failure and a waste of years of the employees time. By the end of 1996, things had gotten infinitely worse - the competition was heating up. Despite NV1 being the first texture-mapped consumer GPU ever released, they had been fundamentally outclassed by their competition. It was a one-two punch: initially, Rendition, founded around the same time as Nvidia in 1993 released its custom RISC architecture V1000 chip. While not particularly fast, this was, for a few months, the only card that could run Quake (the hottest game of 1996) in hardware accelerated mode and was an early market leader (as well as S3's laughably bad ViRGE - Video and Rendering Graphics Engine - which at launch was slower than software on high-end CPUs, and was reserved for OEM bargain-bin disaster machines). +At this point, Nvidia had no sales, no customers, and barely any money (at some point in late 1996, Nvidia had $3 million in the bank - and was burning through $330,000 a month). Most of the NV2 team had been redeployed to the next generation - the NV3. No venture capital funding was going to be forthcoming due to the failure to actually create any products people wanted to buy, at least not without extremely unfavourable terms on things like ownership. It was effectively almost a complete failure and a waste of years of the employees time. By the end of 1996, things had gotten infinitely worse - the competition was heating up. Despite NV1 being the first texture-mapped consumer GPU ever released, they had been fundamentally outclassed by their competition. It was a one-two punch: initially, Rendition, founded around the same time as Nvidia in 1993 released its custom RISC architecture V1000 chip. While not particularly fast, this was, for a few months, the only card that could run Quake (the hottest game of 1996) in hardware accelerated mode and was an early market leader (as well as S3's laughably bad ViRGE - Video and Rendering Graphics Engine - which at launch was slower than software on high-end CPUs, and was reserved for OEM bargain-bin disaster machines). However, this was nothing compared to the body blow about to hit the entire industry, Nvidia included. At a conference in early 1996, an $80,000 SiliconGraphics (then the world leader in accelerated graphics) machine crashed during a demo by the then-CEO Ed McCracken. While they were rebooting the machine, if accounts of the event are to be believed, the people in the event started leaving, many of which, based in rumours that they had heard heading downstairs to another demo by a then-tiny company made up of ex-SGI employes calling itself "3D/fx" (later shortened to 3dfx), which was claiming comparable graphics quality for $250...and had demos to prove it. In many of the cases of supposed "wonder innovations" in the tech industry, it turns out to be too good to be true, but when their card, the "Voodoo Graphics" was first released in the form of the "Righteous 3D" by Orchid in October 1996, it turned out to be true. Despite the fact that it was a 3D-only card and required a 2D card to be installed, and the fact it could not accelerate graphics in a window (which almost all other 3d cards could do), the card's performance was so high relative to the other efforts (including the NV1) that it not only had rave reviews on its own but kicked off a revolution in consumer 3D graphics, which especially caught fire when GLQuake was released in January 1997. @@ -52,7 +52,7 @@ The reasons that 3dfx was able to design such an effective GPU when all others f Effectively, Nvidia had to design a graphics architecture that could at very least get close to 3dfx's performance, on a shoestring budget, with very little resources (60% of the staff, including the entire sales and marketing teams, having been laid off to preserve money). Since they did not have the time, they not only could not completely redesign the NV1 from scratch if they felt the need to do this (this would take two years - time that Nvidia simply didn't have - and any design that came out of this effort would be immediately obsoleted by other companies, such as 3dfx's Voodoo line and ATI with its initially rather pointless, but rapidly advancing in performance and driver stability, Rage series of chips) the chip would have to work reasonably well on the first tapeout, as they simply did not have the capital to produce more revisions of the chip, The fact they were able to achieve a successful design in the form of the NV3 under such conditions was testament to the intelligence, skill and luck of Nvidia's designers. Later on in this blogpost, we will explore how they managed to achieve this. ### The NV3 (Riva 128) -It was with these financial, competitive and time constraints in mind that design on the NV3, which would eventually be commercialised as the RIVA 128 ("Real-time Interactive Video and Animation accelerator", with the 128 owing to its at-the-time very large 128-bit size of its internal bus), began in 1996. Nvidia retained SGS-Thomson (soon to be renamed to STMicroelectronics, which is the name it is still under today) as the manufacturing partner, in return for SGS-Thomson cancelling their rival GPU - the STG-3001. In a similar vein to the NV1, Nvidia was initially going to sell the card as the "NV3" with inbuilt audio functionality and SGS-Thomson was going to white-label the chip as the SGS-Thomson STG-3000 without audio functionality - it seems, based on the original contract language, which for some reason is [**only available on a website for example contracts, where it has been around since 2004**](https://contracts.onecle.com/nvidia/sgs.collab.1993.11.10.shtml), but appears to have originated from a filing with the U.S. Securities and Exchange Commission, based on the format and references to "the Commission", that Nvidia convinced them to cancel their own GPU, the STG-3001, and manufacture the NV3 instead - which would prove to be a terrible decision for STMicro when Nvidia dropped them and moved to TSMC for the RIVA 128ZX due to yield issues, and the fact that Nvidia's venture capital funders were pressuring them to move to TSMC. STMicro manufactured PowerVR cards for a few more years, but they had dropped out of the market entirely by 2001. +It was with these financial, competitive and time constraints in mind that design on the NV3, which would eventually be commercialised as the RIVA 128 ("Real-time Interactive Video and Animation accelerator", with the 128 owing to its at-the-time very large 128-bit size of its internal bus), began in 1996. Nvidia retained SGS-Thomson (soon to be renamed to STMicroelectronics, which is the name it is still under today) as the manufacturing partner, in return for SGS-Thomson cancelling their rival GPU - the STG-3001. In a similar vein to the NV1, Nvidia was initially going to sell the card as the "NV3" with inbuilt audio functionality and SGS-Thomson was going to white-label the chip as the SGS-Thomson STG-3000 without audio functionality - it seems, based on the original contract language, which for some reason is [**only available on a website for example contracts, where it has been around since 2004**](https://contracts.onecle.com/nvidia/sgs.collab.1993.11.10.shtml), but appears to have originated from a filing with the U.S. Securities and Exchange Commission, based on the format and references to "the Commission", that Nvidia convinced them to cancel their own GPU, the STG-3001, and manufacture the NV3 instead - which would prove to be a terrible decision for STMicro when Nvidia dropped them and moved to TSMC for the RIVA 128ZX due to yield issues, and the fact that Nvidia's venture capital funders were pressuring them to move to TSMC. STMicro manufactured PowerVR Midas-3 and Kyro series cards for a few more years, but they had dropped out of the market entirely by 2001, with the Kyro 3 being cancelled. After the NV2 disaster, the company made several calls on the NV3's design that turned out to be very good decisions. First, they acquiesced to Sega's advice (which they might have done already, but too late, to save the Mutara V08/NV2) and moved to an inverse texture mapping triangle based model (although some remnants of the original quad patching design remain) and removed the never-used DRM functionality from the card. This may have been assisted by the replacement of Curtis Priem with the rather egg-shaped David Kirk, perhaps notable as a "Special Thanks" credit on Gex and the producer of the truly unparalleled *3D Baseball* on the Sega Saturn during his time at Crystal Dynamics, as chief designer - Priem insisted on including the DRM functionality with the NV1, because back when he worked at Sun, the game he had written as a demo of the GX GPU designed by Malachowsky was regularly pirated. Another decision that turned out to pay very large dividends was deciding to forgo a native API entirely and entirely build the card around accelerating the most popular graphical APIs - which led to an initial focus on Direct3D (although OpenGL drivers were first publicly released in alpha form in December 1997, and released fully in early 1998). Initially DirectX 3.0 was targeted, but 5.0 came out late during the development of the chip (4.0 was cancelled due to [**lack of developer interest in its new functionality**](https://devblogs.microsoft.com/oldnewthing/20040122-00/?p=40963)) and the chip is mostly Direct3D 5.0 compliant (with the exception of some blending modes such as additive blending, which Jensen Huang later claimed was due to Microsoft not giving them the specification in time), which was made much easier by the design of their driver (which allowed, and still allows, graphical APIs to be plugged in as "clients" to the Resource Manager kernel - as I mentioned earlier, this will be explained in full detail later). The VGA core (which was so separate from the main GPU on the NV1 that it had its own PCI ID) was replaced by a VGA core licensed from Weitek (who would soon exit the graphics market), which was placed in the chip parallel to the main GPU with its own 32-bit bus, which massively accelerated performance in unaccelerated VESA titles, like Doom - and provided a real advantage over the 3D-only 3dfx cards (3dfx did have a combination card, the SST-96 or Voodoo Rush, but it used a crappy Alliance card and was generally considered a failure). Finally, Huang, in his capacity as the CEO, allowed the chip to be expanded (in terms of physical size and number of gates) from its original specification, allowing for a more complex design with more features. @@ -98,7 +98,7 @@ The RIVA 128 is not dependent on the host clock of the machine that it is insert The RAMDAC in the card, which handles final conversion of the digital image generated by the GPU into an analog video signal and clock generation (via three phase-locked loops), has its own clock (ACLK) that ran at around 200 Mhz in the RIVA 128 (revision A/B) and 260 Mhz in the revision C (RIVA 128 ZX) cards. It was not configurable by OEM manufacturers, unlike the other cards. -Generally, most manufacturers set the memory clock at around 100 megahertz and the pixel clock at around 40 Megahertz. +Generally, most manufacturers set the memory clock at around 100 megahertz and the pixel clock at around 40 Megahertz, although in some cases the drivers seem to downclock this. ### Memory Mapping Before we can discuss any part of how the RIVA 128 works, we must explain the memory architecture, since this is a fundamental requirement to even access the graphics card's registers in the first place. Nvidia picked a fairly strange memory mapping architecture, at least for cards of that time. The exact setup of the memory mapping changed numerous times as Nvidia's architecture evolved, so only the NV3-based GPUs will be analysed. @@ -178,7 +178,7 @@ or in the form of bitwise math - code is from my in progress RIVA 128 emulatino I'm not entirely sure why they did this, but I assume it was for providing a more convenient interface to the user and for general efficiency reasons. #### Interrupts -Any graphics card worth its salt needs an interrupt system. So a REALLY good one must have two completely different systems for notifying other parts of the GPU about events, right? There is a traditional interrupt system, with both software and hardware support (indicated by bit 31 of the interrupt status register) controlled by a register in `PMC` that turns on and off interrupts for different components of the GPU. Each component of the GPU also allows individual interrupts to be turned on or off, and has its own interrupt status register. Each component (including the removed-in-revision-B `PAUDIO` for some reason) is represented by a bit in the `PMC` interrupt status register. If the interrupt status register of a component, ANDED with the interrupt status register, is 1, an interrupt is declared to be pending (with some minor exceptions that will be explained in later parts) and a PCI/AGP IRQ is sent. The interrupt registers are set up such that, when they are viewed in hexadecimal, an enabled interrupt appears as a 1 and a disabled interrupt as a 0. Interrupts can be turned off GPU-wide (or for one of just hardware or software) via the `PMC_INTR_EN` register (at `0x0140`) +Any graphics card worth its salt needs an interrupt system. So a REALLY good one must have two completely different systems for notifying other parts of the GPU about events, right? There is a traditional interrupt system, with both software and hardware support (indicated by bit 31 of the interrupt status register) controlled by a register in `PMC` that turns on and off interrupts for different components of the GPU. Each component of the GPU also allows individual interrupts to be turned on or off, and has its own interrupt status register. Each component (including the removed-in-revision-B `PAUDIO` for some reason) is represented by a bit in the `PMC` interrupt status register. If the interrupt status register of a component, bitwise ANDed with the interrupt status register, is 1, an interrupt is declared to be pending (with some minor exceptions that will be explained in later parts) and a PCI/AGP IRQ is sent. The interrupt registers are set up such that, when they are viewed in hexadecimal, an enabled interrupt appears as a 1 and a disabled interrupt as a 0. Interrupts can be turned off GPU-wide (or for one of just hardware or software) via the `PMC_INTR_EN` register (at `0x0140`) This allows an interrupt to be implemented as: @@ -186,6 +186,14 @@ This allows an interrupt to be implemented as: .interrupt_status |= (1 << interrupt_number) ``` +and the interrupt status register is updated by writing: + +``` +.interrupt_status &= ~value; +``` + +where "value" is a hexadecimal number where zero represents not pending and one represents pending. + #### Programmable Interrupt Timer: PTIMER Time-sensitive functions are provided by a nice, simple (except for the fact that, for some strange reason, the counter is 56-bit, split into two 32-bit registers `PTIMER_TIME0`, of which only bits 31 through 5 are meaningful, and `PTIMER_TIME1`...which has bits 28 through 0 meaningful instead?) programmable interval timer that fires an interrupt whenever the threshold value (set by the `PTIMER_ALARM`) is exceeded in nanoseconds. This is how the drivers internally keep track of many actions that they need to perform and is the first functional block you need to get right if you ever hope to emulate the RIVA 128. @@ -283,7 +291,7 @@ The same as `0x08` (point), but the zeta and alpha buffer can be applied to it t Any values not listed are invalid. In theory, since there are 5 bits in the FIFO object context reserved for classes, there can be up to 32 classes, but Nvidia did not implement 32 classes and moved to a different approach (one where the classes are somewhat more constructed in software) with the NV4 architecture. -These graphics objects are then sent (via one of two methods - Parallel I/O, which is basically DMA but only using Channel 0(?) and slower, or using the full DMA engine) to one of two caches within the `PFIFO` subsystem, the single-entry `CACHE0` (which is really intended for the aforementioned notifier engine to be able to inject graphics commands) or the multi-entry (32 on revision A or B cards; 64 on revision C or higher) `CACHE1`. These effectively - a full exploration of what these critical components actually do will be later parts of this - just store object names and contexts as they are waiting to be sent to `RAMIN`; a "pusher" pushes them in from the bus and a "puller" pulls them out of the bus and sends them where they need to be inside of the VRAM (or if they are invalid, to `RAMRO`). Once they are pulled out, the GPU will simply manipulate the various registers in the `PGRAPH` subsystem in order to draw the object. Objects do not "disappear" on frame refresh - in fact, it would simply appear that they are simply drawn over. Most likely, any renderer will simply clear the entire screen - e.g. with a Rectangle object, before resubmitting any graphics objects that they need to render. +These graphics objects are then sent (via one of two methods - Parallel I/O, which is basically DMA but only using Channel 0(?) and slower, or using the full DMA engine) to one of two caches within the `PFIFO` subsystem, the single-entry `CACHE0` (which is really intended for the aforementioned notifier engine to be able to inject graphics commands) or the multi-entry (32 on revision A or B cards; 64 on revision C or higher) `CACHE1`. These effectively - a full exploration of what these critical components actually do will be later parts of this series - just store object names and contexts as they are waiting to be sent to `RAMIN`; initially, objects are submitted via the `NV_USER` area, with a "pusher" pushing them in from the bus and a "puller" pulls them out of the bus and sends them where they need to be inside of the VRAM (or if they are invalid, to `RAMRO`). Once they are pulled out, the GPU will simply manipulate the various registers in the `PGRAPH` subsystem in order to draw the object and do DMA transfers using the DMA subsystem to pull required resources as needed. Objects do not "disappear" on frame refresh - in fact, it would simply appear that they are simply drawn over. Most likely, any renderer will simply clear the entire screen - e.g. with a Rectangle object, before resubmitting any graphics objects that they need to render. Objects are connected together with a special type of object called a "patchcord" (a name leftover from the old NV1 quad patching days). @@ -295,7 +303,7 @@ We already covered RAMFC and RAMAU. But there is another important structure sto If the GPU detects either that the cache ran out during submission, that the cache was turned off, or any kind of illegal access that it doesn't like, your graphics object submission will not be processed, but will instead be sent to a special area of `RAMIN` known as `RAMRO` (which is always half the size of `RAMHT`) that will store the object, what went wrong, if you were trying to write or read when it happened, and report an error by firing an interrupt (the `PFIFO_RUNOUT_STATUS` register also holds the current state of the `RAMRO` region, and if any errors occurred) so that any drivers running on the system can catch the error and (hopefully) correct it. #### RAMAU -Not really sure what this is for but I assume it's a spare area for random stuff. +This area of RAM was used in NV1 and revision A NV3 cards to store audio data being streamed from the CPU or RAM. In Revision B cards and later, it still exists, but has absolutely no functionality. #### Interrupts 2.0: Notifiers However, some people at Nvidia decided that they were too cool for interrupts. Why have an interrupt that tells the GPU to do something, when *you could have an interrupt that has the GPU tell the drivers to do something!*. So they implemented the incredible "notifier" system. It appears to have been implemented to allow the drivers to manage the GPU resources when the silicon could not implement them. Every single subsystem in the GPU has a notifier enable register alongside its interrupt enable register (some have multiple different notifier enable registers for different types of notifiers!) Notifiers appear to be intended to work with the object class system (although they may also exist within GPU subsystems, they mostly exist within `PGRAPH`, `PME` and `PVIDEO`) and are actually different *per-class of object* - with each object having a set of "notification parameters" that can be used to trigger a notification and are triggered by the `SetNotify` method at `0x104` within an object when it is stored inside of RAMHT. There is also the `SetNotifyCtxDma` method, usually but not always at `0x0`, which is used for the aforementioned context switching. Notifiers appear to be "requested" until the GPU processes them, and PGRAPH can take up to 16 software and 1 hardware notifier type.