Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 0/7] Pre-requisite patches for mm and irq in nova-core
These patches have some prerequistes needed for nova-core as support is added for memory management and interrupt handling. I rebased them on drm-rust-next and would like them to be considered for the next merge window. I also included a simple rust documentation patch fixing some issues I noticed while reading it :). The series adds support for the PRAMIN aperture mechanism, which is a prerequisite for virtual memory as it is required to boot strap virtual memory (we cannot write to VRAM using virtual memory because we need to write page tables to VRAM in the first place). I also add page table related structures (mm/types.rs) using the bitfield macro, which will be used for page table walking, memory mapping, etc. This is currently unused code, because without physical memory allocation (using the buddy allocator), we cannot use this code as page table pages need to be allocated in the first place. However, I have included several examples in the file about how these structures will be used. I have also simplified the code keeping future additions to it for later. For interrupts, I only have added additional support for GSP's message queue interrupt. I am working on adding support to the interrupt controller module (VFN) which is the next thing for me to post after this series. I have it prototyped and working, however I am currently making several changes to it related to virtual functions. For now in this series, I just want to get the GSP-specific patch out of the way, hence I am including it here. I also have added a patch for bitfield macro which constructs a bitfield struct given its storage type. This is used in a later GSP interrupt patch in the series to read from one register and write to another. Joel Fernandes (7): docs: rust: Fix a few grammatical errors gpu: nova-core: Add support to convert bitfield to underlying type docs: gpu: nova-core: Document GSP RPC message queue architecture docs: gpu: nova-core: Document the PRAMIN aperture mechanism gpu: nova-core: Add support for managing GSP falcon interrupts nova-core: mm: Add support to use PRAMIN windows to write to VRAM nova-core: mm: Add data structures for page table management Documentation/gpu/nova/core/msgq.rst | 159 +++++++++ Documentation/gpu/nova/core/pramin.rst | 113 +++++++ Documentation/gpu/nova/index.rst | 2 + Documentation/rust/coding-guidelines.rst | 4 +- drivers/gpu/nova-core/bitfield.rs | 7 + drivers/gpu/nova-core/falcon/gsp.rs | 26 +- drivers/gpu/nova-core/gpu.rs | 2 +- drivers/gpu/nova-core/mm/mod.rs | 4 + drivers/gpu/nova-core/mm/pramin.rs | 241 ++++++++++++++ drivers/gpu/nova-core/mm/types.rs | 405 +++++++++++++++++++++++ drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 39 ++- 12 files changed, 996 insertions(+), 7 deletions(-) create mode 100644 Documentation/gpu/nova/core/msgq.rst create mode 100644 Documentation/gpu/nova/core/pramin.rst create mode 100644 drivers/gpu/nova-core/mm/mod.rs create mode 100644 drivers/gpu/nova-core/mm/pramin.rs create mode 100644 drivers/gpu/nova-core/mm/types.rs -- 2.34.1
Fix two grammatical errors in the Rust coding guidelines document.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 Documentation/rust/coding-guidelines.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/rust/coding-guidelines.rst
b/Documentation/rust/coding-guidelines.rst
index 6ff9e754755d..d556f0db042b 100644
--- a/Documentation/rust/coding-guidelines.rst
+++ b/Documentation/rust/coding-guidelines.rst
@@ -97,7 +97,7 @@ should still be used. For instance:
 	// TODO: ...
 	fn f() {}
 
-One special kind of comments are the ``// SAFETY:`` comments. These must appear
+One special kind of comment is the ``// SAFETY:`` comment. These must appear
 before every ``unsafe`` block, and they explain why the code inside the block
is
 correct/sound, i.e. why it cannot trigger undefined behavior in any case, e.g.:
 
@@ -166,7 +166,7 @@ in the kernel:
 - While not shown here, if a function may panic, the conditions under which
   that happens must be described under a ``# Panics`` section.
 
-  Please note that panicking should be very rare and used only with a good
+  Please note that panicking should be very rare and used only for a good
   reason. In almost all cases, a fallible approach should be used, typically
   returning a ``Result``.
 
-- 
2.34.1
Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 2/7] gpu: nova-core: Add support to convert bitfield to underlying type
To support the usecase where we read a register and write to another
with identical bit layout, add support to convert bitfield to underlying type.
Another way to do this, is to read individual fields, on the caller
side, and write to the destination fields, but that is both cumbersome
and error-prone as new bits added in hardware may be missed.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 drivers/gpu/nova-core/bitfield.rs | 7 +++++++
 1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/nova-core/bitfield.rs
b/drivers/gpu/nova-core/bitfield.rs
index 0994505393dd..2266abc3f7ab 100644
--- a/drivers/gpu/nova-core/bitfield.rs
+++ b/drivers/gpu/nova-core/bitfield.rs
@@ -72,6 +72,7 @@
 /// - Field setters: `set_mode()`, `set_state()`, etc. (supports chaining with
builder pattern).
 ///   Note that the compiler will error out if the size of the setter's arg
exceeds the
 ///   struct's storage size.
+/// - Conversion from the underlying storage type (e.g., `From<u32>`).
 /// - Debug and Default implementations.
 ///
 /// Note: Field accessors and setters inherit the same visibility as the struct
itself.
@@ -117,6 +118,12 @@ fn from(val: $name) -> $storage {
             }
         }
 
+        impl ::core::convert::From<$storage> for $name {
+            fn from(val: $storage) -> $name {
+                $name(val)
+            }
+        }
+
         bitfield!(@fields_dispatcher $vis $name $storage { $($fields)* });
     };
 
-- 
2.34.1
Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 3/7] docs: gpu: nova-core: Document GSP RPC message queue architecture
Document the GSP RPC message queue architecture in detail.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 Documentation/gpu/nova/core/msgq.rst | 159 +++++++++++++++++++++++++++
 Documentation/gpu/nova/index.rst     |   1 +
 2 files changed, 160 insertions(+)
 create mode 100644 Documentation/gpu/nova/core/msgq.rst
diff --git a/Documentation/gpu/nova/core/msgq.rst
b/Documentation/gpu/nova/core/msgq.rst
new file mode 100644
index 000000000000..84e25be69cd6
--- /dev/null
+++ b/Documentation/gpu/nova/core/msgq.rst
@@ -0,0 +1,159 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================================+Nova GPU RPC Message Passing
Architecture
+========================================+
+.. note::
+   The following description is approximate and current as of the Ampere
family.
+   It may change for future generations and is intended to assist in
understanding
+   the driver code.
+
+Overview
+=======+
+The Nova GPU driver communicates with the GSP (GPU System Processor) firmware
+using an RPC (Remote Procedure Call) mechanism built on top of circular message
+queues in shared memory. This document describes the structure of RPC messages
+and the mechanics of the message passing system.
+
+Message Queue Architecture
+=========================+
+The communication between CPU and GSP uses two unidirectional circular queues:
+
+1. **CPU Queue (cpuq)**: CPU writes, GSP reads
+2. **GSP Queue (gspq)**: GSP writes, CPU reads
+
+The advantage of this approach is no synchronization is required to access the
+queues, if one entity wants to communicate with the other (CPU or GSP), they
+simply write into their own queue.
+
+Memory Layout
+-------------
+
+The shared memory region (GspMem) where the queues reside has the following
+layout::
+
+    +------------------------+ GspMem DMA Handle (base address)
+    |    PTE Array (4KB)     |  <- Self-mapping page table
+    | PTE[0] = base + 0x0000 |     Points to this page
+    | PTE[1] = base + 0x1000 |     Points to CPU queue Header page
+    | PTE[2] = base + 0x2000 |     Points to first page of CPU queue data
+    | ...                    |     ...
+    | ...                    |     ...
+    +------------------------+ base + 0x1000
+    |    CPU Queue Header    |  MsgqTxHeader + MsgqRxHeader
+    |    - TX Header (32B)   |
+    |    - RX Header (4B)    | (1 page)
+    |    - Padding           |
+    +------------------------+ base + 0x2000
+    |    CPU Queue Data      | (63 pages)
+    |    (63 x 4KB pages)    |  Circular buffer for messages
+    | ...                    |     ...
+    +------------------------+ base + 0x41000
+    |    GSP Queue Header    |  MsgqTxHeader + MsgqRxHeader
+    |    - TX Header (32B)   |
+    |    - RX Header (4B)    | (1 page)
+    |    - Padding           |
+    +------------------------+ base + 0x42000
+    |    GSP Queue Data      | (63 pages)
+    |    (63 x 4KB pages)    |  Circular buffer for messages
+    | ...                    |     ...
+    +------------------------+ base + 0x81000
+
+
+Message Passing Mechanics
+-------------------------
+The split read/write pointer design allows bidirectional communication between
the
+CPU and GSP without synchronization (if it were a shared queue), for example,
the
+following diagram illustrates pointer updates, when CPU sends message to GSP::
+
+   
+--------------------------------------------------------------------------+
+    |                     DMA coherent Shared Memory (GspMem)                 
|
+   
+--------------------------------------------------------------------------+
+    |                          (CPU sending message to GSP)                   
|
+    |  +-------------------+                      +-------------------+       
|
+    |  |   GSP Queue       |                      |   CPU Queue       |       
|
+    |  |                   |                      |                   |       
|
+    |  | +-------------+   |                      | +-------------+   |       
|
+    |  | |  TX Header  |   |                      | |  TX Header  |   |       
|
+    |  | | write_ptr   |   |                      | | write_ptr   |---+----,  
|
+    |  | |             |   |                      | |             |   |    |  
|
+    |  | +-------------+   |                      | +-------------+   |    |  
|
+    |  |                   |                      |                   |    |  
|
+    |  | +-------------+   |                      | +-------------+   |    |  
|
+    |  | |  RX Header  |   |                      | |  RX Header  |   |    |  
|
+    |  | |  read_ptr ------+-------,              | |  read_ptr   |   |    |  
|
+    |  | |             |   |       |              | |             |   |    |  
|
+    |  | +-------------+   |       |              | +-------------+   |    |  
|
+    |  |                   |       |              |                   |    |  
|
+    |  | +-------------+   |       |              | +-------------+   |    |  
|
+    |  | |   Page 0    |   |       |              | |   Page 0    |   |    |  
|
+    |  | +-------------+   |       |              | +-------------+   |    |  
|
+    |  | |   Page 1    |   |       `--------------> |   Page 1    |   |    |
|
+    |  | +-------------+   |                      | +-------------+   |    |  
|
+    |  | |   Page 2    |   |                      | |   Page 2   
|<--+----'   |
+    |  | +-------------+   |                      | +-------------+   |       
|
+    |  | |     ...     |   |                      | |     ...     |   |       
|
+    |  | +-------------+   |                      | +-------------+   |       
|
+    |  | |   Page 62   |   |                      | |   Page 62   |   |       
|
+    |  | +-------------+   |                      | +-------------+   |       
|
+    |  |   (63 pages)      |                      |   (63 pages)      |       
|
+    |  +-------------------+                      +-------------------+       
|
+    |                                                                         
|
+   
+--------------------------------------------------------------------------+
+
+When the CPU sends a message to the GSP, it writes the message to its own
+queue (CPU queue) and updates the write pointer in its queue's TX header.
The GSP
+then reads the read pointer in its own queue's RX header and knows that
there are
+pending messages from the CPU because its RX header's read pointer is
behind the
+CPU's TX header's write pointer. After reading the message, the GSP
updates its RX
+header's read pointer to catch up. The same happens in reverse.
+
+Page-based message passing
+--------------------------
+The message queue is page-based, which means that the message is stored in a
+page-aligned buffer. The page size is 4KB. Each message starts at the beginning
of
+a page. If the message is shorter than a page, the remaining space in the page
is
+wasted. The next message starts at the beginning of the next page no matter how
+small the previous message was.
+
+Note that messages larger than a page will span multiple pages. This means that
+it is possible that the first part of the message lands on the last page, and
the
+second part of the message lands on the first page, thus requiring out-of-order
+memory access. The SBuffer data structure in Nova tackles this use case.
+
+RPC Message Structure:
+=====================+
+An RPC message is also called a "Message Element". The entire message
has
+multiple headers. There is a "message element" header which handles
message
+queue specific details and integrity, followed by a "RPC" header
which handles
+the RPC protocol details::
+
+    +----------------------------------+
+    |        GspMsgHeader (64B)        | (aka, Message Element Header)
+    +----------------------------------+
+    | auth_tag_buffer[16]              | --+
+    | aad_buffer[16]                   |   |
+    | checksum        (u32)            |   +-- Security & Integrity
+    | sequence        (u32)            |   |
+    | elem_count      (u32)            |   |
+    | pad             (u32)            | --+
+    +----------------------------------+
+    |        GspRpcHeader (32B)        |
+    +----------------------------------+
+    | header_version  (0x03000000)     | --+
+    | signature       (0x43505256)     |   |
+    | length          (u32)            |   +-- RPC Protocol
+    | function        (u32)            |   |
+    | rpc_result      (u32)            |   |
+    | rpc_result_private (u32)         |   |
+    | sequence        (u32)            |   |
+    | cpu_rm_gfid     (u32)            | --+
+    +----------------------------------+
+    |                                  |
+    |        Payload (Variable)        | --- Function-specific data
+    |                                  |
+    +----------------------------------+
diff --git a/Documentation/gpu/nova/index.rst b/Documentation/gpu/nova/index.rst
index e39cb3163581..46302daace34 100644
--- a/Documentation/gpu/nova/index.rst
+++ b/Documentation/gpu/nova/index.rst
@@ -32,3 +32,4 @@ vGPU manager VFIO driver and the nova-drm driver.
    core/devinit
    core/fwsec
    core/falcon
+   core/msgq
-- 
2.34.1
Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 4/7] docs: gpu: nova-core: Document the PRAMIN aperture mechanism
While not terribly complicated, a little bit of documentation will help
augment the code for this very important mechanism.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 Documentation/gpu/nova/core/pramin.rst | 113 +++++++++++++++++++++++++
 Documentation/gpu/nova/index.rst       |   1 +
 2 files changed, 114 insertions(+)
 create mode 100644 Documentation/gpu/nova/core/pramin.rst
diff --git a/Documentation/gpu/nova/core/pramin.rst
b/Documentation/gpu/nova/core/pramin.rst
new file mode 100644
index 000000000000..19615e504db9
--- /dev/null
+++ b/Documentation/gpu/nova/core/pramin.rst
@@ -0,0 +1,113 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========================+PRAMIN aperture mechanism
+========================+
+.. note::
+   The following description is approximate and current as of the Ampere
family.
+   It may change for future generations and is intended to assist in
understanding
+   the driver code.
+
+Introduction
+===========+
+PRAMIN is a hardware aperture mechanism that provides CPU access to GPU Video
RAM (VRAM) before
+the GPU's Memory Management Unit (MMU) and page tables are initialized.
This 1MB sliding window,
+located at a fixed offset within BAR0, is essential for setting up page tables
and other critical
+GPU data structures without relying on the GPU's MMU.
+
+Architecture Overview
+====================+
+Logically, the PRAMIN aperture mechanism is implemented by the GPU's PBUS
(PCIe Bus Controller Unit)
+and provides a CPU-accessible window into VRAM through the PCIe interface::
+
+    +-----------------+    PCIe     +------------------------------+
+    |      CPU        |<----------->|           GPU                |
+    +-----------------+             |                              |
+                                    |  +----------------------+    |
+                                    |  |       PBUS           |    |
+                                    |  |  (Bus Controller)    |    |
+                                    |  |                      |    |
+                                    |  |  +--------------.<------------
(window always starts at
+                                    |  |  |   PRAMIN     |    |    |     BAR0 +
0x700000)
+                                    |  |  |   Window     |    |    |
+                                    |  |  |   (1MB)      |    |    |
+                                    |  |  +--------------+    |    |
+                                    |  |         |            |    |
+                                    |  +---------|------------+    |
+                                    |            |                 |
+                                    |            v                 |
+                                    |  .----------------------.<------------
(Program PRAMIN to any
+                                    |  |       VRAM           |    |    64KB
VRAM physical boundary)
+                                    |  |    (Several GBs)     |    |
+                                    |  |                      |    |
+                                    |  |  FB[0x000000000000]  |    |
+                                    |  |          ...         |    |
+                                    |  |  FB[0x7FFFFFFFFFF]   |    |
+                                    |  +----------------------+    |
+                                    +------------------------------+
+
+PBUS (PCIe Bus Controller) among other things is responsible in the GPU for
handling MMIO
+accesses to the BAR registers.
+
+PRAMIN Window Operation
+======================+
+The PRAMIN window provides a 1MB sliding aperture that can be repositioned over
+the entire VRAM address space using the NV_PBUS_BAR0_WINDOW register.
+
+Window Control Mechanism
+-------------------------
+
+The window position is controlled via the PBUS BAR0_WINDOW register::
+
+    NV_PBUS_BAR0_WINDOW Register
+    +-----+-----+--------------------------------------+
+    |31-26|25-24|           23-0                       |
+    |     |TARG |         BASE_ADDR                    |
+    |     | ET  |        (bits 39:16 of VRAM address)  |
+    +-----+-----+--------------------------------------+
+
+    TARGET field values:
+    - 0x0: VID_MEM (Video Memory / VRAM)
+    - 0x1: SYS_MEM_COHERENT (Coherent system memory)
+    - 0x2: SYS_MEM_NONCOHERENT (Non-coherent system memory)
+
+64KB Alignment Requirement
+---------------------------
+
+The PRAMIN window must be aligned to 64KB boundaries in VRAM. This is enforced
+by the BASE_ADDR field representing bits [39:16] of the target address::
+
+    VRAM Address Calculation:
+    actual_vram_addr = (BASE_ADDR << 16) + pramin_offset
+    Where:
+    - BASE_ADDR: 24-bit value from NV_PBUS_BAR0_WINDOW[23:0]
+    - pramin_offset: 20-bit offset within PRAMIN window [0x00000-0xFFFFF]
+    Example Window Positioning:
+    +---------------------------------------------------------+
+    |                    VRAM Space                           |
+    |                                                         |
+    |  0x000000000  +-----------------+ <-- 64KB aligned      |
+    |               | PRAMIN Window   |                       |
+    |               |    (1MB)        |                       |
+    |  0x0000FFFFF  +-----------------+                       |
+    |                                                         |
+    |       |              ^                                  |
+    |       |              | Window can slide                 |
+    |       v              | to any 64KB boundary             |
+    |                                                         |
+    |  0x123400000  +-----------------+ <-- 64KB aligned      |
+    |               | PRAMIN Window   |                       |
+    |               |    (1MB)        |                       |
+    |  0x1234FFFFF  +-----------------+                       |
+    |                                                         |
+    |                       ...                               |
+    |                                                         |
+    |  0x7FFFF0000  +-----------------+ <-- 64KB aligned      |
+    |               | PRAMIN Window   |                       |
+    |               |    (1MB)        |                       |
+    |  0x7FFFFFFFF  +-----------------+                       |
+    +---------------------------------------------------------+
diff --git a/Documentation/gpu/nova/index.rst b/Documentation/gpu/nova/index.rst
index 46302daace34..e77d3ee336a4 100644
--- a/Documentation/gpu/nova/index.rst
+++ b/Documentation/gpu/nova/index.rst
@@ -33,3 +33,4 @@ vGPU manager VFIO driver and the nova-drm driver.
    core/fwsec
    core/falcon
    core/msgq
+   core/pramin
-- 
2.34.1
Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 5/7] gpu: nova-core: Add support for managing GSP falcon interrupts
Add support for managing GSP falcon interrupts. These are required for
GSP message queue interrupt handling.
Also rename clear_swgen0_intr() to enable_msq_interrupt() for
readability.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 drivers/gpu/nova-core/falcon/gsp.rs | 26 +++++++++++++++++++++++---
 drivers/gpu/nova-core/gpu.rs        |  2 +-
 drivers/gpu/nova-core/regs.rs       | 10 ++++++++++
 3 files changed, 34 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/nova-core/falcon/gsp.rs
b/drivers/gpu/nova-core/falcon/gsp.rs
index f17599cb49fa..6da63823996b 100644
--- a/drivers/gpu/nova-core/falcon/gsp.rs
+++ b/drivers/gpu/nova-core/falcon/gsp.rs
@@ -22,11 +22,31 @@ impl FalconEngine for Gsp {
 }
 
 impl Falcon<Gsp> {
-    /// Clears the SWGEN0 bit in the Falcon's IRQ status clear register to
-    /// allow GSP to signal CPU for processing new messages in message queue.
-    pub(crate) fn clear_swgen0_intr(&self, bar: &Bar0) {
+    /// Enable the GSP Falcon message queue interrupt (SWGEN0 interrupt).
+    #[expect(dead_code)]
+    pub(crate) fn enable_msgq_interrupt(&self, bar: &Bar0) {
+        regs::NV_PFALCON_FALCON_IRQMASK::alter(bar, &Gsp::ID, |r|
r.set_swgen0(true));
+    }
+
+    /// Check if the message queue interrupt is pending.
+    #[expect(dead_code)]
+    pub(crate) fn has_msgq_interrupt(&self, bar: &Bar0) -> bool {
+        regs::NV_PFALCON_FALCON_IRQSTAT::read(bar, &Gsp::ID).swgen0()
+    }
+
+    /// Clears the message queue interrupt to allow GSP to signal CPU
+    /// for processing new messages.
+    pub(crate) fn clear_msgq_interrupt(&self, bar: &Bar0) {
         regs::NV_PFALCON_FALCON_IRQSCLR::default()
             .set_swgen0(true)
             .write(bar, &Gsp::ID);
     }
+
+    /// Acknowledge all pending GSP interrupts.
+    #[expect(dead_code)]
+    pub(crate) fn ack_all_interrupts(&self, bar: &Bar0) {
+        // Read status and write the raw value to IRQSCLR to clear all pending
interrupts.
+        let status = regs::NV_PFALCON_FALCON_IRQSTAT::read(bar, &Gsp::ID);
+        regs::NV_PFALCON_FALCON_IRQSCLR::from(u32::from(status)).write(bar,
&Gsp::ID);
+    }
 }
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index af20e2daea24..fb120cf7b15d 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -216,7 +216,7 @@ pub(crate) fn new<'a>(
                 bar,
                 spec.chipset > Chipset::GA100,
             )
-            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
+            .inspect(|falcon| falcon.clear_msgq_interrupt(bar))?,
 
             sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset, bar, true)?,
 
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 206dab2e1335..a3836a01996b 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -198,6 +198,16 @@ pub(crate) fn vga_workspace_addr(self) ->
Option<u64> {
 
 // PFALCON
 
+register!(NV_PFALCON_FALCON_IRQMASK @ PFalconBase[0x00000014] {
+    4:4     halt as bool;
+    6:6     swgen0 as bool;
+});
+
+register!(NV_PFALCON_FALCON_IRQSTAT @ PFalconBase[0x00000008] {
+    4:4     halt as bool;
+    6:6     swgen0 as bool;
+});
+
 register!(NV_PFALCON_FALCON_IRQSCLR @ PFalconBase[0x00000004] {
     4:4     halt as bool;
     6:6     swgen0 as bool;
-- 
2.34.1
Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 6/7] nova-core: mm: Add support to use PRAMIN windows to write to VRAM
Required for writing page tables directly to VRAM physical memory,
before page tables and MMU are setup.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 drivers/gpu/nova-core/mm/mod.rs    |   3 +
 drivers/gpu/nova-core/mm/pramin.rs | 241 +++++++++++++++++++++++++++++
 drivers/gpu/nova-core/nova_core.rs |   1 +
 drivers/gpu/nova-core/regs.rs      |  29 +++-
 4 files changed, 273 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/nova-core/mm/mod.rs
 create mode 100644 drivers/gpu/nova-core/mm/pramin.rs
diff --git a/drivers/gpu/nova-core/mm/mod.rs b/drivers/gpu/nova-core/mm/mod.rs
new file mode 100644
index 000000000000..54c7cd9416a9
--- /dev/null
+++ b/drivers/gpu/nova-core/mm/mod.rs
@@ -0,0 +1,3 @@
+// SPDX-License-Identifier: GPL-2.0
+
+pub(crate) mod pramin;
diff --git a/drivers/gpu/nova-core/mm/pramin.rs
b/drivers/gpu/nova-core/mm/pramin.rs
new file mode 100644
index 000000000000..4f4e1b8c0b9b
--- /dev/null
+++ b/drivers/gpu/nova-core/mm/pramin.rs
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Direct VRAM access through PRAMIN window before page tables are set up.
+//! PRAMIN can also write to system memory, however for simplicty we only
+//! support VRAM access.
+//!
+//! # Examples
+//!
+//! ## Writing u32 data to VRAM
+//!
+//! ```no_run
+//! use crate::driver::Bar0;
+//! use crate::mm::pramin::PraminVram;
+//!
+//! fn write_data_to_vram(bar: &Bar0) -> Result {
+//!     let pramin = PraminVram::new(bar);
+//!     // Write 4 32-bit words to VRAM at offset 0x10000
+//!     let data: [u32; 4] = [0xDEADBEEF, 0xCAFEBABE, 0x12345678, 0x87654321];
+//!     pramin.write::<u32>(0x10000, &data)?;
+//!     Ok(())
+//! }
+//! ```
+//!
+//! ## Reading bytes from VRAM
+//!
+//! ```no_run
+//! use crate::driver::Bar0;
+//! use crate::mm::pramin::PraminVram;
+//!
+//! fn read_data_from_vram(bar: &Bar0, buffer: &mut KVec<u8>)
-> Result {
+//!     let pramin = PraminVram::new(bar);
+//!     // Read a u8 from VRAM starting at offset 0x20000
+//!     pramin.read::<u8>(0x20000, buffer)?;
+//!     Ok(())
+//! }
+//! ```
+
+#![expect(dead_code)]
+
+use crate::driver::Bar0;
+use crate::regs;
+use core::mem;
+use kernel::prelude::*;
+
+/// PRAMIN is a window into the VRAM (not a hardware block) that is used to
access
+/// the VRAM directly. These addresses are consistent across all GPUs.
+const PRAMIN_BASE: usize = 0x700000; // PRAMIN is always at BAR0 + 0x700000
+const PRAMIN_SIZE: usize = 0x100000; // 1MB aperture - max access per window
position
+
+/// Trait for types that can be read/written through PRAMIN.
+pub(crate) trait PraminNum: Copy + Default + Sized {
+    fn read_from_bar(bar: &Bar0, offset: usize) -> Result<Self>;
+
+    fn write_to_bar(self, bar: &Bar0, offset: usize) -> Result;
+
+    fn size_bytes() -> usize {
+        mem::size_of::<Self>()
+    }
+
+    fn alignment() -> usize {
+        Self::size_bytes()
+    }
+}
+
+/// Macro to implement PraminNum trait for unsigned integer types.
+macro_rules! impl_pramin_unsigned_num {
+    ($bits:literal) => {
+        ::kernel::macros::paste! {
+            impl PraminNum for [<u $bits>] {
+                fn read_from_bar(bar: &Bar0, offset: usize) ->
Result<Self> {
+                    bar.[<try_read $bits>](offset)
+                }
+
+                fn write_to_bar(self, bar: &Bar0, offset: usize) ->
Result {
+                    bar.[<try_write $bits>](self, offset)
+                }
+            }
+        }
+    };
+}
+
+impl_pramin_unsigned_num!(8);
+impl_pramin_unsigned_num!(16);
+impl_pramin_unsigned_num!(32);
+impl_pramin_unsigned_num!(64);
+
+/// Direct VRAM access through PRAMIN window before page tables are set up.
+pub(crate) struct PraminVram<'a> {
+    bar: &'a Bar0,
+    saved_window_addr: usize,
+}
+
+impl<'a> PraminVram<'a> {
+    /// Create a new PRAMIN VRAM accessor, saving current window state,
+    /// the state is restored when the accessor is dropped.
+    ///
+    /// The BAR0 window base must be 64KB aligned but provides 1MB of VRAM
access.
+    /// Window is repositioned automatically when accessing data beyond 1MB
boundaries.
+    pub(crate) fn new(bar: &'a Bar0) -> Self {
+        let saved_window_addr = Self::get_window_addr(bar);
+        Self {
+            bar,
+            saved_window_addr,
+        }
+    }
+
+    /// Set BAR0 window to point to specific FB region.
+    ///
+    /// # Arguments
+    ///
+    /// * `fb_offset` - VRAM byte offset where the window should be positioned.
+    ///                 Must be 64KB aligned (lower 16 bits zero).
+    fn set_window_addr(&self, fb_offset: usize) -> Result {
+        // FB offset must be 64KB aligned (hardware requirement for window_base
field)
+        // Once positioned, the window provides access to 1MB of VRAM through
PRAMIN aperture
+        if fb_offset & 0xFFFF != 0 {
+            return Err(EINVAL);
+        }
+
+        let window_reg =
regs::NV_PBUS_BAR0_WINDOW::default().set_window_addr(fb_offset);
+        window_reg.write(self.bar);
+
+        // Read back to ensure it took effect
+        let readback = regs::NV_PBUS_BAR0_WINDOW::read(self.bar);
+        if readback.window_base() != window_reg.window_base() {
+            return Err(EIO);
+        }
+
+        Ok(())
+    }
+
+    /// Get current BAR0 window offset.
+    ///
+    /// # Returns
+    ///
+    /// The byte offset in VRAM where the PRAMIN window is currently
positioned.
+    /// This offset is always 64KB aligned.
+    fn get_window_addr(bar: &Bar0) -> usize {
+        let window_reg = regs::NV_PBUS_BAR0_WINDOW::read(bar);
+        window_reg.get_window_addr()
+    }
+
+    /// Common logic for accessing VRAM data through PRAMIN with windowing.
+    ///
+    /// # Arguments
+    ///
+    /// * `fb_offset` - Starting byte offset in VRAM (framebuffer) where access
begins.
+    ///                 Must be aligned to `T::alignment()`.
+    /// * `num_items` - Number of items of type `T` to process.
+    /// * `operation` - Closure called for each item to perform the actual
read/write.
+    ///                 Takes two parameters:
+    ///                 - `data_idx`: Index of the item in the data array
(0..num_items)
+    ///                 - `pramin_offset`: BAR0 offset in the PRAMIN aperture
to access
+    ///
+    /// The function automatically handles PRAMIN window repositioning when
accessing
+    /// data that spans multiple 1MB windows.
+    fn access_vram<T: PraminNum, F>(
+        &self,
+        fb_offset: usize,
+        num_items: usize,
+        mut operation: F,
+    ) -> Result
+    where
+        F: FnMut(usize, usize) -> Result,
+    {
+        // FB offset must be aligned to the size of T
+        if fb_offset & (T::alignment() - 1) != 0 {
+            return Err(EINVAL);
+        }
+
+        let mut offset_bytes = fb_offset;
+        let mut remaining_items = num_items;
+        let mut data_index = 0;
+
+        while remaining_items > 0 {
+            // Align the window to 64KB boundary
+            let target_window = offset_bytes & !0xFFFF;
+            let window_offset = offset_bytes - target_window;
+
+            // Set window if needed
+            if target_window != Self::get_window_addr(self.bar) {
+                self.set_window_addr(target_window)?;
+            }
+
+            // Calculate how many items we can access from this window position
+            // We can access up to 1MB total, minus the offset within the
window
+            let remaining_in_window = PRAMIN_SIZE - window_offset;
+            let max_items_in_window = remaining_in_window / T::size_bytes();
+            let items_to_write = core::cmp::min(remaining_items,
max_items_in_window);
+
+            // Process data through PRAMIN
+            for i in 0..items_to_write {
+                // Calculate the byte offset in the PRAMIN window to write to.
+                let pramin_offset_bytes = PRAMIN_BASE + window_offset + (i *
T::size_bytes());
+                operation(data_index + i, pramin_offset_bytes)?;
+            }
+
+            // Move to next chunk.
+            data_index += items_to_write;
+            offset_bytes += items_to_write * T::size_bytes();
+            remaining_items -= items_to_write;
+        }
+
+        Ok(())
+    }
+
+    /// Generic write for data to VRAM through PRAMIN.
+    ///
+    /// # Arguments
+    ///
+    /// * `fb_offset` - Starting byte offset in VRAM where data will be
written.
+    ///                 Must be aligned to `T::alignment()`.
+    /// * `data` - Slice of items to write to VRAM. All items will be written
sequentially
+    ///            starting at `fb_offset`.
+    pub(crate) fn write<T: PraminNum>(&self, fb_offset: usize, data:
&[T]) -> Result {
+        self.access_vram::<T, _>(fb_offset, data.len(), |data_idx,
pramin_offset| {
+            data[data_idx].write_to_bar(self.bar, pramin_offset)
+        })
+    }
+
+    /// Generic read data from VRAM through PRAMIN.
+    ///
+    /// # Arguments
+    ///
+    /// * `fb_offset` - Starting byte offset in VRAM where data will be read
from.
+    ///                 Must be aligned to `T::alignment()`.
+    /// * `data` - Mutable slice that will be filled with data read from VRAM.
+    ///            The number of items read equals `data.len()`.
+    pub(crate) fn read<T: PraminNum>(&self, fb_offset: usize, data:
&mut [T]) -> Result {
+        self.access_vram::<T, _>(fb_offset, data.len(), |data_idx,
pramin_offset| {
+            data[data_idx] = T::read_from_bar(self.bar, pramin_offset)?;
+            Ok(())
+        })
+    }
+}
+
+impl<'a> Drop for PraminVram<'a> {
+    fn drop(&mut self) {
+        let _ = self.set_window_addr(self.saved_window_addr); // Restore
original window.
+    }
+}
diff --git a/drivers/gpu/nova-core/nova_core.rs
b/drivers/gpu/nova-core/nova_core.rs
index 112277c7921e..6bd9fc3372d6 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -13,6 +13,7 @@
 mod gfw;
 mod gpu;
 mod gsp;
+mod mm;
 mod regs;
 mod util;
 mod vbios;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index a3836a01996b..ba09da7e1541 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -12,6 +12,7 @@
     FalconModSelAlgo, FalconSecurityModel, PFalcon2Base, PFalconBase,
PeregrineCoreSelect,
 };
 use crate::gpu::{Architecture, Chipset};
+use kernel::bits::genmask_u32;
 use kernel::prelude::*;
 
 // PMC
@@ -43,7 +44,8 @@ pub(crate) fn chipset(self) -> Result<Chipset> {
     }
 }
 
-// PBUS
+// PBUS - PBUS is a bus control unit, that helps the GPU communicate with the
PCI bus.
+// Handles the BAR windows, decoding of MMIO read/writes on the BARs, etc.
 
 register!(NV_PBUS_SW_SCRATCH @ 0x00001400[64]  {});
 
@@ -52,6 +54,31 @@ pub(crate) fn chipset(self) -> Result<Chipset> {
     31:16   frts_err_code as u16;
 });
 
+// BAR0 window control register to configure the BAR0 window for PRAMIN access
+// (direct physical VRAM access).
+register!(NV_PBUS_BAR0_WINDOW @ 0x00001700, "BAR0 window control
register" {
+    25:24   target as u8, "Target (0=VID_MEM, 1=SYS_MEM_COHERENT,
2=SYS_MEM_NONCOHERENT)";
+    23:0    window_base as u32, "Window base address (bits 39:16 of FB
addr)";
+});
+
+impl NV_PBUS_BAR0_WINDOW {
+    /// Returns the 64-bit aligned VRAM address of the window.
+    pub(crate) fn get_window_addr(self) -> usize {
+        (self.window_base() as usize) << 16
+    }
+
+    /// Sets the window address from a framebuffer offset.
+    /// The fb_offset must be 64KB aligned (lower bits discared).
+    pub(crate) fn set_window_addr(self, fb_offset: usize) -> Self {
+        // Calculate window base (bits 39:16 of FB address)
+        // The total FB address is 40 bits, mask anything above. Since we are
+        // right shifting the offset by 16 bits, the mask is only 24 bits.
+        let mask = genmask_u32(0..=23) as usize;
+        let window_base = ((fb_offset >> 16) & mask) as u32;
+        self.set_window_base(window_base)
+    }
+}
+
 // PFB
 
 // The following two registers together hold the physical system memory address
that is used by the
-- 
2.34.1
Joel Fernandes
2025-Oct-20  18:55 UTC
[PATCH 7/7] nova-core: mm: Add data structures for page table management
Add data structures and helpers for page table management. Uses
bitfield for cleanly representing and accessing the bitfields in the
structures.
Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com>
---
 drivers/gpu/nova-core/mm/mod.rs   |   1 +
 drivers/gpu/nova-core/mm/types.rs | 405 ++++++++++++++++++++++++++++++
 2 files changed, 406 insertions(+)
 create mode 100644 drivers/gpu/nova-core/mm/types.rs
diff --git a/drivers/gpu/nova-core/mm/mod.rs b/drivers/gpu/nova-core/mm/mod.rs
index 54c7cd9416a9..f4985780a8a1 100644
--- a/drivers/gpu/nova-core/mm/mod.rs
+++ b/drivers/gpu/nova-core/mm/mod.rs
@@ -1,3 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0
 
 pub(crate) mod pramin;
+pub(crate) mod types;
diff --git a/drivers/gpu/nova-core/mm/types.rs
b/drivers/gpu/nova-core/mm/types.rs
new file mode 100644
index 000000000000..0a2dec6b9145
--- /dev/null
+++ b/drivers/gpu/nova-core/mm/types.rs
@@ -0,0 +1,405 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Page table data management for NVIDIA GPUs.
+//!
+//! This module provides data structures for GPU page table management,
including
+//! address types, page table entries (PTEs), page directory entries (PDEs),
and
+//! the page table hierarchy levels.
+//!
+//! # Examples
+//!
+//! ## Creating and writing a PDE
+//!
+//! ```no_run
+//! let new_pde = Pde::default()
+//!     .set_valid(true)
+//!     .set_aperture(AperturePde::VideoMemory)
+//!     .set_table_frame_number(new_table.frame_number());
+//! // Call a function to write PDE to VRAM address
+//! write_pde(pde_addr, new_pde)?;
+//! ```
+//!
+//! ## Given a PTE, Get or allocate a PFN (page frame number).
+//!
+//! ```no_run
+//! fn get_frame_number(pte_addr: VramAddress) -> Result<Pfn> {
+//!     // Call a function to read 64-bit PTE value from VRAM address
+//!     let pte = Pte(read_u64_from_vram(pte_addr)?);
+//!     if pte.valid() {
+//!         // Return physical frame number from existing mapping
+//!         Ok(Pfn::new(pte.frame_number()))
+//!     } else {
+//!         // Create new PTE mapping
+//!         // Call a function to allocate a physical page, returning a Pfn
+//!         let phys_pfn = allocate_page()?;
+//!         let new_pte = Pte::default()
+//!             .set_valid(true)
+//!             .set_frame_number(phys_pfn.raw())
+//!             .set_aperture(AperturePte::VideoMemory)
+//!             .set_privilege(false)   // User-accessible
+//!             .set_read_only(false);  // Writable
+//!
+//!         // Call a function to write 64-bit PTE value to VRAM address
+//!         write_u64_to_vram(pte_addr, new_pte.raw())?;
+//!         Ok(phys_pfn)
+//!     }
+//! }
+//! ```
+
+#![expect(dead_code)]
+
+/// Memory size constants
+pub(crate) const KB: usize = 1024;
+pub(crate) const MB: usize = KB * 1024;
+
+/// Page size: 4 KiB
+pub(crate) const PAGE_SIZE: usize = 4 * KB;
+
+/// Page Table Level hierarchy
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum PageTableLevel {
+    Pdb, // Level 0 - Page Directory Base
+    L1,  // Level 1
+    L2,  // Level 2
+    L3,  // Level 3 - Dual PDE (128-bit entries)
+    L4,  // Level 4 - PTEs
+}
+
+impl PageTableLevel {
+    /// Get the entry size for this level.
+    pub(crate) fn entry_size(&self) -> usize {
+        match self {
+            Self::L3 => 16, // 128-bit dual PDE
+            _ => 8,         // 64-bit PDE/PTE
+        }
+    }
+
+    /// PDE levels constant array for iteration.
+    const PDE_LEVELS: [PageTableLevel; 4] = [
+        PageTableLevel::Pdb,
+        PageTableLevel::L1,
+        PageTableLevel::L2,
+        PageTableLevel::L3,
+    ];
+
+    /// Get iterator over PDE levels.
+    pub(crate) fn pde_levels() -> impl Iterator<Item = PageTableLevel>
{
+        Self::PDE_LEVELS.into_iter()
+    }
+}
+
+/// Memory aperture for Page Directory Entries (PDEs)
+#[repr(u8)]
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
+pub(crate) enum AperturePde {
+    #[default]
+    Invalid = 0,
+    VideoMemory = 1,
+    SystemCoherent = 2,
+    SystemNonCoherent = 3,
+}
+
+impl From<u8> for AperturePde {
+    fn from(val: u8) -> Self {
+        match val {
+            1 => Self::VideoMemory,
+            2 => Self::SystemCoherent,
+            3 => Self::SystemNonCoherent,
+            _ => Self::Invalid,
+        }
+    }
+}
+
+impl From<AperturePde> for u8 {
+    fn from(val: AperturePde) -> Self {
+        val as u8
+    }
+}
+
+/// Memory aperture for Page Table Entries (PTEs)
+#[repr(u8)]
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
+pub(crate) enum AperturePte {
+    #[default]
+    VideoMemory = 0,
+    PeerVideoMemory = 1,
+    SystemCoherent = 2,
+    SystemNonCoherent = 3,
+}
+
+impl From<u8> for AperturePte {
+    fn from(val: u8) -> Self {
+        match val {
+            0 => Self::VideoMemory,
+            1 => Self::PeerVideoMemory,
+            2 => Self::SystemCoherent,
+            3 => Self::SystemNonCoherent,
+            _ => Self::VideoMemory,
+        }
+    }
+}
+
+impl From<AperturePte> for u8 {
+    fn from(val: AperturePte) -> Self {
+        val as u8
+    }
+}
+
+/// Common trait for address types
+pub(crate) trait Address {
+    /// Get raw u64 value.
+    fn raw(&self) -> u64;
+
+    /// Convert an Address to a frame number.
+    fn frame_number(&self) -> u64 {
+        self.raw() >> 12
+    }
+
+    /// Get the frame offset within an Address.
+    fn frame_offset(&self) -> u16 {
+        (self.raw() & 0xFFF) as u16
+    }
+}
+
+bitfield! {
+    pub(crate) struct VramAddress(u64), "Physical VRAM address
representation." {
+        11:0    offset          as u16;    // Offset within 4KB page
+        63:12   frame_number    as u64;    // Frame number
+    }
+}
+
+impl Address for VramAddress {
+    fn raw(&self) -> u64 {
+        self.0
+    }
+}
+
+impl From<Pfn> for VramAddress {
+    fn from(pfn: Pfn) -> VramAddress {
+        VramAddress::default().set_frame_number(pfn.raw())
+    }
+}
+
+bitfield! {
+    pub(crate) struct VirtualAddress(u64), "Virtual address representation
for GPU." {
+        11:0    offset      as u16;    // Offset within 4KB page
+        20:12   l4_index    as u16;    // Level 4 index (PTE)
+        29:21   l3_index    as u16;    // Level 3 index
+        38:30   l2_index    as u16;    // Level 2 index
+        47:39   l1_index    as u16;    // Level 1 index
+        56:48   l0_index    as u16;    // Level 0 index (PDB)
+
+        63:12   frame_number as u64;   // Frame number (combination of levels).
+    }
+}
+
+impl VirtualAddress {
+    /// Get index for a specific page table level.
+    ///
+    /// # Example
+    ///
+    /// ```no_run
+    /// let va = VirtualAddress::default();
+    /// let pte_idx = va.level_index(PageTableLevel::L4);
+    /// ```
+    pub(crate) fn level_index(&self, level: PageTableLevel) -> u16 {
+        match level {
+            PageTableLevel::Pdb => self.l0_index(),
+            PageTableLevel::L1 => self.l1_index(),
+            PageTableLevel::L2 => self.l2_index(),
+            PageTableLevel::L3 => self.l3_index(),
+            PageTableLevel::L4 => self.l4_index(),
+        }
+    }
+}
+
+impl Address for VirtualAddress {
+    fn raw(&self) -> u64 {
+        self.0
+    }
+}
+
+impl From<Vfn> for VirtualAddress {
+    fn from(vfn: Vfn) -> VirtualAddress {
+        VirtualAddress::default().set_frame_number(vfn.raw())
+    }
+}
+
+bitfield! {
+    pub(crate) struct Pte(u64), "Page Table Entry (PTE) to map virtual
pages to physical frames." {
+        0:0     valid           as bool;    // (1 = valid for PTEs)
+        1:1     privilege       as bool;    // P - Privileged/kernel-only
access
+        2:2     read_only       as bool;    // RO - Write protection
+        3:3     atomic_disable  as bool;    // AD - Disable atomic ops
+        4:4     encrypted       as bool;    // E - Encryption enabled
+        39:8    frame_number    as u64;     // PA[39:8] - Physical frame number
(32 bits)
+        41:40   aperture        as u8 => AperturePte;   // Memory aperture
type.
+        42:42   volatile        as bool;    // VOL - Volatile flag
+        50:43   kind            as u8;      // K[7:0] - Compression/tiling kind
+        63:51   comptag_line    as u16;     // CTL[12:0] - Compression tag line
+    }
+}
+
+impl Pte {
+    /// Set the physical address mapped by this PTE.
+    pub(crate) fn set_address(&mut self, addr: VramAddress) {
+        self.set_frame_number(addr.frame_number());
+    }
+
+    /// Get the physical address mapped by this PTE.
+    pub(crate) fn address(&self) -> VramAddress {
+        VramAddress::default().set_frame_number(self.frame_number())
+    }
+
+    /// Get raw u64 value.
+    pub(crate) fn raw(&self) -> u64 {
+        self.0
+    }
+}
+
+bitfield! {
+    pub(crate) struct Pde(u64), "Page Directory Entry (PDE) pointing to
next-level page tables." {
+        0:0     valid_inverted       as bool;    // V - Valid bit (0=valid for
PDEs)
+        2:1     aperture             as u8 => AperturePde;      // Memory
aperture type
+        3:3     volatile             as bool;    // VOL - Volatile flag
+        39:8    table_frame_number   as u64;     // PA[39:8] - Table frame
number (32 bits)
+    }
+}
+
+impl Pde {
+    /// Check if PDE is valid.
+    pub(crate) fn is_valid(&self) -> bool {
+        !self.valid_inverted() && self.aperture() !=
AperturePde::Invalid
+    }
+
+    /// The valid bit is inverted so add an accessor to flip it.
+    pub(crate) fn set_valid(&self, value: bool) -> Pde {
+        self.set_valid_inverted(!value)
+    }
+
+    /// Set the physical table address mapped by this PDE.
+    pub(crate) fn set_table_address(&mut self, addr: VramAddress) {
+        self.set_table_frame_number(addr.frame_number());
+    }
+
+    /// Get the physical table address mapped by this PDE.
+    pub(crate) fn table_address(&self) -> VramAddress {
+        VramAddress::default().set_frame_number(self.table_frame_number())
+    }
+
+    /// Get raw u64 value.
+    pub(crate) fn raw(&self) -> u64 {
+        self.0
+    }
+}
+
+/// Dual PDE at Level 3 - 128-bit entry containing both LPT and SPT pointers.
+/// Lower 64 bits = big/large page, upper 64 bits = small page.
+///
+/// # Example
+///
+/// ## Set the SPT (small page table) address in a Dual PDE
+///
+/// ```no_run
+/// // Call a function to read dual PDE from VRAM address
+/// let mut dual_pde: DualPde = read_dual_pde(dpde_addr)?;
+/// // Call a function to allocate a page table and return its VRAM address
+/// let spt_addr = allocate_page_table()?;
+/// dual_pde.set_spt(Pfn::from(spt_addr), AperturePde::VideoMemory);
+/// // Call a function to write dual PDE to VRAM address
+/// write_dual_pde(dpde_addr, dual_pde)?;
+/// ```
+#[repr(C)]
+#[derive(Debug, Clone, Copy)]
+pub(crate) struct DualPde {
+    pub lpt: Pde, // Large/Big Page Table pointer (2MB pages) - bits 63:0
(lower)
+    pub spt: Pde, // Small Page Table pointer (4KB pages) - bits 127:64 (upper)
+}
+
+impl DualPde {
+    /// Create a new empty dual PDE.
+    pub(crate) fn new() -> Self {
+        Self {
+            spt: Pde::default(),
+            lpt: Pde::default(),
+        }
+    }
+
+    /// Set the Small Page Table address with aperture.
+    pub(crate) fn set_small_pt_address(&mut self, addr: VramAddress,
aperture: AperturePde) {
+        self.spt = Pde::default()
+            .set_valid(true)
+            .set_table_frame_number(addr.frame_number())
+            .set_aperture(aperture);
+    }
+
+    /// Set the Large Page Table address with aperture.
+    pub(crate) fn set_large_pt_address(&mut self, addr: VramAddress,
aperture: AperturePde) {
+        self.lpt = Pde::default()
+            .set_valid(true)
+            .set_table_frame_number(addr.frame_number())
+            .set_aperture(aperture);
+    }
+
+    /// Check if has valid Small Page Table.
+    pub(crate) fn has_small_pt_address(&self) -> bool {
+        self.spt.is_valid()
+    }
+
+    /// Check if has valid Large Page Table.
+    pub(crate) fn has_large_pt_address(&self) -> bool {
+        self.lpt.is_valid()
+    }
+
+    /// Set SPT (Small Page Table) using Pfn.
+    pub(crate) fn set_spt(&mut self, pfn: Pfn, aperture: AperturePde) {
+        self.spt = Pde::default()
+            .set_valid(true)
+            .set_aperture(aperture)
+            .set_table_frame_number(pfn.raw());
+    }
+}
+
+/// Virtual Frame Number - virtual address divided by 4KB.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) struct Vfn(u64);
+
+impl Vfn {
+    /// Create a new VFN from a frame number.
+    pub(crate) const fn new(frame_number: u64) -> Self {
+        Self(frame_number)
+    }
+
+    /// Get raw frame number.
+    pub(crate) const fn raw(&self) -> u64 {
+        self.0
+    }
+}
+
+impl From<VirtualAddress> for Vfn {
+    fn from(vaddr: VirtualAddress) -> Self {
+        Self(vaddr.frame_number())
+    }
+}
+
+/// Physical Frame Number - physical address divided by 4KB.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) struct Pfn(u64);
+
+impl Pfn {
+    /// Create a new PFN from a frame number.
+    pub(crate) const fn new(frame_number: u64) -> Self {
+        Self(frame_number)
+    }
+
+    /// Get raw frame number.
+    pub(crate) const fn raw(&self) -> u64 {
+        self.0
+    }
+}
+
+impl From<VramAddress> for Pfn {
+    fn from(addr: VramAddress) -> Self {
+        Self(addr.frame_number())
+    }
+}
-- 
2.34.1
John Hubbard
2025-Oct-20  21:20 UTC
[PATCH 0/7] Pre-requisite patches for mm and irq in nova-core
On 10/20/25 11:55 AM, Joel Fernandes wrote:> These patches have some prerequistes needed for nova-core as support is added > for memory management and interrupt handling. I rebased them on drm-rust-next > and would like them to be considered for the next merge window. I also included > a simple rust documentation patch fixing some issues I noticed while reading it :).Just to be clear, it appears to me that one must first apply "[PATCH v7 0/4] bitfield initial refactor within nova-core" [1], in order to apply this series, yes? [1] https://lore.kernel.org/20251016150204.1189641-1-joelagnelf at nvidia.com thanks, -- John Hubbard> > The series adds support for the PRAMIN aperture mechanism, which is a > prerequisite for virtual memory as it is required to boot strap virtual memory > (we cannot write to VRAM using virtual memory because we need to write page > tables to VRAM in the first place). > > I also add page table related structures (mm/types.rs) using the bitfield > macro, which will be used for page table walking, memory mapping, etc. This is > currently unused code, because without physical memory allocation (using the > buddy allocator), we cannot use this code as page table pages need to be > allocated in the first place. However, I have included several examples in the > file about how these structures will be used. I have also simplified the code > keeping future additions to it for later. > > For interrupts, I only have added additional support for GSP's message queue > interrupt. I am working on adding support to the interrupt controller module > (VFN) which is the next thing for me to post after this series. I have it > prototyped and working, however I am currently making several changes to it > related to virtual functions. For now in this series, I just want to get the > GSP-specific patch out of the way, hence I am including it here. > > I also have added a patch for bitfield macro which constructs a bitfield struct > given its storage type. This is used in a later GSP interrupt patch in the > series to read from one register and write to another. > > Joel Fernandes (7): > docs: rust: Fix a few grammatical errors > gpu: nova-core: Add support to convert bitfield to underlying type > docs: gpu: nova-core: Document GSP RPC message queue architecture > docs: gpu: nova-core: Document the PRAMIN aperture mechanism > gpu: nova-core: Add support for managing GSP falcon interrupts > nova-core: mm: Add support to use PRAMIN windows to write to VRAM > nova-core: mm: Add data structures for page table management > > Documentation/gpu/nova/core/msgq.rst | 159 +++++++++ > Documentation/gpu/nova/core/pramin.rst | 113 +++++++ > Documentation/gpu/nova/index.rst | 2 + > Documentation/rust/coding-guidelines.rst | 4 +- > drivers/gpu/nova-core/bitfield.rs | 7 + > drivers/gpu/nova-core/falcon/gsp.rs | 26 +- > drivers/gpu/nova-core/gpu.rs | 2 +- > drivers/gpu/nova-core/mm/mod.rs | 4 + > drivers/gpu/nova-core/mm/pramin.rs | 241 ++++++++++++++ > drivers/gpu/nova-core/mm/types.rs | 405 +++++++++++++++++++++++ > drivers/gpu/nova-core/nova_core.rs | 1 + > drivers/gpu/nova-core/regs.rs | 39 ++- > 12 files changed, 996 insertions(+), 7 deletions(-) > create mode 100644 Documentation/gpu/nova/core/msgq.rst > create mode 100644 Documentation/gpu/nova/core/pramin.rst > create mode 100644 drivers/gpu/nova-core/mm/mod.rs > create mode 100644 drivers/gpu/nova-core/mm/pramin.rs > create mode 100644 drivers/gpu/nova-core/mm/types.rs >
Alexandre Courbot
2025-Oct-22  06:57 UTC
[PATCH 0/7] Pre-requisite patches for mm and irq in nova-core
On Tue Oct 21, 2025 at 3:55 AM JST, Joel Fernandes wrote:> These patches have some prerequistes needed for nova-core as support is added > for memory management and interrupt handling. I rebased them on drm-rust-next > and would like them to be considered for the next merge window. I also included > a simple rust documentation patch fixing some issues I noticed while reading it :). > > The series adds support for the PRAMIN aperture mechanism, which is a > prerequisite for virtual memory as it is required to boot strap virtual memory > (we cannot write to VRAM using virtual memory because we need to write page > tables to VRAM in the first place). > > I also add page table related structures (mm/types.rs) using the bitfield > macro, which will be used for page table walking, memory mapping, etc. This is > currently unused code, because without physical memory allocation (using the > buddy allocator), we cannot use this code as page table pages need to be > allocated in the first place. However, I have included several examples in the > file about how these structures will be used. I have also simplified the code > keeping future additions to it for later. > > For interrupts, I only have added additional support for GSP's message queue > interrupt. I am working on adding support to the interrupt controller module > (VFN) which is the next thing for me to post after this series. I have it > prototyped and working, however I am currently making several changes to it > related to virtual functions. For now in this series, I just want to get the > GSP-specific patch out of the way, hence I am including it here. > > I also have added a patch for bitfield macro which constructs a bitfield struct > given its storage type. This is used in a later GSP interrupt patch in the > series to read from one register and write to another.So IIUC, this series contains the following: 1. Add PRAMIN support, 2. Add Page mapping support, albeit this is unexercised until we have a user (e.g. buddy allocator), 3. Add Falcon interrupt support, 4. Add missing bitfield functionality, albeit not used yet, 5. Various documentation patches. This is a bit confusing, as there is close to no logical relationship or dependency between these patches. I see potential for several different submissions here: - The core documentation fix, as Miguel pointed out, since it should be merged into the rust tree and not nova-core. - The bitfield patch is a useful addition and should be sent separately. - PRAMIN/Page mapping should come with code that exercices them, so think they belong as the first patches of a series that ends with basic memory allocation capabilities. But feel free to send a RFC if you want early feedback. - The falcon interrupts patch does not seem to be used by the last two patches? I guess it belongs to the series that will add support for the interrupt controller. - Other documentation patches belong to the series that introduces the feature they document.