Hi all, I am a Nouveau user on FC12 with GeForce 9500GT. I have read the Nouveau wiki documents, and they imply that there are ways to set GPU to send interrupts to CPU, when we want to be notified for something, e.g., when DMA transfer or GPU operation is completed. By default, when I run an OpenGL demo application from Gallium3D, the driver gets no interrupts from GPU in nouveau_irq_handler(), except that it gets one NV_PFIFO_INTR_CACHE_ERROR interrupt right after the FIFO is allocated. According to the wiki docs, I need to set NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN into the 'notify' field of an object in a channel. Hence, I tried seting a flag to a DMA notifier in nouveau_dma_init(): // seems entry[1] is related to a DMA notifier? nv_wo32(dev, m2mf, 1, NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN); I also tried sending some command: // guess this is a very wrong way ;-) BEGIN_RING(chan, NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY, 1); OUT_RING(chan, NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN); But they both did not work... How can we set GPU to send interrupts to CPU? I would appreciate your comments. Thanks, - Shinpei
Hi, I am responding to myself... Interrupts now work; I should have set the NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN flag every time a command is sent. In fact, I thought this flag tells GPU to notify us of when DMA transfers are done, but I got PGRAPH_NOTIFY interrupts by this. It sounds to me that they notify of when GPU operations are done rather than DMA transfers. # PGRAPH_BUFFER_NOTIFY sounds more related to DMA transfers. Is my understanding wrong? I appreciate any comments or information about this. Best regards, - Shinpei> -----Original Message----- > From: nouveau-bounces at lists.freedesktop.org > [mailto:nouveau-bounces at lists.freedesktop.org] On Behalf Of Shinpei KATO > Sent: Friday, March 12, 2010 7:21 AM > To: nouveau at lists.freedesktop.org > Subject: [Nouveau] Interrupt setting > > Hi all, > > I am a Nouveau user on FC12 with GeForce 9500GT. > I have read the Nouveau wiki documents, and they imply that there are ways > to set GPU to send interrupts to CPU, when we want to be notified for > something, e.g., when DMA transfer or GPU operation is completed. > By default, when I run an OpenGL demo application from Gallium3D, thedriver> gets no interrupts from GPU in nouveau_irq_handler(), except that it gets > one NV_PFIFO_INTR_CACHE_ERROR interrupt right after the FIFO is allocated. > According to the wiki docs, I need to set > NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN into the > 'notify' > field of an object in a channel. > Hence, I tried seting a flag to a DMA notifier in nouveau_dma_init(): > > // seems entry[1] is related to a DMA notifier? > nv_wo32(dev, m2mf, 1, > NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN); > > I also tried sending some command: > > // guess this is a very wrong way ;-) > BEGIN_RING(chan, NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY, 1); > OUT_RING(chan, > NV_MEMORY_TO_MEMORY_FORMAT_NOTIFY_STYLE_WRITE_LE_AWAKEN); > > But they both did not work... > How can we set GPU to send interrupts to CPU? > I would appreciate your comments. > > Thanks, > - Shinpei > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/nouveau
Dear Luca,> > Ok, I think I was misunderstanding. > > This means we have to tell a GPU by sending another command to increment a > fence > > value. > > Finally I believe I am correct ;-) > > It actually sets an arbitrary value. The relevant code is in nouveau_fence.c: > BEGIN_RING(chan, NvSubSw, USE_REFCNT ? 0x0050 : 0x0150, 1); > OUT_RING(chan, fence->sequence); > > The current sequence number is read with (in the same file): > sequence = nvchan_rd32(chan, 0x48); > > (on Riva TNT/TNT2, USE_REFCNT is false, and we trigger an interrupt > for each fence completion so we can emulate that in software)Sorry that I did not carefully read the code; I just remembered the code doing: fence->sequence = ++chan->fence.sequence This means the fence # is incremented, but does not mean the fence # executed by a GPU is always sequential. I should have precisely mentioned this.> > On the other hand, this means that, for now in Nouveau, we dont have a way > to > > tell a GPU to notify us of a completion of a DMA transfer. > > Maybe an interrupt from a GPU (PGRAPH_BUFFER_NOTIFY or something) is needed > for > > this. > > I understand this notification is not necessary now, because DMAtransactions> > are managed by ring buffers. > > Not sure what you mean. > In theory, once the FIFO processes the next command, all transfers > done by the previous one should be completed. > Note that simple DMA data copies are done with FIFO commands too (with > m2mf aka MEMORY_TO_MEMORY_FORMAT). > Also, all the commands before it have obviously already been fetched > from the FIFO.Ok, I still may be misunderstanding. I thought that, in nouveau_gem_ioctl_pushbuf(), you transfer the buffers to a GPU by nv50_dma_push() in case of NV50. My assumption is that each pushbuf may include more than one command; i.e., one pushbuf procedure may include both DMA data copy and object drawing (I used a term "GPU operation" for drawing). Since you create one fence object for each pushbuf, I thought that we can synchronize only with last the command. Not sure if my assumption is correct... Best, - Shinpei
> Since you create one fence object for each pushbuf, I thought that we can > synchronize only with last the command. > Not sure if my assumption is correct...All the commands in the pushbuffer are executed sequentially and the fence setting command is written at the end of the pushbuffer, so when the fence register is updated, all commands will have been already executed. However, this does indeed mean that to wait for the completion of any command, we also need to wait for the completion of all the other commands in the pushbuffer sent in the pushbuf ioctl call. This could be improved, but I doubt it is worth the significant additional complexity and CPU performance costs. Userspace can always send commands in smaller chunks if it wants to.
Dear Luca,> > Since you create one fence object for each pushbuf, I thought that we can > > synchronize only with last the command. > > Not sure if my assumption is correct... > All the commands in the pushbuffer are executed sequentially and the > fence setting command is written at the end of the pushbuffer, so when > the fence register is updated, all commands will have been already > executed. > > However, this does indeed mean that to wait for the completion of any > command, we also need to wait for the completion of all the other > commands in the pushbuffer sent in the pushbuf ioctl call.Interesting... So fencing at the end of the pushbuf ioctl results in synchronizing with not necessarily the last command sent from a CPU but with the last command finished on a GPU.> This could be improved, but I doubt it is worth the significant > additional complexity and CPU performance costs. Userspace can always > send commands in smaller chunks if it wants to.Totally agree with you. I learned a lot here, thanks! Best, - Shinpei
> The presence of some rendering artifacts casts some doubts on this, > but the consensus is that it is more likely due to missing cache > flushing or something else.I cannot comment on pre-NV50, but on NV50 PGRAPH commands are asynchronous for sure. You can insert a PGRAPH command barrier by using method SERIALIZE [0x110]. No idea yet about PFIFO<->PGRAPH sync. Marcin Ko?cielnicki