Dear Devs, (I hope this question is not that much OT for this list..) My question is about fast OpenGL context switching, i.e. when there are several processes using the same nvidia card, each one with their own OpenGL context. In my specific case, I am trying to dump 720p video simultaneously to multiple windows using OpenGL textures. So, to begin with, I have a process that spans child processes (*not* threads, i.e. using the fork system call) Each one of these (multi)processes has a video decoder that produces bitmaps at 25 fps rate. I am dumping the bitmaps to openGL using pixel buffer objects (PBOs) and textures. Each time when uploading a PBO and later on, when showing the texture, each (multi)process needs to change to its own context by performing the GLX call glXMakeCurrent(display_id, window_id, context_id) So there is quite a bit of context switching going on..! If there are 30 windows, that's > 750 context switches per second. I am able to pull this off with intel and ati graphics cards, but not with any nvidia card (tried a few of them), neither with the proprietary driver nor with noveau. Are there any specific tricks to achieve this? Or should I just use one context from one process, i.e. use multithreading instead of multiprocessing and avoid context switching alltogether? (is there something fundamentally wrong in my approach of multiprocesses and context switching?) Interestingly, when using N>2 multiprocesses (spanned with fork), only one of the textures is updating - the others freeze. However, when using the same code as independently launched programs, I can achieve N=10+. However, this seems not to be always stable (whole X server freezes sometimes). On the other hand, I can also use, say, several vlc clients to create a same kind of situation and it seems to work. Ay insight highly appreciated.. Regards, Sampsa Riikonen
Context switching = very slow on NVIDIA. Don't do it if you can avoid it. Each context is like a megabyte of data, if not more. Each time it has to get saved off and restored. On Wed, Feb 8, 2017 at 1:38 PM, Sampsa Riikonen <sampsa.riikonen at iki.fi> wrote:> Dear Devs, > > (I hope this question is not that much OT for this list..) > > My question is about fast OpenGL context switching, i.e. when there are > several processes using the same nvidia card, each one with their own OpenGL > context. In my specific case, I am trying to dump 720p video simultaneously > to multiple windows using OpenGL textures. > > So, to begin with, I have a process that spans child processes (*not* > threads, i.e. using the fork system call) > > Each one of these (multi)processes has a video decoder that produces bitmaps > at 25 fps rate. I am dumping the bitmaps to openGL using pixel buffer > objects (PBOs) and textures. > > Each time when uploading a PBO and later on, when showing the texture, each > (multi)process needs to change to its own context by performing the GLX call > > glXMakeCurrent(display_id, window_id, context_id) > > So there is quite a bit of context switching going on..! If there are 30 > windows, that's > 750 context switches per second. I am able to pull this > off with intel and ati graphics cards, but not with any nvidia card (tried a > few of them), neither with the proprietary driver nor with noveau. > > Are there any specific tricks to achieve this? Or should I just use one > context from one process, i.e. use multithreading instead of multiprocessing > and avoid context switching alltogether? (is there something fundamentally > wrong in my approach of multiprocesses and context switching?) > > Interestingly, when using N>2 multiprocesses (spanned with fork), only one > of the textures is updating - the others freeze. However, when using the > same code as independently launched programs, I can achieve N=10+. However, > this seems not to be always stable (whole X server freezes sometimes). > > On the other hand, I can also use, say, several vlc clients to create a same > kind of situation and it seems to work. > > Ay insight highly appreciated.. > > Regards, > > Sampsa Riikonen > > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
For details see my presentation on day 3 of XDC2016. TL;DW: A context switch on average takes ~25 microseconds, but depending on the display resolution and the load on the card times up to 130 microseconds have been observed. The average does not appear to differ much between cards as the (growing) size of the context is in balance with the increased DRAM bandwidth. Measured worst cases appear to get better for higher end cards, but no hard real-time guarantees given. Roy Op 08-02-17 om 18:41 schreef Ilia Mirkin:> Context switching = very slow on NVIDIA. Don't do it if you can avoid > it. Each context is like a megabyte of data, if not more. Each time it > has to get saved off and restored. > > On Wed, Feb 8, 2017 at 1:38 PM, Sampsa Riikonen <sampsa.riikonen at iki.fi> wrote: >> Dear Devs, >> >> (I hope this question is not that much OT for this list..) >> >> My question is about fast OpenGL context switching, i.e. when there are >> several processes using the same nvidia card, each one with their own OpenGL >> context. In my specific case, I am trying to dump 720p video simultaneously >> to multiple windows using OpenGL textures. >> >> So, to begin with, I have a process that spans child processes (*not* >> threads, i.e. using the fork system call) >> >> Each one of these (multi)processes has a video decoder that produces bitmaps >> at 25 fps rate. I am dumping the bitmaps to openGL using pixel buffer >> objects (PBOs) and textures. >> >> Each time when uploading a PBO and later on, when showing the texture, each >> (multi)process needs to change to its own context by performing the GLX call >> >> glXMakeCurrent(display_id, window_id, context_id) >> >> So there is quite a bit of context switching going on..! If there are 30 >> windows, that's > 750 context switches per second. I am able to pull this >> off with intel and ati graphics cards, but not with any nvidia card (tried a >> few of them), neither with the proprietary driver nor with noveau. >> >> Are there any specific tricks to achieve this? Or should I just use one >> context from one process, i.e. use multithreading instead of multiprocessing >> and avoid context switching alltogether? (is there something fundamentally >> wrong in my approach of multiprocesses and context switching?) >> >> Interestingly, when using N>2 multiprocesses (spanned with fork), only one >> of the textures is updating - the others freeze. However, when using the >> same code as independently launched programs, I can achieve N=10+. However, >> this seems not to be always stable (whole X server freezes sometimes). >> >> On the other hand, I can also use, say, several vlc clients to create a same >> kind of situation and it seems to work. >> >> Ay insight highly appreciated.. >> >> Regards, >> >> Sampsa Riikonen >> >> _______________________________________________ >> Nouveau mailing list >> Nouveau at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/nouveau > _______________________________________________ > Nouveau mailing list > Nouveau at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau