thr3ads.net - Linux Virtualization - [PATCH 05/10] s390/cio: introduce DMA pools to cio [May 2019]

If this information is useful, please help other people find it:
Share via:

Cornelia Huck

2019-May-10 14:10 UTC

[PATCH 05/10] s390/cio: introduce DMA pools to cio

On Fri, 10 May 2019 00:11:12 +0200
Halil Pasic <pasic at linux.ibm.com> wrote:
> On Thu, 9 May 2019 12:11:06 +0200
> Cornelia Huck <cohuck at redhat.com> wrote:
> 
> > On Wed, 8 May 2019 23:22:10 +0200
> > Halil Pasic <pasic at linux.ibm.com> wrote:
> >   
> > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > Sebastian Ott <sebott at linux.ibm.com> wrote:  
> >   
> > > > > @@ -1063,6 +1163,7 @@ static int __init
css_bus_init(void)
> > > > >  	
unregister_reboot_notifier(&css_reboot_notifier);
> > > > >  		goto out_unregister;
> > > > >  	}
> > > > > +	cio_dma_pool_init();      
> > > > 
> > > > This is too late for early devices (ccw console!).    
> > > 
> > > You have already raised concern about this last time (thanks). I
think,
> > > I've addressed this issue: tje cio_dma_pool is only used by
the airq
> > > stuff. I don't think the ccw console needs it. Please have an
other look
> > > at patch #6, and explain your concern in more detail if it
persists.
> > 
> > What about changing the naming/adding comments here, so that (1) folks
> > aren't confused by the same thing in the future and (2) folks
don't try
> > to use that pool for something needed for the early ccw consoles?
> >   
> 
> I'm all for clarity! Suggestions for better names?
css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
need it?

Halil Pasic

2019-May-12 18:22 UTC

head link

[PATCH 05/10] s390/cio: introduce DMA pools to cio

On Fri, 10 May 2019 16:10:13 +0200
Cornelia Huck <cohuck at redhat.com> wrote:
> On Fri, 10 May 2019 00:11:12 +0200
> Halil Pasic <pasic at linux.ibm.com> wrote:
> 
> > On Thu, 9 May 2019 12:11:06 +0200
> > Cornelia Huck <cohuck at redhat.com> wrote:
> > 
> > > On Wed, 8 May 2019 23:22:10 +0200
> > > Halil Pasic <pasic at linux.ibm.com> wrote:
> > >   
> > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > Sebastian Ott <sebott at linux.ibm.com> wrote:  
> > >   
> > > > > > @@ -1063,6 +1163,7 @@ static int __init
css_bus_init(void)
> > > > > >  	
unregister_reboot_notifier(&css_reboot_notifier);
> > > > > >  		goto out_unregister;
> > > > > >  	}
> > > > > > +	cio_dma_pool_init();      
> > > > > 
> > > > > This is too late for early devices (ccw console!).    
> > > > 
> > > > You have already raised concern about this last time
(thanks). I think,
> > > > I've addressed this issue: the cio_dma_pool is only used
by the airq
> > > > stuff. I don't think the ccw console needs it. Please
have an other look
> > > > at patch #6, and explain your concern in more detail if it
persists.
> > > 
> > > What about changing the naming/adding comments here, so that (1)
folks
> > > aren't confused by the same thing in the future and (2) folks
don't try
> > > to use that pool for something needed for the early ccw consoles?
> > >   
> > 
> > I'm all for clarity! Suggestions for better names?
> 
> css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> need it?
> 
Ouch! I was considering to use cio_dma_zalloc() for the adapter
interruption vectors but I ended up between the two chairs in the end.
So with this series there are no uses for cio_dma pool.

I don't feel strongly about this going one way the other.

Against getting rid of the cio_dma_pool and sticking with the speaks
dma_alloc_coherent() that we waste a DMA page per vector, which is a
non obvious side effect.

What speaks against cio_dma_pool is that it is slightly more code, and
this currently can not be used for very early stuff, which I don't
think is relevant. What also used to speak against it is that
allocations asking for more than a page would just fail, but I addressed
that in the patch I've hacked up on top of the series, and I'm going to
paste below. While at it I addressed some other issues as well.

I've also got code that deals with AIRQ_IV_CACHELINE by turning the
kmem_cache into a dma_pool.

Cornelia, Sebastian which approach do you prefer:
1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
vector, or
2) go with the approach taken by the patch below?


Regards,
Halil
-----------------------8<---------------------------------------------
From: Halil Pasic <pasic at linux.ibm.com>
Date: Sun, 12 May 2019 18:08:05 +0200
Subject: [PATCH] WIP: use cio dma pool for airqs

Let's not waste a DMA page per adapter interrupt bit vector.
---
Lightly tested...
---
 arch/s390/include/asm/airq.h |  1 -
 drivers/s390/cio/airq.c      | 10 +++-------
 drivers/s390/cio/css.c       | 18 +++++++++++++++---
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 1492d48..981a3eb 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
-	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 7a5c0a0..f11f437 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned
long flags)
 		goto out;
 	iv->bits = bits;
 	size = iv_size(bits);
-	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
-						 &iv->vector_dma, GFP_KERNEL);
+	iv->vector = cio_dma_zalloc(size);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned
long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
-			  iv->vector_dma);
+	cio_dma_free(iv->vector, size);
 	kfree(iv);
 out:
 	return NULL;
@@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->data);
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
-	kfree(iv->vector);
-	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
-			  iv->vector, iv->vector_dma);
+	cio_dma_free(iv->vector, iv_size(iv->bits));
 	kfree(iv->avail);
 	kfree(iv);
 }
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index 7087cc3..88d9c92 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device
*dma_dev, int nr_pages)
 static void __gp_dma_free_dma(struct gen_pool *pool,
 			      struct gen_pool_chunk *chunk, void *data)
 {
-	dma_free_coherent((struct device *) data, PAGE_SIZE,
+
+	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
+
+	dma_free_coherent((struct device *) data, chunk_size,
 			 (void *) chunk->start_addr,
 			 (dma_addr_t) chunk->phys_addr);
 }
@@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct
device *dma_dev,
 {
 	dma_addr_t dma_addr;
 	unsigned long addr = gen_pool_alloc(gp_dma, size);
+	size_t chunk_size;
 
 	if (!addr) {
+		chunk_size = round_up(size, PAGE_SIZE);
 		addr = (unsigned long) dma_alloc_coherent(dma_dev,
-					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+					 chunk_size, &dma_addr, CIO_DMA_GFP);
 		if (!addr)
 			return NULL;
-		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
 		addr = gen_pool_alloc(gp_dma, size);
 	}
 	return (void *) addr;
@@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void
*cpu_addr, size_t size)
 	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
 }
 
+/**
+ * Allocate dma memory from the css global pool. Intended for memory not
+ * specific to any single device within the css.
+ *
+ * Caution: Not suitable for early stuff like console.
+ *
+ */
 void *cio_dma_zalloc(size_t size)
 {
 	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
-- 
2.5.5

Cornelia Huck

2019-May-13 13:29 UTC

head link

[PATCH 05/10] s390/cio: introduce DMA pools to cio

On Sun, 12 May 2019 20:22:56 +0200
Halil Pasic <pasic at linux.ibm.com> wrote:
> On Fri, 10 May 2019 16:10:13 +0200
> Cornelia Huck <cohuck at redhat.com> wrote:
> 
> > On Fri, 10 May 2019 00:11:12 +0200
> > Halil Pasic <pasic at linux.ibm.com> wrote:
> >   
> > > On Thu, 9 May 2019 12:11:06 +0200
> > > Cornelia Huck <cohuck at redhat.com> wrote:
> > >   
> > > > On Wed, 8 May 2019 23:22:10 +0200
> > > > Halil Pasic <pasic at linux.ibm.com> wrote:
> > > >     
> > > > > On Wed, 8 May 2019 15:18:10 +0200 (CEST)
> > > > > Sebastian Ott <sebott at linux.ibm.com> wrote:
> > > >     
> > > > > > > @@ -1063,6 +1163,7 @@ static int __init
css_bus_init(void)
> > > > > > >  	
unregister_reboot_notifier(&css_reboot_notifier);
> > > > > > >  		goto out_unregister;
> > > > > > >  	}
> > > > > > > +	cio_dma_pool_init();        
> > > > > > 
> > > > > > This is too late for early devices (ccw console!).
> > > > > 
> > > > > You have already raised concern about this last time
(thanks). I think,
> > > > > I've addressed this issue: the cio_dma_pool is only
used by the airq
> > > > > stuff. I don't think the ccw console needs it.
Please have an other look
> > > > > at patch #6, and explain your concern in more detail if
it persists.
> > > > 
> > > > What about changing the naming/adding comments here, so that
(1) folks
> > > > aren't confused by the same thing in the future and (2)
folks don't try
> > > > to use that pool for something needed for the early ccw
consoles?
> > > >     
> > > 
> > > I'm all for clarity! Suggestions for better names?  
> > 
> > css_aiv_dma_pool, maybe? Or is there other cross-device stuff that may
> > need it?
> >   
> 
> Ouch! I was considering to use cio_dma_zalloc() for the adapter
> interruption vectors but I ended up between the two chairs in the end.
> So with this series there are no uses for cio_dma pool.
> 
> I don't feel strongly about this going one way the other.
> 
> Against getting rid of the cio_dma_pool and sticking with the speaks
> dma_alloc_coherent() that we waste a DMA page per vector, which is a
> non obvious side effect.
That would basically mean one DMA page per virtio-ccw device, right?
For single queue devices, this seems like quite a bit of overhead.

Are we expecting many devices in use per guest?
> 
> What speaks against cio_dma_pool is that it is slightly more code, and
> this currently can not be used for very early stuff, which I don't
> think is relevant. 
Unless properly documented, it feels like something you can easily trip
over, however.

I assume that the "very early stuff" is basically only ccw consoles.
Not sure if we can use virtio-serial as an early ccw console -- IIRC
that was only about 3215/3270? While QEMU guests are able to use a 3270
console, this is experimental and I would not count that as a use case.
Anyway, 3215/3270 don't use adapter interrupts, and probably not
anything cross-device, either; so unless early virtio-serial is a
thing, this restriction is fine if properly documented.
> What also used to speak against it is that
> allocations asking for more than a page would just fail, but I addressed
> that in the patch I've hacked up on top of the series, and I'm
going to
> paste below. While at it I addressed some other issues as well.
Hm, which "other issues"?
> 
> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> kmem_cache into a dma_pool.
Isn't that percolating to other airq users again? Or maybe I just don't
understand what you're proposing here...
> 
> Cornelia, Sebastian which approach do you prefer:
> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> vector, or
> 2) go with the approach taken by the patch below?
I'm not sure that I properly understand this (yeah, you probably
guessed); so I'm not sure I can make a good call here.
> 
> 
> Regards,
> Halil
> -----------------------8<---------------------------------------------
> From: Halil Pasic <pasic at linux.ibm.com>
> Date: Sun, 12 May 2019 18:08:05 +0200
> Subject: [PATCH] WIP: use cio dma pool for airqs
> 
> Let's not waste a DMA page per adapter interrupt bit vector.
> ---
> Lightly tested...
> ---
>  arch/s390/include/asm/airq.h |  1 -
>  drivers/s390/cio/airq.c      | 10 +++-------
>  drivers/s390/cio/css.c       | 18 +++++++++++++++---
>  3 files changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 1492d48..981a3eb 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -30,7 +30,6 @@ void unregister_adapter_interrupt(struct airq_struct
*airq);
>  /* Adapter interrupt bit vector */
>  struct airq_iv {
>  	unsigned long *vector;	/* Adapter interrupt bit vector */
> -	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
>  	unsigned long *avail;	/* Allocation bit mask for the bit vector */
>  	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
>  	unsigned long *ptr;	/* Pointer associated with each bit */
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index 7a5c0a0..f11f437 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -136,8 +136,7 @@ struct airq_iv *airq_iv_create(unsigned long bits,
unsigned long flags)
>  		goto out;
>  	iv->bits = bits;
>  	size = iv_size(bits);
> -	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
> -						 &iv->vector_dma, GFP_KERNEL);
> +	iv->vector = cio_dma_zalloc(size);
>  	if (!iv->vector)
>  		goto out_free;
>  	if (flags & AIRQ_IV_ALLOC) {
> @@ -172,8 +171,7 @@ struct airq_iv *airq_iv_create(unsigned long bits,
unsigned long flags)
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
>  	kfree(iv->avail);
> -	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
> -			  iv->vector_dma);
> +	cio_dma_free(iv->vector, size);
>  	kfree(iv);
>  out:
>  	return NULL;
> @@ -189,9 +187,7 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->data);
>  	kfree(iv->ptr);
>  	kfree(iv->bitlock);
> -	kfree(iv->vector);
> -	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
> -			  iv->vector, iv->vector_dma);
> +	cio_dma_free(iv->vector, iv_size(iv->bits));
>  	kfree(iv->avail);
>  	kfree(iv);
>  }
> diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
> index 7087cc3..88d9c92 100644
> --- a/drivers/s390/cio/css.c
> +++ b/drivers/s390/cio/css.c
> @@ -1063,7 +1063,10 @@ struct gen_pool *cio_gp_dma_create(struct device
*dma_dev, int nr_pages)
>  static void __gp_dma_free_dma(struct gen_pool *pool,
>  			      struct gen_pool_chunk *chunk, void *data)
>  {
> -	dma_free_coherent((struct device *) data, PAGE_SIZE,
> +
> +	size_t chunk_size = chunk->end_addr - chunk->start_addr + 1;
> +
> +	dma_free_coherent((struct device *) data, chunk_size,
>  			 (void *) chunk->start_addr,
>  			 (dma_addr_t) chunk->phys_addr);
>  }
> @@ -1088,13 +1091,15 @@ void *cio_gp_dma_zalloc(struct gen_pool *gp_dma,
struct device *dma_dev,
>  {
>  	dma_addr_t dma_addr;
>  	unsigned long addr = gen_pool_alloc(gp_dma, size);
> +	size_t chunk_size;
>  
>  	if (!addr) {
> +		chunk_size = round_up(size, PAGE_SIZE);
Doesn't that mean that we still go up to chunks of at least PAGE_SIZE?
Or can vectors now share the same chunk?
>  		addr = (unsigned long) dma_alloc_coherent(dma_dev,
> -					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
> +					 chunk_size, &dma_addr, CIO_DMA_GFP);
>  		if (!addr)
>  			return NULL;
> -		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
> +		gen_pool_add_virt(gp_dma, addr, dma_addr, chunk_size, -1);
>  		addr = gen_pool_alloc(gp_dma, size);
>  	}
>  	return (void *) addr;
> @@ -1108,6 +1113,13 @@ void cio_gp_dma_free(struct gen_pool *gp_dma, void
*cpu_addr, size_t size)
>  	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
>  }
>  
> +/**
> + * Allocate dma memory from the css global pool. Intended for memory not
> + * specific to any single device within the css.
> + *
> + * Caution: Not suitable for early stuff like console.
Maybe add "Do not use prior to <point in startup>"?
> + *
> + */
>  void *cio_dma_zalloc(size_t size)
>  {
>  	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);

Sebastian Ott

2019-May-16 13:59 UTC

head link

[PATCH 05/10] s390/cio: introduce DMA pools to cio

On Sun, 12 May 2019, Halil Pasic wrote:> I've also got code that deals with AIRQ_IV_CACHELINE by turning the
> kmem_cache into a dma_pool.
> 
> Cornelia, Sebastian which approach do you prefer:
> 1) get rid of cio_dma_pool and AIRQ_IV_CACHELINE, and waste a page per
> vector, or
> 2) go with the approach taken by the patch below?
We only have a couple of users for airq_iv:

virtio_ccw.c: 2K bits

pci with floating IRQs: <= 2K (for the per-function bit vectors)
                        1..4K (for the summary bit vector)

pci with CPU directed IRQs: 2K (for the per-CPU bit vectors)
                            1..nr_cpu (for the summary bit vector)


The options are:
* page allocations for everything
* dma_pool for AIRQ_IV_CACHELINE ,gen_pool for others
* dma_pool for everything

I think we should do option 3 and use a dma_pool with cachesize
alignment for everything (as a prerequisite we have to limit
config PCI_NR_FUNCTIONS to 2K - but that is not a real constraint).

Sebastian

Reasonably Related Threads

Search for more possibly parallel threads

Linux Virtualization - May 2019 - [PATCH 05/10] s390/cio: introduce DMA pools to cio

[PATCH 05/10] s390/cio: introduce DMA pools to cio

[PATCH 05/10] s390/cio: introduce DMA pools to cio

[PATCH 05/10] s390/cio: introduce DMA pools to cio

[PATCH 05/10] s390/cio: introduce DMA pools to cio

Reasonably Related Threads