thr3ads.net - Xen devel - [Xen-devel] x86 swiotlb questions [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Jan Beulich

2006-Dec-15 12:50 UTC

[Xen-devel] x86 swiotlb questions

I''m not certain when this code was changed significantly from what I
remember, but

- What is the purpose of using alloc_bootmem_low variants here? I.e., where is
the
dependency on physical addresses being below 4G here (machine addresses are
being restricted after the allocation anyway)? The panic message text after the
failed allocation is confusing me additionally.

- While I can see the idea behind the overflow buffer, it doesn''t seem
to prevent
data corruption, and if I understand it correctly it doesn''t even
prevent memory
corruption (since its machine address doesn''t get restricted anywhere,
so the fall
back return value would not necessarily meet the device requirements).

- With various parameters now being command line configurable, if any of these
get set inconsistently or incorrectly the user would probably get a cryptic
crash
(from the BUG_ON() following the call to xen_create_contiguous_region()).

- The default bit width for DMA (also in Xen itself) was now changed to 30, just
because of a single device (b44). Shouldn''t it, with the settings being
customizable now, rather be 32 (and those who own such ill devices need to
make use of the option)?

- The DMA bit widths can be set to different values in Xen and kernel, which can
lead to surprising results, I would think. Shouldn''t the kernel rather
obtain Xen''s
value, so they are consistent?

Thanks, Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 13:35 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 15/12/06 12:50, "Jan Beulich" <jbeulich@novell.com> wrote:
> I''m not certain when this code was changed significantly from what
I remember,
> but
> 
> - What is the purpose of using alloc_bootmem_low variants here? I.e., where
is
> the dependency on physical addresses being below 4G here (machine addresses
are> being restricted after the allocation anyway)? The panic message text after
> the
> failed allocation is confusing me additionally.
This is how it''s always been since we took ia64''s swiotlb.c.
> - While I can see the idea behind the overflow buffer, it doesn''t
seem to
> prevent
> data corruption, and if I understand it correctly it doesn''t even
prevent
> memory
> corruption (since its machine address doesn''t get restricted
anywhere, so the
> fall
> back return value would not necessarily meet the device requirements).
Same here. We didn''t implement this. It doesn''t seem to make
that much
sense. Sync''ing with lib/swiotb.c and throwing away our special one
would be
very nice. :-)
> - With various parameters now being command line configurable, if any of
these
> get set inconsistently or incorrectly the user would probably get a cryptic
> crash
> (from the BUG_ON() following the call to xen_create_contiguous_region()).
True. Best know what you''re doing if you mess with these.
> - The default bit width for DMA (also in Xen itself) was now changed to 30,
> just
> because of a single device (b44). Shouldn''t it, with the settings
being
> customizable now, rather be 32 (and those who own such ill devices need to
> make use of the option)?
Arguable. Depends how common 31- and 30-bit limitations are. Limiting the
swiotlb to 1GB of memory doesn''t seem that harsh -- how much swiotlb
memory
are you likely to want, system wide?
> - The DMA bit widths can be set to different values in Xen and kernel,
which
> can
> lead to surprising results, I would think. Shouldn''t the kernel
rather obtain
> Xen''s
> value, so they are consistent?
We would like to generalise Xen''s heap allocator so that it keeps
separate
heaps for different bit widths. Then there would be no ''DMA
width'' or ''DMA
pool'' in Xen.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-15 13:53 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>> - The DMA bit widths can be set to different values in Xen and kernel,
which
>> can
>> lead to surprising results, I would think. Shouldn''t the
kernel rather obtain
>> Xen''s
>> value, so they are consistent?
>
>We would like to generalise Xen''s heap allocator so that it keeps
separate
>heaps for different bit widths. Then there would be no ''DMA
width'' or ''DMA
>pool'' in Xen.
I already have patches ready to do this (the DMA thing really is a nice side
effect, I mostly wanted it for 32on64, so that I can restrict domain
allocations for 32-bit domains). Are you saying I should throw away the
DMA specialization then altogether (I already have no special DMA heap
anymore)? The leftovers from it are so that one can reserve some portion
of low memory to be returned only when the width restriction is low enough
(i.e. to retain dma_emergency_pool functionality), which certainly
isn''t
really appropriate anymore now (it should rather be a percentage or
something like that, so that the lower you get the more of the memory
remains reserved for specialized allocations).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 14:03 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 15/12/06 13:53, "Jan Beulich" <jbeulich@novell.com> wrote:
> I already have patches ready to do this (the DMA thing really is a nice
side
> effect, I mostly wanted it for 32on64, so that I can restrict domain
> allocations for 32-bit domains). Are you saying I should throw away the
> DMA specialization then altogether (I already have no special DMA heap
> anymore)? The leftovers from it are so that one can reserve some portion
> of low memory to be returned only when the width restriction is low enough
> (i.e. to retain dma_emergency_pool functionality), which certainly
isn''t
> really appropriate anymore now (it should rather be a percentage or
> something like that, so that the lower you get the more of the memory
> remains reserved for specialized allocations).
I think dma_emergency_pool as is can go. Possibly it should be replaced by
allocator-management tools in dom0 to allow setting of limits on a
per-bitwidth basis.

Is this one of the patches you already sent in your 32-on-64 batch, or an
additional one?

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-15 14:17 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>>> Keir Fraser <keir@xensource.com> 15.12.06 15:03 >>>
>On 15/12/06 13:53, "Jan Beulich" <jbeulich@novell.com>
wrote:
>
>> I already have patches ready to do this (the DMA thing really is a nice
side
>> effect, I mostly wanted it for 32on64, so that I can restrict domain
>> allocations for 32-bit domains). Are you saying I should throw away the
>> DMA specialization then altogether (I already have no special DMA heap
>> anymore)? The leftovers from it are so that one can reserve some
portion
>> of low memory to be returned only when the width restriction is low
enough
>> (i.e. to retain dma_emergency_pool functionality), which certainly
isn''t
>> really appropriate anymore now (it should rather be a percentage or
>> something like that, so that the lower you get the more of the memory
>> remains reserved for specialized allocations).
>
>I think dma_emergency_pool as is can go. Possibly it should be replaced by
>allocator-management tools in dom0 to allow setting of limits on a
>per-bitwidth basis.
Okay, but I think I''ll leave this as a separate change (that we
probably first
should reach agreement on what it really ought to do and not do).
>Is this one of the patches you already sent in your 32-on-64 batch, or an
>additional one?
An additional one (or actually, a set of them, to make the individual steps
more clear). As a followup, I''m also planning to get rid of the Xen
heaps
on those arches where they aren''t needed (x86-64, not sure about ppc
and ia64, but I would assume it''s really only x86-32 that needs it).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 14:19 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 15/12/06 14:17, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Is this one of the patches you already sent in your 32-on-64 batch, or
an
>> additional one?
> 
> An additional one (or actually, a set of them, to make the individual steps
> more clear). As a followup, I''m also planning to get rid of the
Xen heaps
> on those arches where they aren''t needed (x86-64, not sure about
ppc
> and ia64, but I would assume it''s really only x86-32 that needs
it).
Yes, that would be great. There are a couple of minor caveats around special
semantics of the Xen heap but those could do with cleaning up and being made
explicit anyway.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-15 14:46 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>>> Keir Fraser <keir@xensource.com> 15.12.06 15:19 >>>
>On 15/12/06 14:17, "Jan Beulich" <jbeulich@novell.com>
wrote:
>
>>> Is this one of the patches you already sent in your 32-on-64 batch,
or an
>>> additional one?
>> 
>> An additional one (or actually, a set of them, to make the individual
steps
>> more clear). As a followup, I''m also planning to get rid of
the Xen heaps
>> on those arches where they aren''t needed (x86-64, not sure
about ppc
>> and ia64, but I would assume it''s really only x86-32 that
needs it).
>
>Yes, that would be great. There are a couple of minor caveats around special
>semantics of the Xen heap but those could do with cleaning up and being made
>explicit anyway.
Mind adding a little more detail here?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Alan

2006-Dec-15 16:19 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

> Arguable. Depends how common 31- and 30-bit limitations are. Limiting the
> swiotlb to 1GB of memory doesn''t seem that harsh -- how much
swiotlb memory
> are you likely to want, system wide?
31bit pops up now and then, 30bit also. There are a few devices that are
even worse but they are older sound cards so not important.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-15 16:47 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 15/12/06 14:46, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Yes, that would be great. There are a couple of minor caveats around
special
>> semantics of the Xen heap but those could do with cleaning up and being
made
>> explicit anyway.
> 
> Mind adding a little more detail here?
Some allocations need to be below 4GB (e.g., domain pointers, as they get
pickled into a 32-bit field in page structures). These should use an
explicit address-width parameter to the allocator.

Pages allocated from the ''Xen heap'' but shared with a domain
have special
semantics when their refcount falls to zero. This includes shared_info and
grant-table pages. We may need an extra PGC_ bit to flag these pages.

That''s all I can think of.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-18 07:44 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>> - What is the purpose of using alloc_bootmem_low variants here? I.e.,
where is
>> the dependency on physical addresses being below 4G here (machine
addresses
>are
>> being restricted after the allocation anyway)? The panic message text
after
>> the
>> failed allocation is confusing me additionally.
>
>This is how it''s always been since we took ia64''s
swiotlb.c.
Okay, then I must have forgotten about how it looked like. However, the
specific panic message has a Xen-specific addition, so I still wonder what its
background is...
>> - While I can see the idea behind the overflow buffer, it
doesn''t seem to
>> prevent
>> data corruption, and if I understand it correctly it doesn''t
even prevent
>> memory
>> corruption (since its machine address doesn''t get restricted
anywhere, so the
>> fall
>> back return value would not necessarily meet the device requirements).
>
>Same here. We didn''t implement this. It doesn''t seem to
make that much
>sense. Sync''ing with lib/swiotb.c and throwing away our special one
would be
>very nice. :-)
Trying to do that I find one extra issue: in_swiotlb_aperture() does its check
based on pfn, while lib/swiotlb.c uses the virtual address in the respective
checks instead. Is there some subtlety behind that (that then should be
commented upon), or is this just due to this originally having been an
mfn-based check?

Thanks, Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-18 09:39 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 18/12/06 7:44 am, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Same here. We didn''t implement this. It doesn''t seem
to make that much
>> sense. Sync''ing with lib/swiotb.c and throwing away our
special one would be
>> very nice. :-)
> 
> Trying to do that I find one extra issue: in_swiotlb_aperture() does its
check
> based on pfn, while lib/swiotlb.c uses the virtual address in the
respective
> checks instead. Is there some subtlety behind that (that then should be
> commented upon), or is this just due to this originally having been an
> mfn-based check?
Yes, it''s because we used to do an mfn range check which was okay when
the
swiotlb aperture was filled with contiguous machine memory. Since it is
composed of discontiguous slabs now, we changed to a pfn check but that
could equally well be a virtual-address check.

Do we merge okay with lib/swiotlb.c then? One concern I had was with our
preferred setup semantics -- we really want the user to be able to forcibly
enable the swiotlb via a boot parameter *but* not have to suffer using it
for every DMA operation. Last I looked the generic swiotlb didn''t have
that
option. That and our very Xen-specific checks for whether to auto-enable the
swiotlb led me to think that the very start-of-day setup of swiotlb would
need to be overridable by architecture.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-19 12:48 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>Do we merge okay with lib/swiotlb.c then?
Not yet - because of the highmem handling needed for i386. I wonder, however,
how native Linux gets away with not handling this through swiotlb, and why
nevertheless Xen needs to special case this. Any ideas?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-19 14:14 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 19/12/06 12:48, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Do we merge okay with lib/swiotlb.c then?
> 
> Not yet - because of the highmem handling needed for i386. I wonder,
however,
> how native Linux gets away with not handling this through swiotlb, and why
> nevertheless Xen needs to special case this. Any ideas?
Probably because GFP_KERNEL and GFP_DMA allocations are guaranteed to be
DMAable by 30-bit-capable devices on native, but not on Xen.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-19 14:39 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>>> Keir Fraser <keir@xensource.com> 19.12.06 15:14 >>>
>On 19/12/06 12:48, "Jan Beulich" <jbeulich@novell.com>
wrote:
>
>>> Do we merge okay with lib/swiotlb.c then?
>> 
>> Not yet - because of the highmem handling needed for i386. I wonder,
however,
>> how native Linux gets away with not handling this through swiotlb, and
why
>> nevertheless Xen needs to special case this. Any ideas?
>
>Probably because GFP_KERNEL and GFP_DMA allocations are guaranteed to be
>DMAable by 30-bit-capable devices on native, but not on Xen.
Not sure I understand your thinking here. Nothing prevents user pages (or
anything
else that I/O may happen against) to come from highmem, hence the bounce logic
in mm/highmem.c needs to control this anyway (as I understand it). And since all
we''re talking about here are physical addresses (and their translations
to virtual
ones), I would rather conclude that we''ll never see a page in the I/O
path that
page_address() would return NULL for, but if that''s the case, then
there''s no need
to kmap such pages or to favor page_to_bus() over virt_to_bus(page_address()).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-19 14:46 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

One thing is we change the value of PCI_BUS_IS_PHYS (or some similar macro
which I can''t quite remember the name of) which I believe turns off
some
bounce-buffer logic contained within the block-device subsystem. So that
will mean that we get highmem requests hitting the DMA interfaces, where on
native they would have got filtered earlier by the highmem/lowmem bounce
buffer logic that is specific to block-device requests.

Obviously we want to turn off that lowmem/highmem bounce buffer logic on Xen
as it is nonsense when there is no direct correspondence to low/high machine
addresses.

 -- Keir

On 19/12/06 14:39, "Jan Beulich" <jbeulich@novell.com> wrote:
>>> Not yet - because of the highmem handling needed for i386. I
wonder,
>>> however,
>>> how native Linux gets away with not handling this through swiotlb,
and why
>>> nevertheless Xen needs to special case this. Any ideas?
>> 
>> Probably because GFP_KERNEL and GFP_DMA allocations are guaranteed to
be
>> DMAable by 30-bit-capable devices on native, but not on Xen.
> 
> Not sure I understand your thinking here. Nothing prevents user pages (or
> anything
> else that I/O may happen against) to come from highmem, hence the bounce
logic
> in mm/highmem.c needs to control this anyway (as I understand it). And
since
> all
> we''re talking about here are physical addresses (and their
translations to
> virtual
> ones), I would rather conclude that we''ll never see a page in the
I/O path
> that
> page_address() would return NULL for, but if that''s the case, then
there''s no
> need
> to kmap such pages or to favor page_to_bus() over
virt_to_bus(page_address()).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Muli Ben-Yehuda

2006-Dec-19 17:07 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On Tue, Dec 19, 2006 at 02:46:41PM +0000, Keir Fraser
wrote:> 
> One thing is we change the value of PCI_BUS_IS_PHYS (or some similar macro
> which I can''t quite remember the name of) which I believe turns
off some
> bounce-buffer logic contained within the block-device subsystem. So that
> will mean that we get highmem requests hitting the DMA interfaces, where on
> native they would have got filtered earlier by the highmem/lowmem bounce
> buffer logic that is specific to block-device requests.
PCI_DMA_BUS_IS_PHYS, commonly found in asm-$(arch)/pci.h.
 
Cheers,
Muli

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-20 16:40 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>Do we merge okay with lib/swiotlb.c then? One concern I had was with our
>preferred setup semantics -- we really want the user to be able to forcibly
>enable the swiotlb via a boot parameter *but* not have to suffer using it
>for every DMA operation. Last I looked the generic swiotlb didn''t
have that
>option. That and our very Xen-specific checks for whether to auto-enable the
>swiotlb led me to think that the very start-of-day setup of swiotlb would
>need to be overridable by architecture.
I think I retained all of the semantics, attached the patches as I have them
by now. This is a submission for review only, as the first four patches will
need to go to kernel.org (and hopefully will get accepted). The Xen
customization is fairly ugly, but I didn''t see anything nicer than that
while
also keeping the amount of changes to lib/swiotlb.c reasonable.

Patch order is
swiotlb-bugs.patch
swiotlb-bus.patch
swiotlb-cleanup.patch
swiotlb-split.patch
xen-swiotlb.patch







_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-22 14:49 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

Patch update, fixing a bug on x86/PAE, and making include/xen/swiotlb.h look
a lot nicer (but still not really nice). My plan is to submit the non-Xen ones
to
lkml right after New Year, unless I hear negative feedback.

What are the plans on the Xen side - pull the non-Xen patches into patches/,
or ignore everything until (hopefully) mainline has picked up some or all of
the native ones?

Jan
>>Do we merge okay with lib/swiotlb.c then? One concern I had was with our
>>preferred setup semantics -- we really want the user to be able to
forcibly
>>enable the swiotlb via a boot parameter *but* not have to suffer using
it
>>for every DMA operation. Last I looked the generic swiotlb
didn''t have that
>>option. That and our very Xen-specific checks for whether to auto-enable
the
>>swiotlb led me to think that the very start-of-day setup of swiotlb
would
>>need to be overridable by architecture.
>
>I think I retained all of the semantics, attached the patches as I have them
>by now. This is a submission for review only, as the first four patches will
>need to go to kernel.org (and hopefully will get accepted). The Xen
>customization is fairly ugly, but I didn''t see anything nicer than
that while
>also keeping the amount of changes to lib/swiotlb.c reasonable.
>
>Patch order is
>swiotlb-bugs.patch
>swiotlb-bus.patch
>swiotlb-cleanup.patch
>swiotlb-split.patch
>xen-swiotlb.patch






_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-22 16:20 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

One more thing: Is it really necessary to restrict dma_alloc_coherent() to
dma_bits?
I.e., couldn''t we, once the bit-level page allocator is merged, use the
real bit width
needed for the requesting device here? If not, this would then permit using the
original implementation of swiotlb_dma_supported() (as dma_alloc_coherent() then
no longer depends on dma_bits), and perhaps even auto-setting dma_bits based
on what memory we can get out of Xen in swiotlb_init(), making the mismatching
of
command line options (between Xen and kernel) impossible (the kernel simply
wouldn''t have one anymore).

As a nice side effect, using the original implementation of
swiotlb_dma_supported()
would require slightly less tweaking of lib/swiotlb.c, hence slightly raising
the
chances of the changes getting accepted into mainline. And clearly, if the
kernel
manages to allocate the swiotlb at an address with less than dma_bits bits,
there
seems to be no reason to refuse use of I/O devices that the actual buffer fits,
but
dma_bits doesn''t.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Herbert Xu

2006-Dec-22 21:00 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On Fri, Dec 22, 2006 at 04:20:28PM +0000, Jan Beulich
wrote:> One more thing: Is it really necessary to restrict dma_alloc_coherent() to
dma_bits?
> I.e., couldn''t we, once the bit-level page allocator is merged,
use the real bit width
> needed for the requesting device here? If not, this would then permit using
the
Yes I think that should work.  If the width is less than what the
hypervisor supports it''ll simply fail which is better than returning
the wrong memory silently.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-23 09:48 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

Firstly, I think we should wait and see if the patches are acceptable
upstream in their current form before switching to using them.

As for dma_alloc_coherent() -- yes it makes sense to make an allocation
request with the device''s specified bit width. And we could be
opportunistic
about the bit width we advertise from swiotlb if we happen to get lower
memory than we asked for. *but*:
 1. Our swiotlb is made up of separately allocated strides, so the swiotlb
is not contiguous in machine memory. That needs to be kept in mind when
calculating the bit width as it''ll be max over all strides.
 2. Also because of this, the existing swiotlb_dma_supported() cannot work
as is. Firstly it would need to use virt_to_machine(), and even then it
doesn''t take into account that the aperture is not contiguous in
machine
memory.

And as I said before, dma_bits will disappear from Xen in due course when
the dma pool goes away and is replaced with something more flexible. The
plan is to leave it (or a similarly-named parameter) in Linux, at least as a
guide to the swiotlb pre-allocation (even if no longer used for
dma_alloc_coherent).

 -- Keir

On 22/12/06 4:20 pm, "Jan Beulich" <jbeulich@novell.com> wrote:
> One more thing: Is it really necessary to restrict dma_alloc_coherent() to
> dma_bits?
> I.e., couldn''t we, once the bit-level page allocator is merged,
use the real
> bit width
> needed for the requesting device here? If not, this would then permit using
> the
> original implementation of swiotlb_dma_supported() (as dma_alloc_coherent()
> then
> no longer depends on dma_bits), and perhaps even auto-setting dma_bits
based
> on what memory we can get out of Xen in swiotlb_init(), making the
mismatching
> of
> command line options (between Xen and kernel) impossible (the kernel simply
> wouldn''t have one anymore).
> 
> As a nice side effect, using the original implementation of
> swiotlb_dma_supported()
> would require slightly less tweaking of lib/swiotlb.c, hence slightly
raising
> the
> chances of the changes getting accepted into mainline. And clearly, if the
> kernel
> manages to allocate the swiotlb at an address with less than dma_bits bits,
> there
> seems to be no reason to refuse use of I/O devices that the actual buffer
> fits, but
> dma_bits doesn''t.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Muli Ben-Yehuda

2006-Dec-25 04:50 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On Fri, Dec 22, 2006 at 02:49:53PM +0000, Jan Beulich wrote:
> Patch update, fixing a bug on x86/PAE, and making
> include/xen/swiotlb.h look a lot nicer (but still not really
> nice). My plan is to submit the non-Xen ones to lkml right after New
> Year, unless I hear negative feedback.
> >Patch order is
> >swiotlb-bugs.patch
> >swiotlb-bus.patch
> >swiotlb-cleanup.patch
> >swiotlb-split.patch
> >xen-swiotlb.patch
Comments inline.

swiotlb-bugs.patch:

[snip]
>  /*
> @@ -758,8 +739,10 @@ swiotlb_sync_sg(struct device *hwdev, st
>  
>  	for (i = 0; i < nelems; i++, sg++)
>  		if (sg->dma_address != SG_ENT_PHYS_ADDRESS(sg))
> -			sync_single(hwdev, (void *) sg->dma_address,
> +			sync_single(hwdev, phys_to_virt(sg->dma_address),
>  				    sg->dma_length, dir, target);
Fix looks correct and bug looks painful. I think you should send this
one to mainline immediately.

swiotlb-bus.patch:
> Convert all phys_to_virt/virt_to_phys uses to
> bus_to_virt/virt_to_bus.
"... because Xen needs it", otherwise someone is bound to ask why, as
all other archs define _bus as _phys.
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>

swiotlb-cleanup:
> This patch
> - adds proper __init decoration to swiotlb''s init code (and the
code calling
>   it, where not already the case)
> - replaces uses of ''unsigned long'' with dma_addr_t where
appropriate
> - does miscellaneous simplicfication and cleanup
> 
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
Looks good in general, not acking yet because I want to give it a spin
first.

swiotlb-split.patch:
> This patch adds abstraction so that the file can be used by environments
other
> than IA64 and EM64T, namely for Xen.
I''m sorry, but this patch is horrible. swiotlb.c is now pretty much
unreadable. I''d be surprised if mainline accepted it - I would
certainly NAK it with my mainline hat on, especially for an unmerged
architecture.

If Xen needs so many "abstractions", I have to ask whether it
isn''t
better off just using its own swiotlb.c as we are now.

I''ll take another look at this later and try to come up with a
different way of merging them that isn''t quite this horrible. Maybe
using function pointers for the "low level" operations?

Cheers,
Muli

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-25 10:20 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 25/12/06 4:50 am, "Muli Ben-Yehuda" <muli@il.ibm.com> wrote:
> I''m sorry, but this patch is horrible. swiotlb.c is now pretty
much
> unreadable. I''d be surprised if mainline accepted it - I would
> certainly NAK it with my mainline hat on, especially for an unmerged
> architecture.
> 
> If Xen needs so many "abstractions", I have to ask whether it
isn''t
> better off just using its own swiotlb.c as we are now.
> 
> I''ll take another look at this later and try to come up with a
> different way of merging them that isn''t quite this horrible.
Maybe
> using function pointers for the "low level" operations?
A lot of this ugliness comes from swiotlb_[un]map_page, the introduction of
a structural ''io_tlb_addr_t'', and a more complicated
synchronisation
function than memcpy (which uses copy_to_user and kmap).

However, I don''t remember *why* we need these Xen-specific
customisations
(even though I was the one who originally made them, way back). That given,
it would probably be sensible to make a more brutal merge that discards Xen
differences which we cannot explain. For example:

 * Why can''t we turn dma_[un]map_page into dma_[un]map_single, as
x86_64
does? This would avoid needing to expand the swiotlb api.

 * Why can''t we store virtual addresses in the io_tlb_orig_addr[] array
just
like lib/swiotlb.c, given that the native swiotlb is happy to use
page_address() on any ''struct page'' that is passed to it? I
can''t see why we
actually need KM_SWIOTLB and to use kmap_atomic.

 * If we make the above changes, do we need special sync_single() any more?
We''ll no longer need kmap_atomic, but we''ll still have
copy_to_user_inatomic, due to abuses of the DMA API by the block-device
subsystem. Maybe that has been fixed already? Or maybe, in merging this
upstream, we should aim to fix the block-device drivers rather than work
around?

Even if we wait for some of the patches to be merged upstream before
applying the swiotlb changes to our own Linux tree, I''d consider
patches
just to remove the above differences relative to lib/swiotlb.c. This might
reduce the diff but would also let us get some testing of these swiotlb
simplifications.

The differences I was expecting to need to make were just the following:
 * Initialisation time -- we want to be able to automatically conditionally
initialise the swiotlb, and allocate the aperture contents in a special way.
 * Usage -- ''force'' has difference semantics for Xen (means
forcibly
allocate a swiotlb, but not use it on every device access whether it is
needed or not). We want to be able to use the swiotlb only when it is needed
for a particular device access; this may in particular require special
treatment of the unmap api calls to decide whether or not the given
dma_address resides in the swiotlb aperture.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2006-Dec-30 17:32 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 12/25/06 11:20 AM
>>>
>On 25/12/06 4:50 am, "Muli Ben-Yehuda" <muli@il.ibm.com>
wrote:
>
>> I''m sorry, but this patch is horrible. swiotlb.c is now pretty
much
>> unreadable. I''d be surprised if mainline accepted it - I would
>> certainly NAK it with my mainline hat on, especially for an unmerged
>> architecture.
>> 
>> If Xen needs so many "abstractions", I have to ask whether it
isn''t
>> better off just using its own swiotlb.c as we are now.
>> 
>> I''ll take another look at this later and try to come up with a
>> different way of merging them that isn''t quite this horrible.
Maybe
>> using function pointers for the "low level" operations?
>
>A lot of this ugliness comes from swiotlb_[un]map_page, the introduction of<a structural ''io_tlb_addr_t'', and a more complicated
synchronisation>function than memcpy (which uses copy_to_user and kmap).
>
>However, I don''t remember *why* we need these Xen-specific
customisations
>(even though I was the one who originally made them, way back). That given,
>it would probably be sensible to make a more brutal merge that discards Xen
>differences which we cannot explain. For example:
>
> * Why can''t we turn dma_[un]map_page into dma_[un]map_single, as
x86_64
>does? This would avoid needing to expand the swiotlb api.
Because we allow highmem pages in the I/O path, hence page_address() cannot
be used. As you may have concluded from my sending of a second rev of the
patches, I had a bug in exactly that path, so I know it is being exercised.
Of course, all this exists for x86-32/PAE *only*, so it may be valid to
raise the question if it''s worth it. But otoh with supporting (only)
32-bit
PAE PV guests on x86-64 we are in the process of widening the use case here.
> * Why can''t we store virtual addresses in the io_tlb_orig_addr[]
array just
>like lib/swiotlb.c, given that the native swiotlb is happy to use
>page_address() on any ''struct page'' that is passed to it?
I can''t see why we
>actually need KM_SWIOTLB and to use kmap_atomic.
Likewise.
> * If we make the above changes, do we need special sync_single() any more?
>We''ll no longer need kmap_atomic, but we''ll still have
>copy_to_user_inatomic, due to abuses of the DMA API by the block-device
>subsystem. Maybe that has been fixed already? Or maybe, in merging this
>upstream, we should aim to fix the block-device drivers rather than work
>around?
Fixing the block drivers would certainly be nice (I''m not exactly clear
where
this dependency lives on their side), but due to the above we''ll be
unable to
completely switch over to memcpy() only.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Dec-30 17:47 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 30/12/06 5:32 pm, "Jan Beulich" <jbeulich@novell.com> wrote:
>> * Why can''t we turn dma_[un]map_page into dma_[un]map_single,
as x86_64
>> does? This would avoid needing to expand the swiotlb api.
> 
> Because we allow highmem pages in the I/O path, hence page_address() cannot
> be used. As you may have concluded from my sending of a second rev of the
> patches, I had a bug in exactly that path, so I know it is being exercised.
> Of course, all this exists for x86-32/PAE *only*, so it may be valid to
> raise the question if it''s worth it. But otoh with supporting
(only) 32-bit
> PAE PV guests on x86-64 we are in the process of widening the use case
here.
Ah of course, the generic swiotlb has not (yet) been used by an architecture
with highmem requiring use of kmap(). I forgot about that.

Unfortunately highmem does rather complicate things -- I guess it''s up
to
the lib/swiotlb maintainers whether they want to keep that complexity in an
xen-i386-specific swiotlb.c or attempt a merge.

Here''s a thought: if the highmem DMA requests come from *only* the
blkdev
subsystem, then perhaps we could use its highmem bounce buffer (I think that
still exists?). We turn that off on Xen right now, but we could re-enable
it, leading to a slightly odd ''double bounce buffer'': the
first taking us
from high pseudophysical memory to low pseudophysical memory, and the second
taking us from high machine memory to low machine memory. If we can ensure
only lowmem requests get to the swiotlb then a lot of the Xen diffs go away.
I''m not sure whether we might get DMA requests from high memory from
things
other than block devices though, nor whether all block devices would
actually pass through the blkdev bounce buffer code.

 Thanks,
 Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2007-Jan-02 08:39 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>Here''s a thought: if the highmem DMA requests come from *only* the
blkdev
>subsystem, then perhaps we could use its highmem bounce buffer (I think that
>still exists?). We turn that off on Xen right now, but we could re-enable
>it, leading to a slightly odd ''double bounce buffer'': the
first taking us
>from high pseudophysical memory to low pseudophysical memory, and the second
>taking us from high machine memory to low machine memory. If we can ensure
>only lowmem requests get to the swiotlb then a lot of the Xen diffs go away.
>I''m not sure whether we might get DMA requests from high memory
from things
>other than block devices though, nor whether all block devices would
>actually pass through the blkdev bounce buffer code.
Proving that no driver ever passes highmem pages into any of the DMA ops should
be at least difficult, but would be needed since native i386 has no requirement
that
dma_map_sg() or dma_map_page() be called only with non-highmem pages. I am
therefore not thinking this is a realistic option.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2007-Jan-03 07:10 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>Here''s a thought: if the highmem DMA requests come from *only* the
blkdev
>subsystem, then perhaps we could use its highmem bounce buffer (I think that
>still exists?).
Another, only partially related thought: The addition of KM_SWIOTLB is the only
difference to native kmap_types.h. It would seem to me that, at the price of
disabling interrupts around the use, it should be possible to replace that with
KM_BOUNCE_READ and let go of the Xen specific header...

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Jan-03 09:32 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

On 3/1/07 07:10, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Here''s a thought: if the highmem DMA requests come from *only*
the blkdev
>> subsystem, then perhaps we could use its highmem bounce buffer (I think
that
>> still exists?).
> 
> Another, only partially related thought: The addition of KM_SWIOTLB is the
> only
> difference to native kmap_types.h. It would seem to me that, at the price
of
> disabling interrupts around the use, it should be possible to replace that
> with
> KM_BOUNCE_READ and let go of the Xen specific header...
Sounds reasonable, although KM_BOUNCE_READ is a misnomer (since we write to
the kmap as well as read from it).

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2007-Jan-03 11:01 UTC

head link

Re: [Xen-devel] x86 swiotlb questions

>>> Keir Fraser <keir@xensource.com> 03.01.07 10:32 >>>
>On 3/1/07 07:10, "Jan Beulich" <jbeulich@novell.com> wrote:
>> Another, only partially related thought: The addition of KM_SWIOTLB is
the
>> only
>> difference to native kmap_types.h. It would seem to me that, at the
price of
>> disabling interrupts around the use, it should be possible to replace
that
>> with
>> KM_BOUNCE_READ and let go of the Xen specific header...
>
>Sounds reasonable, although KM_BOUNCE_READ is a misnomer (since we write to
>the kmap as well as read from it).
... similar to the (mis-)use of it in the EDAC driver.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Dec 2006 - x86 swiotlb questions

[Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions

Re: [Xen-devel] x86 swiotlb questions