Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 00/10] s390: virtio: support protected virtualization
Enhanced virtualization protection technology may require the use of
bounce buffers for I/O. While support for this was built into the virtio
core, virtio-ccw wasn't changed accordingly.
Some background on technology (not part of this series) and the
terminology used.
* Protected Virtualization (PV):
Protected Virtualization guarantees, that non-shared memory of a  guest
that operates in PV mode private to that guest. I.e. any attempts by the
hypervisor or other guests to access it will result in an exception. If
supported by the environment (machine, KVM, guest VM) a guest can decide
to change into PV mode by doing the appropriate ultravisor calls. Unlike
some other enhanced virtualization protection technology, 
* Ultravisor:
A hardware/firmware entity that manages PV guests, and polices access to
their memory. A PV guest prospect needs to interact with the ultravisor,
to enter PV mode, and potentially to share pages (for I/O which should
be encrypted by the guest). A guest interacts with the ultravisor via so
called ultravisor calls. A hypervisor needs to interact with the
ultravisor to facilitate interpretation, emulation and swapping. A
hypervisor  interacts with the ultravisor via ultravisor calls and via
the SIE state description. Generally the ultravisor sanitizes hypervisor
inputs so that the guest can not be corrupted (except for denial of
service.
What needs to be done
====================
Thus what needs to be done to bring virtio-ccw up to speed with respect
to protected virtualization is:
* use some 'new' common virtio stuff
* make sure that virtio-ccw specific stuff uses shared memory when
  talking to the hypervisor (except control/communication blocks like ORB,
  these are handled by the ultravisor)
* make sure the DMA API does what is necessary to talk through shared
  memory if we are a protected virtualization guest.
* make sure the common IO layer plays along as well (airqs, sense).
Important notes
===============
* This patch set is based on Martins features branch
 (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch
 'features').
* Documentation is still very sketchy. I'm committed to improving this,
  but I'm currently hampered by some dependencies currently.  
* The existing naming in the common infrastructure (kernel internal
  interfaces) is pretty much based on the AMD SEV terminology. Thus the
  names aren't always perfect. There might be merit to changing these
  names to more abstract ones. I did not put much thought into that at
  the current stage.
* Testing: Please use iommu_platform=on for any virtio devices you are
  going to test this code with (so virtio actually uses the DMA API).
Change log
=========
RFC --> v1:
* Fixed bugs found by Connie (may_reduce and handling reduced,  warning,
  split move -- thanks Connie!).
* Fixed console bug found by Sebastian (thanks Sebastian!).
* Removed the completely useless duplicate of dma-mapping.h spotted by
  Christoph (thanks Christoph!).
* Don't use the global DMA pool for subchannel and ccw device
  owned memory as requested by Sebastian. Consequences:
	* Both subchannel and ccw devices have their dma masks
	now (both specifying 31 bit addressable)
	* We require at least 2 DMA pages per ccw device now, most of
	this memory is wasted though.
	* DMA memory allocated by virtio is also 31 bit addressable now
        as virtio uses the parent (which is the ccw device).
* Enabled packed ring.
* Rebased onto Martins feature branch; using the actual uv (ultravisor)
  interface instead of TODO comments.
* Added some explanations to the cover letter (Connie, David).
* Squashed a couple of patches together and fixed some text stuff. 
Looking forward to your review, or any other type of input.
Halil Pasic (10):
  virtio/s390: use vring_create_virtqueue
  virtio/s390: DMA support for virtio-ccw
  virtio/s390: enable packed ring
  s390/mm: force swiotlb for protected virtualization
  s390/cio: introduce DMA pools to cio
  s390/cio: add basic protected virtualization support
  s390/airq: use DMA memory for adapter interrupts
  virtio/s390: add indirection to indicators access
  virtio/s390: use DMA memory for ccw I/O and classic notifiers
  virtio/s390: make airq summary indicators DMA
 arch/s390/Kconfig                   |   5 +
 arch/s390/include/asm/airq.h        |   2 +
 arch/s390/include/asm/ccwdev.h      |   4 +
 arch/s390/include/asm/cio.h         |  11 ++
 arch/s390/include/asm/mem_encrypt.h |  18 +++
 arch/s390/mm/init.c                 |  50 +++++++
 drivers/s390/cio/airq.c             |  18 ++-
 drivers/s390/cio/ccwreq.c           |   8 +-
 drivers/s390/cio/cio.h              |   1 +
 drivers/s390/cio/css.c              | 101 +++++++++++++
 drivers/s390/cio/device.c           |  65 +++++++--
 drivers/s390/cio/device_fsm.c       |  40 +++---
 drivers/s390/cio/device_id.c        |  18 +--
 drivers/s390/cio/device_ops.c       |  21 ++-
 drivers/s390/cio/device_pgid.c      |  20 +--
 drivers/s390/cio/device_status.c    |  24 ++--
 drivers/s390/cio/io_sch.h           |  21 ++-
 drivers/s390/virtio/virtio_ccw.c    | 275 +++++++++++++++++++-----------------
 include/linux/virtio.h              |  17 ---
 19 files changed, 499 insertions(+), 220 deletions(-)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h
-- 
2.16.4
The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API")
establishes a new way of allocating virtqueues (as a part of the effort
that taught DMA to virtio rings).
In the future we will want virtio-ccw to use the DMA API as well.
Let us switch from the legacy method of allocating virtqueues to
vring_create_virtqueue() as the first step into that direction.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++-------------------
 1 file changed, 11 insertions(+), 19 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 74c328321889..2c66941ef3d0 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -108,7 +108,6 @@ struct virtio_rev_info {
 struct virtio_ccw_vq_info {
 	struct virtqueue *vq;
 	int num;
-	void *queue;
 	union {
 		struct vq_info_block s;
 		struct vq_info_block_legacy l;
@@ -423,7 +422,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct
ccw1 *ccw)
 	struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev);
 	struct virtio_ccw_vq_info *info = vq->priv;
 	unsigned long flags;
-	unsigned long size;
 	int ret;
 	unsigned int index = vq->index;
 
@@ -461,8 +459,6 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct
ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	free_pages_exact(info->queue, size);
 	kfree(info->info_block);
 	kfree(info);
 }
@@ -494,8 +490,9 @@ static struct virtqueue *virtio_ccw_setup_vq(struct
virtio_device *vdev,
 	int err;
 	struct virtqueue *vq = NULL;
 	struct virtio_ccw_vq_info *info;
-	unsigned long size = 0; /* silence the compiler */
+	u64 queue;
 	unsigned long flags;
+	bool may_reduce;
 
 	/* Allocate queue. */
 	info = kzalloc(sizeof(struct virtio_ccw_vq_info), GFP_KERNEL);
@@ -516,33 +513,30 @@ static struct virtqueue *virtio_ccw_setup_vq(struct
virtio_device *vdev,
 		err = info->num;
 		goto out_err;
 	}
-	size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN));
-	info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO);
-	if (info->queue == NULL) {
-		dev_warn(&vcdev->cdev->dev, "no queue\n");
-		err = -ENOMEM;
-		goto out_err;
-	}
+	may_reduce = vcdev->revision > 0;
+	vq = vring_create_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN,
+				    vdev, true, may_reduce, ctx,
+				    virtio_ccw_kvm_notify, callback, name);
 
-	vq = vring_new_virtqueue(i, info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev,
-				 true, ctx, info->queue, virtio_ccw_kvm_notify,
-				 callback, name);
 	if (!vq) {
 		/* For now, we fail if we can't get the requested size. */
 		dev_warn(&vcdev->cdev->dev, "no vq\n");
 		err = -ENOMEM;
 		goto out_err;
 	}
+	/* it may have been reduced */
+	info->num = virtqueue_get_vring_size(vq);
 
 	/* Register it with the host. */
+	queue = virtqueue_get_desc_addr(vq);
 	if (vcdev->revision == 0) {
-		info->info_block->l.queue = (__u64)info->queue;
+		info->info_block->l.queue = queue;
 		info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
 		info->info_block->l.index = i;
 		info->info_block->l.num = info->num;
 		ccw->count = sizeof(info->info_block->l);
 	} else {
-		info->info_block->s.desc = (__u64)info->queue;
+		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
 		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
@@ -572,8 +566,6 @@ static struct virtqueue *virtio_ccw_setup_vq(struct
virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		if (info->queue)
-			free_pages_exact(info->queue, size);
 		kfree(info->info_block);
 	}
 	kfree(info);
-- 
2.16.4
Currently virtio-ccw devices do not work if the device has
VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with
virtio-ccw.
Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works
with virtio-ccw.
Let us also switch from legacy avail/used accessors to the DMA aware
ones (even if it isn't strictly necessary), and remove the legacy
accessors (we were the last users).
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 22 +++++++++++++++-------
 include/linux/virtio.h           | 17 -----------------
 2 files changed, 15 insertions(+), 24 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 2c66941ef3d0..42832a164546 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,6 +66,7 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -539,8 +540,8 @@ static struct virtqueue *virtio_ccw_setup_vq(struct
virtio_device *vdev,
 		info->info_block->s.desc = queue;
 		info->info_block->s.index = i;
 		info->info_block->s.num = info->num;
-		info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
-		info->info_block->s.used = (__u64)virtqueue_get_used(vq);
+		info->info_block->s.avail = (__u64)virtqueue_get_avail_addr(vq);
+		info->info_block->s.used = (__u64)virtqueue_get_used_addr(vq);
 		ccw->count = sizeof(info->info_block->s);
 	}
 	ccw->cmd_code = CCW_CMD_SET_VQ;
@@ -772,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device
*vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * Packed ring isn't enabled on virtio_ccw for now,
-	 * because virtio_ccw uses some legacy accessors,
-	 * e.g. virtqueue_get_avail() and virtqueue_get_used()
-	 * which aren't available in packed ring currently.
+	 * There shouldn't be anything that precludes supporting packed.
+	 * TODO: Remove the limitation after having another look into this.
 	 */
 	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
@@ -1258,6 +1257,16 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
+
+	vcdev->vdev.dev.parent = &cdev->dev;
+	cdev->dev.dma_mask = &vcdev->dma_mask;
+	/* we are fine with common virtio infrastructure using 64 bit DMA */
+	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
+	if (ret) {
+		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
+		goto out_free;
+	}
+
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
@@ -1272,7 +1281,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->is_thinint = virtio_ccw_use_airq; /* at least try */
 
-	vcdev->vdev.dev.parent = &cdev->dev;
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
 	vcdev->cdev = cdev;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 673fe3ef3607..15f906e4a748 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -90,23 +90,6 @@ dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
 
-/*
- * Legacy accessors -- in almost all cases, these are the wrong functions
- * to use.
- */
-static inline void *virtqueue_get_desc(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->desc;
-}
-static inline void *virtqueue_get_avail(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->avail;
-}
-static inline void *virtqueue_get_used(struct virtqueue *vq)
-{
-	return virtqueue_get_vring(vq)->used;
-}
-
 /**
  * virtio_device - representation of a device using virtio
  * @index: unique position on the virtio bus
-- 
2.16.4
Nothing precludes to accepting  VIRTIO_F_RING_PACKED any more.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 42832a164546..6d989c360f38 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device
*vdev)
 static void ccw_transport_features(struct virtio_device *vdev)
 {
 	/*
-	 * There shouldn't be anything that precludes supporting packed.
-	 * TODO: Remove the limitation after having another look into this.
+	 * Currently nothing to do here.
 	 */
-	__virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED);
 }
 
 static int virtio_ccw_finalize_features(struct virtio_device *vdev)
-- 
2.16.4
Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 04/10] s390/mm: force swiotlb for protected virtualization
On s390, protected virtualization guests have to use bounced I/O
buffers.  That requires some plumbing.
Let us make sure, any device that uses DMA API with direct ops correctly
is spared from the problems, that a hypervisor attempting I/O to a
non-shared page would bring.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 arch/s390/Kconfig                   |  4 +++
 arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++
 arch/s390/mm/init.c                 | 50 +++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+)
 create mode 100644 arch/s390/include/asm/mem_encrypt.h
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 1c3fcf19c3af..5500d05d4d53 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+config ARCH_HAS_MEM_ENCRYPT
+        def_bool y
+
 config MMU
 	def_bool y
 
@@ -191,6 +194,7 @@ config S390
 	select ARCH_HAS_SCALED_CPUTIME
 	select VIRT_TO_BUS
 	select HAVE_NMI
+	select SWIOTLB
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/mem_encrypt.h
b/arch/s390/include/asm/mem_encrypt.h
new file mode 100644
index 000000000000..0898c09a888c
--- /dev/null
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef S390_MEM_ENCRYPT_H__
+#define S390_MEM_ENCRYPT_H__
+
+#ifndef __ASSEMBLY__
+
+#define sme_me_mask	0ULL
+
+static inline bool sme_active(void) { return false; }
+extern bool sev_active(void);
+
+int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_decrypted(unsigned long addr, int numpages);
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* S390_MEM_ENCRYPT_H__ */
+
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 3e82f66d5c61..7e3cbd15dcfa 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -18,6 +18,7 @@
 #include <linux/mman.h>
 #include <linux/mm.h>
 #include <linux/swap.h>
+#include <linux/swiotlb.h>
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/pagemap.h>
@@ -29,6 +30,7 @@
 #include <linux/export.h>
 #include <linux/cma.h>
 #include <linux/gfp.h>
+#include <linux/dma-mapping.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
 #include <asm/pgtable.h>
@@ -42,6 +44,8 @@
 #include <asm/sclp.h>
 #include <asm/set_memory.h>
 #include <asm/kasan.h>
+#include <asm/dma-mapping.h>
+#include <asm/uv.h>
 
 pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir);
 
@@ -126,6 +130,50 @@ void mark_rodata_ro(void)
 	pr_info("Write protected read-only-after-init data: %luk\n", size
>> 10);
 }
 
+int set_memory_encrypted(unsigned long addr, int numpages)
+{
+	int i;
+
+	/* make all pages shared, (swiotlb, dma_free) */
+	for (i = 0; i < numpages; ++i) {
+		uv_remove_shared(addr);
+		addr += PAGE_SIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_encrypted);
+
+int set_memory_decrypted(unsigned long addr, int numpages)
+{
+	int i;
+	/* make all pages shared (swiotlb, dma_alloca) */
+	for (i = 0; i < numpages; ++i) {
+		uv_set_shared(addr);
+		addr += PAGE_SIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(set_memory_decrypted);
+
+/* are we a protected virtualization guest? */
+bool sev_active(void)
+{
+	return is_prot_virt_guest();
+}
+EXPORT_SYMBOL_GPL(sev_active);
+
+/* protected virtualization */
+static void pv_init(void)
+{
+	if (!sev_active())
+		return;
+
+	/* make sure bounce buffers are shared */
+	swiotlb_init(1);
+	swiotlb_update_mem_attributes();
+	swiotlb_force = SWIOTLB_FORCE;
+}
+
 void __init mem_init(void)
 {
 	cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask);
@@ -134,6 +182,8 @@ void __init mem_init(void)
 	set_max_mapnr(max_low_pfn);
         high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
 
+	pv_init();
+
 	/* Setup guest page hinting */
 	cmma_init();
 
-- 
2.16.4
To support protected virtualization cio will need to make sure the
memory used for communication with the hypervisor is DMA memory.
Let us introduce one global cio, and some tools for pools seated
at individual devices.
Our DMA pools are implemented as a gen_pool backed with DMA pages. The
idea is to avoid each allocation effectively wasting a page, as we
typically allocate much less than PAGE_SIZE.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 arch/s390/Kconfig           |   1 +
 arch/s390/include/asm/cio.h |  11 +++++
 drivers/s390/cio/cio.h      |   1 +
 drivers/s390/cio/css.c      | 101 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 114 insertions(+)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 5500d05d4d53..5861311d95d9 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -195,6 +195,7 @@ config S390
 	select VIRT_TO_BUS
 	select HAVE_NMI
 	select SWIOTLB
+	select GENERIC_ALLOCATOR
 
 
 config SCHED_OMIT_FRAME_POINTER
diff --git a/arch/s390/include/asm/cio.h b/arch/s390/include/asm/cio.h
index 1727180e8ca1..43c007d2775a 100644
--- a/arch/s390/include/asm/cio.h
+++ b/arch/s390/include/asm/cio.h
@@ -328,6 +328,17 @@ static inline u8 pathmask_to_pos(u8 mask)
 void channel_subsystem_reinit(void);
 extern void css_schedule_reprobe(void);
 
+extern void *cio_dma_zalloc(size_t size);
+extern void cio_dma_free(void *cpu_addr, size_t size);
+extern struct device *cio_get_dma_css_dev(void);
+
+struct gen_pool;
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+			size_t size);
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size);
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev);
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages);
+
 /* Function from drivers/s390/cio/chsc.c */
 int chsc_sstpc(void *page, unsigned int op, u16 ctrl, u64 *clock_delta);
 int chsc_sstpi(void *page, void *result, size_t size);
diff --git a/drivers/s390/cio/cio.h b/drivers/s390/cio/cio.h
index 92eabbb5f18d..f23f7e2c33f7 100644
--- a/drivers/s390/cio/cio.h
+++ b/drivers/s390/cio/cio.h
@@ -113,6 +113,7 @@ struct subchannel {
 	enum sch_todo todo;
 	struct work_struct todo_work;
 	struct schib_config config;
+	u64 dma_mask;
 } __attribute__ ((aligned(8)));
 
 DECLARE_PER_CPU_ALIGNED(struct irb, cio_irb);
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c
index aea502922646..7087cc314fe9 100644
--- a/drivers/s390/cio/css.c
+++ b/drivers/s390/cio/css.c
@@ -20,6 +20,8 @@
 #include <linux/reboot.h>
 #include <linux/suspend.h>
 #include <linux/proc_fs.h>
+#include <linux/genalloc.h>
+#include <linux/dma-mapping.h>
 #include <asm/isc.h>
 #include <asm/crw.h>
 
@@ -199,6 +201,8 @@ static int css_validate_subchannel(struct subchannel_id
schid,
 	return err;
 }
 
+static u64 css_dev_dma_mask = DMA_BIT_MASK(31);
+
 struct subchannel *css_alloc_subchannel(struct subchannel_id schid,
 					struct schib *schib)
 {
@@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id
schid,
 	INIT_WORK(&sch->todo_work, css_sch_todo);
 	sch->dev.release = &css_subchannel_release;
 	device_initialize(&sch->dev);
+	sch->dma_mask = css_dev_dma_mask;
+	sch->dev.dma_mask = &sch->dma_mask;
+	sch->dev.coherent_dma_mask = sch->dma_mask;
 	return sch;
 
 err:
@@ -899,6 +906,9 @@ static int __init setup_css(int nr)
 	dev_set_name(&css->device, "css%x", nr);
 	css->device.groups = cssdev_attr_groups;
 	css->device.release = channel_subsystem_release;
+	/* some cio DMA memory needs to be 31 bit addressable */
+	css->device.coherent_dma_mask = css_dev_dma_mask,
+	css->device.dma_mask = &css_dev_dma_mask;
 
 	mutex_init(&css->mutex);
 	css->cssid = chsc_get_cssid(nr);
@@ -1018,6 +1028,96 @@ static struct notifier_block css_power_notifier = {
 	.notifier_call = css_power_event,
 };
 
+#define POOL_INIT_PAGES 1
+static struct gen_pool *cio_dma_pool;
+/* Currently cio supports only a single css */
+#define  CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)
+
+
+struct device *cio_get_dma_css_dev(void)
+{
+	return &channel_subsystems[0]->device;
+}
+
+struct gen_pool *cio_gp_dma_create(struct device *dma_dev, int nr_pages)
+{
+	struct gen_pool *gp_dma;
+	void *cpu_addr;
+	dma_addr_t dma_addr;
+	int i;
+
+	gp_dma = gen_pool_create(3, -1);
+	if (!gp_dma)
+		return NULL;
+	for (i = 0; i < nr_pages; ++i) {
+		cpu_addr = dma_alloc_coherent(dma_dev, PAGE_SIZE, &dma_addr,
+					      CIO_DMA_GFP);
+		if (!cpu_addr)
+			return gp_dma;
+		gen_pool_add_virt(gp_dma, (unsigned long) cpu_addr,
+				  dma_addr, PAGE_SIZE, -1);
+	}
+	return gp_dma;
+}
+
+static void __gp_dma_free_dma(struct gen_pool *pool,
+			      struct gen_pool_chunk *chunk, void *data)
+{
+	dma_free_coherent((struct device *) data, PAGE_SIZE,
+			 (void *) chunk->start_addr,
+			 (dma_addr_t) chunk->phys_addr);
+}
+
+void cio_gp_dma_destroy(struct gen_pool *gp_dma, struct device *dma_dev)
+{
+	if (!gp_dma)
+		return;
+	/* this is qite ugly but no better idea */
+	gen_pool_for_each_chunk(gp_dma, __gp_dma_free_dma, dma_dev);
+	gen_pool_destroy(gp_dma);
+}
+
+static void __init cio_dma_pool_init(void)
+{
+	/* No need to free up the resources: compiled in */
+	cio_dma_pool = cio_gp_dma_create(cio_get_dma_css_dev(), 1);
+}
+
+void *cio_gp_dma_zalloc(struct gen_pool *gp_dma, struct device *dma_dev,
+			size_t size)
+{
+	dma_addr_t dma_addr;
+	unsigned long addr = gen_pool_alloc(gp_dma, size);
+
+	if (!addr) {
+		addr = (unsigned long) dma_alloc_coherent(dma_dev,
+					PAGE_SIZE, &dma_addr, CIO_DMA_GFP);
+		if (!addr)
+			return NULL;
+		gen_pool_add_virt(gp_dma, addr, dma_addr, PAGE_SIZE, -1);
+		addr = gen_pool_alloc(gp_dma, size);
+	}
+	return (void *) addr;
+}
+
+void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size)
+{
+	if (!cpu_addr)
+		return;
+	memset(cpu_addr, 0, size);
+	gen_pool_free(gp_dma, (unsigned long) cpu_addr, size);
+}
+
+void *cio_dma_zalloc(size_t size)
+{
+	return cio_gp_dma_zalloc(cio_dma_pool, cio_get_dma_css_dev(), size);
+}
+
+void cio_dma_free(void *cpu_addr, size_t size)
+{
+	cio_gp_dma_free(cio_dma_pool, cpu_addr, size);
+}
+
 /*
  * Now that the driver core is running, we can setup our channel subsystem.
  * The struct subchannel's are created during probing.
@@ -1063,6 +1163,7 @@ static int __init css_bus_init(void)
 		unregister_reboot_notifier(&css_reboot_notifier);
 		goto out_unregister;
 	}
+	cio_dma_pool_init();
 	css_init_done = 1;
 
 	/* Enable default isc for I/O subchannels. */
-- 
2.16.4
Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 06/10] s390/cio: add basic protected virtualization support
As virtio-ccw devices are channel devices, we need to use the dma area
for any communication with the hypervisor.
This patch addresses the most basic stuff (mostly what is required for
virtio-ccw), and does take care of QDIO or any devices.
An interesting side effect is that virtio structures are now going to
get allocated in 31 bit addressable storage.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 arch/s390/include/asm/ccwdev.h   |  4 +++
 drivers/s390/cio/ccwreq.c        |  8 ++---
 drivers/s390/cio/device.c        | 65 +++++++++++++++++++++++++++++++++-------
 drivers/s390/cio/device_fsm.c    | 40 ++++++++++++-------------
 drivers/s390/cio/device_id.c     | 18 +++++------
 drivers/s390/cio/device_ops.c    | 21 +++++++++++--
 drivers/s390/cio/device_pgid.c   | 20 ++++++-------
 drivers/s390/cio/device_status.c | 24 +++++++--------
 drivers/s390/cio/io_sch.h        | 21 +++++++++----
 drivers/s390/virtio/virtio_ccw.c | 10 -------
 10 files changed, 148 insertions(+), 83 deletions(-)
diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index a29dd430fb40..865ce1cb86d5 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *);
 extern void ccw_device_wait_idle(struct ccw_device *);
 extern int ccw_device_force_console(struct ccw_device *);
 
+extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size);
+extern void ccw_device_dma_free(struct ccw_device *cdev,
+				void *cpu_addr, size_t size);
+
 int ccw_device_siosl(struct ccw_device *);
 
 extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *);
diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c
index 603268a33ea1..dafbceb311b3 100644
--- a/drivers/s390/cio/ccwreq.c
+++ b/drivers/s390/cio/ccwreq.c
@@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc)
 		return;
 	req->done = 1;
 	ccw_device_set_timeout(cdev, 0);
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	if (rc && rc != -ENODEV && req->drc)
 		rc = req->drc;
 	req->callback(cdev, req->data, rc);
@@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev)
 			continue;
 		}
 		/* Perform start function. */
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		rc = cio_start(sch, cp, (u8) req->mask);
 		if (rc == 0) {
 			/* I/O started successfully. */
@@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev)
  */
 static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb)
 {
-	struct irb *irb = &cdev->private->irb;
+	struct irb *irb = &cdev->private->dma_area->irb;
 	struct cmd_scsw *scsw = &irb->scsw.cmd;
 	enum uc_todo todo;
 
@@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev,
struct irb *lcirb)
 		CIO_TRACE_EVENT(2, "sensedata");
 		CIO_HEX_EVENT(2, &cdev->private->dev_id,
 			      sizeof(struct ccw_dev_id));
-		CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT);
+		CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw,
SENSE_MAX_COUNT);
 		/* Check for command reject. */
 		if (irb->ecw[0] & SNS0_CMD_REJECT)
 			return IO_REJECTED;
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c
index 1540229a37bb..a3310ee14a4a 100644
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -24,6 +24,7 @@
 #include <linux/timer.h>
 #include <linux/kernel_stat.h>
 #include <linux/sched/signal.h>
+#include <linux/dma-mapping.h>
 
 #include <asm/ccwdev.h>
 #include <asm/cio.h>
@@ -687,6 +688,9 @@ ccw_device_release(struct device *dev)
 	struct ccw_device *cdev;
 
 	cdev = to_ccwdev(dev);
+	cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area,
+			sizeof(*cdev->private->dma_area));
+	cio_gp_dma_destroy(cdev->private->dma_pool, &cdev->dev);
 	/* Release reference of parent subchannel. */
 	put_device(cdev->dev.parent);
 	kfree(cdev->private);
@@ -696,15 +700,31 @@ ccw_device_release(struct device *dev)
 static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch)
 {
 	struct ccw_device *cdev;
+	struct gen_pool *dma_pool;
 
 	cdev  = kzalloc(sizeof(*cdev), GFP_KERNEL);
-	if (cdev) {
-		cdev->private = kzalloc(sizeof(struct ccw_device_private),
-					GFP_KERNEL | GFP_DMA);
-		if (cdev->private)
-			return cdev;
-	}
+	if (!cdev)
+		goto err_cdev;
+	cdev->private = kzalloc(sizeof(struct ccw_device_private),
+				GFP_KERNEL | GFP_DMA);
+	if (!cdev->private)
+		goto err_priv;
+	cdev->dev.dma_mask = &cdev->private->dma_mask;
+	*cdev->dev.dma_mask = *sch->dev.dma_mask;
+	cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
+	dma_pool = cio_gp_dma_create(&cdev->dev, 1);
+	cdev->private->dma_pool = dma_pool;
+	cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev,
+					sizeof(*cdev->private->dma_area));
+	if (!cdev->private->dma_area)
+		goto err_dma_area;
+	return cdev;
+err_dma_area:
+	cio_gp_dma_destroy(dma_pool, &cdev->dev);
+	kfree(cdev->private);
+err_priv:
 	kfree(cdev);
+err_cdev:
 	return ERR_PTR(-ENOMEM);
 }
 
@@ -884,7 +904,7 @@ io_subchannel_recog_done(struct ccw_device *cdev)
 			wake_up(&ccw_device_init_wq);
 		break;
 	case DEV_STATE_OFFLINE:
-		/* 
+		/*
 		 * We can't register the device in interrupt context so
 		 * we schedule a work item.
 		 */
@@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch)
 	if (!io_priv)
 		goto out_schedule;
 
+	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
+				sizeof(*io_priv->dma_area),
+				&io_priv->dma_area_dma, GFP_KERNEL);
+	if (!io_priv->dma_area) {
+		kfree(io_priv);
+		goto out_schedule;
+	}
+
 	set_io_private(sch, io_priv);
 	css_schedule_eval(sch->schid);
 	return 0;
@@ -1088,6 +1116,8 @@ static int io_subchannel_remove(struct subchannel *sch)
 	set_io_private(sch, NULL);
 	spin_unlock_irq(sch->lock);
 out_free:
+	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+			  io_priv->dma_area, io_priv->dma_area_dma);
 	kfree(io_priv);
 	sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group);
 	return 0;
@@ -1593,20 +1623,31 @@ struct ccw_device * __init
ccw_device_create_console(struct ccw_driver *drv)
 		return ERR_CAST(sch);
 
 	io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA);
-	if (!io_priv) {
-		put_device(&sch->dev);
-		return ERR_PTR(-ENOMEM);
-	}
+	if (!io_priv)
+		goto err_priv;
+	io_priv->dma_area = dma_alloc_coherent(&sch->dev,
+				sizeof(*io_priv->dma_area),
+				&io_priv->dma_area_dma, GFP_KERNEL);
+	if (!io_priv->dma_area)
+		goto err_dma_area;
 	set_io_private(sch, io_priv);
 	cdev = io_subchannel_create_ccwdev(sch);
 	if (IS_ERR(cdev)) {
 		put_device(&sch->dev);
+		dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+				  io_priv->dma_area, io_priv->dma_area_dma);
 		kfree(io_priv);
 		return cdev;
 	}
 	cdev->drv = drv;
 	ccw_device_set_int_class(cdev);
 	return cdev;
+
+err_dma_area:
+		kfree(io_priv);
+err_priv:
+	put_device(&sch->dev);
+	return ERR_PTR(-ENOMEM);
 }
 
 void __init ccw_device_destroy_console(struct ccw_device *cdev)
@@ -1617,6 +1658,8 @@ void __init ccw_device_destroy_console(struct ccw_device
*cdev)
 	set_io_private(sch, NULL);
 	put_device(&sch->dev);
 	put_device(&cdev->dev);
+	dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area),
+			  io_priv->dma_area, io_priv->dma_area_dma);
 	kfree(io_priv);
 }
 
diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c
index 9169af7dbb43..fea23b44795b 100644
--- a/drivers/s390/cio/device_fsm.c
+++ b/drivers/s390/cio/device_fsm.c
@@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev)
 			       sizeof(struct tcw), 0);
 	} else {
 		printk(KERN_WARNING "cio: orb indicates command mode\n");
-		if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw ||
-		    (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws)
+		if ((void *)(addr_t)orb->cmd.cpa ==
&private->dma_area->sense_ccw ||
+		    (void *)(addr_t)orb->cmd.cpa ==
cdev->private->dma_area->iccws)
 			printk(KERN_WARNING "cio: last channel program "
 			       "(intern):\n");
 		else
@@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev)
 void ccw_device_update_sense_data(struct ccw_device *cdev)
 {
 	memset(&cdev->id, 0, sizeof(cdev->id));
-	cdev->id.cu_type   = cdev->private->senseid.cu_type;
-	cdev->id.cu_model  = cdev->private->senseid.cu_model;
-	cdev->id.dev_type  = cdev->private->senseid.dev_type;
-	cdev->id.dev_model = cdev->private->senseid.dev_model;
+	cdev->id.cu_type   = cdev->private->dma_area->senseid.cu_type;
+	cdev->id.cu_model  = cdev->private->dma_area->senseid.cu_model;
+	cdev->id.dev_type  = cdev->private->dma_area->senseid.dev_type;
+	cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model;
 }
 
 int ccw_device_test_sense_data(struct ccw_device *cdev)
 {
-	return cdev->id.cu_type == cdev->private->senseid.cu_type &&
-		cdev->id.cu_model == cdev->private->senseid.cu_model &&
-		cdev->id.dev_type == cdev->private->senseid.dev_type &&
-		cdev->id.dev_model == cdev->private->senseid.dev_model;
+	return cdev->id.cu_type ==
cdev->private->dma_area->senseid.cu_type &&
+		cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model
&&
+		cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type
&&
+		cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model;
 }
 
 /*
@@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state)
 		cio_disable_subchannel(sch);
 
 	/* Reset device status. */
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	cdev->private->state = state;
 
@@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int
err)
 		ccw_device_done(cdev, DEV_STATE_ONLINE);
 		/* Deliver fake irb to device driver, if needed. */
 		if (cdev->private->flags.fake_irb) {
-			create_fake_irb(&cdev->private->irb,
+			create_fake_irb(&cdev->private->dma_area->irb,
 					cdev->private->flags.fake_irb);
 			cdev->private->flags.fake_irb = 0;
 			if (cdev->handler)
 				cdev->handler(cdev, cdev->private->intparm,
-					      &cdev->private->irb);
-			memset(&cdev->private->irb, 0, sizeof(struct irb));
+					      &cdev->private->dma_area->irb);
+			memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		}
 		ccw_device_report_path_events(cdev);
 		ccw_device_handle_broken_paths(cdev);
@@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum
dev_event dev_event)
 
 	if (scsw_actl(&sch->schib.scsw) != 0 ||
 	    (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) ||
-	    (scsw_stctl(&cdev->private->irb.scsw) &
SCSW_STCTL_STATUS_PEND)) {
+	    (scsw_stctl(&cdev->private->dma_area->irb.scsw) &
SCSW_STCTL_STATUS_PEND)) {
 		/*
 		 * No final status yet or final status not yet delivered
 		 * to the device driver. Can't do path verification now,
@@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 	 *  - fast notification was requested (primary status)
 	 *  - unsolicited interrupts
 	 */
-	stctl = scsw_stctl(&cdev->private->irb.scsw);
+	stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw);
 	ending_status = (stctl & SCSW_STCTL_SEC_STATUS) ||
 		(stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) ||
 		(stctl == SCSW_STCTL_STATUS_PEND);
@@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev)
 
 	if (cdev->handler)
 		cdev->handler(cdev, cdev->private->intparm,
-			      &cdev->private->irb);
+			      &cdev->private->dma_area->irb);
 
-	memset(&cdev->private->irb, 0, sizeof(struct irb));
+	memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 	return 1;
 }
 
@@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event
dev_event)
 			/* Unit check but no sense data. Need basic sense. */
 			if (ccw_device_do_sense(cdev, irb) != 0)
 				goto call_handler_unsol;
-			memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+			memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 			cdev->private->state = DEV_STATE_W4SENSE;
 			cdev->private->intparm = 0;
 			return;
@@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event
dev_event)
 	if (scsw_fctl(&irb->scsw) &
 	    (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) {
 		cdev->private->flags.dosense = 0;
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 		ccw_device_accumulate_irb(cdev, irb);
 		goto call_handler;
 	}
diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c
index f6df83a9dfbb..ea8a0fc6c0b6 100644
--- a/drivers/s390/cio/device_id.c
+++ b/drivers/s390/cio/device_id.c
@@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct
diag210 *diag)
 static int diag210_get_dev_info(struct ccw_device *cdev)
 {
 	struct ccw_dev_id *dev_id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	struct diag210 diag_data;
 	int rc;
 
@@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev)
 static void snsid_init(struct ccw_device *cdev)
 {
 	cdev->private->flags.esid = 0;
-	memset(&cdev->private->senseid, 0,
sizeof(cdev->private->senseid));
-	cdev->private->senseid.cu_type = 0xffff;
+	memset(&cdev->private->dma_area->senseid, 0,
sizeof(cdev->private->dma_area->senseid));
+	cdev->private->dma_area->senseid.cu_type = 0xffff;
 }
 
 /*
@@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev)
  */
 static int snsid_check(struct ccw_device *cdev, void *data)
 {
-	struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd;
+	struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd;
 	int len = sizeof(struct senseid) - scsw->count;
 
 	/* Check for incomplete SENSE ID data. */
 	if (len < SENSE_ID_MIN_LEN)
 		goto out_restart;
-	if (cdev->private->senseid.cu_type == 0xffff)
+	if (cdev->private->dma_area->senseid.cu_type == 0xffff)
 		goto out_restart;
 	/* Check for incompatible SENSE ID data. */
-	if (cdev->private->senseid.reserved != 0xff)
+	if (cdev->private->dma_area->senseid.reserved != 0xff)
 		return -EOPNOTSUPP;
 	/* Check for extended-identification information. */
 	if (len > SENSE_ID_BASIC_LEN)
@@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data)
 static void snsid_callback(struct ccw_device *cdev, void *data, int rc)
 {
 	struct ccw_dev_id *id = &cdev->private->dev_id;
-	struct senseid *senseid = &cdev->private->senseid;
+	struct senseid *senseid = &cdev->private->dma_area->senseid;
 	int vm = 0;
 
 	if (rc && MACHINE_IS_VM) {
@@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 {
 	struct subchannel *sch = to_subchannel(cdev->dev.parent);
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	CIO_TRACE_EVENT(4, "snsid");
 	CIO_HEX_EVENT(4, &cdev->private->dev_id,
sizeof(cdev->private->dev_id));
@@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev)
 	snsid_init(cdev);
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_ID;
-	cp->cda		= (u32) (addr_t) &cdev->private->senseid;
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->senseid;
 	cp->count	= sizeof(struct senseid);
 	cp->flags	= CCW_FLAG_SLI;
 	/* Request setup. */
diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c
index 4435ae0b3027..be4acfa9265a 100644
--- a/drivers/s390/cio/device_ops.c
+++ b/drivers/s390/cio/device_ops.c
@@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev,
__u32 ct)
 	if (cdev->private->flags.esid == 0)
 		return NULL;
 	for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++)
-		if (cdev->private->senseid.ciw[ciw_cnt].ct == ct)
-			return cdev->private->senseid.ciw + ciw_cnt;
+		if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct)
+			return cdev->private->dma_area->senseid.ciw + ciw_cnt;
 	return NULL;
 }
 
@@ -699,6 +699,23 @@ void ccw_device_get_schid(struct ccw_device *cdev, struct
subchannel_id *schid)
 }
 EXPORT_SYMBOL_GPL(ccw_device_get_schid);
 
+/**
+ * Allocate zeroed dma coherent 31 bit addressable memory using
+ * the subchannels dma pool. Maximal size of allocation supported
+ * is PAGE_SIZE.
+ */
+void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size)
+{
+	return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev,
size);
+}
+EXPORT_SYMBOL(ccw_device_dma_zalloc);
+
+void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size)
+{
+	cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size);
+}
+EXPORT_SYMBOL(ccw_device_dma_free);
+
 EXPORT_SYMBOL(ccw_device_set_options_mask);
 EXPORT_SYMBOL(ccw_device_set_options);
 EXPORT_SYMBOL(ccw_device_clear_options);
diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c
index d30a3babf176..e97baa89cbf8 100644
--- a/drivers/s390/cio/device_pgid.c
+++ b/drivers/s390/cio/device_pgid.c
@@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc)
 static void nop_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp->cmd_code	= CCW_CMD_NOOP;
 	cp->cda		= 0;
@@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void
*data, int rc)
 static void spid_build_cp(struct ccw_device *cdev, u8 fn)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
-	struct pgid *pgid = &cdev->private->pgid[i];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[i];
 
 	pgid->inf.fc	= fn;
 	cp->cmd_code	= CCW_CMD_SET_PGID;
@@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2)
 static void pgid_analyze(struct ccw_device *cdev, struct pgid **p,
 			 int *mismatch, u8 *reserved, u8 *reset)
 {
-	struct pgid *pgid = &cdev->private->pgid[0];
+	struct pgid *pgid = &cdev->private->dma_area->pgid[0];
 	struct pgid *first = NULL;
 	int lpm;
 	int i;
@@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev)
 		lpm = 0x80 >> i;
 		if ((cdev->private->pgid_valid_mask & lpm) == 0)
 			continue;
-		pgid = &cdev->private->pgid[i];
+		pgid = &cdev->private->dma_area->pgid[i];
 		if (sch->opm & lpm) {
 			if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED)
 				continue;
@@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid
*pgid)
 	int i;
 
 	for (i = 0; i < 8; i++)
-		memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid));
+		memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct
pgid));
 }
 
 /*
@@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc)
 static void snid_build_cp(struct ccw_device *cdev)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 	int i = pathmask_to_pos(req->lpm);
 
 	/* Channel program setup. */
 	cp->cmd_code	= CCW_CMD_SENSE_PGID;
-	cp->cda		= (u32) (addr_t) &cdev->private->pgid[i];
+	cp->cda		= (u32) (addr_t) &cdev->private->dma_area->pgid[i];
 	cp->count	= sizeof(struct pgid);
 	cp->flags	= CCW_FLAG_SLI;
 	req->cp		= cp;
@@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev)
 	sch->lpm = sch->schib.pmcw.pam;
 
 	/* Initialize PGID data. */
-	memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid));
+	memset(cdev->private->dma_area->pgid, 0,
sizeof(cdev->private->dma_area->pgid));
 	cdev->private->pgid_valid_mask = 0;
 	cdev->private->pgid_todo_mask = sch->schib.pmcw.pam;
 	cdev->private->path_notoper_mask = 0;
@@ -626,7 +626,7 @@ struct stlck_data {
 static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2)
 {
 	struct ccw_request *req = &cdev->private->req;
-	struct ccw1 *cp = cdev->private->iccws;
+	struct ccw1 *cp = cdev->private->dma_area->iccws;
 
 	cp[0].cmd_code = CCW_CMD_STLCK;
 	cp[0].cda = (u32) (addr_t) buf1;
diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c
index 7d5c7892b2c4..a9aabde604f4 100644
--- a/drivers/s390/cio/device_status.c
+++ b/drivers/s390/cio/device_status.c
@@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct
irb *irb)
 	 * are condition that have to be met for the extended control
 	 * bit to have meaning. Sick.
 	 */
-	cdev->private->irb.scsw.cmd.ectl = 0;
+	cdev->private->dma_area->irb.scsw.cmd.ectl = 0;
 	if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) &&
 	    !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS))
-		cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
+		cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl;
 	/* Check if extended control word is valid. */
-	if (!cdev->private->irb.scsw.cmd.ectl)
+	if (!cdev->private->dma_area->irb.scsw.cmd.ectl)
 		return;
 	/* Copy concurrent sense / model dependent information. */
-	memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw));
+	memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof
(irb->ecw));
 }
 
 /*
@@ -118,7 +118,7 @@ ccw_device_accumulate_esw(struct ccw_device *cdev, struct
irb *irb)
 	if (!ccw_device_accumulate_esw_valid(irb))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/* Copy last path used mask. */
 	cdev_irb->esw.esw1.lpum = irb->esw.esw1.lpum;
@@ -210,7 +210,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct
irb *irb)
 		ccw_device_path_notoper(cdev);
 	/* No irb accumulation for transport mode irbs. */
 	if (scsw_is_tm(&irb->scsw)) {
-		memcpy(&cdev->private->irb, irb, sizeof(struct irb));
+		memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb));
 		return;
 	}
 	/*
@@ -219,7 +219,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct
irb *irb)
 	if (!scsw_is_solicited(&irb->scsw))
 		return;
 
-	cdev_irb = &cdev->private->irb;
+	cdev_irb = &cdev->private->dma_area->irb;
 
 	/*
 	 * If the clear function had been performed, all formerly pending
@@ -227,7 +227,7 @@ ccw_device_accumulate_irb(struct ccw_device *cdev, struct
irb *irb)
 	 * intermediate accumulated status to the device driver.
 	 */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_CLEAR_FUNC)
-		memset(&cdev->private->irb, 0, sizeof(struct irb));
+		memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb));
 
 	/* Copy bits which are valid only for the start function. */
 	if (irb->scsw.cmd.fctl & SCSW_FCTL_START_FUNC) {
@@ -329,9 +329,9 @@ ccw_device_do_sense(struct ccw_device *cdev, struct irb
*irb)
 	/*
 	 * We have ending status but no sense information. Do a basic sense.
 	 */
-	sense_ccw = &to_io_private(sch)->sense_ccw;
+	sense_ccw = &to_io_private(sch)->dma_area->sense_ccw;
 	sense_ccw->cmd_code = CCW_CMD_BASIC_SENSE;
-	sense_ccw->cda = (__u32) __pa(cdev->private->irb.ecw);
+	sense_ccw->cda = (__u32) __pa(cdev->private->dma_area->irb.ecw);
 	sense_ccw->count = SENSE_MAX_COUNT;
 	sense_ccw->flags = CCW_FLAG_SLI;
 
@@ -364,7 +364,7 @@ ccw_device_accumulate_basic_sense(struct ccw_device *cdev,
struct irb *irb)
 
 	if (!(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK) &&
 	    (irb->scsw.cmd.dstat & DEV_STAT_CHN_END)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 	}
 	/* Check if path verification is required. */
@@ -386,7 +386,7 @@ ccw_device_accumulate_and_sense(struct ccw_device *cdev,
struct irb *irb)
 	/* Check for basic sense. */
 	if (cdev->private->flags.dosense &&
 	    !(irb->scsw.cmd.dstat & DEV_STAT_UNIT_CHECK)) {
-		cdev->private->irb.esw.esw0.erw.cons = 1;
+		cdev->private->dma_area->irb.esw.esw0.erw.cons = 1;
 		cdev->private->flags.dosense = 0;
 		return 0;
 	}
diff --git a/drivers/s390/cio/io_sch.h b/drivers/s390/cio/io_sch.h
index 90e4e3a7841b..4cf38c98cd0b 100644
--- a/drivers/s390/cio/io_sch.h
+++ b/drivers/s390/cio/io_sch.h
@@ -9,15 +9,20 @@
 #include "css.h"
 #include "orb.h"
 
+struct io_subchannel_dma_area {
+	struct ccw1 sense_ccw;	/* static ccw for sense command */
+};
+
 struct io_subchannel_private {
 	union orb orb;		/* operation request block */
-	struct ccw1 sense_ccw;	/* static ccw for sense command */
 	struct ccw_device *cdev;/* pointer to the child ccw device */
 	struct {
 		unsigned int suspend:1;	/* allow suspend */
 		unsigned int prefetch:1;/* deny prefetch */
 		unsigned int inter:1;	/* suppress intermediate interrupts */
 	} __packed options;
+	struct io_subchannel_dma_area *dma_area;
+	dma_addr_t dma_area_dma;
 } __aligned(8);
 
 #define to_io_private(n) ((struct io_subchannel_private *) \
@@ -115,6 +120,13 @@ enum cdev_todo {
 #define FAKE_CMD_IRB	1
 #define FAKE_TM_IRB	2
 
+struct ccw_device_dma_area {
+	struct senseid senseid;	/* SenseID info */
+	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
+	struct irb irb;		/* device status */
+	struct pgid pgid[8];	/* path group IDs per chpid*/
+};
+
 struct ccw_device_private {
 	struct ccw_device *cdev;
 	struct subchannel *sch;
@@ -156,11 +168,7 @@ struct ccw_device_private {
 	} __attribute__((packed)) flags;
 	unsigned long intparm;	/* user interruption parameter */
 	struct qdio_irq *qdio_data;
-	struct irb irb;		/* device status */
 	int async_kill_io_rc;
-	struct senseid senseid;	/* SenseID info */
-	struct pgid pgid[8];	/* path group IDs per chpid*/
-	struct ccw1 iccws[2];	/* ccws for SNID/SID/SPGID commands */
 	struct work_struct todo_work;
 	enum cdev_todo todo;
 	wait_queue_head_t wait_q;
@@ -169,6 +177,9 @@ struct ccw_device_private {
 	struct list_head cmb_list;	/* list of measured devices */
 	u64 cmb_start_time;		/* clock value of cmb reset */
 	void *cmb_wait;			/* deferred cmb enable/disable */
+	u64 dma_mask;
+	struct gen_pool *dma_pool;
+	struct ccw_device_dma_area *dma_area;
 	enum interruption_class int_class;
 };
 
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 6d989c360f38..bb7a92316fc8 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -66,7 +66,6 @@ struct virtio_ccw_device {
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
-	u64 dma_mask;
 };
 
 struct vq_info_block_legacy {
@@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		ret = -ENOMEM;
 		goto out_free;
 	}
-
 	vcdev->vdev.dev.parent = &cdev->dev;
-	cdev->dev.dma_mask = &vcdev->dma_mask;
-	/* we are fine with common virtio infrastructure using 64 bit DMA */
-	ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64));
-	if (ret) {
-		dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");
-		goto out_free;
-	}
-
 	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
 				   GFP_DMA | GFP_KERNEL);
 	if (!vcdev->config_block) {
-- 
2.16.4
Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
Protected virtualization guests have to use shared pages for airq
notifier bit vectors, because hypervisor needs to write these bits.
Let us make sure we allocate DMA memory for the notifier bit vectors.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 arch/s390/include/asm/airq.h |  2 ++
 drivers/s390/cio/airq.c      | 18 ++++++++++++++----
 2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index fcf539efb32f..1492d4856049 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -11,6 +11,7 @@
 #define _ASM_S390_AIRQ_H
 
 #include <linux/bit_spinlock.h>
+#include <linux/dma-mapping.h>
 
 struct airq_struct {
 	struct hlist_node list;		/* Handler queueing. */
@@ -29,6 +30,7 @@ void unregister_adapter_interrupt(struct airq_struct *airq);
 /* Adapter interrupt bit vector */
 struct airq_iv {
 	unsigned long *vector;	/* Adapter interrupt bit vector */
+	dma_addr_t vector_dma; /* Adapter interrupt bit vector dma */
 	unsigned long *avail;	/* Allocation bit mask for the bit vector */
 	unsigned long *bitlock;	/* Lock bit mask for the bit vector */
 	unsigned long *ptr;	/* Pointer associated with each bit */
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index a45011e4529e..7a5c0a08ee09 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -19,6 +19,7 @@
 
 #include <asm/airq.h>
 #include <asm/isc.h>
+#include <asm/cio.h>
 
 #include "cio.h"
 #include "cio_debug.h"
@@ -113,6 +114,11 @@ void __init init_airq_interrupts(void)
 	setup_irq(THIN_INTERRUPT, &airq_interrupt);
 }
 
+static inline unsigned long iv_size(unsigned long bits)
+{
+	return BITS_TO_LONGS(bits) * sizeof(unsigned long);
+}
+
 /**
  * airq_iv_create - create an interrupt vector
  * @bits: number of bits in the interrupt vector
@@ -123,14 +129,15 @@ void __init init_airq_interrupts(void)
 struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 {
 	struct airq_iv *iv;
-	unsigned long size;
+	unsigned long size = 0;
 
 	iv = kzalloc(sizeof(*iv), GFP_KERNEL);
 	if (!iv)
 		goto out;
 	iv->bits = bits;
-	size = BITS_TO_LONGS(bits) * sizeof(unsigned long);
-	iv->vector = kzalloc(size, GFP_KERNEL);
+	size = iv_size(bits);
+	iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size,
+						 &iv->vector_dma, GFP_KERNEL);
 	if (!iv->vector)
 		goto out_free;
 	if (flags & AIRQ_IV_ALLOC) {
@@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned
long flags)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->avail);
-	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector,
+			  iv->vector_dma);
 	kfree(iv);
 out:
 	return NULL;
@@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->ptr);
 	kfree(iv->bitlock);
 	kfree(iv->vector);
+	dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits),
+			  iv->vector, iv->vector_dma);
 	kfree(iv->avail);
 	kfree(iv);
 }
-- 
2.16.4
Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 08/10] virtio/s390: add indirection to indicators access
This will come in handy soon when we pull out the indicators from
virtio_ccw_device to a memory area that is shared with the hypervisor
(in particular for protected virtualization guests).
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index bb7a92316fc8..1f3e7d56924f 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -68,6 +68,16 @@ struct virtio_ccw_device {
 	void *airq_info;
 };
 
+static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators;
+}
+
+static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
+{
+	return &vcdev->indicators2;
+}
+
 struct vq_info_block_legacy {
 	__u64 queue;
 	__u32 align;
@@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct
virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(&vcdev->indicators),
+		indicatorp = kmalloc(sizeof(indicators(vcdev)),
 				     GFP_DMA | GFP_KERNEL);
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 	}
 	/* Deregister indicators from host. */
-	vcdev->indicators = 0;
+	*indicators(vcdev) = 0;
 	ccw->flags = 0;
 	ret = ccw_io_helper(vcdev, ccw,
 			    vcdev->is_thinint ?
@@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev,
unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL);
+	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
 	if (!indicatorp)
 		goto out;
-	*indicatorp = (unsigned long) &vcdev->indicators;
+	*indicatorp = (unsigned long) indicators(vcdev);
 	if (vcdev->is_thinint) {
 		ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw);
 		if (ret)
@@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev,
unsigned nvqs,
 	}
 	if (!vcdev->is_thinint) {
 		/* Register queue indicators with host. */
-		vcdev->indicators = 0;
+		*indicators(vcdev) = 0;
 		ccw->cmd_code = CCW_CMD_SET_IND;
 		ccw->flags = 0;
-		ccw->count = sizeof(&vcdev->indicators);
+		ccw->count = sizeof(indicators(vcdev));
 		ccw->cda = (__u32)(unsigned long) indicatorp;
 		ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND);
 		if (ret)
 			goto out;
 	}
 	/* Register indicators2 with host for config changes */
-	*indicatorp = (unsigned long) &vcdev->indicators2;
-	vcdev->indicators2 = 0;
+	*indicatorp = (unsigned long) indicators2(vcdev);
+	*indicators2(vcdev) = 0;
 	ccw->cmd_code = CCW_CMD_SET_CONF_IND;
 	ccw->flags = 0;
-	ccw->count = sizeof(&vcdev->indicators2);
+	ccw->count = sizeof(indicators2(vcdev));
 	ccw->cda = (__u32)(unsigned long) indicatorp;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND);
 	if (ret)
@@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device
*cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
-	for_each_set_bit(i, &vcdev->indicators,
-			 sizeof(vcdev->indicators) * BITS_PER_BYTE) {
+	for_each_set_bit(i, indicators(vcdev),
+			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
-		clear_bit(i, &vcdev->indicators);
+		clear_bit(i, indicators(vcdev));
 		barrier();
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
-	if (test_bit(0, &vcdev->indicators2)) {
+	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
-		clear_bit(0, &vcdev->indicators2);
+		clear_bit(0, indicators2(vcdev));
 	}
 }
 
-- 
2.16.4
Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
Before virtio-ccw could get away with not using DMA API for the pieces of
memory it does ccw I/O with. With protected virtualization this has to
change, since the hypervisor needs to read and sometimes also write these
pieces of memory.
The hypervisor is supposed to poke the classic notifiers, if these are
used, out of band with regards to ccw I/O. So these need to be allocated
as DMA memory (which is shared memory for protected virtualization
guests).
Let us factor out everything from struct virtio_ccw_device that needs to
be DMA memory in a satellite that is allocated as such.
Note: The control blocks of I/O instructions do not need to be shared.
These are marshalled by the ultravisor.
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------
 1 file changed, 96 insertions(+), 81 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 1f3e7d56924f..613b18001a0c 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -46,9 +46,15 @@ struct vq_config_block {
 #define VIRTIO_CCW_CONFIG_SIZE 0x100
 /* same as PCI config space size, should be enough for all drivers */
 
+struct vcdev_dma_area {
+	unsigned long indicators;
+	unsigned long indicators2;
+	struct vq_config_block config_block;
+	__u8 status;
+};
+
 struct virtio_ccw_device {
 	struct virtio_device vdev;
-	__u8 *status;
 	__u8 config[VIRTIO_CCW_CONFIG_SIZE];
 	struct ccw_device *cdev;
 	__u32 curr_io;
@@ -58,24 +64,22 @@ struct virtio_ccw_device {
 	spinlock_t lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
-	unsigned long indicators;
-	unsigned long indicators2;
-	struct vq_config_block *config_block;
 	bool is_thinint;
 	bool going_away;
 	bool device_lost;
 	unsigned int config_ready;
 	void *airq_info;
+	struct vcdev_dma_area *dma_area;
 };
 
 static inline unsigned long *indicators(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators;
+	return &vcdev->dma_area->indicators;
 }
 
 static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev)
 {
-	return &vcdev->indicators2;
+	return &vcdev->dma_area->indicators2;
 }
 
 struct vq_info_block_legacy {
@@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct
virtio_device *vdev)
 	return container_of(vdev, struct virtio_ccw_device, vdev);
 }
 
+static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size)
+{
+	return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size);
+}
+
+static inline void __vc_dma_free(struct virtio_device *vdev, size_t size,
+				 void *cpu_addr)
+{
+	return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size);
+}
+
+#define vc_dma_alloc_struct(vdev, ptr) \
+	({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); })
+#define vc_dma_free_struct(vdev, ptr) \
+	__vc_dma_free(vdev, sizeof(*(ptr)), (ptr))
+
 static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 {
 	unsigned long i, flags;
@@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct
virtio_ccw_device *vcdev,
 	struct airq_info *airq_info = vcdev->airq_info;
 
 	if (vcdev->is_thinint) {
-		thinint_area = kzalloc(sizeof(*thinint_area),
-				       GFP_DMA | GFP_KERNEL);
+		vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator @@ -347,8 +366,8 @@ static void
virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev,
 		ccw->cda = (__u32)(unsigned long) thinint_area;
 	} else {
 		/* payload is the address of the indicators */
-		indicatorp = kmalloc(sizeof(indicators(vcdev)),
-				     GFP_DMA | GFP_KERNEL);
+		indicatorp = __vc_dma_alloc(&vcdev->vdev,
+					   sizeof(indicators(vcdev)));
 		if (!indicatorp)
 			return;
 		*indicatorp = 0;
@@ -368,8 +387,9 @@ static void virtio_ccw_drop_indicator(struct
virtio_ccw_device *vcdev,
 			 "Failed to deregister indicators (%d)\n", ret);
 	else if (vcdev->is_thinint)
 		virtio_ccw_drop_indicators(vcdev);
-	kfree(indicatorp);
-	kfree(thinint_area);
+	__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+		      indicatorp);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 }
 
 static inline long __do_kvm_notify(struct subchannel_id schid,
@@ -416,15 +436,15 @@ static int virtio_ccw_read_vq_conf(struct
virtio_ccw_device *vcdev,
 {
 	int ret;
 
-	vcdev->config_block->index = index;
+	vcdev->dma_area->config_block.index = index;
 	ccw->cmd_code = CCW_CMD_READ_VQ_CONF;
 	ccw->flags = 0;
 	ccw->count = sizeof(struct vq_config_block);
-	ccw->cda = (__u32)(unsigned long)(vcdev->config_block);
+	ccw->cda = (__u32)(unsigned
long)(&vcdev->dma_area->config_block);
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_VQ_CONF);
 	if (ret)
 		return ret;
-	return vcdev->config_block->num ?: -ENOENT;
+	return vcdev->dma_area->config_block.num ?: -ENOENT;
 }
 
 static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
@@ -469,7 +489,7 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct
ccw1 *ccw)
 			 ret, index);
 
 	vring_del_virtqueue(vq);
-	kfree(info->info_block);
+	vc_dma_free_struct(vq->vdev, info->info_block);
 	kfree(info);
 }
 
@@ -479,7 +499,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	struct ccw1 *ccw;
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
@@ -488,7 +508,7 @@ static void virtio_ccw_del_vqs(struct virtio_device *vdev)
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
 		virtio_ccw_del_vq(vq, ccw);
 
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
@@ -511,8 +531,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct
virtio_device *vdev,
 		err = -ENOMEM;
 		goto out_err;
 	}
-	info->info_block = kzalloc(sizeof(*info->info_block),
-				   GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, info->info_block);
 	if (!info->info_block) {
 		dev_warn(&vcdev->cdev->dev, "no info block\n");
 		err = -ENOMEM;
@@ -576,7 +595,7 @@ static struct virtqueue *virtio_ccw_setup_vq(struct
virtio_device *vdev,
 	if (vq)
 		vring_del_virtqueue(vq);
 	if (info) {
-		kfree(info->info_block);
+		vc_dma_free_struct(vdev, info->info_block);
 	}
 	kfree(info);
 	return ERR_PTR(err);
@@ -590,7 +609,7 @@ static int virtio_ccw_register_adapter_ind(struct
virtio_ccw_device *vcdev,
 	struct virtio_thinint_area *thinint_area = NULL;
 	struct airq_info *info;
 
-	thinint_area = kzalloc(sizeof(*thinint_area), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, thinint_area);
 	if (!thinint_area) {
 		ret = -ENOMEM;
 		goto out;
@@ -626,7 +645,7 @@ static int virtio_ccw_register_adapter_ind(struct
virtio_ccw_device *vcdev,
 		virtio_ccw_drop_indicators(vcdev);
 	}
 out:
-	kfree(thinint_area);
+	vc_dma_free_struct(&vcdev->vdev, thinint_area);
 	return ret;
 }
 
@@ -642,7 +661,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev,
unsigned nvqs,
 	int ret, i, queue_idx = 0;
 	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
@@ -666,7 +685,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev,
unsigned nvqs,
 	 * We need a data area under 2G to communicate. Our payload is
 	 * the address of the indicators.
 	*/
-	indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL);
+	indicatorp = __vc_dma_alloc(&vcdev->vdev, sizeof(indicators(vcdev)));
 	if (!indicatorp)
 		goto out;
 	*indicatorp = (unsigned long) indicators(vcdev);
@@ -698,12 +717,16 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev,
unsigned nvqs,
 	if (ret)
 		goto out;
 
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+			       indicatorp);
+	vc_dma_free_struct(vdev, ccw);
 	return 0;
 out:
-	kfree(indicatorp);
-	kfree(ccw);
+	if (indicatorp)
+		__vc_dma_free(&vcdev->vdev, sizeof(indicators(vcdev)),
+			       indicatorp);
+	vc_dma_free_struct(vdev, ccw);
 	virtio_ccw_del_vqs(vdev);
 	return ret;
 }
@@ -713,12 +736,12 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
 	/* Zero status bits. */
-	*vcdev->status = 0;
+	vcdev->dma_area->status = 0;
 
 	/* Send a reset ccw on device. */
 	ccw->cmd_code = CCW_CMD_VDEV_RESET;
@@ -726,22 +749,22 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
 	ccw->count = 0;
 	ccw->cda = 0;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u64 virtio_ccw_get_features(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
 	struct virtio_feature_desc *features;
+	struct ccw1 *ccw;
 	int ret;
 	u64 rc;
-	struct ccw1 *ccw;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return 0;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		rc = 0;
 		goto out_free;
@@ -774,8 +797,8 @@ static u64 virtio_ccw_get_features(struct virtio_device
*vdev)
 		rc |= (u64)le32_to_cpu(features->features) << 32;
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 	return rc;
 }
 
@@ -800,11 +823,11 @@ static int virtio_ccw_finalize_features(struct
virtio_device *vdev)
 		return -EINVAL;
 	}
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
 
-	features = kzalloc(sizeof(*features), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, features);
 	if (!features) {
 		ret = -ENOMEM;
 		goto out_free;
@@ -839,8 +862,8 @@ static int virtio_ccw_finalize_features(struct virtio_device
*vdev)
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
 
 out_free:
-	kfree(features);
-	kfree(ccw);
+	vc_dma_free_struct(vdev, features);
+	vc_dma_free_struct(vdev, ccw);
 
 	return ret;
 }
@@ -854,11 +877,11 @@ static void virtio_ccw_get_config(struct virtio_device
*vdev,
 	void *config_area;
 	unsigned long flags;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -880,8 +903,8 @@ static void virtio_ccw_get_config(struct virtio_device
*vdev,
 		memcpy(buf, config_area + offset, len);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static void virtio_ccw_set_config(struct virtio_device *vdev,
@@ -893,11 +916,11 @@ static void virtio_ccw_set_config(struct virtio_device
*vdev,
 	void *config_area;
 	unsigned long flags;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
-	config_area = kzalloc(VIRTIO_CCW_CONFIG_SIZE, GFP_DMA | GFP_KERNEL);
+	config_area = __vc_dma_alloc(vdev, VIRTIO_CCW_CONFIG_SIZE);
 	if (!config_area)
 		goto out_free;
 
@@ -916,61 +939,61 @@ static void virtio_ccw_set_config(struct virtio_device
*vdev,
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_CONFIG);
 
 out_free:
-	kfree(config_area);
-	kfree(ccw);
+	__vc_dma_free(vdev, VIRTIO_CCW_CONFIG_SIZE, config_area);
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static u8 virtio_ccw_get_status(struct virtio_device *vdev)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	struct ccw1 *ccw;
 
 	if (vcdev->revision < 1)
-		return *vcdev->status;
+		return vcdev->dma_area->status;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return old_status;
 
 	ccw->cmd_code = CCW_CMD_READ_STATUS;
 	ccw->flags = 0;
-	ccw->count = sizeof(*vcdev->status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->count = sizeof(vcdev->dma_area->status);
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_STATUS);
 /*
  * If the channel program failed (should only happen if the device
  * was hotunplugged, and then we clean up via the machine check
- * handler anyway), vcdev->status was not overwritten and we just
+ * handler anyway), vcdev->dma_area->status was not overwritten and we
just
  * return the old status, which is fine.
 */
-	kfree(ccw);
+	vc_dma_free_struct(vdev, ccw);
 
-	return *vcdev->status;
+	return vcdev->dma_area->status;
 }
 
 static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
 {
 	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
-	u8 old_status = *vcdev->status;
+	u8 old_status = vcdev->dma_area->status;
 	struct ccw1 *ccw;
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(vdev, ccw);
 	if (!ccw)
 		return;
 
 	/* Write the status to the host. */
-	*vcdev->status = status;
+	vcdev->dma_area->status = status;
 	ccw->cmd_code = CCW_CMD_WRITE_STATUS;
 	ccw->flags = 0;
 	ccw->count = sizeof(status);
-	ccw->cda = (__u32)(unsigned long)vcdev->status;
+	ccw->cda = (__u32)(unsigned long)&vcdev->dma_area->status;
 	ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
 	/* Write failed? We assume status is unchanged. */
 	if (ret)
-		*vcdev->status = old_status;
-	kfree(ccw);
+		vcdev->dma_area->status = old_status;
+	vc_dma_free_struct(vdev, ccw);
 }
 
 static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
@@ -1003,8 +1026,7 @@ static void virtio_ccw_release_dev(struct device *_d)
 	struct virtio_device *dev = dev_to_virtio(_d);
 	struct virtio_ccw_device *vcdev = to_vc_device(dev);
 
-	kfree(vcdev->status);
-	kfree(vcdev->config_block);
+	vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	kfree(vcdev);
 }
 
@@ -1212,12 +1234,12 @@ static int virtio_ccw_set_transport_rev(struct
virtio_ccw_device *vcdev)
 	struct ccw1 *ccw;
 	int ret;
 
-	ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, ccw);
 	if (!ccw)
 		return -ENOMEM;
-	rev = kzalloc(sizeof(*rev), GFP_DMA | GFP_KERNEL);
+	vc_dma_alloc_struct(&vcdev->vdev, rev);
 	if (!rev) {
-		kfree(ccw);
+		vc_dma_free_struct(&vcdev->vdev, ccw);
 		return -ENOMEM;
 	}
 
@@ -1247,8 +1269,8 @@ static int virtio_ccw_set_transport_rev(struct
virtio_ccw_device *vcdev)
 		}
 	} while (ret == -EOPNOTSUPP);
 
-	kfree(ccw);
-	kfree(rev);
+	vc_dma_free_struct(&vcdev->vdev, ccw);
+	vc_dma_free_struct(&vcdev->vdev, rev);
 	return ret;
 }
 
@@ -1265,14 +1287,9 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 		goto out_free;
 	}
 	vcdev->vdev.dev.parent = &cdev->dev;
-	vcdev->config_block = kzalloc(sizeof(*vcdev->config_block),
-				   GFP_DMA | GFP_KERNEL);
-	if (!vcdev->config_block) {
-		ret = -ENOMEM;
-		goto out_free;
-	}
-	vcdev->status = kzalloc(sizeof(*vcdev->status), GFP_DMA | GFP_KERNEL);
-	if (!vcdev->status) {
+	vcdev->cdev = cdev;
+	vc_dma_alloc_struct(&vcdev->vdev, vcdev->dma_area);
+	if (!vcdev->dma_area) {
 		ret = -ENOMEM;
 		goto out_free;
 	}
@@ -1281,7 +1298,6 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 
 	vcdev->vdev.dev.release = virtio_ccw_release_dev;
 	vcdev->vdev.config = &virtio_ccw_config_ops;
-	vcdev->cdev = cdev;
 	init_waitqueue_head(&vcdev->wait_q);
 	INIT_LIST_HEAD(&vcdev->virtqueues);
 	spin_lock_init(&vcdev->lock);
@@ -1312,8 +1328,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	return ret;
 out_free:
 	if (vcdev) {
-		kfree(vcdev->status);
-		kfree(vcdev->config_block);
+		vc_dma_free_struct(&vcdev->vdev, vcdev->dma_area);
 	}
 	kfree(vcdev);
 	return ret;
-- 
2.16.4
Halil Pasic
2019-Apr-26  18:32 UTC
[PATCH 10/10] virtio/s390: make airq summary indicators DMA
Hypervisor needs to interact with the summary indicators, so these
need to be DMA memory as well (at least for protected virtualization
guests).
Signed-off-by: Halil Pasic <pasic at linux.ibm.com>
---
 drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 613b18001a0c..6058b07fea08 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1;
 
 struct airq_info {
 	rwlock_t lock;
-	u8 summary_indicator;
+	u8 summary_indicator_idx;
 	struct airq_struct airq;
 	struct airq_iv *aiv;
 };
 static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
+static u8 *summary_indicators;
+
+static inline u8 *get_summary_indicator(struct airq_info *info)
+{
+	return summary_indicators + info->summary_indicator_idx;
+}
 
 #define CCW_CMD_SET_VQ 0x13
 #define CCW_CMD_VDEV_RESET 0x33
@@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq)
 			break;
 		vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai));
 	}
-	info->summary_indicator = 0;
+	*(get_summary_indicator(info)) = 0;
 	smp_wmb();
 	/* Walk through indicators field, summary indicator not active. */
 	for (ai = 0;;) {
@@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq)
 	read_unlock(&info->lock);
 }
 
-static struct airq_info *new_airq_info(void)
+/* call with airq_areas_lock held */
+static struct airq_info *new_airq_info(int index)
 {
 	struct airq_info *info;
 	int rc;
@@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void)
 		return NULL;
 	}
 	info->airq.handler = virtio_airq_handler;
-	info->airq.lsi_ptr = &info->summary_indicator;
+	info->summary_indicator_idx = index;
+	info->airq.lsi_ptr = get_summary_indicator(info);
 	info->airq.lsi_mask = 0xff;
 	info->airq.isc = VIRTIO_AIRQ_ISC;
 	rc = register_adapter_interrupt(&info->airq);
@@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue
*vqs[], int nvqs,
 	unsigned long bit, flags;
 
 	for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) {
+		/* TODO: this seems to be racy */
 		if (!airq_areas[i])
-			airq_areas[i] = new_airq_info();
+			airq_areas[i] = new_airq_info(i);
 		info = airq_areas[i];
 		if (!info)
 			return 0;
@@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct
virtio_ccw_device *vcdev,
 		if (!thinint_area)
 			return;
 		thinint_area->summary_indicator -			(unsigned long)
&airq_info->summary_indicator;
+			(unsigned long) get_summary_indicator(airq_info);
 		thinint_area->isc = VIRTIO_AIRQ_ISC;
 		ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 		ccw->count = sizeof(*thinint_area);
@@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct
virtio_ccw_device *vcdev,
 	}
 	info = vcdev->airq_info;
 	thinint_area->summary_indicator -		(unsigned long)
&info->summary_indicator;
+		(unsigned long) get_summary_indicator(info);
 	thinint_area->isc = VIRTIO_AIRQ_ISC;
 	ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER;
 	ccw->flags = CCW_FLAG_SLI;
@@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void)
 {
 	/* parse no_auto string before we do anything further */
 	no_auto_parse();
+	summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS);
 	return ccw_driver_register(&virtio_ccw_driver);
 }
 device_initcall(virtio_ccw_init);
-- 
2.16.4
Christoph Hellwig
2019-Apr-26  19:27 UTC
[PATCH 04/10] s390/mm: force swiotlb for protected virtualization
On Fri, Apr 26, 2019 at 08:32:39PM +0200, Halil Pasic wrote:> +EXPORT_SYMBOL_GPL(set_memory_encrypted);> +EXPORT_SYMBOL_GPL(set_memory_decrypted);> +EXPORT_SYMBOL_GPL(sev_active);Why do you export these? I know x86 exports those as well, but it shoudn't be needed there either.
On Fri, 26 Apr 2019 20:32:36 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> The commit 2a2d1382fe9d ("virtio: Add improved queue allocation API") > establishes a new way of allocating virtqueues (as a part of the effort > that taught DMA to virtio rings). > > In the future we will want virtio-ccw to use the DMA API as well. > > Let us switch from the legacy method of allocating virtqueues to > vring_create_virtqueue() as the first step into that direction. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 30 +++++++++++------------------- > 1 file changed, 11 insertions(+), 19 deletions(-)Reviewed-by: Cornelia Huck <cohuck at redhat.com> I'd vote for merging this patch right away for 5.2.
On Fri, 26 Apr 2019 20:32:37 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Currently virtio-ccw devices do not work if the device has > VIRTIO_F_IOMMU_PLATFORM. In future we do want to support DMA API with > virtio-ccw. > > Let us do the plumbing, so the feature VIRTIO_F_IOMMU_PLATFORM works > with virtio-ccw. > > Let us also switch from legacy avail/used accessors to the DMA aware > ones (even if it isn't strictly necessary), and remove the legacy > accessors (we were the last users). > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 22 +++++++++++++++------- > include/linux/virtio.h | 17 ----------------- > 2 files changed, 15 insertions(+), 24 deletions(-) > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index 2c66941ef3d0..42832a164546 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c(...)> @@ -1258,6 +1257,16 @@ static int virtio_ccw_online(struct ccw_device *cdev) > ret = -ENOMEM; > goto out_free; > } > + > + vcdev->vdev.dev.parent = &cdev->dev; > + cdev->dev.dma_mask = &vcdev->dma_mask; > + /* we are fine with common virtio infrastructure using 64 bit DMA */ > + ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64)); > + if (ret) { > + dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n");Drop the trailing period?> + goto out_free; > + } > + > vcdev->config_block = kzalloc(sizeof(*vcdev->config_block), > GFP_DMA | GFP_KERNEL); > if (!vcdev->config_block) {(...) Reviewed-by: Cornelia Huck <cohuck at redhat.com> Also 5.2 material, I think.
On Fri, 26 Apr 2019 20:32:38 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Nothing precludes to accepting VIRTIO_F_RING_PACKED any more."precludes us from accepting"> > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index 42832a164546..6d989c360f38 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c > @@ -773,10 +773,8 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev) > static void ccw_transport_features(struct virtio_device *vdev) > { > /* > - * There shouldn't be anything that precludes supporting packed. > - * TODO: Remove the limitation after having another look into this. > + * Currently nothing to do here. > */ > - __virtio_clear_bit(vdev, VIRTIO_F_RING_PACKED); > } > > static int virtio_ccw_finalize_features(struct virtio_device *vdev)Not sure whether we should merge this into the previous patch instead. Anyway, I think we can go ahead with that for 5.2 as well. Reviewed-by: Cornelia Huck <cohuck at redhat.com>
Cornelia Huck
2019-May-03  09:55 UTC
[PATCH 00/10] s390: virtio: support protected virtualization
On Fri, 26 Apr 2019 20:32:35 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Enhanced virtualization protection technology may require the use of > bounce buffers for I/O. While support for this was built into the virtio > core, virtio-ccw wasn't changed accordingly. > > Some background on technology (not part of this series) and the > terminology used. > > * Protected Virtualization (PV): > > Protected Virtualization guarantees, that non-shared memory of a guest > that operates in PV mode private to that guest. I.e. any attempts by the > hypervisor or other guests to access it will result in an exception. If > supported by the environment (machine, KVM, guest VM) a guest can decide > to change into PV mode by doing the appropriate ultravisor calls. Unlike > some other enhanced virtualization protection technology,I think that sentence misses its second part?> > * Ultravisor: > > A hardware/firmware entity that manages PV guests, and polices access to > their memory. A PV guest prospect needs to interact with the ultravisor, > to enter PV mode, and potentially to share pages (for I/O which should > be encrypted by the guest). A guest interacts with the ultravisor via so > called ultravisor calls. A hypervisor needs to interact with the > ultravisor to facilitate interpretation, emulation and swapping. A > hypervisor interacts with the ultravisor via ultravisor calls and via > the SIE state description. Generally the ultravisor sanitizes hypervisor > inputs so that the guest can not be corrupted (except for denial of > service. > > > What needs to be done > ====================> > Thus what needs to be done to bring virtio-ccw up to speed with respect > to protected virtualization is: > * use some 'new' common virtio stuffDoing this makes sense regardless of the protected virtualization use case, and I think we should go ahead and merge those patches for 5.2.> * make sure that virtio-ccw specific stuff uses shared memory when > talking to the hypervisor (except control/communication blocks like ORB, > these are handled by the ultravisor)TBH, I'm still a bit hazy on what needs to use shared memory and what doesn't.> * make sure the DMA API does what is necessary to talk through shared > memory if we are a protected virtualization guest. > * make sure the common IO layer plays along as well (airqs, sense). > > > Important notes > ===============> > * This patch set is based on Martins features branch > (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch > 'features'). > > * Documentation is still very sketchy. I'm committed to improving this, > but I'm currently hampered by some dependencies currently.I understand, but I think this really needs more doc; also for people who want to understand the code in the future. Unfortunately lack of doc also hampers others in reviewing this :/> > * The existing naming in the common infrastructure (kernel internal > interfaces) is pretty much based on the AMD SEV terminology. Thus the > names aren't always perfect. There might be merit to changing these > names to more abstract ones. I did not put much thought into that at > the current stage. > > * Testing: Please use iommu_platform=on for any virtio devices you are > going to test this code with (so virtio actually uses the DMA API). > > Change log > =========> > RFC --> v1: > * Fixed bugs found by Connie (may_reduce and handling reduced, warning, > split move -- thanks Connie!). > * Fixed console bug found by Sebastian (thanks Sebastian!). > * Removed the completely useless duplicate of dma-mapping.h spotted by > Christoph (thanks Christoph!). > * Don't use the global DMA pool for subchannel and ccw device > owned memory as requested by Sebastian. Consequences: > * Both subchannel and ccw devices have their dma masks > now (both specifying 31 bit addressable) > * We require at least 2 DMA pages per ccw device now, most of > this memory is wasted though. > * DMA memory allocated by virtio is also 31 bit addressable now > as virtio uses the parent (which is the ccw device). > * Enabled packed ring. > * Rebased onto Martins feature branch; using the actual uv (ultravisor) > interface instead of TODO comments. > * Added some explanations to the cover letter (Connie, David). > * Squashed a couple of patches together and fixed some text stuff. > > Looking forward to your review, or any other type of input. > > Halil Pasic (10): > virtio/s390: use vring_create_virtqueue > virtio/s390: DMA support for virtio-ccw > virtio/s390: enable packed ring > s390/mm: force swiotlb for protected virtualization > s390/cio: introduce DMA pools to cio > s390/cio: add basic protected virtualization support > s390/airq: use DMA memory for adapter interrupts > virtio/s390: add indirection to indicators access > virtio/s390: use DMA memory for ccw I/O and classic notifiers > virtio/s390: make airq summary indicators DMA > > arch/s390/Kconfig | 5 + > arch/s390/include/asm/airq.h | 2 + > arch/s390/include/asm/ccwdev.h | 4 + > arch/s390/include/asm/cio.h | 11 ++ > arch/s390/include/asm/mem_encrypt.h | 18 +++ > arch/s390/mm/init.c | 50 +++++++ > drivers/s390/cio/airq.c | 18 ++- > drivers/s390/cio/ccwreq.c | 8 +- > drivers/s390/cio/cio.h | 1 + > drivers/s390/cio/css.c | 101 +++++++++++++ > drivers/s390/cio/device.c | 65 +++++++-- > drivers/s390/cio/device_fsm.c | 40 +++--- > drivers/s390/cio/device_id.c | 18 +-- > drivers/s390/cio/device_ops.c | 21 ++- > drivers/s390/cio/device_pgid.c | 20 +-- > drivers/s390/cio/device_status.c | 24 ++-- > drivers/s390/cio/io_sch.h | 21 ++- > drivers/s390/virtio/virtio_ccw.c | 275 +++++++++++++++++++----------------- > include/linux/virtio.h | 17 --- > 19 files changed, 499 insertions(+), 220 deletions(-) > create mode 100644 arch/s390/include/asm/mem_encrypt.h >
Juergen Gross
2019-May-03  10:03 UTC
[PATCH 00/10] s390: virtio: support protected virtualization
On 03/05/2019 11:55, Cornelia Huck wrote:> On Fri, 26 Apr 2019 20:32:35 +0200 > Halil Pasic <pasic at linux.ibm.com> wrote: > >> Enhanced virtualization protection technology may require the use of >> bounce buffers for I/O. While support for this was built into the virtio >> core, virtio-ccw wasn't changed accordingly. >> >> Some background on technology (not part of this series) and the >> terminology used. >> >> * Protected Virtualization (PV):Uuh, you are aware that "PV" in virtualization environment is used as "Para-Virtualization" (e.g. in Xen) today? I believe you are risking a major mis-understanding here. Juergen
Cornelia Huck
2019-May-03  13:33 UTC
[PATCH 00/10] s390: virtio: support protected virtualization
On Fri, 3 May 2019 11:55:11 +0200 Cornelia Huck <cohuck at redhat.com> wrote:> On Fri, 26 Apr 2019 20:32:35 +0200 > Halil Pasic <pasic at linux.ibm.com> wrote: >> > What needs to be done > > ====================> > > > Thus what needs to be done to bring virtio-ccw up to speed with respect > > to protected virtualization is: > > * use some 'new' common virtio stuff > > Doing this makes sense regardless of the protected virtualization use > case, and I think we should go ahead and merge those patches for 5.2. > > > * make sure that virtio-ccw specific stuff uses shared memory when > > talking to the hypervisor (except control/communication blocks like ORB, > > these are handled by the ultravisor) > > TBH, I'm still a bit hazy on what needs to use shared memory and what > doesn't. > > > * make sure the DMA API does what is necessary to talk through shared > > memory if we are a protected virtualization guest. > > * make sure the common IO layer plays along as well (airqs, sense).I've skimmed some more over the rest of the patches; but I think this needs review by someone else. For example, I'm not sure what the semantics of using the main css device as the dma device are, as I'm not sufficiently familiar with the intricacies of the dma infrastructure. Combine this with a lack of documentation of the hardware architecture, and I think that others can provide much better feedback on this than I am able to.
Halil Pasic
2019-May-04  13:58 UTC
[PATCH 00/10] s390: virtio: support protected virtualization
On Fri, 3 May 2019 11:55:11 +0200 Cornelia Huck <cohuck at redhat.com> wrote:> On Fri, 26 Apr 2019 20:32:35 +0200 > Halil Pasic <pasic at linux.ibm.com> wrote: > > > Enhanced virtualization protection technology may require the use of > > bounce buffers for I/O. While support for this was built into the virtio > > core, virtio-ccw wasn't changed accordingly. > > > > Some background on technology (not part of this series) and the > > terminology used. > > > > * Protected Virtualization (PV): > > > > Protected Virtualization guarantees, that non-shared memory of a guest > > that operates in PV mode private to that guest. I.e. any attempts by the > > hypervisor or other guests to access it will result in an exception. If > > supported by the environment (machine, KVM, guest VM) a guest can decide > > to change into PV mode by doing the appropriate ultravisor calls. Unlike > > some other enhanced virtualization protection technology, > > I think that sentence misses its second part? >I wanted to kill the whole sentence, but killed only a part of it. :( Sorry. If any, the sentence had only significance for judging how well inherited some names fit.> > > > * Ultravisor: > > > > A hardware/firmware entity that manages PV guests, and polices access to > > their memory. A PV guest prospect needs to interact with the ultravisor, > > to enter PV mode, and potentially to share pages (for I/O which should > > be encrypted by the guest). A guest interacts with the ultravisor via so > > called ultravisor calls. A hypervisor needs to interact with the > > ultravisor to facilitate interpretation, emulation and swapping. A > > hypervisor interacts with the ultravisor via ultravisor calls and via > > the SIE state description. Generally the ultravisor sanitizes hypervisor > > inputs so that the guest can not be corrupted (except for denial of > > service. > > > > > > What needs to be done > > ====================> > > > Thus what needs to be done to bring virtio-ccw up to speed with respect > > to protected virtualization is: > > * use some 'new' common virtio stuff > > Doing this makes sense regardless of the protected virtualization use > case, and I think we should go ahead and merge those patches for 5.2. >I agree.> > * make sure that virtio-ccw specific stuff uses shared memory when > > talking to the hypervisor (except control/communication blocks like ORB, > > these are handled by the ultravisor) > > TBH, I'm still a bit hazy on what needs to use shared memory and what > doesn't. >It is all in the code :). To have complete and definitive answers here we would need some sort of public UV architecture.> > * make sure the DMA API does what is necessary to talk through shared > > memory if we are a protected virtualization guest. > > * make sure the common IO layer plays along as well (airqs, sense). > > > > > > Important notes > > ===============> > > > * This patch set is based on Martins features branch > > (git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git branch > > 'features'). > > > > * Documentation is still very sketchy. I'm committed to improving this, > > but I'm currently hampered by some dependencies currently. > > I understand, but I think this really needs more doc; also for people > who want to understand the code in the future. > > Unfortunately lack of doc also hampers others in reviewing this :/ >I'm not sure how much can we do on the doc front. Without a complete architecture, one basically needs to trust the guys with access to the architecture. Many thanks for your feedback. Regards, Halil [..]
Claudio Imbrenda
2019-May-08  13:15 UTC
[PATCH 04/10] s390/mm: force swiotlb for protected virtualization
On Fri, 26 Apr 2019 20:32:39 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> On s390, protected virtualization guests have to use bounced I/O > buffers. That requires some plumbing. > > Let us make sure, any device that uses DMA API with direct ops > correctly is spared from the problems, that a hypervisor attempting > I/O to a non-shared page would bring. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > arch/s390/Kconfig | 4 +++ > arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++ > arch/s390/mm/init.c | 50 > +++++++++++++++++++++++++++++++++++++ 3 files changed, 72 > insertions(+) create mode 100644 arch/s390/include/asm/mem_encrypt.h > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 1c3fcf19c3af..5500d05d4d53 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -1,4 +1,7 @@ > # SPDX-License-Identifier: GPL-2.0 > +config ARCH_HAS_MEM_ENCRYPT > + def_bool y > + > config MMU > def_bool y > > @@ -191,6 +194,7 @@ config S390 > select ARCH_HAS_SCALED_CPUTIME > select VIRT_TO_BUS > select HAVE_NMI > + select SWIOTLB > > > config SCHED_OMIT_FRAME_POINTER > diff --git a/arch/s390/include/asm/mem_encrypt.h > b/arch/s390/include/asm/mem_encrypt.h new file mode 100644 > index 000000000000..0898c09a888c > --- /dev/null > +++ b/arch/s390/include/asm/mem_encrypt.h > @@ -0,0 +1,18 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef S390_MEM_ENCRYPT_H__ > +#define S390_MEM_ENCRYPT_H__ > + > +#ifndef __ASSEMBLY__ > + > +#define sme_me_mask 0ULLThis is rather ugly, but I understand why it's there> + > +static inline bool sme_active(void) { return false; } > +extern bool sev_active(void); > + > +int set_memory_encrypted(unsigned long addr, int numpages); > +int set_memory_decrypted(unsigned long addr, int numpages); > + > +#endif /* __ASSEMBLY__ */ > + > +#endif /* S390_MEM_ENCRYPT_H__ */ > + > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c > index 3e82f66d5c61..7e3cbd15dcfa 100644 > --- a/arch/s390/mm/init.c > +++ b/arch/s390/mm/init.c > @@ -18,6 +18,7 @@ > #include <linux/mman.h> > #include <linux/mm.h> > #include <linux/swap.h> > +#include <linux/swiotlb.h> > #include <linux/smp.h> > #include <linux/init.h> > #include <linux/pagemap.h> > @@ -29,6 +30,7 @@ > #include <linux/export.h> > #include <linux/cma.h> > #include <linux/gfp.h> > +#include <linux/dma-mapping.h> > #include <asm/processor.h> > #include <linux/uaccess.h> > #include <asm/pgtable.h> > @@ -42,6 +44,8 @@ > #include <asm/sclp.h> > #include <asm/set_memory.h> > #include <asm/kasan.h> > +#include <asm/dma-mapping.h> > +#include <asm/uv.h> > > pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir); > > @@ -126,6 +130,50 @@ void mark_rodata_ro(void) > pr_info("Write protected read-only-after-init data: %luk\n", > size >> 10); } > > +int set_memory_encrypted(unsigned long addr, int numpages) > +{ > + int i; > + > + /* make all pages shared, (swiotlb, dma_free) */this is a copypaste typo, I think? (should be UNshared?) also, it doesn't make ALL pages unshared, but only those specified in the parameters with this fixed: Reviewed-by: Claudio Imbrenda <imbrenda at linux.ibm.com>> + for (i = 0; i < numpages; ++i) { > + uv_remove_shared(addr); > + addr += PAGE_SIZE; > + } > + return 0; > +} > +EXPORT_SYMBOL_GPL(set_memory_encrypted); > + > +int set_memory_decrypted(unsigned long addr, int numpages) > +{ > + int i; > + /* make all pages shared (swiotlb, dma_alloca) */same here with ALL> + for (i = 0; i < numpages; ++i) { > + uv_set_shared(addr); > + addr += PAGE_SIZE; > + } > + return 0; > +} > +EXPORT_SYMBOL_GPL(set_memory_decrypted); > + > +/* are we a protected virtualization guest? */ > +bool sev_active(void)this is also ugly. the correct solution would be probably to refactor everything, including all the AMD SEV code.... let's not go there> +{ > + return is_prot_virt_guest(); > +} > +EXPORT_SYMBOL_GPL(sev_active); > + > +/* protected virtualization */ > +static void pv_init(void) > +{ > + if (!sev_active())can't you just use is_prot_virt_guest here?> + return; > + > + /* make sure bounce buffers are shared */ > + swiotlb_init(1); > + swiotlb_update_mem_attributes(); > + swiotlb_force = SWIOTLB_FORCE; > +} > + > void __init mem_init(void) > { > cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask); > @@ -134,6 +182,8 @@ void __init mem_init(void) > set_max_mapnr(max_low_pfn); > high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); > > + pv_init(); > + > /* Setup guest page hinting */ > cmma_init(); >
On Fri, 26 Apr 2019, Halil Pasic wrote:> @@ -224,6 +228,9 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid, > INIT_WORK(&sch->todo_work, css_sch_todo); > sch->dev.release = &css_subchannel_release; > device_initialize(&sch->dev); > + sch->dma_mask = css_dev_dma_mask; > + sch->dev.dma_mask = &sch->dma_mask; > + sch->dev.coherent_dma_mask = sch->dma_mask;Could we do: sch->dev.dma_mask = &sch->dev.coherent_dma_mask; sch->dev.coherent_dma_mask = css_dev_dma_mask; ?> +#define POOL_INIT_PAGES 1 > +static struct gen_pool *cio_dma_pool; > +/* Currently cio supports only a single css */ > +#define CIO_DMA_GFP (GFP_KERNEL | __GFP_ZERO)__GFP_ZERO has no meaning with the dma api (since all implementations do an implicit zero initialization) but let's keep it for the sake of documentation. We need GFP_DMA here (which will return addresses < 2G on s390)!> +void cio_gp_dma_free(struct gen_pool *gp_dma, void *cpu_addr, size_t size) > +{ > + if (!cpu_addr) > + return; > + memset(cpu_addr, 0, size);Hm, normally I'd do the memset during alloc not during free - but maybe this makes more sense here with your usecase in mind.> @@ -1063,6 +1163,7 @@ static int __init css_bus_init(void) > unregister_reboot_notifier(&css_reboot_notifier); > goto out_unregister; > } > + cio_dma_pool_init();This is too late for early devices (ccw console!).
Sebastian Ott
2019-May-08  13:46 UTC
[PATCH 06/10] s390/cio: add basic protected virtualization support
On Fri, 26 Apr 2019, Halil Pasic wrote:> static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch) > {[..]> + cdev->private = kzalloc(sizeof(struct ccw_device_private), > + GFP_KERNEL | GFP_DMA);Do we still need GFP_DMA here (since we now have cdev->private->dma_area)?> @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch) > if (!io_priv) > goto out_schedule; > > + io_priv->dma_area = dma_alloc_coherent(&sch->dev, > + sizeof(*io_priv->dma_area), > + &io_priv->dma_area_dma, GFP_KERNEL);This needs GFP_DMA. You use a genpool for ccw_private->dma and not for iopriv->dma - looks kinda inconsistent.
Sebastian Ott
2019-May-08  13:58 UTC
[PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
On Fri, 26 Apr 2019, Halil Pasic wrote:> @@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv) > kfree(iv->ptr); > kfree(iv->bitlock); > kfree(iv->vector);- kfree(iv->vector);> + dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits), > + iv->vector, iv->vector_dma); > kfree(iv->avail); > kfree(iv); > }Looks good to me but needs adaption to current code. Probably you can just revert my changes introducing cacheline aligned vectors since we now use a whole page.
Pierre Morel
2019-May-08  14:23 UTC
[PATCH 06/10] s390/cio: add basic protected virtualization support
On 26/04/2019 20:32, Halil Pasic wrote:> As virtio-ccw devices are channel devices, we need to use the dma area > for any communication with the hypervisor. > > This patch addresses the most basic stuff (mostly what is required for > virtio-ccw), and does take care of QDIO or any devices. > > An interesting side effect is that virtio structures are now going to > get allocated in 31 bit addressable storage. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > arch/s390/include/asm/ccwdev.h | 4 +++ > drivers/s390/cio/ccwreq.c | 8 ++--- > drivers/s390/cio/device.c | 65 +++++++++++++++++++++++++++++++++------- > drivers/s390/cio/device_fsm.c | 40 ++++++++++++------------- > drivers/s390/cio/device_id.c | 18 +++++------ > drivers/s390/cio/device_ops.c | 21 +++++++++++-- > drivers/s390/cio/device_pgid.c | 20 ++++++------- > drivers/s390/cio/device_status.c | 24 +++++++-------- > drivers/s390/cio/io_sch.h | 21 +++++++++---- > drivers/s390/virtio/virtio_ccw.c | 10 ------- > 10 files changed, 148 insertions(+), 83 deletions(-) > > diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h > index a29dd430fb40..865ce1cb86d5 100644 > --- a/arch/s390/include/asm/ccwdev.h > +++ b/arch/s390/include/asm/ccwdev.h > @@ -226,6 +226,10 @@ extern int ccw_device_enable_console(struct ccw_device *); > extern void ccw_device_wait_idle(struct ccw_device *); > extern int ccw_device_force_console(struct ccw_device *); > > +extern void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size); > +extern void ccw_device_dma_free(struct ccw_device *cdev, > + void *cpu_addr, size_t size); > + > int ccw_device_siosl(struct ccw_device *); > > extern void ccw_device_get_schid(struct ccw_device *, struct subchannel_id *); > diff --git a/drivers/s390/cio/ccwreq.c b/drivers/s390/cio/ccwreq.c > index 603268a33ea1..dafbceb311b3 100644 > --- a/drivers/s390/cio/ccwreq.c > +++ b/drivers/s390/cio/ccwreq.c > @@ -63,7 +63,7 @@ static void ccwreq_stop(struct ccw_device *cdev, int rc) > return; > req->done = 1; > ccw_device_set_timeout(cdev, 0); > - memset(&cdev->private->irb, 0, sizeof(struct irb)); > + memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb)); > if (rc && rc != -ENODEV && req->drc) > rc = req->drc; > req->callback(cdev, req->data, rc); > @@ -86,7 +86,7 @@ static void ccwreq_do(struct ccw_device *cdev) > continue; > } > /* Perform start function. */ > - memset(&cdev->private->irb, 0, sizeof(struct irb)); > + memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb)); > rc = cio_start(sch, cp, (u8) req->mask); > if (rc == 0) { > /* I/O started successfully. */ > @@ -169,7 +169,7 @@ int ccw_request_cancel(struct ccw_device *cdev) > */ > static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb) > { > - struct irb *irb = &cdev->private->irb; > + struct irb *irb = &cdev->private->dma_area->irb; > struct cmd_scsw *scsw = &irb->scsw.cmd; > enum uc_todo todo; > > @@ -187,7 +187,7 @@ static enum io_status ccwreq_status(struct ccw_device *cdev, struct irb *lcirb) > CIO_TRACE_EVENT(2, "sensedata"); > CIO_HEX_EVENT(2, &cdev->private->dev_id, > sizeof(struct ccw_dev_id)); > - CIO_HEX_EVENT(2, &cdev->private->irb.ecw, SENSE_MAX_COUNT); > + CIO_HEX_EVENT(2, &cdev->private->dma_area->irb.ecw, SENSE_MAX_COUNT); > /* Check for command reject. */ > if (irb->ecw[0] & SNS0_CMD_REJECT) > return IO_REJECTED; > diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c > index 1540229a37bb..a3310ee14a4a 100644 > --- a/drivers/s390/cio/device.c > +++ b/drivers/s390/cio/device.c > @@ -24,6 +24,7 @@ > #include <linux/timer.h> > #include <linux/kernel_stat.h> > #include <linux/sched/signal.h> > +#include <linux/dma-mapping.h> > > #include <asm/ccwdev.h> > #include <asm/cio.h> > @@ -687,6 +688,9 @@ ccw_device_release(struct device *dev) > struct ccw_device *cdev; > > cdev = to_ccwdev(dev); > + cio_gp_dma_free(cdev->private->dma_pool, cdev->private->dma_area, > + sizeof(*cdev->private->dma_area)); > + cio_gp_dma_destroy(cdev->private->dma_pool, &cdev->dev); > /* Release reference of parent subchannel. */ > put_device(cdev->dev.parent); > kfree(cdev->private); > @@ -696,15 +700,31 @@ ccw_device_release(struct device *dev) > static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch) > { > struct ccw_device *cdev; > + struct gen_pool *dma_pool; > > cdev = kzalloc(sizeof(*cdev), GFP_KERNEL); > - if (cdev) { > - cdev->private = kzalloc(sizeof(struct ccw_device_private), > - GFP_KERNEL | GFP_DMA); > - if (cdev->private) > - return cdev; > - } > + if (!cdev) > + goto err_cdev; > + cdev->private = kzalloc(sizeof(struct ccw_device_private), > + GFP_KERNEL | GFP_DMA); > + if (!cdev->private) > + goto err_priv; > + cdev->dev.dma_mask = &cdev->private->dma_mask; > + *cdev->dev.dma_mask = *sch->dev.dma_mask; > + cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask; > + dma_pool = cio_gp_dma_create(&cdev->dev, 1); > + cdev->private->dma_pool = dma_pool; > + cdev->private->dma_area = cio_gp_dma_zalloc(dma_pool, &cdev->dev, > + sizeof(*cdev->private->dma_area)); > + if (!cdev->private->dma_area) > + goto err_dma_area; > + return cdev; > +err_dma_area: > + cio_gp_dma_destroy(dma_pool, &cdev->dev); > + kfree(cdev->private); > +err_priv: > kfree(cdev); > +err_cdev: > return ERR_PTR(-ENOMEM); > } > > @@ -884,7 +904,7 @@ io_subchannel_recog_done(struct ccw_device *cdev) > wake_up(&ccw_device_init_wq); > break; > case DEV_STATE_OFFLINE: > - /* > + /* > * We can't register the device in interrupt context so > * we schedule a work item. > */ > @@ -1062,6 +1082,14 @@ static int io_subchannel_probe(struct subchannel *sch) > if (!io_priv) > goto out_schedule; > > + io_priv->dma_area = dma_alloc_coherent(&sch->dev, > + sizeof(*io_priv->dma_area), > + &io_priv->dma_area_dma, GFP_KERNEL); > + if (!io_priv->dma_area) { > + kfree(io_priv); > + goto out_schedule; > + } > + > set_io_private(sch, io_priv); > css_schedule_eval(sch->schid); > return 0; > @@ -1088,6 +1116,8 @@ static int io_subchannel_remove(struct subchannel *sch) > set_io_private(sch, NULL); > spin_unlock_irq(sch->lock); > out_free: > + dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area), > + io_priv->dma_area, io_priv->dma_area_dma); > kfree(io_priv); > sysfs_remove_group(&sch->dev.kobj, &io_subchannel_attr_group); > return 0; > @@ -1593,20 +1623,31 @@ struct ccw_device * __init ccw_device_create_console(struct ccw_driver *drv) > return ERR_CAST(sch); > > io_priv = kzalloc(sizeof(*io_priv), GFP_KERNEL | GFP_DMA); > - if (!io_priv) { > - put_device(&sch->dev); > - return ERR_PTR(-ENOMEM); > - } > + if (!io_priv) > + goto err_priv; > + io_priv->dma_area = dma_alloc_coherent(&sch->dev, > + sizeof(*io_priv->dma_area), > + &io_priv->dma_area_dma, GFP_KERNEL); > + if (!io_priv->dma_area) > + goto err_dma_area; > set_io_private(sch, io_priv); > cdev = io_subchannel_create_ccwdev(sch); > if (IS_ERR(cdev)) { > put_device(&sch->dev); > + dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area), > + io_priv->dma_area, io_priv->dma_area_dma); > kfree(io_priv); > return cdev; > } > cdev->drv = drv; > ccw_device_set_int_class(cdev); > return cdev; > + > +err_dma_area: > + kfree(io_priv); > +err_priv: > + put_device(&sch->dev); > + return ERR_PTR(-ENOMEM); > } > > void __init ccw_device_destroy_console(struct ccw_device *cdev) > @@ -1617,6 +1658,8 @@ void __init ccw_device_destroy_console(struct ccw_device *cdev) > set_io_private(sch, NULL); > put_device(&sch->dev); > put_device(&cdev->dev); > + dma_free_coherent(&sch->dev, sizeof(*io_priv->dma_area), > + io_priv->dma_area, io_priv->dma_area_dma); > kfree(io_priv); > } > > diff --git a/drivers/s390/cio/device_fsm.c b/drivers/s390/cio/device_fsm.c > index 9169af7dbb43..fea23b44795b 100644 > --- a/drivers/s390/cio/device_fsm.c > +++ b/drivers/s390/cio/device_fsm.c > @@ -67,8 +67,8 @@ static void ccw_timeout_log(struct ccw_device *cdev) > sizeof(struct tcw), 0); > } else { > printk(KERN_WARNING "cio: orb indicates command mode\n"); > - if ((void *)(addr_t)orb->cmd.cpa == &private->sense_ccw || > - (void *)(addr_t)orb->cmd.cpa == cdev->private->iccws) > + if ((void *)(addr_t)orb->cmd.cpa == &private->dma_area->sense_ccw || > + (void *)(addr_t)orb->cmd.cpa == cdev->private->dma_area->iccws) > printk(KERN_WARNING "cio: last channel program " > "(intern):\n"); > else > @@ -143,18 +143,18 @@ ccw_device_cancel_halt_clear(struct ccw_device *cdev) > void ccw_device_update_sense_data(struct ccw_device *cdev) > { > memset(&cdev->id, 0, sizeof(cdev->id)); > - cdev->id.cu_type = cdev->private->senseid.cu_type; > - cdev->id.cu_model = cdev->private->senseid.cu_model; > - cdev->id.dev_type = cdev->private->senseid.dev_type; > - cdev->id.dev_model = cdev->private->senseid.dev_model; > + cdev->id.cu_type = cdev->private->dma_area->senseid.cu_type; > + cdev->id.cu_model = cdev->private->dma_area->senseid.cu_model; > + cdev->id.dev_type = cdev->private->dma_area->senseid.dev_type; > + cdev->id.dev_model = cdev->private->dma_area->senseid.dev_model; > } > > int ccw_device_test_sense_data(struct ccw_device *cdev) > { > - return cdev->id.cu_type == cdev->private->senseid.cu_type && > - cdev->id.cu_model == cdev->private->senseid.cu_model && > - cdev->id.dev_type == cdev->private->senseid.dev_type && > - cdev->id.dev_model == cdev->private->senseid.dev_model; > + return cdev->id.cu_type == cdev->private->dma_area->senseid.cu_type && > + cdev->id.cu_model == cdev->private->dma_area->senseid.cu_model && > + cdev->id.dev_type == cdev->private->dma_area->senseid.dev_type && > + cdev->id.dev_model == cdev->private->dma_area->senseid.dev_model; > } > > /* > @@ -342,7 +342,7 @@ ccw_device_done(struct ccw_device *cdev, int state) > cio_disable_subchannel(sch); > > /* Reset device status. */ > - memset(&cdev->private->irb, 0, sizeof(struct irb)); > + memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb)); > > cdev->private->state = state; > > @@ -509,13 +509,13 @@ void ccw_device_verify_done(struct ccw_device *cdev, int err) > ccw_device_done(cdev, DEV_STATE_ONLINE); > /* Deliver fake irb to device driver, if needed. */ > if (cdev->private->flags.fake_irb) { > - create_fake_irb(&cdev->private->irb, > + create_fake_irb(&cdev->private->dma_area->irb, > cdev->private->flags.fake_irb); > cdev->private->flags.fake_irb = 0; > if (cdev->handler) > cdev->handler(cdev, cdev->private->intparm, > - &cdev->private->irb); > - memset(&cdev->private->irb, 0, sizeof(struct irb)); > + &cdev->private->dma_area->irb); > + memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb)); > } > ccw_device_report_path_events(cdev); > ccw_device_handle_broken_paths(cdev); > @@ -672,7 +672,7 @@ ccw_device_online_verify(struct ccw_device *cdev, enum dev_event dev_event) > > if (scsw_actl(&sch->schib.scsw) != 0 || > (scsw_stctl(&sch->schib.scsw) & SCSW_STCTL_STATUS_PEND) || > - (scsw_stctl(&cdev->private->irb.scsw) & SCSW_STCTL_STATUS_PEND)) { > + (scsw_stctl(&cdev->private->dma_area->irb.scsw) & SCSW_STCTL_STATUS_PEND)) { > /* > * No final status yet or final status not yet delivered > * to the device driver. Can't do path verification now, > @@ -719,7 +719,7 @@ static int ccw_device_call_handler(struct ccw_device *cdev) > * - fast notification was requested (primary status) > * - unsolicited interrupts > */ > - stctl = scsw_stctl(&cdev->private->irb.scsw); > + stctl = scsw_stctl(&cdev->private->dma_area->irb.scsw); > ending_status = (stctl & SCSW_STCTL_SEC_STATUS) || > (stctl == (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND)) || > (stctl == SCSW_STCTL_STATUS_PEND); > @@ -735,9 +735,9 @@ static int ccw_device_call_handler(struct ccw_device *cdev) > > if (cdev->handler) > cdev->handler(cdev, cdev->private->intparm, > - &cdev->private->irb); > + &cdev->private->dma_area->irb); > > - memset(&cdev->private->irb, 0, sizeof(struct irb)); > + memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb)); > return 1; > } > > @@ -759,7 +759,7 @@ ccw_device_irq(struct ccw_device *cdev, enum dev_event dev_event) > /* Unit check but no sense data. Need basic sense. */ > if (ccw_device_do_sense(cdev, irb) != 0) > goto call_handler_unsol; > - memcpy(&cdev->private->irb, irb, sizeof(struct irb)); > + memcpy(&cdev->private->dma_area->irb, irb, sizeof(struct irb)); > cdev->private->state = DEV_STATE_W4SENSE; > cdev->private->intparm = 0; > return; > @@ -842,7 +842,7 @@ ccw_device_w4sense(struct ccw_device *cdev, enum dev_event dev_event) > if (scsw_fctl(&irb->scsw) & > (SCSW_FCTL_CLEAR_FUNC | SCSW_FCTL_HALT_FUNC)) { > cdev->private->flags.dosense = 0; > - memset(&cdev->private->irb, 0, sizeof(struct irb)); > + memset(&cdev->private->dma_area->irb, 0, sizeof(struct irb)); > ccw_device_accumulate_irb(cdev, irb); > goto call_handler; > } > diff --git a/drivers/s390/cio/device_id.c b/drivers/s390/cio/device_id.c > index f6df83a9dfbb..ea8a0fc6c0b6 100644 > --- a/drivers/s390/cio/device_id.c > +++ b/drivers/s390/cio/device_id.c > @@ -99,7 +99,7 @@ static int diag210_to_senseid(struct senseid *senseid, struct diag210 *diag) > static int diag210_get_dev_info(struct ccw_device *cdev) > { > struct ccw_dev_id *dev_id = &cdev->private->dev_id; > - struct senseid *senseid = &cdev->private->senseid; > + struct senseid *senseid = &cdev->private->dma_area->senseid; > struct diag210 diag_data; > int rc; > > @@ -134,8 +134,8 @@ static int diag210_get_dev_info(struct ccw_device *cdev) > static void snsid_init(struct ccw_device *cdev) > { > cdev->private->flags.esid = 0; > - memset(&cdev->private->senseid, 0, sizeof(cdev->private->senseid)); > - cdev->private->senseid.cu_type = 0xffff; > + memset(&cdev->private->dma_area->senseid, 0, sizeof(cdev->private->dma_area->senseid)); > + cdev->private->dma_area->senseid.cu_type = 0xffff; > } > > /* > @@ -143,16 +143,16 @@ static void snsid_init(struct ccw_device *cdev) > */ > static int snsid_check(struct ccw_device *cdev, void *data) > { > - struct cmd_scsw *scsw = &cdev->private->irb.scsw.cmd; > + struct cmd_scsw *scsw = &cdev->private->dma_area->irb.scsw.cmd; > int len = sizeof(struct senseid) - scsw->count; > > /* Check for incomplete SENSE ID data. */ > if (len < SENSE_ID_MIN_LEN) > goto out_restart; > - if (cdev->private->senseid.cu_type == 0xffff) > + if (cdev->private->dma_area->senseid.cu_type == 0xffff) > goto out_restart; > /* Check for incompatible SENSE ID data. */ > - if (cdev->private->senseid.reserved != 0xff) > + if (cdev->private->dma_area->senseid.reserved != 0xff) > return -EOPNOTSUPP; > /* Check for extended-identification information. */ > if (len > SENSE_ID_BASIC_LEN) > @@ -170,7 +170,7 @@ static int snsid_check(struct ccw_device *cdev, void *data) > static void snsid_callback(struct ccw_device *cdev, void *data, int rc) > { > struct ccw_dev_id *id = &cdev->private->dev_id; > - struct senseid *senseid = &cdev->private->senseid; > + struct senseid *senseid = &cdev->private->dma_area->senseid; > int vm = 0; > > if (rc && MACHINE_IS_VM) { > @@ -200,7 +200,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev) > { > struct subchannel *sch = to_subchannel(cdev->dev.parent); > struct ccw_request *req = &cdev->private->req; > - struct ccw1 *cp = cdev->private->iccws; > + struct ccw1 *cp = cdev->private->dma_area->iccws; > > CIO_TRACE_EVENT(4, "snsid"); > CIO_HEX_EVENT(4, &cdev->private->dev_id, sizeof(cdev->private->dev_id)); > @@ -208,7 +208,7 @@ void ccw_device_sense_id_start(struct ccw_device *cdev) > snsid_init(cdev); > /* Channel program setup. */ > cp->cmd_code = CCW_CMD_SENSE_ID; > - cp->cda = (u32) (addr_t) &cdev->private->senseid; > + cp->cda = (u32) (addr_t) &cdev->private->dma_area->senseid; > cp->count = sizeof(struct senseid); > cp->flags = CCW_FLAG_SLI; > /* Request setup. */ > diff --git a/drivers/s390/cio/device_ops.c b/drivers/s390/cio/device_ops.c > index 4435ae0b3027..be4acfa9265a 100644 > --- a/drivers/s390/cio/device_ops.c > +++ b/drivers/s390/cio/device_ops.c > @@ -429,8 +429,8 @@ struct ciw *ccw_device_get_ciw(struct ccw_device *cdev, __u32 ct) > if (cdev->private->flags.esid == 0) > return NULL; > for (ciw_cnt = 0; ciw_cnt < MAX_CIWS; ciw_cnt++) > - if (cdev->private->senseid.ciw[ciw_cnt].ct == ct) > - return cdev->private->senseid.ciw + ciw_cnt; > + if (cdev->private->dma_area->senseid.ciw[ciw_cnt].ct == ct) > + return cdev->private->dma_area->senseid.ciw + ciw_cnt; > return NULL; > } > > @@ -699,6 +699,23 @@ void ccw_device_get_schid(struct ccw_device *cdev, struct subchannel_id *schid) > } > EXPORT_SYMBOL_GPL(ccw_device_get_schid); > > +/** > + * Allocate zeroed dma coherent 31 bit addressable memory using > + * the subchannels dma pool. Maximal size of allocation supported > + * is PAGE_SIZE. > + */ > +void *ccw_device_dma_zalloc(struct ccw_device *cdev, size_t size) > +{ > + return cio_gp_dma_zalloc(cdev->private->dma_pool, &cdev->dev, size); > +} > +EXPORT_SYMBOL(ccw_device_dma_zalloc); > + > +void ccw_device_dma_free(struct ccw_device *cdev, void *cpu_addr, size_t size) > +{ > + cio_gp_dma_free(cdev->private->dma_pool, cpu_addr, size); > +} > +EXPORT_SYMBOL(ccw_device_dma_free); > + > EXPORT_SYMBOL(ccw_device_set_options_mask); > EXPORT_SYMBOL(ccw_device_set_options); > EXPORT_SYMBOL(ccw_device_clear_options); > diff --git a/drivers/s390/cio/device_pgid.c b/drivers/s390/cio/device_pgid.c > index d30a3babf176..e97baa89cbf8 100644 > --- a/drivers/s390/cio/device_pgid.c > +++ b/drivers/s390/cio/device_pgid.c > @@ -57,7 +57,7 @@ static void verify_done(struct ccw_device *cdev, int rc) > static void nop_build_cp(struct ccw_device *cdev) > { > struct ccw_request *req = &cdev->private->req; > - struct ccw1 *cp = cdev->private->iccws; > + struct ccw1 *cp = cdev->private->dma_area->iccws; > > cp->cmd_code = CCW_CMD_NOOP; > cp->cda = 0; > @@ -134,9 +134,9 @@ static void nop_callback(struct ccw_device *cdev, void *data, int rc) > static void spid_build_cp(struct ccw_device *cdev, u8 fn) > { > struct ccw_request *req = &cdev->private->req; > - struct ccw1 *cp = cdev->private->iccws; > + struct ccw1 *cp = cdev->private->dma_area->iccws; > int i = pathmask_to_pos(req->lpm); > - struct pgid *pgid = &cdev->private->pgid[i]; > + struct pgid *pgid = &cdev->private->dma_area->pgid[i]; > > pgid->inf.fc = fn; > cp->cmd_code = CCW_CMD_SET_PGID; > @@ -300,7 +300,7 @@ static int pgid_cmp(struct pgid *p1, struct pgid *p2) > static void pgid_analyze(struct ccw_device *cdev, struct pgid **p, > int *mismatch, u8 *reserved, u8 *reset) > { > - struct pgid *pgid = &cdev->private->pgid[0]; > + struct pgid *pgid = &cdev->private->dma_area->pgid[0]; > struct pgid *first = NULL; > int lpm; > int i; > @@ -342,7 +342,7 @@ static u8 pgid_to_donepm(struct ccw_device *cdev) > lpm = 0x80 >> i; > if ((cdev->private->pgid_valid_mask & lpm) == 0) > continue; > - pgid = &cdev->private->pgid[i]; > + pgid = &cdev->private->dma_area->pgid[i]; > if (sch->opm & lpm) { > if (pgid->inf.ps.state1 != SNID_STATE1_GROUPED) > continue; > @@ -368,7 +368,7 @@ static void pgid_fill(struct ccw_device *cdev, struct pgid *pgid) > int i; > > for (i = 0; i < 8; i++) > - memcpy(&cdev->private->pgid[i], pgid, sizeof(struct pgid)); > + memcpy(&cdev->private->dma_area->pgid[i], pgid, sizeof(struct pgid)); > } > > /* > @@ -435,12 +435,12 @@ static void snid_done(struct ccw_device *cdev, int rc) > static void snid_build_cp(struct ccw_device *cdev) > { > struct ccw_request *req = &cdev->private->req; > - struct ccw1 *cp = cdev->private->iccws; > + struct ccw1 *cp = cdev->private->dma_area->iccws; > int i = pathmask_to_pos(req->lpm); > > /* Channel program setup. */ > cp->cmd_code = CCW_CMD_SENSE_PGID; > - cp->cda = (u32) (addr_t) &cdev->private->pgid[i]; > + cp->cda = (u32) (addr_t) &cdev->private->dma_area->pgid[i]; > cp->count = sizeof(struct pgid); > cp->flags = CCW_FLAG_SLI; > req->cp = cp; > @@ -516,7 +516,7 @@ static void verify_start(struct ccw_device *cdev) > sch->lpm = sch->schib.pmcw.pam; > > /* Initialize PGID data. */ > - memset(cdev->private->pgid, 0, sizeof(cdev->private->pgid)); > + memset(cdev->private->dma_area->pgid, 0, sizeof(cdev->private->dma_area->pgid)); > cdev->private->pgid_valid_mask = 0; > cdev->private->pgid_todo_mask = sch->schib.pmcw.pam; > cdev->private->path_notoper_mask = 0; > @@ -626,7 +626,7 @@ struct stlck_data { > static void stlck_build_cp(struct ccw_device *cdev, void *buf1, void *buf2) > { > struct ccw_request *req = &cdev->private->req; > - struct ccw1 *cp = cdev->private->iccws; > + struct ccw1 *cp = cdev->private->dma_area->iccws; > > cp[0].cmd_code = CCW_CMD_STLCK; > cp[0].cda = (u32) (addr_t) buf1; > diff --git a/drivers/s390/cio/device_status.c b/drivers/s390/cio/device_status.c > index 7d5c7892b2c4..a9aabde604f4 100644 > --- a/drivers/s390/cio/device_status.c > +++ b/drivers/s390/cio/device_status.c > @@ -79,15 +79,15 @@ ccw_device_accumulate_ecw(struct ccw_device *cdev, struct irb *irb) > * are condition that have to be met for the extended control > * bit to have meaning. Sick. > */ > - cdev->private->irb.scsw.cmd.ectl = 0; > + cdev->private->dma_area->irb.scsw.cmd.ectl = 0; > if ((irb->scsw.cmd.stctl & SCSW_STCTL_ALERT_STATUS) && > !(irb->scsw.cmd.stctl & SCSW_STCTL_INTER_STATUS)) > - cdev->private->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl; > + cdev->private->dma_area->irb.scsw.cmd.ectl = irb->scsw.cmd.ectl; > /* Check if extended control word is valid. */ > - if (!cdev->private->irb.scsw.cmd.ectl) > + if (!cdev->private->dma_area->irb.scsw.cmd.ectl) > return; > /* Copy concurrent sense / model dependent information. */ > - memcpy (&cdev->private->irb.ecw, irb->ecw, sizeof (irb->ecw)); > + memcpy (&cdev->private->dma_area->irb.ecw, irb->ecw, sizeof (irb->ecw));NIT, may be you should take the opportunity to remove the blanc before the parenthesis. NIT again, Some lines over 80 character too. just a first check, I will go deeper later. Regards, Pierre -- Pierre Morel Linux/KVM/QEMU in B?blingen - Germany
Pierre Morel
2019-May-08  14:31 UTC
[PATCH 08/10] virtio/s390: add indirection to indicators access
On 26/04/2019 20:32, Halil Pasic wrote:> This will come in handy soon when we pull out the indicators from > virtio_ccw_device to a memory area that is shared with the hypervisor > (in particular for protected virtualization guests). > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 40 +++++++++++++++++++++++++--------------- > 1 file changed, 25 insertions(+), 15 deletions(-) > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index bb7a92316fc8..1f3e7d56924f 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c > @@ -68,6 +68,16 @@ struct virtio_ccw_device { > void *airq_info; > }; > > +static inline unsigned long *indicators(struct virtio_ccw_device *vcdev) > +{ > + return &vcdev->indicators; > +} > + > +static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev) > +{ > + return &vcdev->indicators2; > +} > + > struct vq_info_block_legacy { > __u64 queue; > __u32 align; > @@ -337,17 +347,17 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev, > ccw->cda = (__u32)(unsigned long) thinint_area; > } else { > /* payload is the address of the indicators */ > - indicatorp = kmalloc(sizeof(&vcdev->indicators), > + indicatorp = kmalloc(sizeof(indicators(vcdev)), > GFP_DMA | GFP_KERNEL); > if (!indicatorp) > return; > *indicatorp = 0; > ccw->cmd_code = CCW_CMD_SET_IND; > - ccw->count = sizeof(&vcdev->indicators); > + ccw->count = sizeof(indicators(vcdev));This looks strange to me. Was already weird before. Lucky we are indicators are long... may be just sizeof(long)> ccw->cda = (__u32)(unsigned long) indicatorp; > } > /* Deregister indicators from host. */ > - vcdev->indicators = 0; > + *indicators(vcdev) = 0; > ccw->flags = 0; > ret = ccw_io_helper(vcdev, ccw, > vcdev->is_thinint ? > @@ -656,10 +666,10 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, > * We need a data area under 2G to communicate. Our payload is > * the address of the indicators. > */ > - indicatorp = kmalloc(sizeof(&vcdev->indicators), GFP_DMA | GFP_KERNEL); > + indicatorp = kmalloc(sizeof(indicators(vcdev)), GFP_DMA | GFP_KERNEL); > if (!indicatorp) > goto out; > - *indicatorp = (unsigned long) &vcdev->indicators; > + *indicatorp = (unsigned long) indicators(vcdev); > if (vcdev->is_thinint) { > ret = virtio_ccw_register_adapter_ind(vcdev, vqs, nvqs, ccw); > if (ret) > @@ -668,21 +678,21 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, > } > if (!vcdev->is_thinint) { > /* Register queue indicators with host. */ > - vcdev->indicators = 0; > + *indicators(vcdev) = 0; > ccw->cmd_code = CCW_CMD_SET_IND; > ccw->flags = 0; > - ccw->count = sizeof(&vcdev->indicators); > + ccw->count = sizeof(indicators(vcdev));same as before> ccw->cda = (__u32)(unsigned long) indicatorp; > ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_IND); > if (ret) > goto out; > } > /* Register indicators2 with host for config changes */ > - *indicatorp = (unsigned long) &vcdev->indicators2; > - vcdev->indicators2 = 0; > + *indicatorp = (unsigned long) indicators2(vcdev); > + *indicators2(vcdev) = 0; > ccw->cmd_code = CCW_CMD_SET_CONF_IND; > ccw->flags = 0; > - ccw->count = sizeof(&vcdev->indicators2); > + ccw->count = sizeof(indicators2(vcdev));here too> ccw->cda = (__u32)(unsigned long) indicatorp; > ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_CONF_IND); > if (ret) > @@ -1092,17 +1102,17 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev, > vcdev->err = -EIO; > } > virtio_ccw_check_activity(vcdev, activity); > - for_each_set_bit(i, &vcdev->indicators, > - sizeof(vcdev->indicators) * BITS_PER_BYTE) { > + for_each_set_bit(i, indicators(vcdev), > + sizeof(*indicators(vcdev)) * BITS_PER_BYTE) { > /* The bit clear must happen before the vring kick. */ > - clear_bit(i, &vcdev->indicators); > + clear_bit(i, indicators(vcdev)); > barrier(); > vq = virtio_ccw_vq_by_ind(vcdev, i); > vring_interrupt(0, vq); > } > - if (test_bit(0, &vcdev->indicators2)) { > + if (test_bit(0, indicators2(vcdev))) { > virtio_config_changed(&vcdev->vdev); > - clear_bit(0, &vcdev->indicators2); > + clear_bit(0, indicators2(vcdev)); > } > } > >Here again just a fast check. I will go in the functional later. Regards, Pierre -- Pierre Morel Linux/KVM/QEMU in B?blingen - Germany
Pierre Morel
2019-May-08  14:46 UTC
[PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
On 26/04/2019 20:32, Halil Pasic wrote:> Before virtio-ccw could get away with not using DMA API for the pieces of > memory it does ccw I/O with. With protected virtualization this has to > change, since the hypervisor needs to read and sometimes also write these > pieces of memory. > > The hypervisor is supposed to poke the classic notifiers, if these are > used, out of band with regards to ccw I/O. So these need to be allocated > as DMA memory (which is shared memory for protected virtualization > guests). > > Let us factor out everything from struct virtio_ccw_device that needs to > be DMA memory in a satellite that is allocated as such. > > Note: The control blocks of I/O instructions do not need to be shared. > These are marshalled by the ultravisor. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------ > 1 file changed, 96 insertions(+), 81 deletions(-) > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index 1f3e7d56924f..613b18001a0c 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c > @@ -46,9 +46,15 @@ struct vq_config_block { > #define VIRTIO_CCW_CONFIG_SIZE 0x100 > /* same as PCI config space size, should be enough for all drivers */ > > +struct vcdev_dma_area { > + unsigned long indicators; > + unsigned long indicators2; > + struct vq_config_block config_block; > + __u8 status; > +}; > + > struct virtio_ccw_device { > struct virtio_device vdev; > - __u8 *status; > __u8 config[VIRTIO_CCW_CONFIG_SIZE]; > struct ccw_device *cdev; > __u32 curr_io; > @@ -58,24 +64,22 @@ struct virtio_ccw_device { > spinlock_t lock; > struct mutex io_lock; /* Serializes I/O requests */ > struct list_head virtqueues; > - unsigned long indicators; > - unsigned long indicators2; > - struct vq_config_block *config_block; > bool is_thinint; > bool going_away; > bool device_lost; > unsigned int config_ready; > void *airq_info; > + struct vcdev_dma_area *dma_area; > }; > > static inline unsigned long *indicators(struct virtio_ccw_device *vcdev) > { > - return &vcdev->indicators; > + return &vcdev->dma_area->indicators; > } > > static inline unsigned long *indicators2(struct virtio_ccw_device *vcdev) > { > - return &vcdev->indicators2; > + return &vcdev->dma_area->indicators2; > } > > struct vq_info_block_legacy { > @@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev) > return container_of(vdev, struct virtio_ccw_device, vdev); > } > > +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size) > +{ > + return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size); > +} > + > +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size, > + void *cpu_addr) > +{ > + return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size); > +} > + > +#define vc_dma_alloc_struct(vdev, ptr) \ > + ({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); }) > +#define vc_dma_free_struct(vdev, ptr) \ > + __vc_dma_free(vdev, sizeof(*(ptr)), (ptr)) > + > static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info) > { > unsigned long i, flags; > @@ -335,8 +355,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev, > struct airq_info *airq_info = vcdev->airq_info; > > if (vcdev->is_thinint) { > - thinint_area = kzalloc(sizeof(*thinint_area), > - GFP_DMA | GFP_KERNEL); > + vc_dma_alloc_struct(&vcdev->vdev, thinint_area); > if (!thinint_area) > return; > thinint_area->summary_indicator > @@ -347,8 +366,8 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev, > ccw->cda = (__u32)(unsigned long) thinint_area; > } else { > /* payload is the address of the indicators */ > - indicatorp = kmalloc(sizeof(indicators(vcdev)), > - GFP_DMA | GFP_KERNEL); > + indicatorp = __vc_dma_alloc(&vcdev->vdev, > + sizeof(indicators(vcdev)));should be sizeof(long) ? This is a recurrent error, but it is not an issue because the size of the indicators is unsigned long as the size of the pointer. Regards, Pierre -- Pierre Morel Linux/KVM/QEMU in B?blingen - Germany
Pierre Morel
2019-May-08  15:11 UTC
[PATCH 10/10] virtio/s390: make airq summary indicators DMA
On 26/04/2019 20:32, Halil Pasic wrote:> Hypervisor needs to interact with the summary indicators, so these > need to be DMA memory as well (at least for protected virtualization > guests). > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++------- > 1 file changed, 17 insertions(+), 7 deletions(-) > > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index 613b18001a0c..6058b07fea08 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c > @@ -140,11 +140,17 @@ static int virtio_ccw_use_airq = 1; > > struct airq_info { > rwlock_t lock; > - u8 summary_indicator; > + u8 summary_indicator_idx; > struct airq_struct airq; > struct airq_iv *aiv; > }; > static struct airq_info *airq_areas[MAX_AIRQ_AREAS]; > +static u8 *summary_indicators; > + > +static inline u8 *get_summary_indicator(struct airq_info *info) > +{ > + return summary_indicators + info->summary_indicator_idx; > +} > > #define CCW_CMD_SET_VQ 0x13 > #define CCW_CMD_VDEV_RESET 0x33 > @@ -225,7 +231,7 @@ static void virtio_airq_handler(struct airq_struct *airq) > break; > vring_interrupt(0, (void *)airq_iv_get_ptr(info->aiv, ai)); > } > - info->summary_indicator = 0; > + *(get_summary_indicator(info)) = 0; > smp_wmb(); > /* Walk through indicators field, summary indicator not active. */ > for (ai = 0;;) { > @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq) > read_unlock(&info->lock); > } > > -static struct airq_info *new_airq_info(void) > +/* call with airq_areas_lock held */ > +static struct airq_info *new_airq_info(int index) > { > struct airq_info *info; > int rc; > @@ -252,7 +259,8 @@ static struct airq_info *new_airq_info(void) > return NULL; > } > info->airq.handler = virtio_airq_handler; > - info->airq.lsi_ptr = &info->summary_indicator; > + info->summary_indicator_idx = index; > + info->airq.lsi_ptr = get_summary_indicator(info); > info->airq.lsi_mask = 0xff; > info->airq.isc = VIRTIO_AIRQ_ISC; > rc = register_adapter_interrupt(&info->airq); > @@ -273,8 +281,9 @@ static unsigned long get_airq_indicator(struct virtqueue *vqs[], int nvqs, > unsigned long bit, flags; > > for (i = 0; i < MAX_AIRQ_AREAS && !indicator_addr; i++) { > + /* TODO: this seems to be racy */yes, my opinions too, was already racy, in my opinion, we need another patch in another series to fix this. However, not sure about the comment.> if (!airq_areas[i]) > - airq_areas[i] = new_airq_info(); > + airq_areas[i] = new_airq_info(i); > info = airq_areas[i]; > if (!info) > return 0; > @@ -359,7 +368,7 @@ static void virtio_ccw_drop_indicator(struct virtio_ccw_device *vcdev, > if (!thinint_area) > return; > thinint_area->summary_indicator > - (unsigned long) &airq_info->summary_indicator; > + (unsigned long) get_summary_indicator(airq_info); > thinint_area->isc = VIRTIO_AIRQ_ISC; > ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER; > ccw->count = sizeof(*thinint_area); > @@ -624,7 +633,7 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev, > } > info = vcdev->airq_info; > thinint_area->summary_indicator > - (unsigned long) &info->summary_indicator; > + (unsigned long) get_summary_indicator(info); > thinint_area->isc = VIRTIO_AIRQ_ISC; > ccw->cmd_code = CCW_CMD_SET_IND_ADAPTER; > ccw->flags = CCW_FLAG_SLI; > @@ -1500,6 +1509,7 @@ static int __init virtio_ccw_init(void) > { > /* parse no_auto string before we do anything further */ > no_auto_parse(); > + summary_indicators = cio_dma_zalloc(MAX_AIRQ_AREAS); > return ccw_driver_register(&virtio_ccw_driver); > } > device_initcall(virtio_ccw_init); >-- Pierre Morel Linux/KVM/QEMU in B?blingen - Germany
Cornelia Huck
2019-May-09  11:37 UTC
[PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
On Fri, 26 Apr 2019 20:32:42 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Protected virtualization guests have to use shared pages for airq > notifier bit vectors, because hypervisor needs to write these bits. > > Let us make sure we allocate DMA memory for the notifier bit vectors. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > arch/s390/include/asm/airq.h | 2 ++ > drivers/s390/cio/airq.c | 18 ++++++++++++++---- > 2 files changed, 16 insertions(+), 4 deletions(-)As an aside, there are some other devices that use adapter interrupts as well (pci, ap, qdio). How does that interact with their needs? Do they continue to work on non-protected virt guests (kvm or z/VM), and can they be accommodated if support for them on protected guests is added in the future? (For some of the indicator bit handling, I suspect millicode takes care of it anyway, but at least for pci, there's some hypervisor action to be aware of.) Also, as another aside, has this been tested with a regular guest under tcg as well? If not, can you provide a branch for quick verification somewhere?
Jason J. Herne
2019-May-09  18:05 UTC
[PATCH 04/10] s390/mm: force swiotlb for protected virtualization
> Subject: [PATCH 04/10] s390/mm: force swiotlb for protected virtualization > Date: Fri, 26 Apr 2019 20:32:39 +0200 > From: Halil Pasic <pasic at linux.ibm.com> > To: kvm at vger.kernel.org, linux-s390 at vger.kernel.org, Cornelia Huck <cohuck at redhat.com>, > Martin Schwidefsky <schwidefsky at de.ibm.com>, Sebastian Ott <sebott at linux.ibm.com> > CC: Halil Pasic <pasic at linux.ibm.com>, virtualization at lists.linux-foundation.org, Michael > S. Tsirkin <mst at redhat.com>, Christoph Hellwig <hch at infradead.org>, Thomas Huth > <thuth at redhat.com>, Christian Borntraeger <borntraeger at de.ibm.com>, Viktor Mihajlovski > <mihajlov at linux.ibm.com>, Vasily Gorbik <gor at linux.ibm.com>, Janosch Frank > <frankja at linux.ibm.com>, Claudio Imbrenda <imbrenda at linux.ibm.com>, Farhan Ali > <alifm at linux.ibm.com>, Eric Farman <farman at linux.ibm.com> > > On s390, protected virtualization guests have to use bounced I/O > buffers.? That requires some plumbing. > > Let us make sure, any device that uses DMA API with direct ops correctly > is spared from the problems, that a hypervisor attempting I/O to a > non-shared page would bring. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > ?arch/s390/Kconfig?????????????????? |? 4 +++ > ?arch/s390/include/asm/mem_encrypt.h | 18 +++++++++++++ > ?arch/s390/mm/init.c???????????????? | 50 +++++++++++++++++++++++++++++++++++++ > ?3 files changed, 72 insertions(+) > ?create mode 100644 arch/s390/include/asm/mem_encrypt.h > > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 1c3fcf19c3af..5500d05d4d53 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -1,4 +1,7 @@ > ?# SPDX-License-Identifier: GPL-2.0 > +config ARCH_HAS_MEM_ENCRYPT > +??????? def_bool y > + > ?config MMU > ???? def_bool y > ?@@ -191,6 +194,7 @@ config S390 > ???? select ARCH_HAS_SCALED_CPUTIME > ???? select VIRT_TO_BUS > ???? select HAVE_NMI > +??? select SWIOTLB > ?? config SCHED_OMIT_FRAME_POINTER > diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h > new file mode 100644 > index 000000000000..0898c09a888c > --- /dev/null > +++ b/arch/s390/include/asm/mem_encrypt.h > @@ -0,0 +1,18 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef S390_MEM_ENCRYPT_H__ > +#define S390_MEM_ENCRYPT_H__ > + > +#ifndef __ASSEMBLY__ > + > +#define sme_me_mask??? 0ULL > + > +static inline bool sme_active(void) { return false; } > +extern bool sev_active(void); > +I noticed this patch always returns false for sme_active. Is it safe to assume that whatever fixups are required on x86 to deal with sme do not apply to s390?> +int set_memory_encrypted(unsigned long addr, int numpages); > +int set_memory_decrypted(unsigned long addr, int numpages); > + > +#endif??? /* __ASSEMBLY__ */ > + > +#endif??? /* S390_MEM_ENCRYPT_H__ */ > + > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c > index 3e82f66d5c61..7e3cbd15dcfa 100644 > --- a/arch/s390/mm/init.c > +++ b/arch/s390/mm/init.c > @@ -18,6 +18,7 @@ > ?#include <linux/mman.h> > ?#include <linux/mm.h> > ?#include <linux/swap.h> > +#include <linux/swiotlb.h> > ?#include <linux/smp.h> > ?#include <linux/init.h> > ?#include <linux/pagemap.h> > @@ -29,6 +30,7 @@ > ?#include <linux/export.h> > ?#include <linux/cma.h> > ?#include <linux/gfp.h> > +#include <linux/dma-mapping.h> > ?#include <asm/processor.h> > ?#include <linux/uaccess.h> > ?#include <asm/pgtable.h> > @@ -42,6 +44,8 @@ > ?#include <asm/sclp.h> > ?#include <asm/set_memory.h> > ?#include <asm/kasan.h> > +#include <asm/dma-mapping.h> > +#include <asm/uv.h> > ? pgd_t swapper_pg_dir[PTRS_PER_PGD] __section(.bss..swapper_pg_dir); > ?@@ -126,6 +130,50 @@ void mark_rodata_ro(void) > ???? pr_info("Write protected read-only-after-init data: %luk\n", size >> 10); > ?} > ?+int set_memory_encrypted(unsigned long addr, int numpages) > +{ > +??? int i; > + > +??? /* make all pages shared, (swiotlb, dma_free) */This comment should be "make all pages unshared"?> +??? for (i = 0; i < numpages; ++i) { > +??????? uv_remove_shared(addr); > +??????? addr += PAGE_SIZE; > +??? } > +??? return 0; > +} > +EXPORT_SYMBOL_GPL(set_memory_encrypted); > + > +int set_memory_decrypted(unsigned long addr, int numpages) > +{ > +??? int i; > +??? /* make all pages shared (swiotlb, dma_alloca) */ > +??? for (i = 0; i < numpages; ++i) { > +??????? uv_set_shared(addr); > +??????? addr += PAGE_SIZE; > +??? } > +??? return 0; > +} > +EXPORT_SYMBOL_GPL(set_memory_decrypted);The addr arguments for the above functions appear to be referring to virtual addresses. Would vaddr be a better name?> + > +/* are we a protected virtualization guest? */ > +bool sev_active(void) > +{ > +??? return is_prot_virt_guest(); > +} > +EXPORT_SYMBOL_GPL(sev_active); > + > +/* protected virtualization */ > +static void pv_init(void) > +{ > +??? if (!sev_active()) > +??????? return; > + > +??? /* make sure bounce buffers are shared */ > +??? swiotlb_init(1); > +??? swiotlb_update_mem_attributes(); > +??? swiotlb_force = SWIOTLB_FORCE; > +} > + > ?void __init mem_init(void) > ?{ > ???? cpumask_set_cpu(0, &init_mm.context.cpu_attach_mask); > @@ -134,6 +182,8 @@ void __init mem_init(void) > ???? set_max_mapnr(max_low_pfn); > ???????? high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); > ?+??? pv_init(); > + > ???? /* Setup guest page hinting */ > ???? cmma_init(); > ?-- 2.16.4 > >-- -- Jason J. Herne (jjherne at linux.ibm.com)
Claudio Imbrenda
2019-May-10  07:49 UTC
[PATCH 04/10] s390/mm: force swiotlb for protected virtualization
On Thu, 9 May 2019 14:05:20 -0400 "Jason J. Herne" <jjherne at linux.ibm.com> wrote: [...]> > +#define sme_me_mask??? 0ULL > > + > > +static inline bool sme_active(void) { return false; } > > +extern bool sev_active(void); > > + > > I noticed this patch always returns false for sme_active. Is it safe > to assume that whatever fixups are required on x86 to deal with sme > do not apply to s390?yes, on x86 sev_active returns false if SEV is enabled. SME is for host memory encryption. from arch/x86/mm/mem_encrypt.c: bool sme_active(void) { return sme_me_mask && !sev_enabled; } and it makes sense because you can't have both SME and SEV enabled on the same kernel, because either you're running on bare metal (and then you can have SME) __or__ you are running as a guest (and then you can have SEV). The key difference is that DMA operations don't need bounce buffers with SME, but they do with SEV. I hope this clarifies your doubts :) [...]
Cornelia Huck
2019-May-13  09:41 UTC
[PATCH 06/10] s390/cio: add basic protected virtualization support
On Fri, 26 Apr 2019 20:32:41 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> As virtio-ccw devices are channel devices, we need to use the dma area > for any communication with the hypervisor. > > This patch addresses the most basic stuff (mostly what is required for > virtio-ccw), and does take care of QDIO or any devices."does not take care of QDIO", surely? (Also, what does "any devices" mean? Do you mean "every arbitrary device", perhaps?)> > An interesting side effect is that virtio structures are now going to > get allocated in 31 bit addressable storage.Hm...> > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > arch/s390/include/asm/ccwdev.h | 4 +++ > drivers/s390/cio/ccwreq.c | 8 ++--- > drivers/s390/cio/device.c | 65 +++++++++++++++++++++++++++++++++------- > drivers/s390/cio/device_fsm.c | 40 ++++++++++++------------- > drivers/s390/cio/device_id.c | 18 +++++------ > drivers/s390/cio/device_ops.c | 21 +++++++++++-- > drivers/s390/cio/device_pgid.c | 20 ++++++------- > drivers/s390/cio/device_status.c | 24 +++++++-------- > drivers/s390/cio/io_sch.h | 21 +++++++++---- > drivers/s390/virtio/virtio_ccw.c | 10 ------- > 10 files changed, 148 insertions(+), 83 deletions(-)(...)> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c > index 6d989c360f38..bb7a92316fc8 100644 > --- a/drivers/s390/virtio/virtio_ccw.c > +++ b/drivers/s390/virtio/virtio_ccw.c > @@ -66,7 +66,6 @@ struct virtio_ccw_device { > bool device_lost; > unsigned int config_ready; > void *airq_info; > - u64 dma_mask; > }; > > struct vq_info_block_legacy { > @@ -1255,16 +1254,7 @@ static int virtio_ccw_online(struct ccw_device *cdev) > ret = -ENOMEM; > goto out_free; > } > - > vcdev->vdev.dev.parent = &cdev->dev; > - cdev->dev.dma_mask = &vcdev->dma_mask; > - /* we are fine with common virtio infrastructure using 64 bit DMA */ > - ret = dma_set_mask_and_coherent(&cdev->dev, DMA_BIT_MASK(64)); > - if (ret) { > - dev_warn(&cdev->dev, "Failed to enable 64-bit DMA.\n"); > - goto out_free; > - }This means that vring structures now need to fit into 31 bits as well, I think? Is there any way to reserve the 31 bit restriction for channel subsystem structures and keep vring in the full 64 bit range? (Or am I fundamentally misunderstanding something?)> - > vcdev->config_block = kzalloc(sizeof(*vcdev->config_block), > GFP_DMA | GFP_KERNEL); > if (!vcdev->config_block) {
Cornelia Huck
2019-May-13  12:20 UTC
[PATCH 10/10] virtio/s390: make airq summary indicators DMA
On Fri, 26 Apr 2019 20:32:45 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Hypervisor needs to interact with the summary indicators, so these > need to be DMA memory as well (at least for protected virtualization > guests). > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 24 +++++++++++++++++------- > 1 file changed, 17 insertions(+), 7 deletions(-)(...)> @@ -237,7 +243,8 @@ static void virtio_airq_handler(struct airq_struct *airq) > read_unlock(&info->lock); > } > > -static struct airq_info *new_airq_info(void) > +/* call with airq_areas_lock held */Hm, where is airq_areas_lock defined? If it was introduced in one of the previous patches, I have missed it.> +static struct airq_info *new_airq_info(int index) > { > struct airq_info *info; > int rc;
Cornelia Huck
2019-May-13  12:59 UTC
[PATCH 07/10] s390/airq: use DMA memory for adapter interrupts
On Fri, 26 Apr 2019 20:32:42 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Protected virtualization guests have to use shared pages for airq > notifier bit vectors, because hypervisor needs to write these bits. > > Let us make sure we allocate DMA memory for the notifier bit vectors.[Looking at this first, before I can think about your update in patch 5.]> > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > arch/s390/include/asm/airq.h | 2 ++ > drivers/s390/cio/airq.c | 18 ++++++++++++++---- > 2 files changed, 16 insertions(+), 4 deletions(-)(...)> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c > index a45011e4529e..7a5c0a08ee09 100644 > --- a/drivers/s390/cio/airq.c > +++ b/drivers/s390/cio/airq.c > @@ -19,6 +19,7 @@ > > #include <asm/airq.h> > #include <asm/isc.h> > +#include <asm/cio.h> > > #include "cio.h" > #include "cio_debug.h" > @@ -113,6 +114,11 @@ void __init init_airq_interrupts(void) > setup_irq(THIN_INTERRUPT, &airq_interrupt); > } > > +static inline unsigned long iv_size(unsigned long bits) > +{ > + return BITS_TO_LONGS(bits) * sizeof(unsigned long); > +} > + > /** > * airq_iv_create - create an interrupt vector > * @bits: number of bits in the interrupt vector > @@ -123,14 +129,15 @@ void __init init_airq_interrupts(void) > struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags) > { > struct airq_iv *iv; > - unsigned long size; > + unsigned long size = 0;Why do you need to init this to 0?> > iv = kzalloc(sizeof(*iv), GFP_KERNEL); > if (!iv) > goto out; > iv->bits = bits; > - size = BITS_TO_LONGS(bits) * sizeof(unsigned long); > - iv->vector = kzalloc(size, GFP_KERNEL); > + size = iv_size(bits); > + iv->vector = dma_alloc_coherent(cio_get_dma_css_dev(), size, > + &iv->vector_dma, GFP_KERNEL);Indent is a bit off. But more importantly, I'm also a bit vary about ap and pci. IIRC, css support is mandatory, so that should not be a problem; and unless I remember incorrectly, ap only uses summary indicators. How does this interact with pci devices? I suppose any of their dma properties do not come into play with the interrupt code here? (Just want to be sure.)> if (!iv->vector) > goto out_free; > if (flags & AIRQ_IV_ALLOC) { > @@ -165,7 +172,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags) > kfree(iv->ptr); > kfree(iv->bitlock); > kfree(iv->avail); > - kfree(iv->vector); > + dma_free_coherent(cio_get_dma_css_dev(), size, iv->vector, > + iv->vector_dma); > kfree(iv); > out: > return NULL; > @@ -182,6 +190,8 @@ void airq_iv_release(struct airq_iv *iv) > kfree(iv->ptr); > kfree(iv->bitlock); > kfree(iv->vector); > + dma_free_coherent(cio_get_dma_css_dev(), iv_size(iv->bits), > + iv->vector, iv->vector_dma); > kfree(iv->avail); > kfree(iv); > }
Cornelia Huck
2019-May-13  13:54 UTC
[PATCH 09/10] virtio/s390: use DMA memory for ccw I/O and classic notifiers
On Fri, 26 Apr 2019 20:32:44 +0200 Halil Pasic <pasic at linux.ibm.com> wrote:> Before virtio-ccw could get away with not using DMA API for the pieces of > memory it does ccw I/O with. With protected virtualization this has to > change, since the hypervisor needs to read and sometimes also write these > pieces of memory. > > The hypervisor is supposed to poke the classic notifiers, if these are > used, out of band with regards to ccw I/O. So these need to be allocated > as DMA memory (which is shared memory for protected virtualization > guests). > > Let us factor out everything from struct virtio_ccw_device that needs to > be DMA memory in a satellite that is allocated as such. > > Note: The control blocks of I/O instructions do not need to be shared. > These are marshalled by the ultravisor. > > Signed-off-by: Halil Pasic <pasic at linux.ibm.com> > --- > drivers/s390/virtio/virtio_ccw.c | 177 +++++++++++++++++++++------------------ > 1 file changed, 96 insertions(+), 81 deletions(-) >> @@ -176,6 +180,22 @@ static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev) > return container_of(vdev, struct virtio_ccw_device, vdev); > } > > +static inline void *__vc_dma_alloc(struct virtio_device *vdev, size_t size) > +{ > + return ccw_device_dma_zalloc(to_vc_device(vdev)->cdev, size); > +} > + > +static inline void __vc_dma_free(struct virtio_device *vdev, size_t size, > + void *cpu_addr) > +{ > + return ccw_device_dma_free(to_vc_device(vdev)->cdev, cpu_addr, size); > +}Hm, why do these use leading underscores? Also, maybe make the _free function safe for NULL to simplify the cleanup paths?> + > +#define vc_dma_alloc_struct(vdev, ptr) \ > + ({ptr = __vc_dma_alloc(vdev, sizeof(*(ptr))); }) > +#define vc_dma_free_struct(vdev, ptr) \ > + __vc_dma_free(vdev, sizeof(*(ptr)), (ptr))I find these a bit ugly... does adding a wrapper help that much?> + > static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info) > { > unsigned long i, flags;
Apparently Analagous Threads
- [RFC PATCH 01/12] virtio/s390: use vring_create_virtqueue
- [PATCH RFC 10/11] KVM: s390: virtio-ccw revision 1 SET_VQ
- [PATCH v5] virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
- [PATCH v4] virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
- [PATCH v4] virtio: add VIRTIO_F_NOTIFICATION_DATA feature support