Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 00 of 16] [V2] amd iommu: support ATS device passthru on IOMMUv2 systems
ATS devices with PRI and PASID capabilities can communicate with iommuv2 to perform two level (nested) address translation and demand paging for DMA. To passthru such devices, iommu driver has to been enabled in guest OS. This patch set adds initial iommu emulation for hvm guests to support ATS device passthru. Changes in v2: * Do not use linked list to access guest iommu tables. * Do not parse iommu parameter in libxl_device_model_info again. * Fix incorrect logical calculation in patch 11. * Fix hypercall definition for non-x86 systems. Please review. Thanks, Wei
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 01 of 16] amd iommu: Refactoring iommu ring buffer definition
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569370 -3600 # Node ID 4c986253976d7efd19640500aa3bb69a5a534637 # Parent 492914c658383e071bef2f907cb62ede69ba48a4 amd iommu: Refactoring iommu ring buffer definition. Introduce struct ring_buffer to represent iommu cmd buffer, event log and ppr log Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 492914c65838 -r 4c986253976d xen/drivers/passthrough/amd/iommu_cmd.c --- a/xen/drivers/passthrough/amd/iommu_cmd.c Wed Dec 21 10:47:30 2011 +0000 +++ b/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:10 2011 +0100 @@ -29,7 +29,7 @@ static int queue_iommu_command(struct am u32 tail, head, *cmd_buffer; int i; - tail = iommu->cmd_buffer_tail; + tail = iommu->cmd_buffer.tail; if ( ++tail == iommu->cmd_buffer.entries ) tail = 0; @@ -40,13 +40,13 @@ static int queue_iommu_command(struct am if ( head != tail ) { cmd_buffer = (u32 *)(iommu->cmd_buffer.buffer + - (iommu->cmd_buffer_tail * + (iommu->cmd_buffer.tail * IOMMU_CMD_BUFFER_ENTRY_SIZE)); for ( i = 0; i < IOMMU_CMD_BUFFER_U32_PER_ENTRY; i++ ) cmd_buffer[i] = cmd[i]; - iommu->cmd_buffer_tail = tail; + iommu->cmd_buffer.tail = tail; return 1; } @@ -57,7 +57,7 @@ static void commit_iommu_command_buffer( { u32 tail; - set_field_in_reg_u32(iommu->cmd_buffer_tail, 0, + set_field_in_reg_u32(iommu->cmd_buffer.tail, 0, IOMMU_CMD_BUFFER_TAIL_MASK, IOMMU_CMD_BUFFER_TAIL_SHIFT, &tail); writel(tail, iommu->mmio_base+IOMMU_CMD_BUFFER_TAIL_OFFSET); diff -r 492914c65838 -r 4c986253976d xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Wed Dec 21 10:47:30 2011 +0000 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:10 2011 +0100 @@ -294,20 +294,20 @@ static int amd_iommu_read_event_log(stru IOMMU_EVENT_LOG_TAIL_MASK, IOMMU_EVENT_LOG_TAIL_SHIFT); - while ( tail != iommu->event_log_head ) + while ( tail != iommu->event_log.head ) { /* read event log entry */ event_log = (u32 *)(iommu->event_log.buffer + - (iommu->event_log_head * + (iommu->event_log.head * IOMMU_EVENT_LOG_ENTRY_SIZE)); parse_event_log_entry(iommu, event_log); - if ( ++iommu->event_log_head == iommu->event_log.entries ) - iommu->event_log_head = 0; + if ( ++iommu->event_log.head == iommu->event_log.entries ) + iommu->event_log.head = 0; /* update head pointer */ - set_field_in_reg_u32(iommu->event_log_head, 0, + set_field_in_reg_u32(iommu->event_log.head, 0, IOMMU_EVENT_LOG_HEAD_MASK, IOMMU_EVENT_LOG_HEAD_SHIFT, &head); writel(head, iommu->mmio_base + IOMMU_EVENT_LOG_HEAD_OFFSET); @@ -346,7 +346,7 @@ static void amd_iommu_reset_event_log(st writel(entry, iommu->mmio_base+IOMMU_STATUS_MMIO_OFFSET); /*reset event log base address */ - iommu->event_log_head = 0; + iommu->event_log.head = 0; set_iommu_event_log_control(iommu, IOMMU_CONTROL_ENABLED); } @@ -605,71 +605,83 @@ static void enable_iommu(struct amd_iomm } -static void __init deallocate_iommu_table_struct( - struct table_struct *table) +static void __init deallocate_buffer(void *buf, uint32_t sz) { int order = 0; - if ( table->buffer ) + if ( buf ) { - order = get_order_from_bytes(table->alloc_size); - __free_amd_iommu_tables(table->buffer, order); - table->buffer = NULL; + order = get_order_from_bytes(sz); + __free_amd_iommu_tables(buf, order); } } -static int __init allocate_iommu_table_struct(struct table_struct *table, - const char *name) +static void __init deallocate_device_table(struct table_struct *table) { - int order = 0; - if ( table->buffer == NULL ) - { - order = get_order_from_bytes(table->alloc_size); - table->buffer = __alloc_amd_iommu_tables(order); - - if ( table->buffer == NULL ) - { - AMD_IOMMU_DEBUG("Error allocating %s\n", name); - return -ENOMEM; - } - memset(table->buffer, 0, PAGE_SIZE * (1UL << order)); - } - return 0; + deallocate_buffer(table->buffer, table->alloc_size); + table->buffer = NULL; } -static int __init allocate_cmd_buffer(struct amd_iommu *iommu) +static void __init deallocate_ring_buffer(struct ring_buffer *ring_buf) +{ + deallocate_buffer(ring_buf->buffer, ring_buf->alloc_size); + ring_buf->buffer = NULL; + ring_buf->head = 0; + ring_buf->tail = 0; +} + +static void * __init allocate_buffer(uint32_t alloc_size, const char *name) +{ + void * buffer; + int order = get_order_from_bytes(alloc_size); + + buffer = __alloc_amd_iommu_tables(order); + + if ( buffer == NULL ) + { + AMD_IOMMU_DEBUG("Error allocating %s\n", name); + return NULL; + } + + memset(buffer, 0, PAGE_SIZE * (1UL << order)); + return buffer; +} + +static void * __init allocate_ring_buffer(struct ring_buffer *ring_buf, + uint32_t entry_size, + uint64_t entries, const char *name) +{ + ring_buf->head = 0; + ring_buf->tail = 0; + + ring_buf->entry_size = entry_size; + ring_buf->alloc_size = PAGE_SIZE << get_order_from_bytes(entries * + entry_size); + ring_buf->entries = ring_buf->alloc_size / entry_size; + ring_buf->buffer = allocate_buffer(ring_buf->alloc_size, name); + return ring_buf->buffer; +} + +static void * __init allocate_cmd_buffer(struct amd_iommu *iommu) { /* allocate ''command buffer'' in power of 2 increments of 4K */ - iommu->cmd_buffer_tail = 0; - iommu->cmd_buffer.alloc_size = PAGE_SIZE << - get_order_from_bytes( - PAGE_ALIGN(IOMMU_CMD_BUFFER_DEFAULT_ENTRIES - * IOMMU_CMD_BUFFER_ENTRY_SIZE)); - iommu->cmd_buffer.entries = iommu->cmd_buffer.alloc_size / - IOMMU_CMD_BUFFER_ENTRY_SIZE; - - return (allocate_iommu_table_struct(&iommu->cmd_buffer, "Command Buffer")); + return allocate_ring_buffer(&iommu->cmd_buffer, sizeof(cmd_entry_t), + IOMMU_CMD_BUFFER_DEFAULT_ENTRIES, + "Command Buffer"); } -static int __init allocate_event_log(struct amd_iommu *iommu) +static void * __init allocate_event_log(struct amd_iommu *iommu) { - /* allocate ''event log'' in power of 2 increments of 4K */ - iommu->event_log_head = 0; - iommu->event_log.alloc_size = PAGE_SIZE << - get_order_from_bytes( - PAGE_ALIGN(IOMMU_EVENT_LOG_DEFAULT_ENTRIES * - IOMMU_EVENT_LOG_ENTRY_SIZE)); - iommu->event_log.entries = iommu->event_log.alloc_size / - IOMMU_EVENT_LOG_ENTRY_SIZE; - - return (allocate_iommu_table_struct(&iommu->event_log, "Event Log")); + /* allocate ''event log'' in power of 2 increments of 4K */ + return allocate_ring_buffer(&iommu->event_log, sizeof(event_entry_t), + IOMMU_EVENT_LOG_DEFAULT_ENTRIES, "Event Log"); } static int __init amd_iommu_init_one(struct amd_iommu *iommu) { - if ( allocate_cmd_buffer(iommu) != 0 ) + if ( allocate_cmd_buffer(iommu) == NULL ) goto error_out; - if ( allocate_event_log(iommu) != 0 ) + if ( allocate_event_log(iommu) == NULL ) goto error_out; if ( map_iommu_mmio_region(iommu) != 0 ) @@ -708,8 +720,8 @@ static void __init amd_iommu_init_cleanu list_del(&iommu->list); if ( iommu->enabled ) { - deallocate_iommu_table_struct(&iommu->cmd_buffer); - deallocate_iommu_table_struct(&iommu->event_log); + deallocate_ring_buffer(&iommu->cmd_buffer); + deallocate_ring_buffer(&iommu->event_log); unmap_iommu_mmio_region(iommu); } xfree(iommu); @@ -719,7 +731,7 @@ static void __init amd_iommu_init_cleanu iterate_ivrs_entries(amd_iommu_free_intremap_table); /* free device table */ - deallocate_iommu_table_struct(&device_table); + deallocate_device_table(&device_table); /* free ivrs_mappings[] */ radix_tree_destroy(&ivrs_maps, xfree); @@ -830,8 +842,10 @@ static int __init amd_iommu_setup_device device_table.entries = device_table.alloc_size / IOMMU_DEV_TABLE_ENTRY_SIZE; - if ( allocate_iommu_table_struct(&device_table, "Device Table") != 0 ) - return -ENOMEM; + device_table.buffer = allocate_buffer(device_table.alloc_size, + "Device Table"); + if ( device_table.buffer == NULL ) + return -ENOMEM; /* Add device table entries */ for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ ) diff -r 492914c65838 -r 4c986253976d xen/include/asm-x86/amd-iommu.h --- a/xen/include/asm-x86/amd-iommu.h Wed Dec 21 10:47:30 2011 +0000 +++ b/xen/include/asm-x86/amd-iommu.h Thu Dec 22 16:56:10 2011 +0100 @@ -30,12 +30,43 @@ extern struct list_head amd_iommu_head; +#pragma pack(1) +typedef struct event_entry +{ + uint32_t data[4]; +} event_entry_t; + +typedef struct ppr_entry +{ + uint32_t data[4]; +} ppr_entry_t; + +typedef struct cmd_entry +{ + uint32_t data[4]; +} cmd_entry_t; + +typedef struct dev_entry +{ + uint32_t data[8]; +} dev_entry_t; +#pragma pack() + struct table_struct { void *buffer; unsigned long entries; unsigned long alloc_size; }; +struct ring_buffer { + void *buffer; + unsigned long entries; + unsigned long alloc_size; + unsigned long entry_size; + uint32_t tail; + uint32_t head; +}; + typedef struct iommu_cap { uint32_t header; /* offset 00h */ uint32_t base_low; /* offset 04h */ @@ -60,10 +91,8 @@ struct amd_iommu { unsigned long mmio_base_phys; struct table_struct dev_table; - struct table_struct cmd_buffer; - u32 cmd_buffer_tail; - struct table_struct event_log; - u32 event_log_head; + struct ring_buffer cmd_buffer; + struct ring_buffer event_log; int exclusion_enable; int exclusion_allow_all;
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 02 of 16] amd iommu: Introduces new helper functions to simplify iommu bitwise operations
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569374 -3600 # Node ID e15194f68f99a64b65046ba3d29a3f06ccdca950 # Parent 4c986253976d7efd19640500aa3bb69a5a534637 amd iommu: Introduces new helper functions to simplify iommu bitwise operations Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 4c986253976d -r e15194f68f99 xen/drivers/passthrough/amd/iommu_cmd.c --- a/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:10 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:14 2011 +0100 @@ -33,10 +33,8 @@ static int queue_iommu_command(struct am if ( ++tail == iommu->cmd_buffer.entries ) tail = 0; - head = get_field_from_reg_u32(readl(iommu->mmio_base + - IOMMU_CMD_BUFFER_HEAD_OFFSET), - IOMMU_CMD_BUFFER_HEAD_MASK, - IOMMU_CMD_BUFFER_HEAD_SHIFT); + head = iommu_get_rb_pointer(readl(iommu->mmio_base + + IOMMU_CMD_BUFFER_HEAD_OFFSET)); if ( head != tail ) { cmd_buffer = (u32 *)(iommu->cmd_buffer.buffer + @@ -55,11 +53,9 @@ static int queue_iommu_command(struct am static void commit_iommu_command_buffer(struct amd_iommu *iommu) { - u32 tail; + u32 tail = 0; - set_field_in_reg_u32(iommu->cmd_buffer.tail, 0, - IOMMU_CMD_BUFFER_TAIL_MASK, - IOMMU_CMD_BUFFER_TAIL_SHIFT, &tail); + iommu_set_rb_pointer(&tail, iommu->cmd_buffer.tail); writel(tail, iommu->mmio_base+IOMMU_CMD_BUFFER_TAIL_OFFSET); } diff -r 4c986253976d -r e15194f68f99 xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:10 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:14 2011 +0100 @@ -106,21 +106,21 @@ static void register_iommu_dev_table_in_ u64 addr_64, addr_lo, addr_hi; u32 entry; + ASSERT( iommu->dev_table.buffer ); + addr_64 = (u64)virt_to_maddr(iommu->dev_table.buffer); addr_lo = addr_64 & DMA_32BIT_MASK; addr_hi = addr_64 >> 32; - set_field_in_reg_u32((u32)addr_lo >> PAGE_SHIFT, 0, - IOMMU_DEV_TABLE_BASE_LOW_MASK, - IOMMU_DEV_TABLE_BASE_LOW_SHIFT, &entry); + entry = 0; + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT); set_field_in_reg_u32((iommu->dev_table.alloc_size / PAGE_SIZE) - 1, entry, IOMMU_DEV_TABLE_SIZE_MASK, IOMMU_DEV_TABLE_SIZE_SHIFT, &entry); writel(entry, iommu->mmio_base + IOMMU_DEV_TABLE_BASE_LOW_OFFSET); - set_field_in_reg_u32((u32)addr_hi, 0, - IOMMU_DEV_TABLE_BASE_HIGH_MASK, - IOMMU_DEV_TABLE_BASE_HIGH_SHIFT, &entry); + entry = 0; + iommu_set_addr_hi_to_reg(&entry, addr_hi); writel(entry, iommu->mmio_base + IOMMU_DEV_TABLE_BASE_HIGH_OFFSET); } @@ -130,21 +130,21 @@ static void register_iommu_cmd_buffer_in u32 power_of2_entries; u32 entry; + ASSERT( iommu->dev_table.buffer ); + addr_64 = (u64)virt_to_maddr(iommu->cmd_buffer.buffer); addr_lo = addr_64 & DMA_32BIT_MASK; addr_hi = addr_64 >> 32; - set_field_in_reg_u32((u32)addr_lo >> PAGE_SHIFT, 0, - IOMMU_CMD_BUFFER_BASE_LOW_MASK, - IOMMU_CMD_BUFFER_BASE_LOW_SHIFT, &entry); + entry = 0; + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT); writel(entry, iommu->mmio_base + IOMMU_CMD_BUFFER_BASE_LOW_OFFSET); power_of2_entries = get_order_from_bytes(iommu->cmd_buffer.alloc_size) + IOMMU_CMD_BUFFER_POWER_OF2_ENTRIES_PER_PAGE; - set_field_in_reg_u32((u32)addr_hi, 0, - IOMMU_CMD_BUFFER_BASE_HIGH_MASK, - IOMMU_CMD_BUFFER_BASE_HIGH_SHIFT, &entry); + entry = 0; + iommu_set_addr_hi_to_reg(&entry, addr_hi); set_field_in_reg_u32(power_of2_entries, entry, IOMMU_CMD_BUFFER_LENGTH_MASK, IOMMU_CMD_BUFFER_LENGTH_SHIFT, &entry); @@ -157,21 +157,21 @@ static void register_iommu_event_log_in_ u32 power_of2_entries; u32 entry; + ASSERT( iommu->dev_table.buffer ); + addr_64 = (u64)virt_to_maddr(iommu->event_log.buffer); addr_lo = addr_64 & DMA_32BIT_MASK; addr_hi = addr_64 >> 32; - set_field_in_reg_u32((u32)addr_lo >> PAGE_SHIFT, 0, - IOMMU_EVENT_LOG_BASE_LOW_MASK, - IOMMU_EVENT_LOG_BASE_LOW_SHIFT, &entry); + entry = 0; + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT); writel(entry, iommu->mmio_base + IOMMU_EVENT_LOG_BASE_LOW_OFFSET); power_of2_entries = get_order_from_bytes(iommu->event_log.alloc_size) + IOMMU_EVENT_LOG_POWER_OF2_ENTRIES_PER_PAGE; - set_field_in_reg_u32((u32)addr_hi, 0, - IOMMU_EVENT_LOG_BASE_HIGH_MASK, - IOMMU_EVENT_LOG_BASE_HIGH_SHIFT, &entry); + entry = 0; + iommu_set_addr_hi_to_reg(&entry, addr_hi); set_field_in_reg_u32(power_of2_entries, entry, IOMMU_EVENT_LOG_LENGTH_MASK, IOMMU_EVENT_LOG_LENGTH_SHIFT, &entry); @@ -234,14 +234,12 @@ static void register_iommu_exclusion_ran addr_lo = iommu->exclusion_base & DMA_32BIT_MASK; addr_hi = iommu->exclusion_base >> 32; - set_field_in_reg_u32((u32)addr_hi, 0, - IOMMU_EXCLUSION_BASE_HIGH_MASK, - IOMMU_EXCLUSION_BASE_HIGH_SHIFT, &entry); + entry = 0; + iommu_set_addr_hi_to_reg(&entry, addr_hi); writel(entry, iommu->mmio_base+IOMMU_EXCLUSION_BASE_HIGH_OFFSET); - set_field_in_reg_u32((u32)addr_lo >> PAGE_SHIFT, 0, - IOMMU_EXCLUSION_BASE_LOW_MASK, - IOMMU_EXCLUSION_BASE_LOW_SHIFT, &entry); + entry = 0; + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT); set_field_in_reg_u32(iommu->exclusion_allow_all, entry, IOMMU_EXCLUSION_ALLOW_ALL_MASK, @@ -490,9 +488,7 @@ static void parse_event_log_entry(struct if ( code == IOMMU_EVENT_IO_PAGE_FAULT ) { - device_id = get_field_from_reg_u32(entry[0], - IOMMU_EVENT_DEVICE_ID_MASK, - IOMMU_EVENT_DEVICE_ID_SHIFT); + device_id = iommu_get_devid_from_event(entry[0]); domain_id = get_field_from_reg_u32(entry[1], IOMMU_EVENT_DOMAIN_ID_MASK, IOMMU_EVENT_DOMAIN_ID_SHIFT); diff -r 4c986253976d -r e15194f68f99 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Thu Dec 22 16:56:10 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Thu Dec 22 16:56:14 2011 +0100 @@ -191,5 +191,85 @@ static inline int iommu_has_feature(stru return 0; return !!(iommu->features & (1U << bit)); } +/* access tail or head pointer of ring buffer */ +#define IOMMU_RING_BUFFER_PTR_MASK 0x0007FFF0 +#define IOMMU_RING_BUFFER_PTR_SHIFT 4 +static inline uint32_t iommu_get_rb_pointer(uint32_t reg) +{ + return get_field_from_reg_u32(reg, IOMMU_RING_BUFFER_PTR_MASK, + IOMMU_RING_BUFFER_PTR_SHIFT); +} + +static inline void iommu_set_rb_pointer(uint32_t *reg, uint32_t val) +{ + set_field_in_reg_u32(val, *reg, IOMMU_RING_BUFFER_PTR_MASK, + IOMMU_RING_BUFFER_PTR_SHIFT, reg); +} + +/* access device field from iommu cmd */ +#define IOMMU_CMD_DEVICE_ID_MASK 0x0000FFFF +#define IOMMU_CMD_DEVICE_ID_SHIFT 0 + +static inline uint16_t iommu_get_devid_from_cmd(uint32_t cmd) +{ + return get_field_from_reg_u32(cmd, IOMMU_CMD_DEVICE_ID_MASK, + IOMMU_CMD_DEVICE_ID_SHIFT); +} + +static inline void iommu_set_devid_to_cmd(uint32_t *cmd, uint16_t id) +{ + set_field_in_reg_u32(id, *cmd, IOMMU_CMD_DEVICE_ID_MASK, + IOMMU_CMD_DEVICE_ID_SHIFT, cmd); +} + +/* access address field from iommu cmd */ +#define IOMMU_CMD_ADDR_LOW_MASK 0xFFFFF000 +#define IOMMU_CMD_ADDR_LOW_SHIFT 12 +#define IOMMU_CMD_ADDR_HIGH_MASK 0xFFFFFFFF +#define IOMMU_CMD_ADDR_HIGH_SHIFT 0 + +static inline uint32_t iommu_get_addr_lo_from_cmd(uint32_t cmd) +{ + return get_field_from_reg_u32(cmd, IOMMU_CMD_ADDR_LOW_MASK, + IOMMU_CMD_ADDR_LOW_SHIFT); +} + +static inline uint32_t iommu_get_addr_hi_from_cmd(uint32_t cmd) +{ + return get_field_from_reg_u32(cmd, IOMMU_CMD_ADDR_LOW_MASK, + IOMMU_CMD_ADDR_HIGH_SHIFT); +} + +#define iommu_get_devid_from_event iommu_get_devid_from_cmd + +/* access iommu base addresses from mmio regs */ +#define IOMMU_REG_BASE_ADDR_BASE_LOW_MASK 0xFFFFF000 +#define IOMMU_REG_BASE_ADDR_LOW_SHIFT 12 +#define IOMMU_REG_BASE_ADDR_HIGH_MASK 0x000FFFFF +#define IOMMU_REG_BASE_ADDR_HIGH_SHIFT 0 + +static inline void iommu_set_addr_lo_to_reg(uint32_t *reg, uint32_t addr) +{ + set_field_in_reg_u32(addr, *reg, IOMMU_REG_BASE_ADDR_BASE_LOW_MASK, + IOMMU_REG_BASE_ADDR_LOW_SHIFT, reg); +} + +static inline void iommu_set_addr_hi_to_reg(uint32_t *reg, uint32_t addr) +{ + set_field_in_reg_u32(addr, *reg, IOMMU_REG_BASE_ADDR_HIGH_MASK, + IOMMU_REG_BASE_ADDR_HIGH_SHIFT, reg); +} + +static inline uint32_t iommu_get_addr_lo_from_reg(uint32_t reg) +{ + return get_field_from_reg_u32(reg, IOMMU_REG_BASE_ADDR_BASE_LOW_MASK, + IOMMU_REG_BASE_ADDR_LOW_SHIFT); +} + +static inline uint32_t iommu_get_addr_hi_from_reg(uint32_t reg) +{ + return get_field_from_reg_u32(reg, IOMMU_REG_BASE_ADDR_HIGH_MASK, + IOMMU_REG_BASE_ADDR_HIGH_SHIFT); +} #endif /* _ASM_X86_64_AMD_IOMMU_PROTO_H */
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 03 of 16] amd iommu: Add iommu emulation for hvm guest
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569377 -3600 # Node ID 07f338ae663242ba9080f1ab84298894783da3e2 # Parent e15194f68f99a64b65046ba3d29a3f06ccdca950 amd iommu: Add iommu emulation for hvm guest ATS device driver that support PASID [1] and PRI [2] capabilites needs to work with iommu driver in OS. If we want passthru ATS device to hvm guests using unmodified OS, we have to expose iommu functionality to HVM guest. Signed-off-by: Wei Wang <wei.wang2@amd.com> [1] http://www.pcisig.com/specifications/pciexpress/specifications/ECN-PASID-ATS-2011-03-31.pdf [2] http://www.pcisig.com/members/downloads/specifications/iov/ats_r1.1_26Jan09.pdf diff -r e15194f68f99 -r 07f338ae6632 xen/drivers/passthrough/amd/Makefile --- a/xen/drivers/passthrough/amd/Makefile Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/drivers/passthrough/amd/Makefile Thu Dec 22 16:56:17 2011 +0100 @@ -5,3 +5,4 @@ obj-y += pci_amd_iommu.o obj-bin-y += iommu_acpi.init.o obj-y += iommu_intr.o obj-y += iommu_cmd.o +obj-y += iommu_guest.o diff -r e15194f68f99 -r 07f338ae6632 xen/drivers/passthrough/amd/iommu_cmd.c --- a/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:17 2011 +0100 @@ -398,3 +398,15 @@ void amd_iommu_flush_all_caches(struct a invalidate_iommu_all(iommu); flush_command_buffer(iommu); } + +void amd_iommu_send_guest_cmd(struct amd_iommu *iommu, u32 cmd[]) +{ + unsigned long flags; + + spin_lock_irqsave(&iommu->lock, flags); + + send_iommu_command(iommu, cmd); + flush_command_buffer(iommu); + + spin_unlock_irqrestore(&iommu->lock, flags); +} diff -r e15194f68f99 -r 07f338ae6632 xen/drivers/passthrough/amd/iommu_guest.c --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:17 2011 +0100 @@ -0,0 +1,915 @@ +/* + * Copyright (C) 2011 Advanced Micro Devices, Inc. + * Author: Wei Wang <wei.wang2@amd.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include <xen/sched.h> +#include <asm/p2m.h> +#include <asm/hvm/iommu.h> +#include <asm/amd-iommu.h> +#include <asm/hvm/svm/amd-iommu-proto.h> + + +#define IOMMU_MMIO_SIZE 0x8000 +#define IOMMU_MMIO_PAGE_NR 0x8 +#define RING_BF_LENGTH_MASK 0x0F000000 +#define RING_BF_LENGTH_SHIFT 24 + +#define PASMAX_9_bit 0x8 +#define GUEST_CR3_1_LEVEL 0x0 +#define GUEST_ADDRESS_SIZE_6_LEVEL 0x2 +#define HOST_ADDRESS_SIZE_6_LEVEL 0x2 + +#define guest_iommu_set_status(iommu, bit) \ + iommu_set_bit(&((iommu)->reg_status.lo), bit) + +#define guest_iommu_clear_status(iommu, bit) \ + iommu_clear_bit(&((iommu)->reg_status.lo), bit) + +#define reg_to_u64(reg) (((uint64_t)reg.hi << 32) | reg.lo ) +#define u64_to_reg(reg, val) \ + do \ + { \ + (reg)->lo = val & 0xFFFFFFFF; \ + (reg)->hi = (val >> 32) & 0xFFFFFFFF; \ + } while(0) + +static unsigned int machine_bdf(struct domain *d, uint16_t guest_bdf) +{ + return guest_bdf; +} + +static uint16_t guest_bdf(struct domain *d, uint16_t machine_bdf) +{ + return machine_bdf; +} + +static inline struct guest_iommu *domain_iommu(struct domain *d) +{ + return domain_hvm_iommu(d)->g_iommu; +} + +static inline struct guest_iommu *vcpu_iommu(struct vcpu *v) +{ + return domain_hvm_iommu(v->domain)->g_iommu; +} + +static void guest_iommu_enable(struct guest_iommu *iommu) +{ + iommu->enabled = 1; +} + +static void guest_iommu_disable(struct guest_iommu *iommu) +{ + iommu->enabled = 0; +} + +static uint64_t get_guest_cr3_from_dte(dev_entry_t *dte) +{ + uint64_t gcr3_1, gcr3_2, gcr3_3; + + gcr3_1 = get_field_from_reg_u32(dte->data[1], + IOMMU_DEV_TABLE_GCR3_1_MASK, + IOMMU_DEV_TABLE_GCR3_1_SHIFT); + gcr3_2 = get_field_from_reg_u32(dte->data[2], + IOMMU_DEV_TABLE_GCR3_2_MASK, + IOMMU_DEV_TABLE_GCR3_2_SHIFT); + gcr3_3 = get_field_from_reg_u32(dte->data[3], + IOMMU_DEV_TABLE_GCR3_3_MASK, + IOMMU_DEV_TABLE_GCR3_3_SHIFT); + + return ((gcr3_3 << 31) | (gcr3_2 << 15 ) | (gcr3_1 << 12)) >> PAGE_SHIFT; +} + +static uint16_t get_domid_from_dte(dev_entry_t *dte) +{ + return get_field_from_reg_u32(dte->data[2], IOMMU_DEV_TABLE_DOMAIN_ID_MASK, + IOMMU_DEV_TABLE_DOMAIN_ID_SHIFT); +} + +static uint16_t get_glx_from_dte(dev_entry_t *dte) +{ + return get_field_from_reg_u32(dte->data[1], IOMMU_DEV_TABLE_GLX_MASK, + IOMMU_DEV_TABLE_GLX_SHIFT); +} + +static uint16_t get_gv_from_dte(dev_entry_t *dte) +{ + return get_field_from_reg_u32(dte->data[1],IOMMU_DEV_TABLE_GV_MASK, + IOMMU_DEV_TABLE_GV_SHIFT); +} + +static unsigned int host_domid(struct domain *d, uint64_t g_domid) +{ + /* Only support one PPR device in guest for now */ + return d->domain_id; +} + +static unsigned long get_gfn_from_base_reg(uint64_t base_raw) +{ + uint64_t addr_lo, addr_hi, addr64; + + addr_lo = iommu_get_addr_lo_from_reg(base_raw & DMA_32BIT_MASK); + addr_hi = iommu_get_addr_hi_from_reg(base_raw >> 32); + addr64 = (addr_hi << 32) | (addr_lo << PAGE_SHIFT); + + ASSERT ( addr64 != 0 ); + + return addr64 >> PAGE_SHIFT; +} + +static void guest_iommu_deliver_msi(struct domain *d) +{ + uint8_t vector, dest, dest_mode, delivery_mode, trig_mode; + struct guest_iommu *iommu = domain_iommu(d); + + vector = iommu->msi.vector; + dest = iommu->msi.dest; + dest_mode = iommu->msi.dest_mode; + delivery_mode = iommu->msi.delivery_mode; + trig_mode = iommu->msi.trig_mode; + + vmsi_deliver(d, vector, dest, dest_mode, delivery_mode, trig_mode); +} + +static unsigned long guest_iommu_get_table_mfn(struct domain *d, + uint64_t base_raw, + unsigned int entry_size, + unsigned int pos) +{ + unsigned long idx, gfn, mfn; + p2m_type_t p2mt; + + gfn = get_gfn_from_base_reg(base_raw); + idx = (pos * entry_size) >> PAGE_SHIFT; + + mfn = mfn_x(get_gfn(d, gfn + idx, &p2mt)); + put_gfn(d, gfn); + + return mfn; +} + +static void guest_iommu_enable_dev_table(struct guest_iommu *iommu) +{ + uint32_t length_raw = get_field_from_reg_u32(iommu->dev_table.reg_base.lo, + IOMMU_DEV_TABLE_SIZE_MASK, + IOMMU_DEV_TABLE_SIZE_SHIFT); + iommu->dev_table.size = (length_raw + 1) * PAGE_SIZE; +} + +static void guest_iommu_enable_ring_buffer(struct guest_iommu *iommu, + struct guest_buffer *buffer, + uint32_t entry_size) +{ + uint32_t length_raw = get_field_from_reg_u32(buffer->reg_base.hi, + RING_BF_LENGTH_MASK, + RING_BF_LENGTH_SHIFT); + buffer->entries = 1 << length_raw; +} + +void guest_iommu_add_ppr_log(struct domain *d, u32 entry[]) +{ + uint16_t gdev_id; + unsigned long mfn, tail, head; + ppr_entry_t *log, *log_base; + struct guest_iommu *iommu; + + iommu = domain_iommu(d); + tail = iommu_get_rb_pointer(iommu->ppr_log.reg_tail.lo); + head = iommu_get_rb_pointer(iommu->ppr_log.reg_head.lo); + + if ( tail >= iommu->ppr_log.entries || head >= iommu->ppr_log.entries ) + { + AMD_IOMMU_DEBUG("Error: guest iommu ppr log overflows\n"); + guest_iommu_disable(iommu); + return; + } + + mfn = guest_iommu_get_table_mfn(d, reg_to_u64(iommu->ppr_log.reg_base), + sizeof(ppr_entry_t), tail); + ASSERT(mfn_valid(mfn)); + + log_base = map_domain_page(mfn); + log = log_base + tail % (PAGE_SIZE / sizeof(ppr_entry_t)); + + /* Convert physical device id back into virtual device id */ + gdev_id = guest_bdf(d, iommu_get_devid_from_cmd(entry[0])); + iommu_set_devid_to_cmd(&entry[0], gdev_id); + + memcpy(log, entry, sizeof(ppr_entry_t)); + + /* Now shift ppr log tail pointer */ + if ( (++tail) >= iommu->ppr_log.entries ) + { + tail = 0; + guest_iommu_set_status(iommu, IOMMU_STATUS_PPR_LOG_OVERFLOW_SHIFT); + } + iommu_set_rb_pointer(&iommu->ppr_log.reg_tail.lo, tail); + unmap_domain_page(log_base); + + guest_iommu_deliver_msi(d); +} + +void guest_iommu_add_event_log(struct domain *d, u32 entry[]) +{ + uint16_t dev_id; + unsigned long mfn, tail, head; + event_entry_t *log, *log_base; + struct guest_iommu *iommu; + + iommu = domain_iommu(d); + tail = iommu_get_rb_pointer(iommu->event_log.reg_tail.lo); + head = iommu_get_rb_pointer(iommu->event_log.reg_head.lo); + + if ( tail >= iommu->event_log.entries || head >= iommu->event_log.entries ) + { + AMD_IOMMU_DEBUG("Error: guest iommu event overflows\n"); + guest_iommu_disable(iommu); + return; + } + + mfn = guest_iommu_get_table_mfn(d, reg_to_u64(iommu->event_log.reg_base), + sizeof(event_entry_t), tail); + ASSERT(mfn_valid(mfn)); + + log_base = map_domain_page(mfn); + log = log_base + tail % (PAGE_SIZE / sizeof(event_entry_t)); + + /* re-write physical device id into virtual device id */ + dev_id = guest_bdf(d, iommu_get_devid_from_cmd(entry[0])); + iommu_set_devid_to_cmd(&entry[0], dev_id); + memcpy(log, entry, sizeof(event_entry_t)); + + /* Now shift event log tail pointer */ + if ( (++tail) >= iommu->event_log.entries ) + { + tail = 0; + guest_iommu_set_status(iommu, IOMMU_STATUS_EVENT_OVERFLOW_SHIFT); + } + + iommu_set_rb_pointer(&iommu->event_log.reg_tail.lo, tail); + unmap_domain_page(log_base); + + guest_iommu_deliver_msi(d); +} + +static int do_complete_ppr_request(struct domain *d, cmd_entry_t *cmd) +{ + uint16_t dev_id; + struct amd_iommu *iommu; + + dev_id = machine_bdf(d, iommu_get_devid_from_cmd(cmd->data[0])); + iommu = find_iommu_for_device(0, dev_id); + + if ( !iommu ) + { + AMD_IOMMU_DEBUG("%s Fail to find iommu for bdf %x\n", + __func__, dev_id); + return -ENODEV; + } + + /* replace virtual device id into physical */ + iommu_set_devid_to_cmd(&cmd->data[0], dev_id); + amd_iommu_send_guest_cmd(iommu, cmd->data); + + return 0; +} + +static int do_invalidate_pages(struct domain *d, cmd_entry_t *cmd) +{ + uint16_t gdom_id, hdom_id; + struct amd_iommu *iommu = NULL; + + gdom_id = get_field_from_reg_u32(cmd->data[1], + IOMMU_INV_IOMMU_PAGES_DOMAIN_ID_MASK, + IOMMU_INV_IOMMU_PAGES_DOMAIN_ID_SHIFT); + + hdom_id = host_domid(d, gdom_id); + set_field_in_reg_u32(hdom_id, cmd->data[1], + IOMMU_INV_IOMMU_PAGES_DOMAIN_ID_MASK, + IOMMU_INV_IOMMU_PAGES_DOMAIN_ID_SHIFT, &cmd->data[1]); + + for_each_amd_iommu ( iommu ) + amd_iommu_send_guest_cmd(iommu, cmd->data); + + return 0; +} + +static int do_invalidate_all(struct domain *d, cmd_entry_t *cmd) +{ + struct amd_iommu *iommu = NULL; + + for_each_amd_iommu ( iommu ) + amd_iommu_flush_all_pages(d); + + return 0; +} + +static int do_invalidate_iotlb_pages(struct domain *d, cmd_entry_t *cmd) +{ + struct amd_iommu *iommu; + uint16_t dev_id; + + dev_id = machine_bdf(d, iommu_get_devid_from_cmd(cmd->data[0])); + + iommu = find_iommu_for_device(0, dev_id); + if ( !iommu ) + { + AMD_IOMMU_DEBUG("%s Fail to find iommu for bdf %x\n", + __func__, dev_id); + return -ENODEV; + } + + iommu_set_devid_to_cmd(&cmd->data[0], dev_id); + amd_iommu_send_guest_cmd(iommu, cmd->data); + + return 0; +} + +static int do_completion_wait(struct domain *d, cmd_entry_t *cmd) +{ + bool_t com_wait_int_en, com_wait_int, i, s; + struct guest_iommu *iommu; + unsigned long gfn; + p2m_type_t p2mt; + + iommu = domain_iommu(d); + + i = iommu_get_bit(cmd->data[0], IOMMU_COMP_WAIT_I_FLAG_SHIFT); + s = iommu_get_bit(cmd->data[0], IOMMU_COMP_WAIT_S_FLAG_SHIFT); + + if ( i ) + guest_iommu_set_status(iommu, IOMMU_STATUS_COMP_WAIT_INT_SHIFT); + + if ( s ) + { + uint64_t gaddr_lo, gaddr_hi, gaddr_64, data; + void *vaddr; + + data = (uint64_t) cmd->data[3] << 32 | cmd->data[2]; + gaddr_lo = get_field_from_reg_u32(cmd->data[0], + IOMMU_COMP_WAIT_ADDR_LOW_MASK, + IOMMU_COMP_WAIT_ADDR_LOW_SHIFT); + gaddr_hi = get_field_from_reg_u32(cmd->data[1], + IOMMU_COMP_WAIT_ADDR_HIGH_MASK, + IOMMU_COMP_WAIT_ADDR_HIGH_SHIFT); + + gaddr_64 = (gaddr_hi << 32) | (gaddr_lo << 3); + + gfn = gaddr_64 >> PAGE_SHIFT; + vaddr = map_domain_page(mfn_x(get_gfn(d, gfn ,&p2mt))); + put_gfn(d, gfn); + + write_u64_atomic((uint64_t*)(vaddr + (gaddr_64 & (PAGE_SIZE-1))), data); + unmap_domain_page(vaddr); + } + + com_wait_int_en = iommu_get_bit(iommu->reg_ctrl.lo, + IOMMU_CONTROL_COMP_WAIT_INT_SHIFT); + com_wait_int = iommu_get_bit(iommu->reg_status.lo, + IOMMU_STATUS_COMP_WAIT_INT_SHIFT); + + if ( com_wait_int_en && com_wait_int ) + guest_iommu_deliver_msi(d); + + return 0; +} + +static int do_invalidate_dte(struct domain *d, cmd_entry_t *cmd) +{ + uint16_t gbdf, mbdf, req_id, gdom_id, hdom_id; + dev_entry_t *gdte, *mdte, *dte_base; + struct amd_iommu *iommu = NULL; + struct guest_iommu *g_iommu; + uint64_t gcr3_gfn, gcr3_mfn; + uint8_t glx, gv; + unsigned long dte_mfn, flags; + p2m_type_t p2mt; + + g_iommu = domain_iommu(d); + gbdf = iommu_get_devid_from_cmd(cmd->data[0]); + mbdf = machine_bdf(d, gbdf); + + /* Guest can only update DTEs for its passthru devices */ + if ( mbdf == 0 || gbdf == 0 ) + return 0; + + /* Sometimes guest invalidates devices from non-exists dtes */ + if ( (gbdf * sizeof(dev_entry_t)) > g_iommu->dev_table.size ) + return 0; + + dte_mfn = guest_iommu_get_table_mfn(d, + reg_to_u64(g_iommu->dev_table.reg_base), + sizeof(dev_entry_t), gbdf); + ASSERT(mfn_valid(dte_mfn)); + + dte_base = map_domain_page(dte_mfn); + + gdte = dte_base + gbdf % (PAGE_SIZE / sizeof(dev_entry_t)); + + gdom_id = get_domid_from_dte(gdte); + gcr3_gfn = get_guest_cr3_from_dte(gdte); + + /* Do not update host dte before gcr3 has been set */ + if ( gcr3_gfn == 0 ) + return 0; + + gcr3_mfn = mfn_x(get_gfn(d, gcr3_gfn, &p2mt)); + put_gfn(d, gcr3_gfn); + + ASSERT(mfn_valid(gcr3_mfn)); + + /* Read guest dte information */ + iommu = find_iommu_for_device(0, mbdf); + if ( !iommu ) + { + AMD_IOMMU_DEBUG("%s Fail to find iommu!\n",__func__); + return -ENODEV; + } + + glx = get_glx_from_dte(gdte); + gv = get_gv_from_dte(gdte); + + unmap_domain_page(dte_base); + + /* Setup host device entry */ + hdom_id = host_domid(d, gdom_id); + req_id = get_dma_requestor_id(iommu->seg, mbdf); + mdte = iommu->dev_table.buffer + (req_id * sizeof(dev_entry_t)); + + spin_lock_irqsave(&iommu->lock, flags); + iommu_dte_set_guest_cr3((u32*)mdte, hdom_id, + gcr3_mfn << PAGE_SHIFT, gv, glx); + + amd_iommu_flush_device(iommu, req_id); + spin_unlock_irqrestore(&iommu->lock, flags); + + return 0; +} + +static void guest_iommu_process_command(unsigned long _d) +{ + unsigned long opcode, tail, head, entries_per_page, cmd_mfn; + cmd_entry_t *cmd, *cmd_base; + struct domain *d; + struct guest_iommu *iommu; + + d = (struct domain*) _d; + iommu = domain_iommu(d); + + if ( !iommu->enabled ) + return; + + head = iommu_get_rb_pointer(iommu->cmd_buffer.reg_head.lo); + tail = iommu_get_rb_pointer(iommu->cmd_buffer.reg_tail.lo); + + /* Tail pointer is rolled over by guest driver, value outside + * cmd_buffer_entries cause iommu disabled + */ + + if ( tail >= iommu->cmd_buffer.entries || + head >= iommu->cmd_buffer.entries ) + { + AMD_IOMMU_DEBUG("Error: guest iommu cmd buffer overflows\n"); + guest_iommu_disable(iommu); + return; + } + + entries_per_page = PAGE_SIZE / sizeof(cmd_entry_t); + + while ( head != tail ) + { + int ret = 0; + + cmd_mfn = guest_iommu_get_table_mfn(d, + reg_to_u64(iommu->cmd_buffer.reg_base), + sizeof(cmd_entry_t), head); + ASSERT(mfn_valid(cmd_mfn)); + + cmd_base = map_domain_page(cmd_mfn); + cmd = cmd_base + head % entries_per_page; + + opcode = get_field_from_reg_u32(cmd->data[1], + IOMMU_CMD_OPCODE_MASK, + IOMMU_CMD_OPCODE_SHIFT); + switch ( opcode ) + { + case IOMMU_CMD_COMPLETION_WAIT: + ret = do_completion_wait(d, cmd); + break; + case IOMMU_CMD_INVALIDATE_DEVTAB_ENTRY: + ret = do_invalidate_dte(d, cmd); + break; + case IOMMU_CMD_INVALIDATE_IOMMU_PAGES: + ret = do_invalidate_pages(d, cmd); + break; + case IOMMU_CMD_INVALIDATE_IOTLB_PAGES: + ret = do_invalidate_iotlb_pages(d, cmd); + break; + case IOMMU_CMD_INVALIDATE_INT_TABLE: + break; + case IOMMU_CMD_COMPLETE_PPR_REQUEST: + ret = do_complete_ppr_request(d, cmd); + break; + case IOMMU_CMD_INVALIDATE_IOMMU_ALL: + ret = do_invalidate_all(d, cmd); + break; + default: + AMD_IOMMU_DEBUG("CMD: Unknown command cmd_type = %lx " + "head = %ld\n", opcode, head); + break; + } + + unmap_domain_page(cmd_base); + if ( (++head) >= iommu->cmd_buffer.entries ) + head = 0; + if ( ret ) + guest_iommu_disable(iommu); + } + + /* Now shift cmd buffer head pointer */ + iommu_set_rb_pointer(&iommu->cmd_buffer.reg_head.lo, head); + return; +} + +static int guest_iommu_write_ctrl(struct guest_iommu *iommu, uint64_t newctrl) +{ + bool_t cmd_en, event_en, iommu_en, ppr_en, ppr_log_en; + bool_t cmd_en_old, event_en_old, iommu_en_old; + bool_t cmd_run; + + iommu_en = iommu_get_bit(newctrl, + IOMMU_CONTROL_TRANSLATION_ENABLE_SHIFT); + iommu_en_old = iommu_get_bit(iommu->reg_ctrl.lo, + IOMMU_CONTROL_TRANSLATION_ENABLE_SHIFT); + + cmd_en = iommu_get_bit(newctrl, + IOMMU_CONTROL_COMMAND_BUFFER_ENABLE_SHIFT); + cmd_en_old = iommu_get_bit(iommu->reg_ctrl.lo, + IOMMU_CONTROL_COMMAND_BUFFER_ENABLE_SHIFT); + cmd_run = iommu_get_bit(iommu->reg_status.lo, + + IOMMU_STATUS_CMD_BUFFER_RUN_SHIFT); + event_en = iommu_get_bit(newctrl, + IOMMU_CONTROL_EVENT_LOG_ENABLE_SHIFT); + event_en_old = iommu_get_bit(iommu->reg_ctrl.lo, + IOMMU_CONTROL_EVENT_LOG_ENABLE_SHIFT); + + ppr_en = iommu_get_bit(newctrl, + IOMMU_CONTROL_PPR_ENABLE_SHIFT); + ppr_log_en = iommu_get_bit(newctrl, + IOMMU_CONTROL_PPR_LOG_ENABLE_SHIFT); + + if ( iommu_en ) + { + guest_iommu_enable(iommu); + guest_iommu_enable_dev_table(iommu); + } + + if ( iommu_en && cmd_en ) + { + guest_iommu_enable_ring_buffer(iommu, &iommu->cmd_buffer, + sizeof(cmd_entry_t)); + /* Enable iommu command processing */ + tasklet_schedule(&iommu->cmd_buffer_tasklet); + } + + if ( iommu_en && event_en ) + { + guest_iommu_enable_ring_buffer(iommu, &iommu->event_log, + sizeof(event_entry_t)); + guest_iommu_set_status(iommu, IOMMU_STATUS_EVENT_LOG_RUN_SHIFT); + guest_iommu_clear_status(iommu, IOMMU_STATUS_EVENT_OVERFLOW_SHIFT); + } + + if ( iommu_en && ppr_en && ppr_log_en ) + { + guest_iommu_enable_ring_buffer(iommu, &iommu->ppr_log, + sizeof(ppr_entry_t)); + guest_iommu_set_status(iommu, IOMMU_STATUS_PPR_LOG_RUN_SHIFT); + guest_iommu_clear_status(iommu, IOMMU_STATUS_PPR_LOG_OVERFLOW_SHIFT); + } + + if ( iommu_en && cmd_en_old && !cmd_en ) + { + /* Disable iommu command processing */ + tasklet_kill(&iommu->cmd_buffer_tasklet); + } + + if ( event_en_old && !event_en ) + { + guest_iommu_clear_status(iommu, IOMMU_STATUS_EVENT_LOG_RUN_SHIFT); + } + + if ( !iommu_en && iommu_en_old ) + { + guest_iommu_disable(iommu); + } + + u64_to_reg(&iommu->reg_ctrl, newctrl); + return 0; +} + +static uint64_t iommu_mmio_read64(struct guest_iommu *iommu, + unsigned long offset) +{ + uint64_t val; + + switch ( offset ) + { + case IOMMU_DEV_TABLE_BASE_LOW_OFFSET: + val = reg_to_u64(iommu->dev_table.reg_base); + break; + case IOMMU_CMD_BUFFER_BASE_LOW_OFFSET: + val = reg_to_u64(iommu->cmd_buffer.reg_base); + break; + case IOMMU_EVENT_LOG_BASE_LOW_OFFSET: + val = reg_to_u64(iommu->event_log.reg_base); + break; + case IOMMU_PPR_LOG_BASE_LOW_OFFSET: + val = reg_to_u64(iommu->ppr_log.reg_base); + break; + case IOMMU_CMD_BUFFER_HEAD_OFFSET: + val = reg_to_u64(iommu->cmd_buffer.reg_head); + break; + case IOMMU_CMD_BUFFER_TAIL_OFFSET: + val = reg_to_u64(iommu->cmd_buffer.reg_tail); + break; + case IOMMU_EVENT_LOG_HEAD_OFFSET:; + val = reg_to_u64(iommu->event_log.reg_head); + break; + case IOMMU_EVENT_LOG_TAIL_OFFSET: + val = reg_to_u64(iommu->event_log.reg_tail); + break; + case IOMMU_PPR_LOG_HEAD_OFFSET: + val = reg_to_u64(iommu->ppr_log.reg_head); + break; + case IOMMU_PPR_LOG_TAIL_OFFSET: + val = reg_to_u64(iommu->ppr_log.reg_tail); + break; + case IOMMU_CONTROL_MMIO_OFFSET: + val = reg_to_u64(iommu->reg_ctrl); + break; + case IOMMU_STATUS_MMIO_OFFSET: + val = reg_to_u64(iommu->reg_status); + break; + case IOMMU_EXT_FEATURE_MMIO_OFFSET: + val = reg_to_u64(iommu->reg_ext_feature); + break; + + default: + AMD_IOMMU_DEBUG("Guest reads unknown mmio offset = %lx\n", + offset); + val = 0; + break; + } + + return val; +} + +static int guest_iommu_mmio_read(struct vcpu *v, unsigned long addr, + unsigned long len, unsigned long *pval) +{ + struct guest_iommu *iommu = vcpu_iommu(v); + unsigned long offset; + uint64_t val; + uint32_t mmio, shift; + uint64_t mask = 0; + + offset = addr - iommu->mmio_base; + + if ( unlikely((offset & (len - 1 )) || (len > 8)) ) + { + AMD_IOMMU_DEBUG("iommu mmio write access is not aligned." + "offset = %lx, len = %lx \n", offset, len); + return X86EMUL_UNHANDLEABLE; + } + + mask = (len == 8) ? (~0ULL) : (1ULL << (len * 8)) - 1; + shift = (offset & 7u) * 8; + + /* mmio access is always aligned on 8-byte boundary */ + mmio = offset & (~7u); + + spin_lock(&iommu->lock); + val = iommu_mmio_read64(iommu, mmio); + spin_unlock(&iommu->lock); + + *pval = (val >> shift ) & mask; + + return X86EMUL_OKAY; +} + +static void guest_iommu_mmio_write64(struct guest_iommu *iommu, + unsigned long offset, uint64_t val) +{ + switch ( offset ) + { + case IOMMU_DEV_TABLE_BASE_LOW_OFFSET: + u64_to_reg(&iommu->dev_table.reg_base, val); + break; + case IOMMU_CMD_BUFFER_BASE_LOW_OFFSET: + u64_to_reg(&iommu->cmd_buffer.reg_base, val); + break; + case IOMMU_EVENT_LOG_BASE_LOW_OFFSET: + u64_to_reg(&iommu->event_log.reg_base, val); + case IOMMU_PPR_LOG_BASE_LOW_OFFSET: + u64_to_reg(&iommu->ppr_log.reg_base, val); + break; + case IOMMU_CONTROL_MMIO_OFFSET: + guest_iommu_write_ctrl(iommu, val); + break; + case IOMMU_CMD_BUFFER_HEAD_OFFSET: + u64_to_reg(&iommu->cmd_buffer.reg_head, val); + break; + case IOMMU_CMD_BUFFER_TAIL_OFFSET: + u64_to_reg(&iommu->cmd_buffer.reg_tail, val); + tasklet_schedule(&iommu->cmd_buffer_tasklet); + break; + case IOMMU_EVENT_LOG_HEAD_OFFSET: + u64_to_reg(&iommu->event_log.reg_head, val); + break; + case IOMMU_EVENT_LOG_TAIL_OFFSET: + u64_to_reg(&iommu->event_log.reg_tail, val); + break; + case IOMMU_PPR_LOG_HEAD_OFFSET: + u64_to_reg(&iommu->ppr_log.reg_head, val); + break; + case IOMMU_PPR_LOG_TAIL_OFFSET: + u64_to_reg(&iommu->ppr_log.reg_tail, val); + break; + case IOMMU_STATUS_MMIO_OFFSET: + u64_to_reg(&iommu->reg_status, val); + break; + + default: + AMD_IOMMU_DEBUG("guest writes unknown mmio offset = %lx, " + "val = %lx\n", offset, val); + break; + } +} + +static int guest_iommu_mmio_write(struct vcpu *v, unsigned long addr, + unsigned long len, unsigned long val) +{ + struct guest_iommu *iommu = vcpu_iommu(v); + unsigned long offset; + uint64_t reg_old, mmio; + uint32_t shift; + uint64_t mask = 0; + + offset = addr - iommu->mmio_base; + + if ( unlikely((offset & (len - 1 )) || (len > 8)) ) + { + AMD_IOMMU_DEBUG("iommu mmio write access is not aligned." + "offset = %lx, len = %lx \n", offset, len); + return X86EMUL_UNHANDLEABLE; + } + + mask = (len == 8) ? (~0ULL): (1ULL << (len * 8)) - 1; + shift = (offset & 7u) * 8; + + /* mmio access is always aligned on 8-byte boundary */ + mmio = offset & (~7u); + + spin_lock(&iommu->lock); + + reg_old = iommu_mmio_read64(iommu, mmio); + reg_old &= ~( mask << shift ); + val = reg_old | ((val & mask) << shift ); + guest_iommu_mmio_write64(iommu, mmio, val); + + spin_unlock(&iommu->lock); + + return X86EMUL_OKAY; +} + +int guest_iommu_set_base(struct domain *d, uint64_t base) +{ + p2m_type_t t; + struct guest_iommu *iommu = domain_iommu(d); + + iommu->mmio_base = base; + base >>= PAGE_SHIFT; + + for ( int i = 0; i < IOMMU_MMIO_PAGE_NR; i++ ) + { + unsigned long gfn = base + i; + + get_gfn_query(d, gfn, &t); + p2m_change_type(d, gfn, t, p2m_mmio_dm); + put_gfn(d, gfn); + } + + return 0; +} + +/* Initialize mmio read only bits */ +static void guest_iommu_reg_init(struct guest_iommu *iommu) +{ + uint32_t lower, upper; + + lower = upper = 0; + /* Support prefetch */ + iommu_set_bit(&lower,IOMMU_EXT_FEATURE_PREFSUP_SHIFT); + /* Support PPR log */ + iommu_set_bit(&lower,IOMMU_EXT_FEATURE_PPRSUP_SHIFT); + /* Support guest translation */ + iommu_set_bit(&lower,IOMMU_EXT_FEATURE_GTSUP_SHIFT); + /* Support invalidate all command */ + iommu_set_bit(&lower,IOMMU_EXT_FEATURE_IASUP_SHIFT); + + /* Host translation size has 6 levels */ + set_field_in_reg_u32(HOST_ADDRESS_SIZE_6_LEVEL, lower, + IOMMU_EXT_FEATURE_HATS_MASK, + IOMMU_EXT_FEATURE_HATS_SHIFT, + &lower); + /* Guest translation size has 6 levels */ + set_field_in_reg_u32(GUEST_ADDRESS_SIZE_6_LEVEL, lower, + IOMMU_EXT_FEATURE_GATS_MASK, + IOMMU_EXT_FEATURE_GATS_SHIFT, + &lower); + /* Single level gCR3 */ + set_field_in_reg_u32(GUEST_CR3_1_LEVEL, lower, + IOMMU_EXT_FEATURE_GLXSUP_MASK, + IOMMU_EXT_FEATURE_GLXSUP_SHIFT, &lower); + /* 9 bit PASID */ + set_field_in_reg_u32(PASMAX_9_bit, upper, + IOMMU_EXT_FEATURE_PASMAX_MASK, + IOMMU_EXT_FEATURE_PASMAX_SHIFT, &upper); + + iommu->reg_ext_feature.lo = lower; + iommu->reg_ext_feature.hi = upper; +} + +/* Domain specific initialization */ +int guest_iommu_init(struct domain* d) +{ + struct guest_iommu *iommu; + struct hvm_iommu *hd = domain_hvm_iommu(d); + + if ( !is_hvm_domain(d) ) + return 0; + + iommu = xzalloc(struct guest_iommu); + if ( !iommu ) + { + AMD_IOMMU_DEBUG("Error allocating guest iommu structure.\n"); + return 1; + } + + guest_iommu_reg_init(iommu); + iommu->domain = d; + hd->g_iommu = iommu; + + tasklet_init(&iommu->cmd_buffer_tasklet, + guest_iommu_process_command, (unsigned long)d); + + spin_lock_init(&iommu->lock); + + return 0; +} + +void guest_iommu_destroy(struct domain *d) +{ + struct guest_iommu *iommu; + + if ( !is_hvm_domain(d) ) + return; + + iommu = domain_iommu(d); + + tasklet_kill(&iommu->cmd_buffer_tasklet); + xfree(iommu); + + domain_hvm_iommu(d)->g_iommu = NULL; +} + +static int guest_iommu_mmio_range(struct vcpu *v, unsigned long addr) +{ + struct guest_iommu *iommu = vcpu_iommu(v); + + return ( addr >= iommu->mmio_base && + addr < (iommu->mmio_base + IOMMU_MMIO_SIZE) ); +} + +const struct hvm_mmio_handler iommu_mmio_handler = { + .check_handler = guest_iommu_mmio_range, + .read_handler = guest_iommu_mmio_read, + .write_handler = guest_iommu_mmio_write +}; diff -r e15194f68f99 -r 07f338ae6632 xen/drivers/passthrough/amd/iommu_map.c --- a/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:17 2011 +0100 @@ -234,6 +234,53 @@ void __init iommu_dte_add_device_entry(u dte[3] = entry; } +void iommu_dte_set_guest_cr3(u32 *dte, u16 dom_id, u64 gcr3, + int gv, unsigned int glx) +{ + u32 entry, gcr3_1, gcr3_2, gcr3_3; + + gcr3_3 = gcr3 >> 31; + gcr3_2 = (gcr3 >> 15) & 0xFFFF; + gcr3_1 = (gcr3 >> PAGE_SHIFT) & 0x7; + + /* I bit must be set when gcr3 is enabled */ + entry = dte[3]; + set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, entry, + IOMMU_DEV_TABLE_IOTLB_SUPPORT_MASK, + IOMMU_DEV_TABLE_IOTLB_SUPPORT_SHIFT, &entry); + /* update gcr3 */ + set_field_in_reg_u32(gcr3_3, entry, + IOMMU_DEV_TABLE_GCR3_3_MASK, + IOMMU_DEV_TABLE_GCR3_3_SHIFT, &entry); + dte[3] = entry; + + set_field_in_reg_u32(dom_id, entry, + IOMMU_DEV_TABLE_DOMAIN_ID_MASK, + IOMMU_DEV_TABLE_DOMAIN_ID_SHIFT, &entry); + /* update gcr3 */ + entry = dte[2]; + set_field_in_reg_u32(gcr3_2, entry, + IOMMU_DEV_TABLE_GCR3_2_MASK, + IOMMU_DEV_TABLE_GCR3_2_SHIFT, &entry); + dte[2] = entry; + + entry = dte[1]; + /* Enable GV bit */ + set_field_in_reg_u32(!!gv, entry, + IOMMU_DEV_TABLE_GV_MASK, + IOMMU_DEV_TABLE_GV_SHIFT, &entry); + + /* 1 level guest cr3 table */ + set_field_in_reg_u32(glx, entry, + IOMMU_DEV_TABLE_GLX_MASK, + IOMMU_DEV_TABLE_GLX_SHIFT, &entry); + /* update gcr3 */ + set_field_in_reg_u32(gcr3_1, entry, + IOMMU_DEV_TABLE_GCR3_1_MASK, + IOMMU_DEV_TABLE_GCR3_1_SHIFT, &entry); + dte[1] = entry; +} + u64 amd_iommu_get_next_table_from_pte(u32 *entry) { u64 addr_lo, addr_hi, ptr; diff -r e15194f68f99 -r 07f338ae6632 xen/drivers/passthrough/amd/pci_amd_iommu.c --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Thu Dec 22 16:56:17 2011 +0100 @@ -260,6 +260,8 @@ static int amd_iommu_domain_init(struct hd->domain_id = d->domain_id; + guest_iommu_init(d); + return 0; } @@ -443,6 +445,7 @@ static void deallocate_iommu_page_tables static void amd_iommu_domain_destroy(struct domain *d) { + guest_iommu_destroy(d); deallocate_iommu_page_tables(d); amd_iommu_flush_all_pages(d); } diff -r e15194f68f99 -r 07f338ae6632 xen/include/asm-x86/amd-iommu.h --- a/xen/include/asm-x86/amd-iommu.h Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/include/asm-x86/amd-iommu.h Thu Dec 22 16:56:17 2011 +0100 @@ -24,6 +24,7 @@ #include <xen/types.h> #include <xen/list.h> #include <xen/spinlock.h> +#include <xen/tasklet.h> #include <asm/hvm/svm/amd-iommu-defs.h> #define iommu_found() (!list_empty(&amd_iommu_head)) @@ -130,4 +131,55 @@ struct ivrs_mappings *get_ivrs_mappings( int iterate_ivrs_mappings(int (*)(u16 seg, struct ivrs_mappings *)); int iterate_ivrs_entries(int (*)(u16 seg, struct ivrs_mappings *)); +/* iommu tables in guest space */ +struct mmio_reg { + uint32_t lo; + uint32_t hi; +}; + +struct guest_dev_table { + struct mmio_reg reg_base; + uint32_t size; +}; + +struct guest_buffer { + struct mmio_reg reg_base; + struct mmio_reg reg_tail; + struct mmio_reg reg_head; + uint32_t entries; +}; + +struct guest_iommu_msi { + uint8_t vector; + uint8_t dest; + uint8_t dest_mode; + uint8_t delivery_mode; + uint8_t trig_mode; +}; + +/* virtual IOMMU structure */ +struct guest_iommu { + + struct domain *domain; + spinlock_t lock; + bool_t enabled; + + struct guest_dev_table dev_table; + struct guest_buffer cmd_buffer; + struct guest_buffer event_log; + struct guest_buffer ppr_log; + + struct tasklet cmd_buffer_tasklet; + + uint64_t mmio_base; /* MMIO base address */ + + /* MMIO regs */ + struct mmio_reg reg_ctrl; /* MMIO offset 0018h */ + struct mmio_reg reg_status; /* MMIO offset 2020h */ + struct mmio_reg reg_ext_feature; /* MMIO offset 0030h */ + + /* guest interrupt settings */ + struct guest_iommu_msi msi; +}; + #endif /* _ASM_X86_64_AMD_IOMMU_H */ diff -r e15194f68f99 -r 07f338ae6632 xen/include/asm-x86/hvm/svm/amd-iommu-defs.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h Thu Dec 22 16:56:17 2011 +0100 @@ -117,6 +117,13 @@ #define IOMMU_DEV_TABLE_PAGE_TABLE_PTR_LOW_SHIFT 12 /* DeviceTable Entry[63:32] */ +#define IOMMU_DEV_TABLE_GV_SHIFT 23 +#define IOMMU_DEV_TABLE_GV_MASK 0x800000 +#define IOMMU_DEV_TABLE_GLX_SHIFT 24 +#define IOMMU_DEV_TABLE_GLX_MASK 0x3000000 +#define IOMMU_DEV_TABLE_GCR3_1_SHIFT 26 +#define IOMMU_DEV_TABLE_GCR3_1_MASK 0x1c000000 + #define IOMMU_DEV_TABLE_PAGE_TABLE_PTR_HIGH_MASK 0x000FFFFF #define IOMMU_DEV_TABLE_PAGE_TABLE_PTR_HIGH_SHIFT 0 #define IOMMU_DEV_TABLE_IO_READ_PERMISSION_MASK 0x20000000 @@ -127,6 +134,8 @@ /* DeviceTable Entry[95:64] */ #define IOMMU_DEV_TABLE_DOMAIN_ID_MASK 0x0000FFFF #define IOMMU_DEV_TABLE_DOMAIN_ID_SHIFT 0 +#define IOMMU_DEV_TABLE_GCR3_2_SHIFT 16 +#define IOMMU_DEV_TABLE_GCR3_2_MASK 0xFFFF0000 /* DeviceTable Entry[127:96] */ #define IOMMU_DEV_TABLE_IOTLB_SUPPORT_MASK 0x00000001 @@ -155,6 +164,8 @@ #define IOMMU_DEV_TABLE_INT_TABLE_IGN_UNMAPPED_SHIFT 5 #define IOMMU_DEV_TABLE_INT_TABLE_PTR_LOW_MASK 0xFFFFFFC0 #define IOMMU_DEV_TABLE_INT_TABLE_PTR_LOW_SHIFT 6 +#define IOMMU_DEV_TABLE_GCR3_3_SHIFT 11 +#define IOMMU_DEV_TABLE_GCR3_3_MASK 0xfffff800 /* DeviceTable Entry[191:160] */ #define IOMMU_DEV_TABLE_INT_TABLE_PTR_HIGH_MASK 0x000FFFFF @@ -164,7 +175,6 @@ #define IOMMU_DEV_TABLE_INT_CONTROL_MASK 0x30000000 #define IOMMU_DEV_TABLE_INT_CONTROL_SHIFT 28 - /* Command Buffer */ #define IOMMU_CMD_BUFFER_BASE_LOW_OFFSET 0x08 #define IOMMU_CMD_BUFFER_BASE_HIGH_OFFSET 0x0C @@ -192,6 +202,7 @@ #define IOMMU_CMD_INVALIDATE_IOMMU_PAGES 0x3 #define IOMMU_CMD_INVALIDATE_IOTLB_PAGES 0x4 #define IOMMU_CMD_INVALIDATE_INT_TABLE 0x5 +#define IOMMU_CMD_COMPLETE_PPR_REQUEST 0x7 #define IOMMU_CMD_INVALIDATE_IOMMU_ALL 0x8 /* COMPLETION_WAIT command */ @@ -282,6 +293,28 @@ #define IOMMU_EVENT_DEVICE_ID_MASK 0x0000FFFF #define IOMMU_EVENT_DEVICE_ID_SHIFT 0 +/* PPR Log */ +#define IOMMU_PPR_LOG_ENTRY_SIZE 16 +#define IOMMU_PPR_LOG_POWER_OF2_ENTRIES_PER_PAGE 8 +#define IOMMU_PPR_LOG_U32_PER_ENTRY (IOMMU_PPR_LOG_ENTRY_SIZE / 4) + +#define IOMMU_PPR_LOG_BASE_LOW_OFFSET 0x0038 +#define IOMMU_PPR_LOG_BASE_HIGH_OFFSET 0x003C +#define IOMMU_PPR_LOG_BASE_LOW_MASK 0xFFFFF000 +#define IOMMU_PPR_LOG_BASE_LOW_SHIFT 12 +#define IOMMU_PPR_LOG_BASE_HIGH_MASK 0x000FFFFF +#define IOMMU_PPR_LOG_BASE_HIGH_SHIFT 0 +#define IOMMU_PPR_LOG_LENGTH_MASK 0x0F000000 +#define IOMMU_PPR_LOG_LENGTH_SHIFT 24 +#define IOMMU_PPR_LOG_HEAD_MASK 0x0007FFF0 +#define IOMMU_PPR_LOG_HEAD_SHIFT 4 +#define IOMMU_PPR_LOG_TAIL_MASK 0x0007FFF0 +#define IOMMU_PPR_LOG_TAIL_SHIFT 4 +#define IOMMU_PPR_LOG_HEAD_OFFSET 0x2030 +#define IOMMU_PPR_LOG_TAIL_OFFSET 0x2038 +#define IOMMU_PPR_LOG_DEVICE_ID_MASK 0x0000FFFF +#define IOMMU_PPR_LOG_DEVICE_ID_SHIFT 0 + /* Control Register */ #define IOMMU_CONTROL_MMIO_OFFSET 0x18 #define IOMMU_CONTROL_TRANSLATION_ENABLE_MASK 0x00000001 @@ -309,6 +342,11 @@ #define IOMMU_CONTROL_RESTART_MASK 0x80000000 #define IOMMU_CONTROL_RESTART_SHIFT 31 +#define IOMMU_CONTROL_PPR_LOG_ENABLE_SHIFT 13 +#define IOMMU_CONTROL_PPR_INT_SHIFT 14 +#define IOMMU_CONTROL_PPR_ENABLE_SHIFT 15 +#define IOMMU_CONTROL_GT_ENABLE_SHIFT 16 + /* Exclusion Register */ #define IOMMU_EXCLUSION_BASE_LOW_OFFSET 0x20 #define IOMMU_EXCLUSION_BASE_HIGH_OFFSET 0x24 @@ -342,7 +380,8 @@ #define IOMMU_EXT_FEATURE_HATS_MASK 0x00000C00 #define IOMMU_EXT_FEATURE_GATS_SHIFT 0x12 #define IOMMU_EXT_FEATURE_GATS_MASK 0x00003000 -#define IOMMU_EXT_FEATURE_GLXSUP 0x14 +#define IOMMU_EXT_FEATURE_GLXSUP_SHIFT 0x14 +#define IOMMU_EXT_FEATURE_GLXSUP_MASK 0x0000C000 #define IOMMU_EXT_FEATURE_PASMAX_SHIFT 0x0 #define IOMMU_EXT_FEATURE_PASMAX_MASK 0x0000001F @@ -359,6 +398,9 @@ #define IOMMU_STATUS_EVENT_LOG_RUN_SHIFT 3 #define IOMMU_STATUS_CMD_BUFFER_RUN_MASK 0x00000010 #define IOMMU_STATUS_CMD_BUFFER_RUN_SHIFT 4 +#define IOMMU_STATUS_PPR_LOG_OVERFLOW_SHIFT 5 +#define IOMMU_STATUS_PPR_LOG_INT_SHIFT 6 +#define IOMMU_STATUS_PPR_LOG_RUN_SHIFT 7 /* I/O Page Table */ #define IOMMU_PAGE_TABLE_ENTRY_SIZE 8 diff -r e15194f68f99 -r 07f338ae6632 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Thu Dec 22 16:56:17 2011 +0100 @@ -71,6 +71,8 @@ void amd_iommu_set_root_page_table( u32 *dte, u64 root_ptr, u16 domain_id, u8 paging_mode, u8 valid); void iommu_dte_set_iotlb(u32 *dte, u8 i); void iommu_dte_add_device_entry(u32 *dte, struct ivrs_mappings *ivrs_dev); +void iommu_dte_set_guest_cr3(u32 *dte, u16 dom_id, u64 gcr3, + int gv, unsigned int glx); /* send cmd to iommu */ void amd_iommu_flush_all_pages(struct domain *d); @@ -106,6 +108,14 @@ void amd_iommu_resume(void); void amd_iommu_suspend(void); void amd_iommu_crash_shutdown(void); +/* guest iommu support */ +void amd_iommu_send_guest_cmd(struct amd_iommu *iommu, u32 cmd[]); +void guest_iommu_add_ppr_log(struct domain *d, u32 entry[]); +void guest_iommu_add_event_log(struct domain *d, u32 entry[]); +int guest_iommu_init(struct domain* d); +void guest_iommu_destroy(struct domain *d); +int guest_iommu_set_base(struct domain *d, uint64_t base); + static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift) { u32 field; diff -r e15194f68f99 -r 07f338ae6632 xen/include/xen/hvm/iommu.h --- a/xen/include/xen/hvm/iommu.h Thu Dec 22 16:56:14 2011 +0100 +++ b/xen/include/xen/hvm/iommu.h Thu Dec 22 16:56:17 2011 +0100 @@ -47,6 +47,7 @@ struct hvm_iommu { int domain_id; int paging_mode; struct page_info *root_table; + struct guest_iommu *g_iommu; /* iommu_ops */ const struct iommu_ops *platform_ops;
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569381 -3600 # Node ID 33f88c76776c318eea74b8fc1ba467389407ad57 # Parent 07f338ae663242ba9080f1ab84298894783da3e2 amd iommu: Enable ppr log. IOMMUv2 writes peripheral page service request (PPR) records into ppr log to report DMA page request from ATS devices to OS. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 07f338ae6632 -r 33f88c76776c xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:17 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:21 2011 +0100 @@ -178,6 +178,34 @@ static void register_iommu_event_log_in_ writel(entry, iommu->mmio_base+IOMMU_EVENT_LOG_BASE_HIGH_OFFSET); } +static void register_iommu_ppr_log_in_mmio_space(struct amd_iommu *iommu) +{ + u64 addr_64, addr_lo, addr_hi; + u32 power_of2_entries; + u32 entry; + + ASSERT ( iommu->ppr_log.buffer ); + + addr_64 = (u64)virt_to_maddr(iommu->ppr_log.buffer); + addr_lo = addr_64 & DMA_32BIT_MASK; + addr_hi = addr_64 >> 32; + + entry = 0; + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT); + writel(entry, iommu->mmio_base + IOMMU_PPR_LOG_BASE_LOW_OFFSET); + + power_of2_entries = get_order_from_bytes(iommu->ppr_log.alloc_size) + + IOMMU_PPR_LOG_POWER_OF2_ENTRIES_PER_PAGE; + + entry = 0; + iommu_set_addr_hi_to_reg(&entry, addr_hi); + set_field_in_reg_u32(power_of2_entries, entry, + IOMMU_PPR_LOG_LENGTH_MASK, + IOMMU_PPR_LOG_LENGTH_SHIFT, &entry); + writel(entry, iommu->mmio_base + IOMMU_PPR_LOG_BASE_HIGH_OFFSET); +} + + static void set_iommu_translation_control(struct amd_iommu *iommu, int enable) { @@ -278,6 +306,35 @@ static void set_iommu_event_log_control( writel(entry, iommu->mmio_base + IOMMU_CONTROL_MMIO_OFFSET); } +static void set_iommu_ppr_log_control(struct amd_iommu *iommu, + int enable) +{ + u32 entry; + + entry = readl(iommu->mmio_base + IOMMU_CONTROL_MMIO_OFFSET); + + /*reset head and tail pointer manually before enablement */ + if ( enable ) + { + writel(0x0, iommu->mmio_base + IOMMU_PPR_LOG_HEAD_OFFSET); + writel(0x0, iommu->mmio_base + IOMMU_PPR_LOG_TAIL_OFFSET); + + iommu_set_bit(&entry, IOMMU_CONTROL_PPR_ENABLE_SHIFT); + iommu_set_bit(&entry, IOMMU_CONTROL_PPR_INT_SHIFT); + iommu_set_bit(&entry, IOMMU_CONTROL_PPR_LOG_ENABLE_SHIFT); + } + else + { + iommu_clear_bit(&entry, IOMMU_CONTROL_PPR_ENABLE_SHIFT); + iommu_clear_bit(&entry, IOMMU_CONTROL_PPR_INT_SHIFT); + iommu_clear_bit(&entry, IOMMU_CONTROL_PPR_LOG_ENABLE_SHIFT); + } + + writel(entry, iommu->mmio_base + IOMMU_CONTROL_MMIO_OFFSET); + if ( enable ) + AMD_IOMMU_DEBUG("PPR Log Enabled.\n"); +} + static void parse_event_log_entry(struct amd_iommu *, u32 entry[]); static int amd_iommu_read_event_log(struct amd_iommu *iommu) @@ -585,12 +642,19 @@ static void enable_iommu(struct amd_iomm register_iommu_event_log_in_mmio_space(iommu); register_iommu_exclusion_range(iommu); + if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) + register_iommu_ppr_log_in_mmio_space(iommu); + iommu_msi_set_affinity(irq_to_desc(iommu->irq), &cpu_online_map); amd_iommu_msi_enable(iommu, IOMMU_CONTROL_ENABLED); set_iommu_ht_flags(iommu); set_iommu_command_buffer_control(iommu, IOMMU_CONTROL_ENABLED); set_iommu_event_log_control(iommu, IOMMU_CONTROL_ENABLED); + + if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) + set_iommu_ppr_log_control(iommu, IOMMU_CONTROL_ENABLED); + set_iommu_translation_control(iommu, IOMMU_CONTROL_ENABLED); if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_IASUP_SHIFT) ) @@ -672,16 +736,29 @@ static void * __init allocate_event_log( IOMMU_EVENT_LOG_DEFAULT_ENTRIES, "Event Log"); } +static void * __init allocate_ppr_log(struct amd_iommu *iommu) +{ + /* allocate ''ppr log'' in power of 2 increments of 4K */ + return allocate_ring_buffer(&iommu->ppr_log, sizeof(ppr_entry_t), + IOMMU_PPR_LOG_DEFAULT_ENTRIES, "PPR Log"); +} + static int __init amd_iommu_init_one(struct amd_iommu *iommu) { + if ( map_iommu_mmio_region(iommu) != 0 ) + goto error_out; + + get_iommu_features(iommu); + if ( allocate_cmd_buffer(iommu) == NULL ) goto error_out; if ( allocate_event_log(iommu) == NULL ) goto error_out; - if ( map_iommu_mmio_region(iommu) != 0 ) - goto error_out; + if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) + if ( allocate_ppr_log(iommu) == NULL ) + goto error_out; if ( set_iommu_interrupt_handler(iommu) == 0 ) goto error_out; @@ -694,8 +771,6 @@ static int __init amd_iommu_init_one(str iommu->dev_table.entries = device_table.entries; iommu->dev_table.buffer = device_table.buffer; - get_iommu_features(iommu); - enable_iommu(iommu); printk("AMD-Vi: IOMMU %d Enabled.\n", nr_amd_iommus ); nr_amd_iommus++; @@ -718,6 +793,7 @@ static void __init amd_iommu_init_cleanu { deallocate_ring_buffer(&iommu->cmd_buffer); deallocate_ring_buffer(&iommu->event_log); + deallocate_ring_buffer(&iommu->ppr_log); unmap_iommu_mmio_region(iommu); } xfree(iommu); @@ -916,6 +992,10 @@ static void disable_iommu(struct amd_iom amd_iommu_msi_enable(iommu, IOMMU_CONTROL_DISABLED); set_iommu_command_buffer_control(iommu, IOMMU_CONTROL_DISABLED); set_iommu_event_log_control(iommu, IOMMU_CONTROL_DISABLED); + + if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) + set_iommu_ppr_log_control(iommu, IOMMU_CONTROL_DISABLED); + set_iommu_translation_control(iommu, IOMMU_CONTROL_DISABLED); iommu->enabled = 0; diff -r 07f338ae6632 -r 33f88c76776c xen/include/asm-x86/amd-iommu.h --- a/xen/include/asm-x86/amd-iommu.h Thu Dec 22 16:56:17 2011 +0100 +++ b/xen/include/asm-x86/amd-iommu.h Thu Dec 22 16:56:21 2011 +0100 @@ -94,6 +94,7 @@ struct amd_iommu { struct table_struct dev_table; struct ring_buffer cmd_buffer; struct ring_buffer event_log; + struct ring_buffer ppr_log; int exclusion_enable; int exclusion_allow_all; diff -r 07f338ae6632 -r 33f88c76776c xen/include/asm-x86/hvm/svm/amd-iommu-defs.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h Thu Dec 22 16:56:17 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h Thu Dec 22 16:56:21 2011 +0100 @@ -27,6 +27,9 @@ /* IOMMU Event Log entries: in power of 2 increments, minimum of 256 */ #define IOMMU_EVENT_LOG_DEFAULT_ENTRIES 512 +/* IOMMU PPR Log entries: in power of 2 increments, minimum of 256 */ +#define IOMMU_PPR_LOG_DEFAULT_ENTRIES 512 + #define PTE_PER_TABLE_SHIFT 9 #define PTE_PER_TABLE_SIZE (1 << PTE_PER_TABLE_SHIFT) #define PTE_PER_TABLE_MASK (~(PTE_PER_TABLE_SIZE - 1))
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569385 -3600 # Node ID 120144c1fac2dd26926b3b2a6362d30a4029620f # Parent 33f88c76776c318eea74b8fc1ba467389407ad57 amd iommu: Enable guest level translation. Similar to nested paging for SVM, IOMMUv2 supports two level translations for DMA. This patch enables this feature. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 33f88c76776c -r 120144c1fac2 xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:21 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:25 2011 +0100 @@ -220,6 +220,23 @@ static void set_iommu_translation_contro writel(entry, iommu->mmio_base+IOMMU_CONTROL_MMIO_OFFSET); } +static void set_iommu_guest_translation_control(struct amd_iommu *iommu, + int enable) +{ + u32 entry; + + entry = readl(iommu->mmio_base + IOMMU_CONTROL_MMIO_OFFSET); + + enable ? + iommu_set_bit(&entry, IOMMU_CONTROL_GT_ENABLE_SHIFT): + iommu_clear_bit(&entry, IOMMU_CONTROL_GT_ENABLE_SHIFT); + + writel(entry, iommu->mmio_base+IOMMU_CONTROL_MMIO_OFFSET); + + if ( enable ) + AMD_IOMMU_DEBUG("Guest Translation Enabled.\n"); +} + static void set_iommu_command_buffer_control(struct amd_iommu *iommu, int enable) { @@ -655,6 +672,9 @@ static void enable_iommu(struct amd_iomm if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) set_iommu_ppr_log_control(iommu, IOMMU_CONTROL_ENABLED); + if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_GTSUP_SHIFT) ) + set_iommu_guest_translation_control(iommu, IOMMU_CONTROL_ENABLED); + set_iommu_translation_control(iommu, IOMMU_CONTROL_ENABLED); if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_IASUP_SHIFT) ) @@ -996,6 +1016,9 @@ static void disable_iommu(struct amd_iom if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) set_iommu_ppr_log_control(iommu, IOMMU_CONTROL_DISABLED); + if ( iommu_has_feature(iommu, IOMMU_EXT_FEATURE_GTSUP_SHIFT) ) + set_iommu_guest_translation_control(iommu, IOMMU_CONTROL_DISABLED); + set_iommu_translation_control(iommu, IOMMU_CONTROL_DISABLED); iommu->enabled = 0;
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 06 of 16] amd iommu: add ppr log processing into iommu interrupt handling
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569388 -3600 # Node ID bf9f21ad9a0ec245b409f3862a5a36c0e070f333 # Parent 120144c1fac2dd26926b3b2a6362d30a4029620f amd iommu: add ppr log processing into iommu interrupt handling PPR log and event log share the same interrupt source. Interrupt handler should check both of them. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 120144c1fac2 -r bf9f21ad9a0e xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:25 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:28 2011 +0100 @@ -352,75 +352,91 @@ static void set_iommu_ppr_log_control(st AMD_IOMMU_DEBUG("PPR Log Enabled.\n"); } -static void parse_event_log_entry(struct amd_iommu *, u32 entry[]); +/* read event log or ppr log from iommu ring buffer */ +static int iommu_read_log(struct amd_iommu *iommu, + struct ring_buffer *log, + void (*parse_func)(struct amd_iommu *, u32 *)) +{ + u32 tail, head, *entry, tail_offest, head_offset; -static int amd_iommu_read_event_log(struct amd_iommu *iommu) -{ - u32 tail, head, *event_log; - - BUG_ON( !iommu ); + BUG_ON( !iommu || ((log != &iommu->event_log) && + (log != &iommu->ppr_log)) ); /* make sure there''s an entry in the log */ - tail = readl(iommu->mmio_base + IOMMU_EVENT_LOG_TAIL_OFFSET); - tail = get_field_from_reg_u32(tail, - IOMMU_EVENT_LOG_TAIL_MASK, - IOMMU_EVENT_LOG_TAIL_SHIFT); + tail_offest = ( log == &iommu->event_log ) ? + IOMMU_EVENT_LOG_TAIL_OFFSET: + IOMMU_PPR_LOG_TAIL_OFFSET; - while ( tail != iommu->event_log.head ) + head_offset = ( log == &iommu->event_log ) ? + IOMMU_EVENT_LOG_HEAD_OFFSET: + IOMMU_PPR_LOG_HEAD_OFFSET; + + tail = readl(iommu->mmio_base + tail_offest); + tail = iommu_get_rb_pointer(tail); + + while ( tail != log->head ) { /* read event log entry */ - event_log = (u32 *)(iommu->event_log.buffer + - (iommu->event_log.head * - IOMMU_EVENT_LOG_ENTRY_SIZE)); + entry = (u32 *)(log->buffer + log->head * log->entry_size); - parse_event_log_entry(iommu, event_log); - - if ( ++iommu->event_log.head == iommu->event_log.entries ) - iommu->event_log.head = 0; + parse_func(iommu, entry); + if ( ++log->head == log->entries ) + log->head = 0; /* update head pointer */ - set_field_in_reg_u32(iommu->event_log.head, 0, - IOMMU_EVENT_LOG_HEAD_MASK, - IOMMU_EVENT_LOG_HEAD_SHIFT, &head); - writel(head, iommu->mmio_base + IOMMU_EVENT_LOG_HEAD_OFFSET); + head = 0; + iommu_set_rb_pointer(&head, log->head); + + writel(head, iommu->mmio_base + head_offset); } return 0; } -static void amd_iommu_reset_event_log(struct amd_iommu *iommu) +/* reset event log or ppr log when overflow */ +static void iommu_reset_log(struct amd_iommu *iommu, + struct ring_buffer *log, + void (*ctrl_func)(struct amd_iommu *iommu, int)) { u32 entry; - int log_run; + int log_run, run_bit, of_bit; int loop_count = 1000; + BUG_ON( !iommu || ((log != &iommu->event_log) && + (log != &iommu->ppr_log)) ); + + run_bit = ( log == &iommu->event_log ) ? + IOMMU_STATUS_EVENT_LOG_RUN_SHIFT: + IOMMU_STATUS_PPR_LOG_RUN_SHIFT; + + of_bit = ( log == &iommu->event_log ) ? + IOMMU_STATUS_EVENT_OVERFLOW_SHIFT: + IOMMU_STATUS_PPR_LOG_OVERFLOW_SHIFT; + /* wait until EventLogRun bit = 0 */ do { entry = readl(iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); - log_run = iommu_get_bit(entry, IOMMU_STATUS_EVENT_LOG_RUN_SHIFT); + log_run = iommu_get_bit(entry, run_bit); loop_count--; } while ( log_run && loop_count ); if ( log_run ) { - AMD_IOMMU_DEBUG("Warning: EventLogRun bit is not cleared" - "before reset!\n"); + AMD_IOMMU_DEBUG("Warning: Log Run bit %d is not cleared" + "before reset! \n", run_bit); return; } - set_iommu_event_log_control(iommu, IOMMU_CONTROL_DISABLED); + ctrl_func(iommu, IOMMU_CONTROL_DISABLED); - /* read event log for debugging */ - amd_iommu_read_event_log(iommu); /*clear overflow bit */ - iommu_clear_bit(&entry, IOMMU_STATUS_EVENT_OVERFLOW_SHIFT); - - writel(entry, iommu->mmio_base+IOMMU_STATUS_MMIO_OFFSET); + iommu_clear_bit(&entry, of_bit); + writel(entry, iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); /*reset event log base address */ - iommu->event_log.head = 0; + log->head = 0; - set_iommu_event_log_control(iommu, IOMMU_CONTROL_ENABLED); + ctrl_func(iommu, IOMMU_CONTROL_ENABLED); } static void iommu_msi_set_affinity(struct irq_desc *desc, const cpumask_t *mask) @@ -592,30 +608,93 @@ static void parse_event_log_entry(struct } } -static void amd_iommu_page_fault(int irq, void *dev_id, - struct cpu_user_regs *regs) +static void iommu_check_event_log(struct amd_iommu *iommu) { u32 entry; unsigned long flags; - struct amd_iommu *iommu = dev_id; spin_lock_irqsave(&iommu->lock, flags); - amd_iommu_read_event_log(iommu); + + iommu_read_log(iommu, &iommu->event_log, parse_event_log_entry); /*check event overflow */ entry = readl(iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); if ( iommu_get_bit(entry, IOMMU_STATUS_EVENT_OVERFLOW_SHIFT) ) - amd_iommu_reset_event_log(iommu); + iommu_reset_log(iommu, &iommu->event_log, set_iommu_event_log_control); /* reset interrupt status bit */ entry = readl(iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); iommu_set_bit(&entry, IOMMU_STATUS_EVENT_LOG_INT_SHIFT); - writel(entry, iommu->mmio_base+IOMMU_STATUS_MMIO_OFFSET); + writel(entry, iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); + spin_unlock_irqrestore(&iommu->lock, flags); } +void parse_ppr_log_entry(struct amd_iommu *iommu, u32 entry[]) +{ + + u16 device_id; + u8 bus, devfn; + struct pci_dev *pdev; + struct domain *d; + + /* here device_id is physical value */ + device_id = iommu_get_devid_from_cmd(entry[0]); + bus = device_id >> 8; + devfn = device_id & 0xFF; + + local_irq_enable(); + + spin_lock(&pcidevs_lock); + pdev = pci_get_pdev(0, bus, devfn); + spin_unlock(&pcidevs_lock); + + local_irq_disable(); + + if ( pdev == NULL ) + return; + + d = pdev->domain; + + guest_iommu_add_ppr_log(d, entry); +} + +static void iommu_check_ppr_log(struct amd_iommu *iommu) +{ + u32 entry; + unsigned long flags; + + spin_lock_irqsave(&iommu->lock, flags); + + iommu_read_log(iommu, &iommu->ppr_log, parse_ppr_log_entry); + + /*check event overflow */ + entry = readl(iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); + + if ( iommu_get_bit(entry, IOMMU_STATUS_PPR_LOG_OVERFLOW_SHIFT) ) + iommu_reset_log(iommu, &iommu->ppr_log, set_iommu_ppr_log_control); + + /* reset interrupt status bit */ + entry = readl(iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); + iommu_set_bit(&entry, IOMMU_STATUS_PPR_LOG_INT_SHIFT); + + writel(entry, iommu->mmio_base + IOMMU_STATUS_MMIO_OFFSET); + + spin_unlock_irqrestore(&iommu->lock, flags); +} + +static void iommu_interrupt_handler(int irq, void *dev_id, + struct cpu_user_regs *regs) +{ + struct amd_iommu *iommu = dev_id; + iommu_check_event_log(iommu); + + if ( iommu->ppr_log.buffer != NULL ) + iommu_check_ppr_log(iommu); +} + static int __init set_iommu_interrupt_handler(struct amd_iommu *iommu) { int irq, ret; @@ -628,8 +707,7 @@ static int __init set_iommu_interrupt_ha } irq_desc[irq].handler = &iommu_msi_type; - ret = request_irq(irq, amd_iommu_page_fault, 0, - "amd_iommu", iommu); + ret = request_irq(irq, iommu_interrupt_handler, 0, "amd_iommu", iommu); if ( ret ) { irq_desc[irq].handler = &no_irq_type;
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569392 -3600 # Node ID 40d61d0390ec930cf53ce5cbf91faada8c7192bd # Parent bf9f21ad9a0ec245b409f3862a5a36c0e070f333 amd iommu: Add 2 hypercalls for libxc iommu_set_msi: used by qemu to inform hypervisor iommu vector number in guest space. Hypervisor needs this vector to inject msi into guest when PPR logging happens. iommu_bind_bdf: used by xl to bind guest bdf number to machine bdf number. IOMMU emulations codes receives commands from guest iommu driver and forwards them to host iommu. But virtual device id from guest should be converted into physical before sending to real hardware. Signed -off-by: Wei Wang <wei.wang2@amd.com> diff -r bf9f21ad9a0e -r 40d61d0390ec xen/drivers/passthrough/amd/iommu_guest.c --- a/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:28 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:32 2011 +0100 @@ -50,12 +50,27 @@ static unsigned int machine_bdf(struct domain *d, uint16_t guest_bdf) { - return guest_bdf; + struct pci_dev *pdev; + uint16_t mbdf = 0; + + for_each_pdev( d, pdev ) + { + if ( pdev->gbdf == guest_bdf ) + { + mbdf = PCI_BDF2(pdev->bus, pdev->devfn); + break; + } + } + return mbdf; } static uint16_t guest_bdf(struct domain *d, uint16_t machine_bdf) { - return machine_bdf; + struct pci_dev *pdev; + + pdev = pci_get_pdev_by_domain(d, 0, PCI_BUS(machine_bdf), + PCI_DEVFN2(machine_bdf)); + return pdev->gbdf; } static inline struct guest_iommu *domain_iommu(struct domain *d) @@ -913,3 +928,43 @@ const struct hvm_mmio_handler iommu_mmio .read_handler = guest_iommu_mmio_read, .write_handler = guest_iommu_mmio_write }; + +/* iommu hypercall handler */ +int iommu_bind_bdf(struct domain* d, uint16_t gbdf, uint16_t mbdf) +{ + struct pci_dev *pdev; + int ret = -ENODEV; + + if ( !iommu_found() ) + return 0; + + spin_lock(&pcidevs_lock); + + for_each_pdev( d, pdev ) + { + if ( (pdev->bus != PCI_BUS(mbdf) ) || + (pdev->devfn != PCI_DEVFN2(mbdf)) ) + continue; + + pdev->gbdf = gbdf; + ret = 0; + } + + spin_unlock(&pcidevs_lock); + return ret; +} + +void iommu_set_msi(struct domain* d, uint16_t vector, uint16_t dest, + uint16_t dest_mode, uint16_t delivery_mode, + uint16_t trig_mode) +{ + struct guest_iommu *iommu = domain_iommu(d); + + if ( !iommu_found() ) + return; + + iommu->msi.vector = vector; + iommu->msi.dest = dest; + iommu->msi.dest_mode = dest_mode; + iommu->msi.trig_mode = trig_mode; +} diff -r bf9f21ad9a0e -r 40d61d0390ec xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Thu Dec 22 16:56:28 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Thu Dec 22 16:56:32 2011 +0100 @@ -648,6 +648,40 @@ int iommu_do_domctl( put_domain(d); break; +#ifndef __ia64__ + case XEN_DOMCTL_guest_iommu_op: + { + xen_domctl_guest_iommu_op_t * guest_op; + + if ( unlikely((d = get_domain_by_id(domctl->domain)) == NULL) ) + { + gdprintk(XENLOG_ERR, + "XEN_DOMCTL_guest_iommu_op: get_domain_by_id() failed\n"); + ret = -EINVAL; + break; + } + + guest_op = &(domctl->u.guest_iommu_op); + switch ( guest_op->op ) + { + case XEN_DOMCTL_GUEST_IOMMU_OP_SET_MSI: + iommu_set_msi(d, guest_op->u.msi.vector, + guest_op->u.msi.dest, + guest_op->u.msi.dest_mode, + guest_op->u.msi.delivery_mode, + guest_op->u.msi.trig_mode); + ret = 0; + break; + case XEN_DOMCTL_GUEST_IOMMU_OP_BIND_BDF: + ret = iommu_bind_bdf(d, guest_op->u.bdf_bind.g_bdf, + guest_op->u.bdf_bind.m_bdf); + break; + } + put_domain(d); + break; + } +#endif + default: ret = -ENOSYS; break; diff -r bf9f21ad9a0e -r 40d61d0390ec xen/include/public/domctl.h --- a/xen/include/public/domctl.h Thu Dec 22 16:56:28 2011 +0100 +++ b/xen/include/public/domctl.h Thu Dec 22 16:56:32 2011 +0100 @@ -848,6 +848,29 @@ struct xen_domctl_set_access_required { typedef struct xen_domctl_set_access_required xen_domctl_set_access_required_t; DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t); +/* Support for guest iommu emulation */ +struct xen_domctl_guest_iommu_op { + /* XEN_DOMCTL_GUEST_IOMMU_OP_* */ +#define XEN_DOMCTL_GUEST_IOMMU_OP_SET_MSI 0 +#define XEN_DOMCTL_GUEST_IOMMU_OP_BIND_BDF 1 + uint8_t op; + union { + struct iommu_msi { + uint8_t vector; + uint8_t dest; + uint8_t dest_mode; + uint8_t delivery_mode; + uint8_t trig_mode; + } msi; + struct bdf_bind { + uint32_t g_bdf; + uint32_t m_bdf; + } bdf_bind; + } u; +}; +typedef struct xen_domctl_guest_iommu_op xen_domctl_guest_iommu_op_t; +DEFINE_XEN_GUEST_HANDLE(xen_domctl_guest_iommu_op_t); + struct xen_domctl { uint32_t cmd; #define XEN_DOMCTL_createdomain 1 @@ -912,6 +935,7 @@ struct xen_domctl { #define XEN_DOMCTL_getvcpuextstate 63 #define XEN_DOMCTL_set_access_required 64 #define XEN_DOMCTL_audit_p2m 65 +#define XEN_DOMCTL_guest_iommu_op 66 #define XEN_DOMCTL_gdbsx_guestmemio 1000 #define XEN_DOMCTL_gdbsx_pausevcpu 1001 #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 @@ -960,6 +984,7 @@ struct xen_domctl { struct xen_domctl_debug_op debug_op; struct xen_domctl_mem_event_op mem_event_op; struct xen_domctl_mem_sharing_op mem_sharing_op; + struct xen_domctl_guest_iommu_op guest_iommu_op; #if defined(__i386__) || defined(__x86_64__) struct xen_domctl_cpuid cpuid; struct xen_domctl_vcpuextstate vcpuextstate; diff -r bf9f21ad9a0e -r 40d61d0390ec xen/include/xen/iommu.h --- a/xen/include/xen/iommu.h Thu Dec 22 16:56:28 2011 +0100 +++ b/xen/include/xen/iommu.h Thu Dec 22 16:56:32 2011 +0100 @@ -164,6 +164,14 @@ int iommu_do_domctl(struct xen_domctl *, void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count); void iommu_iotlb_flush_all(struct domain *d); +#ifndef __ia64_ +/* Only used by AMD IOMMU */ +void iommu_set_msi(struct domain* d, uint16_t vector, uint16_t dest, + uint16_t dest_mode, uint16_t delivery_mode, + uint16_t trig_mode); +int iommu_bind_bdf(struct domain* d, uint16_t gbdf, uint16_t mbdf); +#endif + /* * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to * avoid unecessary iotlb_flush in the low level IOMMU code. diff -r bf9f21ad9a0e -r 40d61d0390ec xen/include/xen/pci.h --- a/xen/include/xen/pci.h Thu Dec 22 16:56:28 2011 +0100 +++ b/xen/include/xen/pci.h Thu Dec 22 16:56:32 2011 +0100 @@ -63,6 +63,9 @@ struct pci_dev { const u8 devfn; struct pci_dev_info info; u64 vf_rlen[6]; + + /* used by amd iomm to represent bdf value in guest space */ + u16 gbdf; }; #define for_each_pdev(domain, pdev) \
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569395 -3600 # Node ID 2329dad2786f6f20ea69c9609ab60208cad6fca9 # Parent 40d61d0390ec930cf53ce5cbf91faada8c7192bd amd iommu: Add a hypercall for hvmloader. IOMMU MMIO base address is dynamically allocated by firmware. This patch allows hvmloader to notify hypervisor where the iommu mmio pages are. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 40d61d0390ec -r 2329dad2786f xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c Thu Dec 22 16:56:32 2011 +0100 +++ b/xen/arch/x86/hvm/hvm.c Thu Dec 22 16:56:35 2011 +0100 @@ -65,6 +65,7 @@ #include <public/memory.h> #include <asm/mem_event.h> #include <public/mem_event.h> +#include <asm/hvm/svm/amd-iommu-proto.h> bool_t __read_mostly hvm_enabled; @@ -3677,6 +3678,9 @@ long do_hvm_op(unsigned long op, XEN_GUE case HVM_PARAM_BUFIOREQ_EVTCHN: rc = -EINVAL; break; + case HVM_PARAM_IOMMU_BASE: + rc = guest_iommu_set_base(d, a.value); + break; } if ( rc == 0 ) diff -r 40d61d0390ec -r 2329dad2786f xen/include/public/hvm/params.h --- a/xen/include/public/hvm/params.h Thu Dec 22 16:56:32 2011 +0100 +++ b/xen/include/public/hvm/params.h Thu Dec 22 16:56:35 2011 +0100 @@ -142,6 +142,10 @@ /* Boolean: Enable nestedhvm (hvm only) */ #define HVM_PARAM_NESTEDHVM 24 -#define HVM_NR_PARAMS 27 +#ifndef __ia64__ +#define HVM_PARAM_IOMMU_BASE 27 +#endif + +#define HVM_NR_PARAMS 28 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569398 -3600 # Node ID dd808bdd61c581b041d5b7e816b18674de51da6f # Parent 2329dad2786f6f20ea69c9609ab60208cad6fca9 amd iommu: add iommu mmio handler. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 2329dad2786f -r dd808bdd61c5 xen/arch/x86/hvm/intercept.c --- a/xen/arch/x86/hvm/intercept.c Thu Dec 22 16:56:35 2011 +0100 +++ b/xen/arch/x86/hvm/intercept.c Thu Dec 22 16:56:38 2011 +0100 @@ -38,7 +38,8 @@ hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] &hpet_mmio_handler, &vlapic_mmio_handler, &vioapic_mmio_handler, - &msixtbl_mmio_handler + &msixtbl_mmio_handler, + &iommu_mmio_handler }; static int hvm_mmio_access(struct vcpu *v, diff -r 2329dad2786f -r dd808bdd61c5 xen/include/asm-x86/hvm/io.h --- a/xen/include/asm-x86/hvm/io.h Thu Dec 22 16:56:35 2011 +0100 +++ b/xen/include/asm-x86/hvm/io.h Thu Dec 22 16:56:38 2011 +0100 @@ -69,8 +69,9 @@ extern const struct hvm_mmio_handler hpe extern const struct hvm_mmio_handler vlapic_mmio_handler; extern const struct hvm_mmio_handler vioapic_mmio_handler; extern const struct hvm_mmio_handler msixtbl_mmio_handler; +extern const struct hvm_mmio_handler iommu_mmio_handler; -#define HVM_MMIO_HANDLER_NR 4 +#define HVM_MMIO_HANDLER_NR 5 int hvm_io_intercept(ioreq_t *p, int type); void register_io_handler(
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 10 of 16] amd iommu: Enable FC bit in iommu host level PTE
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569401 -3600 # Node ID 30b1f434160d989be5e0bb6c6956bb7e3985db59 # Parent dd808bdd61c581b041d5b7e816b18674de51da6f amd iommu: Enable FC bit in iommu host level PTE Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r dd808bdd61c5 -r 30b1f434160d xen/drivers/passthrough/amd/iommu_map.c --- a/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:38 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:41 2011 +0100 @@ -83,6 +83,11 @@ static bool_t set_iommu_pde_present(u32 set_field_in_reg_u32(ir, entry, IOMMU_PDE_IO_READ_PERMISSION_MASK, IOMMU_PDE_IO_READ_PERMISSION_SHIFT, &entry); + + /* IOMMUv2 needs FC bit enabled */ + if ( next_level == IOMMU_PAGING_MODE_LEVEL_0 ) + set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, entry, + IOMMU_PTE_FC_MASK, IOMMU_PTE_FC_SHIFT, &entry); pde[1] = entry; /* mark next level as ''present'' */
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 11 of 16] amd iommu: Add a new flag to indication iommuv2 feature enabled or not
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569405 -3600 # Node ID b70cf1dcb110f79546f3efac3201a08d70c8b96e # Parent 30b1f434160d989be5e0bb6c6956bb7e3985db59 amd iommu: Add a new flag to indication iommuv2 feature enabled or not. Hypercalls should return early on non-iommuv2 systems. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 30b1f434160d -r b70cf1dcb110 xen/drivers/passthrough/amd/iommu_guest.c --- a/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:41 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:45 2011 +0100 @@ -48,6 +48,8 @@ (reg)->hi = (val >> 32) & 0xFFFFFFFF; \ } while(0) +extern bool_t iommuv2_enabled; + static unsigned int machine_bdf(struct domain *d, uint16_t guest_bdf) { struct pci_dev *pdev; @@ -819,6 +821,9 @@ int guest_iommu_set_base(struct domain * p2m_type_t t; struct guest_iommu *iommu = domain_iommu(d); + if ( !is_hvm_domain(d) || !iommuv2_enabled ) + return 1; + iommu->mmio_base = base; base >>= PAGE_SHIFT; @@ -878,7 +883,7 @@ int guest_iommu_init(struct domain* d) struct guest_iommu *iommu; struct hvm_iommu *hd = domain_hvm_iommu(d); - if ( !is_hvm_domain(d) ) + if ( !is_hvm_domain(d) || !iommuv2_enabled ) return 0; iommu = xzalloc(struct guest_iommu); @@ -902,13 +907,11 @@ int guest_iommu_init(struct domain* d) void guest_iommu_destroy(struct domain *d) { - struct guest_iommu *iommu; + struct guest_iommu *iommu = domain_iommu(d); - if ( !is_hvm_domain(d) ) + if ( !is_hvm_domain(d) || !iommuv2_enabled ) return; - iommu = domain_iommu(d); - tasklet_kill(&iommu->cmd_buffer_tasklet); xfree(iommu); @@ -919,6 +922,9 @@ static int guest_iommu_mmio_range(struct { struct guest_iommu *iommu = vcpu_iommu(v); + if ( !iommu_found() || !iommuv2_enabled ) + return 0; + return ( addr >= iommu->mmio_base && addr < (iommu->mmio_base + IOMMU_MMIO_SIZE) ); } @@ -935,7 +941,7 @@ int iommu_bind_bdf(struct domain* d, uin struct pci_dev *pdev; int ret = -ENODEV; - if ( !iommu_found() ) + if ( !iommu_found() || !iommuv2_enabled ) return 0; spin_lock(&pcidevs_lock); @@ -960,7 +966,7 @@ void iommu_set_msi(struct domain* d, uin { struct guest_iommu *iommu = domain_iommu(d); - if ( !iommu_found() ) + if ( !iommu_found() || !iommuv2_enabled ) return; iommu->msi.vector = vector; diff -r 30b1f434160d -r b70cf1dcb110 xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:41 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:45 2011 +0100 @@ -36,6 +36,7 @@ unsigned short ivrs_bdf_entries; static struct radix_tree_root ivrs_maps; struct list_head amd_iommu_head; struct table_struct device_table; +bool_t iommuv2_enabled; static int iommu_has_ht_flag(struct amd_iommu *iommu, u8 mask) { @@ -759,6 +760,10 @@ static void enable_iommu(struct amd_iomm amd_iommu_flush_all_caches(iommu); iommu->enabled = 1; + + if ( iommu->features ) + iommuv2_enabled = 1; + spin_unlock_irqrestore(&iommu->lock, flags); }
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569408 -3600 # Node ID bb76ba2457e629d668c08e57f2168b13bdfb7930 # Parent b70cf1dcb110f79546f3efac3201a08d70c8b96e hvmloader: Build IVRS table. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r b70cf1dcb110 -r bb76ba2457e6 tools/firmware/hvmloader/acpi/acpi2_0.h --- a/tools/firmware/hvmloader/acpi/acpi2_0.h Thu Dec 22 16:56:45 2011 +0100 +++ b/tools/firmware/hvmloader/acpi/acpi2_0.h Thu Dec 22 16:56:48 2011 +0100 @@ -389,6 +389,60 @@ struct acpi_20_madt_intsrcovr { #define ACPI_2_0_WAET_REVISION 0x01 #define ACPI_1_0_FADT_REVISION 0x01 +#define IVRS_SIGNATURE ASCII32(''I'',''V'',''R'',''S'') +#define IVRS_REVISION 1 +#define IVRS_VASIZE 64 +#define IVRS_PASIZE 52 +#define IVRS_GVASIZE 64 + +#define IVHD_BLOCK_TYPE 0x10 +#define IVHD_FLAG_HTTUNEN (1 << 0) +#define IVHD_FLAG_PASSPW (1 << 1) +#define IVHD_FLAG_RESPASSPW (1 << 2) +#define IVHD_FLAG_ISOC (1 << 3) +#define IVHD_FLAG_IOTLBSUP (1 << 4) +#define IVHD_FLAG_COHERENT (1 << 5) +#define IVHD_FLAG_PREFSUP (1 << 6) +#define IVHD_FLAG_PPRSUP (1 << 7) + +#define IVHD_EFR_GTSUP (1 << 2) +#define IVHD_EFR_IASUP (1 << 5) + +#define IVHD_SELECT_4_BYTE 0x2 + +struct ivrs_ivhd_block +{ + uint8_t type; + uint8_t flags; + uint16_t length; + uint16_t devid; + uint16_t cap_offset; + uint64_t iommu_base_addr; + uint16_t pci_segment; + uint16_t iommu_info; + uint32_t reserved; +}; + +/* IVHD 4-byte device entries */ +struct ivrs_ivhd_device +{ + uint8_t type; + uint16_t dev_id; + uint8_t flags; +}; + +#define PT_DEV_MAX_NR 32 +#define IOMMU_CAP_OFFSET 0x40 +struct acpi_40_ivrs +{ + struct acpi_header header; + uint32_t iv_info; + uint32_t reserved[2]; + struct ivrs_ivhd_block ivhd_block; + struct ivrs_ivhd_device ivhd_device[PT_DEV_MAX_NR]; +}; + + #pragma pack () struct acpi_config { diff -r b70cf1dcb110 -r bb76ba2457e6 tools/firmware/hvmloader/acpi/build.c --- a/tools/firmware/hvmloader/acpi/build.c Thu Dec 22 16:56:45 2011 +0100 +++ b/tools/firmware/hvmloader/acpi/build.c Thu Dec 22 16:56:48 2011 +0100 @@ -23,6 +23,8 @@ #include "ssdt_pm.h" #include "../config.h" #include "../util.h" +#include "../hypercall.h" +#include <xen/hvm/params.h> #define align16(sz) (((sz) + 15) & ~15) #define fixed_strcpy(d, s) strncpy((d), (s), sizeof(d)) @@ -198,6 +200,77 @@ static struct acpi_20_waet *construct_wa return waet; } +extern uint32_t ptdev_bdf[PT_DEV_MAX_NR]; +extern uint32_t ptdev_nr; +extern uint32_t iommu_bdf; +static struct acpi_40_ivrs* construct_ivrs(void) +{ + struct acpi_40_ivrs *ivrs; + uint64_t mmio; + struct ivrs_ivhd_block *ivhd; + struct ivrs_ivhd_device *dev_entry; + struct xen_hvm_param p; + + if (ptdev_nr == 0) return NULL; + + ivrs = mem_alloc(sizeof(*ivrs), 16); + if (!ivrs) return NULL; + + memset(ivrs, 0, sizeof(*ivrs)); + + /* initialize acpi header */ + ivrs->header.signature = IVRS_SIGNATURE; + ivrs->header.revision = IVRS_REVISION; + fixed_strcpy(ivrs->header.oem_id, ACPI_OEM_ID); + fixed_strcpy(ivrs->header.oem_table_id, ACPI_OEM_TABLE_ID); + + ivrs->header.oem_revision = ACPI_OEM_REVISION; + ivrs->header.creator_id = ACPI_CREATOR_ID; + ivrs->header.creator_revision = ACPI_CREATOR_REVISION; + + ivrs->header.length = sizeof(*ivrs); + + /* initialize IVHD Block */ + ivhd = &ivrs->ivhd_block; + ivrs->iv_info = (IVRS_VASIZE << 15) | (IVRS_PASIZE << 8) | + (IVRS_GVASIZE << 5); + + ivhd->type = IVHD_BLOCK_TYPE; + ivhd->flags = IVHD_FLAG_PPRSUP | IVHD_FLAG_IOTLBSUP; + ivhd->devid = iommu_bdf; + ivhd->cap_offset = IOMMU_CAP_OFFSET; + + /*reserve 32K IOMMU MMIO space */ + mmio = virt_to_phys(mem_alloc(0x8000, 0x1000)); + if (!mmio) return NULL; + + p.domid = DOMID_SELF; + p.index = HVM_PARAM_IOMMU_BASE; + p.value = mmio; + + /* Return non-zero if IOMMUv2 hardware is not avaliable */ + if ( hypercall_hvm_op(HVMOP_set_param, &p) ) + return NULL; + + ivhd->iommu_base_addr = mmio; + ivhd->reserved = IVHD_EFR_IASUP | IVHD_EFR_GTSUP; + + /* Build IVHD device entries */ + dev_entry = ivrs->ivhd_device; + for ( int i = 0; i < ptdev_nr; i++ ) + { + dev_entry[i].type = IVHD_SELECT_4_BYTE; + dev_entry[i].dev_id = ptdev_bdf[i]; + dev_entry[i].flags = 0; + } + + ivhd->length = sizeof(*ivhd) + sizeof(*dev_entry) * PT_DEV_MAX_NR; + set_checksum(ivrs, offsetof(struct acpi_header, checksum), + ivrs->header.length); + + return ivrs; +} + static int construct_secondary_tables(unsigned long *table_ptrs, struct acpi_info *info) { @@ -206,6 +279,7 @@ static int construct_secondary_tables(un struct acpi_20_hpet *hpet; struct acpi_20_waet *waet; struct acpi_20_tcpa *tcpa; + struct acpi_40_ivrs *ivrs; unsigned char *ssdt; static const uint16_t tis_signature[] = {0x0001, 0x0001, 0x0001}; uint16_t *tis_hdr; @@ -293,6 +367,13 @@ static int construct_secondary_tables(un } } + if ( hvm_info->iommu_enabled ) + { + ivrs = construct_ivrs(); + if ( ivrs != NULL ) + table_ptrs[nr_tables++] = (unsigned long)ivrs; + } + table_ptrs[nr_tables] = 0; return nr_tables; } diff -r b70cf1dcb110 -r bb76ba2457e6 tools/firmware/hvmloader/pci.c --- a/tools/firmware/hvmloader/pci.c Thu Dec 22 16:56:45 2011 +0100 +++ b/tools/firmware/hvmloader/pci.c Thu Dec 22 16:56:48 2011 +0100 @@ -34,11 +34,17 @@ unsigned long pci_mem_end = PCI_MEM_END; enum virtual_vga virtual_vga = VGA_none; unsigned long igd_opregion_pgbase = 0; +/* support up to 32 passthrough devices */ +#define PT_DEV_MAX_NR 32 +uint32_t ptdev_bdf[PT_DEV_MAX_NR]; +uint32_t ptdev_nr; +uint32_t iommu_bdf; + void pci_setup(void) { uint32_t base, devfn, bar_reg, bar_data, bar_sz, cmd, mmio_total = 0; uint32_t vga_devfn = 256; - uint16_t class, vendor_id, device_id; + uint16_t class, vendor_id, device_id, sub_vendor_id; unsigned int bar, pin, link, isa_irq; /* Resources assignable to PCI devices via BARs. */ @@ -72,12 +78,34 @@ void pci_setup(void) class = pci_readw(devfn, PCI_CLASS_DEVICE); vendor_id = pci_readw(devfn, PCI_VENDOR_ID); device_id = pci_readw(devfn, PCI_DEVICE_ID); + sub_vendor_id = pci_readw(devfn, PCI_SUBSYSTEM_VENDOR_ID); + if ( (vendor_id == 0xffff) && (device_id == 0xffff) ) continue; ASSERT((devfn != PCI_ISA_DEVFN) || ((vendor_id == 0x8086) && (device_id == 0x7000))); + /* Found amd iommu device. */ + if ( class == 0x0806 && vendor_id == 0x1022 ) + { + iommu_bdf = devfn; + continue; + } + /* IVRS: Detecting passthrough devices. + * sub_vendor_id != citrix && sub_vendor_id != qemu */ + if ( sub_vendor_id != 0x5853 && sub_vendor_id != 0x1af4 ) + { + /* found amd iommu device */ + if ( ptdev_nr < PT_DEV_MAX_NR ) + { + ptdev_bdf[ptdev_nr] = devfn; + ptdev_nr ++; + } + else + printf("Number of passthru devices > PT_DEV_MAX_NR \n"); + } + switch ( class ) { case 0x0300: diff -r b70cf1dcb110 -r bb76ba2457e6 xen/include/public/hvm/hvm_info_table.h --- a/xen/include/public/hvm/hvm_info_table.h Thu Dec 22 16:56:45 2011 +0100 +++ b/xen/include/public/hvm/hvm_info_table.h Thu Dec 22 16:56:48 2011 +0100 @@ -67,6 +67,9 @@ struct hvm_info_table { /* Bitmap of which CPUs are online at boot time. */ uint8_t vcpu_online[(HVM_MAX_VCPUS + 7)/8]; + + /* guest iommu enabled */ + uint8_t iommu_enabled; }; #endif /* __XEN_PUBLIC_HVM_HVM_INFO_TABLE_H__ */
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569412 -3600 # Node ID 00bdad5dfb7f5efd0b1cc54622b3ccb1b1b1b16f # Parent bb76ba2457e629d668c08e57f2168b13bdfb7930 libxc: add wrappers for new hypercalls Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r bb76ba2457e6 -r 00bdad5dfb7f tools/libxc/xc_domain.c --- a/tools/libxc/xc_domain.c Thu Dec 22 16:56:48 2011 +0100 +++ b/tools/libxc/xc_domain.c Thu Dec 22 16:56:52 2011 +0100 @@ -1352,6 +1352,55 @@ int xc_domain_bind_pt_isa_irq( PT_IRQ_TYPE_ISA, 0, 0, 0, machine_irq)); } +int xc_domain_update_iommu_msi( + xc_interface *xch, + uint32_t domid, + uint8_t vector, + uint8_t dest, + uint8_t dest_mode, + uint8_t delivery_mode, + uint8_t trig_mode) +{ + int rc; + DECLARE_DOMCTL; + xen_domctl_guest_iommu_op_t * iommu_op; + + domctl.cmd = XEN_DOMCTL_guest_iommu_op; + domctl.domain = (domid_t)domid; + + iommu_op = &(domctl.u.guest_iommu_op); + iommu_op->op = XEN_DOMCTL_GUEST_IOMMU_OP_SET_MSI; + iommu_op->u.msi.vector = vector; + iommu_op->u.msi.dest = dest; + iommu_op->u.msi.dest_mode = dest_mode; + iommu_op->u.msi.delivery_mode = delivery_mode; + iommu_op->u.msi.trig_mode = trig_mode; + + rc = do_domctl(xch, &domctl); + return rc; +} + +int xc_domain_bind_pt_bdf(xc_interface *xch, + uint32_t domid, + uint32_t gbdf, + uint32_t mbdf) +{ + int rc; + DECLARE_DOMCTL; + xen_domctl_guest_iommu_op_t * guest_op; + + domctl.cmd = XEN_DOMCTL_guest_iommu_op; + domctl.domain = (domid_t)domid; + + guest_op = &(domctl.u.guest_iommu_op); + guest_op->op = XEN_DOMCTL_GUEST_IOMMU_OP_BIND_BDF; + guest_op->u.bdf_bind.g_bdf = gbdf; + guest_op->u.bdf_bind.m_bdf = mbdf; + + rc = do_domctl(xch, &domctl); + return rc; +} + int xc_domain_memory_mapping( xc_interface *xch, uint32_t domid, diff -r bb76ba2457e6 -r 00bdad5dfb7f tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h Thu Dec 22 16:56:48 2011 +0100 +++ b/tools/libxc/xenctrl.h Thu Dec 22 16:56:52 2011 +0100 @@ -1697,6 +1697,19 @@ int xc_domain_bind_pt_isa_irq(xc_interfa uint32_t domid, uint8_t machine_irq); +int xc_domain_bind_pt_bdf(xc_interface *xch, + uint32_t domid, + uint32_t gbdf, + uint32_t mbdf); + +int xc_domain_update_iommu_msi(xc_interface *xch, + uint32_t domid, + uint8_t vector, + uint8_t dest, + uint8_t dest_mode, + uint8_t delivery_mode, + uint8_t trig_mode); + int xc_domain_set_machine_address_size(xc_interface *xch, uint32_t domid, unsigned int width);
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 14 of 16] libxl: bind virtual bdf to physical bdf after device assignment
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569415 -3600 # Node ID 7ac92c11a11a4dbdf85ef503445128be861cb400 # Parent 00bdad5dfb7f5efd0b1cc54622b3ccb1b1b1b16f libxl: bind virtual bdf to physical bdf after device assignment Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 00bdad5dfb7f -r 7ac92c11a11a tools/libxl/libxl_pci.c --- a/tools/libxl/libxl_pci.c Thu Dec 22 16:56:52 2011 +0100 +++ b/tools/libxl/libxl_pci.c Thu Dec 22 16:56:55 2011 +0100 @@ -735,6 +735,13 @@ out: LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_assign_device failed"); return ERROR_FAIL; } + if (LIBXL__DOMAIN_IS_TYPE(gc, domid, HVM)) { + rc = xc_domain_bind_pt_bdf(ctx->xch, domid, pcidev->vdevfn, pcidev_encode_bdf(pcidev)); + if ( rc ) { + LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_bind_pt_bdf failed"); + return ERROR_FAIL; + } + } } if (!starting)
Wei Wang
2011-Dec-23 11:29 UTC
[PATCH 15 of 16] libxl: Introduce a new guest config file parameter
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569419 -3600 # Node ID 8ea73cd1a367b318f72026a02629c774d24f999f # Parent 7ac92c11a11a4dbdf85ef503445128be861cb400 libxl: Introduce a new guest config file parameter Use iommu = {1,0} to enable or disable guest iommu emulation. Default value is 0. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 7ac92c11a11a -r 8ea73cd1a367 tools/libxl/libxl_create.c --- a/tools/libxl/libxl_create.c Thu Dec 22 16:56:55 2011 +0100 +++ b/tools/libxl/libxl_create.c Thu Dec 22 16:56:59 2011 +0100 @@ -99,6 +99,7 @@ int libxl_init_build_info(libxl_ctx *ctx b_info->u.hvm.vpt_align = 1; b_info->u.hvm.timer_mode = 1; b_info->u.hvm.nested_hvm = 0; + b_info->u.hvm.iommu = 0; break; case LIBXL_DOMAIN_TYPE_PV: b_info->u.pv.slack_memkb = 8 * 1024; diff -r 7ac92c11a11a -r 8ea73cd1a367 tools/libxl/libxl_dom.c --- a/tools/libxl/libxl_dom.c Thu Dec 22 16:56:55 2011 +0100 +++ b/tools/libxl/libxl_dom.c Thu Dec 22 16:56:59 2011 +0100 @@ -266,6 +266,10 @@ static int hvm_build_set_params(xc_inter va_hvm = (struct hvm_info_table *)(va_map + HVM_INFO_OFFSET); va_hvm->apic_mode = info->u.hvm.apic; va_hvm->nr_vcpus = info->max_vcpus; + + if ( info->u.hvm.iommu ) + va_hvm->iommu_enabled = 1; + memcpy(va_hvm->vcpu_online, &info->cur_vcpus, sizeof(info->cur_vcpus)); for (i = 0, sum = 0; i < va_hvm->length; i++) sum += ((uint8_t *) va_hvm)[i]; diff -r 7ac92c11a11a -r 8ea73cd1a367 tools/libxl/libxl_types.idl --- a/tools/libxl/libxl_types.idl Thu Dec 22 16:56:55 2011 +0100 +++ b/tools/libxl/libxl_types.idl Thu Dec 22 16:56:59 2011 +0100 @@ -184,6 +184,7 @@ libxl_domain_build_info = Struct("domain ("vpt_align", bool), ("timer_mode", integer), ("nested_hvm", bool), + ("iommu", bool), ])), ("pv", Struct(None, [("kernel", libxl_file_reference), ("slack_memkb", uint32), diff -r 7ac92c11a11a -r 8ea73cd1a367 tools/libxl/xl_cmdimpl.c --- a/tools/libxl/xl_cmdimpl.c Thu Dec 22 16:56:55 2011 +0100 +++ b/tools/libxl/xl_cmdimpl.c Thu Dec 22 16:56:59 2011 +0100 @@ -360,6 +360,7 @@ static void printf_info(int domid, printf("\t\t\t(vpt_align %d)\n", b_info->u.hvm.vpt_align); printf("\t\t\t(timer_mode %d)\n", b_info->u.hvm.timer_mode); printf("\t\t\t(nestedhvm %d)\n", b_info->u.hvm.nested_hvm); + printf("\t\t\t(iommu %d)\n", b_info->u.hvm.iommu); printf("\t\t\t(device_model %s)\n", dm_info->device_model ? : "default"); printf("\t\t\t(videoram %d)\n", dm_info->videoram); @@ -764,6 +765,8 @@ static void parse_config_data(const char b_info->u.hvm.timer_mode = l; if (!xlu_cfg_get_long (config, "nestedhvm", &l, 0)) b_info->u.hvm.nested_hvm = l; + if (!xlu_cfg_get_long (config, "iommu", &l, 0)) + b_info->u.hvm.iommu = l; break; case LIBXL_DOMAIN_TYPE_PV: {
# HG changeset patch # User Wei Wang <wei.wang2@amd.com> # Date 1324569422 -3600 # Node ID 1aad019305172c4abca8c2c9d7e632fe3291fead # Parent 8ea73cd1a367b318f72026a02629c774d24f999f libxl: pass iommu parameter to qemu-dm. When iomm = 0, virtual iommu device will be disabled. Signed-off-by: Wei Wang <wei.wang2@amd.com> diff -r 8ea73cd1a367 -r 1aad01930517 tools/libxl/libxl_create.c --- a/tools/libxl/libxl_create.c Thu Dec 22 16:56:59 2011 +0100 +++ b/tools/libxl/libxl_create.c Thu Dec 22 16:57:02 2011 +0100 @@ -559,7 +559,7 @@ static int do_domain_create(libxl__gc *g libxl_device_vkb_dispose(&vkb); dm_info->domid = domid; - ret = libxl__create_device_model(gc, dm_info, + ret = libxl__create_device_model(gc, dm_info, &d_config->b_info, d_config->disks, d_config->num_disks, d_config->vifs, d_config->num_vifs, &dm_starting); diff -r 8ea73cd1a367 -r 1aad01930517 tools/libxl/libxl_dm.c --- a/tools/libxl/libxl_dm.c Thu Dec 22 16:56:59 2011 +0100 +++ b/tools/libxl/libxl_dm.c Thu Dec 22 16:57:02 2011 +0100 @@ -84,6 +84,7 @@ static const char *libxl__domain_bios(li static char ** libxl__build_device_model_args_old(libxl__gc *gc, const char *dm, libxl_device_model_info *info, + libxl_domain_build_info *b_info, libxl_device_disk *disks, int num_disks, libxl_device_nic *vifs, int num_vifs) { @@ -199,6 +200,9 @@ static char ** libxl__build_device_model if (info->gfx_passthru) { flexarray_append(dm_args, "-gfx_passthru"); } + if (b_info && b_info->u.hvm.iommu) { + flexarray_append(dm_args, "-iommu"); + } } if (info->saved_state) { flexarray_vappend(dm_args, "-loadvm", info->saved_state, NULL); @@ -237,6 +241,7 @@ static const char *qemu_disk_format_stri static char ** libxl__build_device_model_args_new(libxl__gc *gc, const char *dm, libxl_device_model_info *info, + libxl_domain_build_info *b_info, libxl_device_disk *disks, int num_disks, libxl_device_nic *vifs, int num_vifs) { @@ -409,6 +414,9 @@ static char ** libxl__build_device_model if (info->gfx_passthru) { flexarray_append(dm_args, "-gfx_passthru"); } + if (b_info && b_info->u.hvm.iommu) { + flexarray_append(dm_args, "-iommu"); + } } if (info->saved_state) { /* This file descriptor is meant to be used by QEMU */ @@ -500,6 +508,7 @@ static char ** libxl__build_device_model static char ** libxl__build_device_model_args(libxl__gc *gc, const char *dm, libxl_device_model_info *info, + libxl_domain_build_info *b_info, libxl_device_disk *disks, int num_disks, libxl_device_nic *vifs, int num_vifs) { @@ -507,11 +516,11 @@ static char ** libxl__build_device_model switch (info->device_model_version) { case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: - return libxl__build_device_model_args_old(gc, dm, info, + return libxl__build_device_model_args_old(gc, dm, info, b_info, disks, num_disks, vifs, num_vifs); case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: - return libxl__build_device_model_args_new(gc, dm, info, + return libxl__build_device_model_args_new(gc, dm, info, b_info, disks, num_disks, vifs, num_vifs); default: @@ -619,7 +628,7 @@ static int libxl__create_stubdom(libxl__ goto out; } - args = libxl__build_device_model_args(gc, "stubdom-dm", info, + args = libxl__build_device_model_args(gc, "stubdom-dm", info, NULL, disks, num_disks, vifs, num_vifs); if (!args) { @@ -782,6 +791,7 @@ out: int libxl__create_device_model(libxl__gc *gc, libxl_device_model_info *info, + libxl_domain_build_info *b_info, libxl_device_disk *disks, int num_disks, libxl_device_nic *vifs, int num_vifs, libxl__spawner_starting **starting_r) @@ -820,7 +830,7 @@ int libxl__create_device_model(libxl__gc rc = ERROR_FAIL; goto out; } - args = libxl__build_device_model_args(gc, dm, info, disks, num_disks, + args = libxl__build_device_model_args(gc, dm, info, b_info, disks, num_disks, vifs, num_vifs); if (!args) { rc = ERROR_FAIL; @@ -1046,7 +1056,7 @@ int libxl__create_xenpv_qemu(libxl__gc * libxl__spawner_starting **starting_r) { libxl__build_xenpv_qemu_args(gc, domid, vfb, info); - libxl__create_device_model(gc, info, NULL, 0, NULL, 0, starting_r); + libxl__create_device_model(gc, info, NULL, NULL, 0, NULL, 0, starting_r); return 0; } diff -r 8ea73cd1a367 -r 1aad01930517 tools/libxl/libxl_internal.h --- a/tools/libxl/libxl_internal.h Thu Dec 22 16:56:59 2011 +0100 +++ b/tools/libxl/libxl_internal.h Thu Dec 22 16:57:02 2011 +0100 @@ -466,6 +466,7 @@ _hidden const char *libxl__domain_device libxl_device_model_info *info); _hidden int libxl__create_device_model(libxl__gc *gc, libxl_device_model_info *info, + libxl_domain_build_info *b_info, libxl_device_disk *disk, int num_disks, libxl_device_nic *vifs, int num_vifs, libxl__spawner_starting **starting_r);
On Fri, 2011-12-23 at 11:29 +0000, Wei Wang wrote:> diff -r b70cf1dcb110 -r bb76ba2457e6 > xen/include/public/hvm/hvm_info_table.h > --- a/xen/include/public/hvm/hvm_info_table.h Thu Dec 22 16:56:45 > 2011 +0100 > +++ b/xen/include/public/hvm/hvm_info_table.h Thu Dec 22 16:56:48 > 2011 +0100 > @@ -67,6 +67,9 @@ struct hvm_info_table { > > /* Bitmap of which CPUs are online at boot time. */ > uint8_t vcpu_online[(HVM_MAX_VCPUS + 7)/8]; > + > + /* guest iommu enabled */ > + uint8_t iommu_enabled; > }; > > #endif /* __XEN_PUBLIC_HVM_HVM_INFO_TABLE_H__ */Please can we avoid adding new things to this struct and use xenstore to pass new configuration items instead. See Paul Durrant''s recent patches to make s3 optional via the platform/acpi_s3 node. Ian.
Ian Campbell
2011-Dec-23 11:37 UTC
Re: [PATCH 14 of 16] libxl: bind virtual bdf to physical bdf after device assignment
On Fri, 2011-12-23 at 11:29 +0000, Wei Wang wrote:> # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569415 -3600 > # Node ID 7ac92c11a11a4dbdf85ef503445128be861cb400 > # Parent 00bdad5dfb7f5efd0b1cc54622b3ccb1b1b1b16f > libxl: bind virtual bdf to physical bdf after device assignment > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r 00bdad5dfb7f -r 7ac92c11a11a tools/libxl/libxl_pci.c > --- a/tools/libxl/libxl_pci.c Thu Dec 22 16:56:52 2011 +0100 > +++ b/tools/libxl/libxl_pci.c Thu Dec 22 16:56:55 2011 +0100 > @@ -735,6 +735,13 @@ out: > LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_assign_device failed"); > return ERROR_FAIL; > } > + if (LIBXL__DOMAIN_IS_TYPE(gc, domid, HVM)) { > + rc = xc_domain_bind_pt_bdf(ctx->xch, domid, pcidev->vdevfn, pcidev_encode_bdf(pcidev)); > + if ( rc ) { > + LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, "xc_domain_bind_pt_bdf failed"); > + return ERROR_FAIL; > + } > + }Indentation is wrong here, or you''ve used hard tabs where you need 4 spaces. Ian.
On Friday 23 December 2011 12:36:32 Ian Campbell wrote:> On Fri, 2011-12-23 at 11:29 +0000, Wei Wang wrote: > > diff -r b70cf1dcb110 -r bb76ba2457e6 > > xen/include/public/hvm/hvm_info_table.h > > --- a/xen/include/public/hvm/hvm_info_table.h Thu Dec 22 16:56:45 > > 2011 +0100 > > +++ b/xen/include/public/hvm/hvm_info_table.h Thu Dec 22 16:56:48 > > 2011 +0100 > > @@ -67,6 +67,9 @@ struct hvm_info_table { > > > > /* Bitmap of which CPUs are online at boot time. */ > > uint8_t vcpu_online[(HVM_MAX_VCPUS + 7)/8]; > > + > > + /* guest iommu enabled */ > > + uint8_t iommu_enabled; > > }; > > > > #endif /* __XEN_PUBLIC_HVM_HVM_INFO_TABLE_H__ */ > > Please can we avoid adding new things to this struct and use xenstore to > pass new configuration items instead. See Paul Durrant''s recent patches > to make s3 optional via the platform/acpi_s3 node. > Ian.Good point, Will use xenstore_read in next try. Thanks, Wei
Wei Wang2
2011-Dec-23 11:56 UTC
Re: [PATCH 14 of 16] libxl: bind virtual bdf to physical bdf after device assignment
On Friday 23 December 2011 12:37:17 Ian Campbell wrote:> On Fri, 2011-12-23 at 11:29 +0000, Wei Wang wrote: > > # HG changeset patch > > # User Wei Wang <wei.wang2@amd.com> > > # Date 1324569415 -3600 > > # Node ID 7ac92c11a11a4dbdf85ef503445128be861cb400 > > # Parent 00bdad5dfb7f5efd0b1cc54622b3ccb1b1b1b16f > > libxl: bind virtual bdf to physical bdf after device assignment > > > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > > > diff -r 00bdad5dfb7f -r 7ac92c11a11a tools/libxl/libxl_pci.c > > --- a/tools/libxl/libxl_pci.c Thu Dec 22 16:56:52 2011 +0100 > > +++ b/tools/libxl/libxl_pci.c Thu Dec 22 16:56:55 2011 +0100 > > @@ -735,6 +735,13 @@ out: > > LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, > > "xc_assign_device failed"); return ERROR_FAIL; > > } > > + if (LIBXL__DOMAIN_IS_TYPE(gc, domid, HVM)) { > > + rc = xc_domain_bind_pt_bdf(ctx->xch, domid, pcidev->vdevfn, > > pcidev_encode_bdf(pcidev)); + if ( rc ) { > > + LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, rc, > > "xc_domain_bind_pt_bdf failed"); + return ERROR_FAIL; > > + } > > + } > > Indentation is wrong here, or you''ve used hard tabs where you need 4 > spaces.Yes, indeed. I will fix it. Thanks, Wei> Ian.
Jan Beulich
2012-Jan-02 11:29 UTC
Re: [PATCH 11 of 16] amd iommu: Add a new flag to indication iommuv2 feature enabled or not
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569405 -3600 > # Node ID b70cf1dcb110f79546f3efac3201a08d70c8b96e > # Parent 30b1f434160d989be5e0bb6c6956bb7e3985db59 > amd iommu: Add a new flag to indication iommuv2 feature enabled or not. > Hypercalls should return early on non-iommuv2 systems.Shouldn''t this be done right away when those functions/hypercalls get introduced?> Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r 30b1f434160d -r b70cf1dcb110 xen/drivers/passthrough/amd/iommu_guest.c > --- a/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:41 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:45 2011 > +0100 > @@ -48,6 +48,8 @@ > (reg)->hi = (val >> 32) & 0xFFFFFFFF; \ > } while(0) > > +extern bool_t iommuv2_enabled;No new extern declarations in .c files, please. Jan> + > static unsigned int machine_bdf(struct domain *d, uint16_t guest_bdf) > { > struct pci_dev *pdev; > @@ -819,6 +821,9 @@ int guest_iommu_set_base(struct domain * > p2m_type_t t; > struct guest_iommu *iommu = domain_iommu(d); > > + if ( !is_hvm_domain(d) || !iommuv2_enabled ) > + return 1; > + > iommu->mmio_base = base; > base >>= PAGE_SHIFT; > > @@ -878,7 +883,7 @@ int guest_iommu_init(struct domain* d) > struct guest_iommu *iommu; > struct hvm_iommu *hd = domain_hvm_iommu(d); > > - if ( !is_hvm_domain(d) ) > + if ( !is_hvm_domain(d) || !iommuv2_enabled ) > return 0; > > iommu = xzalloc(struct guest_iommu); > @@ -902,13 +907,11 @@ int guest_iommu_init(struct domain* d) > > void guest_iommu_destroy(struct domain *d) > { > - struct guest_iommu *iommu; > + struct guest_iommu *iommu = domain_iommu(d); > > - if ( !is_hvm_domain(d) ) > + if ( !is_hvm_domain(d) || !iommuv2_enabled ) > return; > > - iommu = domain_iommu(d); > - > tasklet_kill(&iommu->cmd_buffer_tasklet); > xfree(iommu); > > @@ -919,6 +922,9 @@ static int guest_iommu_mmio_range(struct > { > struct guest_iommu *iommu = vcpu_iommu(v); > > + if ( !iommu_found() || !iommuv2_enabled ) > + return 0; > + > return ( addr >= iommu->mmio_base && > addr < (iommu->mmio_base + IOMMU_MMIO_SIZE) ); > } > @@ -935,7 +941,7 @@ int iommu_bind_bdf(struct domain* d, uin > struct pci_dev *pdev; > int ret = -ENODEV; > > - if ( !iommu_found() ) > + if ( !iommu_found() || !iommuv2_enabled ) > return 0; > > spin_lock(&pcidevs_lock); > @@ -960,7 +966,7 @@ void iommu_set_msi(struct domain* d, uin > { > struct guest_iommu *iommu = domain_iommu(d); > > - if ( !iommu_found() ) > + if ( !iommu_found() || !iommuv2_enabled ) > return; > > iommu->msi.vector = vector; > diff -r 30b1f434160d -r b70cf1dcb110 xen/drivers/passthrough/amd/iommu_init.c > --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:41 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:45 2011 +0100 > @@ -36,6 +36,7 @@ unsigned short ivrs_bdf_entries; > static struct radix_tree_root ivrs_maps; > struct list_head amd_iommu_head; > struct table_struct device_table; > +bool_t iommuv2_enabled; > > static int iommu_has_ht_flag(struct amd_iommu *iommu, u8 mask) > { > @@ -759,6 +760,10 @@ static void enable_iommu(struct amd_iomm > amd_iommu_flush_all_caches(iommu); > > iommu->enabled = 1; > + > + if ( iommu->features ) > + iommuv2_enabled = 1; > + > spin_unlock_irqrestore(&iommu->lock, flags); > > }
Jan Beulich
2012-Jan-02 11:36 UTC
Re: [PATCH 10 of 16] amd iommu: Enable FC bit in iommu host level PTE
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569401 -3600 > # Node ID 30b1f434160d989be5e0bb6c6956bb7e3985db59 > # Parent dd808bdd61c581b041d5b7e816b18674de51da6f > amd iommu: Enable FC bit in iommu host level PTE > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r dd808bdd61c5 -r 30b1f434160d xen/drivers/passthrough/amd/iommu_map.c > --- a/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:38 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:41 2011 +0100 > @@ -83,6 +83,11 @@ static bool_t set_iommu_pde_present(u32 > set_field_in_reg_u32(ir, entry, > IOMMU_PDE_IO_READ_PERMISSION_MASK, > IOMMU_PDE_IO_READ_PERMISSION_SHIFT, &entry); > + > + /* IOMMUv2 needs FC bit enabled */This comment suggests that the patches prior to that aren''t consistent. Is this really a proper standalone patch, or is the word "needs" too strict, or should it really be moved ahead in the series?> + if ( next_level == IOMMU_PAGING_MODE_LEVEL_0 ) > + set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, entry, > + IOMMU_PTE_FC_MASK, IOMMU_PTE_FC_SHIFT, &entry);This is being done no matter whether it actually is a v2 IOMMU that you deal with here - if that''s correct, the comment above should be adjusted accordingly. Jan> pde[1] = entry; > > /* mark next level as ''present'' */
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569398 -3600 > # Node ID dd808bdd61c581b041d5b7e816b18674de51da6f > # Parent 2329dad2786f6f20ea69c9609ab60208cad6fca9 > amd iommu: add iommu mmio handler.This could be part of patch 3, or (perhaps better) the respective bits should get moved here instead of adding them as dead code there. Jan> Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r 2329dad2786f -r dd808bdd61c5 xen/arch/x86/hvm/intercept.c > --- a/xen/arch/x86/hvm/intercept.c Thu Dec 22 16:56:35 2011 +0100 > +++ b/xen/arch/x86/hvm/intercept.c Thu Dec 22 16:56:38 2011 +0100 > @@ -38,7 +38,8 @@ hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] > &hpet_mmio_handler, > &vlapic_mmio_handler, > &vioapic_mmio_handler, > - &msixtbl_mmio_handler > + &msixtbl_mmio_handler, > + &iommu_mmio_handler > }; > > static int hvm_mmio_access(struct vcpu *v, > diff -r 2329dad2786f -r dd808bdd61c5 xen/include/asm-x86/hvm/io.h > --- a/xen/include/asm-x86/hvm/io.h Thu Dec 22 16:56:35 2011 +0100 > +++ b/xen/include/asm-x86/hvm/io.h Thu Dec 22 16:56:38 2011 +0100 > @@ -69,8 +69,9 @@ extern const struct hvm_mmio_handler hpe > extern const struct hvm_mmio_handler vlapic_mmio_handler; > extern const struct hvm_mmio_handler vioapic_mmio_handler; > extern const struct hvm_mmio_handler msixtbl_mmio_handler; > +extern const struct hvm_mmio_handler iommu_mmio_handler; > > -#define HVM_MMIO_HANDLER_NR 4 > +#define HVM_MMIO_HANDLER_NR 5 > > int hvm_io_intercept(ioreq_t *p, int type); > void register_io_handler(
Jan Beulich
2012-Jan-02 11:41 UTC
Re: [PATCH 08 of 16] amd iommu: Add a hypercall for hvmloader
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569395 -3600 > # Node ID 2329dad2786f6f20ea69c9609ab60208cad6fca9 > # Parent 40d61d0390ec930cf53ce5cbf91faada8c7192bd > amd iommu: Add a hypercall for hvmloader. > IOMMU MMIO base address is dynamically allocated by firmware. > This patch allows hvmloader to notify hypervisor where the > iommu mmio pages are. > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r 40d61d0390ec -r 2329dad2786f xen/arch/x86/hvm/hvm.c > --- a/xen/arch/x86/hvm/hvm.c Thu Dec 22 16:56:32 2011 +0100 > +++ b/xen/arch/x86/hvm/hvm.c Thu Dec 22 16:56:35 2011 +0100 > @@ -65,6 +65,7 @@ > #include <public/memory.h> > #include <asm/mem_event.h> > #include <public/mem_event.h> > +#include <asm/hvm/svm/amd-iommu-proto.h> > > bool_t __read_mostly hvm_enabled; > > @@ -3677,6 +3678,9 @@ long do_hvm_op(unsigned long op, XEN_GUE > case HVM_PARAM_BUFIOREQ_EVTCHN: > rc = -EINVAL; > break; > + case HVM_PARAM_IOMMU_BASE: > + rc = guest_iommu_set_base(d, a.value); > + break; > } > > if ( rc == 0 ) > diff -r 40d61d0390ec -r 2329dad2786f xen/include/public/hvm/params.h > --- a/xen/include/public/hvm/params.h Thu Dec 22 16:56:32 2011 +0100 > +++ b/xen/include/public/hvm/params.h Thu Dec 22 16:56:35 2011 +0100 > @@ -142,6 +142,10 @@ > /* Boolean: Enable nestedhvm (hvm only) */ > #define HVM_PARAM_NESTEDHVM 24 > > -#define HVM_NR_PARAMS 27 > +#ifndef __ia64__As with the domctl definitions, I fail to see why this should be excluded for IA64 - the general concept, even if not currently implemented, is valid for any architecture that could potentially have IOMMUs. Jan> +#define HVM_PARAM_IOMMU_BASE 27 > +#endif > + > +#define HVM_NR_PARAMS 28 > > #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
Jan Beulich
2012-Jan-02 12:15 UTC
Re: [PATCH 07 of 16] amd iommu: Add 2 hypercalls for libxc
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569392 -3600 > # Node ID 40d61d0390ec930cf53ce5cbf91faada8c7192bd > # Parent bf9f21ad9a0ec245b409f3862a5a36c0e070f333 > amd iommu: Add 2 hypercalls for libxc > > iommu_set_msi: used by qemu to inform hypervisor iommu vector number in > guest > space. Hypervisor needs this vector to inject msi into guest when PPR > logging > happens. > > iommu_bind_bdf: used by xl to bind guest bdf number to machine bdf number. > IOMMU emulations codes receives commands from guest iommu driver and > forwards > them to host iommu. But virtual device id from guest should be converted > into > physical before sending to real hardware. > > Signed -off-by: Wei Wang <wei.wang2@amd.com> > > diff -r bf9f21ad9a0e -r 40d61d0390ec xen/drivers/passthrough/amd/iommu_guest.c > --- a/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:28 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_guest.c Thu Dec 22 16:56:32 2011 > +0100 > @@ -50,12 +50,27 @@ > > static unsigned int machine_bdf(struct domain *d, uint16_t guest_bdf) > { > - return guest_bdf; > + struct pci_dev *pdev; > + uint16_t mbdf = 0; > + > + for_each_pdev( d, pdev ) > + { > + if ( pdev->gbdf == guest_bdf ) > + { > + mbdf = PCI_BDF2(pdev->bus, pdev->devfn); > + break; > + } > + } > + return mbdf; > } > > static uint16_t guest_bdf(struct domain *d, uint16_t machine_bdf) > { > - return machine_bdf; > + struct pci_dev *pdev; > + > + pdev = pci_get_pdev_by_domain(d, 0, PCI_BUS(machine_bdf), > + PCI_DEVFN2(machine_bdf)); > + return pdev->gbdf; > } > > static inline struct guest_iommu *domain_iommu(struct domain *d) > @@ -913,3 +928,43 @@ const struct hvm_mmio_handler iommu_mmio > .read_handler = guest_iommu_mmio_read, > .write_handler = guest_iommu_mmio_write > }; > + > +/* iommu hypercall handler */ > +int iommu_bind_bdf(struct domain* d, uint16_t gbdf, uint16_t mbdf) > +{ > + struct pci_dev *pdev; > + int ret = -ENODEV; > + > + if ( !iommu_found() ) > + return 0; > + > + spin_lock(&pcidevs_lock); > + > + for_each_pdev( d, pdev ) > + { > + if ( (pdev->bus != PCI_BUS(mbdf) ) || > + (pdev->devfn != PCI_DEVFN2(mbdf)) ) > + continue; > + > + pdev->gbdf = gbdf; > + ret = 0; > + } > + > + spin_unlock(&pcidevs_lock); > + return ret; > +} > + > +void iommu_set_msi(struct domain* d, uint16_t vector, uint16_t dest, > + uint16_t dest_mode, uint16_t delivery_mode, > + uint16_t trig_mode) > +{ > + struct guest_iommu *iommu = domain_iommu(d); > + > + if ( !iommu_found() ) > + return; > + > + iommu->msi.vector = vector; > + iommu->msi.dest = dest; > + iommu->msi.dest_mode = dest_mode; > + iommu->msi.trig_mode = trig_mode; > +} > diff -r bf9f21ad9a0e -r 40d61d0390ec xen/drivers/passthrough/iommu.c > --- a/xen/drivers/passthrough/iommu.c Thu Dec 22 16:56:28 2011 +0100 > +++ b/xen/drivers/passthrough/iommu.c Thu Dec 22 16:56:32 2011 +0100 > @@ -648,6 +648,40 @@ int iommu_do_domctl( > put_domain(d); > break; > > +#ifndef __ia64__While I understand your reasons for putting the #ifdef here, I would like to see it removed in favor of a proper abstraction through vectors in struct iommu_ops.> + case XEN_DOMCTL_guest_iommu_op: > + { > + xen_domctl_guest_iommu_op_t * guest_op; > + > + if ( unlikely((d = get_domain_by_id(domctl->domain)) == NULL) ) > + { > + gdprintk(XENLOG_ERR, > + "XEN_DOMCTL_guest_iommu_op: get_domain_by_id() > failed\n"); > + ret = -EINVAL; > + break; > + } > + > + guest_op = &(domctl->u.guest_iommu_op); > + switch ( guest_op->op ) > + { > + case XEN_DOMCTL_GUEST_IOMMU_OP_SET_MSI: > + iommu_set_msi(d, guest_op->u.msi.vector, > + guest_op->u.msi.dest, > + guest_op->u.msi.dest_mode, > + guest_op->u.msi.delivery_mode, > + guest_op->u.msi.trig_mode); > + ret = 0; > + break; > + case XEN_DOMCTL_GUEST_IOMMU_OP_BIND_BDF: > + ret = iommu_bind_bdf(d, guest_op->u.bdf_bind.g_bdf, > + guest_op->u.bdf_bind.m_bdf); > + break; > + } > + put_domain(d); > + break; > + } > +#endif > + > default: > ret = -ENOSYS; > break; > diff -r bf9f21ad9a0e -r 40d61d0390ec xen/include/public/domctl.h > --- a/xen/include/public/domctl.h Thu Dec 22 16:56:28 2011 +0100 > +++ b/xen/include/public/domctl.h Thu Dec 22 16:56:32 2011 +0100 > @@ -848,6 +848,29 @@ struct xen_domctl_set_access_required { > typedef struct xen_domctl_set_access_required > xen_domctl_set_access_required_t; > DEFINE_XEN_GUEST_HANDLE(xen_domctl_set_access_required_t); > > +/* Support for guest iommu emulation */ > +struct xen_domctl_guest_iommu_op { > + /* XEN_DOMCTL_GUEST_IOMMU_OP_* */ > +#define XEN_DOMCTL_GUEST_IOMMU_OP_SET_MSI 0 > +#define XEN_DOMCTL_GUEST_IOMMU_OP_BIND_BDF 1 > + uint8_t op; > + union { > + struct iommu_msi { > + uint8_t vector; > + uint8_t dest; > + uint8_t dest_mode; > + uint8_t delivery_mode; > + uint8_t trig_mode; > + } msi; > + struct bdf_bind { > + uint32_t g_bdf; > + uint32_t m_bdf; > + } bdf_bind; > + } u; > +}; > +typedef struct xen_domctl_guest_iommu_op xen_domctl_guest_iommu_op_t; > +DEFINE_XEN_GUEST_HANDLE(xen_domctl_guest_iommu_op_t); > + > struct xen_domctl { > uint32_t cmd; > #define XEN_DOMCTL_createdomain 1 > @@ -912,6 +935,7 @@ struct xen_domctl { > #define XEN_DOMCTL_getvcpuextstate 63 > #define XEN_DOMCTL_set_access_required 64 > #define XEN_DOMCTL_audit_p2m 65 > +#define XEN_DOMCTL_guest_iommu_op 66 > #define XEN_DOMCTL_gdbsx_guestmemio 1000 > #define XEN_DOMCTL_gdbsx_pausevcpu 1001 > #define XEN_DOMCTL_gdbsx_unpausevcpu 1002 > @@ -960,6 +984,7 @@ struct xen_domctl { > struct xen_domctl_debug_op debug_op; > struct xen_domctl_mem_event_op mem_event_op; > struct xen_domctl_mem_sharing_op mem_sharing_op; > + struct xen_domctl_guest_iommu_op guest_iommu_op; > #if defined(__i386__) || defined(__x86_64__) > struct xen_domctl_cpuid cpuid; > struct xen_domctl_vcpuextstate vcpuextstate; > diff -r bf9f21ad9a0e -r 40d61d0390ec xen/include/xen/iommu.h > --- a/xen/include/xen/iommu.h Thu Dec 22 16:56:28 2011 +0100 > +++ b/xen/include/xen/iommu.h Thu Dec 22 16:56:32 2011 +0100 > @@ -164,6 +164,14 @@ int iommu_do_domctl(struct xen_domctl *, > void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int > page_count); > void iommu_iotlb_flush_all(struct domain *d); > > +#ifndef __ia64_ > +/* Only used by AMD IOMMU */Even without said abstraction, the conditional is pointless here, and the comment would seem unnecessary to me.> +void iommu_set_msi(struct domain* d, uint16_t vector, uint16_t dest, > + uint16_t dest_mode, uint16_t delivery_mode, > + uint16_t trig_mode); > +int iommu_bind_bdf(struct domain* d, uint16_t gbdf, uint16_t mbdf); > +#endif > + > /* > * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to > * avoid unecessary iotlb_flush in the low level IOMMU code. > diff -r bf9f21ad9a0e -r 40d61d0390ec xen/include/xen/pci.h > --- a/xen/include/xen/pci.h Thu Dec 22 16:56:28 2011 +0100 > +++ b/xen/include/xen/pci.h Thu Dec 22 16:56:32 2011 +0100 > @@ -63,6 +63,9 @@ struct pci_dev { > const u8 devfn; > struct pci_dev_info info; > u64 vf_rlen[6]; > + > + /* used by amd iomm to represent bdf value in guest space */ > + u16 gbdf;For one, this would better be placed immediately after devfn (on 64-bit there''s a 32-bit hole there). Second, what about the segment number - is that one always going to be the physical one? Or always zero? Finally, please correct the typo ("iommu") and remove "amd" as once again the concept ought to be generic. Jan> }; > > #define for_each_pdev(domain, pdev) \
Jan Beulich
2012-Jan-02 12:44 UTC
Re: [PATCH 01 of 16] amd iommu: Refactoring iommu ring buffer definition
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > --- a/xen/include/asm-x86/amd-iommu.h Wed Dec 21 10:47:30 2011 +0000 > +++ b/xen/include/asm-x86/amd-iommu.h Thu Dec 22 16:56:10 2011 +0100 > ... > +struct ring_buffer { > + void *buffer; > + unsigned long entries; > + unsigned long alloc_size; > + unsigned long entry_size;I haven''t been able to spot a real use of this field (throughout the patch series). Jan> + uint32_t tail; > + uint32_t head; > +}; > + > typedef struct iommu_cap { > uint32_t header; /* offset 00h */ > uint32_t base_low; /* offset 04h */
Jan Beulich
2012-Jan-02 12:52 UTC
Re: [PATCH 02 of 16] amd iommu: Introduces new helper functions to simplify iommu bitwise operations
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > --- a/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:10 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_cmd.c Thu Dec 22 16:56:14 2011 +0100 > @@ -33,10 +33,8 @@ static int queue_iommu_command(struct am > if ( ++tail == iommu->cmd_buffer.entries ) > tail = 0; > > - head = get_field_from_reg_u32(readl(iommu->mmio_base + > - IOMMU_CMD_BUFFER_HEAD_OFFSET), > - IOMMU_CMD_BUFFER_HEAD_MASK, > - IOMMU_CMD_BUFFER_HEAD_SHIFT); > + head = iommu_get_rb_pointer(readl(iommu->mmio_base + > + IOMMU_CMD_BUFFER_HEAD_OFFSET)); > if ( head != tail ) > { > cmd_buffer = (u32 *)(iommu->cmd_buffer.buffer + > @@ -55,11 +53,9 @@ static int queue_iommu_command(struct am > > static void commit_iommu_command_buffer(struct amd_iommu *iommu) > { > - u32 tail; > + u32 tail = 0; > > - set_field_in_reg_u32(iommu->cmd_buffer.tail, 0, > - IOMMU_CMD_BUFFER_TAIL_MASK, > - IOMMU_CMD_BUFFER_TAIL_SHIFT, &tail); > + iommu_set_rb_pointer(&tail, iommu->cmd_buffer.tail); > writel(tail, iommu->mmio_base+IOMMU_CMD_BUFFER_TAIL_OFFSET); > } >Afaict with these two changes IOMMU_CMD_BUFFER_{HEAD,TAIL}_{MASK,SHIFT} are unused, so please remove them.> --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:10 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:14 2011 +0100 > @@ -106,21 +106,21 @@ static void register_iommu_dev_table_in_ > u64 addr_64, addr_lo, addr_hi; > u32 entry; > > + ASSERT( iommu->dev_table.buffer ); > + > addr_64 = (u64)virt_to_maddr(iommu->dev_table.buffer); > addr_lo = addr_64 & DMA_32BIT_MASK; > addr_hi = addr_64 >> 32; > > - set_field_in_reg_u32((u32)addr_lo >> PAGE_SHIFT, 0, > - IOMMU_DEV_TABLE_BASE_LOW_MASK, > - IOMMU_DEV_TABLE_BASE_LOW_SHIFT, &entry); > + entry = 0; > + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT);While I didn''t check, I suspect the same applies to the old definitions here and further down in this same file.> --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Thu Dec 22 16:56:10 2011 +0100 > +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Thu Dec 22 16:56:14 2011 > +0100 > @@ -191,5 +191,85 @@ static inline int iommu_has_feature(stru > return 0; > return !!(iommu->features & (1U << bit)); > } > +/* access tail or head pointer of ring buffer */ > +#define IOMMU_RING_BUFFER_PTR_MASK 0x0007FFF0 > +#define IOMMU_RING_BUFFER_PTR_SHIFT 4I suppose these (and the others below) really belong into amd-iommu-defs.h? Jan> +static inline uint32_t iommu_get_rb_pointer(uint32_t reg) > +{ > + return get_field_from_reg_u32(reg, IOMMU_RING_BUFFER_PTR_MASK, > + IOMMU_RING_BUFFER_PTR_SHIFT); > +} > + > +static inline void iommu_set_rb_pointer(uint32_t *reg, uint32_t val) > +{ > + set_field_in_reg_u32(val, *reg, IOMMU_RING_BUFFER_PTR_MASK, > + IOMMU_RING_BUFFER_PTR_SHIFT, reg); > +} > + > +/* access device field from iommu cmd */ > +#define IOMMU_CMD_DEVICE_ID_MASK 0x0000FFFF > +#define IOMMU_CMD_DEVICE_ID_SHIFT 0 > + > +static inline uint16_t iommu_get_devid_from_cmd(uint32_t cmd) > +{ > + return get_field_from_reg_u32(cmd, IOMMU_CMD_DEVICE_ID_MASK, > + IOMMU_CMD_DEVICE_ID_SHIFT); > +} > + > +static inline void iommu_set_devid_to_cmd(uint32_t *cmd, uint16_t id) > +{ > + set_field_in_reg_u32(id, *cmd, IOMMU_CMD_DEVICE_ID_MASK, > + IOMMU_CMD_DEVICE_ID_SHIFT, cmd); > +} > + > +/* access address field from iommu cmd */ > +#define IOMMU_CMD_ADDR_LOW_MASK 0xFFFFF000 > +#define IOMMU_CMD_ADDR_LOW_SHIFT 12 > +#define IOMMU_CMD_ADDR_HIGH_MASK 0xFFFFFFFF > +#define IOMMU_CMD_ADDR_HIGH_SHIFT 0 > + > +static inline uint32_t iommu_get_addr_lo_from_cmd(uint32_t cmd) > +{ > + return get_field_from_reg_u32(cmd, IOMMU_CMD_ADDR_LOW_MASK, > + IOMMU_CMD_ADDR_LOW_SHIFT); > +} > + > +static inline uint32_t iommu_get_addr_hi_from_cmd(uint32_t cmd) > +{ > + return get_field_from_reg_u32(cmd, IOMMU_CMD_ADDR_LOW_MASK, > + IOMMU_CMD_ADDR_HIGH_SHIFT); > +} > + > +#define iommu_get_devid_from_event iommu_get_devid_from_cmd > + > +/* access iommu base addresses from mmio regs */ > +#define IOMMU_REG_BASE_ADDR_BASE_LOW_MASK 0xFFFFF000 > +#define IOMMU_REG_BASE_ADDR_LOW_SHIFT 12 > +#define IOMMU_REG_BASE_ADDR_HIGH_MASK 0x000FFFFF > +#define IOMMU_REG_BASE_ADDR_HIGH_SHIFT 0 > + > +static inline void iommu_set_addr_lo_to_reg(uint32_t *reg, uint32_t addr) > +{ > + set_field_in_reg_u32(addr, *reg, IOMMU_REG_BASE_ADDR_BASE_LOW_MASK, > + IOMMU_REG_BASE_ADDR_LOW_SHIFT, reg); > +} > + > +static inline void iommu_set_addr_hi_to_reg(uint32_t *reg, uint32_t addr) > +{ > + set_field_in_reg_u32(addr, *reg, IOMMU_REG_BASE_ADDR_HIGH_MASK, > + IOMMU_REG_BASE_ADDR_HIGH_SHIFT, reg); > +} > + > +static inline uint32_t iommu_get_addr_lo_from_reg(uint32_t reg) > +{ > + return get_field_from_reg_u32(reg, IOMMU_REG_BASE_ADDR_BASE_LOW_MASK, > + IOMMU_REG_BASE_ADDR_LOW_SHIFT); > +} > + > +static inline uint32_t iommu_get_addr_hi_from_reg(uint32_t reg) > +{ > + return get_field_from_reg_u32(reg, IOMMU_REG_BASE_ADDR_HIGH_MASK, > + IOMMU_REG_BASE_ADDR_HIGH_SHIFT); > +} > > #endif /* _ASM_X86_64_AMD_IOMMU_PROTO_H */
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1324569381 -3600 > # Node ID 33f88c76776c318eea74b8fc1ba467389407ad57 > # Parent 07f338ae663242ba9080f1ab84298894783da3e2 > amd iommu: Enable ppr log. > IOMMUv2 writes peripheral page service request (PPR) records into ppr log > to report DMA page request from ATS devices to OS. > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > diff -r 07f338ae6632 -r 33f88c76776c xen/drivers/passthrough/amd/iommu_init.c > --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:17 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:21 2011 +0100 > @@ -178,6 +178,34 @@ static void register_iommu_event_log_in_ > writel(entry, iommu->mmio_base+IOMMU_EVENT_LOG_BASE_HIGH_OFFSET); > } > > +static void register_iommu_ppr_log_in_mmio_space(struct amd_iommu *iommu) > +{ > + u64 addr_64, addr_lo, addr_hi;The latter two should be u32.> + u32 power_of2_entries; > + u32 entry; > + > + ASSERT ( iommu->ppr_log.buffer ); > + > + addr_64 = (u64)virt_to_maddr(iommu->ppr_log.buffer);Pointless cast?> + addr_lo = addr_64 & DMA_32BIT_MASK;DMA_32BIT_MASK is clearly not meant to be used here (and if addr_lo was of type u32, a plain assignment would be all that''s needed here). Jan> + addr_hi = addr_64 >> 32; > + > + entry = 0; > + iommu_set_addr_lo_to_reg(&entry, addr_lo >> PAGE_SHIFT); > + writel(entry, iommu->mmio_base + IOMMU_PPR_LOG_BASE_LOW_OFFSET); > + > + power_of2_entries = get_order_from_bytes(iommu->ppr_log.alloc_size) + > + IOMMU_PPR_LOG_POWER_OF2_ENTRIES_PER_PAGE; > + > + entry = 0; > + iommu_set_addr_hi_to_reg(&entry, addr_hi); > + set_field_in_reg_u32(power_of2_entries, entry, > + IOMMU_PPR_LOG_LENGTH_MASK, > + IOMMU_PPR_LOG_LENGTH_SHIFT, &entry); > + writel(entry, iommu->mmio_base + IOMMU_PPR_LOG_BASE_HIGH_OFFSET); > +} > + > + > static void set_iommu_translation_control(struct amd_iommu *iommu, > int enable) > {
Jan Beulich
2012-Jan-02 13:13 UTC
Re: [PATCH 06 of 16] amd iommu: add ppr log processing into iommu interrupt handling
>>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:25 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:28 2011 +0100 >... > spin_unlock_irqrestore(&iommu->lock, flags); > } > > +void parse_ppr_log_entry(struct amd_iommu *iommu, u32 entry[])static?> +{ > + > + u16 device_id; >...Jan
Wei Wang2
2012-Jan-03 08:58 UTC
Re: [PATCH 06 of 16] amd iommu: add ppr log processing into iommu interrupt handling
Hi Jan, Thanks for the review. I will fix all of them in the next try. Wei On Monday 02 January 2012 14:13:00 Jan Beulich wrote:> >>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > > > > --- a/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:25 2011 > > +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Thu Dec 22 16:56:28 > > 2011 +0100 ... > > spin_unlock_irqrestore(&iommu->lock, flags); > > } > > > > +void parse_ppr_log_entry(struct amd_iommu *iommu, u32 entry[]) > > static? > > > +{ > > + > > + u16 device_id; > >... > > Jan
Wei Wang2
2012-Jan-03 10:05 UTC
Re: [PATCH 10 of 16] amd iommu: Enable FC bit in iommu host level PTE
On Monday 02 January 2012 12:36:08 Jan Beulich wrote:> >>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > > > > # HG changeset patch > > # User Wei Wang <wei.wang2@amd.com> > > # Date 1324569401 -3600 > > # Node ID 30b1f434160d989be5e0bb6c6956bb7e3985db59 > > # Parent dd808bdd61c581b041d5b7e816b18674de51da6f > > amd iommu: Enable FC bit in iommu host level PTE > > > > Signed-off-by: Wei Wang <wei.wang2@amd.com> > > > > diff -r dd808bdd61c5 -r 30b1f434160d > > xen/drivers/passthrough/amd/iommu_map.c --- > > a/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:38 2011 +0100 > > +++ b/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:41 2011 > > +0100 @@ -83,6 +83,11 @@ static bool_t set_iommu_pde_present(u32 > > set_field_in_reg_u32(ir, entry, > > IOMMU_PDE_IO_READ_PERMISSION_MASK, > > IOMMU_PDE_IO_READ_PERMISSION_SHIFT, &entry); > > + > > + /* IOMMUv2 needs FC bit enabled */ > > This comment suggests that the patches prior to that aren''t consistent. > Is this really a proper standalone patch, or is the word "needs" too > strict, or should it really be moved ahead in the series? > > > + if ( next_level == IOMMU_PAGING_MODE_LEVEL_0 ) > > + set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, entry, > > + IOMMU_PTE_FC_MASK, IOMMU_PTE_FC_SHIFT, > > &entry); > > This is being done no matter whether it actually is a v2 IOMMU that > you deal with here - if that''s correct, the comment above should be > adjusted accordingly.This bit forces pci-defined no snoop bit to be cleared. This helps to solve potential issues in ATS devices with early drivers. I did not see any breaks on legacy devices and iommuv1 with FC = 1. But if you like I could make this only for v2 or change the comment a bit. Thanks, Wei> Jan > > > pde[1] = entry; > > > > /* mark next level as ''present'' */
Jan Beulich
2012-Jan-03 10:12 UTC
Re: [PATCH 10 of 16] amd iommu: Enable FC bit in iommu host level PTE
>>> On 03.01.12 at 11:05, Wei Wang2 <wei.wang2@amd.com> wrote: > On Monday 02 January 2012 12:36:08 Jan Beulich wrote: >> >>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: >> > >> > # HG changeset patch >> > # User Wei Wang <wei.wang2@amd.com> >> > # Date 1324569401 -3600 >> > # Node ID 30b1f434160d989be5e0bb6c6956bb7e3985db59 >> > # Parent dd808bdd61c581b041d5b7e816b18674de51da6f >> > amd iommu: Enable FC bit in iommu host level PTE >> > >> > Signed-off-by: Wei Wang <wei.wang2@amd.com> >> > >> > diff -r dd808bdd61c5 -r 30b1f434160d >> > xen/drivers/passthrough/amd/iommu_map.c --- >> > a/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:38 2011 +0100 >> > +++ b/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:41 2011 >> > +0100 @@ -83,6 +83,11 @@ static bool_t set_iommu_pde_present(u32 >> > set_field_in_reg_u32(ir, entry, >> > IOMMU_PDE_IO_READ_PERMISSION_MASK, >> > IOMMU_PDE_IO_READ_PERMISSION_SHIFT, &entry); >> > + >> > + /* IOMMUv2 needs FC bit enabled */ >> >> This comment suggests that the patches prior to that aren''t consistent. >> Is this really a proper standalone patch, or is the word "needs" too >> strict, or should it really be moved ahead in the series? >> >> > + if ( next_level == IOMMU_PAGING_MODE_LEVEL_0 ) >> > + set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, entry, >> > + IOMMU_PTE_FC_MASK, IOMMU_PTE_FC_SHIFT, >> > &entry); >> >> This is being done no matter whether it actually is a v2 IOMMU that >> you deal with here - if that''s correct, the comment above should be >> adjusted accordingly. > > This bit forces pci-defined no snoop bit to be cleared. This helps to solve > potential issues in ATS devices with early drivers. I did not see any breaks > > on legacy devices and iommuv1 with FC = 1. But if you like I could make this > > only for v2 or change the comment a bit.Whatever is the most appropriate thing to do here (you definitely know better than I do). Jan
Wei Wang2
2012-Jan-03 10:37 UTC
Re: [PATCH 10 of 16] amd iommu: Enable FC bit in iommu host level PTE
On Tuesday 03 January 2012 11:12:35 Jan Beulich wrote:> >>> On 03.01.12 at 11:05, Wei Wang2 <wei.wang2@amd.com> wrote: > > > > On Monday 02 January 2012 12:36:08 Jan Beulich wrote: > >> >>> On 23.12.11 at 12:29, Wei Wang <wei.wang2@amd.com> wrote: > >> > > >> > # HG changeset patch > >> > # User Wei Wang <wei.wang2@amd.com> > >> > # Date 1324569401 -3600 > >> > # Node ID 30b1f434160d989be5e0bb6c6956bb7e3985db59 > >> > # Parent dd808bdd61c581b041d5b7e816b18674de51da6f > >> > amd iommu: Enable FC bit in iommu host level PTE > >> > > >> > Signed-off-by: Wei Wang <wei.wang2@amd.com> > >> > > >> > diff -r dd808bdd61c5 -r 30b1f434160d > >> > xen/drivers/passthrough/amd/iommu_map.c --- > >> > a/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 16:56:38 2011 > >> > +0100 +++ b/xen/drivers/passthrough/amd/iommu_map.c Thu Dec 22 > >> > 16:56:41 2011 +0100 @@ -83,6 +83,11 @@ static bool_t > >> > set_iommu_pde_present(u32 set_field_in_reg_u32(ir, entry, > >> > IOMMU_PDE_IO_READ_PERMISSION_MASK, > >> > IOMMU_PDE_IO_READ_PERMISSION_SHIFT, &entry); > >> > + > >> > + /* IOMMUv2 needs FC bit enabled */ > >> > >> This comment suggests that the patches prior to that aren''t consistent. > >> Is this really a proper standalone patch, or is the word "needs" too > >> strict, or should it really be moved ahead in the series? > >> > >> > + if ( next_level == IOMMU_PAGING_MODE_LEVEL_0 ) > >> > + set_field_in_reg_u32(IOMMU_CONTROL_ENABLED, entry, > >> > + IOMMU_PTE_FC_MASK, IOMMU_PTE_FC_SHIFT, > >> > &entry); > >> > >> This is being done no matter whether it actually is a v2 IOMMU that > >> you deal with here - if that''s correct, the comment above should be > >> adjusted accordingly. > > > > This bit forces pci-defined no snoop bit to be cleared. This helps to > > solve potential issues in ATS devices with early drivers. I did not see > > any breaks > > > > on legacy devices and iommuv1 with FC = 1. But if you like I could make > > this > > > > only for v2 or change the comment a bit. > > Whatever is the most appropriate thing to do here (you definitely > know better than I do)I intend to keep fc =1 for both v1 and v2. This setup also aligns with Linux iommu driver. I will change the comment. Thanks, wei> Jan
Ian Jackson
2012-Jan-03 16:02 UTC
Re: [PATCH 15 of 16] libxl: Introduce a new guest config file parameter
Wei Wang writes ("[PATCH 15 of 16] libxl: Introduce a new guest config file parameter"):> libxl: Introduce a new guest config file parameter > Use iommu = {1,0} to enable or disable guest iommu emulation. > Default value is 0.I''m not sure I like this name. It''s confusing because it''s not at first glance clear whether it refers to the host''s putative iommu, or some kind of provision to the guest. And it needs documentation, which I hope will answer my other question which is "what is this good for?" :-). Ian.
Ian Jackson
2012-Jan-03 16:03 UTC
Re: [PATCH 14 of 16] libxl: bind virtual bdf to physical bdf after device assignment
Wei Wang writes ("[PATCH 14 of 16] libxl: bind virtual bdf to physical bdf after device assignment"):> + rc = xc_domain_bind_pt_bdf(ctx->xch, domid, pcidev->vdevfn, pcidev_encode_bdf(pcidev));This and many of your other patches to libxl need rewrapping to fit within 75-80 columns, correct indentation, and correct the coding style to fit with the rest of the code. Ian.
Ian Jackson
2012-Jan-10 17:12 UTC
Re: [PATCH 15 of 16] libxl: Introduce a new guest config file parameter [and 1 more messages]
Wei Wang writes ("[PATCH 14 of 14 V3] libxl: Introduce a new guest config file parameter"):> # HG changeset patch > # User Wei Wang <wei.wang2@amd.com> > # Date 1326213623 -3600 > # Node ID 39eb093ea89eeaa4dbff29439499f2a289291ff0 > # Parent 9e89b6485b6c91a8d563c46c47a8d768eee7d1f2 > libxl: Introduce a new guest config file parameter > Use iommu = {1,0} to enable or disable guest iommu emulation. > Default value is 0.Ian Jackson writes ("Re: [PATCH 15 of 16] libxl: Introduce a new guest config file parameter"):> Wei Wang writes ("[PATCH 15 of 16] libxl: Introduce a new guest config file parameter"): > > libxl: Introduce a new guest config file parameter > > Use iommu = {1,0} to enable or disable guest iommu emulation. > > Default value is 0. > > I''m not sure I like this name. It''s confusing because it''s not at > first glance clear whether it refers to the host''s putative iommu, or > some kind of provision to the guest. > > And it needs documentation, which I hope will answer my other question > which is "what is this good for?" :-).Ian.
Wei Wang2
2012-Jan-11 10:20 UTC
Re: [PATCH 15 of 16] libxl: Introduce a new guest config file parameter [and 1 more messages]
On Tuesday 10 January 2012 18:12:05 Ian Jackson wrote:> Wei Wang writes ("[PATCH 14 of 14 V3] libxl: Introduce a new guest configfile parameter"):> > # HG changeset patch > > # User Wei Wang <wei.wang2@amd.com> > > # Date 1326213623 -3600 > > # Node ID 39eb093ea89eeaa4dbff29439499f2a289291ff0 > > # Parent 9e89b6485b6c91a8d563c46c47a8d768eee7d1f2 > > libxl: Introduce a new guest config file parameter > > Use iommu = {1,0} to enable or disable guest iommu emulation. > > Default value is 0. > > Ian Jackson writes ("Re: [PATCH 15 of 16] libxl: Introduce a new guestconfig file parameter"):> > Wei Wang writes ("[PATCH 15 of 16] libxl: Introduce a new guest configfile parameter"):> > > libxl: Introduce a new guest config file parameter > > > Use iommu = {1,0} to enable or disable guest iommu emulation. > > > Default value is 0. > > > > I''m not sure I like this name. It''s confusing because it''s not at > > first glance clear whether it refers to the host''s putative iommu, or > > some kind of provision to the guest.How about guest_iommu = {1, 0} or ats_passthru = {1, 0}? Actually the idea of the whole patch queue is to enable passthru for sophisticated ats devices (e.g. Tahiti the latest AMD gfx/gpgpu), so that we utilize the device to do general-purpose computations in guest.> > And it needs documentation, which I hope will answer my other question > > which is "what is this good for?" :-).Sure, I am working on the docs now. It will be sent together with the next try after I had collected enough comments of this version. Thanks, Wei> Ian.
Ian Jackson
2012-Jan-23 13:59 UTC
Re: [PATCH 15 of 16] libxl: Introduce a new guest config file parameter [and 1 more messages]
Wei Wang2 writes ("Re: [PATCH 15 of 16] libxl: Introduce a new guest config file parameter [and 1 more messages]"):> How about guest_iommu = {1, 0} or ats_passthru = {1, 0}?I think "guest_iommu" is more likely to be something people have heard of. Personally I''m afraid I have no idea what (an) "ATS" is ...> Actually the idea of> the whole patch queue is to enable passthru > for sophisticated ats devices (e.g. Tahiti the latest AMD > gfx/gpgpu), so that we utilize the device to do general-purpose > computations in guest.Right. Ian.