Satoshi Uchida
2007-Jul-30 08:50 UTC
[Xen-devel] [PATCH][0/4][RFC][IOMGR] Virtual disk I/O management mechanism in Xen
This patch set provides new function for disk I/O management. This function controls dispatching I/O requests of virtual block devices (blkback/blktap) in order to guard or to discriminate I/O performance among their devices. This time, guarding and discriminating I/O performance is realized based on the number of dispatching requests. This patch set includes following two parts. - A control framework for virtual I/O requests. This framework provides interface to control virtual I/O requests in blkback and blktap mechanism. A virtual I/O request controller can be set/removed as a module. - A simple virtual I/O request scheduler. This scheduler controls virtual I/O requests based on the number of requests per turn. This makes blkback/blktap threads wait when their share per turn is finished. A share means the amount of request per turn that allocated for each virtual block devices. Therefore, virtual I/O requests can be controlled in proportion to shares of blkback/blktap threads. The reason why we made this function is that an ability of I/O resource management is necessary for Xen. Xen has differentiated services for CPU and memory (maybe network), but not I/O. ( Maybe, there are control function based on CFQ currently in XenEnterprise. ). So, we developed I/O control function for blkback/blktap. I/O performance of each blkback/blktap thread is guarded in proportion to shares of all threads and is avoided the influence of other domain''s I/O. We think that there are two control places in Xen: dispatching place in blkback/blktap and I/O scheduler within Linux. The former have advantage which is located Xen architecture (backend - frontend form), while the latter have advantage which is to use unmodified Linux I/O scheduler. Virtualization should be OS agnostic, so this patch set realizes I/O control at the former place. The first part is control framework. The aim of the control framework is to control either blkback or blktap, and both by one control module. This make easy to develop and test management functions. The control framework is constructed as follows. iomgr === control module / | blkback blktap "iomgr" is the core of the control framework, and it connects a control module and blkback/blktap. In addition, it counts the number of total pending requests. The second part is simple scheduler. It is one of implementation as a control modules. Our scheduler controls the number of virtual I/O requests per turn. Now, backend driver is processed in Domain 0 which is Linux and I/O performance is affected by I/O schedulers. To differentiate I/O performance, dispatching I/O requests should be stopped. Our scheduler controls I/O requests based on the number of dispatching I/O requests. If blkback/blktap threads are finished their shares, they are waiting until that all threads finish their own shares. Exceptionally, in the case that any threads remain their shares, but every threads have no requests, turn moves new round and all threads restore their own max shares. Our scheduler judges whether there are pending requests or not. We call this scheduler "turn-based scheduler". We think that a control module will be expanded other I/O control function: for example, absolute control, through-put control, access control management and so on. The procedure to enable our I/O management is as follows. 1. Enable relevant config options (CONFIG_XEN_IOMGR) and runtime configuration kernel module (CONFIG_XEN_IOSCHED_TURN). 2. Build and boot Xen framework and this kernel. 3. Insert control modules into domain-0 (input "modprobe turn_iosched" command). 4. Configure ability of virtual block devices. Interface to configure I/O request share of each virtual block device is represented as sysfs. When you want to check their shares, you can get their shares by reading commands, such as cat, less and so on. When you want to set up their shares, you can set up their shares by writing operations, such as echo and redirection. For example, in the domain 1 with blktap xvda, /sys/devices/xen-backend/tap-1-51712/iomgr/max_cap shows a maximum share per turn. Default is 64. (This value can be changed by rewriting /sys/module/turn_iosched/parameters/default_max_cap.) To configure share for its virtual block devices, write a value into this entry. For example, echo 128 > /sys/devices/xen-backend/tap-1-51712/iomgr/max_cap. /sys/devices/xen-backend/tap-1-51712/iomgr/req_cap shows a remaining share at current turn. This entry is read only. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel