Satoshi UCHIDA
2008-Nov-12 08:15 UTC
[PATCH][RFC][12+2][v3] A expanded CFQ scheduler for cgroups
This patchset expands traditional CFQ scheduler in order to support cgroups, and improves old version. Improvements are as following. * Modularizing our new CFQ scheduler. The expanded CFQ scheduler is registered/unregistered as new I/O elevator scheduler called "cfq-cgroups". By this, the traditional CFQ scheduler, which does not handle cgroups, and our new CFQ scheduler, which handles cgroups, can be used at the same time for different devices. * Allowing to set parameter per device. The expanded CFQ scheduler allows users to set parameter per device. By this, users can decide share (priority) per device. --- Optional functions --- * Adding a validation flag for 'think time'. (Opt-1 patch) CFQ show poor scalability. One of its causes is the think time. The think time is used to improve the I/O performance by handling queues with poor I/O as IDLE class. However, when many tasks have I/O requests, think time for their tasks became long and then all queues are handled as IDLE class. As a result, dispatching I/O requests is dispersed, and then the I/O performance falls. The think time valid flag controls think time judgment. * Adding ioprio class for cgroups. (Opt-2 patch) The previous expanded CFQ scheduler can not implement ioprio class. This optional patch implements its proto-type. This patch gives a basic service tree control for ioprio class of cgroups and does not give preempt function, completed function and so on yet. 1. Introduction. This patchset introduce "Yet Another" I/O bandwidth controlling subsystem for cgroups based on CFQ (called 2 layer CFQ). The idea of 2 layer CFQ is to build fairness control per group on the top of existing CFQ control. We added a new data structure called CFQ driver data on the top of cfqd in order to control I/O bandwidth for cgroups. CFQ driver data control cfq_datas by service tree (rb-tree) and CFQ algorithm when synchronous I/O. An active cfqd controls queue for cfq by service tree. Namely, the CFQ meta-data control traditional CFQ data. the CFQ data runs conventionally. cfqdd cfqdd (cfqmd = cfq driver data) | | cfqc -- cfqd ----- cfqd (cfqd = cfq data, | | cfqc = cfq cgroup data) cfqc --[cfqd]----- cfqd ^ | conventional control. This patchset is against 2.6.28-rc2 2. Build i. Apply this patchset (series 01 - 12) to kernel 2.6.28-rc2. If you want to use optional functions, apply opt-1/opt-2 patches to kernel 2.6.28-rc2. ii. Build kernel with IOSCHED_CFQ_CGROUP=y option. iii. Restart new kernel. 3. Usage of 2 layer CFQ * Preparation for using 2 layer CFQ i. Mount cfq_cgroup special device to device directory. ex. mkdir /dev/cgroup mount -t cgroup -o cfq cfq /dev/cgroup ii. Change elevator scheduler for device to "cfq-cgroups" ex. echo cfq-cgorups > /sys/block/sda/queue/scheduler * Usage of grouping control. - Create a new group. Make a new directory under /dev/cgroup. For example, the following command generates a 'test1' group. mkdir /dev/cgroup/test1 - Insert a task to a group. Write process id(pid) on "tasks" entry in the corresponding group. For example, the following command sets task with pid 1100 into test1 group. echo 1100 > /dev/cgroup/test1/tasks New child tasks of this task is also inserted into test1 group. - Change I/O priorities of a group. Write priority on "cfq.ioprio" entry in the corresponding group. For example, the following command sets priority of rank 2 to 'test1' group. echo 2 > /dev/cgroup/test1/cfq.ioprio I/O priority for cgroups takes the value from 0 to 7. It is same as existing per-task CFQ. If you want to change only I/O priority of a specific device and group, add its device name as a second parameter. For example, the following command sets priority of rank 2 to 'test1' group for 'sda' device. echo 2 sda > /dev/cgroup/test1/cfq.ioprio If you want to change I/O priority of a specific device and group via sysfs. If you can change its priority, Add its path for cgroup as a second parameter. For example, the following command sets priority of rank 2 to 'test1' group for 'sda' device via sysfs. echo 2 /test1 > /sys/block/sda/queue/iosched/ioprio If you can change parameters of cfq_data (slice_sync, back_seek_penalty and so on) for a specific device and group. If you write only one parameter via sysfs, its setting reflects all groups. If you set elevator scheduler as cfq-cgroups, I/O priorities of its new device set a default priority with groups. If you want to change this default priority, write priority and "default" as second parameter on "cfq.ioprio" entry in the corresponding group. For example, echo 2 default > /dev/cgroup/test1/cfq.ioprio - Change I/O priority of task Use existing "ionice" command. 4. Usage of Optional Functions. i. Usage of a validation flag for 'think time' This parameter can use via sysfs as similar as other cfq data parameter. Its entry name is 'ttime_valid'. This flag is decide to check think time. The value 0 is always handled queues as idle class. In practice, idie_window flag is clear. The value 1 is handled as same as traditional CFQ. The value 2 makes the think time invalid. ii. Usage of ioprio class for cgroups. The ioprio class use via cgroupfs as similar as ioprio. Its entry name is 'cfq.ioprio_class' The values of ioprio class are as same as I/O class of traditional CFQ. 0: IOPRIO_CLASS_NONE (is equal to IOPRIO_CLASS_BE) 1: IOPRIO_CLASS_RT 2: IOPRIO_CLASS_BE 3: IOPRIO_CLASS_IDLE 5. Future work. We must implement the follows. * Handle buffered I/O.
Satoshi UCHIDA
2008-Nov-12 08:23 UTC
[PATCH][cfq-cgroups][01/12] Move basic strcture variable to header file.
The "cfq_data" structure and few definition are moved into header file. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-iosched.c | 68 +------------------------------------- include/linux/cfq-iosched.h | 77 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+), 67 deletions(-) create mode 100644 include/linux/cfq-iosched.h diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 6a062ee..024d392 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -12,6 +12,7 @@ #include <linux/rbtree.h> #include <linux/ioprio.h> #include <linux/blktrace_api.h> +#include <linux/cfq-iosched.h> /* * tunables @@ -62,73 +63,6 @@ static DEFINE_SPINLOCK(ioc_gone_lock); #define sample_valid(samples) ((samples) > 80) /* - * Most of our rbtree usage is for sorting with min extraction, so - * if we cache the leftmost node we don't have to walk down the tree - * to find it. Idea borrowed from Ingo Molnars CFS scheduler. We should - * move this into the elevator for the rq sorting as well. - */ -struct cfq_rb_root { - struct rb_root rb; - struct rb_node *left; -}; -#define CFQ_RB_ROOT (struct cfq_rb_root) { RB_ROOT, NULL, } - -/* - * Per block device queue structure - */ -struct cfq_data { - struct request_queue *queue; - - /* - * rr list of queues with requests and the count of them - */ - struct cfq_rb_root service_tree; - unsigned int busy_queues; - - int rq_in_driver; - int sync_flight; - - /* - * queue-depth detection - */ - int rq_queued; - int hw_tag; - int hw_tag_samples; - int rq_in_driver_peak; - - /* - * idle window management - */ - struct timer_list idle_slice_timer; - struct work_struct unplug_work; - - struct cfq_queue *active_queue; - struct cfq_io_context *active_cic; - - /* - * async queue for each priority case - */ - struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; - struct cfq_queue *async_idle_cfqq; - - sector_t last_position; - unsigned long last_end_request; - - /* - * tunables, see top of file - */ - unsigned int cfq_quantum; - unsigned int cfq_fifo_expire[2]; - unsigned int cfq_back_penalty; - unsigned int cfq_back_max; - unsigned int cfq_slice[2]; - unsigned int cfq_slice_async_rq; - unsigned int cfq_slice_idle; - - struct list_head cic_list; -}; - -/* * Per process-grouping structure */ struct cfq_queue { diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h new file mode 100644 index 0000000..adb2410 --- /dev/null +++ b/include/linux/cfq-iosched.h @@ -0,0 +1,77 @@ +#ifndef _LINUX_CFQ_IOSCHED_H +#define _LINUX_CFQ_IOSCHED_H + +#include <linux/rbtree.h> +#include <linux/list.h> + +struct request_queue; +struct cfq_io_context; + +/* + * Most of our rbtree usage is for sorting with min extraction, so + * if we cache the leftmost node we don't have to walk down the tree + * to find it. Idea borrowed from Ingo Molnars CFS scheduler. We should + * move this into the elevator for the rq sorting as well. + */ +struct cfq_rb_root { + struct rb_root rb; + struct rb_node *left; +}; +#define CFQ_RB_ROOT (struct cfq_rb_root) { RB_ROOT, NULL, } + +/* + * Per block device queue structure + */ +struct cfq_data { + struct request_queue *queue; + + /* + * rr list of queues with requests and the count of them + */ + struct cfq_rb_root service_tree; + unsigned int busy_queues; + + int rq_in_driver; + int sync_flight; + + /* + * queue-depth detection + */ + int rq_queued; + int hw_tag; + int hw_tag_samples; + int rq_in_driver_peak; + + /* + * idle window management + */ + struct timer_list idle_slice_timer; + struct work_struct unplug_work; + + struct cfq_queue *active_queue; + struct cfq_io_context *active_cic; + + /* + * async queue for each priority case + */ + struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; + struct cfq_queue *async_idle_cfqq; + + sector_t last_position; + unsigned long last_end_request; + + /* + * tunables, see top of file + */ + unsigned int cfq_quantum; + unsigned int cfq_fifo_expire[2]; + unsigned int cfq_back_penalty; + unsigned int cfq_back_max; + unsigned int cfq_slice[2]; + unsigned int cfq_slice_async_rq; + unsigned int cfq_slice_idle; + + struct list_head cic_list; +}; + +#endif /* _LINUX_CFQ_IOSCHED_H */ -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:24 UTC
[PATCH][cfq-cgroups][02/12] Introduce "cfq_driver_data" structure.
This patch introduces "cfq_driver_Data" structure. This structure extract driver unique data from "cfq_data" structure. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-iosched.c | 218 ++++++++++++++++++++++++++----------------- include/linux/cfq-iosched.h | 32 ++++--- 2 files changed, 151 insertions(+), 99 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 024d392..b726e85 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -144,9 +144,10 @@ CFQ_CFQQ_FNS(sync); #undef CFQ_CFQQ_FNS #define cfq_log_cfqq(cfqd, cfqq, fmt, args...) \ - blk_add_trace_msg((cfqd)->queue, "cfq%d " fmt, (cfqq)->pid, ##args) + blk_add_trace_msg((cfqd)->cfqdd->queue, \ + "cfq%d " fmt, (cfqq)->pid, ##args) #define cfq_log(cfqd, fmt, args...) \ - blk_add_trace_msg((cfqd)->queue, "cfq " fmt, ##args) + blk_add_trace_msg((cfqd)->cfqdd->queue, "cfq " fmt, ##args) static void cfq_dispatch_insert(struct request_queue *, struct request *); static struct cfq_queue *cfq_get_queue(struct cfq_data *, int, @@ -184,9 +185,11 @@ static inline int cfq_bio_sync(struct bio *bio) */ static inline void cfq_schedule_dispatch(struct cfq_data *cfqd) { + struct cfq_driver_data *cfqdd = cfqd->cfqdd; if (cfqd->busy_queues) { cfq_log(cfqd, "schedule dispatch"); - kblockd_schedule_work(cfqd->queue, &cfqd->unplug_work); + kblockd_schedule_work(cfqdd->queue, + &cfqdd->unplug_work); } } @@ -271,7 +274,7 @@ cfq_choose_req(struct cfq_data *cfqd, struct request *rq1, struct request *rq2) s1 = rq1->sector; s2 = rq2->sector; - last = cfqd->last_position; + last = cfqd->cfqdd->last_position; /* * by definition, 1KiB is 2 sectors @@ -548,7 +551,7 @@ static void cfq_add_rq_rb(struct request *rq) * if that happens, put the alias on the dispatch list */ while ((__alias = elv_rb_add(&cfqq->sort_list, rq)) != NULL) - cfq_dispatch_insert(cfqd->queue, __alias); + cfq_dispatch_insert(cfqd->cfqdd->queue, __alias); if (!cfq_cfqq_on_rr(cfqq)) cfq_add_cfqq_rr(cfqd, cfqq); @@ -591,22 +594,24 @@ cfq_find_rq_fmerge(struct cfq_data *cfqd, struct bio *bio) static void cfq_activate_request(struct request_queue *q, struct request *rq) { struct cfq_data *cfqd = q->elevator->elevator_data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; - cfqd->rq_in_driver++; + cfqdd->rq_in_driver++; cfq_log_cfqq(cfqd, RQ_CFQQ(rq), "activate rq, drv=%d", - cfqd->rq_in_driver); + cfqdd->rq_in_driver); - cfqd->last_position = rq->hard_sector + rq->hard_nr_sectors; + cfqdd->last_position = rq->hard_sector + rq->hard_nr_sectors; } static void cfq_deactivate_request(struct request_queue *q, struct request *rq) { struct cfq_data *cfqd = q->elevator->elevator_data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; - WARN_ON(!cfqd->rq_in_driver); - cfqd->rq_in_driver--; + WARN_ON(!cfqdd->rq_in_driver); + cfqdd->rq_in_driver--; cfq_log_cfqq(cfqd, RQ_CFQQ(rq), "deactivate rq, drv=%d", - cfqd->rq_in_driver); + cfqdd->rq_in_driver); } static void cfq_remove_request(struct request *rq) @@ -619,7 +624,7 @@ static void cfq_remove_request(struct request *rq) list_del_init(&rq->queuelist); cfq_del_rq_rb(rq); - cfqq->cfqd->rq_queued--; + cfqq->cfqd->cfqdd->rq_queued--; if (rq_is_meta(rq)) { WARN_ON(!cfqq->meta_pending); cfqq->meta_pending--; @@ -715,10 +720,12 @@ static void __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq, int timed_out) { + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + cfq_log_cfqq(cfqd, cfqq, "slice expired t=%d", timed_out); if (cfq_cfqq_wait_request(cfqq)) - del_timer(&cfqd->idle_slice_timer); + del_timer(&cfqdd->idle_slice_timer); cfq_clear_cfqq_must_dispatch(cfqq); cfq_clear_cfqq_wait_request(cfqq); @@ -736,9 +743,9 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq, if (cfqq == cfqd->active_queue) cfqd->active_queue = NULL; - if (cfqd->active_cic) { - put_io_context(cfqd->active_cic->ioc); - cfqd->active_cic = NULL; + if (cfqdd->active_cic) { + put_io_context(cfqdd->active_cic->ioc); + cfqdd->active_cic = NULL; } } @@ -777,15 +784,17 @@ static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd) static inline sector_t cfq_dist_from_last(struct cfq_data *cfqd, struct request *rq) { - if (rq->sector >= cfqd->last_position) - return rq->sector - cfqd->last_position; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + + if (rq->sector >= cfqdd->last_position) + return rq->sector - cfqdd->last_position; else - return cfqd->last_position - rq->sector; + return cfqdd->last_position - rq->sector; } static inline int cfq_rq_close(struct cfq_data *cfqd, struct request *rq) { - struct cfq_io_context *cic = cfqd->active_cic; + struct cfq_io_context *cic = cfqd->cfqdd->active_cic; if (!sample_valid(cic->seek_samples)) return 0; @@ -809,6 +818,7 @@ static int cfq_close_cooperator(struct cfq_data *cfq_data, static void cfq_arm_slice_timer(struct cfq_data *cfqd) { struct cfq_queue *cfqq = cfqd->active_queue; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; struct cfq_io_context *cic; unsigned long sl; @@ -817,7 +827,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) * for devices that support queuing, otherwise we still have a problem * with sync vs async workloads. */ - if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag) + if (blk_queue_nonrot(cfqdd->queue) && cfqdd->hw_tag) return; WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list)); @@ -832,13 +842,13 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) /* * still requests with the driver, don't idle */ - if (cfqd->rq_in_driver) + if (cfqdd->rq_in_driver) return; /* * task has exited, don't wait */ - cic = cfqd->active_cic; + cic = cfqdd->active_cic; if (!cic || !atomic_read(&cic->ioc->nr_tasks)) return; @@ -861,7 +871,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) if (sample_valid(cic->seek_samples) && CIC_SEEKY(cic)) sl = min(sl, msecs_to_jiffies(CFQ_MIN_TT)); - mod_timer(&cfqd->idle_slice_timer, jiffies + sl); + mod_timer(&cfqdd->idle_slice_timer, jiffies + sl); cfq_log(cfqd, "arm_idle: %lu", sl); } @@ -880,7 +890,7 @@ static void cfq_dispatch_insert(struct request_queue *q, struct request *rq) elv_dispatch_sort(q, rq); if (cfq_cfqq_sync(cfqq)) - cfqd->sync_flight++; + cfqd->cfqdd->sync_flight++; } /* @@ -950,7 +960,7 @@ static struct cfq_queue *cfq_select_queue(struct cfq_data *cfqd) * flight or is idling for a new request, allow either of these * conditions to happen (or time out) before selecting a new queue. */ - if (timer_pending(&cfqd->idle_slice_timer) || + if (timer_pending(&cfqd->cfqdd->idle_slice_timer) || (cfqq->dispatched && cfq_cfqq_idle_window(cfqq))) { cfqq = NULL; goto keep_queue; @@ -972,6 +982,7 @@ static int __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq, int max_dispatch) { + struct cfq_driver_data *cfqdd = cfqd->cfqdd; int dispatched = 0; BUG_ON(RB_EMPTY_ROOT(&cfqq->sort_list)); @@ -989,13 +1000,13 @@ __cfq_dispatch_requests(struct cfq_data *cfqd, struct cfq_queue *cfqq, /* * finally, insert request into driver dispatch list */ - cfq_dispatch_insert(cfqd->queue, rq); + cfq_dispatch_insert(cfqdd->queue, rq); dispatched++; - if (!cfqd->active_cic) { + if (!cfqdd->active_cic) { atomic_inc(&RQ_CIC(rq)->ioc->refcount); - cfqd->active_cic = RQ_CIC(rq); + cfqdd->active_cic = RQ_CIC(rq); } if (RB_EMPTY_ROOT(&cfqq->sort_list)) @@ -1022,7 +1033,7 @@ static int __cfq_forced_dispatch_cfqq(struct cfq_queue *cfqq) int dispatched = 0; while (cfqq->next_rq) { - cfq_dispatch_insert(cfqq->cfqd->queue, cfqq->next_rq); + cfq_dispatch_insert(cfqq->cfqd->cfqdd->queue, cfqq->next_rq); dispatched++; } @@ -1054,6 +1065,7 @@ static int cfq_dispatch_requests(struct request_queue *q, int force) { struct cfq_data *cfqd = q->elevator->elevator_data; struct cfq_queue *cfqq; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; int dispatched; if (!cfqd->busy_queues) @@ -1077,12 +1089,12 @@ static int cfq_dispatch_requests(struct request_queue *q, int force) break; } - if (cfqd->sync_flight && !cfq_cfqq_sync(cfqq)) + if (cfqdd->sync_flight && !cfq_cfqq_sync(cfqq)) break; cfq_clear_cfqq_must_dispatch(cfqq); cfq_clear_cfqq_wait_request(cfqq); - del_timer(&cfqd->idle_slice_timer); + del_timer(&cfqdd->idle_slice_timer); dispatched += __cfq_dispatch_requests(cfqd, cfqq, max_dispatch); } @@ -1248,7 +1260,7 @@ static void cfq_exit_single_io_context(struct io_context *ioc, struct cfq_data *cfqd = cic->key; if (cfqd) { - struct request_queue *q = cfqd->queue; + struct request_queue *q = cfqd->cfqdd->queue; unsigned long flags; spin_lock_irqsave(q->queue_lock, flags); @@ -1272,7 +1284,7 @@ cfq_alloc_io_context(struct cfq_data *cfqd, gfp_t gfp_mask) struct cfq_io_context *cic; cic = kmem_cache_alloc_node(cfq_ioc_pool, gfp_mask | __GFP_ZERO, - cfqd->queue->node); + cfqd->cfqdd->queue->node); if (cic) { cic->last_end_request = jiffies; INIT_LIST_HEAD(&cic->queue_list); @@ -1332,12 +1344,13 @@ static void changed_ioprio(struct io_context *ioc, struct cfq_io_context *cic) { struct cfq_data *cfqd = cic->key; struct cfq_queue *cfqq; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; unsigned long flags; if (unlikely(!cfqd)) return; - spin_lock_irqsave(cfqd->queue->queue_lock, flags); + spin_lock_irqsave(cfqdd->queue->queue_lock, flags); cfqq = cic->cfqq[ASYNC]; if (cfqq) { @@ -1353,7 +1366,7 @@ static void changed_ioprio(struct io_context *ioc, struct cfq_io_context *cic) if (cfqq) cfq_mark_cfqq_prio_changed(cfqq); - spin_unlock_irqrestore(cfqd->queue->queue_lock, flags); + spin_unlock_irqrestore(cfqdd->queue->queue_lock, flags); } static void cfq_ioc_set_ioprio(struct io_context *ioc) @@ -1367,6 +1380,7 @@ cfq_find_alloc_queue(struct cfq_data *cfqd, int is_sync, struct io_context *ioc, gfp_t gfp_mask) { struct cfq_queue *cfqq, *new_cfqq = NULL; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; struct cfq_io_context *cic; retry: @@ -1385,16 +1399,16 @@ retry: * the allocator to do whatever it needs to attempt to * free memory. */ - spin_unlock_irq(cfqd->queue->queue_lock); + spin_unlock_irq(cfqdd->queue->queue_lock); new_cfqq = kmem_cache_alloc_node(cfq_pool, gfp_mask | __GFP_NOFAIL | __GFP_ZERO, - cfqd->queue->node); - spin_lock_irq(cfqd->queue->queue_lock); + cfqdd->queue->node); + spin_lock_irq(cfqdd->queue->queue_lock); goto retry; } else { cfqq = kmem_cache_alloc_node(cfq_pool, gfp_mask | __GFP_ZERO, - cfqd->queue->node); + cfqdd->queue->node); if (!cfqq) goto out; } @@ -1547,9 +1561,11 @@ cfq_cic_lookup(struct cfq_data *cfqd, struct io_context *ioc) static int cfq_cic_link(struct cfq_data *cfqd, struct io_context *ioc, struct cfq_io_context *cic, gfp_t gfp_mask) { + struct cfq_driver_data *cfqdd = cfqd->cfqdd; unsigned long flags; int ret; + ret = radix_tree_preload(gfp_mask); if (!ret) { cic->ioc = ioc; @@ -1565,9 +1581,11 @@ static int cfq_cic_link(struct cfq_data *cfqd, struct io_context *ioc, radix_tree_preload_end(); if (!ret) { - spin_lock_irqsave(cfqd->queue->queue_lock, flags); + spin_lock_irqsave(cfqdd->queue->queue_lock, + flags); list_add(&cic->queue_list, &cfqd->cic_list); - spin_unlock_irqrestore(cfqd->queue->queue_lock, flags); + spin_unlock_irqrestore(cfqdd->queue->queue_lock, + flags); } } @@ -1590,7 +1608,7 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask) might_sleep_if(gfp_mask & __GFP_WAIT); - ioc = get_io_context(gfp_mask, cfqd->queue->node); + ioc = get_io_context(gfp_mask, cfqd->cfqdd->queue->node); if (!ioc) return NULL; @@ -1676,7 +1694,7 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq, enable_idle = old_idle = cfq_cfqq_idle_window(cfqq); if (!atomic_read(&cic->ioc->nr_tasks) || !cfqd->cfq_slice_idle || - (cfqd->hw_tag && CIC_SEEKY(cic))) + (cfqd->cfqdd->hw_tag && CIC_SEEKY(cic))) enable_idle = 0; else if (sample_valid(cic->ttime_samples)) { if (cic->ttime_mean > cfqd->cfq_slice_idle) @@ -1731,7 +1749,7 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq, if (rq_is_meta(rq) && !cfqq->meta_pending) return 1; - if (!cfqd->active_cic || !cfq_cfqq_wait_request(cfqq)) + if (!cfqd->cfqdd->active_cic || !cfq_cfqq_wait_request(cfqq)) return 0; /* @@ -1774,8 +1792,9 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq, struct request *rq) { struct cfq_io_context *cic = RQ_CIC(rq); + struct cfq_driver_data *cfqdd = cfqd->cfqdd; - cfqd->rq_queued++; + cfqdd->rq_queued++; if (rq_is_meta(rq)) cfqq->meta_pending++; @@ -1793,8 +1812,8 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq, */ if (cfq_cfqq_wait_request(cfqq)) { cfq_mark_cfqq_must_dispatch(cfqq); - del_timer(&cfqd->idle_slice_timer); - blk_start_queueing(cfqd->queue); + del_timer(&cfqdd->idle_slice_timer); + blk_start_queueing(cfqdd->queue); } } else if (cfq_should_preempt(cfqd, cfqq, rq)) { /* @@ -1804,7 +1823,7 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq, */ cfq_preempt_queue(cfqd, cfqq); cfq_mark_cfqq_must_dispatch(cfqq); - blk_start_queueing(cfqd->queue); + blk_start_queueing(cfqdd->queue); } } @@ -1827,49 +1846,50 @@ static void cfq_insert_request(struct request_queue *q, struct request *rq) * Update hw_tag based on peak queue depth over 50 samples under * sufficient load. */ -static void cfq_update_hw_tag(struct cfq_data *cfqd) +static void cfq_update_hw_tag(struct cfq_driver_data *cfqdd) { - if (cfqd->rq_in_driver > cfqd->rq_in_driver_peak) - cfqd->rq_in_driver_peak = cfqd->rq_in_driver; + if (cfqdd->rq_in_driver > cfqdd->rq_in_driver_peak) + cfqdd->rq_in_driver_peak = cfqdd->rq_in_driver; - if (cfqd->rq_queued <= CFQ_HW_QUEUE_MIN && - cfqd->rq_in_driver <= CFQ_HW_QUEUE_MIN) + if (cfqdd->rq_queued <= CFQ_HW_QUEUE_MIN && + cfqdd->rq_in_driver <= CFQ_HW_QUEUE_MIN) return; - if (cfqd->hw_tag_samples++ < 50) + if (cfqdd->hw_tag_samples++ < 50) return; - if (cfqd->rq_in_driver_peak >= CFQ_HW_QUEUE_MIN) - cfqd->hw_tag = 1; + if (cfqdd->rq_in_driver_peak >= CFQ_HW_QUEUE_MIN) + cfqdd->hw_tag = 1; else - cfqd->hw_tag = 0; + cfqdd->hw_tag = 0; - cfqd->hw_tag_samples = 0; - cfqd->rq_in_driver_peak = 0; + cfqdd->hw_tag_samples = 0; + cfqdd->rq_in_driver_peak = 0; } static void cfq_completed_request(struct request_queue *q, struct request *rq) { struct cfq_queue *cfqq = RQ_CFQQ(rq); struct cfq_data *cfqd = cfqq->cfqd; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; const int sync = rq_is_sync(rq); unsigned long now; now = jiffies; cfq_log_cfqq(cfqd, cfqq, "complete"); - cfq_update_hw_tag(cfqd); + cfq_update_hw_tag(cfqdd); - WARN_ON(!cfqd->rq_in_driver); + WARN_ON(!cfqdd->rq_in_driver); WARN_ON(!cfqq->dispatched); - cfqd->rq_in_driver--; + cfqdd->rq_in_driver--; cfqq->dispatched--; if (cfq_cfqq_sync(cfqq)) - cfqd->sync_flight--; + cfqdd->sync_flight--; if (!cfq_class_idle(cfqq)) - cfqd->last_end_request = now; + cfqdd->last_end_request = now; if (sync) RQ_CIC(rq)->last_end_request = now; @@ -1889,7 +1909,7 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) cfq_arm_slice_timer(cfqd); } - if (!cfqd->rq_in_driver) + if (!cfqdd->rq_in_driver) cfq_schedule_dispatch(cfqd); } @@ -2034,9 +2054,9 @@ queue_fail: static void cfq_kick_queue(struct work_struct *work) { - struct cfq_data *cfqd - container_of(work, struct cfq_data, unplug_work); - struct request_queue *q = cfqd->queue; + struct cfq_driver_data *cfqdd + container_of(work, struct cfq_driver_data, unplug_work); + struct request_queue *q = cfqdd->queue; unsigned long flags; spin_lock_irqsave(q->queue_lock, flags); @@ -2051,12 +2071,13 @@ static void cfq_idle_slice_timer(unsigned long data) { struct cfq_data *cfqd = (struct cfq_data *) data; struct cfq_queue *cfqq; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; unsigned long flags; int timed_out = 1; cfq_log(cfqd, "idle timer fired"); - spin_lock_irqsave(cfqd->queue->queue_lock, flags); + spin_lock_irqsave(cfqdd->queue->queue_lock, flags); cfqq = cfqd->active_queue; if (cfqq) { @@ -2088,13 +2109,13 @@ expire: out_kick: cfq_schedule_dispatch(cfqd); out_cont: - spin_unlock_irqrestore(cfqd->queue->queue_lock, flags); + spin_unlock_irqrestore(cfqdd->queue->queue_lock, flags); } -static void cfq_shutdown_timer_wq(struct cfq_data *cfqd) +static void cfq_shutdown_timer_wq(struct cfq_driver_data *cfqdd) { - del_timer_sync(&cfqd->idle_slice_timer); - kblockd_flush_work(&cfqd->unplug_work); + del_timer_sync(&cfqdd->idle_slice_timer); + kblockd_flush_work(&cfqdd->unplug_work); } static void cfq_put_async_queues(struct cfq_data *cfqd) @@ -2115,9 +2136,10 @@ static void cfq_put_async_queues(struct cfq_data *cfqd) static void cfq_exit_queue(elevator_t *e) { struct cfq_data *cfqd = e->elevator_data; - struct request_queue *q = cfqd->queue; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + struct request_queue *q = cfqdd->queue; - cfq_shutdown_timer_wq(cfqd); + cfq_shutdown_timer_wq(cfqdd); spin_lock_irq(q->queue_lock); @@ -2136,11 +2158,37 @@ static void cfq_exit_queue(elevator_t *e) spin_unlock_irq(q->queue_lock); - cfq_shutdown_timer_wq(cfqd); + cfq_shutdown_timer_wq(cfqdd); + kfree(cfqdd); kfree(cfqd); } +static struct cfq_driver_data * +cfq_init_driver_data(struct request_queue *q, struct cfq_data *cfqd) +{ + struct cfq_driver_data *cfqdd; + + cfqdd = kmalloc_node(sizeof(*cfqdd), + GFP_KERNEL | __GFP_ZERO, q->node); + if (!cfqdd) + return NULL; + + cfqdd->queue = q; + + init_timer(&cfqdd->idle_slice_timer); + cfqdd->idle_slice_timer.function = cfq_idle_slice_timer; + cfqdd->idle_slice_timer.data = (unsigned long) cfqd; + + INIT_WORK(&cfqdd->unplug_work, cfq_kick_queue); + + cfqdd->last_end_request = jiffies; + + cfqdd->hw_tag = 1; + + return cfqdd; +} + static void *cfq_init_queue(struct request_queue *q) { struct cfq_data *cfqd; @@ -2149,18 +2197,15 @@ static void *cfq_init_queue(struct request_queue *q) if (!cfqd) return NULL; + cfqd->cfqdd = cfq_init_driver_data(q, cfqd); + if (!cfqd->cfqdd) { + kfree(cfqd); + return NULL; + } + cfqd->service_tree = CFQ_RB_ROOT; INIT_LIST_HEAD(&cfqd->cic_list); - cfqd->queue = q; - - init_timer(&cfqd->idle_slice_timer); - cfqd->idle_slice_timer.function = cfq_idle_slice_timer; - cfqd->idle_slice_timer.data = (unsigned long) cfqd; - - INIT_WORK(&cfqd->unplug_work, cfq_kick_queue); - - cfqd->last_end_request = jiffies; cfqd->cfq_quantum = cfq_quantum; cfqd->cfq_fifo_expire[0] = cfq_fifo_expire[0]; cfqd->cfq_fifo_expire[1] = cfq_fifo_expire[1]; @@ -2170,7 +2215,6 @@ static void *cfq_init_queue(struct request_queue *q) cfqd->cfq_slice[1] = cfq_slice_sync; cfqd->cfq_slice_async_rq = cfq_slice_async_rq; cfqd->cfq_slice_idle = cfq_slice_idle; - cfqd->hw_tag = 1; return cfqd; } diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index adb2410..50003f7 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -20,17 +20,11 @@ struct cfq_rb_root { #define CFQ_RB_ROOT (struct cfq_rb_root) { RB_ROOT, NULL, } /* - * Per block device queue structure + * Driver unique data structure */ -struct cfq_data { +struct cfq_driver_data { struct request_queue *queue; - /* - * rr list of queues with requests and the count of them - */ - struct cfq_rb_root service_tree; - unsigned int busy_queues; - int rq_in_driver; int sync_flight; @@ -48,18 +42,30 @@ struct cfq_data { struct timer_list idle_slice_timer; struct work_struct unplug_work; - struct cfq_queue *active_queue; struct cfq_io_context *active_cic; + sector_t last_position; + unsigned long last_end_request; +}; + +/* + * Per block device queue structure + */ +struct cfq_data { + /* + * rr list of queues with requests and the count of them + */ + struct cfq_rb_root service_tree; + unsigned int busy_queues; + + struct cfq_queue *active_queue; + /* * async queue for each priority case */ struct cfq_queue *async_cfqq[2][IOPRIO_BE_NR]; struct cfq_queue *async_idle_cfqq; - sector_t last_position; - unsigned long last_end_request; - /* * tunables, see top of file */ @@ -72,6 +78,8 @@ struct cfq_data { unsigned int cfq_slice_idle; struct list_head cic_list; + + struct cfq_driver_data *cfqdd; }; #endif /* _LINUX_CFQ_IOSCHED_H */ -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:25 UTC
[PATCH][cfq-cgroups][03/12] Add cgroup file and modify configure files.
This patch adds the cfq-cgroup file and modifies configure files. The cfq-cgroup file store some functions which use to expand CFQ scheduler for handling cgroups. Expanded CFQ scheduler is registered for "cfq-cgroups". Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/Kconfig.iosched | 16 ++++++++++++++++ block/Makefile | 1 + block/cfq-cgroup.c | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 49 insertions(+), 0 deletions(-) create mode 100644 block/cfq-cgroup.c diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched index 7e803fc..dc61120 100644 --- a/block/Kconfig.iosched +++ b/block/Kconfig.iosched @@ -40,6 +40,18 @@ config IOSCHED_CFQ working environment, suitable for desktop systems. This is the default I/O scheduler. +config IOSCHED_CFQ_CGROUP + tristate "expanded CFQ I/O Scheduler for cgroups" + default n + depends on IOSCHED_CFQ && CGROUPS + ---help--- + The expanded CFQ I/O scheduelr for cgroups tries to distribute + bandwidth equally among all groups and among all processes within + groups in the system. It should provide a fair working environment, + suitable for consolidated environment which have some destop systems. + This scheduler expands the CFQ I/O scheduler into two layer control + -- per group layer and per task layer --. + choice prompt "Default I/O scheduler" default DEFAULT_CFQ @@ -56,6 +68,9 @@ choice config DEFAULT_CFQ bool "CFQ" if IOSCHED_CFQ=y + config DEFAULT_CFQ_CGROUP + bool "CFQ-Cgroups" if IOSCHED_CFQ_CGROUP=y + config DEFAULT_NOOP bool "No-op" @@ -66,6 +81,7 @@ config DEFAULT_IOSCHED default "anticipatory" if DEFAULT_AS default "deadline" if DEFAULT_DEADLINE default "cfq" if DEFAULT_CFQ + default "cfq-cgroups" if DEFAULT_CFQ_CGROUP default "noop" if DEFAULT_NOOP endmenu diff --git a/block/Makefile b/block/Makefile index bfe7304..3c0f59d 100644 --- a/block/Makefile +++ b/block/Makefile @@ -12,6 +12,7 @@ obj-$(CONFIG_IOSCHED_NOOP) += noop-iosched.o obj-$(CONFIG_IOSCHED_AS) += as-iosched.o obj-$(CONFIG_IOSCHED_DEADLINE) += deadline-iosched.o obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o +obj-$(CONFIG_IOSCHED_CFQ_CGROUP) += cfq-cgroup.o obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c new file mode 100644 index 0000000..3deef41 --- /dev/null +++ b/block/cfq-cgroup.c @@ -0,0 +1,32 @@ +/* + * CFQ CGROUP disk scheduler. + * + * This program is a wrapper program that is + * extend CFQ disk scheduler for handling + * cgroup subsystem. + * + * This program is based on original CFQ code. + * + * Copyright (C) 2008 Satoshi UCHIDA <s-uchida at ap.jp.nec.com> + * and NEC Corp. + */ + +#include <linux/blkdev.h> +#include <linux/cgroup.h> +#include <linux/cfq-iosched.h> + +static int __init cfq_cgroup_init(void) +{ + return 0; +} + +static void __exit cfq_cgroup_exit(void) +{ +} + +module_init(cfq_cgroup_init); +module_exit(cfq_cgroup_exit); + +MODULE_AUTHOR("Satoshi UCHIDA"); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Expanded CFQ IO scheduler for CGROUPS"); -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:26 UTC
[PATCH][cfq-cgroups][04/12] Register or unregister "cfq-cgroups" module.
This patch introduce a register/unregister functions of "cfq-cgroups" module. A elevator_type variables is inherited one of the original CFQ scheduler. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 122 +++++++++++++++++++++++++++++++++++++++++++ block/cfq-iosched.c | 2 +- include/linux/cfq-iosched.h | 2 + 3 files changed, 125 insertions(+), 1 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index 3deef41..aaa00ef 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -15,13 +15,135 @@ #include <linux/cgroup.h> #include <linux/cfq-iosched.h> +/* + * sysfs parts below --> + */ +static ssize_t +cfq_cgroup_var_show(char *page, struct cfq_data *cfqd, + int (func)(struct cfq_data *)) +{ + int val, retval = 0; + + val = func(cfqd); + + retval = snprintf(page, PAGE_SIZE, "%d\n", val); + + return retval; +} + +#define SHOW_FUNCTION(__FUNC, __VAR, __CONV) \ +static int val_transrate_##__FUNC(struct cfq_data *cfqd) \ +{ \ + if (__CONV) \ + return jiffies_to_msecs(cfqd->__VAR); \ + else \ + return cfqd->__VAR; \ +} \ +static ssize_t __FUNC(elevator_t *e, char *page) \ +{ \ + struct cfq_data *cfqd = e->elevator_data; \ + \ + return cfq_cgroup_var_show((page), (cfqd), \ + val_transrate_##__FUNC); \ +} +SHOW_FUNCTION(cfq_cgroup_quantum_show, cfq_quantum, 0); +SHOW_FUNCTION(cfq_cgroup_fifo_expire_sync_show, cfq_fifo_expire[1], 1); +SHOW_FUNCTION(cfq_cgroup_fifo_expire_async_show, cfq_fifo_expire[0], 1); +SHOW_FUNCTION(cfq_cgroup_back_seek_max_show, cfq_back_max, 0); +SHOW_FUNCTION(cfq_cgroup_back_seek_penalty_show, cfq_back_penalty, 0); +SHOW_FUNCTION(cfq_cgroup_slice_idle_show, cfq_slice_idle, 1); +SHOW_FUNCTION(cfq_cgroup_slice_sync_show, cfq_slice[1], 1); +SHOW_FUNCTION(cfq_cgroup_slice_async_show, cfq_slice[0], 1); +SHOW_FUNCTION(cfq_cgroup_slice_async_rq_show, cfq_slice_async_rq, 0); +#undef SHOW_FUNCTION + +static ssize_t +cfq_cgroup_var_store(const char *page, size_t count, struct cfq_data *cfqd, + void (func)(struct cfq_data *, unsigned int)) +{ + int err; + unsigned long val; + + err = strict_strtoul(page, 10, &val); + if (err) + return 0; + + func(cfqd, val); + + return count; +} + +#define STORE_FUNCTION(__FUNC, __VAR, MIN, MAX, __CONV) \ +static void val_transrate_##__FUNC(struct cfq_data *cfqd, \ + unsigned int __data) \ +{ \ + if (__data < (MIN)) \ + __data = (MIN); \ + else if (__data > (MAX)) \ + __data = (MAX); \ + if (__CONV) \ + cfqd->__VAR = msecs_to_jiffies(__data); \ + else \ + cfqd->__VAR = __data; \ +} \ +static ssize_t __FUNC(elevator_t *e, const char *page, size_t count) \ +{ \ + struct cfq_data *cfqd = e->elevator_data; \ + int ret = cfq_cgroup_var_store((page), count, cfqd, \ + val_transrate_##__FUNC); \ + return ret; \ +} +STORE_FUNCTION(cfq_cgroup_quantum_store, cfq_quantum, 1, UINT_MAX, 0); +STORE_FUNCTION(cfq_cgroup_fifo_expire_sync_store, cfq_fifo_expire[1], 1, + UINT_MAX, 1); +STORE_FUNCTION(cfq_cgroup_fifo_expire_async_store, cfq_fifo_expire[0], 1, + UINT_MAX, 1); +STORE_FUNCTION(cfq_cgroup_back_seek_max_store, cfq_back_max, 0, UINT_MAX, 0); +STORE_FUNCTION(cfq_cgroup_back_seek_penalty_store, cfq_back_penalty, 1, + UINT_MAX, 0); +STORE_FUNCTION(cfq_cgroup_slice_idle_store, cfq_slice_idle, + 0, UINT_MAX, 1); +STORE_FUNCTION(cfq_cgroup_slice_sync_store, cfq_slice[1], 1, UINT_MAX, 1); +STORE_FUNCTION(cfq_cgroup_slice_async_store, cfq_slice[0], 1, UINT_MAX, 1); +STORE_FUNCTION(cfq_cgroup_slice_async_rq_store, cfq_slice_async_rq, 1, + UINT_MAX, 0); +#undef STORE_FUNCTION + +#define CFQ_CGROUP_ATTR(name) \ + __ATTR(name, S_IRUGO|S_IWUSR, cfq_cgroup_##name##_show, \ + cfq_cgroup_##name##_store) + +static struct elv_fs_entry cfq_cgroup_attrs[] = { + CFQ_CGROUP_ATTR(quantum), + CFQ_CGROUP_ATTR(fifo_expire_sync), + CFQ_CGROUP_ATTR(fifo_expire_async), + CFQ_CGROUP_ATTR(back_seek_max), + CFQ_CGROUP_ATTR(back_seek_penalty), + CFQ_CGROUP_ATTR(slice_sync), + CFQ_CGROUP_ATTR(slice_async), + CFQ_CGROUP_ATTR(slice_async_rq), + CFQ_CGROUP_ATTR(slice_idle), + __ATTR_NULL +}; + +static struct elevator_type iosched_cfq_cgroup = { + .elevator_attrs = cfq_cgroup_attrs, + .elevator_name = "cfq-cgroups", + .elevator_owner = THIS_MODULE, +}; + static int __init cfq_cgroup_init(void) { + iosched_cfq_cgroup.ops = iosched_cfq.ops; + + elv_register(&iosched_cfq_cgroup); + return 0; } static void __exit cfq_cgroup_exit(void) { + elv_unregister(&iosched_cfq_cgroup); } module_init(cfq_cgroup_init); diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index b726e85..e105827 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -2332,7 +2332,7 @@ static struct elv_fs_entry cfq_attrs[] = { __ATTR_NULL }; -static struct elevator_type iosched_cfq = { +struct elevator_type iosched_cfq = { .ops = { .elevator_merge_fn = cfq_merge, .elevator_merged_fn = cfq_merged_request, diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index 50003f7..a28ef00 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -82,4 +82,6 @@ struct cfq_data { struct cfq_driver_data *cfqdd; }; +extern struct elevator_type iosched_cfq; + #endif /* _LINUX_CFQ_IOSCHED_H */ -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:26 UTC
[PATCH][cfq-cgroups][05/12] Introduce cgroups structure with ioprio entry.
[This email is either empty or too large to be displayed at this time]
Satoshi UCHIDA
2008-Nov-12 08:27 UTC
[PATCH][cfq-cgroups][06/12] Add siblings tree control for driver data(cfq_driver_data).
This patch adds a tree control for siblings of driver data(cfq_driver_data). This tree controls cfq data(cfq_data) for same device, and is used mainly when a new device is registered or a existed device is unregistered. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 114 +++++++++++++++++++++++++++++++++++++++++++ block/cfq-iosched.c | 50 +++++++++++++++--- include/linux/cfq-iosched.h | 27 ++++++++++ 3 files changed, 182 insertions(+), 9 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index 733980d..ce35af2 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -17,6 +17,7 @@ #define CFQ_CGROUP_MAX_IOPRIO (7) +static struct cfq_ops cfq_cgroup_op; struct cfq_cgroup { struct cgroup_subsys_state css; @@ -36,6 +37,70 @@ static inline struct cfq_cgroup *task_to_cfq_cgroup(struct task_struct *tsk) } +/* + * Add device or cgroup data functions. + */ +static void cfq_cgroup_init_driver_data_opt(struct cfq_driver_data *cfqdd, + struct cfq_data *cfqd) +{ + cfqdd->sibling_tree = RB_ROOT; + cfqdd->siblings = 0; +} + +static void cfq_driver_sibling_tree_add(struct cfq_driver_data *cfqdd, + struct cfq_data *cfqd) +{ + struct rb_node **p; + struct rb_node *parent = NULL; + + BUG_ON(!RB_EMPTY_NODE(&cfqd->sib_node)); + + p = &cfqdd->sibling_tree.rb_node; + + while (*p) { + struct cfq_data *__cfqd; + struct rb_node **n; + + parent = *p; + __cfqd = rb_entry(parent, struct cfq_data, sib_node); + + if (cfqd < __cfqd) + n = &(*p)->rb_left; + else + n = &(*p)->rb_right; + p = n; + } + + rb_link_node(&cfqd->sib_node, parent, p); + rb_insert_color(&cfqd->sib_node, &cfqdd->sibling_tree); + cfqdd->siblings++; + cfqd->cfqdd = cfqdd; +} + +static struct cfq_data * +__cfq_cgroup_init_queue(struct request_queue *q, struct cfq_driver_data *cfqdd) +{ + struct cfq_data *cfqd = cfq_init_cfq_data(q, cfqdd, &cfq_cgroup_op); + + if (!cfqd) + return NULL; + + RB_CLEAR_NODE(&cfqd->sib_node); + + cfq_driver_sibling_tree_add(cfqd->cfqdd, cfqd); + + return cfqd; +} + +static void *cfq_cgroup_init_queue(struct request_queue *q) +{ + struct cfq_data *cfqd = NULL; + + cfqd = __cfq_cgroup_init_queue(q, NULL); + + return cfqd; +} + static struct cgroup_subsys_state * cfq_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) { @@ -56,11 +121,53 @@ cfq_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) return &cfqc->css; } + +/* + * Remove device or cgroup data functions. + */ +static void cfq_cgroup_erase_driver_siblings(struct cfq_driver_data *cfqdd, + struct cfq_data *cfqd) +{ + rb_erase(&cfqd->sib_node, &cfqdd->sibling_tree); + cfqdd->siblings--; +} + +static void cfq_exit_device_group(struct cfq_driver_data *cfqdd) +{ + struct rb_node *p, *n; + struct cfq_data *cfqd; + + p = rb_first(&cfqdd->sibling_tree); + + while (p) { + n = rb_next(p); + cfqd = rb_entry(p, struct cfq_data, sib_node); + + cfq_cgroup_erase_driver_siblings(cfqdd, cfqd); + cfq_free_cfq_data(cfqd); + + p = n; + } +} + +static void cfq_cgroup_exit_queue(elevator_t *e) +{ + struct cfq_data *cfqd = e->elevator_data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + + cfq_exit_device_group(cfqdd); + kfree(cfqdd); +} + static void cfq_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cont) { kfree(cgroup_to_cfq_cgroup(cont)); } + +/* + * cgroupfs parts below --> + */ static ssize_t cfq_cgroup_read(struct cgroup *cont, struct cftype *cft, struct file *file, char __user *userbuf, size_t nbytes, loff_t *ppos) @@ -154,6 +261,7 @@ static int cfq_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cont) return cgroup_add_files(cont, ss, files, ARRAY_SIZE(files)); } + struct cgroup_subsys cfq_subsys = { .name = "cfq", .create = cfq_cgroup_create, @@ -280,9 +388,15 @@ static struct elevator_type iosched_cfq_cgroup = { .elevator_owner = THIS_MODULE, }; +static struct cfq_ops cfq_cgroup_op = { + .cfq_init_driver_data_opt_fn = cfq_cgroup_init_driver_data_opt, +}; + static int __init cfq_cgroup_init(void) { iosched_cfq_cgroup.ops = iosched_cfq.ops; + iosched_cfq_cgroup.ops.elevator_init_fn = cfq_cgroup_init_queue; + iosched_cfq_cgroup.ops.elevator_exit_fn = cfq_cgroup_exit_queue; elv_register(&iosched_cfq_cgroup); diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index e105827..fd1ed0c 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -62,6 +62,8 @@ static DEFINE_SPINLOCK(ioc_gone_lock); #define sample_valid(samples) ((samples) > 80) +static struct cfq_ops cfq_op; + /* * Per process-grouping structure */ @@ -2133,9 +2135,8 @@ static void cfq_put_async_queues(struct cfq_data *cfqd) cfq_put_queue(cfqd->async_idle_cfqq); } -static void cfq_exit_queue(elevator_t *e) +void cfq_free_cfq_data(struct cfq_data *cfqd) { - struct cfq_data *cfqd = e->elevator_data; struct cfq_driver_data *cfqdd = cfqd->cfqdd; struct request_queue *q = cfqdd->queue; @@ -2160,12 +2161,21 @@ static void cfq_exit_queue(elevator_t *e) cfq_shutdown_timer_wq(cfqdd); - kfree(cfqdd); kfree(cfqd); } +static void cfq_exit_queue(elevator_t *e) +{ + struct cfq_data *cfqd = e->elevator_data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + + cfq_free_cfq_data(cfqd); + kfree(cfqdd); +} + static struct cfq_driver_data * -cfq_init_driver_data(struct request_queue *q, struct cfq_data *cfqd) +cfq_init_driver_data(struct request_queue *q, struct cfq_data *cfqd, + struct cfq_ops *op) { struct cfq_driver_data *cfqdd; @@ -2186,10 +2196,16 @@ cfq_init_driver_data(struct request_queue *q, struct cfq_data *cfqd) cfqdd->hw_tag = 1; + /* module dependend initialization */ + cfqdd->op = op; + if (op->cfq_init_driver_data_opt_fn) + op->cfq_init_driver_data_opt_fn(cfqdd, cfqd); + return cfqdd; } -static void *cfq_init_queue(struct request_queue *q) +struct cfq_data *cfq_init_cfq_data(struct request_queue *q, + struct cfq_driver_data *cfqdd, struct cfq_ops *op) { struct cfq_data *cfqd; @@ -2197,10 +2213,14 @@ static void *cfq_init_queue(struct request_queue *q) if (!cfqd) return NULL; - cfqd->cfqdd = cfq_init_driver_data(q, cfqd); - if (!cfqd->cfqdd) { - kfree(cfqd); - return NULL; + if (cfqdd) + cfqd->cfqdd = cfqdd; + else { + cfqd->cfqdd = cfq_init_driver_data(q, cfqd, op); + if (!cfqd->cfqdd) { + kfree(cfqd); + return NULL; + } } cfqd->service_tree = CFQ_RB_ROOT; @@ -2219,6 +2239,15 @@ static void *cfq_init_queue(struct request_queue *q) return cfqd; } +static void *cfq_init_queue(struct request_queue *q) +{ + struct cfq_data *cfqd = NULL; + + cfqd = cfq_init_cfq_data(q, NULL, &cfq_op); + + return cfqd; +} + static void cfq_slab_kill(void) { /* @@ -2358,6 +2387,9 @@ struct elevator_type iosched_cfq = { .elevator_owner = THIS_MODULE, }; +static struct cfq_ops cfq_op = { +}; + static int __init cfq_init(void) { /* diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index a28ef00..22d1aed 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -6,6 +6,7 @@ struct request_queue; struct cfq_io_context; +struct cfq_ops; /* * Most of our rbtree usage is for sorting with min extraction, so @@ -46,6 +47,14 @@ struct cfq_driver_data { sector_t last_position; unsigned long last_end_request; + + struct cfq_ops *op; + +#ifdef CONFIG_IOSCHED_CFQ_CGROUP + /* device siblings */ + struct rb_root sibling_tree; + unsigned int siblings; +#endif }; /* @@ -80,8 +89,26 @@ struct cfq_data { struct list_head cic_list; struct cfq_driver_data *cfqdd; + +#ifdef CONFIG_IOSCHED_CFQ_CGROUP + /* sibling_tree member for cfq_meta_data */ + struct rb_node sib_node; +#endif +}; + +/* + * Module depended optional operations. + */ +typedef void (cfq_init_driver_data_opt_fn)(struct cfq_driver_data *, + struct cfq_data *); +struct cfq_ops { + cfq_init_driver_data_opt_fn *cfq_init_driver_data_opt_fn; }; + extern struct elevator_type iosched_cfq; +extern struct cfq_data *cfq_init_cfq_data(struct request_queue *, + struct cfq_driver_data *, struct cfq_ops *); +extern void cfq_free_cfq_data(struct cfq_data *cfqd); #endif /* _LINUX_CFQ_IOSCHED_H */ -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:28 UTC
[PATCH][cfq-cgroups][07/12] Add sibling tree control for group data(cfq_cgroup).
This patch adds a tree control for siblings of group data(cfq_cgroup). This tree controls cfq data(cfq_data) for same group, and is used mainly when a new group is registerecd or a existed group is unregistered. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 126 +++++++++++++++++++++++++++++++++++++++++++ include/linux/cfq-iosched.h | 5 ++ include/linux/cgroup.h | 1 + kernel/cgroup.c | 5 ++ 4 files changed, 137 insertions(+), 0 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index ce35af2..25da08e 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -22,8 +22,12 @@ static struct cfq_ops cfq_cgroup_op; struct cfq_cgroup { struct cgroup_subsys_state css; unsigned int ioprio; + + struct rb_root sibling_tree; + unsigned int siblings; }; + static inline struct cfq_cgroup *cgroup_to_cfq_cgroup(struct cgroup *cont) { return container_of(cgroup_subsys_state(cont, cfq_subsys_id), @@ -77,6 +81,68 @@ static void cfq_driver_sibling_tree_add(struct cfq_driver_data *cfqdd, cfqd->cfqdd = cfqdd; } +static void cfq_cgroup_sibling_tree_add(struct cfq_cgroup *cfqc, + struct cfq_data *cfqd) +{ + struct rb_node **p; + struct rb_node *parent = NULL; + + BUG_ON(!RB_EMPTY_NODE(&cfqd->group_node)); + + p = &cfqc->sibling_tree.rb_node; + + while (*p) { + struct cfq_data *__cfqd; + struct rb_node **n; + + parent = *p; + __cfqd = rb_entry(parent, struct cfq_data, group_node); + + if (cfqd->cfqdd < __cfqd->cfqdd) + n = &(*p)->rb_left; + else + n = &(*p)->rb_right; + p = n; + } + + rb_link_node(&cfqd->group_node, parent, p); + rb_insert_color(&cfqd->group_node, &cfqc->sibling_tree); + cfqc->siblings++; + cfqd->cfqc = cfqc; +} + +static struct cfq_data * +__cfq_cgroup_init_queue(struct request_queue *, struct cfq_driver_data *); + +static void *cfq_cgroup_init_cfq_data(struct cfq_cgroup *cfqc, + struct cfq_data *cfqd) +{ + struct cgroup *child; + + /* setting cfq_data for cfq_cgroup */ + if (!cfqc) { + cfqc = cgroup_to_cfq_cgroup(get_root_subsys(&cfq_subsys)); + cfq_cgroup_sibling_tree_add(cfqc, cfqd); + } else { + struct cfq_data *__cfqd; + __cfqd = __cfq_cgroup_init_queue(cfqd->cfqdd->queue, + cfqd->cfqdd); + if (!__cfqd) + return NULL; + cfq_cgroup_sibling_tree_add(cfqc, __cfqd); + } + + /* check and create cfq_data for children */ + if (cfqc->css.cgroup) + list_for_each_entry(child, &cfqc->css.cgroup->children, + sibling){ + cfq_cgroup_init_cfq_data(cgroup_to_cfq_cgroup(child), + cfqd); + } + + return cfqc; +} + static struct cfq_data * __cfq_cgroup_init_queue(struct request_queue *q, struct cfq_driver_data *cfqdd) { @@ -86,9 +152,13 @@ __cfq_cgroup_init_queue(struct request_queue *q, struct cfq_driver_data *cfqdd) return NULL; RB_CLEAR_NODE(&cfqd->sib_node); + RB_CLEAR_NODE(&cfqd->group_node); cfq_driver_sibling_tree_add(cfqd->cfqdd, cfqd); + if (!cfqdd) + cfq_cgroup_init_cfq_data(NULL, cfqd); + return cfqd; } @@ -101,6 +171,28 @@ static void *cfq_cgroup_init_queue(struct request_queue *q) return cfqd; } +static void *cfq_cgroup_init_cgroup(struct cfq_cgroup *cfqc, + struct cgroup *parent) +{ + struct rb_node *p; + if (parent) { + struct cfq_cgroup *cfqc_p = cgroup_to_cfq_cgroup(parent); + + p = rb_first(&cfqc_p->sibling_tree); + while (p) { + struct cfq_data *__cfqd; + __cfqd = rb_entry(p, struct cfq_data, group_node); + + cfq_cgroup_init_cfq_data(cfqc, __cfqd); + + p = rb_next(p); + } + } + + return cfqc; +} + + static struct cgroup_subsys_state * cfq_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) { @@ -118,6 +210,12 @@ cfq_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cont) cfqc->ioprio = 3; + cfqc->sibling_tree = RB_ROOT; + cfqc->siblings = 0; + + if (!cfq_cgroup_init_cgroup(cfqc, cont->parent)) + return ERR_PTR(-ENOMEM); + return &cfqc->css; } @@ -132,6 +230,13 @@ static void cfq_cgroup_erase_driver_siblings(struct cfq_driver_data *cfqdd, cfqdd->siblings--; } +static void cfq_cgroup_erase_cgroup_siblings(struct cfq_cgroup *cfqc, + struct cfq_data *cfqd) +{ + rb_erase(&cfqd->group_node, &cfqc->sibling_tree); + cfqc->siblings--; +} + static void cfq_exit_device_group(struct cfq_driver_data *cfqdd) { struct rb_node *p, *n; @@ -144,6 +249,7 @@ static void cfq_exit_device_group(struct cfq_driver_data *cfqdd) cfqd = rb_entry(p, struct cfq_data, sib_node); cfq_cgroup_erase_driver_siblings(cfqdd, cfqd); + cfq_cgroup_erase_cgroup_siblings(cfqd->cfqc, cfqd); cfq_free_cfq_data(cfqd); p = n; @@ -159,8 +265,28 @@ static void cfq_cgroup_exit_queue(elevator_t *e) kfree(cfqdd); } +static void cfq_exit_cgroup(struct cfq_cgroup *cfqc) +{ + struct rb_node *p, *n; + struct cfq_data *cfqd; + + p = rb_first(&cfqc->sibling_tree); + + while (p) { + n = rb_next(p); + cfqd = rb_entry(p, struct cfq_data, group_node); + + cfq_cgroup_erase_driver_siblings(cfqd->cfqdd, cfqd); + cfq_cgroup_erase_cgroup_siblings(cfqc, cfqd); + cfq_free_cfq_data(cfqd); + + p = n; + } +} + static void cfq_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cont) { + cfq_exit_cgroup(cgroup_to_cfq_cgroup(cont)); kfree(cgroup_to_cfq_cgroup(cont)); } diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index 22d1aed..382fc0a 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -93,6 +93,11 @@ struct cfq_data { #ifdef CONFIG_IOSCHED_CFQ_CGROUP /* sibling_tree member for cfq_meta_data */ struct rb_node sib_node; + + /* cfq_cgroup attribute */ + struct cfq_cgroup *cfqc; + /* group_tree member for cfq_cgroup */ + struct rb_node group_node; #endif }; diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 8b00f66..4bfd815 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -402,6 +402,7 @@ struct task_struct *cgroup_iter_next(struct cgroup *cgrp, void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it); int cgroup_scan_tasks(struct cgroup_scanner *scan); int cgroup_attach_task(struct cgroup *, struct task_struct *); +struct cgroup *get_root_subsys(struct cgroup_subsys *css); void cgroup_mm_owner_callbacks(struct task_struct *old, struct task_struct *new); diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 35eebd5..71bb335 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1316,6 +1316,11 @@ static int cgroup_tasks_write(struct cgroup *cgrp, struct cftype *cft, u64 pid) return ret; } +struct cgroup *get_root_subsys(struct cgroup_subsys *css) +{ + return &css->root->top_cgroup; +} + /* The various types of files and directories in a cgroup file system */ enum cgroup_filetype { FILE_ROOT, -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:29 UTC
[PATCH][cfq-cgroups][08/12] Interface to new cfq data structure in cfq_cgroup module.
This patch modified interfaces to new cfq_data structure in cfq_cgroup module. By this patch, interfaces can have two parameter and can set a cfq data for particular device and particular group. In cgroupfs, if the number of argument is one, a parameter means value which is set for all device, and if the number of argument is two, first parameter means value which is set and second parameter means name which is setting device. when second parameter is "defaults", default parameter of group is set by first parameter. In sysfs, if the number of argument is one, a parameter means value which is set for all group, and if the number of argument is two, first parameter means value which is set and second parameter means name which is setting group . Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 192 +++++++++++++++++++++++++++++++++++++++--- include/linux/cfq-iosched.h | 2 + 2 files changed, 180 insertions(+), 14 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index 25da08e..99f3d94 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -123,6 +123,7 @@ static void *cfq_cgroup_init_cfq_data(struct cfq_cgroup *cfqc, if (!cfqc) { cfqc = cgroup_to_cfq_cgroup(get_root_subsys(&cfq_subsys)); cfq_cgroup_sibling_tree_add(cfqc, cfqd); + cfqd->ioprio = cfqc->ioprio; } else { struct cfq_data *__cfqd; __cfqd = __cfq_cgroup_init_queue(cfqd->cfqdd->queue, @@ -130,6 +131,7 @@ static void *cfq_cgroup_init_cfq_data(struct cfq_cgroup *cfqc, if (!__cfqd) return NULL; cfq_cgroup_sibling_tree_add(cfqc, __cfqd); + __cfqd->ioprio = cfqc->ioprio; } /* check and create cfq_data for children */ @@ -294,6 +296,35 @@ static void cfq_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cont) /* * cgroupfs parts below --> */ +static void +param_separate(const char *master, char *valbuf, char *pathbuf, int size) +{ + int i; + char *pc1 = (char *) master, *pc2; + + pc2 = valbuf; + for (i = 0 ; i < (size - 1) && (*pc1 != ' ') && + (*pc1 != '\n') && (*pc1 != '\0') ; i++) { + *pc2 = *pc1; + pc2++; + pc1++; + } + *pc2 = '\n'; pc2++; *pc2 = '\0'; + + for ( ; (i < (PAGE_SIZE - 1)) && (*pc1 == ' ') && + (*pc1 != '\n') && (*pc1 != '\0') ; i++) + pc1++; + + pc2 = pathbuf; + for ( ; i < (PAGE_SIZE - 1) && (*pc1 != ' ') && + (*pc1 != '\n') && (*pc1 != '\0') ; i++) { + *pc2 = *pc1; + pc2++; + pc1++; + } + *pc2 = '\0'; +} + static ssize_t cfq_cgroup_read(struct cgroup *cont, struct cftype *cft, struct file *file, char __user *userbuf, size_t nbytes, loff_t *ppos) @@ -301,6 +332,7 @@ static ssize_t cfq_cgroup_read(struct cgroup *cont, struct cftype *cft, struct cfq_cgroup *cfqc; char *page; ssize_t ret; + struct rb_node *p; page = (char *)__get_free_page(GFP_TEMPORARY); if (!page) @@ -318,7 +350,20 @@ static ssize_t cfq_cgroup_read(struct cgroup *cont, struct cftype *cft, cgroup_unlock(); /* print priority */ - ret = snprintf(page, PAGE_SIZE, "%d \n", cfqc->ioprio); + ret = snprintf(page, PAGE_SIZE, "default priority: %d\n", cfqc->ioprio); + + p = rb_first(&cfqc->sibling_tree); + while (p) { + struct cfq_data *__cfqd; + + __cfqd = rb_entry(p, struct cfq_data, group_node); + + ret += snprintf(page + ret, PAGE_SIZE - ret, " %s %d\n", + __cfqd->cfqdd->queue->kobj.parent->name, + __cfqd->ioprio); + + p = rb_next(p); + } ret = simple_read_from_buffer(userbuf, nbytes, ppos, page, ret); @@ -334,8 +379,10 @@ static ssize_t cfq_cgroup_write(struct cgroup *cont, struct cftype *cft, struct cfq_cgroup *cfqc; ssize_t ret; long new_prio; - int err; + int err, sn; char *buffer = NULL; + char *valbuf = NULL, *pathbuf = NULL; + struct rb_node *p; cgroup_lock(); if (cgroup_is_removed(cont)) { @@ -354,23 +401,64 @@ static ssize_t cfq_cgroup_write(struct cgroup *cont, struct cftype *cft, if (copy_from_user(buffer, userbuf, nbytes)) { ret = -EFAULT; - goto out; + goto free_buf; } buffer[nbytes] = 0; - err = strict_strtoul(buffer, 10, &new_prio); + valbuf = kmalloc(nbytes + 1, GFP_KERNEL); + if (!valbuf) { + ret = -ENOMEM; + goto free_buf; + } + + pathbuf = kmalloc(nbytes + 1, GFP_KERNEL); + if (!pathbuf) { + ret = -ENOMEM; + goto free_val; + } + + param_separate(buffer, valbuf, pathbuf, nbytes); + + err = strict_strtoul(valbuf, 10, &new_prio); if ((err) || ((new_prio < 0) || (new_prio > CFQ_CGROUP_MAX_IOPRIO))) { ret = -EINVAL; - goto out; + goto free_path; } - cfqc->ioprio = new_prio; + sn = strlen(pathbuf); + + p = rb_first(&cfqc->sibling_tree); + while (p) { + struct cfq_data *__cfqd; + const char *namep; + + __cfqd = rb_entry(p, struct cfq_data, group_node); + namep = __cfqd->cfqdd->queue->kobj.parent->name; + + if (sn == 0) { + __cfqd->ioprio = new_prio; + } else if ((sn == strlen(namep)) && + (strncmp(pathbuf, namep, sn) == 0)) { + __cfqd->ioprio = new_prio; + break; + } + + p = rb_next(p); + } + + if ((sn == 0) || + ((sn == 7) && (strncmp(pathbuf, "default", 7) == 0))) + cfqc->ioprio = new_prio; ret = nbytes; -out: +free_path: + kfree(pathbuf); +free_val: + kfree(valbuf); +free_buf: kfree(buffer); - +out: return ret; } @@ -404,11 +492,33 @@ static ssize_t cfq_cgroup_var_show(char *page, struct cfq_data *cfqd, int (func)(struct cfq_data *)) { - int val, retval = 0; + int err, val, retval = 0; + char *pathbuf = NULL; + struct rb_node *p; + + pathbuf = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!pathbuf) + return 0; + + p = rb_first(&cfqd->cfqdd->sibling_tree); + while (p) { + struct cfq_data *__cfqd; + struct cgroup *cgrp; - val = func(cfqd); + __cfqd = rb_entry(p, struct cfq_data, sib_node); + cgrp = __cfqd->cfqc->css.cgroup; - retval = snprintf(page, PAGE_SIZE, "%d\n", val); + err = cgroup_path(cgrp, pathbuf, PAGE_SIZE); + if (err) + break; + val = func(__cfqd); + + retval += snprintf(page + retval, PAGE_SIZE - retval, + "%s %d\n", pathbuf, val); + p = rb_next(p); + } + + kfree(pathbuf); return retval; } @@ -437,21 +547,73 @@ SHOW_FUNCTION(cfq_cgroup_slice_idle_show, cfq_slice_idle, 1); SHOW_FUNCTION(cfq_cgroup_slice_sync_show, cfq_slice[1], 1); SHOW_FUNCTION(cfq_cgroup_slice_async_show, cfq_slice[0], 1); SHOW_FUNCTION(cfq_cgroup_slice_async_rq_show, cfq_slice_async_rq, 0); +SHOW_FUNCTION(cfq_cgroup_ioprio_show, ioprio, 0); #undef SHOW_FUNCTION static ssize_t cfq_cgroup_var_store(const char *page, size_t count, struct cfq_data *cfqd, void (func)(struct cfq_data *, unsigned int)) { - int err; + int err, sn; unsigned long val; + char *valbuf = NULL, *setpathbuf = NULL, *pathbuf = NULL; + struct rb_node *p; + + valbuf = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!valbuf) { + count = 0; + goto out; + } - err = strict_strtoul(page, 10, &val); + setpathbuf = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!setpathbuf) { + count = 0; + goto free_val; + } + + pathbuf = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!pathbuf) { + count = 0; + goto free_setpath; + } + + param_separate(page, valbuf, setpathbuf, PAGE_SIZE); + + err = strict_strtoul(valbuf, 10, &val); if (err) return 0; - func(cfqd, val); + sn = strlen(setpathbuf); + + p = rb_first(&cfqd->cfqdd->sibling_tree); + while (p) { + struct cfq_data *__cfqd; + struct cgroup *cgrp; + __cfqd = rb_entry(p, struct cfq_data, sib_node); + cgrp = __cfqd->cfqc->css.cgroup; + + err = cgroup_path(cgrp, pathbuf, PAGE_SIZE); + if (err) + break; + + if (sn == 0) { + func(__cfqd, val); + } else if ((sn == strlen(pathbuf)) && + (strncmp(setpathbuf, pathbuf, sn) == 0)) { + func(__cfqd, val); + break; + } + + p = rb_next(p); + } + + kfree(pathbuf); +free_setpath: + kfree(setpathbuf); +free_val: + kfree(valbuf); +out: return count; } @@ -489,6 +651,7 @@ STORE_FUNCTION(cfq_cgroup_slice_sync_store, cfq_slice[1], 1, UINT_MAX, 1); STORE_FUNCTION(cfq_cgroup_slice_async_store, cfq_slice[0], 1, UINT_MAX, 1); STORE_FUNCTION(cfq_cgroup_slice_async_rq_store, cfq_slice_async_rq, 1, UINT_MAX, 0); +STORE_FUNCTION(cfq_cgroup_ioprio_store, ioprio, 0, CFQ_CGROUP_MAX_IOPRIO, 0); #undef STORE_FUNCTION #define CFQ_CGROUP_ATTR(name) \ @@ -505,6 +668,7 @@ static struct elv_fs_entry cfq_cgroup_attrs[] = { CFQ_CGROUP_ATTR(slice_async), CFQ_CGROUP_ATTR(slice_async_rq), CFQ_CGROUP_ATTR(slice_idle), + CFQ_CGROUP_ATTR(ioprio), __ATTR_NULL }; diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index 382fc0a..b58d476 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -91,6 +91,8 @@ struct cfq_data { struct cfq_driver_data *cfqdd; #ifdef CONFIG_IOSCHED_CFQ_CGROUP + unsigned int ioprio; + /* sibling_tree member for cfq_meta_data */ struct rb_node sib_node; -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:29 UTC
[PATCH][cfq-cgroups][09/12] Develop service tree control.
This patch introduces and controls a service tree for cfq data, namely group layer control. This functions expand IPRIO_BE class section of traditional CFQ scheduler. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 266 +++++++++++++++++++++++++++++++++++++++++++ block/cfq-iosched.c | 32 ++++- include/linux/cfq-iosched.h | 32 +++++- 3 files changed, 323 insertions(+), 7 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index 99f3d94..ff652fe 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -15,8 +15,11 @@ #include <linux/cgroup.h> #include <linux/cfq-iosched.h> +#define CFQ_CGROUP_SLICE_SCALE (5) #define CFQ_CGROUP_MAX_IOPRIO (7) +static const int cfq_cgroup_slice = HZ / 10; + static struct cfq_ops cfq_cgroup_op; struct cfq_cgroup { @@ -27,6 +30,28 @@ struct cfq_cgroup { unsigned int siblings; }; +enum cfqd_state_flags { + CFQ_CFQD_FLAG_on_rr = 0, /* on round-robin busy list */ + CFQ_CFQD_FLAG_slice_new, /* no requests dispatched in slice */ +}; + +#define CFQ_CFQD_FNS(name) \ +static inline void cfq_mark_cfqd_##name(struct cfq_data *cfqd) \ +{ \ + (cfqd)->flags |= (1 << CFQ_CFQD_FLAG_##name); \ +} \ +static inline void cfq_clear_cfqd_##name(struct cfq_data *cfqd) \ +{ \ + (cfqd)->flags &= ~(1 << CFQ_CFQD_FLAG_##name); \ +} \ +static inline int cfq_cfqd_##name(const struct cfq_data *cfqd) \ +{ \ + return ((cfqd)->flags & (1 << CFQ_CFQD_FLAG_##name)) != 0; \ +} + +CFQ_CFQD_FNS(on_rr); +CFQ_CFQD_FNS(slice_new); +#undef CFQ_CFQD_FNS static inline struct cfq_cgroup *cgroup_to_cfq_cgroup(struct cgroup *cont) { @@ -49,6 +74,11 @@ static void cfq_cgroup_init_driver_data_opt(struct cfq_driver_data *cfqdd, { cfqdd->sibling_tree = RB_ROOT; cfqdd->siblings = 0; + + cfqdd->service_tree = CFQ_RB_ROOT; + cfqdd->busy_data = 0; + + cfqdd->cfq_cgroup_slice = cfq_cgroup_slice; } static void cfq_driver_sibling_tree_add(struct cfq_driver_data *cfqdd, @@ -155,6 +185,8 @@ __cfq_cgroup_init_queue(struct request_queue *q, struct cfq_driver_data *cfqdd) RB_CLEAR_NODE(&cfqd->sib_node); RB_CLEAR_NODE(&cfqd->group_node); + RB_CLEAR_NODE(&cfqd->rb_node); + cfqd->rb_key = 0; cfq_driver_sibling_tree_add(cfqd->cfqdd, cfqd); @@ -294,6 +326,237 @@ static void cfq_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cont) /* + * service tree control. + */ +static inline int cfq_cgroup_slice_used(struct cfq_data *cfqd) +{ + if (cfq_cfqd_slice_new(cfqd)) + return 0; + if (time_before(jiffies, cfqd->slice_end)) + return 0; + + return 1; +} + +static inline int +cfq_cgroup_prio_slice(struct cfq_data *cfqd, unsigned short prio) +{ + const int base_slice = cfqd->cfqdd->cfq_cgroup_slice; + + WARN_ON(prio >= IOPRIO_BE_NR); + + return base_slice + (base_slice/CFQ_CGROUP_SLICE_SCALE * + (CFQ_CGROUP_MAX_IOPRIO / 2 - prio)); +} + +static inline void +cfq_cgroup_set_prio_slice(struct cfq_data *cfqd) +{ + cfqd->slice_end = cfq_cgroup_prio_slice(cfqd, cfqd->ioprio) + + jiffies; +} + +static unsigned long cfq_cgroup_slice_offset(struct cfq_data *cfqd) +{ + return (cfqd->cfqdd->busy_data - 1) * + (cfq_cgroup_prio_slice(cfqd, 0) - + cfq_cgroup_prio_slice(cfqd, cfqd->ioprio)); +} + +static void cfq_cgroup_service_tree_add(struct cfq_data *cfqd, int add_front) +{ + struct rb_node **p, *parent; + struct cfq_data *__cfqd; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + unsigned long rb_key; + int left; + + if (!add_front) { + rb_key = cfq_cgroup_slice_offset(cfqd) + jiffies; + rb_key += cfqd->slice_resid; + cfqd->slice_resid = 0; + } else + rb_key = 0; + + if (!RB_EMPTY_NODE(&cfqd->rb_node)) { + if (rb_key == cfqd->rb_key) + return; + cfq_rb_erase(&cfqd->rb_node, &cfqdd->service_tree); + } + + left = 1; + parent = NULL; + p = &cfqdd->service_tree.rb.rb_node; + while (*p) { + struct rb_node **n; + + parent = *p; + __cfqd = rb_entry(parent, struct cfq_data, rb_node); + + if (rb_key < __cfqd->rb_key) + n = &(*p)->rb_left; + else + n = &(*p)->rb_right; + + if (n == &(*p)->rb_right) + left = 0; + + p = n; + } + + if (left) + cfqdd->service_tree.left = &cfqd->rb_node; + + cfqd->rb_key = rb_key; + rb_link_node(&cfqd->rb_node, parent, p); + rb_insert_color(&cfqd->rb_node, &cfqdd->service_tree.rb); +} + +static void __cfq_cgroup_slice_expired(struct cfq_driver_data *cfqdd, + struct cfq_data *cfqd, int timed_out) +{ + if (timed_out && !cfq_cfqd_slice_new(cfqd)) + cfqd->slice_resid = cfqd->slice_end - jiffies; + + if (cfq_cfqd_on_rr(cfqd)) + cfq_cgroup_service_tree_add(cfqd, 0); + + if (cfqd == cfqdd->active_data) + cfqdd->active_data = NULL; +} + +static inline void +cfq_cgroup_slice_expired(struct cfq_driver_data *cfqdd, int timed_out) +{ + struct cfq_data *cfqd = cfqdd->active_data; + + if (cfqd) { + cfq_slice_expired(cfqd, 1); + __cfq_cgroup_slice_expired(cfqdd, cfqd, timed_out); + } +} + +static struct cfq_data *cfq_cgroup_rb_first(struct cfq_rb_root *root) +{ + if (!root->left) + root->left = rb_first(&root->rb); + + if (root->left) + return rb_entry(root->left, struct cfq_data, rb_node); + + return NULL; +} + +static struct cfq_data *cfq_cgroup_get_next_data(struct cfq_driver_data *cfqdd) +{ + if (RB_EMPTY_ROOT(&cfqdd->service_tree.rb)) + return NULL; + + return cfq_cgroup_rb_first(&cfqdd->service_tree); +} + +static void __cfq_cgroup_set_active_data(struct cfq_driver_data*cfqdd, + struct cfq_data *cfqd) +{ + if (cfqd) { + cfqd->slice_end = 0; + cfq_mark_cfqd_slice_new(cfqd); + } + + cfqdd->active_data = cfqd; +} + +static struct cfq_data * +cfq_cgroup_set_active_data(struct cfq_driver_data *cfqdd) +{ + struct cfq_data *cfqd; + + cfqd = cfq_cgroup_get_next_data(cfqdd); + __cfq_cgroup_set_active_data(cfqdd , cfqd); + + return cfqd; +} + +struct cfq_data *cfq_cgroup_select_data(struct cfq_driver_data *cfqdd) +{ + struct cfq_data *cfqd; + + cfqd = cfqdd->active_data; + if (!cfqd) + goto new_data; + + if (cfq_cgroup_slice_used(cfqd)) + goto expire; + + if (!RB_EMPTY_ROOT(&cfqd->service_tree.rb)) + goto keep_data; + + if (wait_request_checker(cfqd)) + goto keep_data; + +expire: + cfq_cgroup_slice_expired(cfqdd, 0); +new_data: + cfqd = cfq_cgroup_set_active_data(cfqdd); +keep_data: + return cfqd; +} + +int cfq_cgroup_forced_dispatch(struct cfq_data *cfqd) +{ + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + int dispatched = 0; + + while ((cfqd = cfq_cgroup_rb_first(&cfqdd->service_tree)) != NULL) + dispatched += cfq_forced_dispatch(cfqd); + + cfq_cgroup_slice_expired(cfqdd, 0); + + BUG_ON(cfqdd->busy_data); + + return dispatched; +} + +int cfq_cgroup_dispatch_requests(struct request_queue *q, int force) +{ + struct cfq_data *cfqd = q->elevator->elevator_data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + int dispatched; + + if (!cfqdd->busy_data) + return 0; + + if (unlikely(force)) + return cfq_cgroup_forced_dispatch(cfqd); + + dispatched = 0; + cfqd = cfq_cgroup_select_data(cfqdd); + + if (cfqd) + dispatched = cfq_queue_dispatch_requests(cfqd, force); + + return dispatched; +} + +int cfq_cgroup_completed_request_opt(struct cfq_data *cfqd) +{ + if (cfqd->cfqdd->active_data == cfqd) { + if (cfq_cfqd_slice_new(cfqd)) { + cfq_cgroup_set_prio_slice(cfqd); + cfq_clear_cfqd_slice_new(cfqd); + + } + if (cfq_cgroup_slice_used(cfqd)) { + cfq_cgroup_slice_expired(cfqd->cfqdd, 1); + return 0; + } + return 1; + } + + return 0; +} + +/* * cgroupfs parts below --> */ static void @@ -680,6 +943,7 @@ static struct elevator_type iosched_cfq_cgroup = { static struct cfq_ops cfq_cgroup_op = { .cfq_init_driver_data_opt_fn = cfq_cgroup_init_driver_data_opt, + .cfq_completed_request_opt_fn = cfq_cgroup_completed_request_opt, }; static int __init cfq_cgroup_init(void) @@ -687,6 +951,8 @@ static int __init cfq_cgroup_init(void) iosched_cfq_cgroup.ops = iosched_cfq.ops; iosched_cfq_cgroup.ops.elevator_init_fn = cfq_cgroup_init_queue; iosched_cfq_cgroup.ops.elevator_exit_fn = cfq_cgroup_exit_queue; + iosched_cfq_cgroup.ops.elevator_dispatch_fn + cfq_cgroup_dispatch_requests, elv_register(&iosched_cfq_cgroup); diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index fd1ed0c..5fbef85 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -354,7 +354,7 @@ static struct cfq_queue *cfq_rb_first(struct cfq_rb_root *root) return NULL; } -static void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root) +void cfq_rb_erase(struct rb_node *n, struct cfq_rb_root *root) { if (root->left == n) root->left = NULL; @@ -751,7 +751,7 @@ __cfq_slice_expired(struct cfq_data *cfqd, struct cfq_queue *cfqq, } } -static inline void cfq_slice_expired(struct cfq_data *cfqd, int timed_out) +inline void cfq_slice_expired(struct cfq_data *cfqd, int timed_out) { struct cfq_queue *cfqq = cfqd->active_queue; @@ -932,6 +932,16 @@ cfq_prio_to_maxrq(struct cfq_data *cfqd, struct cfq_queue *cfqq) return 2 * (base_rq + base_rq * (CFQ_PRIO_LISTS - 1 - cfqq->ioprio)); } +int wait_request_checker(struct cfq_data *cfqd) +{ + struct cfq_queue *cfqq = cfqd->active_queue; + if (cfqq) + return timer_pending(&cfqd->cfqdd->idle_slice_timer) + || (cfqq->dispatched && cfq_cfqq_idle_window(cfqq)); + else + return 0; +} + /* * Select a queue for service. If we have a current active queue, * check whether to continue servicing it, or retrieve and set a new one. @@ -1047,7 +1057,7 @@ static int __cfq_forced_dispatch_cfqq(struct cfq_queue *cfqq) * Drain our current requests. Used for barriers and when switching * io schedulers on-the-fly. */ -static int cfq_forced_dispatch(struct cfq_data *cfqd) +int cfq_forced_dispatch(struct cfq_data *cfqd) { struct cfq_queue *cfqq; int dispatched = 0; @@ -1063,9 +1073,8 @@ static int cfq_forced_dispatch(struct cfq_data *cfqd) return dispatched; } -static int cfq_dispatch_requests(struct request_queue *q, int force) +int cfq_queue_dispatch_requests(struct cfq_data *cfqd, int force) { - struct cfq_data *cfqd = q->elevator->elevator_data; struct cfq_queue *cfqq; struct cfq_driver_data *cfqdd = cfqd->cfqdd; int dispatched; @@ -1105,6 +1114,13 @@ static int cfq_dispatch_requests(struct request_queue *q, int force) return dispatched; } +static int cfq_dispatch_requests(struct request_queue *q, int force) +{ + struct cfq_data *cfqd = q->elevator->elevator_data; + + return cfq_queue_dispatch_requests(cfqd, force); +} + /* * task holds one reference to the queue, dropped when task exits. each rq * in-flight on this queue also holds a reference, dropped when rq is freed. @@ -1876,6 +1892,7 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) struct cfq_driver_data *cfqdd = cfqd->cfqdd; const int sync = rq_is_sync(rq); unsigned long now; + int flag = 1; now = jiffies; cfq_log_cfqq(cfqd, cfqq, "complete"); @@ -1900,7 +1917,10 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq) * If this is the active queue, check if it needs to be expired, * or if we want to idle in case it has no pending requests. */ - if (cfqd->active_queue == cfqq) { + if (cfqd->cfqdd->op->cfq_completed_request_opt_fn) + flag = cfqd->cfqdd->op->cfq_completed_request_opt_fn(cfqd); + + if ((flag) && (cfqd->active_queue == cfqq)) { if (cfq_cfqq_slice_new(cfqq)) { cfq_set_prio_slice(cfqd, cfqq); cfq_clear_cfqq_slice_new(cfqq); diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index b58d476..30702c9 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -54,6 +54,15 @@ struct cfq_driver_data { /* device siblings */ struct rb_root sibling_tree; unsigned int siblings; + + /* + * rr list of cfq_data with requests and the count of them + */ + struct cfq_rb_root service_tree; + unsigned int busy_data; + struct cfq_data *active_data; + + unsigned int cfq_cgroup_slice; #endif }; @@ -100,6 +109,20 @@ struct cfq_data { struct cfq_cgroup *cfqc; /* group_tree member for cfq_cgroup */ struct rb_node group_node; + + /* service_tree member */ + struct rb_node rb_node; + /* service_tree key */ + unsigned long rb_key; + + /* + * slice parameter + */ + unsigned long slice_end; + long slice_resid; + + /* various state flags, see below */ + unsigned int flags; #endif }; @@ -108,14 +131,21 @@ struct cfq_data { */ typedef void (cfq_init_driver_data_opt_fn)(struct cfq_driver_data *, struct cfq_data *); +typedef int (cfq_completed_request_opt_fn)(struct cfq_data *); struct cfq_ops { cfq_init_driver_data_opt_fn *cfq_init_driver_data_opt_fn; + cfq_completed_request_opt_fn *cfq_completed_request_opt_fn; }; extern struct elevator_type iosched_cfq; extern struct cfq_data *cfq_init_cfq_data(struct request_queue *, struct cfq_driver_data *, struct cfq_ops *); -extern void cfq_free_cfq_data(struct cfq_data *cfqd); +extern void cfq_free_cfq_data(struct cfq_data *); +extern void cfq_rb_erase(struct rb_node *, struct cfq_rb_root *); +extern void cfq_slice_expired(struct cfq_data *, int); +extern int wait_request_checker(struct cfq_data *cfqd); +extern int cfq_forced_dispatch(struct cfq_data *); +extern int cfq_queue_dispatch_requests(struct cfq_data *, int); #endif /* _LINUX_CFQ_IOSCHED_H */ -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:30 UTC
[PATCH][cfq-cgroups][10/12] Introduce request control for two layer.
This patch controls requests according to two layer mechanism. As it is, functions searches corresponding cfq data and activates/deactivates cfq data. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 71 +++++++++++++++++++++++++++++++++++++++++++ block/cfq-iosched.c | 57 +++++++++++++++++++++++++++------- include/linux/cfq-iosched.h | 9 +++++ 3 files changed, 125 insertions(+), 12 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index ff652fe..f3e9f40 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -557,6 +557,72 @@ int cfq_cgroup_completed_request_opt(struct cfq_data *cfqd) } /* + * optional functions for two layers + */ +struct cfq_data *cfq_cgroup_search_data(void *data, + struct task_struct *tsk) +{ + struct cfq_data *cfqd = (struct cfq_data *)data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + struct cfq_cgroup *cont = task_to_cfq_cgroup(tsk); + struct rb_node *p = cont->sibling_tree.rb_node; + + while (p) { + struct cfq_data *__cfqd; + __cfqd = rb_entry(p, struct cfq_data, group_node); + + + if (cfqdd < __cfqd->cfqdd) + p = p->rb_left; + else if (cfqdd > __cfqd->cfqdd) + p = p->rb_right; + else + return __cfqd; + } + + return NULL; +} + +static int cfq_cgroup_queue_empty(struct request_queue *q) +{ + struct cfq_data *cfqd = q->elevator->elevator_data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + + return !cfqdd->busy_data; +} + +static void cfq_cgroup_add_cfqd_rr(struct cfq_data *cfqd) +{ + if (!cfq_cfqd_on_rr(cfqd)) { + cfq_mark_cfqd_on_rr(cfqd); + cfqd->cfqdd->busy_data++; + + cfq_cgroup_service_tree_add(cfqd, 0); + } +} + +static void cfq_cgroup_del_cfqd_rr(struct cfq_data *cfqd) +{ + if (RB_EMPTY_ROOT(&cfqd->service_tree.rb)) { + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + BUG_ON(!cfq_cfqd_on_rr(cfqd)); + cfq_clear_cfqd_on_rr(cfqd); + if (!RB_EMPTY_NODE(&cfqd->rb_node)) { + cfq_rb_erase(&cfqd->rb_node, + &cfqdd->service_tree); + } + BUG_ON(!cfqdd->busy_data); + cfqdd->busy_data--; + } +} + +static int cfq_cgroup_is_active_data(struct cfq_data *cfqd) +{ + return cfqd->cfqdd->active_data == cfqd; +} + + +/* * cgroupfs parts below --> */ static void @@ -944,6 +1010,10 @@ static struct elevator_type iosched_cfq_cgroup = { static struct cfq_ops cfq_cgroup_op = { .cfq_init_driver_data_opt_fn = cfq_cgroup_init_driver_data_opt, .cfq_completed_request_opt_fn = cfq_cgroup_completed_request_opt, + .cfq_search_data_fn = cfq_cgroup_search_data, + .cfq_add_cfqq_opt_fn = cfq_cgroup_add_cfqd_rr, + .cfq_del_cfqq_opt_fn = cfq_cgroup_del_cfqd_rr, + .cfq_is_active_data_fn = cfq_cgroup_is_active_data, }; static int __init cfq_cgroup_init(void) @@ -953,6 +1023,7 @@ static int __init cfq_cgroup_init(void) iosched_cfq_cgroup.ops.elevator_exit_fn = cfq_cgroup_exit_queue; iosched_cfq_cgroup.ops.elevator_dispatch_fn cfq_cgroup_dispatch_requests, + iosched_cfq_cgroup.ops.elevator_queue_empty_fn = cfq_cgroup_queue_empty, elv_register(&iosched_cfq_cgroup); diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 5fbef85..5fe0551 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -181,26 +181,30 @@ static inline int cfq_bio_sync(struct bio *bio) return 0; } + +static int cfq_queue_empty(struct request_queue *q) +{ + struct cfq_data *cfqd = q->elevator->elevator_data; + + return !cfqd->busy_queues; +} + /* * scheduler run of queue, if there are requests pending and no one in the * driver that will restart queueing */ -static inline void cfq_schedule_dispatch(struct cfq_data *cfqd) +inline void cfq_schedule_dispatch(struct cfq_data *cfqd) { struct cfq_driver_data *cfqdd = cfqd->cfqdd; - if (cfqd->busy_queues) { + struct elevator_ops *ops = cfqdd->queue->elevator->ops; + + if (!ops->elevator_queue_empty_fn(cfqdd->queue)) { cfq_log(cfqd, "schedule dispatch"); kblockd_schedule_work(cfqdd->queue, &cfqdd->unplug_work); } } -static int cfq_queue_empty(struct request_queue *q) -{ - struct cfq_data *cfqd = q->elevator->elevator_data; - - return !cfqd->busy_queues; -} /* * Scale schedule slice based on io priority. Use the sync time slice only @@ -503,6 +507,9 @@ static void cfq_add_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) cfqd->busy_queues++; cfq_resort_rr_list(cfqd, cfqq); + + if (cfqd->cfqdd->op->cfq_add_cfqq_opt_fn) + cfqd->cfqdd->op->cfq_add_cfqq_opt_fn(cfqd); } /* @@ -520,6 +527,9 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) BUG_ON(!cfqd->busy_queues); cfqd->busy_queues--; + + if (cfqd->cfqdd->op->cfq_del_cfqq_opt_fn) + cfqd->cfqdd->op->cfq_del_cfqq_opt_fn(cfqd); } /* @@ -639,6 +649,9 @@ static int cfq_merge(struct request_queue *q, struct request **req, struct cfq_data *cfqd = q->elevator->elevator_data; struct request *__rq; + if (cfqd->cfqdd->op->cfq_search_data_fn) + cfqd = cfqd->cfqdd->op->cfq_search_data_fn(cfqd, current); + __rq = cfq_find_rq_fmerge(cfqd, bio); if (__rq && elv_rq_merge_ok(__rq, bio)) { *req = __rq; @@ -679,6 +692,9 @@ static int cfq_allow_merge(struct request_queue *q, struct request *rq, struct cfq_io_context *cic; struct cfq_queue *cfqq; + if (cfqd->cfqdd->op->cfq_search_data_fn) + cfqd = cfqd->cfqdd->op->cfq_search_data_fn(cfqd, current); + /* * Disallow merge of a sync bio into an async request. */ @@ -882,8 +898,8 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) */ static void cfq_dispatch_insert(struct request_queue *q, struct request *rq) { - struct cfq_data *cfqd = q->elevator->elevator_data; struct cfq_queue *cfqq = RQ_CFQQ(rq); + struct cfq_data *cfqd = cfqq->cfqd; cfq_log_cfqq(cfqd, cfqq, "dispatch_insert"); @@ -1739,6 +1755,13 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq, struct request *rq) { struct cfq_queue *cfqq; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + int flag = 1; + + if (cfqdd->op->cfq_is_active_data_fn) + flag = cfqdd->op->cfq_is_active_data_fn(cfqd); + if (!flag) + return 0; cfqq = cfqd->active_queue; if (!cfqq) @@ -1767,7 +1790,7 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq, if (rq_is_meta(rq) && !cfqq->meta_pending) return 1; - if (!cfqd->cfqdd->active_cic || !cfq_cfqq_wait_request(cfqq)) + if (!cfqdd->active_cic || !cfq_cfqq_wait_request(cfqq)) return 0; /* @@ -1811,6 +1834,7 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq, { struct cfq_io_context *cic = RQ_CIC(rq); struct cfq_driver_data *cfqdd = cfqd->cfqdd; + int flag = 1; cfqdd->rq_queued++; if (rq_is_meta(rq)) @@ -1822,7 +1846,10 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq, cic->last_request_pos = rq->sector + rq->nr_sectors; - if (cfqq == cfqd->active_queue) { + if (cfqdd->op->cfq_is_active_data_fn) + flag = cfqdd->op->cfq_is_active_data_fn(cfqd); + + if ((flag) && (cfqq == cfqd->active_queue)) { /* * if we are waiting for a request for this queue, let it rip * immediately and flag that we must not expire this queue @@ -1847,8 +1874,8 @@ cfq_rq_enqueued(struct cfq_data *cfqd, struct cfq_queue *cfqq, static void cfq_insert_request(struct request_queue *q, struct request *rq) { - struct cfq_data *cfqd = q->elevator->elevator_data; struct cfq_queue *cfqq = RQ_CFQQ(rq); + struct cfq_data *cfqd = cfqq->cfqd; cfq_log_cfqq(cfqd, cfqq, "insert_request"); cfq_init_prio_data(cfqq, RQ_CIC(rq)->ioc); @@ -1979,6 +2006,9 @@ static int cfq_may_queue(struct request_queue *q, int rw) struct cfq_io_context *cic; struct cfq_queue *cfqq; + if (cfqd->cfqdd->op->cfq_search_data_fn) + cfqd = cfqd->cfqdd->op->cfq_search_data_fn(cfqd, current); + /* * don't force setup of a queue from here, as a call to may_queue * does not necessarily imply that a request actually will be queued. @@ -2035,6 +2065,9 @@ cfq_set_request(struct request_queue *q, struct request *rq, gfp_t gfp_mask) struct cfq_queue *cfqq; unsigned long flags; + if (cfqd->cfqdd->op->cfq_search_data_fn) + cfqd = cfqd->cfqdd->op->cfq_search_data_fn(cfqd, current); + might_sleep_if(gfp_mask & __GFP_WAIT); cic = cfq_get_io_context(cfqd, gfp_mask); diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index 30702c9..7287186 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -132,9 +132,18 @@ struct cfq_data { typedef void (cfq_init_driver_data_opt_fn)(struct cfq_driver_data *, struct cfq_data *); typedef int (cfq_completed_request_opt_fn)(struct cfq_data *); +typedef struct cfq_data* (cfq_search_data_fn)(void *, struct task_struct *); +typedef void (cfq_add_cfqq_opt_fn)(struct cfq_data *); +typedef void (cfq_del_cfqq_opt_fn)(struct cfq_data *); +typedef int (cfq_is_active_data_fn)(struct cfq_data *); + struct cfq_ops { cfq_init_driver_data_opt_fn *cfq_init_driver_data_opt_fn; cfq_completed_request_opt_fn *cfq_completed_request_opt_fn; + cfq_search_data_fn *cfq_search_data_fn; + cfq_add_cfqq_opt_fn *cfq_add_cfqq_opt_fn; + cfq_del_cfqq_opt_fn *cfq_del_cfqq_opt_fn; + cfq_is_active_data_fn *cfq_is_active_data_fn; }; -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:31 UTC
[PATCH][cfq-cgroups][11/12] Expand idle slice timer function.
This patch expandes function of idle slice timer. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 45 +++++++++++++++++++++++++++++++++++++++++++ block/cfq-iosched.c | 29 ++++++++++++++++++++++----- include/linux/cfq-iosched.h | 3 ++ 3 files changed, 71 insertions(+), 6 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index f3e9f40..4938fa0 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -69,9 +69,16 @@ static inline struct cfq_cgroup *task_to_cfq_cgroup(struct task_struct *tsk) /* * Add device or cgroup data functions. */ +static void cfq_cgroup_idle_slice_timer(unsigned long data); + static void cfq_cgroup_init_driver_data_opt(struct cfq_driver_data *cfqdd, struct cfq_data *cfqd) { + cfqdd->elv_data = cfqd; + + cfqdd->idle_slice_timer.function = cfq_cgroup_idle_slice_timer; + cfqdd->idle_slice_timer.data = (unsigned long) cfqdd; + cfqdd->sibling_tree = RB_ROOT; cfqdd->siblings = 0; @@ -623,6 +630,44 @@ static int cfq_cgroup_is_active_data(struct cfq_data *cfqd) /* + * Timer running if the active_queue is currently idling inside its time slice + */ +static void cfq_cgroup_idle_slice_timer(unsigned long data) +{ + struct cfq_driver_data *cfqdd = (struct cfq_driver_data *) data; + struct cfq_data *cfqd; + int timed_out = 1; + unsigned long flags; + + spin_lock_irqsave(cfqdd->queue->queue_lock, flags); + + cfqd = cfqdd->active_data; + if (cfqd) { + timed_out = 0; + + if (cfq_cgroup_slice_used(cfqd)) + goto expire_cgroup; + + if (!cfqdd->busy_data) + goto out_cont; + + if (__cfq_idle_slice_timer(cfqd)) + goto out_cont; + else + goto out_kick; + + } +expire_cgroup: + cfq_cgroup_slice_expired(cfqdd, timed_out); +out_kick: + cfq_schedule_dispatch(cfqdd->elv_data); +out_cont: + spin_unlock_irqrestore(cfqdd->queue->queue_lock, + flags); +} + + +/* * cgroupfs parts below --> */ static void diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 5fe0551..edc23e5 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -2122,18 +2122,13 @@ static void cfq_kick_queue(struct work_struct *work) /* * Timer running if the active_queue is currently idling inside its time slice */ -static void cfq_idle_slice_timer(unsigned long data) +inline int __cfq_idle_slice_timer(struct cfq_data *cfqd) { - struct cfq_data *cfqd = (struct cfq_data *) data; struct cfq_queue *cfqq; - struct cfq_driver_data *cfqdd = cfqd->cfqdd; - unsigned long flags; int timed_out = 1; cfq_log(cfqd, "idle timer fired"); - spin_lock_irqsave(cfqdd->queue->queue_lock, flags); - cfqq = cfqd->active_queue; if (cfqq) { timed_out = 0; @@ -2163,7 +2158,21 @@ expire: cfq_slice_expired(cfqd, timed_out); out_kick: cfq_schedule_dispatch(cfqd); + return 1; out_cont: + return 0; +} + +static void cfq_idle_slice_timer(unsigned long data) +{ + struct cfq_data *cfqd = (struct cfq_data *) data; + struct cfq_driver_data *cfqdd = cfqd->cfqdd; + unsigned long flags; + + spin_lock_irqsave(cfqdd->queue->queue_lock, flags); + + __cfq_idle_slice_timer(cfqd); + spin_unlock_irqrestore(cfqdd->queue->queue_lock, flags); } @@ -2226,6 +2235,13 @@ static void cfq_exit_queue(elevator_t *e) kfree(cfqdd); } +static void +cfq_init_driver_data_opt(struct cfq_driver_data *cfqdd, struct cfq_data *cfqd) +{ + cfqdd->idle_slice_timer.function = cfq_idle_slice_timer; + cfqdd->idle_slice_timer.data = (unsigned long) cfqd; +} + static struct cfq_driver_data * cfq_init_driver_data(struct request_queue *q, struct cfq_data *cfqd, struct cfq_ops *op) @@ -2441,6 +2457,7 @@ struct elevator_type iosched_cfq = { }; static struct cfq_ops cfq_op = { + .cfq_init_driver_data_opt_fn = cfq_init_driver_data_opt, }; static int __init cfq_init(void) diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index 7287186..920bcb5 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -24,6 +24,7 @@ struct cfq_rb_root { * Driver unique data structure */ struct cfq_driver_data { + struct cfq_data *elv_data; struct request_queue *queue; int rq_in_driver; @@ -156,5 +157,7 @@ extern void cfq_slice_expired(struct cfq_data *, int); extern int wait_request_checker(struct cfq_data *cfqd); extern int cfq_forced_dispatch(struct cfq_data *); extern int cfq_queue_dispatch_requests(struct cfq_data *, int); +extern int __cfq_idle_slice_timer(struct cfq_data *cfqd); +extern void cfq_schedule_dispatch(struct cfq_data *cfqd); #endif /* _LINUX_CFQ_IOSCHED_H */ -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:31 UTC
[PATCH][cfq-cgroups][12/12] Interface for parameter of cfq driver data
[This email is either empty or too large to be displayed at this time]
Satoshi UCHIDA
2008-Nov-12 08:37 UTC
[PATCH][cfq-cgroups][Option 1] Introduce a think time valid entry.
This patch introduces a think time valid entry. A think time is effective when queue is poor I/O requests because its queue is handled as idle class and then next queue can start to dispatch requests right after it. However, if there are many tasks, a value of think time is bigger. So, many queue are handled as idle class. Many queue will dispatch few requests(one request) and then expire slice. Namely, ioprio control for their queues is invalid. A think time valid entry is decide to check think time. The value 0 is always handled as idle class. The value 1 is handled as same as traditional CFQ. The value 2 make think time invalid. Signed-off-by: Satoshi UCHIDA <s-uchida at ap.jp.nec.com> --- block/cfq-cgroup.c | 2 ++ block/cfq-iosched.c | 9 ++++++++- include/linux/cfq-iosched.h | 1 + 3 files changed, 11 insertions(+), 1 deletions(-) diff --git a/block/cfq-cgroup.c b/block/cfq-cgroup.c index 776874d..b407768 100644 --- a/block/cfq-cgroup.c +++ b/block/cfq-cgroup.c @@ -922,6 +922,7 @@ SHOW_FUNCTION(cfq_cgroup_slice_sync_show, cfq_slice[1], 1); SHOW_FUNCTION(cfq_cgroup_slice_async_show, cfq_slice[0], 1); SHOW_FUNCTION(cfq_cgroup_slice_async_rq_show, cfq_slice_async_rq, 0); SHOW_FUNCTION(cfq_cgroup_ioprio_show, ioprio, 0); +SHOW_FUNCTION(cfq_cgroup_ttime_valid_show, cfq_ttime_valid, 0); #undef SHOW_FUNCTION static ssize_t @@ -1026,6 +1027,7 @@ STORE_FUNCTION(cfq_cgroup_slice_async_store, cfq_slice[0], 1, UINT_MAX, 1); STORE_FUNCTION(cfq_cgroup_slice_async_rq_store, cfq_slice_async_rq, 1, UINT_MAX, 0); STORE_FUNCTION(cfq_cgroup_ioprio_store, ioprio, 0, CFQ_CGROUP_MAX_IOPRIO, 0); +STORE_FUNCTION(cfq_cgroup_ttime_valid_store, cfq_ttime_valid, 0, 2, 0); #undef STORE_FUNCTION static ssize_t diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index edc23e5..51dccad 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -28,6 +28,8 @@ static const int cfq_slice_sync = HZ / 10; static int cfq_slice_async = HZ / 25; static const int cfq_slice_async_rq = 2; static int cfq_slice_idle = HZ / 125; +/* think time valid flag */ +static int cfq_ttime_valid = 1; /* * offset from end of service tree @@ -1731,7 +1733,8 @@ cfq_update_idle_window(struct cfq_data *cfqd, struct cfq_queue *cfqq, (cfqd->cfqdd->hw_tag && CIC_SEEKY(cic))) enable_idle = 0; else if (sample_valid(cic->ttime_samples)) { - if (cic->ttime_mean > cfqd->cfq_slice_idle) + if (cic->ttime_mean > + cfqd->cfq_slice_idle * cfqd->cfq_ttime_valid) enable_idle = 0; else enable_idle = 1; @@ -2304,6 +2307,7 @@ struct cfq_data *cfq_init_cfq_data(struct request_queue *q, cfqd->cfq_slice[1] = cfq_slice_sync; cfqd->cfq_slice_async_rq = cfq_slice_async_rq; cfqd->cfq_slice_idle = cfq_slice_idle; + cfqd->cfq_ttime_valid = cfq_ttime_valid; return cfqd; } @@ -2381,6 +2385,7 @@ SHOW_FUNCTION(cfq_slice_idle_show, cfqd->cfq_slice_idle, 1); SHOW_FUNCTION(cfq_slice_sync_show, cfqd->cfq_slice[1], 1); SHOW_FUNCTION(cfq_slice_async_show, cfqd->cfq_slice[0], 1); SHOW_FUNCTION(cfq_slice_async_rq_show, cfqd->cfq_slice_async_rq, 0); +SHOW_FUNCTION(cfq_ttime_valid_show, cfqd->cfq_ttime_valid, 0); #undef SHOW_FUNCTION #define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV) \ @@ -2412,6 +2417,7 @@ STORE_FUNCTION(cfq_slice_sync_store, &cfqd->cfq_slice[1], 1, UINT_MAX, 1); STORE_FUNCTION(cfq_slice_async_store, &cfqd->cfq_slice[0], 1, UINT_MAX, 1); STORE_FUNCTION(cfq_slice_async_rq_store, &cfqd->cfq_slice_async_rq, 1, UINT_MAX, 0); +STORE_FUNCTION(cfq_ttime_valid_store, &cfqd->cfq_ttime_valid, 0, 2, 0); #undef STORE_FUNCTION #define CFQ_ATTR(name) \ @@ -2427,6 +2433,7 @@ static struct elv_fs_entry cfq_attrs[] = { CFQ_ATTR(slice_async), CFQ_ATTR(slice_async_rq), CFQ_ATTR(slice_idle), + CFQ_ATTR(ttime_valid), __ATTR_NULL }; diff --git a/include/linux/cfq-iosched.h b/include/linux/cfq-iosched.h index 920bcb5..8fd4b59 100644 --- a/include/linux/cfq-iosched.h +++ b/include/linux/cfq-iosched.h @@ -95,6 +95,7 @@ struct cfq_data { unsigned int cfq_slice[2]; unsigned int cfq_slice_async_rq; unsigned int cfq_slice_idle; + unsigned int cfq_ttime_valid; struct list_head cic_list; -- 1.5.6.5
Satoshi UCHIDA
2008-Nov-12 08:37 UTC
[PATCH][cfq-cgroups][Option 2] Introduce ioprio class for top layer.
[This email is either empty or too large to be displayed at this time]
Apparently Analagous Threads
- [PATCH][RFC][12+2][v3] A expanded CFQ scheduler for cgroups
- [PATCH][cfq-cgroups] Introduce ioprio class for top layer.
- [PATCH][cfq-cgroups] Introduce ioprio class for top layer.
- [PATCH][cfq-cgroups] Interface for parameter of cfq driver data
- [PATCH][cfq-cgroups] Introduce cgroups structure with ioprio entry.