thr3ads.net - Ovirt devel - [Ovirt-devel] Thoughts about taskomatic redesign [Jun 2008]

If this information is useful, please help other people find it:
Share via:

Chris Lalancette

2008-Jun-23 09:06 UTC

[Ovirt-devel] Thoughts about taskomatic redesign

Ian, et.al,
     I've been doing some thinking about both what taskomatic needs to do in
its
next incarnation, along with ways of how to do it.

WHAT:
1)  Taskomatic needs to be able to run on multiple machines at the same time,
accessing a central database
2)  Taskomatic needs to be able to fire off tasks relating to different VMs (or
storage pools) concurrently (whether it's just run on one machine or many).

HOW:
1)  I think we should actually have two modes for taskomatic: standalone (i.e. I
am the only taskomatic), and multi-host (there are other taskomatics).  The
reason for this is in the standalone case, we probably want to fork one
taskomatic process for each VM (or storage pool) we want to perform actions on.
 In the multi-host case, we don't know how many other taskomatics might be
out
there doing tasks, so we keep one process per machine (this should be a
command-line option/config file option)

2)  We need to lock rows in the database as each taskomatic wakes up and finds
work to do.  Luckily both postgres and activerecord support row locking, so the
underlying infrastructure is there.

In the standalone case, taskomatic should wake up, look at how many different
VMs (or storage pools) there are currently tasks queued for, and fork off that
many workers to do work (i.e. if you have start_vm 1, start_vm 2, stop_vm 1 in
the queue, you would fork off two workers).  Each worker would lock all of the
rows of the database corresponding with their VM (i.e. the first worker would
lock all rows having to do with VM 1), and then busy themselves with executing
the actions for that VM serially.  I guess the locking isn't strictly
necessary
here, since we can tell each worker which VM or storage ID it should work on,
but it makes it more like the multihost case.

In the multihost case, things are a bit simpler; the taskomatic running on each
individual machine would just wake up, find the first task that is not in
progress and not locked, and lock all task rows having to do with that VM.  Then
it would execute these tasks and go back looking for more tasks.

Note that in both standalone and multihost case, it's OK for multiple
taskomatics to be sending commands to identical managed nodes.  Libvirtd itself
is serial, so commands might get intertwined, but that's OK since we are
explicitly making sure our taskomatics work on different VMs or storage pools.

3)  Transaction support in taskomatic (hi slinaberry!).  I'm not sure about
this
one; we are modifying state external to the database, so I'm not sure
"rolling-back" a transaction means a whole hill of beans to us.  In
fact, I
might argue that rolling back is worse in this case; if you modified external
state, and then crashed, when you come back you might "roll-back" your
VM state
to something that's totally invalid, and you'll need to be corrected by
host-status anyway.  Does anyone have further thoughts here?

THOUGHTS:
Interestingly, I think we can evolve the current taskomatic to do this, rather
than re-writing the thing from scratch.  Since we cleaned up error reporting
handling and reporting, I actually feel a lot better about the state of
taskomatic.  It really just needs corner/error cases better handled, and then
introducing some of the above concepts one at a time.  Is there anything in
taskomatic right now that people are particularly unhappy about that might
warrant a re-write?

Chris Lalancette

steve linabery

2008-Jun-23 15:03 UTC

head link

[Ovirt-devel] Re: Thoughts about taskomatic redesign

On Mon, Jun 23, 2008 at 4:06 AM, Chris Lalancette <clalance at redhat.com>
wrote:>
> 3)  Transaction support in taskomatic (hi slinaberry!).  I'm not sure
about this
> one; we are modifying state external to the database, so I'm not sure
> "rolling-back" a transaction means a whole hill of beans to us. 
In fact, I
> might argue that rolling back is worse in this case; if you modified
external
> state, and then crashed, when you come back you might "roll-back"
your VM state
> to something that's totally invalid, and you'll need to be
corrected by
> host-status anyway.  Does anyone have further thoughts here?
Only that this is the tough nut I was unable to crack. And...
>
> THOUGHTS:
> Interestingly, I think we can evolve the current taskomatic to do this,
rather
> than re-writing the thing from scratch.  Since we cleaned up error
reporting
> handling and reporting, I actually feel a lot better about the state of
> taskomatic.  It really just needs corner/error cases better handled, and
then
> introducing some of the above concepts one at a time.  Is there anything in
> taskomatic right now that people are particularly unhappy about that might
> warrant a re-write?
...that the 2PC necessity (or not?) in taskomatic was the only issue
that warranted a rewrite in my mind.

I've been away from taskomatic for a while now; I will take a look at
it again. I am glad to read that the exception handing/reporting has
been worked on; that was another thing that was an issue.

Steve

Jason Guiditta

2008-Jun-23 18:08 UTC

head link

[Ovirt-devel] Thoughts about taskomatic redesign

Throwing in my thoughts/concerns (though I don't know taskomatic well, 
so shoot it down if they are invalid)

Chris Lalancette wrote:> <snip>
>
> 3)  Transaction support in taskomatic (hi slinaberry!).  I'm not sure
about this
> one; we are modifying state external to the database, so I'm not sure
> "rolling-back" a transaction means a whole hill of beans to us. 
In fact, I
> might argue that rolling back is worse in this case; if you modified
external
> state, and then crashed, when you come back you might "roll-back"
your VM state
> to something that's totally invalid, and you'll need to be
corrected by
> host-status anyway.  Does anyone have further thoughts here?
>
>   I actually thought I had more problems with this when I first read it, 
but mostly this all seems very reasonable to me on second pass.  I think 
if the external state has been modified, but failed or did not complete 
as expected, instead of rolling back, it would make more sense to 
_change_ whatever fields are associated with the changed state, so when 
the user sees the wui, they do not think nothing has happened, but can 
instead see a message showing what the real state is.  Maybe as an 
example, this could be something like 'VM restart failed, trying again',
or some such.> THOUGHTS:
> Interestingly, I think we can evolve the current taskomatic to do this,
rather
> than re-writing the thing from scratch.  Since we cleaned up error
reporting
> handling and reporting, I actually feel a lot better about the state of
> taskomatic.  It really just needs corner/error cases better handled, and
then
> introducing some of the above concepts one at a time.  Is there anything in
> taskomatic right now that people are particularly unhappy about that might
> warrant a re-write?
>   I mentioned this in irc, but for those who didn't see it - what if 
instead of reading/writing directly to the database, taskomatic _and_ 
the wui communicate using an amqp queue?  Let me attempt an example of 
how this might work, and how it could be useful.  I am creating a new vm 
in the wui.  I type in new information to start the process, and the 
info is saved to the database.  Now I need to wait for taskomatic to do 
x,y and z tasks before I can go on to do whatever else I want to do 
(let's say I want to reboot the vm for some post-install config).  If we 
use queues/messaging, I as the user could immediately say I want to 
reboot.  This request is added to the list after x,y and z (this list 
could even be shown in the wui, if appropriate).  Taskomatic listens for 
new events to be published on whatever channel it is subscribed to, and 
starts do the work as it receives notice.  As it completes a task, it 
fires off a message and any database changes are saved (I guess this 
could either still be by taskomatic, or something else that listens for 
completion messages and takes care of these updates).  It then grabs the 
next task, and so on.  Meanwhile, the wui gets notified as each task is 
completed, so the user can see where they are in the queue while 
continuing to do whatever else they might need to do.  A side benefit of 
this approach is that we may not have a locking issue at all, since the 
apps work strictly from a queue instead of directly again the db.  A 
last thought is that this might also obviate the need for two different 
modes (multi-machine or single), though I am not positive on that point.

-j> Chris Lalancette
>
> _______________________________________________
> Ovirt-devel mailing list
> Ovirt-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/ovirt-devel
>

Daniel P. Berrange

2008-Jun-23 21:03 UTC

head link

[Ovirt-devel] Thoughts about taskomatic redesign

On Mon, Jun 23, 2008 at 11:06:08AM +0200, Chris Lalancette
wrote:> Ian, et.al,
>      I've been doing some thinking about both what taskomatic needs to
do in its
> next incarnation, along with ways of how to do it.
> 
> WHAT:
> 1)  Taskomatic needs to be able to run on multiple machines at the same
time,
> accessing a central database
This is an over-specialization - basically taskomatic needs to be parallelized
whether it runs on one or many machines.
> 2)  Taskomatic needs to be able to fire off tasks relating to different VMs
(or
> storage pools) concurrently (whether it's just run on one machine or
many).
It strikes me that this is avoiding the more general problem - that there are
explicit dependancies between tasks. Serializing tasks per-VM is not expressing
this concept of dependancies directly. 

So as an example a task starting a VM, may have a dependancy on a task to 
start a storage pool (or refresh the volume list in an existing pool). Now
while these 2 tasks are pending, another VM start task is schedule which 
has a dependancy on the same storage task. 

Or the admin may have some runtime policy to the effect that during the hours
9-5 they want VM 'x' to be running on a machine, and then at 5pm
shutdown 'x'
and startup 'y' in its place. This has a strict ordering requirement
between
the 2 vms - they can't be schedule independantly because there won't be
RAM
for 'y', until 'x' is shutdown.
> HOW:
> 1)  I think we should actually have two modes for taskomatic: standalone
(i.e. I
> am the only taskomatic), and multi-host (there are other taskomatics).  The
> reason for this is in the standalone case, we probably want to fork one
> taskomatic process for each VM (or storage pool) we want to perform actions
on.
>  In the multi-host case, we don't know how many other taskomatics might
be out
> there doing tasks, so we keep one process per machine (this should be a
> command-line option/config file option)
Having two modes is inserting an artificial distinction that really doesn't
exist. Even if there is only a single  instance of taskomatic runnig on a 
single machine in the data center there is going to be parallization because
the world has gone heavily SMP whether multi-socket or multi-core or both.
By the very nature of its work taskomatic is not going to be bottlnecked on
CPU, instead spending alot of time waiting on results from operations. To
maximise utilizattion of a single node taskomatic will want to be heavily
parallized, whether fork() based or thread based, on some multiple of the 
number of the number of CPUs. 

On a 4 logical CPU machine perhaps want 16 taskomatic threads running. So
whether those 16 threads are on  single 4 cpu machine, or a pair of 2 cpu
machines is not a dinstiction we ned to consider. We just scale horizontally
to add capacity as required.
> 2)  We need to lock rows in the database as each taskomatic wakes up and
finds
> work to do.  Luckily both postgres and activerecord support row locking, so
the
> underlying infrastructure is there.
We only need row locking if you're working on the model where you keep the
transaction open for the duration of taskomatic's processing for that 
particular job and commit/rollback on completion. It may be that you simply
immediately mark a task as 'in progress' and commit that change right at
the start. Then later have a second transaction where you fill in the result
of the task whether succes or failure.
> 
> In the standalone case, taskomatic should wake up, look at how many
different
> VMs (or storage pools) there are currently tasks queued for, and fork off
that
> many workers to do work (i.e. if you have start_vm 1, start_vm 2, stop_vm 1
in
> the queue, you would fork off two workers).  Each worker would lock all of
the
> rows of the database corresponding with their VM (i.e. the first worker
would
> lock all rows having to do with VM 1), and then busy themselves with
executing
> the actions for that VM serially.  I guess the locking isn't strictly
necessary
> here, since we can tell each worker which VM or storage ID it should work
on,
> but it makes it more like the multihost case.
Forking a thread per VM doesn't work because there can be ordering
requirements
between taskss on different VMs, and/or storage. Explicit task dependancies
need to be tracked. At which point, each taskomatic process/thread in existance
simply waits for a task to arrive which has no pending dependant tasks, claims
it and goes to work on it. Completing the task will then satisfy dependant tasks
allowing them to be processeed and so on. Need to specialize a particular worker
process to a particular object.
> Note that in both standalone and multihost case, it's OK for multiple
> taskomatics to be sending commands to identical managed nodes.  Libvirtd
itself
> is serial, so commands might get intertwined, but that's OK since we
are
> explicitly making sure our taskomatics work on different VMs or storage
pools.
Don't rely on libvirtd being serial - we may well find ourselves making it
fully parallized allowing operations to be made &  executed concurrently. At
the very least we'll have current execution when adding async background
jobs.
> 3)  Transaction support in taskomatic (hi slinaberry!).  I'm not sure
about this
> one; we are modifying state external to the database, so I'm not sure
> "rolling-back" a transaction means a whole hill of beans to us. 
In fact, I
> might argue that rolling back is worse in this case; if you modified
external
> state, and then crashed, when you come back you might "roll-back"
your VM state
> to something that's totally invalid, and you'll need to be
corrected by
> host-status anyway.  Does anyone have further thoughts here?
I agree that life probably isn't going to be as simple as just rolling back.
I think its much more likely we'll need to explicitly track the failure
against
the task. So more similar to the example I mentioned earlier, where we mark
the task as in progreess in the DB, and then later update with the outcome of
the task. If a task failed you'd then want to fail and tasks depending on it
and tasks depending on those, etc, etc. This gives oVirt ability to track
the failures and automatically re-schedule new tasks to try again, or let the
admin choose a different action.

Simply rolling-back the tranaction means you're not capturing any of this
and just re-trying over and over without neccessarily solving the problem

Regards,
Daniel.
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

David Lutterkort

2008-Jun-23 21:53 UTC

head link

[Ovirt-devel] Thoughts about taskomatic redesign

On Mon, 2008-06-23 at 11:06 +0200, Chris Lalancette
wrote:> 3)  Transaction support in taskomatic (hi slinaberry!).  I'm not sure
about this
> one; we are modifying state external to the database, so I'm not sure
> "rolling-back" a transaction means a whole hill of beans to us. 
In fact, I
> might argue that rolling back is worse in this case; if you modified
external
> state, and then crashed, when you come back you might "roll-back"
your VM state
> to something that's totally invalid, and you'll need to be
corrected by
> host-status anyway.  Does anyone have further thoughts here?
I wouldn't even worry about rolling back transactions - it seems like a
lot of pain for very dubious gain.

A slightly related question is: how do you deal with user actions that
require multiple tasks to be performed ? You'd need to keep track of
some higherlevel object that groups related tasks together, and when one
of them fails give the user an option to retry (e.g. when creating a
StoragePool failed, user adds manually more storage and then restarts
that whole task group, which will only redo the tasks that failed or
blocked on failure)

David

Ian Main

2008-Jun-26 01:32 UTC

head link

[Ovirt-devel] Re: Thoughts about taskomatic redesign

So I've been doing some thinking on this, and here's what I've come
up with to date. As usual any input is appreciated.

I wanted to make sure we had some basic requirements we could discuss. 
Taskomatic should:

- Execute many tasks in a timely manner (presumably via distributed/multi
threaded setup)
- Execute tasks in the correct order.
- Implement transaction support (if we can/want it)
- Have good error reporting.
- Include authentication and encryption.
- It would be nice to be able to see the state of the queues in the WUI.
- Implement at least the following tasks: 
   - start/stop/create/destroy VMs
   - clear host functionality.  Destroy all VMs/pools.
   - Migrate vms.
   - Storage pool creation/destruction/management


Now, if we break it down the system into basic components, I'm thinking:

- Task producer: This is the WUI now but could also be some other script in the
future.  Creates task and adds them to the queue.

- Task ordering system: At some point I think a single process needs to separate
out the tasks and order them.  I think it would be useful to actually move them
into separate independent queues that can be executed in parallel.  This would
be done such that each action in a given queue would need to be done in order,
but each queue will not have dependencies on anything in another queue and so
they can be worked on in parallel.  While this could be a bottleneck I think
that the logic required to order the tasks will not be onerous and should keep
up with high loads.

- Task implementation system: This would take a given queue and implement the
tasks within it, dispatching requests to hosts as needed.  Any errors occurring
in this system will be reported and possibly we could implement rollback.

- Host/vm/pool State: In addition to the above, in order to implement queue
ordering and determine for certain that a given task succeeded, we'll
require host/vm/storage pool state information that is as up to date as
possible.

So in terms of implementing this, a lot of it comes down to technology
selection.

Queues could continue to be implemented in postgresql.  It would be nice however
to have something that was event driven and did not require polling.  It is
possible we could use a python stored procedure to alert the consumer but that
is postgresql specific and may have its own problems.  We may also consider
using qpid for this, as it can do 'durable' queues which are stored on
disk and survive across reboots.  Somewhere however, it needs to have a complete
view of all queues so it can keep track of things.

I think a single ruby process could be used to order the tasks and place them in
per-thread/process queues.  If using a DB I think we could either migrate the
entries to a new 'in-progress' table, or update the row with the ID of
the process/thread and possibly the sequence number to be used in implementing
the queue.  Again however, this is another polling point and another point where
we could shift to qpid msg queues.  Actually we may be best off to get commands
from the wui via qpid and then have the separate queues in the database as it
would allow the WUI to easily display the status of the work qeueus.  Jobs could
be marked as 'completed' rather than removed until the queue is
complete.  Individual processes could be awoken using one of various methods.

One possibility is that the task implementers be a mix of ruby processes running
on the wui and C or C++ applications running on the node that use qpid modeling
to represent the host, VMs, and storage pools.  The managed node daemon would
model the host/vm/storage pools as objects and the wui-side process could call
higher level functions to get things done.  eg vm->start() kind of thing
where the daemon would then implement this using the libvirt C api.  State
information for the host, VMs and storage pools would also be made available to
the wui side and could be subscribed to by the implementers and the task
ordering system.

Alternatively at this point, we could stick with libvirt.  The only problems I
see with this is the libvirt backport issue, the potentially long timeout issue
(looks like amqp can specify timeouts, or one could go async..).  It would be
nice to have up to date status information available from the managed nodes as
well.  It would also give us the opportunity to clean up a good deal of the
daemons we have currently implementing various things.

Anyway, that's what I'm thinking for now.  I left out a lot of details
for the sake of brevity.

Thoughts?

    Ian

Ovirt devel - Jun 2008 - Thoughts about taskomatic redesign

[Ovirt-devel] Thoughts about taskomatic redesign

[Ovirt-devel] Re: Thoughts about taskomatic redesign

[Ovirt-devel] Thoughts about taskomatic redesign

[Ovirt-devel] Thoughts about taskomatic redesign

[Ovirt-devel] Thoughts about taskomatic redesign

[Ovirt-devel] Re: Thoughts about taskomatic redesign