myLC@gmx.net
2008-Jan-23 00:39 UTC
[Btrfs-devel] 3 thoughts about important and outstanding features - cont.
First of all, thank you for your helpful replies! =)
The idea of creating a modified copy-on-write duplicate is
not bad. For videos this would mean that you could, for
instance, skip over the ads, while in practice they still
remain inside the file.
I believe this is already a preferred way of handling
things, since it is easy to implement within the
application.
Video editing, however, often means to liberate yourself of
a few gigabytes or putting them together (which can take
hours, were it could & should take only a few seconds).
From a programmer's point of view the present situation
seems rather frustrating. Things haven't changed that much
since the old Datasette went out of order...
This will even be more uncomfortable in the near future,
with harddrives being slowly replaced by RAM (Flash, FE, M-
RAM, etc.). In case of Flash-RAMs, the "art" of reading in
and writing it back out instead of making in-place
modifications, also means a reduced lifetime of the media...
As for XFS, I'm currently using it - as it makes life
somewhat easier (xfs-dump and such).
XFS knows "sparse files". This is mostly useful for
databases or for mmap'ed applications firing their
calculation results into vast (and mostly empty) areas -
which is very simple and yet efficient...
However, that's no help when it comes to the problems
mentioned above. >:-[
(Although XFS has the nifty feature of guaranteed I/O rates,
which could come in handy for DVB-devices - but that's a
different story.)
> Different disks do different things (including inverting
> where block 0 lies).
Yes, of course - I know that. However, it is rather easy to
find out which tracks are the faster ones and where it
becomes slower. Usually you can also tell it by the model-
string. There are only a few possibilities anyhow.
My point was that the difference is SO IMMENSE that it
should be worth thinking about it. We're not talking about
something around 5% or so...
For instance I made the mistake of defragmentating my
Windows-XP partition (NTFS) - utilizing a real
defragmentation program of course (XP's leaves the files
fragmented in a mostly arbitrary fashion;-).
The program shoved the hibernation file (which has to be on
the system partition) towards the hub, the result being a
suspend to disk/reawake now taking twice as long as before
(big p.i.t.a.).
The filesystem is free to put the files where it wants to.
Even the most sophisticated algorithm probably still fades
compared to shear knowledge of an educated bipod.
My guess is that you already have various "special" files
and hence attributes to indicate such things. Adding
attributes indicating a certain preference (location) of a
file shouldn't be that difficult if I'm not mistaken. The
yield can be enormous though - point being: it should be
worth a thought...
Again, thanks for you expert notions! >:O}
LC (myLC@gmx.net)
Chris Mason
2008-Jan-23 06:02 UTC
[Btrfs-devel] 3 thoughts about important and outstanding features - cont.
On Wednesday 23 January 2008, myLC@gmx.net wrote:> First of all, thank you for your helpful replies! =) > > The idea of creating a modified copy-on-write duplicate is > not bad. For videos this would mean that you could, for > instance, skip over the ads, while in practice they still > remain inside the file. > I believe this is already a preferred way of handling > things, since it is easy to implement within the > application. > > Video editing, however, often means to liberate yourself of > a few gigabytes or putting them together (which can take > hours, were it could & should take only a few seconds). > From a programmer's point of view the present situation > seems rather frustrating. Things haven't changed that much > since the old Datasette went out of order... > This will even be more uncomfortable in the near future, > with harddrives being slowly replaced by RAM (Flash, FE, M- > RAM, etc.). In case of Flash-RAMs, the "art" of reading in > and writing it back out instead of making in-place > modifications, also means a reduced lifetime of the media... >I think you're saying that a copy-on-write duplicate is not sufficient for video editing. The thing to keep in mind is that with btrfs COW is done on an extent basis, and extents can be small (1MB is small for these files). So, you can do the video editing part via cow, slice out the parts you don't want, and you'll get that storage back. It is just a matter of setting the max extent size on the file.> > > Different disks do different things (including inverting > > where block 0 lies). > > Yes, of course - I know that. However, it is rather easy to > find out which tracks are the faster ones and where it > becomes slower. Usually you can also tell it by the model- > string. There are only a few possibilities anyhow. > My point was that the difference is SO IMMENSE that it > should be worth thinking about it. We're not talking about > something around 5% or so... > For instance I made the mistake of defragmentating my > Windows-XP partition (NTFS) - utilizing a real > defragmentation program of course (XP's leaves the files > fragmented in a mostly arbitrary fashion;-). > The program shoved the hibernation file (which has to be on > the system partition) towards the hub, the result being a > suspend to disk/reawake now taking twice as long as before > (big p.i.t.a.).I don't think you can conclude moving the hibernation file is the cause of the performance problem. XP probably frees as much file cache as it can before suspend to disk, which means that when you resume you have to seek all over the drive to load files back in. It is entirely possible the new layout is less optimal for this workload. -chris
myLC@gmx.net
2008-Jan-25 02:00 UTC
[Btrfs-devel] Re: 3 thoughts about important and outstanding features - cont.
Chris Mason wrote:
> I think you're saying that a copy-on-write duplicate is not
> sufficient for video editing. The thing to keep in mind is
> that with btrfs COW is done on an extent basis, and extents
> can be small (1MB is small for these files). So, you can do
> the video editing part via cow, slice out the parts you
> don't want, and you'll get that storage back. It is just a
> matter of setting the max extent size on the file.
Maybe I'm missing a point here - the COW file is still
nothing without the original, right?
If so, then how fast is it when it comes to throwing away
the original? Provided that the former works reasonably
fast, then you would have the inserting into/removing from a
file problem solved (1 MB is indeed becoming small, even for
embedded systems and certainly for harddrives:-).
You stated before that it wouldn't be "that fast". I'm
somewhat curious about that - 'should be somewhat faster
than copying the whole damn/or even half of the damn
(say 10 GB) thing...
> I don't think you can conclude moving the hibernation file
> is the cause of the performance problem.
Trust me, it is (and I oughta kick myself for that one;-).
> XP probably frees as much file cache as it can before
> suspend to disk, which means that when you resume you have
> to seek all over the drive to load files back in...
Nope. XP is rather primitive in (not only) that matter.
The size of the hibernation file always matches the amount
of installed RAM. The memory simply gets written into the
file and read back upon awakening. There is a simple
progress bar, now indicating that the whole operation takes
a lot longer than before. After the memory is read back it
reinitializes a few devices and such (1-3 seconds)...
By coincidence I have the interior of an old harddisk right
next to me (pinned to a wall). A disk measures about 9.5cm
in diameter on the outer rim and 2.5 on the inner.
The difference between inner and outer rim is usually left
up to the manufacturers (the smaller the inner rim, the more
they can fit on the disk without any problems in marketing -
i.e.: the bigger the profits).
Now, with that old drive you can easily see that:
- the outer rim measures about 9.5*pi = ~30cm in circumference
- the inner: ~8cm
Since the speed of rotation remains constant it is
relatively safe to conclude (and computer magazines testing
harddrives confirm it) that there is a HUGE difference in
performance between outer and inner tracks (how much a
difference is left up to the manufacturer as previously
stated).
Why not make use of that and boost performance?
XFS already does a good job in keeping files within a
directory together. This way, for instance, the include
files get read in a chunk (read-ahead) with a bit of luck.
That feature combined with a preference indicator for files
(priority) would already be very powerful. It would enable
you to keep files relevant to the system's performance that
tend to get read together grouped and in the faster zones
(and much more). As previously stated this can very much
double the performance at very little a cost.
Yes, I have slaughtered harddrives - I admit it. X-|
But I was young and needed no money...
LC (myLC@gmx.net)