myLC@gmx.net
2008-Jan-23 00:39 UTC
[Btrfs-devel] 3 thoughts about important and outstanding features - cont.
First of all, thank you for your helpful replies! =) The idea of creating a modified copy-on-write duplicate is not bad. For videos this would mean that you could, for instance, skip over the ads, while in practice they still remain inside the file. I believe this is already a preferred way of handling things, since it is easy to implement within the application. Video editing, however, often means to liberate yourself of a few gigabytes or putting them together (which can take hours, were it could & should take only a few seconds). From a programmer's point of view the present situation seems rather frustrating. Things haven't changed that much since the old Datasette went out of order... This will even be more uncomfortable in the near future, with harddrives being slowly replaced by RAM (Flash, FE, M- RAM, etc.). In case of Flash-RAMs, the "art" of reading in and writing it back out instead of making in-place modifications, also means a reduced lifetime of the media... As for XFS, I'm currently using it - as it makes life somewhat easier (xfs-dump and such). XFS knows "sparse files". This is mostly useful for databases or for mmap'ed applications firing their calculation results into vast (and mostly empty) areas - which is very simple and yet efficient... However, that's no help when it comes to the problems mentioned above. >:-[ (Although XFS has the nifty feature of guaranteed I/O rates, which could come in handy for DVB-devices - but that's a different story.) > Different disks do different things (including inverting > where block 0 lies). Yes, of course - I know that. However, it is rather easy to find out which tracks are the faster ones and where it becomes slower. Usually you can also tell it by the model- string. There are only a few possibilities anyhow. My point was that the difference is SO IMMENSE that it should be worth thinking about it. We're not talking about something around 5% or so... For instance I made the mistake of defragmentating my Windows-XP partition (NTFS) - utilizing a real defragmentation program of course (XP's leaves the files fragmented in a mostly arbitrary fashion;-). The program shoved the hibernation file (which has to be on the system partition) towards the hub, the result being a suspend to disk/reawake now taking twice as long as before (big p.i.t.a.). The filesystem is free to put the files where it wants to. Even the most sophisticated algorithm probably still fades compared to shear knowledge of an educated bipod. My guess is that you already have various "special" files and hence attributes to indicate such things. Adding attributes indicating a certain preference (location) of a file shouldn't be that difficult if I'm not mistaken. The yield can be enormous though - point being: it should be worth a thought... Again, thanks for you expert notions! >:O} LC (myLC@gmx.net)
Chris Mason
2008-Jan-23 06:02 UTC
[Btrfs-devel] 3 thoughts about important and outstanding features - cont.
On Wednesday 23 January 2008, myLC@gmx.net wrote:> First of all, thank you for your helpful replies! =) > > The idea of creating a modified copy-on-write duplicate is > not bad. For videos this would mean that you could, for > instance, skip over the ads, while in practice they still > remain inside the file. > I believe this is already a preferred way of handling > things, since it is easy to implement within the > application. > > Video editing, however, often means to liberate yourself of > a few gigabytes or putting them together (which can take > hours, were it could & should take only a few seconds). > From a programmer's point of view the present situation > seems rather frustrating. Things haven't changed that much > since the old Datasette went out of order... > This will even be more uncomfortable in the near future, > with harddrives being slowly replaced by RAM (Flash, FE, M- > RAM, etc.). In case of Flash-RAMs, the "art" of reading in > and writing it back out instead of making in-place > modifications, also means a reduced lifetime of the media... >I think you're saying that a copy-on-write duplicate is not sufficient for video editing. The thing to keep in mind is that with btrfs COW is done on an extent basis, and extents can be small (1MB is small for these files). So, you can do the video editing part via cow, slice out the parts you don't want, and you'll get that storage back. It is just a matter of setting the max extent size on the file.> > > Different disks do different things (including inverting > > where block 0 lies). > > Yes, of course - I know that. However, it is rather easy to > find out which tracks are the faster ones and where it > becomes slower. Usually you can also tell it by the model- > string. There are only a few possibilities anyhow. > My point was that the difference is SO IMMENSE that it > should be worth thinking about it. We're not talking about > something around 5% or so... > For instance I made the mistake of defragmentating my > Windows-XP partition (NTFS) - utilizing a real > defragmentation program of course (XP's leaves the files > fragmented in a mostly arbitrary fashion;-). > The program shoved the hibernation file (which has to be on > the system partition) towards the hub, the result being a > suspend to disk/reawake now taking twice as long as before > (big p.i.t.a.).I don't think you can conclude moving the hibernation file is the cause of the performance problem. XP probably frees as much file cache as it can before suspend to disk, which means that when you resume you have to seek all over the drive to load files back in. It is entirely possible the new layout is less optimal for this workload. -chris
myLC@gmx.net
2008-Jan-25 02:00 UTC
[Btrfs-devel] Re: 3 thoughts about important and outstanding features - cont.
Chris Mason wrote: > I think you're saying that a copy-on-write duplicate is not > sufficient for video editing. The thing to keep in mind is > that with btrfs COW is done on an extent basis, and extents > can be small (1MB is small for these files). So, you can do > the video editing part via cow, slice out the parts you > don't want, and you'll get that storage back. It is just a > matter of setting the max extent size on the file. Maybe I'm missing a point here - the COW file is still nothing without the original, right? If so, then how fast is it when it comes to throwing away the original? Provided that the former works reasonably fast, then you would have the inserting into/removing from a file problem solved (1 MB is indeed becoming small, even for embedded systems and certainly for harddrives:-). You stated before that it wouldn't be "that fast". I'm somewhat curious about that - 'should be somewhat faster than copying the whole damn/or even half of the damn (say 10 GB) thing... > I don't think you can conclude moving the hibernation file > is the cause of the performance problem. Trust me, it is (and I oughta kick myself for that one;-). > XP probably frees as much file cache as it can before > suspend to disk, which means that when you resume you have > to seek all over the drive to load files back in... Nope. XP is rather primitive in (not only) that matter. The size of the hibernation file always matches the amount of installed RAM. The memory simply gets written into the file and read back upon awakening. There is a simple progress bar, now indicating that the whole operation takes a lot longer than before. After the memory is read back it reinitializes a few devices and such (1-3 seconds)... By coincidence I have the interior of an old harddisk right next to me (pinned to a wall). A disk measures about 9.5cm in diameter on the outer rim and 2.5 on the inner. The difference between inner and outer rim is usually left up to the manufacturers (the smaller the inner rim, the more they can fit on the disk without any problems in marketing - i.e.: the bigger the profits). Now, with that old drive you can easily see that: - the outer rim measures about 9.5*pi = ~30cm in circumference - the inner: ~8cm Since the speed of rotation remains constant it is relatively safe to conclude (and computer magazines testing harddrives confirm it) that there is a HUGE difference in performance between outer and inner tracks (how much a difference is left up to the manufacturer as previously stated). Why not make use of that and boost performance? XFS already does a good job in keeping files within a directory together. This way, for instance, the include files get read in a chunk (read-ahead) with a bit of luck. That feature combined with a preference indicator for files (priority) would already be very powerful. It would enable you to keep files relevant to the system's performance that tend to get read together grouped and in the faster zones (and much more). As previously stated this can very much double the performance at very little a cost. Yes, I have slaughtered harddrives - I admit it. X-| But I was young and needed no money... LC (myLC@gmx.net)