A few wild ideas/questions : 1) Is there a way to check the size of the journal of an ext3 filesystem ? I mean - the actually used size ; not the total size of the journal. 2) Would it be difficult to implement "freeze" of ext3 filesystem - that is, blocking all I/O to the filesystem until it's "unfrozen" (XFS can do that), for two purposes : A/ allowing "freezing" in a clean state, to allow clean snapshotting B/ allowing "freezing" while moving a SCSI disk or a network-connected disk without umounting filesystem A/ would require some work at the FS layer I guess, but B/ might be doable at the devicemapper layer or something like that. 3) Is it possible to allow data to stay in the journal for a very long time ? Rationale : for laptops with a lot of memory and some solid-state memory, this would allow to shutdown the hard disk (if all read data is in the cache, and all written data goes to the log on the solid-state disk).
On Mar 10, 2005 12:10 +0100, J?r?me Petazzoni wrote:> 1) Is there a way to check the size of the journal of an ext3 filesystem ? > I mean - the actually used size ; not the total size of the journal.There is no current statistics on any journal usage (though it would be nice to have this). Knowing how much space there currently is in the journal, some sort of average of the free journal space (e.g. abs(head-tail) as each new handle started), how often the journal was totally full and had to be flushed, etc. This would go a long way to telling a user and the ext3 developers how large a journal is needed under their workload. Currently Lustre just creates very large (400MB) journals on all of the filesystems because we know that a large journal improves the performance dramatically, but we have never done the trial+error approach of finding the "optimal" size.> 2) Would it be difficult to implement "freeze" of ext3 filesystem - that > is, blocking all I/O to the filesystem until it's "unfrozen" (XFS can do > that), for two purposes : > A/ allowing "freezing" in a clean state, to allow clean snapshotting > B/ allowing "freezing" while moving a SCSI disk or a network-connected > disk without umounting filesystemThis is already done, and is used by the LVM/device mapper subsystem to do snapshots of the filesystem. However, I'm not sure if there is a user-space API to access this.> 3) Is it possible to allow data to stay in the journal for a very long > time ? > Rationale : for laptops with a lot of memory and some solid-state > memory, this would allow to shutdown the hard disk (if all read data is > in the cache, and all written data goes to the log on the solid-state disk).Yes, this can be done (I think) by tuning the journal flush time and having a large enough journal to avoid filling it up. However, I don't think this would be practical because the only common way to do this would be e.g. flash memory and the heavy usage of the journal would quickly wear out such devices, and it would also be slow. Cheers, Andreas -- Andreas Dilger http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 J?r?me Petazzoni wrote:> A few wild ideas/questions : > > 1) Is there a way to check the size of the journal of an ext3 filesystem ? > I mean - the actually used size ; not the total size of the journal.perhaps "logdump -ac" (within debugfs) will help - i you can tell from its output what parts are "used".> 2) Would it be difficult to implement "freeze" of ext3 filesystem - that > is, blocking all I/O to the filesystem until it's "unfrozen" (XFS can do > that), for two purposes : > A/ allowing "freezing" in a clean state, to allow clean snapshottingwould "remount,ro" be sufficient?> B/ allowing "freezing" while moving a SCSI disk or a network-connected > disk without umounting filesystemerr, "unplug the cable without unmounting the filesystem"?? you'd have to hold the entire fs in ram for the "move" or i don't understand what you mean.> 3) Is it possible to allow data to stay in the journal for a very long > time ? > Rationale : for laptops with a lot of memory and some solid-state > memory, this would allow to shutdown the hard disk (if all read data is > in the cache, and all written data goes to the log on the solid-state > disk).the only tuneable which comes to my mind right now is the "commit" paramater for mount: commit=nrsec Sync all data and metadata every nrsec seconds. The default value is 5 seconds. Zero means default. and there are the laptop-mode-tools [1] using some kernel hacks to spin down disks and continue working. Christian. [1] http://www.xs4all.nl/~bsamwel/laptop_mode/ - -- BOFH excuse #375: Root name servers corrupted. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCMbOAC/PVm5+NVoYRAnYcAKCmAxFY2f9D+OepVXHj4PbYbX8amACgmlcn QNAcp1eUHkFxr7qv38RZmvA=1DzE -----END PGP SIGNATURE-----
Hi folks,> 2) Would it be difficult to implement "freeze" of > ext3 filesystem - that is, blocking all I/O to the > filesystem until it's "unfrozen" (XFS can do > that), for two purposes : > A/ allowing "freezing" in a clean state, to allow clean snapshotting > B/ allowing "freezing" while moving a SCSI disk or a network-connected > disk without umounting filesystem> This is already done, and is used by the LVM/device > mapper subsystem to do snapshots of the filesystem. > However, I'm not sure if there is a user-space API > to access this.Yes, there exists a function "freeze_bdev()" in fs/buffer.c which freezes the file system on the specified block device without unmounting. If the file system is "ext3" then, it calls journal_lock_updates()" to ensure that no more transactions take place. "thaw_bdev()" is its counterpart to continue operations. You can provide an ioctl call in fs/ext3/ioctl.c which will look like : { sb = freeze_bdev(bdev); /* do your stuff */ thaw_bdev(bdev, sb); return 0; }>From user land you can always call this ioctl routine.> 3) Is it possible to allow data to stay in the > journal for a very long time ?> Yes, this can be done (I think) by tuning the journal > flush time and having a large enough journal to avoid > filling it up. However, I don't think this would be > practical because the only common way to do > this would be e.g. flash memory and the heavy usage > of the journal would quickly wear out such devices, > and it would also be slow.This ca be done by changing the commit interval of the journaling thread viz. "kjournald". By default it is 5 seconds but you can change its value by changing JBD_DEFAULT_MAX_COMMIT_AGE. But, if a inode is being used as journal log then, there are chances of journal running out of blocks. So it is better to experiment this with an auxiliary device for external journal log. - Nitin