Hello, since we are using rsync for backing up millions of files in a virtual environment, and most of the virtual machines run on SSD cached storage, i`d be curious how that negatively impacts lifetime of the SSD`s when we do rsync run every night for backup my question: does rsync normal file comparison run to determine if anything has changed change atime of any files ? for me it seems, stat/lstat calls of rsync do NOT modify atime, but i`m not sure under which conditions atime is changed. grepping the source for O_NOATIME in rsync3.txt i found : - Propagate atimes and do not modify them. This is very ugly on Unix. It might be better to try to add O_NOATIME to kernels, and call that. furhermore, apparently there _IS_ O_NOATIME in linux kernels for a while: http://man7.org/linux/man-pages/man2/open.2.html O_NOATIME (since Linux 2.6.8) Do not update the file last access time (st_atime in the inode) when the file is read(2). This flag can be employed only if one of the following conditions is true: * The effective UID of the process matches the owner UID of the file. * The calling process has the CAP_FOWNER capability in its user namespace and the owner UID of the file has a mapping in the namespace. This flag is intended for use by indexing or backup programs, where its use can significantly reduce the amount of disk activity. This flag may not be effective on all filesystems. One example is NFS, where the server maintains the access time. so, maybe someone likes to comment on NOATIME !? maybe it could be useful to make rsync honour O_NOATIME ? regards roland
On Wed 26 Oct 2016, devzero at web.de wrote:> > since we are using rsync for backing up millions of files in a virtual environment, and most of the virtual machines run on SSD cached storage, i`d be curious how that negatively impacts lifetime of the SSD`s when we do rsync run every night for backup > > my question: > does rsync normal file comparison run to determine if anything has changed change atime of any files ? > > for me it seems, stat/lstat calls of rsync do NOT modify atime, but i`m not sure under which conditions atime is changed.Most filesystems on modern linux systems should be mounted with the relatime option. The atime will then only be updated if either the mtime or ctime is newer than the atime, or if the atime is older than a defined interval. Note that simply using stat will not update the atime, as the *file* itself has not been accessed, only its metadata. There's also the nodiratime option that can be useful, as directories are read to find what files exist in those directories; and it's seldomly useful to maintain the atime of a directory. You can also use noatime as a mount option, but then be sure that no application uses the atimes of files; e.g. something like mutt use the atime and mtime to determine whether there is a mail file with unread mail. Paul
On Wed, Oct 26, 2016 at 11:46:50AM +0200, Paul Slootman wrote:> Most filesystems on modern linux systems should be mounted with the > relatime option.This is the default...> You can also use noatime as a mount option, but then be sure that no > application uses the atimes of files; e.g. something like mutt use the > atime and mtime to determine whether there is a mail file with unread > mail.With the latest Linux kernels (4.0+), there is also the "lazytime" mount option. This causes the kernel to avoid writing back inodes that only have "dirty timestamps", until either (a) some other change is made to the inode, (b) fsync(), syncfs(), or sync() is called, (c) an undeleted inode is evicted from memory or the file system is unounted, (d) 24 hours have gone by, or (e) in the case of ext4, if some other change is made to an inode in the same inode table block as an inode with a dirty timestamp (so we need to do the disk I/O anyway). With lazytime, the timestamp is updated in memory, so stat(2) will always return the correct timestamp, and in normal practice, the timestamps will be (eventually) updated on disk. However if a timestsamp gets updated multiple times --- for example, a database file getting updated with O_DIRECT writes, instead of the mtime field being written out every 30 seconds, we can defer the timestamp updates and collapse them into a much smaller set of inode table block writes. Cheers, - Ted