Hey everyone, I am a final semester MCA student. I've chosen rsync as the subject of my project for my graduation. Hence I would appreciate it if someone could guide me with some ideas on how I can contribute to rsync. I will work hard to implement whatever suggestions that you can all give me. I would definitely like to know what are some of the issues concerning rsync. May be I can then select one or two issues and try to solve them. I have 5 months for this project. Thanks in advance folks. zahed -- View this message in context: http://www.nabble.com/Need-ideas-for-project-tp15407150p15407150.html Sent from the Samba - rsync mailing list archive at Nabble.com.
On Mon, 2008-02-11 at 00:16 -0800, zahed wrote:> I am a final semester MCA student. I've chosen rsync as the subject of my > project for my graduation. Hence I would appreciate it if someone could > guide me with some ideas on how I can contribute to rsync. I will work hard > to implement whatever suggestions that you can all give me. > I would definitely like to know what are some of the issues concerning > rsync. May be I can then select one or two issues and try to solve them. I > have 5 months for this project.Wow! The bug database at https://bugzilla.samba.org/ has a bunch of things that need fixing and some enhancements that might be worth doing, but with five months, I think you could do something much more significant. A while ago, Wayne Davison and Martin Pool came up with some ideas for "superlifter", a next-generation file copying tool that would overcome many of the architectural limitations of rsync, but no work has been done on superlifter for a few years. You could look at what has been proposed so far, continue working toward a design, and possibly make a superlifter prototype that people could start to play with. Perhaps Wayne will have some further guidance (or will even help you now that rsync 3.0.0 work is winding down). Matt
On Mon, Feb 11, 2008 at 12:16:46AM -0800, zahed wrote:> I would definitely like to know what are some of the issues concerning > rsync. May be I can then select one or two issues and try to solve > them.Here are some ideas I came up with off the top of my head: - Look into MS Windows ACLs (which are non-Posix) and see if they can be supported. I added support for OS X's non-Posix ACLs, so it might be possible. - Getting TLS support on the daemon sockets finished (see enhancement bug in bugzilla). - I was imagining a way to get remote-to-remote transfers working. It should be possible if the client rsync starts up two remote rsyncs in --server mode (with appropriate options sent to each one), handles the initial handshaking, sends any other initial info that is needed to each (e.g. it may need to send excludes to both), and then acts as a go-between for the data that is flowing. For the most part it would need to just select on the incoming & outgoing file handles and pass the data through, but it would also need to recognize client-oriented things (such as messages) and output them rather than forwarding them. Other complications include command- line options that can be ambiguous (e.g. --rsync-path, --rsh), so there is a lot to figure out for this. - I have a DB proposal I mentioned a while back that I haven't had time to work on yet (though I will hopefully get to sometime soon, since the 3.0.0 release is wrapping up): http://article.gmane.org/gmane.network.rsync.general/15847 This is still in the design stage. Comments, design suggestions, code all gratefully accepted. However, if you're looking for solo work (as opposed to something that I'd be sticking my nose into), this isn't it. As for Matt's superlifter suggestion, that's quite a long-term project. Anytime that you start a code-base over again, you will spend a lot of time working through areas that were already solved in the old code, and revisiting the bugs of the past (since all new code is buggy, and the reasons for why some things are done in certain ways are not always obvious or well documented). It would certainly be a fun thing to work on, though. ..wayne..
On 11.02.2008 00:16, zahed wrote:> > Hey everyone, > I am a final semester MCA student. I've chosen rsync as the subject of my > project for my graduation. Hence I would appreciate it if someone could > guide me with some ideas on how I can contribute to rsync. I will work hard > to implement whatever suggestions that you can all give me. > I would definitely like to know what are some of the issues concerning > rsync. May be I can then select one or two issues and try to solve them. I > have 5 months for this project.As a little "appertizer" you could make an option for the passing the O_NOATIME option available since Linux-kernel 2.6.8 to all opens. That way rsync wouldn't screw with the ATIMEs of everything anymore. It would be useful for filesystems mounted with "atime" or "relatime". Of couse on filesystems already mounted "noatime" there would be no effect. :-) Here is what Matt McCutchen had to say when i was asked about how to do it myself. - snip - Add the option (--o_noatime or whatever you want to call it) in all the appropriate places in options.c, including a variable to store whether the option is on. Then declare the variable in sender.c and modify the do_open call in send_files to pass NO_ATIME if the variable is true. There are a number of do_open calls, but the one in send_files is the most important (it opens the source files). Additionally, if you care about listing source directories without hitting their atimes, there seems to be information about how to do it at http://www.cygwin.com/ml/libc-alpha/2005-09/msg00104.html ; the relevant opendir is in send_directory in flist.c . - snip - Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous.
On Mon, 11 Feb 2008 09:02:14 -0500, Matt McCutchen <matt@mattmccutchen.net> wrote:> A while ago, Wayne Davison and Martin Pool came up with > some ideas for "superlifter", a next-generation file copying tool that > would overcome many of the architectural limitations of rsync, but no > work has been done on superlifter for a few years.FWIW, the BEEP peer-to-peer protocol [http://www.beepcore.org/] might be a good foundation for an rsyncNG. -- Bob Bagwill
Here's another idea that I've been meaning to investigate: Modify the checksum algorithm to make the rolling checksum stronger, and eliminate the strong checksum. This would mean that the generator would only send the new rolling checksum data to the sender, and that the sender would only need to cache (and check) this new checksum (instead of checking for a match in the weak checksum and then verifying it by computing and comparing the strong checksum on the match). Some suggested algorithms can be found in chapter 4 of Tridge's PhD thesis. I'd suggest testing an algorithm for speed, size of checksum data, and number of resend failures in a large amount of file data. Compare it against the regular rsync algorithm and other alternates. ..wayne..