Dan Bretherton
2010-Apr-16 14:00 UTC
[Gluster-users] DHT, pre-existing data unevenly distributed
I have been using DHT to join together two large filesystems (2.5TB) containing pre-exising data. I solved the problem of ls not seeing all the files by doing "rsync --dry-run" from the individual brick directories to the glusterfs mounted volume. I am using glusterfs-3.0.2 and "option lookup-unhashed yes" for DHT on the client. All seemed to be well until the volume started to get nearly full, and despite also using "option min-free-disk 10%" one of the bricks became 100% full preventing any further writes to the whole volume. I managed to get going again by manually transferring some data from one server to the other, making the two more evenly balanced, but I would like to find a more permanent solution. I would also like to know if this sort of thing is supposed to happen with DHT and pre-existing data, in the situation where data is not evenly distributed across the bricks. I have included my client volume file at the bottom of this message. I tried using the unify translator instead, even though it is supposedly now obsolete, but glusterfs crashed (segfault) when I tried to mount the volume. I thought perhaps unify was no longer supported in 3.0.2 so didn't pursue that option any further. However, if unify turns out to be better than DHT for pre-existing data situations I will have to find out what went wrong. Should I be using the unify translator instead of DHT for pre-existing data that is unevenly distributed across bricks? If I can continue with DHT, can I stop using "option lookup-unhashed yes" at some point? Regards, Dan Bretherton #### ## Client vol file volume romulus type protocol/client option transport-type tcp option remote-host romulus option remote-port 6996 option remote-subvolume brick1 end-volume volume perseus type protocol/client option transport-type tcp option remote-host perseus option remote-port 6996 option remote-subvolume brick1 end-volume volume distribute type cluster/distribute option min-free-disk 10% option lookup-unhashed yes subvolumes romulus perseus end-volume volume io-threads type performance/io-threads #option thread-count 8 # default is 16 subvolumes distribute end-volume volume io-cache type performance/io-cache option cache-size 1GB subvolumes io-threads end-volume volume main type performance/stat-prefetch subvolumes io-cache end-volume -- Mr. D.A. Bretherton Reading e-Science Centre Environmental Systems Science Centre Harry Pitt Building 3 Earley Gate University of Reading Reading, RG6 6AL UK Tel. +44 118 378 7722 Fax: +44 118 378 6413