sievers at math2.nat.tu-bs.de
2012-Mar-23 15:02 UTC
[Gluster-users] losing file ownership in replicated volume
Dear list, I found a case where in a replicated volume, when one brick becomes unreachable and gets back (either clean reboot or temporarily dropping all packets with iptables), a file that was written to while the brick went off forgets its owner and is then owned by root. At first, this only happens in the brick that was down. Then an ls -l of the file (not the directory it is in) still shows the correct user, but triggers a self heal that actually copies ownership in the wrong direction. Now all further calls to ls -l show root as owner. All this does not happen when I just copy a file to the volume, or write to it from a program just using fwrite(), but happens always with the following test program which uses fflush(): ---------------------------------------- #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]){ int c=0; FILE *fd; fd=fopen(argv[1],"w"); while(1) { fprintf(fd, "%i\n", ++c); fflush(fd); printf("%i\n",c); sleep(1); } } ---------------------------------------- So to reproduce, compile the program, start it: $ ./testprogram /replicatedvolume/foo reboot or unplug one brick, stop the program (doesn't matter if brick is already back). When the brick is back, the file on it will have root as owner. To see the corruption at the client do: $ ls -l /replicatedvolume/foo # 1st time gives correct answer $ ls -l /replicatedvolume/foo # shows root as owner I tried this with debian systems with 64bit servers and a 32bit client, gluster versions 3.2.4 from squeeze-backports and 3.2.6 directly from gluster.org, the deb-file for the servers and self compiled from the tar-file for the client, with underlying filesystems ext4 and xfs. I hope this is enough information to reproduce and fix the issue. (I won't be online till Wednesday.) All the best, Christian Sievers
Alex Chekholko
2012-Mar-23 17:46 UTC
[Gluster-users] losing file ownership in replicated volume
We saw the same thing on 3.2.5 on Debian (deb from glusterfs directly). Regards, Alex On Mar 23, 2012, at 8:02 AM, sievers at math2.nat.tu-bs.de wrote:> Dear list, > > I found a case where in a replicated volume, when one brick becomes > unreachable and gets back (either clean reboot or temporarily dropping > all packets with iptables), a file that was written to while the brick > went off forgets its owner and is then owned by root. > At first, this only happens in the brick that was down. Then an ls -l > of the file (not the directory it is in) still shows the correct user, > but triggers a self heal that actually copies ownership in the wrong > direction. Now all further calls to ls -l show root as owner. > > All this does not happen when I just copy a file to the volume, or write > to it from a program just using fwrite(), but happens always with the > following test program which uses fflush(): > > ---------------------------------------- > #include <stdio.h> > #include <unistd.h> > > int main(int argc, char *argv[]){ > int c=0; > FILE *fd; > fd=fopen(argv[1],"w"); > while(1) { > fprintf(fd, "%i\n", ++c); > fflush(fd); > printf("%i\n",c); > sleep(1); > } > } > ---------------------------------------- > > So to reproduce, compile the program, start it: > > $ ./testprogram /replicatedvolume/foo > > reboot or unplug one brick, > stop the program (doesn't matter if brick is already back). > When the brick is back, the file on it will have root as owner. > To see the corruption at the client do: > > $ ls -l /replicatedvolume/foo # 1st time gives correct answer > $ ls -l /replicatedvolume/foo # shows root as owner > > I tried this with debian systems with 64bit servers and a 32bit client, > gluster versions 3.2.4 from squeeze-backports and 3.2.6 directly from > gluster.org, the deb-file for the servers and self compiled from the > tar-file for the client, with underlying filesystems ext4 and xfs. > > I hope this is enough information to reproduce and fix the issue. > (I won't be online till Wednesday.) > > > All the best, > Christian Sievers > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users-- Alex Chekholko chekh at stanford.edu 347-401-4860