Bernhard Dübi
2018-Apr-12 10:32 UTC
[Gluster-users] issues with replicating data to a new brick
Hello everybody, I have some kind of a situation here I want to move some volumes to new hosts. the idea is to add the new bricks to the volume, sync and then drop the old bricks. starting point is: Volume Name: Server_Monthly_02 Type: Replicate Volume ID: 0ada8e12-15f7-42e9-9da3-2734b04e04e9 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: chastcvtprd04:/data/glusterfs/Server_Monthly/2I-1-40/brick Brick2: chglbcvtprd04:/data/glusterfs/Server_Monthly/2I-1-40/brick Options Reconfigured: features.scrub: Inactive features.bitrot: off nfs.disable: on auth.allow: 127.0.0.1,10.30.28.43,10.30.28.44,10.30.28.17,10.30.28.18,10.8.13.132,10.30.28.30,10.30.28.31 performance.readdir-ahead: on diagnostics.latency-measurement: on diagnostics.count-fop-hits: on root at chastcvtprd04:~# cat /etc/os-release NAME="Ubuntu" VERSION="16.04.3 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.3 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial root at chastcvtprd04:~# uname -a Linux chastcvtprd04 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux root at chastcvtprd04:~# dpkg -l | grep gluster ii glusterfs-client 3.8.15-ubuntu1~xenial1 amd64 clustered file-system (client package) ii glusterfs-common 3.8.15-ubuntu1~xenial1 amd64 GlusterFS common libraries and translator modules ii glusterfs-server 3.8.15-ubuntu1~xenial1 amd64 clustered file-system (server package) root at chastcvtprd04:~# df -h /data/glusterfs/Server_Monthly/2I-1-40/brick Filesystem Size Used Avail Use% Mounted on /dev/bcache47 7.3T 7.3T 45G 100% /data/glusterfs/Server_Monthly/2I-1-40 then I add the new brick Volume Name: Server_Monthly_02 Type: Replicate Volume ID: 0ada8e12-15f7-42e9-9da3-2734b04e04e9 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: chastcvtprd04:/data/glusterfs/Server_Monthly/2I-1-40/brick Brick2: chglbcvtprd04:/data/glusterfs/Server_Monthly/2I-1-40/brick Brick3: chglbglsprd02:/data/glusterfs/Server_Monthly/1I-1-51/brick Options Reconfigured: features.scrub: Inactive features.bitrot: off nfs.disable: on auth.allow: 127.0.0.1,10.30.28.43,10.30.28.44,10.30.28.17,10.30.28.18,10.8.13.132,10.30.28.30,10.30.28.31 performance.readdir-ahead: on diagnostics.latency-measurement: on diagnostics.count-fop-hits: on root at chglbglsprd02:~# cat /etc/os-release NAME="Ubuntu" VERSION="16.04.4 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.4 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial root at chglbglsprd02:~# uname -a Linux chglbglsprd02 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux root at chglbglsprd02:~# dpkg -l | grep gluster ii glusterfs-client 3.8.15-ubuntu1~xenial1 amd64 clustered file-system (client package) ii glusterfs-common 3.8.15-ubuntu1~xenial1 amd64 GlusterFS common libraries and translator modules ii glusterfs-server 3.8.15-ubuntu1~xenial1 amd64 clustered file-system (server package) then healing kicks in and the cluster starts copying data to the new brick unfortunately after a while it starts complaining [2018-04-10 14:39:32.057443] E [MSGID: 113072] [posix.c:3457:posix_writev] 0-Server_Monthly_02-posix: write failed: offset 0, [No space left on device] [2018-04-10 14:39:32.057538] E [MSGID: 115067] [server-rpc-fops.c:1346:server_writev_cbk] 0-Server_Monthly_02-server: 22835126: WRITEV 0 (48949669-ba1c-4735-b83c-71340f1bb64f) ==> (No space left on device) [No space left on device] root at chglbglsprd02:~# df -h /data/glusterfs/Server_Monthly/1I-1-51/brick Filesystem Size Used Avail Use% Mounted on /dev/sdaq 7.3T 7.3T 20K 100% /data/glusterfs/Server_Monthly/1I-1-51 there's no other I/O going on on this volume, so the copy process should be straight forward BUT I noticed that there are a lot of sparse files on this volume Any ideas on how to make it work? If you need more details, please let me known and I'll try to make them available Kind Regards Bernhard