Rama Shenai
2016-Oct-21 18:33 UTC
[Gluster-users] EBADF after add-brick/self-heal operation in Gluster 3.7.15
Hi Gluster Team, We saw a bunch of intermittent EBADF errors on clients, We saw these errors immediately after an add-brick operation followed by a self-heal of that volume. We are wondering if these errors might close file descriptors prematurely. causing problems on files we had open/memory-mapped Below are the errors that we saw. Any thoughts on this , as well as input in avoiding this when we do live add-brick operations in the future is much appreciated. [2016-10-19 15:11:53.372930] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c003e7c inode-gfid:b1e19a5b-7867-4cb3-8bf0-545df3c5d556) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373058] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c003e7c inode-gfid:b1e19a5b-7867-4cb3-8bf0-545df3c5d556) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373105] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0065b8 inode-gfid:9231d39d-d88c-4a62-b25d-fad232ec9b98) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373121] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0065b8 inode-gfid:9231d39d-d88c-4a62-b25d-fad232ec9b98) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373138] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c005ac0 inode-gfid:a0b02209-59c9-418b-bff0-fb31be01b3e8) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373155] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c005ac0 inode-gfid:a0b02209-59c9-418b-bff0-fb31be01b3e8) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373172] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004e18 inode-gfid:c490e1fe-3ac8-4c11-9c62-fc8672a27737) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373199] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004e18 inode-gfid:c490e1fe-3ac8-4c11-9c62-fc8672a27737) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373231] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0037bc inode-gfid:1aab19de-f4b1-47bf-8216-d174797ae64d) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373245] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c0037bc inode-gfid:1aab19de-f4b1-47bf-8216-d174797ae64d) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373271] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004b90 inode-gfid:5c3d5a39-26f0-4211-a2f0-59de33ea5ade) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) [2016-10-19 15:11:53.373287] W [fuse-resolve.c:556:fuse_resolve_fd] 0-fuse-resolve: migration of basefd (ptr:0x7f6d5c004b90 inode-gfid:5c3d5a39-26f0-4211-a2f0-59de33ea5ade) did not complete, failing fop with EBADF (old-subvolume:meta-autoload-0 new-subvolume:meta-autoload-2) Volume information $ sudo gluster volume info Volume Name: volume1 Type: Replicate Volume ID: 3bcca83e-2be5-410c-9a23-b159f570ee7e Status: Started Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: ip-172-25-2-91.us-west-1.compute.internal:/data/ glusterfs/volume1/brick1/brick Brick2: ip-172-25-2-206.us-west-1.compute.internal:/data/ glusterfs/volume1/brick1/brick Brick3: ip-172-25-33-75.us-west-1.compute.internal:/data/glusterfs/volume1/brick1/brick <-- brick added Options Reconfigured: cluster.quorum-type: fixed cluster.quorum-count: 2 Client translator configuration (from the client log) 1: volume volume1-client-0 2: type protocol/client 3: option ping-timeout 42 4: option remote-host ip-172-25-2-91.us-west-1.compute.internal 5: option remote-subvolume /data/glusterfs/volume1/brick1/brick 6: option transport-type socket 7: option send-gids true 8: end-volume 9: 10: volume volume1-client-1 11: type protocol/client 12: option ping-timeout 42 13: option remote-host ip-172-25-2-206.us-west-1.compute.internal 14: option remote-subvolume /data/glusterfs/volume1/brick1/brick 15: option transport-type socket 16: option send-gids true 17: end-volume 18: 19: volume volume1-client-2 20: type protocol/client 21: option ping-timeout 42 22: option remote-host ip-172-25-33-75.us-west-1.compute.internal 23: option remote-subvolume /data/glusterfs/volume1/brick1/brick 24: option transport-type socket 25: option send-gids true 26: end-volume 27: 28: volume volume1-replicate-0 29: type cluster/replicate 30: option quorum-type fixed 31: option quorum-count 2 32: subvolumes volume1-client-0 volume1-client-1 volume1-client-2 33: end-volume 34: 35: volume volume1-dht 36: type cluster/distribute 37: subvolumes volume1-replicate-0 38: end-volume 39: 40: volume volume1-write-behind 41: type performance/write-behind 42: subvolumes volume1-dht 43: end-volume 44: 45: volume volume1-read-ahead 46: type performance/read-ahead 47: subvolumes volume1-write-behind 48: end-volume 49: 50: volume volume1-io-cache 51: type performance/io-cache 52: subvolumes volume1-read-ahead 53: end-volume 54: 55: volume volume1-quick-read 56: type performance/quick-read 57: subvolumes volume1-io-cache 58: end-volume 59: 60: volume volume1-open-behind 61: type performance/open-behind 62: subvolumes volume1-quick-read 63: end-volume 64: 65: volume volume1-md-cache 66: type performance/md-cache 67: option cache-posix-acl true 68: subvolumes volume1-open-behind 69: end-volume 70: 71: volume volume1 72: type debug/io-stats 73: option log-level INFO 74: option latency-measurement off 75: option count-fop-hits off 76: subvolumes volume1-md-cache 77: end-volume 78: 79: volume posix-acl-autoload 80: type system/posix-acl 81: subvolumes volume1 82: end-volume 83: 84: volume meta-autoload 85: type meta 86: subvolumes posix-acl-autoload 87: end-volume 88: +------------------------------------------------------------------------------+ Thanks Rama -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161021/e15c40e7/attachment.html>