Guillermo Alvarado
2018-Dec-06 20:43 UTC
[Gluster-users] Heketi error: Server busy. Retry operation later.
Hello, Yesterday I tweeted my frustration and @YanivKaul <https://twitter.com/YanivKaul> suggested me to write in this list: I Installed Openshift and I am using INDEPENDENT MODE of GlusterFS to provide persistent and dinamyc storage. These are the vars I am using on the openshift ansible inventory: openshift_storage_glusterfs_namespace=app-storage openshift_storage_glusterfs_storageclass=true openshift_storage_glusterfs_storageclass_default=false openshift_storage_glusterfs_block_deploy=true openshift_storage_glusterfs_block_host_vol_size=600 openshift_storage_glusterfs_block_storageclass=true openshift_storage_glusterfs_block_storageclass_default=false openshift_storage_glusterfs_is_native=false openshift_storage_glusterfs_heketi_is_native=true openshift_storage_glusterfs_heketi_executor=ssh openshift_storage_glusterfs_heketi_ssh_port=22 openshift_storage_glusterfs_heketi_ssh_user=ocpadmin openshift_storage_glusterfs_heketi_ssh_sudo=true openshift_storage_glusterfs_heketi_ssh_keyfile="/home/ocpadmin/.ssh/id_rsa" openshift_storage_glusterfs_registry_namespace=infra-storage openshift_storage_glusterfs_registry_block_deploy=true openshift_storage_glusterfs_registry_block_host_vol_size=600 openshift_storage_glusterfs_registry_block_storageclass=true openshift_storage_glusterfs_registry_block_storageclass_default=true openshift_storage_glusterfs_registry_is_native=false openshift_storage_glusterfs_registry_heketi_is_native=true openshift_storage_glusterfs_registry_heketi_executor=ssh openshift_storage_glusterfs_registry_heketi_ssh_port=22 openshift_storage_glusterfs_registry_heketi_ssh_user=ocpadmin openshift_storage_glusterfs_registry_heketi_ssh_sudo=true openshift_storage_glusterfs_registry_heketi_ssh_keyfile="/home/ocpadmin/.ssh/id_rsa" When I try to create a PVC with the next Storage class: $ oc describe sc glusterfs-storage Name: glusterfs-storage IsDefaultClass: No Annotations: <none> Provisioner: kubernetes.io/glusterfs Parameters: resturl=http://heketi-storage.app-storage.svc:8080 ,restuser=admin,secretName=heketi-storage-admin-secret,secretNamespace=app-storage AllowVolumeExpansion: <unset> MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none> I am able to create it, but with these: $ oc describe sc glusterfs-registry-block Name: glusterfs-registry-block IsDefaultClass: Yes Annotations: storageclass.kubernetes.io/is-default-class=true Provisioner: gluster.org/glusterblock Parameters: chapauthenabled=true,hacount=3,restsecretname=heketi-registry-admin-secret-block,restsecretnamespace=infra-storage,resturlhttp://heketi-registry.infra-storage.svc:8080,restuser=admin AllowVolumeExpansion: <unset> MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none> and this $ oc describe sc glusterfs-storage-block Name: glusterfs-storage-block IsDefaultClass: No Annotations: <none> Provisioner: gluster.org/glusterblock Parameters: chapauthenabled=true,hacount=3,restsecretname=heketi-storage-admin-secret-block,restsecretnamespace=app-storage,resturlhttp://heketi-storage.app-storage.svc:8080,restuser=admin AllowVolumeExpansion: <unset> MountOptions: <none> ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: <none> I am getting this error message when I try to create a PVC: *Failed to provision volume with StorageClass "glusterfs-registry-block": failed to create volume: heketi block volume creation failed: [heketi] failed to create volume: Server busy. Retry operation later.* Heketi is conteinerized and I just trying to create 1 volume at time, so I do not understand why I am getting that message. Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181206/30bf7f81/attachment.html>
John Mulligan
2018-Dec-20 14:18 UTC
[Gluster-users] Heketi error: Server busy. Retry operation later.
Hi, The "Server busy" error is typically returned in one of two scenarios: * The server really is busy processing multiple parallel requests * Heketi needs a block hosting volume to satisfy a block volume request but the only block hosting volumes are pending (still being created). In your case I suspect it's the latter. The most definitive way to determine if this is the case is use the 'heketi-cli db dump' command and examine the db dump json produced. If there are items under the "pendingoperations" key that correspond to block volume creation + block hosting volume creation it would seem you are in that situation. You can look for pending operations with a Type of 4 (OperationCreateBlockVolume) that contain an action with a change type of 2 (OpAddVolume) in the dump. You can also put your db dump in a pastebin if you are comfortable sharing it. If heketi server was stopped at any time while working on this operation heketi considers this a "stale" operation and this can block some other operations until it is resolved. If we can confirm this is the case, we will need to clean up the situation. In all recent releases of Heketi you can stop the server and then export the db to json and then edit the json and import that back into the db. I even have tools to help with the editing process posted in a PR here: https:// github.com/heketi/heketi/pull/1413 Alternatively, if you are willing to use the "master" branch of heketi we have recently (this week!) completed a feature that will automatically clean up some stale and failed pending operations. If you want to try out this code you should be able to avoid doing any extra manual steps. Simply upgrade the server and wait a bit. Use the `heketi-cli server operations info` to see the count go to zero and then attempt to create a new volume. If you do try this and something does not work please file an issue and you'll get extra special attention from me, because I have been working a lot on this feature. :-) If you do _not_ see a pending operation matching the description in the db dump, please get back to us as we will need to examine the heketi logs from around when the error occurred. PS. We tend to check our github issues more frequently than these lists. If you have any future issues you may get faster turnaround by asking there. On Thursday, December 6, 2018 3:43:05 PM EST Guillermo Alvarado wrote:> Hello, Yesterday I tweeted my frustration and @YanivKaul > <https://twitter.com/YanivKaul> suggested me to write in this list: > > I Installed Openshift and I am using INDEPENDENT MODE of GlusterFS to > provide persistent and dinamyc storage. These are the vars I am using on > the openshift ansible inventory: > openshift_storage_glusterfs_namespace=app-storage > openshift_storage_glusterfs_storageclass=true > openshift_storage_glusterfs_storageclass_default=false > openshift_storage_glusterfs_block_deploy=true > openshift_storage_glusterfs_block_host_vol_size=600 > openshift_storage_glusterfs_block_storageclass=true > openshift_storage_glusterfs_block_storageclass_default=false > openshift_storage_glusterfs_is_native=false > openshift_storage_glusterfs_heketi_is_native=true > openshift_storage_glusterfs_heketi_executor=ssh > openshift_storage_glusterfs_heketi_ssh_port=22 > openshift_storage_glusterfs_heketi_ssh_user=ocpadmin > openshift_storage_glusterfs_heketi_ssh_sudo=true > openshift_storage_glusterfs_heketi_ssh_keyfile="/home/ocpadmin/.ssh/id_rsa" > openshift_storage_glusterfs_registry_namespace=infra-storage > openshift_storage_glusterfs_registry_block_deploy=true > openshift_storage_glusterfs_registry_block_host_vol_size=600 > openshift_storage_glusterfs_registry_block_storageclass=true > openshift_storage_glusterfs_registry_block_storageclass_default=true > openshift_storage_glusterfs_registry_is_native=false > openshift_storage_glusterfs_registry_heketi_is_native=true > openshift_storage_glusterfs_registry_heketi_executor=ssh > openshift_storage_glusterfs_registry_heketi_ssh_port=22 > openshift_storage_glusterfs_registry_heketi_ssh_user=ocpadmin > openshift_storage_glusterfs_registry_heketi_ssh_sudo=true > openshift_storage_glusterfs_registry_heketi_ssh_keyfile="/home/ocpadmin/.ssh > /id_rsa" > > When I try to create a PVC with the next Storage class: > > $ oc describe sc glusterfs-storage > Name: glusterfs-storage > IsDefaultClass: No > Annotations: <none> > Provisioner: kubernetes.io/glusterfs > Parameters: resturl=http://heketi-storage.app-storage.svc:8080 > ,restuser=admin,secretName=heketi-storage-admin-secret,secretNamespace=app-s > torage AllowVolumeExpansion: <unset> > MountOptions: <none> > ReclaimPolicy: Delete > VolumeBindingMode: Immediate > Events: <none> > > I am able to create it, but with these: > > $ oc describe sc glusterfs-registry-block > Name: glusterfs-registry-block > IsDefaultClass: Yes > Annotations: storageclass.kubernetes.io/is-default-class=true > Provisioner: gluster.org/glusterblock > Parameters: > chapauthenabled=true,hacount=3,restsecretname=heketi-registry-admin-secret-b > lock,restsecretnamespace=infra-storage,resturl> http://heketi-registry.infra-storage.svc:8080,restuser=admin > AllowVolumeExpansion: <unset> > MountOptions: <none> > ReclaimPolicy: Delete > VolumeBindingMode: Immediate > Events: <none> > > > and this > > > $ oc describe sc glusterfs-storage-block Name: glusterfs-storage-block > IsDefaultClass: No Annotations: <none> Provisioner: gluster.org/glusterblock > Parameters: > chapauthenabled=true,hacount=3,restsecretname=heketi-storage-admin-secret-bl > ock,restsecretnamespace=app-storage,resturl> http://heketi-storage.app-storage.svc:8080,restuser=admin > AllowVolumeExpansion: <unset> MountOptions: <none> ReclaimPolicy: Delete > VolumeBindingMode: Immediate Events: <none> > > > I am getting this error message when I try to create a PVC: *Failed to > provision volume with StorageClass "glusterfs-registry-block": failed to > create volume: heketi block volume creation failed: [heketi] failed to > create volume: Server busy. Retry operation later.* > > Heketi is conteinerized and I just trying to create 1 volume at time, so I > do not understand why I am getting that message. > > Thanks in advance