Kaushal M
2014-Nov-26 11:14 UTC
[Gluster-users] [ovirt-users] Gluster command [<UNKNOWN>] failed on server...
Based on the logs I can guess that glusterd is being started before the network has come up and that the addresses given to bricks do not directly match the addresses used in during peer probe. The gluster_after_reboot log has the line "[2014-11-25 06:46:09.972113] E [glusterd-store.c:2632:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore". Brick resolution fails when glusterd cannot match the address for the brick, with one of the peers. Brick resolution happens in two phases, 1. We first try to identify the peer by performing string comparisions with the brick address and the peer addresses (The peer names will be the names/addresses that were given when the peer was probed). 2. If we don't find a match from step 1, we will then resolve all the brick address and the peer addresses into addrinfo structs, and then compare these structs to find a match. This process should generally find a match if available. This will fail only if the network is not up yet as we cannot resolve addresses. The above steps are applicable only to glusterfs versions >=3.6. They were introduced to reduce problems with peer identification, like the one you encountered Since both of the steps failed to find a match in one run, but succeeded later, we can come to the conclusion that, a) the bricks don't have the exact same string used in peer probe for their addresses as step 1 failed, and b) the network was not up in the initial run, as step 2 failed during the initial run, but passed in the second run. Please let me know if my conclusion is correct. If it is, you can solve your problem in two ways. 1. Use the same string for doing the peer probe and for the brick address during volume create/add-brick. Ideally, we suggest you use properly resolvable FQDNs everywhere. If that is not possible, then use only IP addresses. Try to avoid short names. 2. During boot up, make sure to launch glusterd only after the network is up. This will allow the new peer identification mechanism to do its job correctly. If you have already followed these steps and yet still hit the problem, then please provide more information (setup, logs, etc.). It could be much different problem that you are facing. ~kaushal On Wed, Nov 26, 2014 at 4:01 PM, Punit Dambiwal <hypunit at gmail.com> wrote:> Is there any one can help on this ?? > > Thanks, > punit > > On Wed, Nov 26, 2014 at 9:42 AM, Punit Dambiwal <hypunit at gmail.com> wrote: >> >> Hi, >> >> My Glusterfs version is :- glusterfs-3.6.1-1.el7 >> >> On Wed, Nov 26, 2014 at 1:59 AM, Kanagaraj Mayilsamy <kmayilsa at redhat.com> >> wrote: >>> >>> [+Gluster-users at gluster.org] >>> >>> "Initialization of volume 'management' failed, review your volfile >>> again", glusterd throws this error when the service is started automatically >>> after the reboot. But the service is successfully started later manually by >>> the user. >>> >>> can somebody from gluster-users please help on this? >>> >>> glusterfs version: 3.5.1 >>> >>> Thanks, >>> Kanagaraj >>> >>> ----- Original Message ----- >>> > From: "Punit Dambiwal" <hypunit at gmail.com> >>> > To: "Kanagaraj" <kmayilsa at redhat.com> >>> > Cc: users at ovirt.org >>> > Sent: Tuesday, November 25, 2014 7:24:45 PM >>> > Subject: Re: [ovirt-users] Gluster command [<UNKNOWN>] failed on >>> > server... >>> > >>> > Hi Kanagraj, >>> > >>> > Please check the attached log files....i didn't find any thing >>> > special.... >>> > >>> > On Tue, Nov 25, 2014 at 12:12 PM, Kanagaraj <kmayilsa at redhat.com> >>> > wrote: >>> > >>> > > Do you see any errors in >>> > > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log or vdsm.log when >>> > > the >>> > > service is trying to start automatically after the reboot? >>> > > >>> > > Thanks, >>> > > Kanagaraj >>> > > >>> > > >>> > > On 11/24/2014 08:13 PM, Punit Dambiwal wrote: >>> > > >>> > > Hi Kanagaraj, >>> > > >>> > > Yes...once i will start the gluster service and then vdsmd ...the >>> > > host >>> > > can connect to cluster...but the question is why it's not started >>> > > even it >>> > > has chkconfig enabled... >>> > > >>> > > I have tested it in two host cluster environment...(Centos 6.6 and >>> > > centos 7.0) on both hypervisior cluster..it's failed to reconnect in >>> > > to >>> > > cluster after reboot.... >>> > > >>> > > In both the environment glusterd enabled for next boot....but it's >>> > > failed with the same error....seems it's bug in either gluster or >>> > > Ovirt ?? >>> > > >>> > > Please help me to find the workaround here if can not resolve >>> > > it...as >>> > > without this the Host machine can not connect after reboot....that >>> > > means >>> > > engine will consider it as down and every time need to manually start >>> > > the >>> > > gluster service and vdsmd... ?? >>> > > >>> > > Thanks, >>> > > Punit >>> > > >>> > > On Mon, Nov 24, 2014 at 10:20 PM, Kanagaraj <kmayilsa at redhat.com> >>> > > wrote: >>> > > >>> > >> From vdsm.log "error: Connection failed. Please check if gluster >>> > >> daemon >>> > >> is operational." >>> > >> >>> > >> Starting glusterd service should fix this issue. 'service glusterd >>> > >> start' >>> > >> But i am wondering why the glusterd was not started automatically >>> > >> after >>> > >> the reboot. >>> > >> >>> > >> Thanks, >>> > >> Kanagaraj >>> > >> >>> > >> >>> > >> >>> > >> On 11/24/2014 07:18 PM, Punit Dambiwal wrote: >>> > >> >>> > >> Hi Kanagaraj, >>> > >> >>> > >> Please find the attached VDSM logs :- >>> > >> >>> > >> ---------------- >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:17,182::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >>> > >> Owner.cancelAll requests {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:17,182::task::993::Storage.TaskManager.Task::(_decref) >>> > >> Task=`1691d409-9b27-4585-8281-5ec26154367a`::ref 0 aborting False >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,393::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state init >>> > >> -> >>> > >> state preparing >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:32,393::logUtils::44::dispatcher::(wrapper) Run and protect: >>> > >> repoStats(options=None) >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:32,393::logUtils::47::dispatcher::(wrapper) Run and protect: >>> > >> repoStats, Return response: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,393::task::1191::Storage.TaskManager.Task::(prepare) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::finished: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,394::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state >>> > >> preparing >>> > >> -> >>> > >> state finished >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:32,394::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) >>> > >> Owner.releaseAll requests {} resources {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:32,394::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >>> > >> Owner.cancelAll requests {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:32,394::task::993::Storage.TaskManager.Task::(_decref) >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::ref 0 aborting False >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,550::BindingXMLRPC::1132::vds::(wrapper) client >>> > >> [10.10.10.2]::call >>> > >> getCapabilities with () {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,553::utils::738::root::(execCmd) >>> > >> /sbin/ip route show to 0.0.0.0/0 table all (cwd None) >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,560::utils::758::root::(execCmd) >>> > >> SUCCESS: <err> = ''; <rc> = 0 >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,588::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,592::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-object',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,593::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-plugin',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-account',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-proxy',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-doc',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('gluster-swift-container',) not found >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package >>> > >> ('glusterfs-geo-replication',) not found >>> > >> Thread-13::DEBUG::2014-11-24 21:41:41,600::caps::646::root::(get) >>> > >> VirtioRNG DISABLED: libvirt version 0.10.2-29.el6_5.9 required >>>> > >> 0.10.2-31 >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,603::BindingXMLRPC::1139::vds::(wrapper) return >>> > >> getCapabilities >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': >>> > >> {'HBAInventory': >>> > >> {'iSCSI': [{'InitiatorName': >>> > >> 'iqn.1994-05.com.redhat:32151ce183c8'}], >>> > >> 'FC': >>> > >> []}, 'packages2': {'kernel': {'release': '431.el6.x86_64', >>> > >> 'buildtime': >>> > >> 1385061309.0, 'version': '2.6.32'}, 'glusterfs-rdma': {'release': >>> > >> '1.el6', >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'glusterfs-fuse': >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': '3.5.1'}, >>> > >> 'spice-server': {'release': '6.el6_5.2', 'buildtime': 1402324637L, >>> > >> 'version': '0.12.4'}, 'vdsm': {'release': '1.gitdb83943.el6', >>> > >> 'buildtime': >>> > >> 1412784567L, 'version': '4.16.7'}, 'qemu-kvm': {'release': >>> > >> '2.415.el6_5.10', 'buildtime': 1402435700L, 'version': '0.12.1.2'}, >>> > >> 'qemu-img': {'release': '2.415.el6_5.10', 'buildtime': 1402435700L, >>> > >> 'version': '0.12.1.2'}, 'libvirt': {'release': '29.el6_5.9', >>> > >> 'buildtime': >>> > >> 1402404612L, 'version': '0.10.2'}, 'glusterfs': {'release': '1.el6', >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'mom': {'release': >>> > >> '2.el6', >>> > >> 'buildtime': 1403794344L, 'version': '0.4.1'}, 'glusterfs-server': >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': '3.5.1'}}, >>> > >> 'numaNodeDistance': {'1': [20, 10], '0': [10, 20]}, 'cpuModel': >>> > >> 'Intel(R) >>> > >> Xeon(R) CPU X5650 @ 2.67GHz', 'liveMerge': 'false', >>> > >> 'hooks': >>> > >> {}, >>> > >> 'cpuSockets': '2', 'vmTypes': ['kvm'], 'selinux': {'mode': '1'}, >>> > >> 'kdumpStatus': 0, 'supportedProtocols': ['2.2', '2.3'], 'networks': >>> > >> {'ovirtmgmt': {'iface': u'bond0.10', 'addr': '43.252.176.16', >>> > >> 'bridged': >>> > >> False, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': '1500', >>> > >> 'bootproto4': 'none', 'netmask': '255.255.255.0', 'ipv4addrs': [' >>> > >> 43.252.176.16/24' <http://43.252.176.16/24%27>], 'interface': >>> > >> u'bond0.10', 'ipv6gateway': '::', 'gateway': '43.25.17.1'}, >>> > >> 'Internal': >>> > >> {'iface': 'Internal', 'addr': '', 'cfg': {'DEFROUTE': 'no', >>> > >> 'HOTPLUG': >>> > >> 'no', 'MTU': '9000', 'DELAY': '0', 'NM_CONTROLLED': 'no', >>> > >> 'BOOTPROTO': >>> > >> 'none', 'STP': 'off', 'DEVICE': 'Internal', 'TYPE': 'Bridge', >>> > >> 'ONBOOT': >>> > >> 'no'}, 'bridged': True, 'ipv6addrs': >>> > >> ['fe80::210:18ff:fecd:daac/64'], >>> > >> 'gateway': '', 'bootproto4': 'none', 'netmask': '', 'stp': 'off', >>> > >> 'ipv4addrs': [], 'mtu': '9000', 'ipv6gateway': '::', 'ports': >>> > >> ['bond1.100']}, 'storage': {'iface': u'bond1', 'addr': '10.10.10.6', >>> > >> 'bridged': False, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], >>> > >> 'mtu': >>> > >> '9000', 'bootproto4': 'none', 'netmask': '255.255.255.0', >>> > >> 'ipv4addrs': [' >>> > >> 10.10.10.6/24' <http://10.10.10.6/24%27>], 'interface': u'bond1', >>> > >> 'ipv6gateway': '::', 'gateway': ''}, 'VMNetwork': {'iface': >>> > >> 'VMNetwork', >>> > >> 'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': >>> > >> '1500', >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': >>> > >> 'off', >>> > >> 'DEVICE': 'VMNetwork', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, 'bridged': >>> > >> True, >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'gateway': '', >>> > >> 'bootproto4': >>> > >> 'none', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], 'mtu': '1500', >>> > >> 'ipv6gateway': '::', 'ports': ['bond0.36']}}, 'bridges': >>> > >> {'Internal': >>> > >> {'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': >>> > >> '9000', >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': >>> > >> 'off', >>> > >> 'DEVICE': 'Internal', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, >>> > >> 'ipv6addrs': >>> > >> ['fe80::210:18ff:fecd:daac/64'], 'mtu': '9000', 'netmask': '', >>> > >> 'stp': >>> > >> 'off', 'ipv4addrs': [], 'ipv6gateway': '::', 'gateway': '', 'opts': >>> > >> {'topology_change_detected': '0', 'multicast_last_member_count': >>> > >> '2', >>> > >> 'hash_elasticity': '4', 'multicast_query_response_interval': '999', >>> > >> 'multicast_snooping': '1', 'multicast_startup_query_interval': >>> > >> '3124', >>> > >> 'hello_timer': '31', 'multicast_querier_interval': '25496', >>> > >> 'max_age': >>> > >> '1999', 'hash_max': '512', 'stp_state': '0', 'root_id': >>> > >> '8000.001018cddaac', 'priority': '32768', >>> > >> 'multicast_membership_interval': >>> > >> '25996', 'root_path_cost': '0', 'root_port': '0', >>> > >> 'multicast_querier': >>> > >> '0', >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', >>> > >> 'topology_change': '0', 'bridge_id': '8000.001018cddaac', >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': >>> > >> '31', >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', >>> > >> 'multicast_query_interval': '12498', >>> > >> 'multicast_last_member_interval': >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': >>> > >> ['bond1.100']}, 'VMNetwork': {'addr': '', 'cfg': {'DEFROUTE': 'no', >>> > >> 'HOTPLUG': 'no', 'MTU': '1500', 'DELAY': '0', 'NM_CONTROLLED': 'no', >>> > >> 'BOOTPROTO': 'none', 'STP': 'off', 'DEVICE': 'VMNetwork', 'TYPE': >>> > >> 'Bridge', >>> > >> 'ONBOOT': 'no'}, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], >>> > >> 'mtu': >>> > >> '1500', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], 'ipv6gateway': >>> > >> '::', >>> > >> 'gateway': '', 'opts': {'topology_change_detected': '0', >>> > >> 'multicast_last_member_count': '2', 'hash_elasticity': '4', >>> > >> 'multicast_query_response_interval': '999', 'multicast_snooping': >>> > >> '1', >>> > >> 'multicast_startup_query_interval': '3124', 'hello_timer': '131', >>> > >> 'multicast_querier_interval': '25496', 'max_age': '1999', >>> > >> 'hash_max': >>> > >> '512', 'stp_state': '0', 'root_id': '8000.60eb6920b46c', 'priority': >>> > >> '32768', 'multicast_membership_interval': '25996', 'root_path_cost': >>> > >> '0', >>> > >> 'root_port': '0', 'multicast_querier': '0', >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', >>> > >> 'topology_change': '0', 'bridge_id': '8000.60eb6920b46c', >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': >>> > >> '31', >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', >>> > >> 'multicast_query_interval': '12498', >>> > >> 'multicast_last_member_interval': >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': >>> > >> ['bond0.36']}}, 'uuid': '44454C4C-4C00-1057-8053-B7C04F504E31', >>> > >> 'lastClientIface': 'bond1', 'nics': {'eth3': {'permhwaddr': >>> > >> '00:10:18:cd:da:ae', 'addr': '', 'cfg': {'SLAVE': 'yes', >>> > >> 'NM_CONTROLLED': >>> > >> 'no', 'MTU': '9000', 'HWADDR': '00:10:18:cd:da:ae', 'MASTER': >>> > >> 'bond1', >>> > >> 'DEVICE': 'eth3', 'ONBOOT': 'no'}, 'ipv6addrs': [], 'mtu': '9000', >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '00:10:18:cd:da:ac', >>> > >> 'speed': >>> > >> 1000}, 'eth2': {'permhwaddr': '00:10:18:cd:da:ac', 'addr': '', >>> > >> 'cfg': >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '9000', 'HWADDR': >>> > >> '00:10:18:cd:da:ac', 'MASTER': 'bond1', 'DEVICE': 'eth2', 'ONBOOT': >>> > >> 'no'}, >>> > >> 'ipv6addrs': [], 'mtu': '9000', 'netmask': '', 'ipv4addrs': [], >>> > >> 'hwaddr': >>> > >> '00:10:18:cd:da:ac', 'speed': 1000}, 'eth1': {'permhwaddr': >>> > >> '60:eb:69:20:b4:6d', 'addr': '', 'cfg': {'SLAVE': 'yes', >>> > >> 'NM_CONTROLLED': >>> > >> 'no', 'MTU': '1500', 'HWADDR': '60:eb:69:20:b4:6d', 'MASTER': >>> > >> 'bond0', >>> > >> 'DEVICE': 'eth1', 'ONBOOT': 'yes'}, 'ipv6addrs': [], 'mtu': '1500', >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', >>> > >> 'speed': >>> > >> 1000}, 'eth0': {'permhwaddr': '60:eb:69:20:b4:6c', 'addr': '', >>> > >> 'cfg': >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '1500', 'HWADDR': >>> > >> '60:eb:69:20:b4:6c', 'MASTER': 'bond0', 'DEVICE': 'eth0', 'ONBOOT': >>> > >> 'yes'}, >>> > >> 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'ipv4addrs': [], >>> > >> 'hwaddr': >>> > >> '60:eb:69:20:b4:6c', 'speed': 1000}}, 'software_revision': '1', >>> > >> 'clusterLevels': ['3.0', '3.1', '3.2', '3.3', '3.4', '3.5'], >>> > >> 'cpuFlags': >>> > >> >>> > >> u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,dca,sse4_1,sse4_2,popcnt,aes,lahf_lm,tpr_shadow,vnmi,flexpriority,ept,vpid,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270', >>> > >> 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:32151ce183c8', >>> > >> 'netConfigDirty': 'False', 'supportedENGINEs': ['3.0', '3.1', '3.2', >>> > >> '3.3', >>> > >> '3.4', '3.5'], 'autoNumaBalancing': 2, 'reservedMem': '321', >>> > >> 'bondings': >>> > >> {'bond4': {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', >>> > >> 'slaves': >>> > >> [], 'hwaddr': '00:00:00:00:00:00'}, 'bond0': {'addr': '', 'cfg': >>> > >> {'HOTPLUG': 'no', 'MTU': '1500', 'NM_CONTROLLED': 'no', >>> > >> 'BONDING_OPTS': >>> > >> 'mode=4 miimon=100', 'DEVICE': 'bond0', 'ONBOOT': 'yes'}, >>> > >> 'ipv6addrs': >>> > >> ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': '1500', 'netmask': '', >>> > >> 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', 'slaves': ['eth0', >>> > >> 'eth1'], >>> > >> 'opts': {'miimon': '100', 'mode': '4'}}, 'bond1': {'addr': >>> > >> '10.10.10.6', >>> > >> 'cfg': {'DEFROUTE': 'no', 'IPADDR': '10.10.10.6', 'HOTPLUG': 'no', >>> > >> 'MTU': >>> > >> '9000', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', >>> > >> 'BOOTPROTO': >>> > >> 'none', 'BONDING_OPTS': 'mode=4 miimon=100', 'DEVICE': 'bond1', >>> > >> 'ONBOOT': >>> > >> 'no'}, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'mtu': '9000', >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['10.10.10.6/24' >>> > >> <http://10.10.10.6/24%27>], 'hwaddr': '00:10:18:cd:da:ac', 'slaves': >>> > >> ['eth2', 'eth3'], 'opts': {'miimon': '100', 'mode': '4'}}, 'bond2': >>> > >> {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', 'slaves': [], >>> > >> 'hwaddr': '00:00:00:00:00:00'}, 'bond3': {'addr': '', 'cfg': {}, >>> > >> 'mtu': >>> > >> '1500', 'netmask': '', 'slaves': [], 'hwaddr': >>> > >> '00:00:00:00:00:00'}}, >>> > >> 'software_version': '4.16', 'memSize': '24019', 'cpuSpeed': >>> > >> '2667.000', >>> > >> 'numaNodes': {u'1': {'totalMemory': '12288', 'cpus': [6, 7, 8, 9, >>> > >> 10, 11, >>> > >> 18, 19, 20, 21, 22, 23]}, u'0': {'totalMemory': '12278', 'cpus': [0, >>> > >> 1, 2, >>> > >> 3, 4, 5, 12, 13, 14, 15, 16, 17]}}, 'version_name': 'Snow Man', >>> > >> 'vlans': >>> > >> {'bond0.10': {'iface': 'bond0', 'addr': '43.25.17.16', 'cfg': >>> > >> {'DEFROUTE': >>> > >> 'yes', 'VLAN': 'yes', 'IPADDR': '43.25.17.16', 'HOTPLUG': 'no', >>> > >> 'GATEWAY': >>> > >> '43.25.17.1', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', >>> > >> 'BOOTPROTO': 'none', 'DEVICE': 'bond0.10', 'MTU': '1500', 'ONBOOT': >>> > >> 'yes'}, >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 10, 'mtu': >>> > >> '1500', >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['43.25.17.16/24'] >>> > >> <http://43.25.17.16/24%27%5D>}, 'bond0.36': {'iface': 'bond0', >>> > >> 'addr': >>> > >> '', 'cfg': {'BRIDGE': 'VMNetwork', 'VLAN': 'yes', 'HOTPLUG': 'no', >>> > >> 'MTU': >>> > >> '1500', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond0.36', 'ONBOOT': >>> > >> 'no'}, >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 36, 'mtu': >>> > >> '1500', >>> > >> 'netmask': '', 'ipv4addrs': []}, 'bond1.100': {'iface': 'bond1', >>> > >> 'addr': >>> > >> '', 'cfg': {'BRIDGE': 'Internal', 'VLAN': 'yes', 'HOTPLUG': 'no', >>> > >> 'MTU': >>> > >> '9000', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond1.100', 'ONBOOT': >>> > >> 'no'}, >>> > >> 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'vlanid': 100, 'mtu': >>> > >> '9000', >>> > >> 'netmask': '', 'ipv4addrs': []}}, 'cpuCores': '12', 'kvmEnabled': >>> > >> 'true', >>> > >> 'guestOverhead': '65', 'cpuThreads': '24', 'emulatedMachines': >>> > >> [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', >>> > >> u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', >>> > >> u'rhel5.4.0'], >>> > >> 'operatingSystem': {'release': '5.el6.centos.11.1', 'version': '6', >>> > >> 'name': >>> > >> 'RHEL'}, 'lastClient': '10.10.10.2'}} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,620::BindingXMLRPC::1132::vds::(wrapper) client >>> > >> [10.10.10.2]::call >>> > >> getHardwareInfo with () {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,621::BindingXMLRPC::1139::vds::(wrapper) return >>> > >> getHardwareInfo >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': >>> > >> {'systemProductName': 'CS24-TY', 'systemSerialNumber': '7LWSPN1', >>> > >> 'systemFamily': 'Server', 'systemVersion': 'A00', 'systemUUID': >>> > >> '44454c4c-4c00-1057-8053-b7c04f504e31', 'systemManufacturer': >>> > >> 'Dell'}} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:41,733::BindingXMLRPC::1132::vds::(wrapper) client >>> > >> [10.10.10.2]::call >>> > >> hostsList with () {} flowID [222e8036] >>> > >> Thread-13::ERROR::2014-11-24 >>> > >> 21:41:44,753::BindingXMLRPC::1148::vds::(wrapper) vdsm exception >>> > >> occured >>> > >> Traceback (most recent call last): >>> > >> File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in wrapper >>> > >> res = f(*args, **kwargs) >>> > >> File "/usr/share/vdsm/gluster/api.py", line 54, in wrapper >>> > >> rv = func(*args, **kwargs) >>> > >> File "/usr/share/vdsm/gluster/api.py", line 251, in hostsList >>> > >> return {'hosts': self.svdsmProxy.glusterPeerStatus()} >>> > >> File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ >>> > >> return callMethod() >>> > >> File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> >>> > >> **kwargs) >>> > >> File "<string>", line 2, in glusterPeerStatus >>> > >> File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, >>> > >> in >>> > >> _callmethod >>> > >> raise convert_to_error(kind, result) >>> > >> GlusterCmdExecFailedException: Command execution failed >>> > >> error: Connection failed. Please check if gluster daemon is >>> > >> operational. >>> > >> return code: 1 >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,949::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state init >>> > >> -> >>> > >> state preparing >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:50,950::logUtils::44::dispatcher::(wrapper) Run and protect: >>> > >> repoStats(options=None) >>> > >> Thread-13::INFO::2014-11-24 >>> > >> 21:41:50,950::logUtils::47::dispatcher::(wrapper) Run and protect: >>> > >> repoStats, Return response: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,950::task::1191::Storage.TaskManager.Task::(prepare) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::finished: {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,950::task::595::Storage.TaskManager.Task::(_updateState) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state >>> > >> preparing >>> > >> -> >>> > >> state finished >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:50,951::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) >>> > >> Owner.releaseAll requests {} resources {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> >>> > >> 21:41:50,951::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >>> > >> Owner.cancelAll requests {} >>> > >> Thread-13::DEBUG::2014-11-24 >>> > >> 21:41:50,951::task::993::Storage.TaskManager.Task::(_decref) >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::ref 0 aborting False >>> > >> ------------------------------- >>> > >> >>> > >> [root at compute4 ~]# service glusterd status >>> > >> glusterd is stopped >>> > >> [root at compute4 ~]# chkconfig --list | grep glusterd >>> > >> glusterd 0:off 1:off 2:on 3:on 4:on 5:on >>> > >> 6:off >>> > >> [root at compute4 ~]# >>> > >> >>> > >> Thanks, >>> > >> Punit >>> > >> >>> > >> On Mon, Nov 24, 2014 at 6:36 PM, Kanagaraj <kmayilsa at redhat.com> >>> > >> wrote: >>> > >> >>> > >>> Can you send the corresponding error in vdsm.log from the host? >>> > >>> >>> > >>> Also check if glusterd service is running. >>> > >>> >>> > >>> Thanks, >>> > >>> Kanagaraj >>> > >>> >>> > >>> >>> > >>> On 11/24/2014 03:39 PM, Punit Dambiwal wrote: >>> > >>> >>> > >>> Hi, >>> > >>> >>> > >>> After reboot my Hypervisior host can not activate again in the >>> > >>> cluster >>> > >>> and failed with the following error :- >>> > >>> >>> > >>> Gluster command [<UNKNOWN>] failed on server... >>> > >>> >>> > >>> Engine logs :- >>> > >>> >>> > >>> 2014-11-24 18:05:28,397 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-64) START, >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 5f251c90 >>> > >>> 2014-11-24 18:05:30,609 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-64) FINISH, >>> > >>> GlusterVolumesListVDSCommand, >>> > >>> return: >>> > >>> >>> > >>> {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at d95203e0}, >>> > >>> log id: 5f251c90 >>> > >>> 2014-11-24 18:05:33,768 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (ajp--127.0.0.1-8702-8) >>> > >>> [287d570d] Lock Acquired to object EngineLock [exclusiveLocks= key: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: VDS >>> > >>> , sharedLocks= ] >>> > >>> 2014-11-24 18:05:33,795 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Running command: >>> > >>> ActivateVdsCommand internal: false. Entities affected : ID: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDSAction group >>> > >>> MANIPULATE_HOST >>> > >>> with role type ADMIN >>> > >>> 2014-11-24 18:05:33,796 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Before acquiring >>> > >>> lock in >>> > >>> order to prevent monitoring for host Compute5 from data-center >>> > >>> SV_WTC >>> > >>> 2014-11-24 18:05:33,797 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Lock acquired, from >>> > >>> now a >>> > >>> monitoring of host will be skipped for host Compute5 from >>> > >>> data-center >>> > >>> SV_WTC >>> > >>> 2014-11-24 18:05:33,817 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] START, >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=Unassigned, >>> > >>> nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: >>> > >>> 1cbc7311 >>> > >>> 2014-11-24 18:05:33,820 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] FINISH, >>> > >>> SetVdsStatusVDSCommand, log id: 1cbc7311 >>> > >>> 2014-11-24 18:05:34,086 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) Activate finished. Lock >>> > >>> released. >>> > >>> Monitoring can run now for host Compute5 from data-center SV_WTC >>> > >>> 2014-11-24 18:05:34,088 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>> > >>> (org.ovirt.thread.pool-8-thread-45) Correlation ID: 287d570d, Job >>> > >>> ID: >>> > >>> 5ef8e4d6-b2bc-469e-8e81-7ef74b2a001a, Call Stack: null, Custom >>> > >>> Event ID: >>> > >>> -1, Message: Host Compute5 was activated by admin. >>> > >>> 2014-11-24 18:05:34,090 INFO >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >>> > >>> (org.ovirt.thread.pool-8-thread-45) Lock freed to object EngineLock >>> > >>> [exclusiveLocks= key: 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: >>> > >>> VDS >>> > >>> , sharedLocks= ] >>> > >>> 2014-11-24 18:05:35,792 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] START, >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 48a0c832 >>> > >>> 2014-11-24 18:05:37,064 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) START, >>> > >>> GetHardwareInfoVDSCommand(HostName = Compute5, HostId >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, >>> > >>> vds=Host[Compute5,0bf6b00f-7947-4411-b55a-cc5eea2b381a]), log id: >>> > >>> 6d560cc2 >>> > >>> 2014-11-24 18:05:37,074 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) FINISH, >>> > >>> GetHardwareInfoVDSCommand, log >>> > >>> id: 6d560cc2 >>> > >>> 2014-11-24 18:05:37,093 WARN >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsManager] >>> > >>> (DefaultQuartzScheduler_Worker-69) Host Compute5 is running with >>> > >>> disabled >>> > >>> SELinux. >>> > >>> 2014-11-24 18:05:37,127 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] Running command: >>> > >>> HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities >>> > >>> affected >>> > >>> : ID: 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >>> > >>> 2014-11-24 18:05:37,147 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] START, >>> > >>> GlusterServersListVDSCommand(HostName = Compute5, HostId >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a), log id: 4faed87 >>> > >>> 2014-11-24 18:05:37,164 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] FINISH, >>> > >>> GlusterServersListVDSCommand, log id: 4faed87 >>> > >>> 2014-11-24 18:05:37,189 INFO >>> > >>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Running command: >>> > >>> SetNonOperationalVdsCommand internal: true. Entities affected : >>> > >>> ID: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >>> > >>> 2014-11-24 18:05:37,206 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] START, >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=NonOperational, >>> > >>> nonOperationalReason=GLUSTER_COMMAND_FAILED, >>> > >>> stopSpmFailureLogged=false), >>> > >>> log id: fed5617 >>> > >>> 2014-11-24 18:05:37,209 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] FINISH, >>> > >>> SetVdsStatusVDSCommand, log id: fed5617 >>> > >>> 2014-11-24 18:05:37,223 ERROR >>> > >>> >>> > >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: >>> > >>> 4a84c4e5, >>> > >>> Job >>> > >>> ID: 4bfd4a6d-c3ef-468f-a40e-a3a6ca13011b, Call Stack: null, Custom >>> > >>> Event >>> > >>> ID: -1, Message: Gluster command [<UNKNOWN>] failed on server >>> > >>> Compute5. >>> > >>> 2014-11-24 18:05:37,243 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: null, >>> > >>> Call >>> > >>> Stack: null, Custom Event ID: -1, Message: Status of host Compute5 >>> > >>> was >>> > >>> set >>> > >>> to NonOperational. >>> > >>> 2014-11-24 18:05:37,272 INFO >>> > >>> [org.ovirt.engine.core.bll.HandleVdsVersionCommand] >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Running command: >>> > >>> HandleVdsVersionCommand internal: true. Entities affected : ID: >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >>> > >>> 2014-11-24 18:05:37,274 INFO >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Host >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a : Compute5 is already in >>> > >>> NonOperational status for reason GLUSTER_COMMAND_FAILED. >>> > >>> SetNonOperationalVds command is skipped. >>> > >>> 2014-11-24 18:05:38,065 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] FINISH, >>> > >>> GlusterVolumesListVDSCommand, return: >>> > >>> >>> > >>> {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at 4e72a1b1}, >>> > >>> log id: 48a0c832 >>> > >>> 2014-11-24 18:05:43,243 INFO >>> > >>> >>> > >>> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >>> > >>> (DefaultQuartzScheduler_Worker-35) START, >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 3ce13ebc >>> > >>> ^C >>> > >>> [root at ccr01 ~]# >>> > >>> >>> > >>> Thanks, >>> > >>> Punit >>> > >>> >>> > >>> >>> > >>> _______________________________________________ >>> > >>> Users mailing >>> > >>> listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users >>> > >>> >>> > >>> >>> > >>> >>> > >> >>> > >> >>> > > >>> > > >>> > >> >> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users
Punit Dambiwal
2014-Nov-27 01:29 UTC
[Gluster-users] [ovirt-users] Gluster command [<UNKNOWN>] failed on server...
Hi Kaushal, Thanks for the detailed reply....let me explain my setup first :- 1. Ovirt Engine 2. 4* host as well as storage machine (Host and gluster combined) 3. Every host has 24 bricks... Now whenever the host machine reboot...it can come up but can not join the cluster again and through the following error "Gluster command [<UNKNOWN>] failed on server.." Please check my comment in line :- 1. Use the same string for doing the peer probe and for the brick address during volume create/add-brick. Ideally, we suggest you use properly resolvable FQDNs everywhere. If that is not possible, then use only IP addresses. Try to avoid short names. --------------- [root at cpu05 ~]# gluster peer status Number of Peers: 3 Hostname: cpu03.stack.com Uuid: 5729b8c4-e80d-4353-b456-6f467bddbdfb State: Peer in Cluster (Connected) Hostname: cpu04.stack.com Uuid: d272b790-c4b2-4bed-ba68-793656e6d7b0 State: Peer in Cluster (Connected) Other names: 10.10.0.8 Hostname: cpu02.stack.com Uuid: 8d8a7041-950e-40d0-85f9-58d14340ca25 State: Peer in Cluster (Connected) [root at cpu05 ~]# ---------------- 2. During boot up, make sure to launch glusterd only after the network is up. This will allow the new peer identification mechanism to do its job correctly.>> I think the service itself doing the same job....[root at cpu05 ~]# cat /usr/lib/systemd/system/glusterd.service [Unit] Description=GlusterFS, a clustered file-system server After=network.target rpcbind.service Before=network-online.target [Service] Type=forking PIDFile=/var/run/glusterd.pid LimitNOFILE=65536 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid KillMode=process [Install] WantedBy=multi-user.target [root at cpu05 ~]# -------------------- gluster logs :- [2014-11-24 09:22:22.147471] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.1 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) [2014-11-24 09:22:22.151565] I [glusterd.c:1214:init] 0-management: Maximum allowed open file descriptors set to 65536 [2014-11-24 09:22:22.151599] I [glusterd.c:1259:init] 0-management: Using /var/lib/glusterd as working directory [2014-11-24 09:22:22.155216] W [rdma.c:4195:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device) [2014-11-24 09:22:22.155264] E [rdma.c:4483:init] 0-rdma.management: Failed to initialize IB Device [2014-11-24 09:22:22.155285] E [rpc-transport.c:333:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed [2014-11-24 09:22:22.155354] W [rpcsvc.c:1524:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2014-11-24 09:22:22.156290] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [2014-11-24 09:22:22.161318] I [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 30600 [2014-11-24 09:22:22.821800] I [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2014-11-24 09:22:22.825810] I [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2014-11-24 09:22:22.828705] I [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 [2014-11-24 09:22:22.828771] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-24 09:22:22.832670] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-24 09:22:22.835919] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-11-24 09:22:22.840209] E [glusterd-store.c:4248:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2014-11-24 09:22:22.840233] E [xlator.c:425:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2014-11-24 09:22:22.840245] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2014-11-24 09:22:22.840264] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed [2014-11-24 09:22:22.840754] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (0), shutting down Thanks, Punit On Wed, Nov 26, 2014 at 7:14 PM, Kaushal M <kshlmster at gmail.com> wrote:> Based on the logs I can guess that glusterd is being started before > the network has come up and that the addresses given to bricks do not > directly match the addresses used in during peer probe. > > The gluster_after_reboot log has the line "[2014-11-25 > 06:46:09.972113] E [glusterd-store.c:2632:glusterd_resolve_all_bricks] > 0-glusterd: resolve brick failed in restore". > > Brick resolution fails when glusterd cannot match the address for the > brick, with one of the peers. Brick resolution happens in two phases, > 1. We first try to identify the peer by performing string comparisions > with the brick address and the peer addresses (The peer names will be > the names/addresses that were given when the peer was probed). > 2. If we don't find a match from step 1, we will then resolve all the > brick address and the peer addresses into addrinfo structs, and then > compare these structs to find a match. This process should generally > find a match if available. This will fail only if the network is not > up yet as we cannot resolve addresses. > > The above steps are applicable only to glusterfs versions >=3.6. They > were introduced to reduce problems with peer identification, like the > one you encountered > > Since both of the steps failed to find a match in one run, but > succeeded later, we can come to the conclusion that, > a) the bricks don't have the exact same string used in peer probe for > their addresses as step 1 failed, and > b) the network was not up in the initial run, as step 2 failed during > the initial run, but passed in the second run. > > Please let me know if my conclusion is correct. > > If it is, you can solve your problem in two ways. > 1. Use the same string for doing the peer probe and for the brick > address during volume create/add-brick. Ideally, we suggest you use > properly resolvable FQDNs everywhere. If that is not possible, then > use only IP addresses. Try to avoid short names. > 2. During boot up, make sure to launch glusterd only after the network > is up. This will allow the new peer identification mechanism to do its > job correctly. > > > If you have already followed these steps and yet still hit the > problem, then please provide more information (setup, logs, etc.). It > could be much different problem that you are facing. > > ~kaushal > > On Wed, Nov 26, 2014 at 4:01 PM, Punit Dambiwal <hypunit at gmail.com> wrote: > > Is there any one can help on this ?? > > > > Thanks, > > punit > > > > On Wed, Nov 26, 2014 at 9:42 AM, Punit Dambiwal <hypunit at gmail.com> > wrote: > >> > >> Hi, > >> > >> My Glusterfs version is :- glusterfs-3.6.1-1.el7 > >> > >> On Wed, Nov 26, 2014 at 1:59 AM, Kanagaraj Mayilsamy < > kmayilsa at redhat.com> > >> wrote: > >>> > >>> [+Gluster-users at gluster.org] > >>> > >>> "Initialization of volume 'management' failed, review your volfile > >>> again", glusterd throws this error when the service is started > automatically > >>> after the reboot. But the service is successfully started later > manually by > >>> the user. > >>> > >>> can somebody from gluster-users please help on this? > >>> > >>> glusterfs version: 3.5.1 > >>> > >>> Thanks, > >>> Kanagaraj > >>> > >>> ----- Original Message ----- > >>> > From: "Punit Dambiwal" <hypunit at gmail.com> > >>> > To: "Kanagaraj" <kmayilsa at redhat.com> > >>> > Cc: users at ovirt.org > >>> > Sent: Tuesday, November 25, 2014 7:24:45 PM > >>> > Subject: Re: [ovirt-users] Gluster command [<UNKNOWN>] failed on > >>> > server... > >>> > > >>> > Hi Kanagraj, > >>> > > >>> > Please check the attached log files....i didn't find any thing > >>> > special.... > >>> > > >>> > On Tue, Nov 25, 2014 at 12:12 PM, Kanagaraj <kmayilsa at redhat.com> > >>> > wrote: > >>> > > >>> > > Do you see any errors in > >>> > > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log or vdsm.log when > >>> > > the > >>> > > service is trying to start automatically after the reboot? > >>> > > > >>> > > Thanks, > >>> > > Kanagaraj > >>> > > > >>> > > > >>> > > On 11/24/2014 08:13 PM, Punit Dambiwal wrote: > >>> > > > >>> > > Hi Kanagaraj, > >>> > > > >>> > > Yes...once i will start the gluster service and then vdsmd ...the > >>> > > host > >>> > > can connect to cluster...but the question is why it's not started > >>> > > even it > >>> > > has chkconfig enabled... > >>> > > > >>> > > I have tested it in two host cluster environment...(Centos 6.6 and > >>> > > centos 7.0) on both hypervisior cluster..it's failed to reconnect > in > >>> > > to > >>> > > cluster after reboot.... > >>> > > > >>> > > In both the environment glusterd enabled for next boot....but it's > >>> > > failed with the same error....seems it's bug in either gluster or > >>> > > Ovirt ?? > >>> > > > >>> > > Please help me to find the workaround here if can not resolve > >>> > > it...as > >>> > > without this the Host machine can not connect after reboot....that > >>> > > means > >>> > > engine will consider it as down and every time need to manually > start > >>> > > the > >>> > > gluster service and vdsmd... ?? > >>> > > > >>> > > Thanks, > >>> > > Punit > >>> > > > >>> > > On Mon, Nov 24, 2014 at 10:20 PM, Kanagaraj <kmayilsa at redhat.com> > >>> > > wrote: > >>> > > > >>> > >> From vdsm.log "error: Connection failed. Please check if gluster > >>> > >> daemon > >>> > >> is operational." > >>> > >> > >>> > >> Starting glusterd service should fix this issue. 'service glusterd > >>> > >> start' > >>> > >> But i am wondering why the glusterd was not started automatically > >>> > >> after > >>> > >> the reboot. > >>> > >> > >>> > >> Thanks, > >>> > >> Kanagaraj > >>> > >> > >>> > >> > >>> > >> > >>> > >> On 11/24/2014 07:18 PM, Punit Dambiwal wrote: > >>> > >> > >>> > >> Hi Kanagaraj, > >>> > >> > >>> > >> Please find the attached VDSM logs :- > >>> > >> > >>> > >> ---------------- > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> > >>> > >> > 21:41:17,182::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) > >>> > >> Owner.cancelAll requests {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:17,182::task::993::Storage.TaskManager.Task::(_decref) > >>> > >> Task=`1691d409-9b27-4585-8281-5ec26154367a`::ref 0 aborting False > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:32,393::task::595::Storage.TaskManager.Task::(_updateState) > >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state > init > >>> > >> -> > >>> > >> state preparing > >>> > >> Thread-13::INFO::2014-11-24 > >>> > >> 21:41:32,393::logUtils::44::dispatcher::(wrapper) Run and protect: > >>> > >> repoStats(options=None) > >>> > >> Thread-13::INFO::2014-11-24 > >>> > >> 21:41:32,393::logUtils::47::dispatcher::(wrapper) Run and protect: > >>> > >> repoStats, Return response: {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:32,393::task::1191::Storage.TaskManager.Task::(prepare) > >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::finished: {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:32,394::task::595::Storage.TaskManager.Task::(_updateState) > >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state > >>> > >> preparing > >>> > >> -> > >>> > >> state finished > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> > >>> > >> > 21:41:32,394::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) > >>> > >> Owner.releaseAll requests {} resources {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> > >>> > >> > 21:41:32,394::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) > >>> > >> Owner.cancelAll requests {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:32,394::task::993::Storage.TaskManager.Task::(_decref) > >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::ref 0 aborting False > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,550::BindingXMLRPC::1132::vds::(wrapper) client > >>> > >> [10.10.10.2]::call > >>> > >> getCapabilities with () {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,553::utils::738::root::(execCmd) > >>> > >> /sbin/ip route show to 0.0.0.0/0 table all (cwd None) > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,560::utils::758::root::(execCmd) > >>> > >> SUCCESS: <err> = ''; <rc> = 0 > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,588::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,592::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift-object',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,593::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift-plugin',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift-account',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift-proxy',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift-doc',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('gluster-swift-container',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package > >>> > >> ('glusterfs-geo-replication',) not found > >>> > >> Thread-13::DEBUG::2014-11-24 21:41:41,600::caps::646::root::(get) > >>> > >> VirtioRNG DISABLED: libvirt version 0.10.2-29.el6_5.9 required >> >>> > >> 0.10.2-31 > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,603::BindingXMLRPC::1139::vds::(wrapper) return > >>> > >> getCapabilities > >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': > >>> > >> {'HBAInventory': > >>> > >> {'iSCSI': [{'InitiatorName': > >>> > >> 'iqn.1994-05.com.redhat:32151ce183c8'}], > >>> > >> 'FC': > >>> > >> []}, 'packages2': {'kernel': {'release': '431.el6.x86_64', > >>> > >> 'buildtime': > >>> > >> 1385061309.0, 'version': '2.6.32'}, 'glusterfs-rdma': {'release': > >>> > >> '1.el6', > >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'glusterfs-fuse': > >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': > '3.5.1'}, > >>> > >> 'spice-server': {'release': '6.el6_5.2', 'buildtime': 1402324637L, > >>> > >> 'version': '0.12.4'}, 'vdsm': {'release': '1.gitdb83943.el6', > >>> > >> 'buildtime': > >>> > >> 1412784567L, 'version': '4.16.7'}, 'qemu-kvm': {'release': > >>> > >> '2.415.el6_5.10', 'buildtime': 1402435700L, 'version': > '0.12.1.2'}, > >>> > >> 'qemu-img': {'release': '2.415.el6_5.10', 'buildtime': > 1402435700L, > >>> > >> 'version': '0.12.1.2'}, 'libvirt': {'release': '29.el6_5.9', > >>> > >> 'buildtime': > >>> > >> 1402404612L, 'version': '0.10.2'}, 'glusterfs': {'release': > '1.el6', > >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'mom': {'release': > >>> > >> '2.el6', > >>> > >> 'buildtime': 1403794344L, 'version': '0.4.1'}, 'glusterfs-server': > >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': > '3.5.1'}}, > >>> > >> 'numaNodeDistance': {'1': [20, 10], '0': [10, 20]}, 'cpuModel': > >>> > >> 'Intel(R) > >>> > >> Xeon(R) CPU X5650 @ 2.67GHz', 'liveMerge': 'false', > >>> > >> 'hooks': > >>> > >> {}, > >>> > >> 'cpuSockets': '2', 'vmTypes': ['kvm'], 'selinux': {'mode': '1'}, > >>> > >> 'kdumpStatus': 0, 'supportedProtocols': ['2.2', '2.3'], > 'networks': > >>> > >> {'ovirtmgmt': {'iface': u'bond0.10', 'addr': '43.252.176.16', > >>> > >> 'bridged': > >>> > >> False, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': > '1500', > >>> > >> 'bootproto4': 'none', 'netmask': '255.255.255.0', 'ipv4addrs': [' > >>> > >> 43.252.176.16/24' <http://43.252.176.16/24%27>], 'interface': > >>> > >> u'bond0.10', 'ipv6gateway': '::', 'gateway': '43.25.17.1'}, > >>> > >> 'Internal': > >>> > >> {'iface': 'Internal', 'addr': '', 'cfg': {'DEFROUTE': 'no', > >>> > >> 'HOTPLUG': > >>> > >> 'no', 'MTU': '9000', 'DELAY': '0', 'NM_CONTROLLED': 'no', > >>> > >> 'BOOTPROTO': > >>> > >> 'none', 'STP': 'off', 'DEVICE': 'Internal', 'TYPE': 'Bridge', > >>> > >> 'ONBOOT': > >>> > >> 'no'}, 'bridged': True, 'ipv6addrs': > >>> > >> ['fe80::210:18ff:fecd:daac/64'], > >>> > >> 'gateway': '', 'bootproto4': 'none', 'netmask': '', 'stp': 'off', > >>> > >> 'ipv4addrs': [], 'mtu': '9000', 'ipv6gateway': '::', 'ports': > >>> > >> ['bond1.100']}, 'storage': {'iface': u'bond1', 'addr': > '10.10.10.6', > >>> > >> 'bridged': False, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], > >>> > >> 'mtu': > >>> > >> '9000', 'bootproto4': 'none', 'netmask': '255.255.255.0', > >>> > >> 'ipv4addrs': [' > >>> > >> 10.10.10.6/24' <http://10.10.10.6/24%27>], 'interface': u'bond1', > >>> > >> 'ipv6gateway': '::', 'gateway': ''}, 'VMNetwork': {'iface': > >>> > >> 'VMNetwork', > >>> > >> 'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': > >>> > >> '1500', > >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': > >>> > >> 'off', > >>> > >> 'DEVICE': 'VMNetwork', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, > 'bridged': > >>> > >> True, > >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'gateway': '', > >>> > >> 'bootproto4': > >>> > >> 'none', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], 'mtu': > '1500', > >>> > >> 'ipv6gateway': '::', 'ports': ['bond0.36']}}, 'bridges': > >>> > >> {'Internal': > >>> > >> {'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': > >>> > >> '9000', > >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': > >>> > >> 'off', > >>> > >> 'DEVICE': 'Internal', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, > >>> > >> 'ipv6addrs': > >>> > >> ['fe80::210:18ff:fecd:daac/64'], 'mtu': '9000', 'netmask': '', > >>> > >> 'stp': > >>> > >> 'off', 'ipv4addrs': [], 'ipv6gateway': '::', 'gateway': '', > 'opts': > >>> > >> {'topology_change_detected': '0', 'multicast_last_member_count': > >>> > >> '2', > >>> > >> 'hash_elasticity': '4', 'multicast_query_response_interval': > '999', > >>> > >> 'multicast_snooping': '1', 'multicast_startup_query_interval': > >>> > >> '3124', > >>> > >> 'hello_timer': '31', 'multicast_querier_interval': '25496', > >>> > >> 'max_age': > >>> > >> '1999', 'hash_max': '512', 'stp_state': '0', 'root_id': > >>> > >> '8000.001018cddaac', 'priority': '32768', > >>> > >> 'multicast_membership_interval': > >>> > >> '25996', 'root_path_cost': '0', 'root_port': '0', > >>> > >> 'multicast_querier': > >>> > >> '0', > >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', > >>> > >> 'topology_change': '0', 'bridge_id': '8000.001018cddaac', > >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': > >>> > >> '31', > >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', > >>> > >> 'multicast_query_interval': '12498', > >>> > >> 'multicast_last_member_interval': > >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': > >>> > >> ['bond1.100']}, 'VMNetwork': {'addr': '', 'cfg': {'DEFROUTE': > 'no', > >>> > >> 'HOTPLUG': 'no', 'MTU': '1500', 'DELAY': '0', 'NM_CONTROLLED': > 'no', > >>> > >> 'BOOTPROTO': 'none', 'STP': 'off', 'DEVICE': 'VMNetwork', 'TYPE': > >>> > >> 'Bridge', > >>> > >> 'ONBOOT': 'no'}, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], > >>> > >> 'mtu': > >>> > >> '1500', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], > 'ipv6gateway': > >>> > >> '::', > >>> > >> 'gateway': '', 'opts': {'topology_change_detected': '0', > >>> > >> 'multicast_last_member_count': '2', 'hash_elasticity': '4', > >>> > >> 'multicast_query_response_interval': '999', 'multicast_snooping': > >>> > >> '1', > >>> > >> 'multicast_startup_query_interval': '3124', 'hello_timer': '131', > >>> > >> 'multicast_querier_interval': '25496', 'max_age': '1999', > >>> > >> 'hash_max': > >>> > >> '512', 'stp_state': '0', 'root_id': '8000.60eb6920b46c', > 'priority': > >>> > >> '32768', 'multicast_membership_interval': '25996', > 'root_path_cost': > >>> > >> '0', > >>> > >> 'root_port': '0', 'multicast_querier': '0', > >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', > >>> > >> 'topology_change': '0', 'bridge_id': '8000.60eb6920b46c', > >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': > >>> > >> '31', > >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', > >>> > >> 'multicast_query_interval': '12498', > >>> > >> 'multicast_last_member_interval': > >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': > >>> > >> ['bond0.36']}}, 'uuid': '44454C4C-4C00-1057-8053-B7C04F504E31', > >>> > >> 'lastClientIface': 'bond1', 'nics': {'eth3': {'permhwaddr': > >>> > >> '00:10:18:cd:da:ae', 'addr': '', 'cfg': {'SLAVE': 'yes', > >>> > >> 'NM_CONTROLLED': > >>> > >> 'no', 'MTU': '9000', 'HWADDR': '00:10:18:cd:da:ae', 'MASTER': > >>> > >> 'bond1', > >>> > >> 'DEVICE': 'eth3', 'ONBOOT': 'no'}, 'ipv6addrs': [], 'mtu': '9000', > >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '00:10:18:cd:da:ac', > >>> > >> 'speed': > >>> > >> 1000}, 'eth2': {'permhwaddr': '00:10:18:cd:da:ac', 'addr': '', > >>> > >> 'cfg': > >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '9000', 'HWADDR': > >>> > >> '00:10:18:cd:da:ac', 'MASTER': 'bond1', 'DEVICE': 'eth2', > 'ONBOOT': > >>> > >> 'no'}, > >>> > >> 'ipv6addrs': [], 'mtu': '9000', 'netmask': '', 'ipv4addrs': [], > >>> > >> 'hwaddr': > >>> > >> '00:10:18:cd:da:ac', 'speed': 1000}, 'eth1': {'permhwaddr': > >>> > >> '60:eb:69:20:b4:6d', 'addr': '', 'cfg': {'SLAVE': 'yes', > >>> > >> 'NM_CONTROLLED': > >>> > >> 'no', 'MTU': '1500', 'HWADDR': '60:eb:69:20:b4:6d', 'MASTER': > >>> > >> 'bond0', > >>> > >> 'DEVICE': 'eth1', 'ONBOOT': 'yes'}, 'ipv6addrs': [], 'mtu': > '1500', > >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', > >>> > >> 'speed': > >>> > >> 1000}, 'eth0': {'permhwaddr': '60:eb:69:20:b4:6c', 'addr': '', > >>> > >> 'cfg': > >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '1500', 'HWADDR': > >>> > >> '60:eb:69:20:b4:6c', 'MASTER': 'bond0', 'DEVICE': 'eth0', > 'ONBOOT': > >>> > >> 'yes'}, > >>> > >> 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'ipv4addrs': [], > >>> > >> 'hwaddr': > >>> > >> '60:eb:69:20:b4:6c', 'speed': 1000}}, 'software_revision': '1', > >>> > >> 'clusterLevels': ['3.0', '3.1', '3.2', '3.3', '3.4', '3.5'], > >>> > >> 'cpuFlags': > >>> > >> > >>> > >> > u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,dca,sse4_1,sse4_2,popcnt,aes,lahf_lm,tpr_shadow,vnmi,flexpriority,ept,vpid,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270', > >>> > >> 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:32151ce183c8', > >>> > >> 'netConfigDirty': 'False', 'supportedENGINEs': ['3.0', '3.1', > '3.2', > >>> > >> '3.3', > >>> > >> '3.4', '3.5'], 'autoNumaBalancing': 2, 'reservedMem': '321', > >>> > >> 'bondings': > >>> > >> {'bond4': {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', > >>> > >> 'slaves': > >>> > >> [], 'hwaddr': '00:00:00:00:00:00'}, 'bond0': {'addr': '', 'cfg': > >>> > >> {'HOTPLUG': 'no', 'MTU': '1500', 'NM_CONTROLLED': 'no', > >>> > >> 'BONDING_OPTS': > >>> > >> 'mode=4 miimon=100', 'DEVICE': 'bond0', 'ONBOOT': 'yes'}, > >>> > >> 'ipv6addrs': > >>> > >> ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': '1500', 'netmask': '', > >>> > >> 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', 'slaves': ['eth0', > >>> > >> 'eth1'], > >>> > >> 'opts': {'miimon': '100', 'mode': '4'}}, 'bond1': {'addr': > >>> > >> '10.10.10.6', > >>> > >> 'cfg': {'DEFROUTE': 'no', 'IPADDR': '10.10.10.6', 'HOTPLUG': 'no', > >>> > >> 'MTU': > >>> > >> '9000', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', > >>> > >> 'BOOTPROTO': > >>> > >> 'none', 'BONDING_OPTS': 'mode=4 miimon=100', 'DEVICE': 'bond1', > >>> > >> 'ONBOOT': > >>> > >> 'no'}, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'mtu': > '9000', > >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['10.10.10.6/24' > >>> > >> <http://10.10.10.6/24%27>], 'hwaddr': '00:10:18:cd:da:ac', > 'slaves': > >>> > >> ['eth2', 'eth3'], 'opts': {'miimon': '100', 'mode': '4'}}, > 'bond2': > >>> > >> {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', 'slaves': > [], > >>> > >> 'hwaddr': '00:00:00:00:00:00'}, 'bond3': {'addr': '', 'cfg': {}, > >>> > >> 'mtu': > >>> > >> '1500', 'netmask': '', 'slaves': [], 'hwaddr': > >>> > >> '00:00:00:00:00:00'}}, > >>> > >> 'software_version': '4.16', 'memSize': '24019', 'cpuSpeed': > >>> > >> '2667.000', > >>> > >> 'numaNodes': {u'1': {'totalMemory': '12288', 'cpus': [6, 7, 8, 9, > >>> > >> 10, 11, > >>> > >> 18, 19, 20, 21, 22, 23]}, u'0': {'totalMemory': '12278', 'cpus': > [0, > >>> > >> 1, 2, > >>> > >> 3, 4, 5, 12, 13, 14, 15, 16, 17]}}, 'version_name': 'Snow Man', > >>> > >> 'vlans': > >>> > >> {'bond0.10': {'iface': 'bond0', 'addr': '43.25.17.16', 'cfg': > >>> > >> {'DEFROUTE': > >>> > >> 'yes', 'VLAN': 'yes', 'IPADDR': '43.25.17.16', 'HOTPLUG': 'no', > >>> > >> 'GATEWAY': > >>> > >> '43.25.17.1', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', > >>> > >> 'BOOTPROTO': 'none', 'DEVICE': 'bond0.10', 'MTU': '1500', > 'ONBOOT': > >>> > >> 'yes'}, > >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 10, > 'mtu': > >>> > >> '1500', > >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['43.25.17.16/24'] > >>> > >> <http://43.25.17.16/24%27%5D>}, 'bond0.36': {'iface': 'bond0', > >>> > >> 'addr': > >>> > >> '', 'cfg': {'BRIDGE': 'VMNetwork', 'VLAN': 'yes', 'HOTPLUG': 'no', > >>> > >> 'MTU': > >>> > >> '1500', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond0.36', 'ONBOOT': > >>> > >> 'no'}, > >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 36, > 'mtu': > >>> > >> '1500', > >>> > >> 'netmask': '', 'ipv4addrs': []}, 'bond1.100': {'iface': 'bond1', > >>> > >> 'addr': > >>> > >> '', 'cfg': {'BRIDGE': 'Internal', 'VLAN': 'yes', 'HOTPLUG': 'no', > >>> > >> 'MTU': > >>> > >> '9000', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond1.100', 'ONBOOT': > >>> > >> 'no'}, > >>> > >> 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'vlanid': 100, > 'mtu': > >>> > >> '9000', > >>> > >> 'netmask': '', 'ipv4addrs': []}}, 'cpuCores': '12', 'kvmEnabled': > >>> > >> 'true', > >>> > >> 'guestOverhead': '65', 'cpuThreads': '24', 'emulatedMachines': > >>> > >> [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', > >>> > >> u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', > >>> > >> u'rhel5.4.0'], > >>> > >> 'operatingSystem': {'release': '5.el6.centos.11.1', 'version': > '6', > >>> > >> 'name': > >>> > >> 'RHEL'}, 'lastClient': '10.10.10.2'}} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,620::BindingXMLRPC::1132::vds::(wrapper) client > >>> > >> [10.10.10.2]::call > >>> > >> getHardwareInfo with () {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,621::BindingXMLRPC::1139::vds::(wrapper) return > >>> > >> getHardwareInfo > >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': > >>> > >> {'systemProductName': 'CS24-TY', 'systemSerialNumber': '7LWSPN1', > >>> > >> 'systemFamily': 'Server', 'systemVersion': 'A00', 'systemUUID': > >>> > >> '44454c4c-4c00-1057-8053-b7c04f504e31', 'systemManufacturer': > >>> > >> 'Dell'}} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:41,733::BindingXMLRPC::1132::vds::(wrapper) client > >>> > >> [10.10.10.2]::call > >>> > >> hostsList with () {} flowID [222e8036] > >>> > >> Thread-13::ERROR::2014-11-24 > >>> > >> 21:41:44,753::BindingXMLRPC::1148::vds::(wrapper) vdsm exception > >>> > >> occured > >>> > >> Traceback (most recent call last): > >>> > >> File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in > wrapper > >>> > >> res = f(*args, **kwargs) > >>> > >> File "/usr/share/vdsm/gluster/api.py", line 54, in wrapper > >>> > >> rv = func(*args, **kwargs) > >>> > >> File "/usr/share/vdsm/gluster/api.py", line 251, in hostsList > >>> > >> return {'hosts': self.svdsmProxy.glusterPeerStatus()} > >>> > >> File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ > >>> > >> return callMethod() > >>> > >> File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> > >>> > >> **kwargs) > >>> > >> File "<string>", line 2, in glusterPeerStatus > >>> > >> File "/usr/lib64/python2.6/multiprocessing/managers.py", line > 740, > >>> > >> in > >>> > >> _callmethod > >>> > >> raise convert_to_error(kind, result) > >>> > >> GlusterCmdExecFailedException: Command execution failed > >>> > >> error: Connection failed. Please check if gluster daemon is > >>> > >> operational. > >>> > >> return code: 1 > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:50,949::task::595::Storage.TaskManager.Task::(_updateState) > >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state > init > >>> > >> -> > >>> > >> state preparing > >>> > >> Thread-13::INFO::2014-11-24 > >>> > >> 21:41:50,950::logUtils::44::dispatcher::(wrapper) Run and protect: > >>> > >> repoStats(options=None) > >>> > >> Thread-13::INFO::2014-11-24 > >>> > >> 21:41:50,950::logUtils::47::dispatcher::(wrapper) Run and protect: > >>> > >> repoStats, Return response: {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:50,950::task::1191::Storage.TaskManager.Task::(prepare) > >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::finished: {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:50,950::task::595::Storage.TaskManager.Task::(_updateState) > >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state > >>> > >> preparing > >>> > >> -> > >>> > >> state finished > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> > >>> > >> > 21:41:50,951::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) > >>> > >> Owner.releaseAll requests {} resources {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> > >>> > >> > 21:41:50,951::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) > >>> > >> Owner.cancelAll requests {} > >>> > >> Thread-13::DEBUG::2014-11-24 > >>> > >> 21:41:50,951::task::993::Storage.TaskManager.Task::(_decref) > >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::ref 0 aborting False > >>> > >> ------------------------------- > >>> > >> > >>> > >> [root at compute4 ~]# service glusterd status > >>> > >> glusterd is stopped > >>> > >> [root at compute4 ~]# chkconfig --list | grep glusterd > >>> > >> glusterd 0:off 1:off 2:on 3:on 4:on 5:on > >>> > >> 6:off > >>> > >> [root at compute4 ~]# > >>> > >> > >>> > >> Thanks, > >>> > >> Punit > >>> > >> > >>> > >> On Mon, Nov 24, 2014 at 6:36 PM, Kanagaraj <kmayilsa at redhat.com> > >>> > >> wrote: > >>> > >> > >>> > >>> Can you send the corresponding error in vdsm.log from the host? > >>> > >>> > >>> > >>> Also check if glusterd service is running. > >>> > >>> > >>> > >>> Thanks, > >>> > >>> Kanagaraj > >>> > >>> > >>> > >>> > >>> > >>> On 11/24/2014 03:39 PM, Punit Dambiwal wrote: > >>> > >>> > >>> > >>> Hi, > >>> > >>> > >>> > >>> After reboot my Hypervisior host can not activate again in the > >>> > >>> cluster > >>> > >>> and failed with the following error :- > >>> > >>> > >>> > >>> Gluster command [<UNKNOWN>] failed on server... > >>> > >>> > >>> > >>> Engine logs :- > >>> > >>> > >>> > >>> 2014-11-24 18:05:28,397 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-64) START, > >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId > >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 5f251c90 > >>> > >>> 2014-11-24 18:05:30,609 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-64) FINISH, > >>> > >>> GlusterVolumesListVDSCommand, > >>> > >>> return: > >>> > >>> > >>> > >>> > {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at d95203e0 > }, > >>> > >>> log id: 5f251c90 > >>> > >>> 2014-11-24 18:05:33,768 INFO > >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] > >>> > >>> (ajp--127.0.0.1-8702-8) > >>> > >>> [287d570d] Lock Acquired to object EngineLock [exclusiveLocks> key: > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: VDS > >>> > >>> , sharedLocks= ] > >>> > >>> 2014-11-24 18:05:33,795 INFO > >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Running command: > >>> > >>> ActivateVdsCommand internal: false. Entities affected : ID: > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDSAction group > >>> > >>> MANIPULATE_HOST > >>> > >>> with role type ADMIN > >>> > >>> 2014-11-24 18:05:33,796 INFO > >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Before acquiring > >>> > >>> lock in > >>> > >>> order to prevent monitoring for host Compute5 from data-center > >>> > >>> SV_WTC > >>> > >>> 2014-11-24 18:05:33,797 INFO > >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Lock acquired, > from > >>> > >>> now a > >>> > >>> monitoring of host will be skipped for host Compute5 from > >>> > >>> data-center > >>> > >>> SV_WTC > >>> > >>> 2014-11-24 18:05:33,817 INFO > >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] START, > >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=Unassigned, > >>> > >>> nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: > >>> > >>> 1cbc7311 > >>> > >>> 2014-11-24 18:05:33,820 INFO > >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] FINISH, > >>> > >>> SetVdsStatusVDSCommand, log id: 1cbc7311 > >>> > >>> 2014-11-24 18:05:34,086 INFO > >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) Activate finished. Lock > >>> > >>> released. > >>> > >>> Monitoring can run now for host Compute5 from data-center SV_WTC > >>> > >>> 2014-11-24 18:05:34,088 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) Correlation ID: 287d570d, Job > >>> > >>> ID: > >>> > >>> 5ef8e4d6-b2bc-469e-8e81-7ef74b2a001a, Call Stack: null, Custom > >>> > >>> Event ID: > >>> > >>> -1, Message: Host Compute5 was activated by admin. > >>> > >>> 2014-11-24 18:05:34,090 INFO > >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] > >>> > >>> (org.ovirt.thread.pool-8-thread-45) Lock freed to object > EngineLock > >>> > >>> [exclusiveLocks= key: 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: > >>> > >>> VDS > >>> > >>> , sharedLocks= ] > >>> > >>> 2014-11-24 18:05:35,792 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] START, > >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId > >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 48a0c832 > >>> > >>> 2014-11-24 18:05:37,064 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) START, > >>> > >>> GetHardwareInfoVDSCommand(HostName = Compute5, HostId > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, > >>> > >>> vds=Host[Compute5,0bf6b00f-7947-4411-b55a-cc5eea2b381a]), log id: > >>> > >>> 6d560cc2 > >>> > >>> 2014-11-24 18:05:37,074 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) FINISH, > >>> > >>> GetHardwareInfoVDSCommand, log > >>> > >>> id: 6d560cc2 > >>> > >>> 2014-11-24 18:05:37,093 WARN > >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsManager] > >>> > >>> (DefaultQuartzScheduler_Worker-69) Host Compute5 is running with > >>> > >>> disabled > >>> > >>> SELinux. > >>> > >>> 2014-11-24 18:05:37,127 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] Running command: > >>> > >>> HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities > >>> > >>> affected > >>> > >>> : ID: 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS > >>> > >>> 2014-11-24 18:05:37,147 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] START, > >>> > >>> GlusterServersListVDSCommand(HostName = Compute5, HostId > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a), log id: 4faed87 > >>> > >>> 2014-11-24 18:05:37,164 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] FINISH, > >>> > >>> GlusterServersListVDSCommand, log id: 4faed87 > >>> > >>> 2014-11-24 18:05:37,189 INFO > >>> > >>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Running command: > >>> > >>> SetNonOperationalVdsCommand internal: true. Entities affected : > >>> > >>> ID: > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS > >>> > >>> 2014-11-24 18:05:37,206 INFO > >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] START, > >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=NonOperational, > >>> > >>> nonOperationalReason=GLUSTER_COMMAND_FAILED, > >>> > >>> stopSpmFailureLogged=false), > >>> > >>> log id: fed5617 > >>> > >>> 2014-11-24 18:05:37,209 INFO > >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] FINISH, > >>> > >>> SetVdsStatusVDSCommand, log id: fed5617 > >>> > >>> 2014-11-24 18:05:37,223 ERROR > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: > >>> > >>> 4a84c4e5, > >>> > >>> Job > >>> > >>> ID: 4bfd4a6d-c3ef-468f-a40e-a3a6ca13011b, Call Stack: null, > Custom > >>> > >>> Event > >>> > >>> ID: -1, Message: Gluster command [<UNKNOWN>] failed on server > >>> > >>> Compute5. > >>> > >>> 2014-11-24 18:05:37,243 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: > null, > >>> > >>> Call > >>> > >>> Stack: null, Custom Event ID: -1, Message: Status of host > Compute5 > >>> > >>> was > >>> > >>> set > >>> > >>> to NonOperational. > >>> > >>> 2014-11-24 18:05:37,272 INFO > >>> > >>> [org.ovirt.engine.core.bll.HandleVdsVersionCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Running command: > >>> > >>> HandleVdsVersionCommand internal: true. Entities affected : ID: > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS > >>> > >>> 2014-11-24 18:05:37,274 INFO > >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] > >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Host > >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a : Compute5 is already in > >>> > >>> NonOperational status for reason GLUSTER_COMMAND_FAILED. > >>> > >>> SetNonOperationalVds command is skipped. > >>> > >>> 2014-11-24 18:05:38,065 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] FINISH, > >>> > >>> GlusterVolumesListVDSCommand, return: > >>> > >>> > >>> > >>> > {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at 4e72a1b1 > }, > >>> > >>> log id: 48a0c832 > >>> > >>> 2014-11-24 18:05:43,243 INFO > >>> > >>> > >>> > >>> > [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] > >>> > >>> (DefaultQuartzScheduler_Worker-35) START, > >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId > >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 3ce13ebc > >>> > >>> ^C > >>> > >>> [root at ccr01 ~]# > >>> > >>> > >>> > >>> Thanks, > >>> > >>> Punit > >>> > >>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> > >>> Users mailing > >>> > >>> listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >> > >>> > >> > >>> > > > >>> > > > >>> > > >> > >> > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://supercolony.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141127/0ef5b1d0/attachment.html>
Punit Dambiwal
2014-Dec-01 03:35 UTC
[Gluster-users] [ovirt-users] Gluster command [<UNKNOWN>] failed on server...
Hi, Can Any body help me on this ?? On Thu, Nov 27, 2014 at 9:29 AM, Punit Dambiwal <hypunit at gmail.com> wrote:> Hi Kaushal, > > Thanks for the detailed reply....let me explain my setup first :- > > 1. Ovirt Engine > 2. 4* host as well as storage machine (Host and gluster combined) > 3. Every host has 24 bricks... > > Now whenever the host machine reboot...it can come up but can not join the > cluster again and through the following error "Gluster command [<UNKNOWN>] > failed on server.." > > Please check my comment in line :- > > 1. Use the same string for doing the peer probe and for the brick address > during volume create/add-brick. Ideally, we suggest you use properly > resolvable FQDNs everywhere. If that is not possible, then use only IP > addresses. Try to avoid short names. > --------------- > [root at cpu05 ~]# gluster peer status > Number of Peers: 3 > > Hostname: cpu03.stack.com > Uuid: 5729b8c4-e80d-4353-b456-6f467bddbdfb > State: Peer in Cluster (Connected) > > Hostname: cpu04.stack.com > Uuid: d272b790-c4b2-4bed-ba68-793656e6d7b0 > State: Peer in Cluster (Connected) > Other names: > 10.10.0.8 > > Hostname: cpu02.stack.com > Uuid: 8d8a7041-950e-40d0-85f9-58d14340ca25 > State: Peer in Cluster (Connected) > [root at cpu05 ~]# > ---------------- > 2. During boot up, make sure to launch glusterd only after the network is > up. This will allow the new peer identification mechanism to do its > job correctly. > >> I think the service itself doing the same job.... > > [root at cpu05 ~]# cat /usr/lib/systemd/system/glusterd.service > [Unit] > Description=GlusterFS, a clustered file-system server > After=network.target rpcbind.service > Before=network-online.target > > [Service] > Type=forking > PIDFile=/var/run/glusterd.pid > LimitNOFILE=65536 > ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid > KillMode=process > > [Install] > WantedBy=multi-user.target > [root at cpu05 ~]# > -------------------- > > gluster logs :- > > [2014-11-24 09:22:22.147471] I [MSGID: 100030] [glusterfsd.c:2018:main] > 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.1 > (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) > [2014-11-24 09:22:22.151565] I [glusterd.c:1214:init] 0-management: > Maximum allowed open file descriptors set to 65536 > [2014-11-24 09:22:22.151599] I [glusterd.c:1259:init] 0-management: Using > /var/lib/glusterd as working directory > [2014-11-24 09:22:22.155216] W [rdma.c:4195:__gf_rdma_ctx_create] > 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device) > [2014-11-24 09:22:22.155264] E [rdma.c:4483:init] 0-rdma.management: > Failed to initialize IB Device > [2014-11-24 09:22:22.155285] E [rpc-transport.c:333:rpc_transport_load] > 0-rpc-transport: 'rdma' initialization failed > [2014-11-24 09:22:22.155354] W [rpcsvc.c:1524:rpcsvc_transport_create] > 0-rpc-service: cannot create listener, initing the transport failed > [2014-11-24 09:22:22.156290] I > [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication > module not installed in the system > [2014-11-24 09:22:22.161318] I > [glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd: retrieved > op-version: 30600 > [2014-11-24 09:22:22.821800] I > [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: > connect returned 0 > [2014-11-24 09:22:22.825810] I > [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: > connect returned 0 > [2014-11-24 09:22:22.828705] I > [glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management: > connect returned 0 > [2014-11-24 09:22:22.828771] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-24 09:22:22.832670] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-24 09:22:22.835919] I [rpc-clnt.c:969:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2014-11-24 09:22:22.840209] E > [glusterd-store.c:4248:glusterd_resolve_all_bricks] 0-glusterd: resolve > brick failed in restore > [2014-11-24 09:22:22.840233] E [xlator.c:425:xlator_init] 0-management: > Initialization of volume 'management' failed, review your volfile again > [2014-11-24 09:22:22.840245] E [graph.c:322:glusterfs_graph_init] > 0-management: initializing translator failed > [2014-11-24 09:22:22.840264] E [graph.c:525:glusterfs_graph_activate] > 0-graph: init failed > [2014-11-24 09:22:22.840754] W [glusterfsd.c:1194:cleanup_and_exit] (--> > 0-: received signum (0), shutting down > > Thanks, > Punit > > > > > On Wed, Nov 26, 2014 at 7:14 PM, Kaushal M <kshlmster at gmail.com> wrote: > >> Based on the logs I can guess that glusterd is being started before >> the network has come up and that the addresses given to bricks do not >> directly match the addresses used in during peer probe. >> >> The gluster_after_reboot log has the line "[2014-11-25 >> 06:46:09.972113] E [glusterd-store.c:2632:glusterd_resolve_all_bricks] >> 0-glusterd: resolve brick failed in restore". >> >> Brick resolution fails when glusterd cannot match the address for the >> brick, with one of the peers. Brick resolution happens in two phases, >> 1. We first try to identify the peer by performing string comparisions >> with the brick address and the peer addresses (The peer names will be >> the names/addresses that were given when the peer was probed). >> 2. If we don't find a match from step 1, we will then resolve all the >> brick address and the peer addresses into addrinfo structs, and then >> compare these structs to find a match. This process should generally >> find a match if available. This will fail only if the network is not >> up yet as we cannot resolve addresses. >> >> The above steps are applicable only to glusterfs versions >=3.6. They >> were introduced to reduce problems with peer identification, like the >> one you encountered >> >> Since both of the steps failed to find a match in one run, but >> succeeded later, we can come to the conclusion that, >> a) the bricks don't have the exact same string used in peer probe for >> their addresses as step 1 failed, and >> b) the network was not up in the initial run, as step 2 failed during >> the initial run, but passed in the second run. >> >> Please let me know if my conclusion is correct. >> >> If it is, you can solve your problem in two ways. >> 1. Use the same string for doing the peer probe and for the brick >> address during volume create/add-brick. Ideally, we suggest you use >> properly resolvable FQDNs everywhere. If that is not possible, then >> use only IP addresses. Try to avoid short names. >> 2. During boot up, make sure to launch glusterd only after the network >> is up. This will allow the new peer identification mechanism to do its >> job correctly. >> >> >> If you have already followed these steps and yet still hit the >> problem, then please provide more information (setup, logs, etc.). It >> could be much different problem that you are facing. >> >> ~kaushal >> >> On Wed, Nov 26, 2014 at 4:01 PM, Punit Dambiwal <hypunit at gmail.com> >> wrote: >> > Is there any one can help on this ?? >> > >> > Thanks, >> > punit >> > >> > On Wed, Nov 26, 2014 at 9:42 AM, Punit Dambiwal <hypunit at gmail.com> >> wrote: >> >> >> >> Hi, >> >> >> >> My Glusterfs version is :- glusterfs-3.6.1-1.el7 >> >> >> >> On Wed, Nov 26, 2014 at 1:59 AM, Kanagaraj Mayilsamy < >> kmayilsa at redhat.com> >> >> wrote: >> >>> >> >>> [+Gluster-users at gluster.org] >> >>> >> >>> "Initialization of volume 'management' failed, review your volfile >> >>> again", glusterd throws this error when the service is started >> automatically >> >>> after the reboot. But the service is successfully started later >> manually by >> >>> the user. >> >>> >> >>> can somebody from gluster-users please help on this? >> >>> >> >>> glusterfs version: 3.5.1 >> >>> >> >>> Thanks, >> >>> Kanagaraj >> >>> >> >>> ----- Original Message ----- >> >>> > From: "Punit Dambiwal" <hypunit at gmail.com> >> >>> > To: "Kanagaraj" <kmayilsa at redhat.com> >> >>> > Cc: users at ovirt.org >> >>> > Sent: Tuesday, November 25, 2014 7:24:45 PM >> >>> > Subject: Re: [ovirt-users] Gluster command [<UNKNOWN>] failed on >> >>> > server... >> >>> > >> >>> > Hi Kanagraj, >> >>> > >> >>> > Please check the attached log files....i didn't find any thing >> >>> > special.... >> >>> > >> >>> > On Tue, Nov 25, 2014 at 12:12 PM, Kanagaraj <kmayilsa at redhat.com> >> >>> > wrote: >> >>> > >> >>> > > Do you see any errors in >> >>> > > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log or vdsm.log when >> >>> > > the >> >>> > > service is trying to start automatically after the reboot? >> >>> > > >> >>> > > Thanks, >> >>> > > Kanagaraj >> >>> > > >> >>> > > >> >>> > > On 11/24/2014 08:13 PM, Punit Dambiwal wrote: >> >>> > > >> >>> > > Hi Kanagaraj, >> >>> > > >> >>> > > Yes...once i will start the gluster service and then vdsmd ...the >> >>> > > host >> >>> > > can connect to cluster...but the question is why it's not started >> >>> > > even it >> >>> > > has chkconfig enabled... >> >>> > > >> >>> > > I have tested it in two host cluster environment...(Centos 6.6 >> and >> >>> > > centos 7.0) on both hypervisior cluster..it's failed to reconnect >> in >> >>> > > to >> >>> > > cluster after reboot.... >> >>> > > >> >>> > > In both the environment glusterd enabled for next boot....but >> it's >> >>> > > failed with the same error....seems it's bug in either gluster or >> >>> > > Ovirt ?? >> >>> > > >> >>> > > Please help me to find the workaround here if can not resolve >> >>> > > it...as >> >>> > > without this the Host machine can not connect after reboot....that >> >>> > > means >> >>> > > engine will consider it as down and every time need to manually >> start >> >>> > > the >> >>> > > gluster service and vdsmd... ?? >> >>> > > >> >>> > > Thanks, >> >>> > > Punit >> >>> > > >> >>> > > On Mon, Nov 24, 2014 at 10:20 PM, Kanagaraj <kmayilsa at redhat.com> >> >>> > > wrote: >> >>> > > >> >>> > >> From vdsm.log "error: Connection failed. Please check if gluster >> >>> > >> daemon >> >>> > >> is operational." >> >>> > >> >> >>> > >> Starting glusterd service should fix this issue. 'service >> glusterd >> >>> > >> start' >> >>> > >> But i am wondering why the glusterd was not started automatically >> >>> > >> after >> >>> > >> the reboot. >> >>> > >> >> >>> > >> Thanks, >> >>> > >> Kanagaraj >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> On 11/24/2014 07:18 PM, Punit Dambiwal wrote: >> >>> > >> >> >>> > >> Hi Kanagaraj, >> >>> > >> >> >>> > >> Please find the attached VDSM logs :- >> >>> > >> >> >>> > >> ---------------- >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> >> >>> > >> >> 21:41:17,182::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >> >>> > >> Owner.cancelAll requests {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:17,182::task::993::Storage.TaskManager.Task::(_decref) >> >>> > >> Task=`1691d409-9b27-4585-8281-5ec26154367a`::ref 0 aborting False >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:32,393::task::595::Storage.TaskManager.Task::(_updateState) >> >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state >> init >> >>> > >> -> >> >>> > >> state preparing >> >>> > >> Thread-13::INFO::2014-11-24 >> >>> > >> 21:41:32,393::logUtils::44::dispatcher::(wrapper) Run and >> protect: >> >>> > >> repoStats(options=None) >> >>> > >> Thread-13::INFO::2014-11-24 >> >>> > >> 21:41:32,393::logUtils::47::dispatcher::(wrapper) Run and >> protect: >> >>> > >> repoStats, Return response: {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:32,393::task::1191::Storage.TaskManager.Task::(prepare) >> >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::finished: {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:32,394::task::595::Storage.TaskManager.Task::(_updateState) >> >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::moving from state >> >>> > >> preparing >> >>> > >> -> >> >>> > >> state finished >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> >> >>> > >> >> 21:41:32,394::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) >> >>> > >> Owner.releaseAll requests {} resources {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> >> >>> > >> >> 21:41:32,394::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >> >>> > >> Owner.cancelAll requests {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:32,394::task::993::Storage.TaskManager.Task::(_decref) >> >>> > >> Task=`994c7bc3-a236-4d03-a732-e068c7ed9ed4`::ref 0 aborting False >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,550::BindingXMLRPC::1132::vds::(wrapper) client >> >>> > >> [10.10.10.2]::call >> >>> > >> getCapabilities with () {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,553::utils::738::root::(execCmd) >> >>> > >> /sbin/ip route show to 0.0.0.0/0 table all (cwd None) >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,560::utils::758::root::(execCmd) >> >>> > >> SUCCESS: <err> = ''; <rc> = 0 >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,588::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,592::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift-object',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,593::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift-plugin',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift-account',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift-proxy',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,598::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift-doc',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('gluster-swift-container',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,599::caps::728::root::(_getKeyPackages) rpm package >> >>> > >> ('glusterfs-geo-replication',) not found >> >>> > >> Thread-13::DEBUG::2014-11-24 21:41:41,600::caps::646::root::(get) >> >>> > >> VirtioRNG DISABLED: libvirt version 0.10.2-29.el6_5.9 required >>> >>> > >> 0.10.2-31 >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,603::BindingXMLRPC::1139::vds::(wrapper) return >> >>> > >> getCapabilities >> >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': >> >>> > >> {'HBAInventory': >> >>> > >> {'iSCSI': [{'InitiatorName': >> >>> > >> 'iqn.1994-05.com.redhat:32151ce183c8'}], >> >>> > >> 'FC': >> >>> > >> []}, 'packages2': {'kernel': {'release': '431.el6.x86_64', >> >>> > >> 'buildtime': >> >>> > >> 1385061309.0, 'version': '2.6.32'}, 'glusterfs-rdma': {'release': >> >>> > >> '1.el6', >> >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'glusterfs-fuse': >> >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': >> '3.5.1'}, >> >>> > >> 'spice-server': {'release': '6.el6_5.2', 'buildtime': >> 1402324637L, >> >>> > >> 'version': '0.12.4'}, 'vdsm': {'release': '1.gitdb83943.el6', >> >>> > >> 'buildtime': >> >>> > >> 1412784567L, 'version': '4.16.7'}, 'qemu-kvm': {'release': >> >>> > >> '2.415.el6_5.10', 'buildtime': 1402435700L, 'version': >> '0.12.1.2'}, >> >>> > >> 'qemu-img': {'release': '2.415.el6_5.10', 'buildtime': >> 1402435700L, >> >>> > >> 'version': '0.12.1.2'}, 'libvirt': {'release': '29.el6_5.9', >> >>> > >> 'buildtime': >> >>> > >> 1402404612L, 'version': '0.10.2'}, 'glusterfs': {'release': >> '1.el6', >> >>> > >> 'buildtime': 1403622628L, 'version': '3.5.1'}, 'mom': {'release': >> >>> > >> '2.el6', >> >>> > >> 'buildtime': 1403794344L, 'version': '0.4.1'}, >> 'glusterfs-server': >> >>> > >> {'release': '1.el6', 'buildtime': 1403622628L, 'version': >> '3.5.1'}}, >> >>> > >> 'numaNodeDistance': {'1': [20, 10], '0': [10, 20]}, 'cpuModel': >> >>> > >> 'Intel(R) >> >>> > >> Xeon(R) CPU X5650 @ 2.67GHz', 'liveMerge': 'false', >> >>> > >> 'hooks': >> >>> > >> {}, >> >>> > >> 'cpuSockets': '2', 'vmTypes': ['kvm'], 'selinux': {'mode': '1'}, >> >>> > >> 'kdumpStatus': 0, 'supportedProtocols': ['2.2', '2.3'], >> 'networks': >> >>> > >> {'ovirtmgmt': {'iface': u'bond0.10', 'addr': '43.252.176.16', >> >>> > >> 'bridged': >> >>> > >> False, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': >> '1500', >> >>> > >> 'bootproto4': 'none', 'netmask': '255.255.255.0', 'ipv4addrs': [' >> >>> > >> 43.252.176.16/24' <http://43.252.176.16/24%27>], 'interface': >> >>> > >> u'bond0.10', 'ipv6gateway': '::', 'gateway': '43.25.17.1'}, >> >>> > >> 'Internal': >> >>> > >> {'iface': 'Internal', 'addr': '', 'cfg': {'DEFROUTE': 'no', >> >>> > >> 'HOTPLUG': >> >>> > >> 'no', 'MTU': '9000', 'DELAY': '0', 'NM_CONTROLLED': 'no', >> >>> > >> 'BOOTPROTO': >> >>> > >> 'none', 'STP': 'off', 'DEVICE': 'Internal', 'TYPE': 'Bridge', >> >>> > >> 'ONBOOT': >> >>> > >> 'no'}, 'bridged': True, 'ipv6addrs': >> >>> > >> ['fe80::210:18ff:fecd:daac/64'], >> >>> > >> 'gateway': '', 'bootproto4': 'none', 'netmask': '', 'stp': 'off', >> >>> > >> 'ipv4addrs': [], 'mtu': '9000', 'ipv6gateway': '::', 'ports': >> >>> > >> ['bond1.100']}, 'storage': {'iface': u'bond1', 'addr': >> '10.10.10.6', >> >>> > >> 'bridged': False, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], >> >>> > >> 'mtu': >> >>> > >> '9000', 'bootproto4': 'none', 'netmask': '255.255.255.0', >> >>> > >> 'ipv4addrs': [' >> >>> > >> 10.10.10.6/24' <http://10.10.10.6/24%27>], 'interface': >> u'bond1', >> >>> > >> 'ipv6gateway': '::', 'gateway': ''}, 'VMNetwork': {'iface': >> >>> > >> 'VMNetwork', >> >>> > >> 'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': >> >>> > >> '1500', >> >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': >> >>> > >> 'off', >> >>> > >> 'DEVICE': 'VMNetwork', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, >> 'bridged': >> >>> > >> True, >> >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'gateway': '', >> >>> > >> 'bootproto4': >> >>> > >> 'none', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], 'mtu': >> '1500', >> >>> > >> 'ipv6gateway': '::', 'ports': ['bond0.36']}}, 'bridges': >> >>> > >> {'Internal': >> >>> > >> {'addr': '', 'cfg': {'DEFROUTE': 'no', 'HOTPLUG': 'no', 'MTU': >> >>> > >> '9000', >> >>> > >> 'DELAY': '0', 'NM_CONTROLLED': 'no', 'BOOTPROTO': 'none', 'STP': >> >>> > >> 'off', >> >>> > >> 'DEVICE': 'Internal', 'TYPE': 'Bridge', 'ONBOOT': 'no'}, >> >>> > >> 'ipv6addrs': >> >>> > >> ['fe80::210:18ff:fecd:daac/64'], 'mtu': '9000', 'netmask': '', >> >>> > >> 'stp': >> >>> > >> 'off', 'ipv4addrs': [], 'ipv6gateway': '::', 'gateway': '', >> 'opts': >> >>> > >> {'topology_change_detected': '0', 'multicast_last_member_count': >> >>> > >> '2', >> >>> > >> 'hash_elasticity': '4', 'multicast_query_response_interval': >> '999', >> >>> > >> 'multicast_snooping': '1', 'multicast_startup_query_interval': >> >>> > >> '3124', >> >>> > >> 'hello_timer': '31', 'multicast_querier_interval': '25496', >> >>> > >> 'max_age': >> >>> > >> '1999', 'hash_max': '512', 'stp_state': '0', 'root_id': >> >>> > >> '8000.001018cddaac', 'priority': '32768', >> >>> > >> 'multicast_membership_interval': >> >>> > >> '25996', 'root_path_cost': '0', 'root_port': '0', >> >>> > >> 'multicast_querier': >> >>> > >> '0', >> >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', >> >>> > >> 'topology_change': '0', 'bridge_id': '8000.001018cddaac', >> >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': >> >>> > >> '31', >> >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', >> >>> > >> 'multicast_query_interval': '12498', >> >>> > >> 'multicast_last_member_interval': >> >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': >> >>> > >> ['bond1.100']}, 'VMNetwork': {'addr': '', 'cfg': {'DEFROUTE': >> 'no', >> >>> > >> 'HOTPLUG': 'no', 'MTU': '1500', 'DELAY': '0', 'NM_CONTROLLED': >> 'no', >> >>> > >> 'BOOTPROTO': 'none', 'STP': 'off', 'DEVICE': 'VMNetwork', 'TYPE': >> >>> > >> 'Bridge', >> >>> > >> 'ONBOOT': 'no'}, 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], >> >>> > >> 'mtu': >> >>> > >> '1500', 'netmask': '', 'stp': 'off', 'ipv4addrs': [], >> 'ipv6gateway': >> >>> > >> '::', >> >>> > >> 'gateway': '', 'opts': {'topology_change_detected': '0', >> >>> > >> 'multicast_last_member_count': '2', 'hash_elasticity': '4', >> >>> > >> 'multicast_query_response_interval': '999', 'multicast_snooping': >> >>> > >> '1', >> >>> > >> 'multicast_startup_query_interval': '3124', 'hello_timer': '131', >> >>> > >> 'multicast_querier_interval': '25496', 'max_age': '1999', >> >>> > >> 'hash_max': >> >>> > >> '512', 'stp_state': '0', 'root_id': '8000.60eb6920b46c', >> 'priority': >> >>> > >> '32768', 'multicast_membership_interval': '25996', >> 'root_path_cost': >> >>> > >> '0', >> >>> > >> 'root_port': '0', 'multicast_querier': '0', >> >>> > >> 'multicast_startup_query_count': '2', 'hello_time': '199', >> >>> > >> 'topology_change': '0', 'bridge_id': '8000.60eb6920b46c', >> >>> > >> 'topology_change_timer': '0', 'ageing_time': '29995', 'gc_timer': >> >>> > >> '31', >> >>> > >> 'group_addr': '1:80:c2:0:0:0', 'tcn_timer': '0', >> >>> > >> 'multicast_query_interval': '12498', >> >>> > >> 'multicast_last_member_interval': >> >>> > >> '99', 'multicast_router': '1', 'forward_delay': '0'}, 'ports': >> >>> > >> ['bond0.36']}}, 'uuid': '44454C4C-4C00-1057-8053-B7C04F504E31', >> >>> > >> 'lastClientIface': 'bond1', 'nics': {'eth3': {'permhwaddr': >> >>> > >> '00:10:18:cd:da:ae', 'addr': '', 'cfg': {'SLAVE': 'yes', >> >>> > >> 'NM_CONTROLLED': >> >>> > >> 'no', 'MTU': '9000', 'HWADDR': '00:10:18:cd:da:ae', 'MASTER': >> >>> > >> 'bond1', >> >>> > >> 'DEVICE': 'eth3', 'ONBOOT': 'no'}, 'ipv6addrs': [], 'mtu': >> '9000', >> >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '00:10:18:cd:da:ac', >> >>> > >> 'speed': >> >>> > >> 1000}, 'eth2': {'permhwaddr': '00:10:18:cd:da:ac', 'addr': '', >> >>> > >> 'cfg': >> >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '9000', 'HWADDR': >> >>> > >> '00:10:18:cd:da:ac', 'MASTER': 'bond1', 'DEVICE': 'eth2', >> 'ONBOOT': >> >>> > >> 'no'}, >> >>> > >> 'ipv6addrs': [], 'mtu': '9000', 'netmask': '', 'ipv4addrs': [], >> >>> > >> 'hwaddr': >> >>> > >> '00:10:18:cd:da:ac', 'speed': 1000}, 'eth1': {'permhwaddr': >> >>> > >> '60:eb:69:20:b4:6d', 'addr': '', 'cfg': {'SLAVE': 'yes', >> >>> > >> 'NM_CONTROLLED': >> >>> > >> 'no', 'MTU': '1500', 'HWADDR': '60:eb:69:20:b4:6d', 'MASTER': >> >>> > >> 'bond0', >> >>> > >> 'DEVICE': 'eth1', 'ONBOOT': 'yes'}, 'ipv6addrs': [], 'mtu': >> '1500', >> >>> > >> 'netmask': '', 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', >> >>> > >> 'speed': >> >>> > >> 1000}, 'eth0': {'permhwaddr': '60:eb:69:20:b4:6c', 'addr': '', >> >>> > >> 'cfg': >> >>> > >> {'SLAVE': 'yes', 'NM_CONTROLLED': 'no', 'MTU': '1500', 'HWADDR': >> >>> > >> '60:eb:69:20:b4:6c', 'MASTER': 'bond0', 'DEVICE': 'eth0', >> 'ONBOOT': >> >>> > >> 'yes'}, >> >>> > >> 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'ipv4addrs': [], >> >>> > >> 'hwaddr': >> >>> > >> '60:eb:69:20:b4:6c', 'speed': 1000}}, 'software_revision': '1', >> >>> > >> 'clusterLevels': ['3.0', '3.1', '3.2', '3.3', '3.4', '3.5'], >> >>> > >> 'cpuFlags': >> >>> > >> >> >>> > >> >> u'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,dca,sse4_1,sse4_2,popcnt,aes,lahf_lm,tpr_shadow,vnmi,flexpriority,ept,vpid,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270', >> >>> > >> 'ISCSIInitiatorName': 'iqn.1994-05.com.redhat:32151ce183c8', >> >>> > >> 'netConfigDirty': 'False', 'supportedENGINEs': ['3.0', '3.1', >> '3.2', >> >>> > >> '3.3', >> >>> > >> '3.4', '3.5'], 'autoNumaBalancing': 2, 'reservedMem': '321', >> >>> > >> 'bondings': >> >>> > >> {'bond4': {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', >> >>> > >> 'slaves': >> >>> > >> [], 'hwaddr': '00:00:00:00:00:00'}, 'bond0': {'addr': '', 'cfg': >> >>> > >> {'HOTPLUG': 'no', 'MTU': '1500', 'NM_CONTROLLED': 'no', >> >>> > >> 'BONDING_OPTS': >> >>> > >> 'mode=4 miimon=100', 'DEVICE': 'bond0', 'ONBOOT': 'yes'}, >> >>> > >> 'ipv6addrs': >> >>> > >> ['fe80::62eb:69ff:fe20:b46c/64'], 'mtu': '1500', 'netmask': '', >> >>> > >> 'ipv4addrs': [], 'hwaddr': '60:eb:69:20:b4:6c', 'slaves': >> ['eth0', >> >>> > >> 'eth1'], >> >>> > >> 'opts': {'miimon': '100', 'mode': '4'}}, 'bond1': {'addr': >> >>> > >> '10.10.10.6', >> >>> > >> 'cfg': {'DEFROUTE': 'no', 'IPADDR': '10.10.10.6', 'HOTPLUG': >> 'no', >> >>> > >> 'MTU': >> >>> > >> '9000', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', >> >>> > >> 'BOOTPROTO': >> >>> > >> 'none', 'BONDING_OPTS': 'mode=4 miimon=100', 'DEVICE': 'bond1', >> >>> > >> 'ONBOOT': >> >>> > >> 'no'}, 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'mtu': >> '9000', >> >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['10.10.10.6/24' >> >>> > >> <http://10.10.10.6/24%27>], 'hwaddr': '00:10:18:cd:da:ac', >> 'slaves': >> >>> > >> ['eth2', 'eth3'], 'opts': {'miimon': '100', 'mode': '4'}}, >> 'bond2': >> >>> > >> {'addr': '', 'cfg': {}, 'mtu': '1500', 'netmask': '', 'slaves': >> [], >> >>> > >> 'hwaddr': '00:00:00:00:00:00'}, 'bond3': {'addr': '', 'cfg': {}, >> >>> > >> 'mtu': >> >>> > >> '1500', 'netmask': '', 'slaves': [], 'hwaddr': >> >>> > >> '00:00:00:00:00:00'}}, >> >>> > >> 'software_version': '4.16', 'memSize': '24019', 'cpuSpeed': >> >>> > >> '2667.000', >> >>> > >> 'numaNodes': {u'1': {'totalMemory': '12288', 'cpus': [6, 7, 8, 9, >> >>> > >> 10, 11, >> >>> > >> 18, 19, 20, 21, 22, 23]}, u'0': {'totalMemory': '12278', 'cpus': >> [0, >> >>> > >> 1, 2, >> >>> > >> 3, 4, 5, 12, 13, 14, 15, 16, 17]}}, 'version_name': 'Snow Man', >> >>> > >> 'vlans': >> >>> > >> {'bond0.10': {'iface': 'bond0', 'addr': '43.25.17.16', 'cfg': >> >>> > >> {'DEFROUTE': >> >>> > >> 'yes', 'VLAN': 'yes', 'IPADDR': '43.25.17.16', 'HOTPLUG': 'no', >> >>> > >> 'GATEWAY': >> >>> > >> '43.25.17.1', 'NM_CONTROLLED': 'no', 'NETMASK': '255.255.255.0', >> >>> > >> 'BOOTPROTO': 'none', 'DEVICE': 'bond0.10', 'MTU': '1500', >> 'ONBOOT': >> >>> > >> 'yes'}, >> >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 10, >> 'mtu': >> >>> > >> '1500', >> >>> > >> 'netmask': '255.255.255.0', 'ipv4addrs': ['43.25.17.16/24'] >> >>> > >> <http://43.25.17.16/24%27%5D>}, 'bond0.36': {'iface': 'bond0', >> >>> > >> 'addr': >> >>> > >> '', 'cfg': {'BRIDGE': 'VMNetwork', 'VLAN': 'yes', 'HOTPLUG': >> 'no', >> >>> > >> 'MTU': >> >>> > >> '1500', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond0.36', 'ONBOOT': >> >>> > >> 'no'}, >> >>> > >> 'ipv6addrs': ['fe80::62eb:69ff:fe20:b46c/64'], 'vlanid': 36, >> 'mtu': >> >>> > >> '1500', >> >>> > >> 'netmask': '', 'ipv4addrs': []}, 'bond1.100': {'iface': 'bond1', >> >>> > >> 'addr': >> >>> > >> '', 'cfg': {'BRIDGE': 'Internal', 'VLAN': 'yes', 'HOTPLUG': 'no', >> >>> > >> 'MTU': >> >>> > >> '9000', 'NM_CONTROLLED': 'no', 'DEVICE': 'bond1.100', 'ONBOOT': >> >>> > >> 'no'}, >> >>> > >> 'ipv6addrs': ['fe80::210:18ff:fecd:daac/64'], 'vlanid': 100, >> 'mtu': >> >>> > >> '9000', >> >>> > >> 'netmask': '', 'ipv4addrs': []}}, 'cpuCores': '12', 'kvmEnabled': >> >>> > >> 'true', >> >>> > >> 'guestOverhead': '65', 'cpuThreads': '24', 'emulatedMachines': >> >>> > >> [u'rhel6.5.0', u'pc', u'rhel6.4.0', u'rhel6.3.0', u'rhel6.2.0', >> >>> > >> u'rhel6.1.0', u'rhel6.0.0', u'rhel5.5.0', u'rhel5.4.4', >> >>> > >> u'rhel5.4.0'], >> >>> > >> 'operatingSystem': {'release': '5.el6.centos.11.1', 'version': >> '6', >> >>> > >> 'name': >> >>> > >> 'RHEL'}, 'lastClient': '10.10.10.2'}} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,620::BindingXMLRPC::1132::vds::(wrapper) client >> >>> > >> [10.10.10.2]::call >> >>> > >> getHardwareInfo with () {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,621::BindingXMLRPC::1139::vds::(wrapper) return >> >>> > >> getHardwareInfo >> >>> > >> with {'status': {'message': 'Done', 'code': 0}, 'info': >> >>> > >> {'systemProductName': 'CS24-TY', 'systemSerialNumber': '7LWSPN1', >> >>> > >> 'systemFamily': 'Server', 'systemVersion': 'A00', 'systemUUID': >> >>> > >> '44454c4c-4c00-1057-8053-b7c04f504e31', 'systemManufacturer': >> >>> > >> 'Dell'}} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:41,733::BindingXMLRPC::1132::vds::(wrapper) client >> >>> > >> [10.10.10.2]::call >> >>> > >> hostsList with () {} flowID [222e8036] >> >>> > >> Thread-13::ERROR::2014-11-24 >> >>> > >> 21:41:44,753::BindingXMLRPC::1148::vds::(wrapper) vdsm exception >> >>> > >> occured >> >>> > >> Traceback (most recent call last): >> >>> > >> File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in >> wrapper >> >>> > >> res = f(*args, **kwargs) >> >>> > >> File "/usr/share/vdsm/gluster/api.py", line 54, in wrapper >> >>> > >> rv = func(*args, **kwargs) >> >>> > >> File "/usr/share/vdsm/gluster/api.py", line 251, in hostsList >> >>> > >> return {'hosts': self.svdsmProxy.glusterPeerStatus()} >> >>> > >> File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ >> >>> > >> return callMethod() >> >>> > >> File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> >> >>> > >> **kwargs) >> >>> > >> File "<string>", line 2, in glusterPeerStatus >> >>> > >> File "/usr/lib64/python2.6/multiprocessing/managers.py", line >> 740, >> >>> > >> in >> >>> > >> _callmethod >> >>> > >> raise convert_to_error(kind, result) >> >>> > >> GlusterCmdExecFailedException: Command execution failed >> >>> > >> error: Connection failed. Please check if gluster daemon is >> >>> > >> operational. >> >>> > >> return code: 1 >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:50,949::task::595::Storage.TaskManager.Task::(_updateState) >> >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state >> init >> >>> > >> -> >> >>> > >> state preparing >> >>> > >> Thread-13::INFO::2014-11-24 >> >>> > >> 21:41:50,950::logUtils::44::dispatcher::(wrapper) Run and >> protect: >> >>> > >> repoStats(options=None) >> >>> > >> Thread-13::INFO::2014-11-24 >> >>> > >> 21:41:50,950::logUtils::47::dispatcher::(wrapper) Run and >> protect: >> >>> > >> repoStats, Return response: {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:50,950::task::1191::Storage.TaskManager.Task::(prepare) >> >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::finished: {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:50,950::task::595::Storage.TaskManager.Task::(_updateState) >> >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::moving from state >> >>> > >> preparing >> >>> > >> -> >> >>> > >> state finished >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> >> >>> > >> >> 21:41:50,951::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) >> >>> > >> Owner.releaseAll requests {} resources {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> >> >>> > >> >> 21:41:50,951::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) >> >>> > >> Owner.cancelAll requests {} >> >>> > >> Thread-13::DEBUG::2014-11-24 >> >>> > >> 21:41:50,951::task::993::Storage.TaskManager.Task::(_decref) >> >>> > >> Task=`c9042986-c978-4b08-adb2-616f5299e115`::ref 0 aborting False >> >>> > >> ------------------------------- >> >>> > >> >> >>> > >> [root at compute4 ~]# service glusterd status >> >>> > >> glusterd is stopped >> >>> > >> [root at compute4 ~]# chkconfig --list | grep glusterd >> >>> > >> glusterd 0:off 1:off 2:on 3:on 4:on 5:on >> >>> > >> 6:off >> >>> > >> [root at compute4 ~]# >> >>> > >> >> >>> > >> Thanks, >> >>> > >> Punit >> >>> > >> >> >>> > >> On Mon, Nov 24, 2014 at 6:36 PM, Kanagaraj <kmayilsa at redhat.com> >> >>> > >> wrote: >> >>> > >> >> >>> > >>> Can you send the corresponding error in vdsm.log from the host? >> >>> > >>> >> >>> > >>> Also check if glusterd service is running. >> >>> > >>> >> >>> > >>> Thanks, >> >>> > >>> Kanagaraj >> >>> > >>> >> >>> > >>> >> >>> > >>> On 11/24/2014 03:39 PM, Punit Dambiwal wrote: >> >>> > >>> >> >>> > >>> Hi, >> >>> > >>> >> >>> > >>> After reboot my Hypervisior host can not activate again in the >> >>> > >>> cluster >> >>> > >>> and failed with the following error :- >> >>> > >>> >> >>> > >>> Gluster command [<UNKNOWN>] failed on server... >> >>> > >>> >> >>> > >>> Engine logs :- >> >>> > >>> >> >>> > >>> 2014-11-24 18:05:28,397 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-64) START, >> >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId >> >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 5f251c90 >> >>> > >>> 2014-11-24 18:05:30,609 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-64) FINISH, >> >>> > >>> GlusterVolumesListVDSCommand, >> >>> > >>> return: >> >>> > >>> >> >>> > >>> >> {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at d95203e0 >> }, >> >>> > >>> log id: 5f251c90 >> >>> > >>> 2014-11-24 18:05:33,768 INFO >> >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> >>> > >>> (ajp--127.0.0.1-8702-8) >> >>> > >>> [287d570d] Lock Acquired to object EngineLock [exclusiveLocks>> key: >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a value: VDS >> >>> > >>> , sharedLocks= ] >> >>> > >>> 2014-11-24 18:05:33,795 INFO >> >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Running command: >> >>> > >>> ActivateVdsCommand internal: false. Entities affected : ID: >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDSAction group >> >>> > >>> MANIPULATE_HOST >> >>> > >>> with role type ADMIN >> >>> > >>> 2014-11-24 18:05:33,796 INFO >> >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Before acquiring >> >>> > >>> lock in >> >>> > >>> order to prevent monitoring for host Compute5 from data-center >> >>> > >>> SV_WTC >> >>> > >>> 2014-11-24 18:05:33,797 INFO >> >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] Lock acquired, >> from >> >>> > >>> now a >> >>> > >>> monitoring of host will be skipped for host Compute5 from >> >>> > >>> data-center >> >>> > >>> SV_WTC >> >>> > >>> 2014-11-24 18:05:33,817 INFO >> >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] START, >> >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=Unassigned, >> >>> > >>> nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: >> >>> > >>> 1cbc7311 >> >>> > >>> 2014-11-24 18:05:33,820 INFO >> >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) [287d570d] FINISH, >> >>> > >>> SetVdsStatusVDSCommand, log id: 1cbc7311 >> >>> > >>> 2014-11-24 18:05:34,086 INFO >> >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) Activate finished. Lock >> >>> > >>> released. >> >>> > >>> Monitoring can run now for host Compute5 from data-center SV_WTC >> >>> > >>> 2014-11-24 18:05:34,088 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) Correlation ID: 287d570d, >> Job >> >>> > >>> ID: >> >>> > >>> 5ef8e4d6-b2bc-469e-8e81-7ef74b2a001a, Call Stack: null, Custom >> >>> > >>> Event ID: >> >>> > >>> -1, Message: Host Compute5 was activated by admin. >> >>> > >>> 2014-11-24 18:05:34,090 INFO >> >>> > >>> [org.ovirt.engine.core.bll.ActivateVdsCommand] >> >>> > >>> (org.ovirt.thread.pool-8-thread-45) Lock freed to object >> EngineLock >> >>> > >>> [exclusiveLocks= key: 0bf6b00f-7947-4411-b55a-cc5eea2b381a >> value: >> >>> > >>> VDS >> >>> > >>> , sharedLocks= ] >> >>> > >>> 2014-11-24 18:05:35,792 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] START, >> >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId >> >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 48a0c832 >> >>> > >>> 2014-11-24 18:05:37,064 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) START, >> >>> > >>> GetHardwareInfoVDSCommand(HostName = Compute5, HostId >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, >> >>> > >>> vds=Host[Compute5,0bf6b00f-7947-4411-b55a-cc5eea2b381a]), log >> id: >> >>> > >>> 6d560cc2 >> >>> > >>> 2014-11-24 18:05:37,074 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHardwareInfoVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) FINISH, >> >>> > >>> GetHardwareInfoVDSCommand, log >> >>> > >>> id: 6d560cc2 >> >>> > >>> 2014-11-24 18:05:37,093 WARN >> >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsManager] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) Host Compute5 is running with >> >>> > >>> disabled >> >>> > >>> SELinux. >> >>> > >>> 2014-11-24 18:05:37,127 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] Running command: >> >>> > >>> HandleVdsCpuFlagsOrClusterChangedCommand internal: true. >> Entities >> >>> > >>> affected >> >>> > >>> : ID: 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >> >>> > >>> 2014-11-24 18:05:37,147 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] START, >> >>> > >>> GlusterServersListVDSCommand(HostName = Compute5, HostId >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a), log id: 4faed87 >> >>> > >>> 2014-11-24 18:05:37,164 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [2b4a51cf] FINISH, >> >>> > >>> GlusterServersListVDSCommand, log id: 4faed87 >> >>> > >>> 2014-11-24 18:05:37,189 INFO >> >>> > >>> [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Running command: >> >>> > >>> SetNonOperationalVdsCommand internal: true. Entities affected : >> >>> > >>> ID: >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >> >>> > >>> 2014-11-24 18:05:37,206 INFO >> >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] START, >> >>> > >>> SetVdsStatusVDSCommand(HostName = Compute5, HostId >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a, status=NonOperational, >> >>> > >>> nonOperationalReason=GLUSTER_COMMAND_FAILED, >> >>> > >>> stopSpmFailureLogged=false), >> >>> > >>> log id: fed5617 >> >>> > >>> 2014-11-24 18:05:37,209 INFO >> >>> > >>> [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] FINISH, >> >>> > >>> SetVdsStatusVDSCommand, log id: fed5617 >> >>> > >>> 2014-11-24 18:05:37,223 ERROR >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: >> >>> > >>> 4a84c4e5, >> >>> > >>> Job >> >>> > >>> ID: 4bfd4a6d-c3ef-468f-a40e-a3a6ca13011b, Call Stack: null, >> Custom >> >>> > >>> Event >> >>> > >>> ID: -1, Message: Gluster command [<UNKNOWN>] failed on server >> >>> > >>> Compute5. >> >>> > >>> 2014-11-24 18:05:37,243 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [4a84c4e5] Correlation ID: >> null, >> >>> > >>> Call >> >>> > >>> Stack: null, Custom Event ID: -1, Message: Status of host >> Compute5 >> >>> > >>> was >> >>> > >>> set >> >>> > >>> to NonOperational. >> >>> > >>> 2014-11-24 18:05:37,272 INFO >> >>> > >>> [org.ovirt.engine.core.bll.HandleVdsVersionCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Running command: >> >>> > >>> HandleVdsVersionCommand internal: true. Entities affected : ID: >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a Type: VDS >> >>> > >>> 2014-11-24 18:05:37,274 INFO >> >>> > >>> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] >> >>> > >>> (DefaultQuartzScheduler_Worker-69) [a0c8a7f] Host >> >>> > >>> 0bf6b00f-7947-4411-b55a-cc5eea2b381a : Compute5 is already in >> >>> > >>> NonOperational status for reason GLUSTER_COMMAND_FAILED. >> >>> > >>> SetNonOperationalVds command is skipped. >> >>> > >>> 2014-11-24 18:05:38,065 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-55) [3706e836] FINISH, >> >>> > >>> GlusterVolumesListVDSCommand, return: >> >>> > >>> >> >>> > >>> >> {26ae1672-ee09-4a38-8fd2-72dd9974cc2b=org.ovirt.engine.core.common.businessentities.gluster.GlusterVolumeEntity at 4e72a1b1 >> }, >> >>> > >>> log id: 48a0c832 >> >>> > >>> 2014-11-24 18:05:43,243 INFO >> >>> > >>> >> >>> > >>> >> [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand] >> >>> > >>> (DefaultQuartzScheduler_Worker-35) START, >> >>> > >>> GlusterVolumesListVDSCommand(HostName = Compute4, HostId >> >>> > >>> 33648a90-200c-45ca-89d5-1ce305d79a6a), log id: 3ce13ebc >> >>> > >>> ^C >> >>> > >>> [root at ccr01 ~]# >> >>> > >>> >> >>> > >>> Thanks, >> >>> > >>> Punit >> >>> > >>> >> >>> > >>> >> >>> > >>> _______________________________________________ >> >>> > >>> Users mailing >> >>> > >>> listUsers at ovirt.orghttp:// >> lists.ovirt.org/mailman/listinfo/users >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >> >> >>> > >> >> >>> > > >> >>> > > >> >>> > >> >> >> >> >> > >> > >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > http://supercolony.gluster.org/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141201/db4f1945/attachment.html>