thr3ads.net - similar to: "Large Corosync/Pacemaker clusters"

Displaying 20 results from an estimated 6000 matches similar to: "Large Corosync/Pacemaker clusters"

2011 Nov 23

Corosync init-script broken on CentOS6

Hello all, I am trying to create a corosync/pacemaker cluster using CentOS 6.0. However, I'm having a great deal of difficulty doing so. Corosync has a valid configuration file and an authkey has been generated. When I run /etc/init.d/corosync I see that only corosync is started. >From experience working with corosync/pacemaker before, I know that this is not enough to have a functioning

lctl ping of Pacemaker IP

2012 Nov 02

lctl ping of Pacemaker IP

Greetings! I am working with Lustre-2.1.2 on RHEL 6.2. First I configured it using the standard defaults over TCP/IP. Everything worked very nicely usnig a real, static --mgsnode=a.b.c.x value which was the actual IP of the MGS/MDS system1 node. I am now trying to integrate it with Pacemaker-1.1.7. I believe I have most of the set-up completed with a particular exception. The "lctl

DRBD, Xen, HVM and live migration

2011 May 10

DRBD, Xen, HVM and live migration

Hi, I want to combine all the above mentioned technologies. The Linbit pages warn not to use the drbd: VBD with HVM DomUs. This page however: http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.en (thank you Jean), simply puts two DRBD devices in dual primary mode and starts Xen DomUs while pointing to the DRBD devices with phy: in the DomU config files.

Xen on two node DRBD cluster with Pacemaker

2011 Jan 19

Xen on two node DRBD cluster with Pacemaker

Hi all, could somebody point me to what is considered a sound way to offer Xen guests on a two node DRBD cluster in combination with Pacemaker? I prefer block devices over images for the DomU''s. I understand that for live migration DRBD 8.3 is needed, but I''m not sure as to what kind of resource agents/technologies are advised (LVM,cLVM, ...) and what kind of DRBD config

Lustre Debug level

2007 Nov 16

Lustre Debug level

Hi, Lustre manual 1.6 v18 says that that in production lustre debug level should be set to fairly low. Manual also says that I can verify that level by running following commands: # sysctl portals.debug This gives ne following error error: ''portals.debug'' is an unknown key cat /proc/sys/lnet/debug gives output: ioctl neterror warning error emerg ha config console cat

Cluster xen

2012 Mar 05

Cluster xen

Bonjour, J''aimerai mettre en place un cluster sous Xen ou XenServer avec 2 serveurs dell R 710. J''aimerai pouvoir monter un cluster en utilisant l''espace disque entiere des 2 serveurs cumulés ainsi que la mémoire Quelles sont vos retour d''expériences et vos configurations? Merci d''avance Cordialement Mat

How to bypass failed OST without blocking?

2007 Mar 20

How to bypass failed OST without blocking?

Hi I want my lustre do such things during OST failed: if some file has stripe data on th failed OST, any operation on the file will return IO error without blocking, and also at this moment I can create and read/write new file or read/write files which have no stripe data on the failed OST without blocking. What should I do ? How to configure? thanks! swin -------------- next part

CentOS 6: corosync and pacemaker won't stop (patch)

2011 Sep 29

CentOS 6: corosync and pacemaker won't stop (patch)

Hi, I cannot 'halt' my CentOS 6 servers while running corosync+pacemaker. I believe the runlevels used to stop corosync and pacemaker are not in the correct order and create the infinite "Waiting for corosync services to unload..." loop thing. This is my first time with this cluster technology but apparently pacemaker has to be stopped /before/ corosync. Applying the following

iptables rules for lustre 1.6.x and MGS recovery procedures

2007 Oct 15

iptables rules for lustre 1.6.x and MGS recovery procedures

Hi, I would like to know what TCP/UDP ports should i keep open in my firewall policies on my MGS server such that I can have my MGS server fire-walled. Also if in a event of loss of MGT would it be possible to recreate the MGT without loosing data or bringing the filesystem down (i.e. by using cached information from MDT''s and OST''s) Thanks Anand

write RPC & congestion

2010 Aug 17

write RPC & congestion

Hi, thanks for previous help. I have some question about Lustre RPC and the sequence of events that occur during large concurrent write() involving many processes and large data size per process. I understand there is a mechanism of flow control by credits, but I''m a little unclear on how it works in general after reading the "networking & io protocol" white paper. Is

Multihomed question: want Lustre over IB andEthernet

2008 Mar 07

Multihomed question: want Lustre over IB andEthernet

Chris, Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not. Shane ----- Original Message ----- From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org> To: lustre-discuss <lustre-discuss at lists.lustre.org> Sent: Fri Mar 07 12:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed

Configuring Xen + DRBD + Corosync + Pacemaker

2012 Dec 11

Configuring Xen + DRBD + Corosync + Pacemaker

Hi everyone, I need some help to setup my configuration failover system. My goal is to have a redundance system using Xen + DRBD + Corosync + Pacemaker On Xen I will have one virtual machine. When this computer has network down, I will do a Live migration to the second computer. The first configuration I will need is a crossover cable, won''t I? It is really necessary? Ok, I did it. eth0

NUT configuration complicated by Stonith/Fencing cabling

2017 Feb 10

NUT configuration complicated by Stonith/Fencing cabling

Roger, Thanks for your reply. As I understand it, for reliable fencing a node cannot be responsible for fencing itself, as it may not be functioning properly. Hence my "cross over" setup. The direct USB connection from Webserver1 to UPS-Webserver2 means that Webserver1 can fence (cut the power to) Webserver2 if the cluster software decides that it is necessary. If my UPSes were able to

how do you mount mountconf (i.e. 1.6) lustre on your servers?

2008 Feb 14

how do you mount mountconf (i.e. 1.6) lustre on your servers?

As any of you using version 1.6 of Lustre knows, Lustre servers can now be started simply my mounting the devices it is using. Even an /etc/fstab entry can be used if you can have the mount delayed until the network is started. Given this change, you have also notices that we have eliminated the initscript for Lustre that used to exist for releases prior to 1.6. I''d like to take a

two 2-node clusters or one 4-node cluster?

2018 Jul 05

two 2-node clusters or one 4-node cluster?

Hello, I'm planning migration of current two clusters based on CentOS 6.x with Cman/Rgmanager going to CentOS 7.x and Corosync/Pacemaker. As the clusters and their services are on the same subnet, and there no particular security concerns differentiating them, I'm also evaluating the option to transform the two clusters into a unique 4-node one during the upgrade. Currently I'm

Luster clients getting evicted

2008 Feb 04

Luster clients getting evicted

on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg

Status of STONITH support in the puppetlabs corosync module?

2012 Nov 26

Status of STONITH support in the puppetlabs corosync module?

Greetings - Hoping to hear from hunner or one of the other maintainers of the puppetlabs corosync module - there is a note on the git project page that there is currently no way to configure STONITH. Is this information current? If so, has anybody come up with a simple method of managing STONITH with corosync via puppet? -- You received this message because you are subscribed to the

lustre showing inactive devices

2013 Mar 18

lustre showing inactive devices

I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows: [code] [root at MDS ~]# lctl list_nids 10.94.214.185 at tcp [root at MDS ~]# [/code] On Lustre Client1: [code] [root at lustreclient1 lustre]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0]

OST0006 : inactive device

2013 Mar 18

OST0006 : inactive device

No space left on device on not full filesystem

2010 Jul 08

No space left on device on not full filesystem

Hello, We have running lustre 1.8.1 and have met "No space lest on device" error when uploading 500 Gb small files (less then 100 Kb each). The problem seems to depends on the number of files. If we remove one file, we can create one new file, even with Gb size; but if we haven''t remove something we can''t create even very little file, as an example using touch

similar to: Large Corosync/Pacemaker clusters