thr3ads.net - Gluster users - [Gluster-users] Clarification on common tasks [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Lindsay Mathieson

2016-Aug-11 11:08 UTC

[Gluster-users] Clarification on common tasks

On 11/08/2016 7:13 PM, Gandalf Corvotempesta wrote:> 1) kill the brick process (how can I ensure which is the brick process
> to kill)?

glusterfsd is the prick status

Also "gluster volume status" lists the pid's of all the bricks
processes.
> 2) unmount the brick, in example:
> unmount /dev/sdc
>
> 3) remove the failed disk
>
> 4) insert the new disk
> 5) create an XFS filesystem on the new disk
> 6) mount the new disk where the previous one was
Yes to all that.

> 7) add the new brick to the gluster. How ?
No need. New brick is mounted where the old one was.
> 8) run "gluster v start force".
Yes.
> Why should I need the step 8? If the volume is already started and
> working (remember that I would like to change disk with no downtime,
> thus i can't stop the volume), why should I "start" it again
?

This forces a restart of the glusterfsd process you killed earlier.

Next you do a :

   "gluster heal <VOLUME NAME> full"

That causes the files on the other bricks to be healed to the new brick.


> B) let's assume I would like to add a bounch of new bricks on existing
> servers. Which is the proper procedure to do so?
Different process altogether.

> Ceph has a good documentation page where some common tasks are explained:
> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
> i've not found anything similiar in gluster.

That would be good.


-- 
Lindsay Mathieson

Gandalf Corvotempesta

2016-Aug-11 13:43 UTC

head link

[Gluster-users] Clarification on common tasks

2016-08-11 13:08 GMT+02:00 Lindsay Mathieson <lindsay.mathieson at
gmail.com>:> Also "gluster volume status" lists the pid's of all the
bricks processes.
Ok, let's break everything., just to try.

This is a working cluster. I have 3 server with 1 brick each, in
replica 3, thus, all files are replicated on all hosts.

# gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: 2a36dc0f-1d9b-469c-82de-9d8d98321b83
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 1.2.3.112:/export/sdb1/brick
Brick2: 1.2.3.113:/export/sdb1/brick
Brick3: 1.2.3.114:/export/sdb1/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.shard: off
features.shard-block-size: 10MB
performance.write-behind-window-size: 1GB
performance.cache-size: 1GB


I did this on a client:

# echo 'hello world' > hello
# md5sum hello
6f5902ac237024bdd0c176cb93063dc4  hello

Obviously, on node 1.2.3.112 I have it:

# cat /export/sdb1/brick/hello
hello world
# md5sum /export/sdb1/brick/hello
6f5902ac237024bdd0c176cb93063dc4  /export/sdb1/brick/hello



Let's break everything, this is funny.
I take the brick pid from here:
# gluster volume status | grep 112
Brick 1.2.3.112:/export/sdb1/brick      49152     0          Y       14315


# kill -9 14315

# gluster volume status | grep 112
Brick 1.2.3.112:/export/sdb1/brick      N/A       N/A        N       N/A

this should be like a dregraded cluster, right ?

Now I add a new file from the client:
echo "hello world, i'm degraded" > degraded

Obviously, this file is not replicated on node 1.2.3.112

# gluster volume heal gv0 info
Brick 1.2.3.112:/export/sdb1/brick
Status: Transport endpoint is not connected
Number of entries: -

Brick 1.2.3.113:/export/sdb1/brick
/degraded
/
Status: Connected
Number of entries: 2

Brick 1.2.3.114:/export/sdb1/brick
/degraded
/
Status: Connected
Number of entries: 2



This means that "/" dir and "/degraded" file should be
healed from
.113 and .114 ?

Let's format the disk on .112
# umount /dev/sdb1
# mkfs.xfs /dev/sdb1 -f
meta-data=/dev/sdb1              isize=256    agcount=4, agsize=122094597 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=488378385, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=238466, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


Now I mount it again on the old place:

# mount /dev/sdb1 /export/sdb1

it's empty:
# ls /export/sdb1/ -la
total 4
drwxr-xr-x 2 root root    6 Aug 11 15:37 .
drwxr-xr-x 3 root root 4096 Jul  5 17:03 ..

I create the "brick" directory used by gluster:

# mkdir /export/sdb1/brick

Now I run the volume start force:

# gluster volume start gv0 force
volume start: gv0: success

But brick process is still down:

# gluster volume status | grep 112
Brick 1.2.3.112:/export/sdb1/brick      N/A       N/A        N       N/A



And now ?

What I really don't like is the use of "force" in "gluster
volume start"
Usually (in all software) force is used when "bad things" are needed.
In this case, the volume start is mandatory, thus why I have to use
the force?
If the volume is already started, gluster should be smart enough to
start only the missing processes, without force, or, better, another
command should be created, something like: "gluster bricks start"
using the force means running dangerous operation, not a common
administrative task.

Gluster users - Aug 2016 - Clarification on common tasks

[Gluster-users] Clarification on common tasks

[Gluster-users] Clarification on common tasks