thr3ads.net - Lustre discuss - [Lustre-discuss] lustre quota problems [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Patrick Winnertz

2008-Jan-02 10:27 UTC

[Lustre-discuss] lustre quota problems

Hello,

I''ve several problems with quota on our testcluster:

When I set the quota for a person to a given value (e.g. the values which 
are provided in the operations manual), I''m able to write exact the
amount
which is set with setquota.
But when I delete the files(file) I''m not able to use this space again.

Here is what I''ve done in detail:
lfs checkquota -ug /mnt/testfs
lfs setquota -u winnie 307200 309200 10000 11000 /mnt/testfs

Now I wrote one single big file with dd.
dd if=/dev/zero of=/mnt/testfs/test

As expected it stops writing the file after it is ~300 MB large. 
Removing this file and restarting dd leads to a zero-sized file, because 
the disk quota is exceeded.

Does anybody know this behaviour and know what is wrong here? (I guess some 
values are cached). 

Thanks in advance!
Patrick Winnertz

-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB M?nchengladbach 12080
Hohenzollernstr. 133, 41061 M?nchengladbach
Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz

Roland Laifer

2008-Jan-02 10:51 UTC

head link

[Lustre-discuss] lustre quota problems

Hello, 

we had the same problem with our Lustre software from HP (HP SFS). 
HP opened CFS bug 12431 (which is not visible to the public and 
for us). Therefore, I''m not sure which Lustre version includes 
the corresponding fix. HP provided a fix on top of their newest 
SFS version which fixed the problem.

Here is a part of the explanation for the problem:
Files which did not decrease the quota when they were deleted had 
inode->i_dquota set to NULL which should not happen. The root cause 
was in filter_destroy() and filter_commitrw_commit(). 

Regards, 
  Roland 
-- 
 --------------------------------------------------------------------------
  Roland Laifer 
  Rechenzentrum, Universitaet Karlsruhe (TH), D-76128 Karlsruhe, Germany
  Email: Roland.Laifer at rz.uni-karlsruhe.de, Phone: +49 721 608 4861, 
  Fax: +49 721 32550, Web: www.rz.uni-karlsruhe.de/personen/roland.laifer
 --------------------------------------------------------------------------

On Wed, Jan 02, 2008 at 11:27:56AM +0100, Patrick Winnertz
wrote:> Hello,
> 
> I''ve several problems with quota on our testcluster:
> 
> When I set the quota for a person to a given value (e.g. the values which 
> are provided in the operations manual), I''m able to write exact
the amount
> which is set with setquota.
> But when I delete the files(file) I''m not able to use this space
again.
> 
> Here is what I''ve done in detail:
> lfs checkquota -ug /mnt/testfs
> lfs setquota -u winnie 307200 309200 10000 11000 /mnt/testfs
> 
> Now I wrote one single big file with dd.
> dd if=/dev/zero of=/mnt/testfs/test
> 
> As expected it stops writing the file after it is ~300 MB large. 
> Removing this file and restarting dd leads to a zero-sized file, because 
> the disk quota is exceeded.
> 
> Does anybody know this behaviour and know what is wrong here? (I guess some
> values are cached). 
> 
> Thanks in advance!
> Patrick Winnertz
> 
> -- 
> Patrick Winnertz
> Tel.: +49 (0) 2161 / 4643 - 0
> 
> credativ GmbH, HRB M?nchengladbach 12080
> Hohenzollernstr. 133, 41061 M?nchengladbach
> Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

McHale, Therese

2008-Jan-02 13:39 UTC

head link

[Lustre-discuss] lustre quota problems

The fix Roland mentions is included in Lustre 1.4.10 or you can also find it
here https://bugzilla.lustre.org/attachment.cgi?id=8709

-therese

(HP SFS Support)

Postal Address: Hewlett Packard Galway Ltd., Ballybrit Business Park, Galway,
Ireland
Registered Office: 63-74 Sir John Rogerson''s Quay, Dublin 2, Ireland.
Registered Number: 361933

-----Original Message-----
From: lustre-discuss-bounces at clusterfs.com [mailto:lustre-discuss-bounces at
clusterfs.com] On Behalf Of lustre-discuss-request at clusterfs.com
Sent: 02 January 2008 13:23
To: lustre-discuss at clusterfs.com
Subject: Lustre-discuss Digest, Vol 24, Issue 2


Send Lustre-discuss mailing list submissions to
        lustre-discuss at clusterfs.com

To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
or, via email, send a message with subject or body ''help'' to
        lustre-discuss-request at clusterfs.com

You can reach the person managing the list at
        lustre-discuss-owner at clusterfs.com

When replying, please edit your Subject line so it is more specific than
"Re: Contents of Lustre-discuss digest..."


Today''s Topics:

   1. lustre quota problems (Patrick Winnertz)
   2. Re: lustre quota problems (Roland Laifer)
   3. Re: help needed. (Aaron Knister)


----------------------------------------------------------------------

Message: 1
Date: Wed, 2 Jan 2008 11:27:56 +0100
From: Patrick Winnertz <patrick.winnertz at credativ.de>
Subject: [Lustre-discuss] lustre quota problems
To: Lustre-discuss <lustre-discuss at clusterfs.com>
Message-ID: <200801021127.58965.patrick.winnertz at credativ.de>
Content-Type: text/plain;  charset="iso-8859-1"

Hello,

I''ve several problems with quota on our testcluster:

When I set the quota for a person to a given value (e.g. the values which are
provided in the operations manual), I''m able to write exact the amount
which is set with setquota. But when I delete the files(file) I''m not
able to use this space again.

Here is what I''ve done in detail:
lfs checkquota -ug /mnt/testfs
lfs setquota -u winnie 307200 309200 10000 11000 /mnt/testfs

Now I wrote one single big file with dd.
dd if=/dev/zero of=/mnt/testfs/test

As expected it stops writing the file after it is ~300 MB large. Removing this
file and restarting dd leads to a zero-sized file, because the disk quota is
exceeded.

Does anybody know this behaviour and know what is wrong here? (I guess some
values are cached).

Thanks in advance!
Patrick Winnertz

--
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB M?nchengladbach 12080
Hohenzollernstr. 133, 41061 M?nchengladbach
Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz



------------------------------

Message: 2
Date: Wed, 2 Jan 2008 11:51:28 +0100
From: Roland Laifer <Laifer at RZ.Uni-Karlsruhe.DE>
Subject: Re: [Lustre-discuss] lustre quota problems
To: Patrick Winnertz <patrick.winnertz at credativ.de>
Cc: Lustre-discuss <lustre-discuss at clusterfs.com>
Message-ID: <20080102105128.GC12028 at rz.uni-karlsruhe.de>
Content-Type: text/plain; charset=iso-8859-1

Hello,

we had the same problem with our Lustre software from HP (HP SFS). HP opened CFS
bug 12431 (which is not visible to the public and for us). Therefore,
I''m not sure which Lustre version includes the corresponding fix. HP
provided a fix on top of their newest SFS version which fixed the problem.

Here is a part of the explanation for the problem:
Files which did not decrease the quota when they were deleted had
inode->i_dquota set to NULL which should not happen. The root cause
was in filter_destroy() and filter_commitrw_commit().

Regards,
  Roland
--
 --------------------------------------------------------------------------
  Roland Laifer
  Rechenzentrum, Universitaet Karlsruhe (TH), D-76128 Karlsruhe, Germany
  Email: Roland.Laifer at rz.uni-karlsruhe.de, Phone: +49 721 608 4861,
  Fax: +49 721 32550, Web: www.rz.uni-karlsruhe.de/personen/roland.laifer
 --------------------------------------------------------------------------

On Wed, Jan 02, 2008 at 11:27:56AM +0100, Patrick Winnertz
wrote:> Hello,
>
> I''ve several problems with quota on our testcluster:
>
> When I set the quota for a person to a given value (e.g. the values
> which are provided in the operations manual), I''m able to write
exact
> the amount which is set with setquota. But when I delete the
> files(file) I''m not able to use this space again.
>
> Here is what I''ve done in detail:
> lfs checkquota -ug /mnt/testfs
> lfs setquota -u winnie 307200 309200 10000 11000 /mnt/testfs
>
> Now I wrote one single big file with dd.
> dd if=/dev/zero of=/mnt/testfs/test
>
> As expected it stops writing the file after it is ~300 MB large.
> Removing this file and restarting dd leads to a zero-sized file,
> because the disk quota is exceeded.
>
> Does anybody know this behaviour and know what is wrong here? (I guess
> some values are cached).
>
> Thanks in advance!
> Patrick Winnertz
>
> --
> Patrick Winnertz
> Tel.: +49 (0) 2161 / 4643 - 0
>
> credativ GmbH, HRB M?nchengladbach 12080
> Hohenzollernstr. 133, 41061 M?nchengladbach
> Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss


------------------------------

Message: 3
Date: Wed, 2 Jan 2008 08:22:38 -0500
From: Aaron Knister <aaron at iges.org>
Subject: Re: [Lustre-discuss] help needed.
To: Avi Gershon <gershonavi at gmail.com>
Cc: Yan Benhammou <Yan.Benhammou at cern.ch>,
        lustre-discuss at clusterfs.com,   Meny Ben moshe <meny at
lep1.tau.ac.il>
Message-ID: <E9B98183-FC9C-43CB-9ACF-AD7FC1CBE42A at iges.org>
Content-Type: text/plain; charset="us-ascii"

On the host x-math20 could you run an "lctl list_nids" and also an
"ifconfig -a". I want to see if lnet is listening on the correct
interface. Oh could you also post the contents of your /etc/ modprobe.conf.

Thanks!

-Aaron

On Jan 2, 2008, at 4:42 AM, Avi Gershon wrote:
> Hello to every one and happy new year..
> I think I have reduce my problem to this: lctl ping
> 132.66.176.211 at tcp0 don''t work for me for some strange reason
as you
> can see:
> **********************************************************************
> *************
> [root at x-math20 ~]# lctl ping 132.66.176.211 at tcp0
> failed to ping 132.66.176.211 at tcp: Input/output error
> [root at x-math20 ~]# ping 132.66.176.211
> PING 132.66.176.211 (132.66.176.211) 56(84) bytes of data.
> 64 bytes from 132.66.176.211: icmp_seq=0 ttl=64 time=0.152 ms
> 64 bytes from 132.66.176.211: icmp_seq=1 ttl=64 time=0.130 ms
> 64 bytes from 132.66.176.211: icmp_seq=2 ttl=64 time=0.131 m
> --- 132.66.176.211 ping statistics ---
> 3 packets transmitted, 3 received, 0% packet loss, time 2018ms
> rtt min/avg/max/mdev = 0.130/0.137/0.152/0.016 ms, pipe 2
> [root at x-math20 ~]#
>
*****************************************************************************************
>
>
> On 12/24/07, Avi Gershon <gershonavi at gmail.com> wrote:
> Hi,
> here is the "iptables -L  " results:
>
>  NODE 1 132.66.176.212
> Scientific Linux CERN SLC release 4.6 (Beryllium)
> root at 132.66.176.212''s password: Last login: Sun Dec 23 22:01:18
2007
> from x-fishelov.tau.ac.il [root at localhost ~]#
> [root at localhost ~]#
> [root at localhost ~]# iptables -L
> Chain INPUT (policy ACCEPT)
> target     prot opt source               destination
> Chain FORWARD (policy ACCEPT)
> target     prot opt source               destination
>
> Chain OUTPUT (policy ACCEPT)
> target     prot opt source               destination
> **********************************************************************
> **************************
>  MDT 132.66.176.211
>
> Last login: Mon Dec 24 11:51:57 2007 from dynamic136-91.tau.ac.il
> [root at x-math20 ~]# iptables -L Chain INPUT (policy ACCEPT)
> target     prot opt source               destination
> Chain FORWARD (policy ACCEPT)
> target     prot opt source               destination
>
> Chain OUTPUT (policy ACCEPT)
> target     prot opt source               destination
> **********************************************************************
> ***
>
> NODE 2 132.66.176.215
> Last login: Mon Dec 24 11:01:22 2007 from erezlab.tau.ac.il
> [root at x-mathr11 ~]# iptables -L
>
> Chain INPUT (policy ACCEPT)
> target     prot opt source               destination
> RH-Firewall-1-INPUT  all  --  anywhere             anywhere
> Chain FORWARD (policy ACCEPT)
> target     prot opt source               destination
> RH-Firewall-1-INPUT  all  --  anywhere             anywhere
>
> Chain OUTPUT (policy ACCEPT)
> target     prot opt source               destination
>
> Chain RH-Firewall-1-INPUT (2 references)
> target     prot opt source               destination
> ACCEPT     all  --  anywhere             anywhere
> ACCEPT     icmp --  anywhere             anywhere            icmp any
> ACCEPT     ipv6-crypt--  anywhere             anywhere
> ACCEPT     ipv6-auth--  anywhere             anywhere
> ACCEPT     udp  --  anywhere             224.0.0.251         udp dpt:
> 5353
> ACCEPT     udp  --  anywhere             anywhere            udp
> dpt:ipp
> ACCEPT     all  --  anywhere             anywhere            state
> RELATED,ESTAB
> LISHED
> ACCEPT     tcp  --  anywhere             anywhere            state
> NEW tcp dpts:
> 30000:30101
> ACCEPT     tcp  --  anywhere             anywhere            state
> NEW tcp dpt:s
> sh
> ACCEPT     udp  --  anywhere             anywhere            state
> NEW udp dpt:a
> fs3-callback
> REJECT     all  --  anywhere             anywhere            reject-
> with icmp-ho
> st-prohibited
> [root at x-mathr11 ~]#
>
> ************************************************************
> one more thing....
> Do you use TCP protocol? or do you use UDP?
>
> Regards Avi,
> P.S I think a beginning of a beautiful friendship.. :-)
>
>
>
> On Dec 24, 2007 5:29 PM, Aaron Knister <aaron at iges.org> wrote:
That
> sounds like quite a task! Could you show me the contents of your
> firewall rules on the systems mentioned below? (iptables -L) on each.
> That would help to diagnose the problem further.
>
> -Aaron
>
> On Dec 24, 2007, at 1:21 AM, Yan Benhammou wrote:
>
> > Hi Aaron and thank you for you fast answwers.
> > We are working (Avi,Meny and me) on the israeli GRID and we need to
> > create a single huge file system for this GRID.
> >     cheers
> >          Yan
> >
> > ________________________________
> >
> > From: Aaron Knister [mailto:aaron at iges.org]
> > Sent: Sun 12/23/2007 8:27 PM
> > To: Avi Gershon
> > Cc: lustre-discuss at clusterfs.com; Yan Benhammou; Meny Ben moshe
> > Subject: Re: [Lustre-discuss] help needed.
> >
> >
> > Can you check the firewall on each of those machines ( iptables -L )
> > and paste that here. Also, is this network dedicated to Lustre?
> > Lustre can easily saturate a network interface under load to the
> > point it becomes difficult to login to a node if it only has one
> > interface. I''d recommend using a different interface if you
can.
> >
> > On Dec 23, 2007, at 11:03 AM, Avi Gershon wrote:
> >
> >
> >       node 1 132.66.176.212 < http://132.66.176.212/>
> >       node 2 132.66.176.215 < http://132.66.176.215/>
> >
> >       [root at x-math20 ~]# ssh 132.66.176.215 < http://
> 132.66.176.215/ >
> >       root at 132.66.176.215''s password:
> >       ssh(21957) Permission denied, please try again.
> >       root at 132.66.176.215 ''s password:
> >       Last login: Sun Dec 23 14:32:51 2007 from x-math20.tau.ac.il
> <http://x-math20.tau.ac.il/
> > >
> >       [root at x-mathr11 ~]#  lctl ping 132.66.176.211 at tcp0
> >       failed to ping 132.66.176.211 at tcp: Input/output error
> >       [root at x-mathr11 ~]#  lctl list_nids
> >       132.66.176.215 at tcp
> >       [root at x-mathr11 ~]# ssh 132.66.176.212 <http://
> 132.66.176.212/>
> >       The authenticity of host '' 132.66.176.212
> > <http://132.66.176.212/
> >
> > ( 132.66.176.212 <http://132.66.176.212/> )''
can''t be established.
> >       RSA1 key fingerprint is
> 85:2a:c1:47:84:b7:b5:a6:cd:c4:57:86:af:ce:
> > 7e:74.
> >       Are you sure you want to continue connecting (yes/no)? yes
> >       ssh(11526) Warning: Permanently added '' 132.66.176.212
<
> > http://132.66.176.212/
> > > '' (RSA1) to the list of kno
> >       wn hosts.
> >       root at 132.66.176.212''s password:
> >       Last login: Sun Dec 23 15:24:41 2007 from x-math20.tau.ac.il
> <http://x-math20.tau.ac.il/
> > >
> >       [root at localhost ~]# lctl ping 132.66.176.211 at tcp0
> >       failed to ping 132.66.176.211 at tcp: Input/output error
> >       [root at localhost ~]# lctl list_nids
> >       132.66.176.212 at tcp
> >       [root at localhost ~]#
> >
> >
> >       thanks for helping!!
> >       Avi
> >
> >
> >       On Dec 23, 2007 5:32 PM, Aaron Knister < aaron at
iges.org>
> wrote:
> >
> >
> >               On the oss can you ping the mds/mgs using this
> command--
> >
> >               lctl ping 132.66.176.211 at tcp0
> >
> >               If it doesn''t ping, list the nids on each node
by
> running
> >
> >               lctl list_nids
> >
> >               and tell me what comes back.
> >
> >               -Aaron
> >
> >
> >               On Dec 23, 2007, at 9:22 AM, Avi Gershon wrote:
> >
> >
> >                       HI I could use some help.
> >                       I installed lustre on 3 computers
> >                        mdt/mgs :
> >
> >
> >
> **********************************************************************
> **************8
> >                       [root at x-math20 ~]#mkfs.lustre --reformat --
> fsname spfs --mdt --
> > mgs /dev/hdb
> >
> >                          Permanent disk data:
> >                       Target:     spfs-MDTffff
> >                       Index:      unassigned
> >                       Lustre FS:  spfs
> >                       Mount type: ldiskfs
> >                       Flags:      0x75
> >                                     (MDT MGS needs_index
> first_time update )
> >                       Persistent mount opts: errors=remount-
> ro,iopen_nopriv,user_xattr
> >                       Parameters:
> >
> >                       device size = 19092MB
> >                       formatting backing filesystem ldiskfs on /
> dev/hdb
> >                               target name  spfs-MDTffff
> >                               4k blocks     0
> >                               options        -J size=400 -i 4096 -
> I 512 -q -O dir_index
> > -F
> >                       mkfs_cmd = mkfs.ext2 -j -b 4096 -L spfs-
> MDTffff  -J size=400 -i
> > 4096 -I 512 -q -O dir_index -F /dev/hdb
> >                       Writing CONFIGS/mountdata
> >                       [ root at x-math20 ~]# df
> >                       Filesystem           1K-blocks      Used
> Available Use% Mounted on
> >                       /dev/hda1             19228276   4855244
> 13396284  27% /
> >                       none                    127432         0
> 127432   0% /dev/shm
> >                       /dev/hdb              17105436    455152
> 15672728   3% /mnt/test/
> > mdt
> >                       [root at x-math20 ~]# cat
/proc/fs/lustre/devices
> >                         0 UP mgs MGS MGS 5
> >                         1 UP mgc MGC132.66.176.211 at tcp
> > 5f5ba729-6412-3843-2229-1310a0b48f71 5
> >                         2 UP mdt MDS MDS_uuid 3
> >                         3 UP lov spfs-mdtlov spfs-mdtlov_UUID 4
> >                         4 UP mds spfs-MDT0000 spfs-MDT0000_UUID 3
> >                       [ root at x-math20 ~]#
> >
> *************************************************************end
> > mdt******************************8
> >                       so you can see that the MGS is up
> >                       ond on the ost''s I get an error!! plz
help...
> >
> >                       ost:
> >
> >
> **********************************************************************
> >                       [ root at x-mathr11 ~]# mkfs.lustre --reformat
> --fsname spfs --ost --
> > mgsnode=132.66. 176.211 at tcp0 /dev/hdb1
> >
> >                          Permanent disk data:
> >                       Target:     spfs-OSTffff
> >                       Index:      unassigned
> >                       Lustre FS:  spfs
> >                       Mount type: ldiskfs
> >                       Flags:      0x72
> >                                     (OST needs_index first_time
> update )
> >                       Persistent mount opts: errors=remount-
> ro,extents,mballoc
> >                       Parameters: mgsnode=132.66.176.211 at tcp
> >
> >                       device size = 19594MB
> >                       formatting backing filesystem ldiskfs on /
> dev/hdb1
> >                               target name  spfs-OSTffff
> >                               4k blocks     0
> >                               options        -J size=400 -i 16384 -
> I 256 -q -O
> > dir_index -F
> >                       mkfs_cmd = mkfs.ext2 -j -b 4096 -L spfs-
> OSTffff  -J size=400 -i
> > 16384 -I 256 -q -O dir_index -F /dev/hdb1
> >                       Writing CONFIGS/mountdata
> >                       [ root at x-mathr11 ~]# /CONFIGS/mountdata
> >                       -bash: /CONFIGS/mountdata: No such file or
> directory
> >                       [root at x-mathr11 ~]# mount -t lustre /dev/
> hdb1 /mnt/test/ost1
> >                       mount.lustre: mount /dev/hdb1 at /mnt/test/
> ost1 failed: Input/
> > output error
> >                       Is the MGS running?
> >
> ***********************************************end
> > ost********************************
> >
> >                       can any one point out the problem?
> >                       thanks Avi.
> >
> >
> >
> >
> _______________________________________________
> >                       Lustre-discuss mailing list
> >                       Lustre-discuss at clusterfs.com
> >
> > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >
> >
> >
> >
> >
> >               Aaron Knister
> >               Associate Systems Administrator/Web Designer
> >               Center for Research on Environment and Water
> >
> >               (301) 595-7001
> >               aaron at iges.org
> >
> >
> >
> >
> >
> >
> > Aaron Knister
> > Associate Systems Administrator/Web Designer
> > Center for Research on Environment and Water
> >
> > (301) 595-7001
> > aaron at iges.org
> >
> >
> >
>
> Aaron Knister
> Associate Systems Administrator/Web Designer
> Center for Research on Environment and Water
>
> (301) 595-7001
> aaron at iges.org
>
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org




-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20080102/636e7553/attachment.html

------------------------------

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss


End of Lustre-discuss Digest, Vol 24, Issue 2
*********************************************

Johann Lombardi

2008-Jan-02 14:45 UTC

head link

[Lustre-discuss] lustre quota problems

On Wed, Jan 02, 2008 at 01:39:06PM +0000, McHale, Therese
wrote:> The fix Roland mentions is included in Lustre 1.4.10 or you can also find
it here https://bugzilla.lustre.org/attachment.cgi?id=8709
For the record, the original bugzilla ticket is in fact 11073 and as Therese
pointed out, the patch is included in lustre 1.4.10.

Johann

Patrick Winnertz

2008-Jan-04 22:57 UTC

head link

[Lustre-discuss] lustre quota problems

Am Mittwoch, 2. Januar 2008 15:45:48 schrieb Johann
Lombardi:> On Wed, Jan 02, 2008 at 01:39:06PM +0000, McHale, Therese wrote:
> > The fix Roland mentions is included in Lustre 1.4.10 or you can also
> > find it here https://bugzilla.lustre.org/attachment.cgi?id=8709
>
> For the record, the original bugzilla ticket is in fact 11073 and as
> Therese pointed out, the patch is included in lustre 1.4.10.What about the 1.6.x tree? Is this fix included in one 1.6.x version, too?

Greetings
Winnie>
> Johann
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss


-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB M?nchengladbach 12080
Hohenzollernstr. 133, 41061 M?nchengladbach
Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz

McHale, Therese

2008-Jan-07 13:05 UTC

head link

[Lustre-discuss] lustre quota problems

>What about the 1.6.x tree? Is this fix included in one 1.6.x version, too?yes. It went into 1.5 and is therefore included in 1.6.x
-therese

Postal Address: Hewlett Packard Galway Ltd., Ballybrit Business Park, Galway,
Ireland
Registered Office: 63-74 Sir John Rogerson''s Quay, Dublin 2, Ireland.
Registered Number: 361933

The contents of this message and any attachments to it are confidential and may
be legally privileged. If you have received this message in error you should
delete it from your system immediately and advise the sender. To any recipient
of this message within HP: unless otherwise stated you should consider this
message and attachments as "HP CONFIDENTIAL".



-----Original Message-----
From: lustre-discuss-bounces at clusterfs.com [mailto:lustre-discuss-bounces at
clusterfs.com] On Behalf Of lustre-discuss-request at clusterfs.com
Sent: 05 January 2008 19:00
To: lustre-discuss at clusterfs.com
Subject: Lustre-discuss Digest, Vol 24, Issue 11


Send Lustre-discuss mailing list submissions to
        lustre-discuss at clusterfs.com

To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
or, via email, send a message with subject or body ''help'' to
        lustre-discuss-request at clusterfs.com

You can reach the person managing the list at
        lustre-discuss-owner at clusterfs.com

When replying, please edit your Subject line so it is more specific than
"Re: Contents of Lustre-discuss digest..."


Today''s Topics:

   1. Re: Problems with failover (Aaron Knister)
   2. Re: lustre quota problems (Patrick Winnertz)
   3. Re: small file performance (Robin Humble)
   4. Re: small file performance (Aaron Knister)


----------------------------------------------------------------------

Message: 1
Date: Fri, 4 Jan 2008 15:35:35 -0500
From: Aaron Knister <aaron at iges.org>
Subject: Re: [Lustre-discuss] Problems with failover
To: Jeremy Mann <jeremy at biochem.uthscsa.edu>
Cc: Andreas Dilger <adilger at sun.com>, lustre-discuss at clusterfs.com
Message-ID: <C2F3FB6B-08C3-4D6E-86FE-9687A471AFC5 at iges.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

Personally I strongly advise against using compute nodes to host any type of
storage service. If a user job crashes  a compute node (it will actually usually
take out several) in which case you''re once again up a creek. I
don''t know of any filesystem that could handle the failure of more than
two or three underlying storage components. Separating storage from computation
was the best decision I''ve ever made because it allows both to be
scaled independently. Am I totally missing the mark here? If you still want to
do this, try the gfarm filesystem and there''s another one but I
can''t think of the name. If i find it I''ll let you know.

On Jan 4, 2008, at 11:10 AM, Jeremy Mann wrote:
>
> On Thu, 2008-01-03 at 17:34 -0700, Andreas Dilger wrote:
>
>> To be clear - Lustre failover has nothing to do with data
>> replication. It is meant only as a mechanism to allow
>> high-availability of shared disk.  This means - more than one node
>> can serve shared disk from a SAN or multi-port FC/SCSI disks.
>
> How would one build a reliable system with 20 OSTs? Our system
> contains 20 compute nodes, each with 2 200GB drives in a RAID0
> configuration. Each node acts as an OST and a failover of each other,
> i.e. 0-1, 1-2, 3-4, etc..
>
> I can start from scratch, so I''m thinking of rebuilding the RAID
> arrays with RAID1 to compensate for disk failures. But that still
> leaves me questioning if a node goes down, or we lose another drive,
> if we''ll be back to the same problems we''ve been having.
>
> --
> Jeremy Mann
> jeremy at biochem.uthscsa.edu
>
> University of Texas Health Science Center
> Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu
> Phone: 210-567-2672
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org






------------------------------

Message: 2
Date: Fri, 4 Jan 2008 23:57:12 +0100
From: Patrick Winnertz <patrick.winnertz at credativ.de>
Subject: Re: [Lustre-discuss] lustre quota problems
To: lustre-discuss at clusterfs.com
Message-ID: <200801042357.14163.patrick.winnertz at credativ.de>
Content-Type: text/plain;  charset="iso-8859-1"

Am Mittwoch, 2. Januar 2008 15:45:48 schrieb Johann
Lombardi:> On Wed, Jan 02, 2008 at 01:39:06PM +0000, McHale, Therese wrote:
> > The fix Roland mentions is included in Lustre 1.4.10 or you can also
> > find it here https://bugzilla.lustre.org/attachment.cgi?id=8709
>
> For the record, the original bugzilla ticket is in fact 11073 and as
> Therese pointed out, the patch is included in lustre 1.4.10.What about the 1.6.x tree? Is this fix included in one 1.6.x version, too?

Greetings
Winnie>
> Johann
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss


--
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB M?nchengladbach 12080
Hohenzollernstr. 133, 41061 M?nchengladbach
Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz



------------------------------

Message: 3
Date: Sat, 5 Jan 2008 03:50:17 -0500
From: Robin Humble <rjh+lustre at cita.utoronto.ca>
Subject: Re: [Lustre-discuss] small file performance
To: Aaron Knister <aaron at iges.org>
Cc: Lustre-discuss <lustre-discuss at clusterfs.com>
Message-ID: <20080105085016.GA20815 at lemming.cita.utoronto.ca>
Content-Type: text/plain; charset=us-ascii

On Fri, Jan 04, 2008 at 09:44:54AM -0500, Aaron Knister
wrote:>For whatever reason, searching my lustre mount (ls -R or find),
>compiling code and other operations involving lots of small files are
>painfully slow. There is no load on the filesystem other than my
>various tests. I''ve disabled lnet debugging. Just to give you an
idea
>of how slow it is-- a ./configure of this particular code on a local
>filesystem takes less than a minute. On lustre it''s been running
for
>five minutes and is hardly half way through. An untar on the local
>filesystem takes .9 seconds while that same untar takes 12 seconds to
>our lustre mount. Any ideas for improving this?
do you have striping turned off?
that makes a massive difference for metadata operations...
  lfs setstripe -d /some/lustre/dir/

cheers,
robin



------------------------------

Message: 4
Date: Sat, 5 Jan 2008 11:08:10 -0500
From: Aaron Knister <aaron at iges.org>
Subject: Re: [Lustre-discuss] small file performance
To: Robin Humble <rjh+lustre at cita.utoronto.ca>
Cc: Lustre-discuss <lustre-discuss at clusterfs.com>
Message-ID: <76D0DB87-00E2-4A7F-BDA2-EB1C34D33A00 at iges.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

Striping is turned off. Are there any other optimizations you know of to
increase the speed of metadata operations?

On Jan 5, 2008, at 3:50 AM, Robin Humble wrote:
> On Fri, Jan 04, 2008 at 09:44:54AM -0500, Aaron Knister wrote:
>> For whatever reason, searching my lustre mount (ls -R or find),
>> compiling code and other operations involving lots of small files are
>> painfully slow. There is no load on the filesystem other than my
>> various tests. I''ve disabled lnet debugging. Just to give you
an idea
>> of how slow it is-- a ./configure of this particular code on a local
>> filesystem takes less than a minute. On lustre it''s been
running for
>> five minutes and is hardly half way through. An untar on the local
>> filesystem takes .9 seconds while that same untar takes 12 seconds to
>> our lustre mount. Any ideas for improving this?
>
> do you have striping turned off?
> that makes a massive difference for metadata operations...  lfs
> setstripe -d /some/lustre/dir/
>
> cheers,
> robin
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies

(301) 595-7000
aaron at iges.org






------------------------------

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss at clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss


End of Lustre-discuss Digest, Vol 24, Issue 11
**********************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: McHale, Therese (SFS Support Engineer, Galway).vcf
Type: text/x-vcard
Size: 422 bytes
Desc: McHale, Therese (SFS Support Engineer, Galway).vcf
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080107/d06896ee/attachment-0002.vcf

Patrick Winnertz

2008-Jan-09 07:17 UTC

head link

[Lustre-discuss] lustre quota problems

Am Montag, 7. Januar 2008 14:05:52 schrieb McHale,
Therese:> >What about the 1.6.x tree? Is this fix included in one 1.6.x version,
> > too?This is strange. I experienced this error on a 1.6.4.1 lustre cluster. (and 
before on a 1.6.3) So this doesn''t fit together with your statement
that
this should be fixed in every 1.6.x version.

And why is the bug ticket 11073 belonging to this problem not viewable by 
the public?>
> yes. It went into 1.5 and is therefore included in 1.6.x
> -therese
>
[removed fullquote]

Greetings
Patrick Winnertz

-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB M?nchengladbach 12080
Hohenzollernstr. 133, 41061 M?nchengladbach
Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz

Patrick Winnertz

2008-Jan-09 12:42 UTC

head link

[Lustre-discuss] lustre quota problems

Am Montag, 7. Januar 2008 14:05:52 schrieb McHale,
Therese:> >What about the 1.6.x tree? Is this fix included in one 1.6.x version,
> > too?
>
> yes. It went into 1.5 and is therefore included in 1.6.xMh.. yes.. as you said this fix is included in 1.6.4.1. But the error I 
mentioned exists in 1.6.4.1... so maybe a new bug?

Here again the description of the error:
When I set the quota to a given value, I''m able to write exact the
amount
which is set which setquota. 

After deleting the files it is not possible to rewrite them, this ends up 
in zero-sized files.

In order to use this space again I''ve to reset the quota to 0 0 0 0,
write
a file, and then reset the quota to the values i want. 
Then I can work again with that quota.

If i don''t try to write a file with the quota being set to 0 0 0 0, it 
won''t work.

Greetings
Patrick Winnertz

ps: I attached two files with a comparison of these two scenarios:

Scenario 1:
	setquota $ourvalues
	write file
	delete file
	write file <-- this won''t work (dd will state that disc quota is
full)
	setquota 0 0 0 0
	write file	 	} Necessary in order to get the last dd to work. 
	delete file  	}
	setquota $ourvalues
	write file

Scenario 2:
	setquota $ourvalues
	write file
	delete file
	write fle <-- again this won''t work (zero-sized file)
	setquota 0 0 0 0
	setquota $ourvalues 
	write file  <-- will also not work (ends up again in a zero-sized file)

-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB M?nchengladbach 12080
Hohenzollernstr. 133, 41061 M?nchengladbach
Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
-------------- next part --------------
debian:~# lfs setquota -u credativ 10000 20000 100000 200000 /mnt/lustre_client/
debian:~# su credativ
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
dd: Schreiben in ???/mnt/lustre_client/credativ/testfile???: Der zugewiesene
Plattenplatz (Quota) ist ??berschritten
4113+0 Datens??tze ein
4112+0 Datens??tze aus
2105344 Bytes (2,1 MB) kopiert, 0,327338 Sekunden, 6,4 MB/s
credativ at debian:/root$ rm /mnt/lustre_client/credativ/testfile 
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
dd: Schreiben in ???/mnt/lustre_client/credativ/testfile???: Der zugewiesene
Plattenplatz (Quota) ist ??berschritten
1+0 Datens??tze ein
0+0 Datens??tze aus
0 Bytes (0 B) kopiert, 0,00455348 Sekunden, 0,0 kB/s
credativ at debian:/root$ exit
exit
debian:~# lfs setquota -u credativ 0 0 0 0 /mnt/lustre_client/
debian:~# su credativ
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
180225+0 Datens??tze ein
180225+0 Datens??tze aus
92275200 Bytes (92 MB) kopiert, 6,17184 Sekunden, 15,0 MB/s

credativ at debian:/root$ rm /mnt/lustre_client/credativ/testfile 
credativ at debian:/root$ exit
exit
debian:~# lfs setquota -u credativ 10000 20000 100000 200000 /mnt/lustre_client/
debian:~# su credativ
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
dd: Schreiben in ???/mnt/lustre_client/credativ/testfile???: Der zugewiesene
Plattenplatz (Quota) ist ??berschritten
7089+0 Datens??tze ein
7088+0 Datens??tze aus
3629056 Bytes (3,6 MB) kopiert, 0,442293 Sekunden, 8,2 MB/s
credativ at debian:/root$
-------------- next part --------------
debian:~# lfs setquota -u credativ 10000 20000 100000 200000 /mnt/lustre_client/
debian:~# su credativ
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
dd: Schreiben in ???/mnt/lustre_client/credativ/testfile???: Der zugewiesene
Plattenplatz (Quota) ist ??berschritten
7113+0 Datens??tze ein
7112+0 Datens??tze aus
3641344 Bytes (3,6 MB) kopiert, 0,439291 Sekunden, 8,3 MB/s
credativ at debian:/root$ rm /mnt/lustre_client/credativ/testfile 
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
dd: Schreiben in ???/mnt/lustre_client/credativ/testfile???: Der zugewiesene
Plattenplatz (Quota) ist ??berschritten
1+0 Datens??tze ein
0+0 Datens??tze aus
0 Bytes (0 B) kopiert, 0,00436864 Sekunden, 0,0 kB/s
credativ at debian:/root$ exit
exit
debian:~# lfs setquota -u credativ 0 0 0 0 /mnt/lustre_client/
debian:~# lfs setquota -u credativ 10000 20000 100000 200000 /mnt/lustre_client/
debian:~# su credativ
credativ at debian:/root$ dd if=/dev/zero
of=/mnt/lustre_client/credativ/testfile
dd: Schreiben in ???/mnt/lustre_client/credativ/testfile???: Der zugewiesene
Plattenplatz (Quota) ist ??berschritten
1+0 Datens??tze ein
0+0 Datens??tze aus
0 Bytes (0 B) kopiert, 0,00344826 Sekunden, 0,0 kB/s
credativ at debian:/root$

Wojciech Turek

2008-Jan-09 13:38 UTC

head link

[Lustre-discuss] lustre quota problems

Hi,

We see similar problem on our 1.6.3 lustre installation I will open  
bugzilla ticket for that issue

cheers

Wojciech Turek
On 9 Jan 2008, at 12:42, Patrick Winnertz wrote:
> Am Montag, 7. Januar 2008 14:05:52 schrieb McHale, Therese:
>>> What about the 1.6.x tree? Is this fix included in one 1.6.x  
>>> version,
>>> too?
>>
>> yes. It went into 1.5 and is therefore included in 1.6.x
> Mh.. yes.. as you said this fix is included in 1.6.4.1. But the  
> error I
> mentioned exists in 1.6.4.1... so maybe a new bug?
>
> Here again the description of the error:
> When I set the quota to a given value, I''m able to write exact the
> amount
> which is set which setquota.
>
> After deleting the files it is not possible to rewrite them, this  
> ends up
> in zero-sized files.
>
> In order to use this space again I''ve to reset the quota to 0 0 0
> 0, write
> a file, and then reset the quota to the values i want.
> Then I can work again with that quota.
>
> If i don''t try to write a file with the quota being set to 0 0 0
0, it
> won''t work.
>
> Greetings
> Patrick Winnertz
>
> ps: I attached two files with a comparison of these two scenarios:
>
> Scenario 1:
> 	setquota $ourvalues
> 	write file
> 	delete file
> 	write file <-- this won''t work (dd will state that disc quota
is
> full)
> 	setquota 0 0 0 0
> 	write file	 	} Necessary in order to get the last dd to work.
> 	delete file  	}
> 	setquota $ourvalues
> 	write file
>
> Scenario 2:
> 	setquota $ourvalues
> 	write file
> 	delete file
> 	write fle <-- again this won''t work (zero-sized file)
> 	setquota 0 0 0 0
> 	setquota $ourvalues
> 	write file  <-- will also not work (ends up again in a zero-sized  
> file)
>
> -- 
> Patrick Winnertz
> Tel.: +49 (0) 2161 / 4643 - 0
>
> credativ GmbH, HRB M?nchengladbach 12080
> Hohenzollernstr. 133, 41061 M?nchengladbach
> Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg  
>
Folz<scenario1.txt><scenario2.txt>____________________________________
> ___________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Wojciech Turek

2008-Jan-09 16:35 UTC

head link

[Lustre-discuss] lustre quota problems

Hi,

I followed your scenario on my test lustre 1.6.3 installation

1) set quota limit
lfs setquota -u test 0 512000 0 0 /home

2) create files
for i in 1 2 3 4 ; do dd if=/dev/zero of=dummy.$i bs=102400K count=1  
& done
files are created
  lfs quota -u test /home
Disk quotas for user test (uid 627):
      Filesystem  kbytes   quota   limit   grace   files   quota    
limit   grace
           /home  409620       0  512000               6       0       0
ddn_home-MDT0000_UUID
                       4       0  102400               6       0       0
ddn_home-OST0000_UUID
                  102404*      0  102400
ddn_home-OST0001_UUID
                  102404*      0  102400
ddn_home-OST0002_UUID
                  102404*      0  102400
ddn_home-OST0003_UUID
                  102404*      0  102400

3) delete files
  rm -rf dummy.*
lfs quota -u test /home
Disk quotas for user test (uid 627):
      Filesystem  kbytes   quota   limit   grace   files   quota    
limit   grace
           /home       4       0  512000               2       0       0
ddn_home-MDT0000_UUID
                       4       0  102400               2       0       0
ddn_home-OST0000_UUID
                       0       0  102400
ddn_home-OST0001_UUID
                       0       0  102400
ddn_home-OST0002_UUID
                       0       0  102400
ddn_home-OST0003_UUID
                       0       0  102400

4) create files again
for i in 1 2 3 4 ; do dd if=/dev/zero of=dummy.$i bs=102400K count=1  
& done

ls -l
total 409616
-rw-rw-r--  1 test test 104857600 Jan  9 16:28 dummy.1
-rw-rw-r--  1 test test 104857600 Jan  9 16:28 dummy.2
-rw-rw-r--  1 test test 104857600 Jan  9 16:28 dummy.3
-rw-rw-r--  1 test test 104857600 Jan  9 16:28 dummy.4

lfs quota -u test /home
Disk quotas for user test (uid 627):
      Filesystem  kbytes   quota   limit   grace   files   quota    
limit   grace
           /home  409620       0  512000               6       0       0
ddn_home-MDT0000_UUID
                       4       0  102400               6       0       0
ddn_home-OST0000_UUID
                  102404*      0  102400
ddn_home-OST0001_UUID
                  102404*      0  102400
ddn_home-OST0002_UUID
                  102404*      0  102400
ddn_home-OST0003_UUID
                  102404*      0  102400

As you see I couldn''t reproduce your scenarios but I think source of  
my problem is similar to yours. I created bugzilla ticket for my problem
https://bugzilla.lustre.org/show_bug.cgi?id=14619

cheers

Wojciech Turek

On 9 Jan 2008, at 13:51, Patrick Winnertz wrote:
> Am Mittwoch, 9. Januar 2008 14:38:05 schrieben Sie:
>> Hi,
>>
>> We see similar problem on our 1.6.3 lustre installation I will open
>> bugzilla ticket for that issue
> Okay, fine.. can you please set me into the cc of this ticket?
>
> Greetings
> Patrick
>
> --
> Patrick Winnertz
> Tel.: +49 (0) 2161 / 4643 - 0
>
> credativ GmbH, HRB M?nchengladbach 12080
> Hohenzollernstr. 133, 41061 M?nchengladbach
> Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080109/81f7a4b0/attachment-0002.html

Reasonably Related Threads

Search for more maybe matching threads

Lustre discuss - Jan 2008 - lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

[Lustre-discuss] lustre quota problems

Reasonably Related Threads