Goswin von Brederlow
2006-Nov-24 08:12 UTC
[Lustre-discuss] LustreError: lov_update_create_set()
Hi,
I''m running Lustre 1.4.6 on a 2.6.15.7 vanilla kernel and am trying to
deciver some lustre error messages. The system has 4 systems with 2
2TB OSTs each and default of 4 stripes for files. The lustre is 83%
full so there is over 1TB free space.
On the server I get:
[612824.958378] LustreError: 17495:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x6a79400 sub-object on OST idx 5/4: rc = -28
[612824.973243] LustreError: 17495:0:(lov_request.c:621:lov_update_create_set())
previously skipped 200 similar messages
[613607.743311] LustreError: 17510:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4cfef sub-object on OST idx 5/4: rc = -28
[613607.758653] LustreError: 17510:0:(lov_request.c:621:lov_update_create_set())
previously skipped 47 similar messages
[616495.778648] LustreError: 17503:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4d013 sub-object on OST idx 5/4: rc = -28
[616495.793510] LustreError: 17503:0:(lov_request.c:621:lov_update_create_set())
previously skipped 71 similar messages
[617068.096215] LustreError: 17509:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4d014 sub-object on OST idx 3/4: rc = -28
[617068.111559] LustreError: 17509:0:(lov_request.c:621:lov_update_create_set())
previously skipped 2 similar messages
[617271.813408] LustreError: 17485:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5ada805 sub-object on OST idx 3/4: rc = -28
[617271.828346] LustreError: 17485:0:(lov_request.c:621:lov_update_create_set())
previously skipped 139 similar messages
[617640.257686] LustreError: 17486:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4d05d sub-object on OST idx 3/4: rc = -28
[617640.272617] LustreError: 17486:0:(lov_request.c:621:lov_update_create_set())
previously skipped 7 similar messages
[618312.126431] LustreError: 17486:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4d15a sub-object on OST idx 5/4: rc = -28
[618312.141778] LustreError: 17486:0:(lov_request.c:621:lov_update_create_set())
previously skipped 504 similar messages
[618927.592338] LustreError: 17503:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4d1c2 sub-object on OST idx 3/4: rc = -28
[618927.607685] LustreError: 17503:0:(lov_request.c:621:lov_update_create_set())
previously skipped 211 similar messages
[619532.920758] LustreError: 17506:0:(lov_request.c:621:lov_update_create_set())
error creating fid 0x5d4d21d sub-object on OST idx 3/4: rc = -28
[619532.935787] LustreError: 17506:0:(lov_request.c:621:lov_update_create_set())
previously skipped 183 similar messages
And on the client:
[13038.954468] LustreError:
1737:0:(openiblnd_cb.c:1982:kibnal_active_conn_callback()) Connection
ffff81002fdf44c0 -> 172.17.3.253@openib IDLE
[13038.971282] Lustre: 9:0:(openiblnd_cb.c:1975:kibnal_active_conn_callback())
Connection ffff8100530af680 -> 172.17.3.253@openib ESTABLISHED
[13039.815435] Lustre: 9:0:(openiblnd_cb.c:1975:kibnal_active_conn_callback())
Connection ffff810054239680 -> 172.17.3.21@openib ESTABLISHED
[13850.263066] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810048271800 x611/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[13850.297240] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810048271800 x612/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[13887.466952] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810044331600 x626/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[14246.923972] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff81005a57cc00 x767/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[14246.952805] LustreError: 10462:0:(client.c:577:ptlrpc_check_status())
previously skipped 1 similar messages
[14391.012533] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810070ee0a00 x821/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[14571.014542] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810044331600 x882/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[14744.830493] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff81007ac52e00 x953/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[14784.793279] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810048271400 x971/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[18776.892294] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810014b9e200 x3036/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[18875.752660] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff81001f263000 x3090/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[19569.616318] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff81007b3fc000 x3376/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[19705.117998] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff81007ac52800 x3421/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[19850.093564] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff810060405c00 x3474/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[21025.911361] LustreError: 10462:0:(client.c:577:ptlrpc_check_status()) @@@
type == PTL_RPC_MSG_ERR, err == -5 req@ffff81000516a800 x3965/t0
o3->ost1-1_UUID@sn-03-1_UUID:28 lens 328/280 ref 2 fl Rpc:R/0/0 rc 0/-5
[23563.464898] LustreError: 15351:0:(lov_obd.c:1236:lov_punch()) error: punch
objid 0x6ad2af3 subobj 0x8b111c on OST idx 0: rc = -30
[24644.095006] LustreError: 15983:0:(lov_obd.c:1236:lov_punch()) error: punch
objid 0x6ad2afb subobj 0x8b1948 on OST idx 7: rc = -30
Any ideas what is going wrong or tips how to deciver what those errors mean?
MfG
Goswin
Andreas Dilger
2006-Nov-24 12:00 UTC
[Lustre-discuss] LustreError: lov_update_create_set()
On Nov 24, 2006 16:12 +0100, Goswin von Brederlow wrote:> I''m running Lustre 1.4.6 on a 2.6.15.7 vanilla kernel and am trying to > deciver some lustre error messages. The system has 4 systems with 2 > 2TB OSTs each and default of 4 stripes for files. The lustre is 83% > full so there is over 1TB free space.Try on a client "lfs df" (if this is in 1.4.6, not sure), or alternately "grep ''[0-9]'' /proc/fs/lustre/osc/*/kbytes*" to see free space per OST. Also check "lfs df -i" or "grep ''[0-9]'' /proc/fs/lustre/osc/*/files*" to see free inodes per OST.> On the server I get: > > [612824.958378] LustreError: 17495:0:(lov_request.c:621:lov_update_create_set()) error creating fid 0x6a79400 sub-object on OST idx 5/4: rc = -28/usr/include/asm/errno.h says -28 is "No space left on device" and the message reports OST idx 5 is the one out of space. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Goswin von Brederlow
2006-Nov-27 05:43 UTC
[Lustre-discuss] LustreError: lov_update_create_set()
Andreas Dilger <adilger@clusterfs.com> writes:> On Nov 24, 2006 16:12 +0100, Goswin von Brederlow wrote: >> I''m running Lustre 1.4.6 on a 2.6.15.7 vanilla kernel and am trying to >> deciver some lustre error messages. The system has 4 systems with 2 >> 2TB OSTs each and default of 4 stripes for files. The lustre is 83% >> full so there is over 1TB free space. > > Try on a client "lfs df" (if this is in 1.4.6, not sure), or alternately > "grep ''[0-9]'' /proc/fs/lustre/osc/*/kbytes*" to see free space per OST. > > Also check "lfs df -i" or "grep ''[0-9]'' /proc/fs/lustre/osc/*/files*" to > see free inodes per OST.There is sufficient space now. The OSTs are slightly different in size but none has less than 70G free now. Plent< of inodes too. Our guess, after your info, is that at the time of the error there must have been a big job using up all space. And upon failing it has cleaned up freeing the space again.>> On the server I get: >> >> [612824.958378] LustreError: 17495:0:(lov_request.c:621:lov_update_create_set()) error creating fid 0x6a79400 sub-object on OST idx 5/4: rc = -28 > > /usr/include/asm/errno.h says -28 is "No space left on device" and the > message reports OST idx 5 is the one out of space.Thanks. So the error numbers do correlate. I wasn''t sure about that given the amount of free space left in total. MfG Goswin