Norman P. B. Joseph
2010-Nov-23 19:04 UTC
[Ocfs2-users] Understanding debugfs.ocfs2 output
This is related to the "No space on OCFS2 volume" error discussed here this past Sep/Oct. Our Oracle support rep pointed us to Metalink note #1232702.1 and suggested we should script something up to periodically check the free contiguous blocks in the group chains for the volume in question. Reading the note, I get how to get Clusters per Group X Bits per Cluster from the "stat //extent_alloc:NNNN". What I'm confused about is parsing the "stat //global_bitmap" section, and I thought I might have a better chance of getting an explanation from the audience here. I have 2 basic questions: 1) The note above says that a "highly fragmented volume will have many" Group Chain entries in the //global_bitmap section, but gives no guidance as to what constitutes "many". I see ~240 in the 56 GB OCFS2 partition in question. Is that Low? High? Just Right? 2) The note also says to check the "Contig" values in the long list of Group Chains that follows in the //global_bitmap section. Specifically, "If none are higher than the sum[sic] of "Clusters per Group * Bits per Cluster" the metadata extent file cannot be expanded..." Here is a sample output below: Group Chain: 8 Parent Inode: 11 Generation: 1861766630 CRC32: 00000000 ECC: 0000 ## Block# Total Used Free Contig Size 0 258048 32256 1026 31230 28159 4032 1 8096256 32256 1 32255 32255 4032 Group Chain: 9 Parent Inode: 11 Generation: 1861766630 CRC32: 00000000 ECC: 0000 ## Block# Total Used Free Contig Size 0 290304 32256 30721 1535 1535 4032 1 8128512 32256 1 32255 32255 4032 Group Chain: 10 Parent Inode: 11 Generation: 1861766630 CRC32: 00000000 ECC: 0000 ## Block# Total Used Free Contig Size 0 322560 32256 31745 511 511 4032 1 8160768 32256 1 32255 32255 4032 Group Chain: 11 Parent Inode: 11 Generation: 1861766630 CRC32: 00000000 ECC: 0000 ## Block# Total Used Free Contig Size 0 354816 32256 30721 1535 1535 4032 1 8193024 32256 1 32255 32255 4032 Should I be considering -all- the Contig values from -all- the Group Chains listed when looking for issues, or should I be considering each chain individually? IOW, is it only a problem when -all- the Contig values from -all- the chains are below the clusters X bits value, or is it only a problem when all of the Contig values for a -single- chain are below the cluster X bits value? Any pointers on valid interpretation of the output is appreciated. -Norm -- Norman Joseph, Senior System Engineer joseph at ctc.com Concurrent Technologies Corporation 814.269.2633 Information Systems Management Office (ISMO) ------------------------------------------------------------ This message and any files transmitted within are intended solely for the addressee or its representative and may contain company sensitive information. If you are not the intended recipient, notify the sender immediately and delete this message. Publication, reproduction, forwarding, or content disclosure is prohibited without the consent of the original sender and may be unlawful. Concurrent Technologies Corporation and its Affiliates. www.ctc.com 1-800-282-4392 ------------------------------------------------------------
The length of allocator chains in the global bitmap depends on the size of the volume and block/cluster sizes. It is created during format and only grows if the volume is grown. That's it. On 11/23/2010 11:04 AM, Norman P. B. Joseph wrote:> This is related to the "No space on OCFS2 volume" error discussed here > this past Sep/Oct. Our Oracle support rep pointed us to Metalink note > #1232702.1 and suggested we should script something up to periodically > check the free contiguous blocks in the group chains for the volume in > question. > > Reading the note, I get how to get Clusters per Group X Bits per Cluster > from the "stat //extent_alloc:NNNN". What I'm confused about is parsing > the "stat //global_bitmap" section, and I thought I might have a better > chance of getting an explanation from the audience here. > > I have 2 basic questions: > > 1) The note above says that a "highly fragmented volume will have many" > Group Chain entries in the //global_bitmap section, but gives no > guidance as to what constitutes "many". I see ~240 in the 56 GB OCFS2 > partition in question. Is that Low? High? Just Right? > > 2) The note also says to check the "Contig" values in the long list of > Group Chains that follows in the //global_bitmap section. Specifically, > "If none are higher than the sum[sic] of "Clusters per Group * Bits per > Cluster" the metadata extent file cannot be expanded..." Here is a > sample output below: > > Group Chain: 8 Parent Inode: 11 Generation: 1861766630 > CRC32: 00000000 ECC: 0000 > ## Block# Total Used Free Contig Size > 0 258048 32256 1026 31230 28159 4032 > 1 8096256 32256 1 32255 32255 4032 > > Group Chain: 9 Parent Inode: 11 Generation: 1861766630 > CRC32: 00000000 ECC: 0000 > ## Block# Total Used Free Contig Size > 0 290304 32256 30721 1535 1535 4032 > 1 8128512 32256 1 32255 32255 4032 > > Group Chain: 10 Parent Inode: 11 Generation: 1861766630 > CRC32: 00000000 ECC: 0000 > ## Block# Total Used Free Contig Size > 0 322560 32256 31745 511 511 4032 > 1 8160768 32256 1 32255 32255 4032 > > Group Chain: 11 Parent Inode: 11 Generation: 1861766630 > CRC32: 00000000 ECC: 0000 > ## Block# Total Used Free Contig Size > 0 354816 32256 30721 1535 1535 4032 > 1 8193024 32256 1 32255 32255 4032 > > Should I be considering -all- the Contig values from -all- the Group > Chains listed when looking for issues, or should I be considering each > chain individually? IOW, is it only a problem when -all- the Contig > values from -all- the chains are below the clusters X bits value, or is > it only a problem when all of the Contig values for a -single- chain are > below the cluster X bits value? > > Any pointers on valid interpretation of the output is appreciated. > > -Norm > >