jeff.liu
2011-Apr-18 07:22 UTC
[zfs-discuss] SEEK_HOLE returns the whole sparse file size?
Hello List, I am trying to fetch the data/hole info of a sparse file through the lseek(SEEK_HOLE/SEEK_DATA) stuff, the result of fpathconf(..., _PC_MIN_HOLE_SIZE) is ok, so I think this interface is supported on my testing ZFS, but SEEK_HOLE always return the sparse file total size instead of the desired first hole start offset. The whole process was shown as following, Could anyone give any hints? Create a sparse file("sparse2") as below, SEEK_HOLE should returns *ZERO* and SEEK_DATA should returns 40960 IMHO: bash-3.00@ python -c "f=open(''sparse2'', ''w''); f.seek(40960); f.write(''BYE''); f.close()" A tiny program to examine the hole start offset of file "sparse2". #include <stdio.h> #include <string.h> #include <fcntl.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> #include <errno.h> int main(int argc, char *argv[]) { int ret = 0, fd; off_t data_pos, hole_pos; const char *filename = NULL; if (argc != 2) { fprintf(stderr, "Usage: %s file\n", argv[0]); return 1; } filename = strdup(argv[1]); if (!filename) { perror("strdup"); return -1; } fd = open(filename, O_RDONLY); if (fd < 0) { perror("open"); ret = -1; goto out; } if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0) { fprintf(stderr, "The underlying filesystem does not support SEEK_HOLE.\n"); goto out; } hole_pos = lseek(fd, (off_t)0, SEEK_HOLE); if (hole_pos < 0) { if (errno == EINVAL || errno == ENOTSUP) { fprintf(stderr, "SEEK_HOLE does not support on OS or filesystem.\n"); goto out; } perror("lseek"); ret = -1; goto out; } if (hole_pos == 0) fprintf(stderr, "Oh, no!! hole start at offset 0?\n"); if (hole_pos > 0) fprintf(stderr, "detected a real hole at: %d.\n", hole_pos); out: free(filename); lseek(fd, (off_t)0, SEEK_SET); return 0; } My test env: ===========bash-3.00# uname -a SunOS unknown 5.10 Generic_142910-17 i86pc i386 i86pc man zfs(1): SunOS 5.10 Last change: 11 Jun 2010 bash-3.00# zfs list NAME USED AVAIL REFER MOUNTPOINT ... ... rpool/export/home 120K 139G 120K /export/home ... "sparse2" located at "/export/home": bash-3.00# zdb -dddddd rpool/export/home Object lvl iblk dblk dsize lsize %full type 104 1 16K 40.5K 40.5K 40.5K 100.00 ZFS plain file 264 bonus ZFS znode dnode flags: USED_BYTES USERUSED_ACCOUNTED dnode maxblkid: 0 path /sparse2 uid 0 gid 0 atime Mon Apr 18 14:50:46 2011 mtime Mon Apr 18 14:50:46 2011 ctime Mon Apr 18 14:50:46 2011 crtime Mon Apr 18 14:50:46 2011 gen 497 mode 100600 size 40963 parent 3 links 1 xattr 0 rdev 0x0000000000000000 Indirect blocks: 0 L0 0:1960ce000:a200 a200L/a200P F=1 B=497/497 segment [0000000000000000, 000000000000a200) size 40.5K bash-3.00# zdb -R rpool/export/home 0:1960ce000:a200 ..... 009ff0: 0000000000000000 0000000000000000 ................ 00a000: 0000000000455942 0000000000000000 BYE............. 00a010: 0000000000000000 0000000000000000 ................ .... ... Any comments are appreciated! Thanks, -Jeff
Victor Latushkin
2011-Apr-18 07:37 UTC
[zfs-discuss] SEEK_HOLE returns the whole sparse file size?
On Apr 18, 2011, at 11:22 AM, jeff.liu wrote:> Hello List, > > I am trying to fetch the data/hole info of a sparse file through the lseek(SEEK_HOLE/SEEK_DATA) > stuff, the result of fpathconf(..., _PC_MIN_HOLE_SIZE) is ok, so I think this interface is supported > on my testing ZFS, but SEEK_HOLE always return the sparse file total size instead of the desired > first hole start offset. > > The whole process was shown as following, Could anyone give any hints? > > Create a sparse file("sparse2") as below, SEEK_HOLE should returns *ZERO* and SEEK_DATA should > returns 40960 IMHO: > bash-3.00@ python -c "f=open(''sparse2'', ''w''); f.seek(40960); f.write(''BYE''); f.close()"With default settings you''ll get single-block file without any holes from ZFS perspective. Try somewhat bigger sparse file like this: dd if=/dev/urandom of=test.file count=1 bs=128k oseek=1024> > A tiny program to examine the hole start offset of file "sparse2". > #include <stdio.h> > #include <string.h> > #include <fcntl.h> > #include <sys/stat.h> > #include <sys/types.h> > #include <unistd.h> > #include <errno.h> > > int > main(int argc, char *argv[]) > { > int ret = 0, fd; > off_t data_pos, hole_pos; > const char *filename = NULL; > > if (argc != 2) { > fprintf(stderr, "Usage: %s file\n", argv[0]); > return 1; > } > > filename = strdup(argv[1]); > if (!filename) { > perror("strdup"); > return -1; > } > > fd = open(filename, O_RDONLY); > if (fd < 0) { > perror("open"); > ret = -1; > goto out; > } > > if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0) { > fprintf(stderr, "The underlying filesystem does not support SEEK_HOLE.\n"); > goto out; > } > > hole_pos = lseek(fd, (off_t)0, SEEK_HOLE); > if (hole_pos < 0) { > if (errno == EINVAL || errno == ENOTSUP) { > fprintf(stderr, "SEEK_HOLE does not support on OS or filesystem.\n"); > goto out; > } > > perror("lseek"); > ret = -1; > goto out; > } > > if (hole_pos == 0) > fprintf(stderr, "Oh, no!! hole start at offset 0?\n"); > > if (hole_pos > 0) > fprintf(stderr, "detected a real hole at: %d.\n", hole_pos); > > out: > free(filename); > lseek(fd, (off_t)0, SEEK_SET); > > return 0; > } > > My test env: > ===========> bash-3.00# uname -a > SunOS unknown 5.10 Generic_142910-17 i86pc i386 i86pc > > man zfs(1): > SunOS 5.10 Last change: 11 Jun 2010 > > bash-3.00# zfs list > NAME USED AVAIL REFER MOUNTPOINT > ... > ... > rpool/export/home 120K 139G 120K /export/home > ... > > "sparse2" located at "/export/home": > bash-3.00# zdb -dddddd rpool/export/home > Object lvl iblk dblk dsize lsize %full type > 104 1 16K 40.5K 40.5K 40.5K 100.00 ZFS plain file > 264 bonus ZFS znode > dnode flags: USED_BYTES USERUSED_ACCOUNTED > dnode maxblkid: 0 > path /sparse2 > uid 0 > gid 0 > atime Mon Apr 18 14:50:46 2011 > mtime Mon Apr 18 14:50:46 2011 > ctime Mon Apr 18 14:50:46 2011 > crtime Mon Apr 18 14:50:46 2011 > gen 497 > mode 100600 > size 40963 > parent 3 > links 1 > xattr 0 > rdev 0x0000000000000000 > Indirect blocks: > 0 L0 0:1960ce000:a200 a200L/a200P F=1 B=497/497 > > segment [0000000000000000, 000000000000a200) size 40.5K > > bash-3.00# zdb -R rpool/export/home 0:1960ce000:a200 > ..... > 009ff0: 0000000000000000 0000000000000000 ................ > 00a000: 0000000000455942 0000000000000000 BYE............. > 00a010: 0000000000000000 0000000000000000 ................ > .... > ... > > Any comments are appreciated! > > Thanks, > -Jeff > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
jeff.liu
2011-Apr-18 07:49 UTC
[zfs-discuss] SEEK_HOLE returns the whole sparse file size?
Victor Latushkin wrote:> On Apr 18, 2011, at 11:22 AM, jeff.liu wrote: > >> Hello List, >> >> I am trying to fetch the data/hole info of a sparse file through the lseek(SEEK_HOLE/SEEK_DATA) >> stuff, the result of fpathconf(..., _PC_MIN_HOLE_SIZE) is ok, so I think this interface is supported >> on my testing ZFS, but SEEK_HOLE always return the sparse file total size instead of the desired >> first hole start offset. >> >> The whole process was shown as following, Could anyone give any hints? >> >> Create a sparse file("sparse2") as below, SEEK_HOLE should returns *ZERO* and SEEK_DATA should >> returns 40960 IMHO: >> bash-3.00@ python -c "f=open(''sparse2'', ''w''); f.seek(40960); f.write(''BYE''); f.close()" > > With default settings you''ll get single-block file without any holes from ZFS perspective. > > Try somewhat bigger sparse file like this: > > dd if=/dev/urandom of=test.file count=1 bs=128k oseek=1024Thanks for your quick response! It works for me now. Regards, -Jeff> > >> A tiny program to examine the hole start offset of file "sparse2". >> #include <stdio.h> >> #include <string.h> >> #include <fcntl.h> >> #include <sys/stat.h> >> #include <sys/types.h> >> #include <unistd.h> >> #include <errno.h> >> >> int >> main(int argc, char *argv[]) >> { >> int ret = 0, fd; >> off_t data_pos, hole_pos; >> const char *filename = NULL; >> >> if (argc != 2) { >> fprintf(stderr, "Usage: %s file\n", argv[0]); >> return 1; >> } >> >> filename = strdup(argv[1]); >> if (!filename) { >> perror("strdup"); >> return -1; >> } >> >> fd = open(filename, O_RDONLY); >> if (fd < 0) { >> perror("open"); >> ret = -1; >> goto out; >> } >> >> if (fpathconf(fd, _PC_MIN_HOLE_SIZE) < 0) { >> fprintf(stderr, "The underlying filesystem does not support SEEK_HOLE.\n"); >> goto out; >> } >> >> hole_pos = lseek(fd, (off_t)0, SEEK_HOLE); >> if (hole_pos < 0) { >> if (errno == EINVAL || errno == ENOTSUP) { >> fprintf(stderr, "SEEK_HOLE does not support on OS or filesystem.\n"); >> goto out; >> } >> >> perror("lseek"); >> ret = -1; >> goto out; >> } >> >> if (hole_pos == 0) >> fprintf(stderr, "Oh, no!! hole start at offset 0?\n"); >> >> if (hole_pos > 0) >> fprintf(stderr, "detected a real hole at: %d.\n", hole_pos); >> >> out: >> free(filename); >> lseek(fd, (off_t)0, SEEK_SET); >> >> return 0; >> } >> >> My test env: >> ===========>> bash-3.00# uname -a >> SunOS unknown 5.10 Generic_142910-17 i86pc i386 i86pc >> >> man zfs(1): >> SunOS 5.10 Last change: 11 Jun 2010 >> >> bash-3.00# zfs list >> NAME USED AVAIL REFER MOUNTPOINT >> ... >> ... >> rpool/export/home 120K 139G 120K /export/home >> ... >> >> "sparse2" located at "/export/home": >> bash-3.00# zdb -dddddd rpool/export/home >> Object lvl iblk dblk dsize lsize %full type >> 104 1 16K 40.5K 40.5K 40.5K 100.00 ZFS plain file >> 264 bonus ZFS znode >> dnode flags: USED_BYTES USERUSED_ACCOUNTED >> dnode maxblkid: 0 >> path /sparse2 >> uid 0 >> gid 0 >> atime Mon Apr 18 14:50:46 2011 >> mtime Mon Apr 18 14:50:46 2011 >> ctime Mon Apr 18 14:50:46 2011 >> crtime Mon Apr 18 14:50:46 2011 >> gen 497 >> mode 100600 >> size 40963 >> parent 3 >> links 1 >> xattr 0 >> rdev 0x0000000000000000 >> Indirect blocks: >> 0 L0 0:1960ce000:a200 a200L/a200P F=1 B=497/497 >> >> segment [0000000000000000, 000000000000a200) size 40.5K >> >> bash-3.00# zdb -R rpool/export/home 0:1960ce000:a200 >> ..... >> 009ff0: 0000000000000000 0000000000000000 ................ >> 00a000: 0000000000455942 0000000000000000 BYE............. >> 00a010: 0000000000000000 0000000000000000 ................ >> .... >> ... >> >> Any comments are appreciated! >> >> Thanks, >> -Jeff >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Joerg Schilling
2011-Apr-18 10:03 UTC
[zfs-discuss] SEEK_HOLE returns the whole sparse file size?
"jeff.liu" <jeff.liu at oracle.com> wrote:> Hello List, > > I am trying to fetch the data/hole info of a sparse file through the lseek(SEEK_HOLE/SEEK_DATA) > stuff, the result of fpathconf(..., _PC_MIN_HOLE_SIZE) is ok, so I think this interface is supported > on my testing ZFS, but SEEK_HOLE always return the sparse file total size instead of the desired > first hole start offset.Maybe you did not create the file correctly. If you like to create a file of a specific size that only contains one single hole, there is a way to do this using star: mkfile <size> some-file # create a file of the desired size star -c f=xx.tar -meta some-file # archive the meta data only rm some-file # remove original file star -x -xmeta -force-hole f=xx.tar # let star create an empty file of the # same size This will try to create a file that has one hole but no data in case the filesystem supports to do this. For UFS, such a file e.g. needs to be a multiple of 8k in size. This is because holes in UFS need to be aligned at 8k boundaries and need to be a multiple of 8k in size. A recent star can be found in the schily source consolidation: ftp://ftp.berlios.de/pub/schily/ star is part of Schillix-ON (a free OpenSolaris fork). J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Jeff liu
2011-Apr-18 13:38 UTC
[zfs-discuss] SEEK_HOLE returns the whole sparse file size?
? 2011-4-18???6:03? Joerg Schilling ???> "jeff.liu" <jeff.liu at oracle.com> wrote: > >> Hello List, >> >> I am trying to fetch the data/hole info of a sparse file through the lseek(SEEK_HOLE/SEEK_DATA) >> stuff, the result of fpathconf(..., _PC_MIN_HOLE_SIZE) is ok, so I think this interface is supported >> on my testing ZFS, but SEEK_HOLE always return the sparse file total size instead of the desired >> first hole start offset. > > Maybe you did not create the file correctly. > > If you like to create a file of a specific size that only contains one single > hole, there is a way to do this using star: > > mkfile <size> some-file # create a file of the desired size > star -c f=xx.tar -meta some-file # archive the meta data only > rm some-file # remove original file > star -x -xmeta -force-hole f=xx.tar # let star create an empty file of the > # same size > > This will try to create a file that has one hole but no data in case the > filesystem supports to do this. > > For UFS, such a file e.g. needs to be a multiple of 8k in size. This is because > holes in UFS need to be aligned at 8k boundaries and need to be a multiple of > 8k in size. > > > A recent star can be found in the schily source consolidation: > > ftp://ftp.berlios.de/pub/schily/ > > > star is part of Schillix-ON (a free OpenSolaris fork).Thanks for your info! Actually, I''d like to create a sparse file with both data and holes to test a patch I am working on, it was wrote for Coreutils to optimize sparse file reading, especially for cp(1). I had took a look at the source code of Star before, and I was wondering of comments at start/hole.c: sparse_file(fp, info) at that time, line: 1240 if (pos >= info->f_size) { /* Definitely not sparse */ Now, I am still not sure why the returned ''pos'' could larger than ''info->f_size'' in some cases, I guess it should be equal to the f_size if the target file is not sparse, Am I missing something here? For the sparse file I created via python -c "f=open(''sparse'', ''w''); f.seek(40960); f.write(''BYE''); f.close()", IMHO, it is a right way to create a sparse file conventionally, but it is too small to spanning across multiple blocks, so ZFS allocate it as an non-sparse single block file just as Victor''s response. Regards, -Jeff> > J?rg > > -- > EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin > js at cs.tu-berlin.de (uni) > joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ > URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110418/458422e3/attachment.html>