Barry Bouwsma
2004-Sep-06 10:08 UTC
[Vorbis-dev] Fixing libvorbisfile to handle largefiles
[I'm not online regularly, so don't include me in any replies. Looks like the archives are working at present, so I'll catch up from them when I get a chance. Thanks...] (Trying out vorbis-dev@ instead of vorbis@ where I sent my previous messages; if this is the wrong place, correct me) Greetings. Some weeks ago, I submitted several hacks which gave me the ability to play Ogg Vorbis files with size larger than 2GB on a BSD platform where `long' is a 32-bit quantity. Now I've decided to look at this again, and see if I can patch things in a more logical way, and ask about possible problems that may be introduced. First of all, libvorbisfile. ov_callbacks contains four functions. The seek_func already uses a 64-bit type for the offset, so no change needs to be made there. However, when accessing large files, the tell_function will need to return a 64-bit type as well. This is going to change the API, isn't it? Anyway, here's the needed patch to accomplish this, based on source updated the weekend of 22.Aug: --- include/vorbis/vorbisfile.h-ORIG Sun Aug 22 20:49:53 2004 +++ include/vorbis/vorbisfile.h Fri Aug 27 00:39:56 2004 @@ -40,7 +40,7 @@ size_t (*read_func) (void *ptr, size_t size, size_t nmemb, void *datasource); int (*seek_func) (void *datasource, ogg_int64_t offset, int whence); int (*close_func) (void *datasource); - long (*tell_func) (void *datasource); + ogg_int64_t (*tell_func) (void *datasource); } ov_callbacks; #define NOTOPEN 0 I suspect there are a number of other files elsewhere in the vorbis source that I haven't searched for (like in win32) that need tweaking as a result of this. I've only looked in the files that get built on my FreeBSD system. That would be lib/vorbisfile.c. This includes a wrapper for seek_func which uses fseek() but alludes to the use of a 64-bit safe function where appropriate. I would say the same thing needs to be done for tell_func. Now, I need to ask about this _fseek64_wrap. Is it intended that this select the appropriate one of fseek64/fseeko/fseek/ whatever, based on what the OS supports? I'm assuming that it is, but that it isn't complete yet. Under this assumption, let me ask, how do you plan to select between the available 64-bit-safe functions for the particular platform? Would the source be littered with #ifdef (OS) to select the supported function, or would it be littered with #ifdef (HAVE_FSEEKfoo) as determined by `configure' in the future, or might `configure' define an FSEEK_FUNC to be used, or something different entirely? I've taken the easiest way (the first OS-specific one that I mentioned) in my hacks below. Anyway, here are some of my hacks to lib/vorbisfile.c in vorbis (these are just the ones needed for 64-bit tell/seek): --- lib/vorbisfile.c-ORIG Sun Aug 22 20:50:55 2004 +++ lib/vorbisfile.c Fri Aug 27 08:21:33 2004 @@ -84,6 +84,18 @@ } } +/* XXX HACK */ +/* save a tiny smidge of verbosity to make the code more readable */ +ogg_int64_t _tell_helper(OggVorbis_File *vf){ + if(vf->datasource){ + (vf->callbacks.tell_func)(vf->datasource); + ogg_sync_reset(&vf->oy); + }else{ + /* shouldn't happen unless someone writes a broken callback */ + return; + } +} + /* The read/seek functions track absolute position within the stream */ /* from the head of the stream, get the next page. boundary specifies Let me comment here: This is a duplicate of _seek_helper, and I'm not sure it's correct. _seek_helper is declared static (which I also changed for my convenience to match the above), and is not available to be used elsewhere. However, in the process of building `ogg123', I found that I needed to replace references to fseek()/ftell() with something comparable to the above -- otherwise I'd need to duplicate the below in order to use the proper function. It may be that the above should not be available to be used elsewhere, in which case the above should be `static' and I'll need to work on later hacks. Come to think of it, I think I wasn't able to use the above in an external program, but the below _fseek64_wrap insted, which is similarly changed not to be static -- just so that I don't have to duplicate the code elsewhere. Anyway, the above is probably unnecessary, since I didn't know what I was doing or why. @@ -625,9 +637,27 @@ /* if, eg, 64 bit stdio is configured by default, this will build with fseek64 */ -static int _fseek64_wrap(FILE *f,ogg_int64_t off,int whence){ +int _fseek64_wrap(FILE *f,ogg_int64_t off,int whence){ if(f==NULL)return(-1); + /* XXX HACK return fseek(f,off,whence); */ +/* There must be a better way to do this, no? */ +#if defined(__FreeBSD__) /* BSD */ + return fseeko(f,off,whence); +#else return fseek(f,off,whence); +#endif +} + +/* if, eg, 64 bit stdio is configured by default, this will build with + ftell64 */ +ogg_int64_t _ftell64_wrap(FILE *f){ + if(f==NULL)return(-1); +/* There must be a better way to do this, no? */ +#if defined(__FreeBSD__) /* BSD */ + return ftello(f); +#else + return ftell(f); +#endif } static int _ov_open1(void *f,OggVorbis_File *vf,char *initial, More comments: As per the code comment, I'm not at all sure how this is supposed to build with ftell64() or anything else. All the BSDen have had native 64-bit stdio but which use a slightly different name. I'm building for FreeBSD, but the above __FreeBSD__ check, if correct, could be extended to cover __NetBSD__ and __OpenBSD__ and __DragonFly__ and any others. However, as I commented earlier, I'm not sure if this is the way to do it, or via some `configure' magic, or something else entirely. Your input will help me re-do this part of the hack in a more acceptable way. And now the meat of the matter: @@ -738,7 +768,7 @@ (size_t (*)(void *, size_t, size_t, void *)) fread, (int (*)(void *, ogg_int64_t, int)) _fseek64_wrap, (int (*)(void *)) fclose, - (long (*)(void *)) ftell + (ogg_int64_t (*)(void *)) _ftell64_wrap }; return ov_open_callbacks((void *)f, vf, initial, ibytes, callbacks); @@ -788,7 +818,7 @@ (size_t (*)(void *, size_t, size_t, void *)) fread, (int (*)(void *, ogg_int64_t, int)) _fseek64_wrap, (int (*)(void *)) fclose, - (long (*)(void *)) ftell + (ogg_int64_t (*)(void *)) _ftell64_wrap }; return ov_test_callbacks((void *)f, vf, initial, ibytes, callbacks); The most important part of this hack from an API perspective; without this I stand no chance of playing a large size file on a 32-bit system. (The rest of the hack is another attempt to work around an overflow in bisect that I posted earlier, and which I'll post and possibly re-do sometime later.) Glancing for the first time ;-) at the doc directory, it also seems like ov_raw_seek() and seek_lap are possible candidates for a change from long to 64-bit as well, but I'm not sure how/where those are used. As the return from ov_raw_total() is already 64-bit, it seems one would want to change the seek functions to match for consistency. As noted, the doc, possibly examples, and maybe things elsewhere in vorbis/ will need to be changed to match, if I'm on the right track. I've only cared that I'm able to play large files as proof of concept. My biggest concern is the change to the API, and how that would affect the chances of these sorts of changes making it into the code. thanks barry bouwsma
On Sat, Sep 04, 2004 at 09:16:21PM +0200, Barry Bouwsma wrote:> This is going to change the API, isn't it?Yep, that's the fundamental problem, and the main reason that glaring oversight hasn't been corrected yet. ...but it should. Hm, OK, time to think about the plan... Monty