> Define the file format at the byte level and then make sure C structures > can map it accurately on known machines. Suitable testing during > configuration will ascertain whether the library can be compiled > conformantly on new machines.Correct me, but the whole idea of the cache is to save it locally - so the endianes should not matter as you wont exchange your cache between different systems. Kris
Dnia ?roda 07 maj 2003 06:28 pm, Keith Packard napisa?:> Around 11 o''clock on May 7, Krzysztof Dabrowski wrote: > > FcNameParse who is responsible for 22.84% of total execution time.. > > I''m interested in getting a replacement file structure for the current > fonts.cache-1 files; those are now text files and parsing that is just too > slow. > > Because the files are versioned, we really can completely redesign the > format.I''m thinking about something more radical: I thought about serialising this struct to disk: struct _FcConfig { FcStrSet *configDirs; /* directories to scan for fonts */ FcChar8 *cache; /* name of per-user cache file */ FcBlanks *blanks; FcStrSet *fontDirs; FcStrSet *configFiles; /* config files loaded */ FcSubst *substPattern; /* substitutions for patterns */ FcSubst *substFont; /* substitutions for fonts */ int maxObjects; /* maximum number of tests in all substs FcFontSet *fonts[FcSetApplication + 1]; time_t rescanTime; /* last time information was scanned */ int rescanInterval; /* interval between scans */ }; I''m more into Java programming these days and my C is a little rusty, so the question is: are there any automatic way to serialise a struct in C (or maybe a helper library exists)? If not then it should still be possible to dump it wholesale somehow. And since font configuration rarely changes, then it could be loaded only one, then saved to disk for other applications to use and have a reasonable TTL (a day/week/whatever). What do you think about it? If you could suggest me a right way of serialising it to disk then i could even try to program it in my spare time if i can manage to find any. Kris
Ok.. Here comes my idea (and the format will be structure independant + ultra fast to load). Saving the cache to disk will require significant amount of work but loading and re-using it should be blazing fast.. The whole idea is to SAVE: a) compute the total memory consumption of a given structure including all it''s members and members'' members etc. b) allocate a single block of memory of desired lenght. c) make a copy of the structure to this block changing it to occupy the linear space without gaps. d) scan the structure for ALL POINTERS, convert them to be relative to the block''s begining and output pointer location to a list. e) save the area together with pointers addreses lists to a binary file. LOAD: a) read cache file header. b) malloc the structure area in one single malloc c) read the structure inside this are d) read pointers'' addresses from the file e) perform address fixing (convert pointers from relative to absolute). f) and here we have the structure as it was before.. The SAVE part looks a bit complicated but points a,b,c could be made transparent by using a malloc replacement that works in a static pre-allocated buffer (there is something called amalloc on some unixes).. And i think that if the first element of the structure will be the structure''s address and we''ll make sure that the main structure is the first thing allocated in the buffer (so structures address == buffer address) then we could even skip d) and e) and the loader that has knowledge of the structure format will be able to recursiveli traverse it and "fix" all the pointers but this would slow loading a bit. And about versioning: such things like this cache can be in my opinion realy library-version dependant, if the cache is from older version, re-creating it takes just a second or two.. Now i''m waiting for your input. If somebody know how to make point d) in the SAVE easly then i think the whole thing can make sense. Kris
Around 11 o''clock on May 8, Owen Taylor wrote:> But there are various difficulties; for instance the format of > this memmapped would have either be independent of endianess and > alignment, or saved per-machine.A brute-force approach would be to define structures to pad appropriately (as the X headers do) and then permit entries for both byte orders. Define the file format at the byte level and then make sure C structures can map it accurately on known machines. Suitable testing during configuration will ascertain whether the library can be compiled conformantly on new machines. One possible problem here is that FcFontSort currently returns only references to the font patterns. I think the FcPattern structure will need some rework to make sure this is still possible. Thank goodness for opaque data structures.> (BTW - I seem to recall some significant speedups that could be made > in the parsing code when I looked at it.)There''s a patch in fontconfig bugzilla which has some useful optimizations, but I felt it would be better to have a wholesale redesign fo the file format to wring as much improvement as possible. -keith
Around 11 o''clock on May 7, Krzysztof Dabrowski wrote:> FcNameParse who is responsible for 22.84% of total execution time..I''m interested in getting a replacement file structure for the current fonts.cache-1 files; those are now text files and parsing that is just too slow. Because the files are versioned, we really can completely redesign the format. -keith
> And since font configuration rarely changes, then it could be loaded only > one, then saved to disk for other applications to use and have a reasonable > TTL (a day/week/whatever).Or the whole configuration could be put into shared memory and then used directly by other instances of the library. Correct me, but this seems to be the fastest option... Kris
On Thu, 2003-05-08 at 08:23, Krzysztof Dabrowski wrote:> > And since font configuration rarely changes, then it could be loaded only > > one, then saved to disk for other applications to use and have a reasonable > > TTL (a day/week/whatever). > > Or the whole configuration could be put into shared memory and then used > directly by other instances of the library. Correct me, but this seems to be > the fastest option...Some sort of shared-memory scheme could certainly be helpful in also cutting down the memory usage of fontconfig, which, while it is much better than it used to be, is still far from tiny. IMO, probably better than using SysV shared memory (which would require a daemon) is to write a file to disk that can be mem-mapped in a shared fashion by the various processes. But there are various difficulties; for instance the format of this memmapped would have either be independent of endianess and alignment, or saved per-machine. An intermediate possibility might be to try to keep things like strings and coverage tables in a shared segment, but create real FcPattern structures in non-shared memory that point to the shared segment. Regards, Owen (BTW - I seem to recall some significant speedups that could be made in the parsing code when I looked at it.)
> Some sort of shared-memory scheme could certainly be helpful in > also cutting down the memory usage of fontconfig, which, while it > is much better than it used to be, is still far from tiny.> IMO, probably better than using SysV shared memory (which would > require a daemon) is to write a file to disk that can be mem-mapped > in a shared fashion by the various processes.After such mapping you would have to "parse" it somehow anyway otherwise you will have realy slow access to the whole thing in the memory. I think that a serialisation to a binary format (something resembling IFF format would suffice) on disk would be OK since this can be relatively quickly loaded and converted to real structures/lists. But such serialisation would be realy dependant on the structure format - any change in the structure would require a new serialiser.> An intermediate possibility might be to try to keep things like strings > and coverage tables in a shared segment, but create real FcPattern > structures in non-shared memory that point to the shared segment.> (BTW - I seem to recall some significant speedups that could be made > in the parsing code when I looked at it.)This could be beneficial but still it wont solve the problem that avery KDE/Gnome/Xft2 application started in your system will have to re-parse everything again and this is a waste of time. I''m realy not against the daemon thing. It could be realy transparent - if you have it up and running - fontconfig could obtain it''s data from it. If it''s not up and running it can fall back to the old behaviour. 30% of app''s startup time is worth it in my opinion. best regards, Kris
Around 17 o''clock on May 8, Krzysztof Dabrowski wrote:> After such mapping you would have to "parse" it somehow anyway otherwise you > will have realy slow access to the whole thing in the memory.A carefully designed format should be interpretable by the library with minimal parsing -- reformatting the data would increase memory usage which obviates a lot of the utility of a mapped file.> But such serialisation would be realy dependant on the structure format - any > change in the structure would require a new serialiser.That''s a significant issue. We have versioned cache file names, but it would be best if the format could remain compatible across several versions of the library.> I''m realy not against the daemon thing. It could be realy transparent - if > you have it up and running - fontconfig could obtain it''s data from it. If > it''s not up and running it can fall back to the old behaviour. 30% of app''s > startup time is worth it in my opinion.A solution which doesn''t depend on shared memory among processes will reduce security and stability issues. Let''s see how far we can get without that. -keith
Hello, I''ve been profiling some KDE applications recently (namely Kmail, Konqueror and Konsole) using valgrind. I''ve noticed that significant amount of startup time is devoted to Fontconfig related activities. F.e. for Konsole (i''ve started it and quited immediately to measure just the startup times). a call to FcInit consumes 33.49% of the total execution time. FCinit calls 5 different functions but 4 of them are neglible only one stands out: FcLoadConfigAndFonts - it takes 33.49% (=it does most of the job in FcInit). So i dig further: Again, one call inside FcLoadConfigAndFonts consumes most of the time: It''s FcConfigBuildFonts and inside of it, FcDirScan takes 32.78% of total time. And we are close to the final now: in FcDirScan the function FcDirCacheReadDir takes 32.78% of total time (total time means total application execution time). And inside this function there is a lot of action but onfortunately i''m missing debugging symbols so i can not see all the function names but the first function calls mostly FcNameParse who is responsible for 22.84% of total execution time.. If you are interested then i can send you my valgrind output files so you can all see it yourself. But something is wrong with the fontconfig in the sense that it still takes too long just to load config at the startup despite the cache. Can''t we just dump the serialized binary config structures from the memory to disk and then only compare file dates and do the parsing only when necesary? Or maybe a daemon holding configuration in the memory and serving it to interested applications? What do you think about it guys? 1/3 of the startup time for KDE application is certainly worth the effort (and all GTK2 apps will also benefit). congratulations for a great project by the way, Krzysztof Dabrowski