On Saturday 26 March 2005 12:49, Owen Taylor wrote:> On Sat, 2005-03-26 at 10:10 +0100, Lars Knoll wrote: > > On Friday 25 March 2005 22:25, Owen Taylor wrote: > > > On Fri, 2005-03-25 at 01:53 -0500, Patrick Lam wrote: > > > > BTW, Owen''s suggested struct: > > > > struct FcPattern { > > > > int refcount; > > > > FcPatternStorage storage; /* dynamic, static */ > > > > > > > > union { > > > > FcPatternDynamic *dynamic; > > > > FcPatternStatic *static; > > > > } u; > > > > }; > > > > > > > > still seems to be a struct, and thus requiring mallocness and a > > > > pointer. > > > > > > The point of my suggestion is that we need a single *public* structure > > > for: > > > > > > - Patterns retrieved from the database of fonts on disk > > > - Patterns created on the fly through the FcPattern API > > > > > > But that public structure can be small, lightweight, and created on > > > the fly. > > > > Do you really? FcPattern is a completely opaque type. For patterns on the > > disk, you can just return a pointer to them. All that should be needed is > > one bitflag in the pattern structure itself marking it a begin readonly, > > and using the same data structure for dynamic patterns. > > > > This requires using the same endianness and packing for dynamic and > > static patterns and thus makes building up dynamic pattern a little more > > complex, but I don''t think the overhead would be noticable (and the code > > is needed for the cache building anyway). The advantage would the that > > retrieving of properties from the pattern could use the same code. > > My thought that it would be hard to create a structure that both worked > on disk and was suitable for efficient implementation operations like > FcPatternAdd/FcPatternDel. The fundamental problem I see is that > FcPatternAdd requires allocating more storage, but you can''t call > realloc on the FcPattern pointer.Hmmm... true. The need to allocate more space spoils the whole possibility of static and dynamic patterns using exactly the same structure. I still think that you could probably avoid the overhead of allocating anything for static patterns with a pattern as below: struct FcPattern { int Type; // Dynamic or Static }; struct FcPatternStatic { FcPattern p; FcPatternData data; }; struct FcPatternDynamic { FcPattern p; int refcount; FcPatternData *data; }; That way you have almost the same data structure for both types. The only difference would be the way to get a pointer to the pattern data.> I could be missing some tricky way to set things up, however. > > Changing things so that a static FcPattern was only one memory block > would be easier, though if you don''t reference count static patterns, > you may have trouble with font database changes,Won''t font database changes require mmaping the new cache file? In that case you probably need a global refcount for all patterns in the mmaped file, so you know when nothing references the old cache file anymore. Cheers, Lars
On Sat, 2005-03-26 at 10:10 +0100, Lars Knoll wrote:> On Friday 25 March 2005 22:25, Owen Taylor wrote: > > On Fri, 2005-03-25 at 01:53 -0500, Patrick Lam wrote: > > > BTW, Owen''s suggested struct: > > > struct FcPattern { > > > int refcount; > > > FcPatternStorage storage; /* dynamic, static */ > > > > > > union { > > > FcPatternDynamic *dynamic; > > > FcPatternStatic *static; > > > } u; > > > }; > > > > > > still seems to be a struct, and thus requiring mallocness and a pointer. > > > > The point of my suggestion is that we need a single *public* structure > > for: > > > > - Patterns retrieved from the database of fonts on disk > > - Patterns created on the fly through the FcPattern API > > > > But that public structure can be small, lightweight, and created on > > the fly. > > Do you really? FcPattern is a completely opaque type. For patterns on the > disk, you can just return a pointer to them. All that should be needed is one > bitflag in the pattern structure itself marking it a begin readonly, and > using the same data structure for dynamic patterns. > > This requires using the same endianness and packing for dynamic and static > patterns and thus makes building up dynamic pattern a little more complex, > but I don''t think the overhead would be noticable (and the code is needed for > the cache building anyway). The advantage would the that retrieving of > properties from the pattern could use the same code.My thought that it would be hard to create a structure that both worked on disk and was suitable for efficient implementation operations like FcPatternAdd/FcPatternDel. The fundamental problem I see is that FcPatternAdd requires allocating more storage, but you can''t call realloc on the FcPattern pointer. I could be missing some tricky way to set things up, however. Changing things so that a static FcPattern was only one memory block would be easier, though if you don''t reference count static patterns, you may have trouble with font database changes, Owen -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050326/860af19f/attachment.pgp
Around 6 o''clock on Mar 26, Owen Taylor wrote:> Changing things so that a static FcPattern was only one memory block > would be easier, though if you don''t reference count static patterns, > you may have trouble with font database changes,Static FcPatterns are already allocated completely differently than dynamic ones, they remain compatible at the data structure level. This allows them to share lots of pieces across patterns saving huge amounts of memory. -keith -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 228 bytes Desc: not available Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050326/5d9299a2/attachment.pgp
Around 13 o''clock on Mar 26, Lars Knoll wrote:> Won''t font database changes require mmaping the new cache file? In that case > you probably need a global refcount for all patterns in the mmaped file, so > you know when nothing references the old cache file anymore.Except for the listing API, this is pretty easy. I fear that applications will hold FcFontList results for lengthy periods though, essentially locking down old cache files forever. We need a better listing API in any case, to permit faster matching, so perhaps we can fix both problems at the same time. -keith -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 228 bytes Desc: not available Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050326/3a0cb72d/attachment.pgp
On Mon, 2005-03-21 at 01:01 -0500, Patrick Lam wrote:> Hi. > > I''ve been hacking on fontconfig a bit to make it possible to mmap the > FcInit and related data structures. There is some discussion at > > http://bugzilla.gnome.org/show_bug.cgi?id=169345 > > In short, I think that one would need to replace pointers with array > indices; to do otherwise seems to be asking for trouble. I''ve started a > patch that does this; my current patch is available at > > http://plam.csail.mit.edu/~plam/tmp/fontconfig-strings-to-indices.diff > > I''m also willing to replace FcPatternIndex with a void * datatype that > would be cast to ints when needed; this would avoid breaking current > APIs but is less aesthetically pleasing. > > Does this seem to be a promising approach, and is something like this > commitable when complete?The fontconfig caches live in the font directories, which may be shared between machines with different architectures. For that reason, even if you remove pointers, you can''t simply put a structure into the file... the endianess and structure packing may differ between machines. Rather, you need to a design a binary file format that can referenced by a FcPattern. The icon cache format that GTK+ uses is an example of this approach. http://lists.freedesktop.org/archives/xdg/2004-October/005140.html To deal with dynamically allocated patterns, you might have: struct FcPattern { int refcount; FcPatternStorage storage; /* dynamic, static */ union { FcPatternDynamic *dynamic; FcPatternStatic *static; } u; }; You might also want to think about the ability to put hash-table indices into the mmap''ed file ... fontconfig programs can spend a lot of time scanning through a large list of patterns linearly to find a matching family name. Regards, Owen -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050321/3f346af4/attachment.pgp
Owen Taylor wrote:> The fontconfig caches live in the font directories, which may be > shared between machines with different architectures. For that reason, > even if you remove pointers, you can''t simply put a structure into > the file... the endianess and structure packing may differ between > machines.I was thinking that it would be quicker to create a file on a per-user basis, but I suppose that users might still log on using different archs. In fact FcPatterns themselves don''t take up much room (using an FcPattern struct as you propose would probably double the amount of room taken). However, the FcValues contained within the FcPatternElts do take up a lot of room, I think. And if we put the FcConfig in an mmap''ed file, we can''t put any pointers to FcPatterns into the FcConfig, only indices into an array.> struct FcPattern { > int refcount; > FcPatternStorage storage; /* dynamic, static */ > > union { > FcPatternDynamic *dynamic; > FcPatternStatic *static; > } u; > }; > > You might also want to think about the ability to put hash-table indices > into the mmap''ed file ... fontconfig programs can spend > a lot of time scanning through a large list of patterns linearly > to find a matching family name.Seems like a good thing to work on after I have an initial version with the mmapable config. I think someone''s recently added some hashing, too. pat
Around 1 o''clock on Mar 21, Patrick Lam wrote:> In short, I think that one would need to replace pointers with array > indices; to do otherwise seems to be asking for trouble. I''ve started a > patch that does this; my current patch is available atYes, of course the structures need to be address independent. I was thinking that at the same time we''d also investigate changing the search algorithm and data structure to reduce the cost of searching and matching fonts. Right now, the cost of pattern matching scales linearaly with the number of fonts, when you have thousands of fonts, this isn''t acceptable. -keith -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 228 bytes Desc: not available Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050321/9a7f04be/attachment.pgp
On Tue, 2005-03-22 at 01:05 -0500, Patrick Lam wrote:> Owen Taylor wrote: > > The fontconfig caches live in the font directories, which may be > > shared between machines with different architectures. For that reason, > > even if you remove pointers, you can''t simply put a structure into > > the file... the endianess and structure packing may differ between > > machines. > > I was thinking that it would be quicker to create a file on a per-user > basis, but I suppose that users might still log on using different archs. > > In fact FcPatterns themselves don''t take up much room (using an > FcPattern struct as you propose would probably double the amount of room > taken). However, the FcValues contained within the FcPatternElts do > take up a lot of room, I think.I don''t think there is a useful distinction there :-). A FcPattern is more or less a simply a refcounted list of key/value pairs. The amount of memory we care about in the end is the amount of memory it takes to store that key value pair list, not the amount of memory that struct FcPattern consumes by itself. Regards, Owen -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050322/fc09ce60/attachment.pgp
Keith Packard wrote:> Around 1 o''clock on Mar 21, Patrick Lam wrote: > > >>In short, I think that one would need to replace pointers with array >>indices; to do otherwise seems to be asking for trouble. I''ve started a >>patch that does this; my current patch is available at > > > Yes, of course the structures need to be address independent. I was > thinking that at the same time we''d also investigate changing the search > algorithm and data structure to reduce the cost of searching and matching > fonts. > > Right now, the cost of pattern matching scales linearaly with the number of > fonts, when you have thousands of fonts, this isn''t acceptable.I agree. It seems that these things can be done independently, though, and since I have little domain knowledge of fontconfig, I could probably contribute most effectively to making things address-independent (which is progressing slowly, but surely, in my spare time -- I hope to have a patch ready in the near future.) These current changes don''t seem to yet require a change to the cache format; that would come later. BTW, Owen''s suggested struct: struct FcPattern { int refcount; FcPatternStorage storage; /* dynamic, static */ union { FcPatternDynamic *dynamic; FcPatternStatic *static; } u; }; still seems to be a struct, and thus requiring mallocness and a pointer. My current patch allocates all FcPatterns in an array, which is probably all right for FcPatterns, but not all right for FcValueLists. Is there some safe way to use such a struct without having to malloc it? I don''t think there is. pat
On Fri, 2005-03-25 at 01:53 -0500, Patrick Lam wrote:> BTW, Owen''s suggested struct: > struct FcPattern { > int refcount; > FcPatternStorage storage; /* dynamic, static */ > > union { > FcPatternDynamic *dynamic; > FcPatternStatic *static; > } u; > }; > > still seems to be a struct, and thus requiring mallocness and a pointer.The point of my suggestion is that we need a single *public* structure for: - Patterns retrieved from the database of fonts on disk - Patterns created on the fly through the FcPattern API But that public structure can be small, lightweight, and created on the fly. Regards, Owen -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050325/46336fda/attachment.pgp
On Friday 25 March 2005 22:25, Owen Taylor wrote:> On Fri, 2005-03-25 at 01:53 -0500, Patrick Lam wrote: > > BTW, Owen''s suggested struct: > > struct FcPattern { > > int refcount; > > FcPatternStorage storage; /* dynamic, static */ > > > > union { > > FcPatternDynamic *dynamic; > > FcPatternStatic *static; > > } u; > > }; > > > > still seems to be a struct, and thus requiring mallocness and a pointer. > > The point of my suggestion is that we need a single *public* structure > for: > > - Patterns retrieved from the database of fonts on disk > - Patterns created on the fly through the FcPattern API > > But that public structure can be small, lightweight, and created on > the fly.Do you really? FcPattern is a completely opaque type. For patterns on the disk, you can just return a pointer to them. All that should be needed is one bitflag in the pattern structure itself marking it a begin readonly, and using the same data structure for dynamic patterns. This requires using the same endianness and packing for dynamic and static patterns and thus makes building up dynamic pattern a little more complex, but I don''t think the overhead would be noticable (and the code is needed for the cache building anyway). The advantage would the that retrieving of properties from the pattern could use the same code. Cheers, Lars
Around 1 o''clock on Mar 22, Patrick Lam wrote:> Seems like a good thing to work on after I have an initial version with > the mmapable config. I think someone''s recently added some hashing, too.Without a specification for how matching and sorting can work without looking at every font, there''s little point in starting to work on an mmap''able data structure. In fact, I suspect that to get reasonable performance, we will have to add a new API that allows for incremental sorting where you don''t receive all of the fonts at once but rather need to ask for the ''next'' font in line. I''d rather change the cache file format only once; it''s an expensive operation to consider as we will have old versions on disk for some time after any switch. So, first we need a specification for font matching which doesn''t require examining each font, then some idea of how to extend that into a sorting API. Looking at the CSS2 specification and matching that more closely may actually permit some of these optimizations. -keith -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 228 bytes Desc: not available Url : http://lists.freedesktop.org/archives/fontconfig/attachments/20050322/2ee4f222/attachment.pgp
Hi. I''ve been hacking on fontconfig a bit to make it possible to mmap the FcInit and related data structures. There is some discussion at http://bugzilla.gnome.org/show_bug.cgi?id=169345 In short, I think that one would need to replace pointers with array indices; to do otherwise seems to be asking for trouble. I''ve started a patch that does this; my current patch is available at http://plam.csail.mit.edu/~plam/tmp/fontconfig-strings-to-indices.diff I''m also willing to replace FcPatternIndex with a void * datatype that would be cast to ints when needed; this would avoid breaking current APIs but is less aesthetically pleasing. Does this seem to be a promising approach, and is something like this commitable when complete? Thanks, pat