Currently mailbox names are stored in IMAP's modified-UTF-7 format in filesystem. I was wondering about changing this in v2.0. The default would still be to use mUTF-7 in filesystem, but just adding :UTF8 or something to mail_location could enable UTF-8. Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of weird characters, perhaps no one really wants to see them on filesystem since there's no way to type the characters? But for small systems this probably isn't a problem. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20091109/27536fa1/attachment-0002.bin>
Quoting Timo Sirainen <tss at iki.fi>:> Currently mailbox names are stored in IMAP's modified-UTF-7 format in > filesystem. I was wondering about changing this in v2.0. The default > would still be to use mUTF-7 in filesystem, but just adding :UTF8 or > something to mail_location could enable UTF-8. > > Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of > weird characters, perhaps no one really wants to see them on filesystem > since there's no way to type the characters? But for small systems this > probably isn't a problem.I would personally find it useful. I use accented and Chinese characters, and I've worked in environments where they were common as well. Having a common name between MUA and FS would certainly be nice. As for the risks, maybe some Unicode ranges could be restricted to avoid control characters and such? Or limit the use to given subsets? It might be useful as well to be able to enable it on a per-user basis. Would that add too much complexity? I think of it as a nice feature, but not a critical one. Laurent
On Mon, Nov 09, 2009 at 09:11:23PM -0500, Timo Sirainen wrote:> Currently mailbox names are stored in IMAP's modified-UTF-7 format in > filesystem. I was wondering about changing this in v2.0. The default > would still be to use mUTF-7 in filesystem, but just adding :UTF8 or > something to mail_location could enable UTF-8. > > Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of > weird characters, perhaps no one really wants to see them on filesystem > since there's no way to type the characters? But for small systems this > probably isn't a problem.What's the advantage? Geert -- Geert Hendrickx -=- ghen at telenet.be -=- PGP: 0xC4BB9E9F This e-mail was composed using 100% recycled spam messages!
On Mon, 09 Nov 2009 21:11:23 -0500 Timo Sirainen <tss at iki.fi> wrote:> Currently mailbox names are stored in IMAP's modified-UTF-7 format in > filesystem. I was wondering about changing this in v2.0. The default > would still be to use mUTF-7 in filesystem, but just adding :UTF8 or > something to mail_location could enable UTF-8. > > Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of > weird characters, perhaps no one really wants to see them on > filesystem since there's no way to type the characters? But for small > systems this probably isn't a problem.A while ago, I was playing around with the idea of encoded '/'s in Maildir names since many people have asked for a way to use them. UTF-7 does not require that each character be representable in only 1 way like UTF-8 does, so it's possible to encode US-ASCII characters and put them into the folder name; however, I found that most clients decode any mUTF-7 in folder names while parsing LIST/LSUB replies and then discard the name given by the server (expecting that they can just re-encode any non-ASCII characters and still arrive at the correct folder name.) While I would argue that these clients are buggy, the bug seems to be so common that encoding characters this way isn't practical. With that in mind, you do lose the ability to encode characters like this if the folder names on disk are UTF8, but that's not much of a loss anyway if UTF8 encoding is optional. So far as UTF-8 on the filesystem is concerned, I've been using UTF-8 in filenames on my personal systems for years now without any real issues. -- Ben Winslow <rain at bluecherry.net>