I recently had an issue where a user couldn't correctly store an email out of his Outlook on an windows 11 PC to samba. Something around illegal characters (sry, I don't have the message at hand right now). The samba is 4.22.8 on Debian-13 and it has been historically upgraded over years ... The former admin configured: dos charset = CP850 unix charset = iso8859-15 And I kept it that way because I didn't want to break things (and it wasn't a problem so far ...). Is changing this to UTF-8 problematic? Does that break filenames or the displayed names? We can't backup/restore all the files or so .. so I have to research how to maybe solve this. Thanks for any advice!
On Tue, 19 May 2026 10:23:40 +0200 "Stefan G. Weichinger via samba" <samba at lists.samba.org> wrote:> > I recently had an issue where a user couldn't correctly store an > email out of his Outlook on an windows 11 PC to samba. > > Something around illegal characters (sry, I don't have the message at > hand right now). > > The samba is 4.22.8 on Debian-13 and it has been historically > upgraded over years ... > > The former admin configured: > > dos charset = CP850 > unix charset = iso8859-15 > > And I kept it that way because I didn't want to break things (and it > wasn't a problem so far ...). > > Is changing this to UTF-8 problematic? > Does that break filenames or the displayed names?It shouldn't, as far as I am aware, iso8859-15 is a subset of the more modern UTF-8 so you should not notice any real difference. Try just commenting out those lines and see what happens, if anyone screams, then uncomment them, but I do not think anyone will.> > We can't backup/restore all the files or so .. so I have to research > how to maybe solve this.Oh you really should have backups, unless you can afford to lose things. Rowland
Hi Stefan you will need to convert filenames you should look at convmv tool https://www.j3e.de/linux/convmv/man/ convmv - converts filenames from one encoding to another Le 19/05/2026 ? 10:23, Stefan G. Weichinger via samba a ?crit?:> > I recently had an issue where a user couldn't correctly store an email > out of his Outlook on an windows 11 PC to samba. > > Something around illegal characters (sry, I don't have the message at > hand right now). > > The samba is 4.22.8 on Debian-13 and it has been historically upgraded > over years ... > > The former admin configured: > > ????dos charset = CP850 > ????unix charset = iso8859-15 > > And I kept it that way because I didn't want to break things (and it > wasn't a problem so far ...). > > Is changing this to UTF-8 problematic? > Does that break filenames or the displayed names? > > We can't backup/restore all the files or so .. so I have to research > how to maybe solve this. > > Thanks for any advice! > >-- Arnaud FLORENT IRIS Technologies
On 2026-05-19 01:23, Stefan G. Weichinger via samba wrote:> > I recently had an issue where a user couldn't correctly store an email > out of his Outlook on an windows 11 PC to samba. > > Something around illegal characters (sry, I don't have the message at > hand right now). > > The samba is 4.22.8 on Debian-13 and it has been historically upgraded > over years ... > > The former admin configured: > > ????dos charset = CP850 > ????unix charset = iso8859-15 > > And I kept it that way because I didn't want to break things (and it > wasn't a problem so far ...). > > Is changing this to UTF-8 problematic? > Does that break filenames or the displayed names?With all due respect to the expert and unendingly helpful Rowland Penny, changing a charset from iso8859-15 to utf-8 might very well invalidate some stored filenames and configuration strings. ISO8859-15[1] is a character set for Western European languages written using the Latin script. Every character occupies one byte. UTF-8[2] is a way of storing Unicode character values in 1, 2, 3, or 4-byte sequences. It is correct that every character in ISO8859-15 can be represented in UTF-8. It is also correct that every character in ISO8859-15 which has a byte value of 127 or less is represented in UTF-8 by a single byte with the same byte value. However, those characters in ISO8859-15 which have a byte value of 128 or greater are represented by two-byte sequences in UTF-8. So, text data ??for instance a file name ??which was stored as bytes with values of 128 or greater, encoded in ISO8859-15, will not be legible if read back as UTF-8. I do not know how Samba uses those admin settings of "dos charset" and "unix?charset". I do not know how it affects the interpretation of byte values in configuration strings or filenames. Thus I can't advise you what else you need to convert when you change these settings.? For what it's worth, I have a suspicion that those settings may affect how the byte sequences of filenames are interpreted, and if your filenames include accented Latin characters and such, those filenames will need to be converted from ISO8859-15 to UTF-8 form separately from Samba. Arnaud Florent's suggestion of `convmv` to convert filenames from ISO8859-15 encoding to UTF-8 encoding seems like it would be helpful.> We can't backup/restore all the files or so .. so I have to research > how to maybe solve this. > > Thanks for any advice!Best regards, ? ? ? ?Jim DeLaHunt, who knows little about Samba, but a lot about Unicode and UTF-8. [1] <https://en.wikipedia.org/wiki/ISO/IEC_8859-15> [2] <https://en.wikipedia.org/wiki/UTF-8> -- . --Jim DeLaHunt http://blog.jdlh.com/ (http://jdlh.com/) Vancouver, B.C., Canada