Thomas Schmitt
2013-Mar-28 21:15 UTC
[syslinux] Rock Ridge. Was: Allowed code pages and encodings to write f0.txt through f1.txt?
Hi, i began to implement a common lookup function for SUSP and Rock Ridge entries: /* Obtain the payload bytes of all SUSP entries with a given signature. @param fs The data source from which to read CE blocks. @param dir_rec Memory containing the whole ISO 9660 directory record. @param sig Two characters of SUSP signature. E.g. "NM", "ER", ... @param data Returns allocated memory with the found payload. @param len_data Returns the number of valid bytes in *data. (Does not include the trailing 0-byte.) @return 1= Success. *data and *len_data are valid. A trailing 0-byte was added to *data for convenience. 0= Desired signature not found. *data and *len_data are NULL resp. 0. -1= Error. Something is wrong with the ISO 9660 or SUSP data in the image. */ static int susp_get_entry(struct fs_info *fs, char *dir_rec, char *sig, char **data, int *len_data); The function and its subordinates already have more than 300 lines, i fear. And still some necessary features are missing. Some open questions: How do i check for success or failure of get_cache() ? Shall the new code become part of iso9660/iso9660.c or shall i start a new source file iso9660/susp.c ? Would it be ok to extend struct iso_sb_info ? (I should implement non-zero skip length that might be defined by the SUSP SP entry. Probably i find more use for global-ish parameters.) Have a nice day :) Thomas
H. Peter Anvin
2013-Mar-28 21:34 UTC
[syslinux] Rock Ridge. Was: Allowed code pages and encodings to write f0.txt through f1.txt?
On 03/28/2013 02:15 PM, Thomas Schmitt wrote:> > Some open questions: > > How do i check for success or failure of get_cache() ? >get_cache() doesn't return on failure; instead it triggers the boot failure path. (Actually, it looks like it doesn't right now, but it should.) It might be wise to check for a NULL pointer in case we eventually implement a failure path.> Shall the new code become part of iso9660/iso9660.c > or shall i start a new source file iso9660/susp.c ?A new source file is probably better.> Would it be ok to extend struct iso_sb_info ? > (I should implement non-zero skip length that might be defined by > the SUSP SP entry. Probably i find more use for global-ish parameters.)Yes, that is just fine. -hpa
H. Peter Anvin
2013-Mar-31 21:59 UTC
[syslinux] Rock Ridge. Was: Allowed code pages and encodings to write f0.txt through f1.txt?
Stupid question: what do these limitations mean in practice? + Shortcommings / Future improvements: + (XXX): Avoid memcpy() with Continuation Areas wich span over more than one + block ? (Will then need memcpy() with entries which are hit by a + block boundary.) (Questionable whether the effort is worth it.) + (XXX): Take into respect ES entries ? (Hardly anybody does this.) + -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf.
Hi,> Stupid question: what do these limitations mean in practice?Now you triggered a lengthy answer. :)) Also want to take the opportunity to warn of case-sensitivity.> + (XXX): Avoid memcpy() with Continuation AreasThis is some waste of computing time, because i did not make the iterator smart enough to do its job without copying data under certain circumstances. Those circumstances are quite rare. The problem is that get_cache() does not guarantee that two blocks with adjacent LBA will be held as adjacent copies in the cache. So i allocate memory and copy the single blocks in order to get them as one contiguous memory area. This happens only if more than one block is involved by a CE entry. CE is rare execept for the root directory records "." and "..", which host the fat Rock Ridge ER entries. Even if CE announces a Continuation Area, this often fits into a single block. You need about 220 characters in a file name to surely provoke the production of CE entries for normal directory records. Adding fat AAIP info by libisofs has the same effect. I.e. if a file on Linux has lots of Extended Attributes or a long ACL. http://libburnia-project.org/wiki/AAIP xorriso -as mkisofs --xattr --acl ...> + (XXX): Take into respect ES entries ? (Hardly anybody does this.)ES entries distinguish different SUSP extensions like Rock Ridge or my AAIP. They are necessary only if two such extensions use the same entry signatures with different meaning. E.g. if both have an NM entry. There are not many SUSP extensions around. Those of which i know (Rock Ridge, zisofs, Apple, Amiga, AAIP) do not collide by entry signatures. Linux seems to ignore ES. It does not appear in its isofs/rock.c. libisofs produces ES if SUSP/RRIP 1.12 is selected. Default is 1.10 without ES. ----------------------------------------------------------------- In both (XXX) cases i doubt that it is worthwile to invest complexity. The benefit is sparse and the risk of bugs increases. ----------------------------------------------------------------- What about this one: XXX: Is there already a reader for 32-bit MSB or LSB in the syslinux code ? (iso9660.c seems to flatly assume that it runs on little-endian int.) I re-invented byte-to-word translation in susp_rr.c. iso9660.c had no such gesture as example. It rather picks the little-endian byte strings of the 32-bit numbers from the ISO and uses them directly as integers. I understand syslinux is x86-only. Nevertheless, my hairs are raised when i see that gesture. One could replace all calls of susp_rr_read_msb() by the direct use of the bytes which are four byte positions before the big-endian byte strings wich i submit to susp_rr_read_msb(). E.g.: iter->next_lba = susp_rr_read_msb(u_entry + 8, 4); to iter->next_lba = *((block_t *) u_entry + 4); But as it is now, susp_rr.c should be easily portable to big-endian processors. ----------------------------------------------------------------- I see you commited the code. This will make the file names in Rock Ridge enhanced ISO images case-sensitive. E.g. Apple HFS is addicted to case-insensitivity because the users learned not to care for uppercase or lowercase. Is it possible that the syslinux community already suffers from a similar addiction to case-less ISO 9660 ? Is there the need for an option that disables Rock Ridge interpretation ? (Like with mount -o norock) ----------------------------------------------------------------- Have a nice day :) Thomas
On 04/01/2013 12:40 PM, Thomas Schmitt wrote:> Hi, > >>> The temptation to make use of the integer size waste in ISO 9660 >>> should be rejected. > >> You mean the dual-endian numbers? Yes, it is well known that the >> bigendian stuff in ISO 9660 is broken on too many mastering platforms. > > Indeed ? > In that case we should let susp_rr.c interpret the little endian ones. > Currently it reads the big endians (out of old habit). > On the other hand, are the broken ones supposed to produce Rock Ridge ? > > Shall i make a patch ? >Yes, please use the littleendian ones. Syslinux *does* provide the standard hton/ntoh macros for bigendian access, but there is no accessor for littleendian at the moment (we really should just adopt the Linux conventions.)> (Actually i meant the waste to store both byte sexes. Obviously a > lame compromise so that no operating system has a disadvantage > by having to convert.) >Quite. It made sense except so many mastering utilities were only tested on DOS/Windows that Apple started byte-swapping the littleendian bits even back in the PowerPC days. So the bitendian stuff is generally considered unusable -- Linux uses the littleendian data even on bigendian platforms. The other bit is that I/O is so much slower than CPU that it never made any sense in the first place. -hpa
Hi, it seems i need some git tutorial for dummies. I tried to get the current state of the rockridge branch: git clone git://git.kernel.org/pub/scm/boot/syslinux/syslinux.git cd syslinux git checkout rockridge But susp_rr.c is obviously in the state before the commits of yesterday. This is also shown by http://git.kernel.org/cgit/boot/syslinux/syslinux.git/tree/core/fs/iso9660/susp_rr.c?h=rockridge Which one would be the right command for trying to get a newer state some time later ? git pull ? git fetch ? (The nomenclature used in git man pages just riddles me.) Have a nice day :) Thomas
On 04/03/2013 03:48 AM, Thomas Schmitt wrote:> Hi, > > it seems i need some git tutorial for dummies. > > I tried to get the current state of the rockridge branch: > > git clone git://git.kernel.org/pub/scm/boot/syslinux/syslinux.git > cd syslinux > git checkout rockridge > > But susp_rr.c is obviously in the state before the commits of > yesterday. > This is also shown by > http://git.kernel.org/cgit/boot/syslinux/syslinux.git/tree/core/fs/iso9660/susp_rr.c?h=rockridge > > Which one would be the right command for trying to get a newer > state some time later ? git pull ? git fetch ? > (The nomenclature used in git man pages just riddles me.) >Sorry, fixed. The problem is that there are two upstream repos for Syslinux, and I had mistakenly only pushed one. -hpa
Hi, i have tested the current state of iso9660/susp_rr.c after adding to the test mock-up a substitute for get_le32: #define get_le32(x) iso_read_lsb(x, 4) It still passes the test. :)) Shall i submit a patch so that others can do the same tests with their images (after enabling a macro in libisofs/fs_image.c) ? Strangely, git pull to the clone of yesterday failed with CONFLICT (content): Merge conflict in core/fs/fs.c So i had to clone again. The only git gesture for which i seem apt. Have a nice day :) Thomas
Hi Thomas, Since I am not a developer, the following comments may or may not be relevant to the current RR addition to Syslinux. Recently, Pete Batard, developer of RUFUS, found some issue(s) between RUFUS and ArchLinux ISO images. It turned out to be related to some exception regarding Rock Ridge / ISO9660 translations in libcdio. FWIW, Pete pointed to: http://git.savannah.gnu.org/gitweb/?p=libcdio.git;a=blob;f=src/iso-inf o.c;#l242 and his workaround / solution for RUFUS regarding this issue was: https://github.com/pbatard/rufus/commit/97576d79cbc180dd9509b1d7144107 261c6d58a3 In a nutshell (quoting Pete), when using libcdio and RR, you should not call on iso9660_name_translate_ext() on the filename attribute of the iso9660_stat_t struct you process, but instead use the name returned as is. Thanks go to Pete Batard. This issue may or may not be relevant to RR in Syslinux. My apologies if it is not. Regards, Ady.
Hi,> https://github.com/pbatard/rufus/commit/97576d79cbc180dd9509b1d7144107261c6d58a3 > (quoting Pete), when using libcdio and RR, you should > not call on iso9660_name_translate_ext() on the filename attribute of > the iso9660_stat_t struct you process, but instead use the name > returned as is.Sounds rather like a libcdio-specific bug. If i understand the patch correctly, then libcdio applied the ISO 9660 name back-translation to Rock Ridge names. That would be wrong, because Rock Ridge bears plain POSIX names, whereas the names of ISO 9660 are combined of user-visible name, mandatory dot (if missing in name), and version number. Everything is mapped to upper case. "foo" becomes "FOO.;1" in ISO 9660. This has to be compensated by some back-translation. (As good as is possible with a non-injective forth-translation.) To see real ISO 9660 names, you may mount an ISO image on Linux with -o loop,norock,nojoliet,map=off The changeset to syslinux/core/fs/iso8660/iso9660.c is supposed to handle Rock Ridge names correctly in this aspect: http://git.kernel.org/cgit/boot/syslinux/syslinux.git/commit/?h=rockridge&id=556ccf02efe3eae833bcc82c4764179580ba6361 It compares them with the user-visible name by if (strcmp(rr_name, dname) == 0) { rather than by if (iso_compare_name(de_name, de_name_len, dname)) { as is done with ISO 9660 names. It returns obtained Rock Ridge names without conversion by memcpy(dirent->d_name, rr_name, name_len); rather than name_len = iso_convert_name(dirent->d_name, de->name, de->name_len); as done with ISO 9660 names. Well, the code ist now waiting for its first tester. A benefit of interpreting Rock Ridge will be that ISO 9660 name mapping cannot alter your POSIX file names. So you have not to care about possible name collisions in the ISO 9660 name space. (Those get resolved by "mangling" which alters the ISO names in a quite unpredictable way.) Another benefit (and reason of this thread) is the wish to use UTF-8 as character set for filenames. A no-go for ISO 9660, no problem for Rock Ridge. Have a nice day :) Thomas
Possibly Parallel Threads
- Allowed code pages and encodings to write f0.txt through f1.txt?
- Rock Ridge for core/fs/iso9660
- [syslinux:rockridge] iso9660.c did not copy terminating 0 of Rock Ridge name
- [Ping:] [Patch] iso9660.c did not copy terminating 0 of Rock Ridge name
- Problem with mkisofs (i guess :-))