thr3ads.net - llvm dev - [llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Corentin via llvm-dev

2020-Jun-11 20:13 UTC

[llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang

Hello.
> 2) Add patches to Clang to allow EBCDIC and ASCII (ISO-8859-1) encoded
input source files. This would be done at the file open time to allow the
rest of Clang to operate as if the source was UTF-8 and so require no
changes downstream. Feedback on this plan is welcome from the Clang
community.

Would it be correct to assume that this EBCDIC -> UTF-8 mapping would be as
prescribed by
UTF-EBCDIC / IBM CDRA, notably for the control characters that do not map
exactly?
Notably, if the execution encoding is EBCDIC, is '0x06' equivalent to
'0086', etc?

The question "Is Unicode sufficient to represent all characters present in
the input source without using the Private Use Area?" is one that
is relevant to both Clang and the C/C++ standard. ( I do hope that it is
the case!)

Thanks,

Corentin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200611/a3ea1cfb/attachment.html>

Kai Peter Nacke via llvm-dev

2020-Jun-16 12:50 UTC

head link

[llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang

> > 2) Add patches to Clang to allow EBCDIC and ASCII (ISO-8859-1) encoded
> input source files. This would be done at the file open time to allow 
the > rest of Clang to operate as if the source was UTF-8 and so require no 
> changes downstream. Feedback on this plan is welcome from the Clang 
> community.
> Would it be correct to assume that this EBCDIC -> UTF-8 mapping 
> would be as prescribed by
> UTF-EBCDIC / IBM CDRA, notably for the control characters that do 
> not map exactly?
> Notably, if the execution encoding is EBCDIC, is '0x06' equivalent 
> to '0086', etc?
> 
> The question "Is Unicode sufficient to represent all characters 
> present in the input source without using the Private Use Area?" is
one
that> is relevant to both Clang and the C/C++ standard. ( I do hope that 
> it is the case!)  
The current goal is to make only minimal changes to the frontend to enable 
reading of EBCDIC encoded files. For this, we use the auto-conversion 
service of z/OS UNIX System Services (
https://www.ibm.com/support/knowledgecenter/SSLTBW_2.4.0/com.ibm.zos.v2r4.bpxb200/xpascii.htm
), together with file tagging and setting the CCSID for the program and 
for opened files.. The auto-conversion service supports round-trip 
conversion between EBCDIC and Enhanced ASCII. With it, boot strapping with 
EBCDIC source files is possible.
Of course, more complete UTF-8 support is a valid implementation 
alternative.

Best regards,
Kai Nacke
IT Architect

IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Geschäftsführung: Gregor Pillen (Vorsitzender), Agnes Heftberger, Norbert 
Janzen, Markus Koerner, Christian Noll, Nicole Reimer
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, 
HRB 14562 / WEEE-Reg.-Nr. DE 99369940

llvm dev - Jun 2020 - RFC: Adding support for the z/OS platform to LLVM and clang

[llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang

[llvm-dev] RFC: Adding support for the z/OS platform to LLVM and clang