thr3ads.net - llvm dev - [LLVMdev] Inconsistencies or intended behaviour of LLVM IR? [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Robin Eklind

2015-Jan-28 01:49 UTC

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

Hello everyone!

I've recently had a chance to familiarize myself with the nitty-gritty 
details of LLVM IR. It has been a great learning experience, sometimes 
frustrating or confusing but mostly rewarding.

There are a few cases I've come across which seems odd to me. I've tried
to cross reference with the language specification and the source code 
to the best of my abilities, but would like to reach out to an 
experienced crowd with a few questions.

Could you help me out by taking a look at these examples? To my novice 
eyes they seem to highlight inconsistencies in LLVM IR (or the reference 
implementation), but it is quite likely that I've overlooked something. 
Please help me out.

Note: the example source files have been attached and a copy is made 
available at https://github.com/mewplay/ll

* Item 1 - named pointer types

It is possible to create a named array pointer type (and many others), 
but not a named structure pointer type. E.g.

%x = type [1 x i32]* ; valid.
%x = type {i32}*     ; invalid.

Is this the intended behaviour? Attaching a.ll, b.ll, c.ll and d.ll for 
reference. All files except d.ll compiles without error using clang 
version 3.5.1 (tags/RELEASE_351/final).

 > $ clang d.ll
 > d.ll:3:16: error: expected top-level entity
 > %x = type {i32}*
 >                ^
 > 1 error generated.

Does it have anything to do with type equality? (just a hunch)

* Item 2 - equality of named types

A named integer type is equivalent to its literal type counterpart, but 
the same is not true for named and literal structures. I am certain that 
I've read about this before, but can't seem to locate the right section 
of the language specification; could anyone point me in the right 
direction? Also, what is the motivation behind this decision? I've 
skimmed over the code which handles named structure types (in 
lib/IR/core.cpp), but would love to hear the high level idea.

Attaching e.ll, f.ll, g.ll and h.ll for reference. All compile just file 
except h.ll, which produces the following error message (using the same 
version of clang as above):

 > $ clang h.ll
 > h.ll:10:23: error: argument is not of expected type '%x = type { i32
}'
 >         call void (%x)* @foo({i32} {i32 0})
 >                              ^
 > 1 error generated.

* Item 3 - zero initialized common linkage variables

According to the language specification common linkage variables are 
required to have a zero initializer [1]. If so, why are they also 
required to provide an initial value?

Attaching i.ll and j.ll for reference. Both compiles just fine and once 
executed i.ll returns 37 and j.ll return 0. If the common linkage 
variable @x was not initialized to 0, j.ll would have returned 42.

* Item 4 - constant common linkage variables

The language specification states that common linkage variables may not 
be marked as constant [1]. The parser doesn't seem to enforce this 
restriction. Would doing so cause any problems?

Attaching k.ll and l.ll for reference. Both compiles just fine, but once 
executed k.ll returns 37 (e.g. the constant variable was overwritten) 
while l.ll segfaults as expected when it tries to overwrite a read-only 
memory location.

* Item 5 - appending linkage restrictions

An extract from the language specification [1]:

 > "appending" linkage may only be applied to global variables of 
pointer to array type.

Similarly to item 4 this restriction isn't enforced by the parser. Would 
it make sense doing so, or is there any problem with such an approach?

* Item 6 - hash token

The hash token (#) is defined in lib/AsmParser/LLToken.h (release 
version 3.5.0 of the LLVM source code) but doesn't seem to be used 
anywhere else in the source tree. Is this token a historical artefact or 
does it serve a purpose?

* Item 7 - backslash token

Similarly to item 7 the backslash token doesn't seem to serve a purpose 
(with regards to release version 3.5.0 of the LLVM source code). Is it 
used somewhere?

* Item 8 - quoted labels

A comment in lib/AsmParser/LLLexer.cpp (once again, release version 
3.5.0 of the LLVM source code) describes quoted labels using the 
following regexp (e.g. at least one character between the double quotes):

 > ///   QuoteLabel        "[^"]+":

In contrast the reference implementation accepts quoted labels with zero 
or more characters between the double quotes. Which is to be trusted? 
The comment makes more sense as the variable name would effectively be 
blank otherwise.

* Item 9 - undocumented calling conventions

The following calling conventions are valid tokens but not described in 
the language references as of revision 223189:

intel_ocl_bicc, x86_stdcallcc, x86_fastcallcc, x86_thiscallcc, 
kw_x86_vectorcallcc, arm_apcscc, arm_aapcscc, arm_aapcs_vfpcc, 
msp430_intrcc, ptx_kernel, ptx_device, spir_kernel, spir_func, 
x86_64_sysvcc, x86_64_win64cc, kw_ghccc



Lastly I'd just like to thank the LLVM developers for all the time and 
hard work they've put into this project. I'd especially like to thank 
you for providing a language specification along side of the reference 
implementation! Keeping it up to date is a huge task, but also hugely 
important. Thank you!

Kind regards
/Robin Eklind

[1]: http://llvm.org/docs/LangRef.html#linkage-types
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

define void @foo([1 x i32]*) {
	ret void
}

define i32 @main() {
	call void ([1 x i32]*)* @foo([1 x i32]* null)
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

%x = type [1 x i32]*

define void @foo(%x) {
	ret void
}

define i32 @main() {
	call void (%x)* @foo(%x null)
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

define void @foo({i32}*) {
	ret void
}

define i32 @main() {
	call void ({i32}*)* @foo({i32}* null)
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

%x = type {i32}*

define void @foo(%x) {
	ret void
}

define i32 @main() {
	call void (%x)* @foo(%x null)
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

%x = type i32

define void @foo(%x) {
	ret void
}

define i32 @main() {
	call void (%x)* @foo(%x 0)
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

%x = type i32

define void @foo(%x) {
	ret void
}

define i32 @main() {
	call void (%x)* @foo(i32 0)
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

%x = type {i32}

define void @foo(%x) {
	ret void
}

define i32 @main() {
	call void (%x)* @foo(%x {i32 0})
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

%x = type {i32}

define void @foo(%x) {
	ret void
}

define i32 @main() {
	call void (%x)* @foo({i32} {i32 0})
	ret i32 0
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

@x = common global i32 42

define i32 @main() {
	store i32 37, i32* @x
	%foo = load i32* @x
	ret i32 %foo
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

@x = common global i32 42

define i32 @main() {
	%foo = load i32* @x
	ret i32 %foo
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

@x = common constant i32 42

define i32 @main() {
	store i32 37, i32* @x
	%foo = load i32* @x
	ret i32 %foo
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

@x = constant i32 42

define i32 @main() {
	store i32 37, i32* @x
	%foo = load i32* @x
	ret i32 %foo
}
-------------- next part --------------
target triple = "x86_64-unknown-linux-gnu"

@x = appending global i32 2

define i32 @main() {
	%foo = load i32* @x
	ret i32 %foo
}

Sean Silva

2015-Jan-28 14:45 UTC

head link

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

A couple quick comments inline (didn't touch on all points):

On Wed, Jan 28, 2015 at 1:49 AM, Robin Eklind <carl.eklind at
myport.ac.uk>
wrote:
> Hello everyone!
>
> I've recently had a chance to familiarize myself with the nitty-gritty
> details of LLVM IR. It has been a great learning experience, sometimes
> frustrating or confusing but mostly rewarding.
>
> There are a few cases I've come across which seems odd to me. I've
tried
> to cross reference with the language specification and the source code to
> the best of my abilities, but would like to reach out to an experienced
> crowd with a few questions.
>
> Could you help me out by taking a look at these examples? To my novice
> eyes they seem to highlight inconsistencies in LLVM IR (or the reference
> implementation), but it is quite likely that I've overlooked something.
> Please help me out.
>
> Note: the example source files have been attached and a copy is made
> available at https://github.com/mewplay/ll
>
> * Item 1 - named pointer types
>
> It is possible to create a named array pointer type (and many others), but
> not a named structure pointer type. E.g.
>
> %x = type [1 x i32]* ; valid.
> %x = type {i32}*     ; invalid.
>
> Is this the intended behaviour? Attaching a.ll, b.ll, c.ll and d.ll for
> reference. All files except d.ll compiles without error using clang version
> 3.5.1 (tags/RELEASE_351/final).
>
> > $ clang d.ll
> > d.ll:3:16: error: expected top-level entity
> > %x = type {i32}*
> >                ^
> > 1 error generated.
>
> Does it have anything to do with type equality? (just a hunch)
>
> * Item 2 - equality of named types
>
> A named integer type is equivalent to its literal type counterpart, but
> the same is not true for named and literal structures. I am certain that
> I've read about this before, but can't seem to locate the right
section of
> the language specification; could anyone point me in the right direction?
> Also, what is the motivation behind this decision? I've skimmed over
the
> code which handles named structure types (in lib/IR/core.cpp), but would
> love to hear the high level idea.
>
> Attaching e.ll, f.ll, g.ll and h.ll for reference. All compile just file
> except h.ll, which produces the following error message (using the same
> version of clang as above):
>
> > $ clang h.ll
> > h.ll:10:23: error: argument is not of expected type '%x = type {
i32 }'
> >         call void (%x)* @foo({i32} {i32 0})
> >                              ^
> > 1 error generated.
>
> * Item 3 - zero initialized common linkage variables
>
> According to the language specification common linkage variables are
> required to have a zero initializer [1]. If so, why are they also required
> to provide an initial value?
>
> Attaching i.ll and j.ll for reference. Both compiles just fine and once
> executed i.ll returns 37 and j.ll return 0. If the common linkage variable
> @x was not initialized to 0, j.ll would have returned 42.
>
> * Item 4 - constant common linkage variables
>
> The language specification states that common linkage variables may not be
> marked as constant [1]. The parser doesn't seem to enforce this
> restriction. Would doing so cause any problems?
>
> Attaching k.ll and l.ll for reference. Both compiles just fine, but once
> executed k.ll returns 37 (e.g. the constant variable was overwritten) while
> l.ll segfaults as expected when it tries to overwrite a read-only memory
> location.
>
> * Item 5 - appending linkage restrictions
>
> An extract from the language specification [1]:
>
> > "appending" linkage may only be applied to global variables
of pointer
> to array type.
>
> Similarly to item 4 this restriction isn't enforced by the parser.
Would
> it make sense doing so, or is there any problem with such an approach?
>
> * Item 6 - hash token
>
> The hash token (#) is defined in lib/AsmParser/LLToken.h (release version
> 3.5.0 of the LLVM source code) but doesn't seem to be used anywhere
else in
> the source tree. Is this token a historical artefact or does it serve a
> purpose?
>
Try deleting it. If the tests pass send a patch. Same for item 7.

>
> * Item 7 - backslash token
>
> Similarly to item 7 the backslash token doesn't seem to serve a purpose
> (with regards to release version 3.5.0 of the LLVM source code). Is it used
> somewhere?
>
> * Item 8 - quoted labels
>
> A comment in lib/AsmParser/LLLexer.cpp (once again, release version 3.5.0
> of the LLVM source code) describes quoted labels using the following regexp
> (e.g. at least one character between the double quotes):
>
> > ///   QuoteLabel        "[^"]+":
>
> In contrast the reference implementation accepts quoted labels with zero
> or more characters between the double quotes. Which is to be trusted? The
> comment makes more sense as the variable name would effectively be blank
> otherwise.
>
Looks an empty name just results in the thing becoming unnamed. That's sort
of confusing, but probably not harmful. Maybe we use an empty name as a
sentinel for "unnamed", so it sort of just was an accident of the
implementation.

>
> * Item 9 - undocumented calling conventions
>
> The following calling conventions are valid tokens but not described in
> the language references as of revision 223189:
>
> intel_ocl_bicc, x86_stdcallcc, x86_fastcallcc, x86_thiscallcc,
> kw_x86_vectorcallcc, arm_apcscc, arm_aapcscc, arm_aapcs_vfpcc,
> msp430_intrcc, ptx_kernel, ptx_device, spir_kernel, spir_func,
> x86_64_sysvcc, x86_64_win64cc, kw_ghccc
>
>This is just bitrot.

-- Sean Silva

>
>
> Lastly I'd just like to thank the LLVM developers for all the time and
> hard work they've put into this project. I'd especially like to
thank you
> for providing a language specification along side of the reference
> implementation! Keeping it up to date is a huge task, but also hugely
> important. Thank you!
>
> Kind regards
> /Robin Eklind
>
> [1]: http://llvm.org/docs/LangRef.html#linkage-types
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150128/62e3df25/attachment.html>

Robin Eklind

2015-Jan-28 18:28 UTC

head link

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

Hello Sean,

Thank you for your reply. I'll give your suggestion to item 6 and 7 a 
try tonight. I'll start a compilation and let it run throughout the 
night. My laptop (x61s) is 8 years old by know, so compiling LLVM takes 
a little time :)

Regarding item 8. I don't know if anyone is using "": in the wild
so
fixing the implementation might make sense. If not the documentation 
(e.g. the QuoteLabel comment) should be updated to be in line with the 
implementation.

I only included item 9 since I stumbled upon it once cross-referencing 
the source code with the language specification. Bitrot for a project of 
this size is to be expected.

I'm still very interested to hear about the items related to types, e.g. 
item 1 and 2. Is there a good reference which describes how type 
equality works in LLVM IR? If the source code is the reference, could 
someone with the high level knowledge get me up to speed?

Item 1 still confuses me, so I'd be very happy if someone with more 
insight could clarify if this is the intended behaviour and if so the 
motivation behind it.

As it so happens, I forgot to include item 10 :)

* Item 10 - lli vs. clang output

Using the same source files as before, it seems like lli and clang 
treats common linkage and constant variables differently. The following 
execution demonstrates the return value after executing i.ll, j.ll, k.ll 
and l.ll with lli and clang respectively:

 > $ clang i.ll && ./a.out ; echo $?
 > 37
 >
 > $ lli i.ll ; echo $?
 > 37
 >
 >
 > $ clang j.ll && ./a.out ; echo $?
 > 0
 >
 > $ lli j.ll ; echo $?
 > 42
 >
 >
 > $ clang k.ll && ./a.out ; echo $?
 > 37
 >
 > $ lli k.ll ; echo $?
 > 37
 >
 >
 > $ clang l.ll && ./a.out ; echo $?
 > Segmentation fault
 > 139
 >
 > $ lli l.ll ; echo $?
 > 37

Looking forward to hear more about type equality, or get a pointer as to 
where I can read up about it.

Cheers /Robin Eklind


On 01/28/2015 03:45 PM, Sean Silva wrote:> A couple quick comments inline (didn't touch on all points):
>
> On Wed, Jan 28, 2015 at 1:49 AM, Robin Eklind <carl.eklind at
myport.ac.uk>
> wrote:
>
>> Hello everyone!
>>
>> I've recently had a chance to familiarize myself with the
nitty-gritty
>> details of LLVM IR. It has been a great learning experience, sometimes
>> frustrating or confusing but mostly rewarding.
>>
>> There are a few cases I've come across which seems odd to me.
I've tried
>> to cross reference with the language specification and the source code
to
>> the best of my abilities, but would like to reach out to an experienced
>> crowd with a few questions.
>>
>> Could you help me out by taking a look at these examples? To my novice
>> eyes they seem to highlight inconsistencies in LLVM IR (or the
reference
>> implementation), but it is quite likely that I've overlooked
something.
>> Please help me out.
>>
>> Note: the example source files have been attached and a copy is made
>> available at https://github.com/mewplay/ll
>>
>> * Item 1 - named pointer types
>>
>> It is possible to create a named array pointer type (and many others),
but
>> not a named structure pointer type. E.g.
>>
>> %x = type [1 x i32]* ; valid.
>> %x = type {i32}*     ; invalid.
>>
>> Is this the intended behaviour? Attaching a.ll, b.ll, c.ll and d.ll for
>> reference. All files except d.ll compiles without error using clang
version
>> 3.5.1 (tags/RELEASE_351/final).
>>
>>> $ clang d.ll
>>> d.ll:3:16: error: expected top-level entity
>>> %x = type {i32}*
>>>                 ^
>>> 1 error generated.
>>
>> Does it have anything to do with type equality? (just a hunch)
>>
>> * Item 2 - equality of named types
>>
>> A named integer type is equivalent to its literal type counterpart, but
>> the same is not true for named and literal structures. I am certain
that
>> I've read about this before, but can't seem to locate the right
section of
>> the language specification; could anyone point me in the right
direction?
>> Also, what is the motivation behind this decision? I've skimmed
over the
>> code which handles named structure types (in lib/IR/core.cpp), but
would
>> love to hear the high level idea.
>>
>> Attaching e.ll, f.ll, g.ll and h.ll for reference. All compile just
file
>> except h.ll, which produces the following error message (using the same
>> version of clang as above):
>>
>>> $ clang h.ll
>>> h.ll:10:23: error: argument is not of expected type '%x = type
{ i32 }'
>>>          call void (%x)* @foo({i32} {i32 0})
>>>                               ^
>>> 1 error generated.
>>
>> * Item 3 - zero initialized common linkage variables
>>
>> According to the language specification common linkage variables are
>> required to have a zero initializer [1]. If so, why are they also
required
>> to provide an initial value?
>>
>> Attaching i.ll and j.ll for reference. Both compiles just fine and once
>> executed i.ll returns 37 and j.ll return 0. If the common linkage
variable
>> @x was not initialized to 0, j.ll would have returned 42.
>>
>> * Item 4 - constant common linkage variables
>>
>> The language specification states that common linkage variables may not
be
>> marked as constant [1]. The parser doesn't seem to enforce this
>> restriction. Would doing so cause any problems?
>>
>> Attaching k.ll and l.ll for reference. Both compiles just fine, but
once
>> executed k.ll returns 37 (e.g. the constant variable was overwritten)
while
>> l.ll segfaults as expected when it tries to overwrite a read-only
memory
>> location.
>>
>> * Item 5 - appending linkage restrictions
>>
>> An extract from the language specification [1]:
>>
>>> "appending" linkage may only be applied to global
variables of pointer
>> to array type.
>>
>> Similarly to item 4 this restriction isn't enforced by the parser.
Would
>> it make sense doing so, or is there any problem with such an approach?
>>
>> * Item 6 - hash token
>>
>> The hash token (#) is defined in lib/AsmParser/LLToken.h (release
version
>> 3.5.0 of the LLVM source code) but doesn't seem to be used anywhere
else in
>> the source tree. Is this token a historical artefact or does it serve a
>> purpose?
>>
>
> Try deleting it. If the tests pass send a patch. Same for item 7.
>
>
>>
>> * Item 7 - backslash token
>>
>> Similarly to item 7 the backslash token doesn't seem to serve a
purpose
>> (with regards to release version 3.5.0 of the LLVM source code). Is it
used
>> somewhere?
>>
>> * Item 8 - quoted labels
>>
>> A comment in lib/AsmParser/LLLexer.cpp (once again, release version
3.5.0
>> of the LLVM source code) describes quoted labels using the following
regexp
>> (e.g. at least one character between the double quotes):
>>
>>> ///   QuoteLabel        "[^"]+":
>>
>> In contrast the reference implementation accepts quoted labels with
zero
>> or more characters between the double quotes. Which is to be trusted?
The
>> comment makes more sense as the variable name would effectively be
blank
>> otherwise.
>>
>
> Looks an empty name just results in the thing becoming unnamed. That's
sort
> of confusing, but probably not harmful. Maybe we use an empty name as a
> sentinel for "unnamed", so it sort of just was an accident of the
> implementation.
>
>
>>
>> * Item 9 - undocumented calling conventions
>>
>> The following calling conventions are valid tokens but not described in
>> the language references as of revision 223189:
>>
>> intel_ocl_bicc, x86_stdcallcc, x86_fastcallcc, x86_thiscallcc,
>> kw_x86_vectorcallcc, arm_apcscc, arm_aapcscc, arm_aapcs_vfpcc,
>> msp430_intrcc, ptx_kernel, ptx_device, spir_kernel, spir_func,
>> x86_64_sysvcc, x86_64_win64cc, kw_ghccc
>>
>>
> This is just bitrot.
>
> -- Sean Silva
>
>
>>
>>
>> Lastly I'd just like to thank the LLVM developers for all the time
and
>> hard work they've put into this project. I'd especially like to
thank you
>> for providing a language specification along side of the reference
>> implementation! Keeping it up to date is a huge task, but also hugely
>> important. Thank you!
>>
>> Kind regards
>> /Robin Eklind
>>
>> [1]: http://llvm.org/docs/LangRef.html#linkage-types
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>

Nick Lewycky

2015-Feb-11 03:46 UTC

head link

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

On 27 January 2015 at 17:49, Robin Eklind <carl.eklind at myport.ac.uk>
wrote:
> Hello everyone!
>
> I've recently had a chance to familiarize myself with the nitty-gritty
> details of LLVM IR. It has been a great learning experience, sometimes
> frustrating or confusing but mostly rewarding.
>
> There are a few cases I've come across which seems odd to me. I've
tried
> to cross reference with the language specification and the source code to
> the best of my abilities, but would like to reach out to an experienced
> crowd with a few questions.
>
> Could you help me out by taking a look at these examples? To my novice
> eyes they seem to highlight inconsistencies in LLVM IR (or the reference
> implementation), but it is quite likely that I've overlooked something.
> Please help me out.
>
> Note: the example source files have been attached and a copy is made
> available at https://github.com/mewplay/ll
>
> * Item 1 - named pointer types
>
> It is possible to create a named array pointer type (and many others), but
> not a named structure pointer type. E.g.
>
> %x = type [1 x i32]* ; valid.
> %x = type {i32}*     ; invalid.
>
> Is this the intended behaviour? Attaching a.ll, b.ll, c.ll and d.ll for
> reference. All files except d.ll compiles without error using clang version
> 3.5.1 (tags/RELEASE_351/final).
>
Only struct types may be named. What you're seeing is an artifact of the
.ll parser compatibility-supporting the old (llvm 2.x) syntax. In the array
case, the resulting llvm::Module does not have any type named %x. In the
struct case, it's a hard error as you noticed. LLVM 2.x used to permit all
types to have names.
> $ clang d.ll
> > d.ll:3:16: error: expected top-level entity
> > %x = type {i32}*
> >                ^
> > 1 error generated.
>
> Does it have anything to do with type equality? (just a hunch)
>
> * Item 2 - equality of named types
>
> A named integer type is equivalent to its literal type counterpart, but
> the same is not true for named and literal structures.

Right. Since named non-struct types don't exist, what's really going on
is
that the .ll parser remembers %name to Type* mapping and uses that all
over. Hence they're pointer equivalent. For structs, this is not so,
structs with identical contents but different names are different.

I am certain that I've read about this before, but can't seem to locate
the> right section of the language specification; could anyone point me in the
> right direction? Also, what is the motivation behind this decision?
I've
> skimmed over the code which handles named structure types (in
> lib/IR/core.cpp), but would love to hear the high level idea.
>
> Attaching e.ll, f.ll, g.ll and h.ll for reference. All compile just file
> except h.ll, which produces the following error message (using the same
> version of clang as above):
>
> > $ clang h.ll
> > h.ll:10:23: error: argument is not of expected type '%x = type {
i32 }'
> >         call void (%x)* @foo({i32} {i32 0})
> >                              ^
> > 1 error generated.
>
> * Item 3 - zero initialized common linkage variables
>
> According to the language specification common linkage variables are
> required to have a zero initializer [1]. If so, why are they also required
> to provide an initial value?
>
I don't know but I can guess. We want code that checks for an initial value
(via the C++ API) to only look in one place, GV->getInitializer(), instead
of adding a check for isCommon() at each call site.

Of course we could make the .ll text for this whatever we want, but having
a zero initializer requirement more closely matches what's going on with
the objects in memory.

Attaching i.ll and j.ll for reference. Both compiles just fine and
once> executed i.ll returns 37 and j.ll return 0. If the common linkage variable
> @x was not initialized to 0, j.ll would have returned 42.
>
> * Item 4 - constant common linkage variables
>
> The language specification states that common linkage variables may not be
> marked as constant [1]. The parser doesn't seem to enforce this
> restriction. Would doing so cause any problems?
>
In general, restrictions are enforced by the verifier, not the .ll parser.
The verifier operates on the in-memory model and is the source of truth for
validity of IR.

$ cat a.ll
@x = common global i32 1
$ llvm-as a.ll
llvm-as: assembly parsed, but does not verify as correct!
'common' global must have a zero initializer!
i32* @x

All passes are expected to assume that their inputs pass the verifier, and
are permitted to executed undefined behaviour if they do not. All passes
are expected to leave the IR in a state where the verifier passes (on the
assumption that the input did). Same with bitcode reader and writer. There
are some utility functions that are used during the execution of a pass
which cannot make this assumption since the IR may be invalid during a
larger transformation.

Attaching k.ll and l.ll for reference. Both compiles just fine, but
once> executed k.ll returns 37 (e.g. the constant variable was overwritten) while
> l.ll segfaults as expected when it tries to overwrite a read-only memory
> location.
>
> * Item 5 - appending linkage restrictions
>
> An extract from the language specification [1]:
>
> > "appending" linkage may only be applied to global variables
of pointer
> to array type.
>
> Similarly to item 4 this restriction isn't enforced by the parser.
Would
> it make sense doing so, or is there any problem with such an approach?
>
Same as above, it's in the verifier.

* Item 6 - hash token>
> The hash token (#) is defined in lib/AsmParser/LLToken.h (release version
> 3.5.0 of the LLVM source code) but doesn't seem to be used anywhere
else in
> the source tree. Is this token a historical artefact or does it serve a
> purpose?
>
It's gone! This was removed in r227442.

* Item 7 - backslash token>
> Similarly to item 7 the backslash token doesn't seem to serve a purpose
> (with regards to release version 3.5.0 of the LLVM source code). Is it used
> somewhere?
>
Yep, again.

> * Item 8 - quoted labels
>
> A comment in lib/AsmParser/LLLexer.cpp (once again, release version 3.5.0
> of the LLVM source code) describes quoted labels using the following regexp
> (e.g. at least one character between the double quotes):
>
> > ///   QuoteLabel        "[^"]+":
>
> In contrast the reference implementation accepts quoted labels with zero
> or more characters between the double quotes. Which is to be trusted? The
> comment makes more sense as the variable name would effectively be blank
> otherwise.
>
I think this is a bug. Well, two bugs:

$ cat a.ll
@"" = internal constant i32 0
@0 = internal constant i32 0
$ llvm-as a.ll
llvm-as: a.ll:2:1: error: variable expected to be numbered '%1'
@0 = internal constant i32 0
^

Anonymous values are numbered, one set of numberings for local variables
(including arguments) and one for globals. I think that @"" should not
be
anonymous, but llvm-as clearly thinks it is. If you check llvm::Value's
getValueName() method, we clearly support a distinction between an empty
string and no string.

The other bug is in the error message. The variable should be numbered
'@1'
not '%1'.

* Item 9 - undocumented calling conventions>
> The following calling conventions are valid tokens but not described in
> the language references as of revision 223189:
>
> intel_ocl_bicc, x86_stdcallcc, x86_fastcallcc, x86_thiscallcc,
> kw_x86_vectorcallcc, arm_apcscc, arm_aapcscc, arm_aapcs_vfpcc,
> msp430_intrcc, ptx_kernel, ptx_device, spir_kernel, spir_func,
> x86_64_sysvcc, x86_64_win64cc, kw_ghccc
>
Ooh. Yes, these should be documented!

Nick

Lastly I'd just like to thank the LLVM developers for all the time and
hard> work they've put into this project. I'd especially like to thank
you for
> providing a language specification along side of the reference
> implementation! Keeping it up to date is a huge task, but also hugely
> important. Thank you!
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150210/f45dc728/attachment.html>

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Jan 2015 - [LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

[LLVMdev] Inconsistencies or intended behaviour of LLVM IR?

Seemingly Similar Threads