thr3ads.net - llvm dev - [LLVMdev] IR type safety [Sep 2010]

If this information is useful, please help other people find it:
Share via:

Renato Golin

2010-Sep-21 16:26 UTC

[LLVMdev] IR type safety

Hi folks,

I have a few questions I was saving for later and never got around to
ask them, so I'll send a few emails to the list, one with each
question, to ease the further discussions that may come from them...

The first question is:

According to the language reference, LLVM IR is type safe. It means,
for instance, that you won't be able to perform ADD operations in two
different types or call functions with the wrong arguments, etc.

But, when declaring two types that happen to (supposedly) have the
same layout, LLVM ignores the second type and use the first's name
instead.

In one module, it doesn't matter, but once you join different modules
with, possibly, different data layouts, the data types are not the
same any more.

Is this a declaration that you will never be able (with an error
message, assert or whatever) to join two IRs with different data
layouts? Or it was never thought that you could mix them?

In my view, that is the precise reason why we have the data layout.
Unions can't rely on them (why we don't have unions any more) and
compiler data (RTTI, VT, VTT, etc) are all statically created with the
correct size.


-- 
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

John Criswell

2010-Sep-21 16:40 UTC

head link

[LLVMdev] IR type safety

Renato Golin wrote:> Hi folks,
>
> I have a few questions I was saving for later and never got around to
> ask them, so I'll send a few emails to the list, one with each
> question, to ease the further discussions that may come from them...
>
> The first question is:
>
> According to the language reference, LLVM IR is type safe. It means,
> for instance, that you won't be able to perform ADD operations in two
> different types or call functions with the wrong arguments, etc.
>   
First, this is only partially correct.  LLVM IR is typed, and most 
operations are type-safe.  However, LLVM can represent type-unsafe code 
through at least the following:

1) LLVM has a cast instruction (and cast constant expression) that can 
cast one type to another.  It's possible to take a float, cast it to an 
int, and add it to another int.

2) LLVM does not require garbage collection or region-based memory 
management.  You can get implicit casting of values if you dereference a 
dangling pointer.

3) LLVM does not prevent a function from returning a pointer to 
stack-allocated memory.  Dangling pointers to stack-allocated objects is 
possible.

That said, you can generate type-safe LLVM IR, and if you force your 
front-end to generate IR with certain restrictions, you can probably 
prove that it is type-safe.

> But, when declaring two types that happen to (supposedly) have the
> same layout, LLVM ignores the second type and use the first's name
> instead.
>
> In one module, it doesn't matter, but once you join different modules
> with, possibly, different data layouts, the data types are not the
> same any more.
>
> Is this a declaration that you will never be able (with an error
> message, assert or whatever) to join two IRs with different data
> layouts? Or it was never thought that you could mix them?
>   
I think linking two LLVM bitcode files with different data layouts would 
be hard (especially given different endians); I think LLVM 2.7 prints a 
warning when data layout doesn't match.  However, I'll let people more 
knowledgeable of LLVM data layout answer this part of your question.

-- John T.
> In my view, that is the precise reason why we have the data layout.
> Unions can't rely on them (why we don't have unions any more) and
> compiler data (RTTI, VT, VTT, etc) are all statically created with the
> correct size.
>
>
>

Devang Patel

2010-Sep-21 16:48 UTC

head link

[LLVMdev] IR type safety

On Sep 21, 2010, at 9:26 AM, Renato Golin wrote:
> But, when declaring two types that happen to (supposedly) have the
> same layout, LLVM ignores the second type and use the first's name
> instead.
> 
> In one module, it doesn't matter, but once you join different modules
> with, possibly, different data layouts, the data types are not the
> same any more.
Try linking following two modules using llvm-ld and see what happens.

--- x.ll ---
%struct.x = type { i32, i32 }
%struct.y = type { i32, i32 }

@p = common global %struct.x zeroinitializer, align 4
@p2 = common global %struct.x zeroinitializer, align 4
---

--- y.ll ---
%struct.x = type { i32, i32, i32 }

@p3 = common global %struct.x zeroinitializer, align 4
---

In the combined llvm IR, @p3 and @p won't match as expected.
-
Devang

Renato Golin

2010-Sep-21 17:17 UTC

head link

[LLVMdev] IR type safety

On 21 September 2010 17:40, John Criswell <criswell at illinois.edu>
wrote:> That said, you can generate type-safe LLVM IR, and if you force your
> front-end to generate IR with certain restrictions, you can probably prove
> that it is type-safe.
Indeed, I was referring to that kind of type safety... ;)


-- 
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Renato Golin

2010-Sep-21 17:27 UTC

head link

[LLVMdev] IR type safety

On 21 September 2010 17:48, Devang Patel <dpatel at apple.com>
wrote:> In the combined llvm IR, @p3 and @p won't match as expected.
Hi Devang,

That's not quite what I was thinking... Maybe I explained badly...

Imagine this:

-- a.ll --
%struct.x = type { i32, i32 }

%a = call void @func (%struct.x %b)

-- b.ll --
%struct.y = type { i32, i32 }

declare i32 @func (%struct.y)

Now, imagine that X and Y are completely different structures, they
don't reflect the same type in the code, but in the IR it got
flattened out, so the modules can't distinguish between X or Y.

If I distribute IR (with the same data layout, target triple, etc),
and you try to link against it, it will allow you to put apples in
place of bananas...

Does it make sense?

-- 
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Sep 2010 - [LLVMdev] IR type safety

[LLVMdev] IR type safety

[LLVMdev] IR type safety

[LLVMdev] IR type safety

[LLVMdev] IR type safety

[LLVMdev] IR type safety

Maybe Matching Threads