So just to compare, here are two different typesafe phantom type implementations. One is bottom down, the other bottom up. This is an example of the following functions: string Value::getName() bool Constant::isNull() bool GlobalValue::isDeclaration() bool GlobalVariable::isGlobalConstant() bool Function::isVarArg() Driver code: val make_constant : unit -> llconstant t val make_global_variable : unit -> llglobalvariable t val make_function : unit -> llfunction t (* typesafe *) value_name (make_constant ());; value_name (make_global_variable ());; value_name (make_function ());; is_null (make_constant ());; is_declaration (make_global_variable ());; is_declaration (make_function ());; is_global_constant (make_global_variable ());; is_var_arg (make_function ());; (* type errors *) is_null (make_global_variable ());; is_null (make_function ());; is_declaration (make_constant ());; is_global_constant (make_constant ());; is_var_arg (make_constant ());; is_var_arg (make_global_variable ());; Bottom up. The advantage of this one is that we'll always be typesafe since we're saying the arguments must be a subset of the specified variants, and thus all the variants are closed. The disadvantage is that if we want to add another type we have to edit all the type definitions. This also means that we can't add another library that subclasses from something and still use these functions: type 'a t type llfunction = [ `Function ] type llglobalvariable = [ `GlobalVariable ] type llglobalvalue = [ llfunction | llglobalvariable ] type llconstant = [ `Constant ] type llvalue = [ llconstant | llglobalvalue ] val value_name : [< llvalue] t -> string val is_null : [< llconstant] t -> bool val is_declaration : [< llglobalvalue] t -> bool val is_global_constant : [< llglobalvariable] t -> bool val is_var_arg : [< llfunction] t -> bool The other way is top down. This lets us extend our types, but sacrifices some type safety, as we're saying that the arguments are a superset of the variants. We can control this by limiting who can create 't's: type 'a t type llvalue = [ `Value ] type llconstant = [ llvalue | `Constant ] type llglobalvalue = [ llvalue | `GlobalValue ] type llglobalvariable = [ llglobalvalue | `GlobalVariable ] type llfunction = [ llglobalvalue | `Function ] val value_name : [> `Value] t -> string val is_null : [> `Constant] t -> bool val is_declaration : [> `GlobalValue] t -> bool val is_global_constant : [> `GlobalVariable] t -> bool val is_var_arg : [> `Function] t -> bool So what's better to use?
Gordon Henriksen
2008-Mar-15 22:56 UTC
[LLVMdev] improving the ocaml binding's type safety
On Mar 15, 2008, at 17:49, Erick Tryzelaar wrote:> The other way is top down. This lets us extend our types, but > sacrifices some type safety, as we're saying that the arguments are > a superset of the variants. We can control this by limiting who can > create 't's: > > type 'a t > > type llvalue = [ `Value ] > type llconstant = [ llvalue | `Constant ] > type llglobalvalue = [ llvalue | `GlobalValue ] > type llglobalvariable = [ llglobalvalue | `GlobalVariable ] > type llfunction = [ llglobalvalue | `Function ] > > val value_name : [> `Value] t -> string > val is_null : [> `Constant] t -> bool > val is_declaration : [> `GlobalValue] t -> bool > val is_global_constant : [> `GlobalVariable] t -> bool > val is_var_arg : [> `Function] t -> boolAh, I get it now. type b = [ a | `B ] is idiomatic type theoretician for b t is a subclass of a t. This seems slightly a hack, using sum types to do exactly the opposite of what they're designed for. :) I think you want to use the sum types for parameter typing; [> `GlobalValue ] is not compatible with llconstant.> Bottom up. The advantage of this one is that we'll always be > typesafe since we're saying the arguments must be a subset of the > specified variants, and thus all the variants are closed. The > disadvantage is that if we want to add another type we have to edit > all the type definitions. This also means that we can't add another > library that subclasses from something and still use these functions: > > type 'a t > type llfunction = [ `Function ] > type llglobalvariable = [ `GlobalVariable ] > type llglobalvalue = [ llfunction | llglobalvariable ] > type llconstant = [ `Constant ] > type llvalue = [ llconstant | llglobalvalue ] > > val value_name : [< llvalue] t -> string > val is_null : [< llconstant] t -> bool > val is_declaration : [< llglobalvalue] t -> bool > val is_global_constant : [< llglobalvariable] t -> bool > val is_var_arg : [< llfunction] t -> boolI find this intensely more intuitive. The LLVM IR is closed and self- contained within the VMCore library (modulo a couple of private User subclasses in the bitcode libraries), so that's not a problem. Clearly, these are syntactically very similar. With my above comment, the difference is literally > vs. <. Since the IR is closed, they're semantically similar as well. The extensible method is clearly preferable for extensible hierarchies, should we have cause to expose any methods on them. Passes are an example of this. What do the respective type errors look like? — Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080315/08a6f9a5/attachment.html>
On Sat, Mar 15, 2008 at 4:00 PM, Gordon Henriksen <gordonhenriksen at mac.com> wrote:> What do the respective type errors look like?Top down looks like: File "foo.ml", line 21, characters 11-36: This expression has type Bar.llglobalvariable Bar.t but is here used with type ([> `Function ] as 'a) Bar.t Type Bar.llglobalvariable = [ `GlobalValue | `GlobalVariable | `Value ] is not compatible with type 'a The first variant type does not allow tag(s) `Function Bottom up looks like: File "foo.ml", line 21, characters 11-36: This expression has type Bar.llglobalvariable Bar.t but is here used with type ([< Bar.llfunction ] as 'a) Bar.t Type Bar.llglobalvariable = [ `GlobalVariable ] is not compatible with type 'a = [< `Function ] These two variant types have no intersection
Gordon Henriksen
2008-Mar-16 02:33 UTC
[LLVMdev] improving the ocaml binding's type safety
Erick, After some experimentation, I'd prefer the closed system. LLVM has some type peculiarities like the commonality between CallInst and InvokeInst. I find that the closed type system lets me express such constraints more naturally. Expressing these constraints explicitly in the open system involves annotating the C++ class hierarchy with extra variants which are unnecessary in the closed model. Please use 'a Llvm.ty for Type and 'a Llvm.v for Value to save typing. These choices avoid conflicting with the common type binding t and the language keyword val, but promote these important types to the type names into the Llvm module (likely open'd) for brevity's sake. I don't have a better suggestion than just naming the variant sum types Llvm.ll_____. I considered some other options, but decided I'm not fond of them in practice. Thanks, Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080315/7c624f16/attachment.html>