Eric Schweitz via llvm-dev
2018-Nov-01 20:29 UTC
[llvm-dev] Fwd: RFC: Adding debug information to LLVM to support Fortran
*From:* flang-dev <flang-dev-bounces at lists.flang-compiler.org> *On Behalf Of *Eric Schweitz (PGI) *Sent:* Thursday, November 01, 2018 1:02 PM *To:* flang-dev at lists.flang-compiler.org *Subject:* [Flang-dev] RFC: Adding debug information to LLVM to support Fortran In order to support debugging in the Flang project, work has been done to extend LLVM debug information for the Fortran language. The changes are currently available at https://github.com/flang-compiler/llvm. In order to upstream these changes into LLVM itself, three smaller changesets, described below, will be uploaded to https://reviews.llvm.org for code review. 1. Elemental, Pure, and Recursive Procedures DWARF 4 defines attributes for these Fortran procedure specifiers: DW_AT_elemental, DW_AT_pure, DW_AT_recursive, resp. LLVM has a way of informing the DWARF generator of simple boolean attributes in the metadata via the flags parameter. We have added these new values to the existing collection of flags. !60 = !DISubprogram(…, flags: DIFlagElemental) !61 = !DISubprogram(…, flags: DIFlagPure) !62 = !DISubprogram(…, flags: DIFlagRecursive) 2. Fortran Type Support 2.1 CHARACTER Intrinsic Type There is no analog in C for the Fortran CHARACTER type. The Fortran CHARACTER type maps to the DWARF tag, DW_TAG_string_type. We have added a new named DI to LLVM to generate this DWARF information. !21 = !DIStringType(name: “character(5)”, size: 40) This produces the following DWARF information. DW_TAG_string_type: DW_AT_name: “character(5)” DW_AT_byte_size: 5 CHARACTER types can also have deferred length. This is supported in the new metadata as follows. !22 = !DIStringType(name: “character(*)!1”, size: 32, stringLength: !23, stringLengthExpression: !DIExpression()) !23 = !DILocalVariable(scope: !3, arg: 4, file: !4, type: !5, flags: DIFlagArtificial) This will generate the following DWARF information. DW_TAG_string_type: DW_AT_name: character(*)!1 DW_AT_string_length: 0x9b (location list) DW_AT_byte_size: 4 2.2 Fortran Array Types and Bounds In this section we refer to the DWARF tag, DW_TAG_array_type, which is used to describe Fortran arrays. However in Fortran, arrays are not types but are rather runtime data objects, a multidimensional rectangular set of scalar data of homogeneous type. An array object has dimensions (rank and corank) and extents in those dimensions. The rank and ranges of the extents of an array may not be known until runtime. Arrays may be reshaped, acted upon in whole or in part, or otherwise be referenced (perhaps even in reverse order) non-contiguously. Furthermore arrays may be allocated and deallocated at runtime and aliased through other POINTER objects. In short, Fortran array objects are not readily mappable to the C family of languages model of arrays, and more expressive DWARF information is required. 2.2.1 Explicit array dimensions An array may be given a constant size as in the following example. The example shows a two-dimensional array, named array, that has indices from 1 to 10 for the rows and 2 to 11 for the columns. TYPE(t) :: array(10,2:11) For this declaration, the compiler generates the following LLVM metadata. !100 = !DIFortranArrayType(baseType: !7, elements: !101) !101 = !{ !102, !103 } !102 = !DIFortranSubrange(constLowerBound: 1, constUpperBound: 10) !103 = !DIFortranSubrange(constLowerBound: 2, constUpperBound: 11) The DWARF generated for this is as follows. (DWARF asserts in the standard that arrays are interpreted as column-major.) DW_TAG_array_type: DW_AT_name: array DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 1 DW_AT_upper_bound: 10 DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 DW_AT_upper_bound: 11 2.2.2 Adjustable arrays By adjustable arrays, we mean that an array may have its size passed explicitly as another argument. SUBROUTINE subr2(array2,N) INTEGER :: N TYPE(t) :: array2(N) In this case, the compiler expresses the !DISubrange as an expression that references the dummy argument, N. call void @llvm.dbg.declare(metadata i64* %N, metadata !113, metadata !DIExpression()) … !110 = !DIFortranArrayType(baseType: !7, elements: !111) !111 = !{ !112 } !112 = !DIFortranSubrange(lowerBound: 1, upperBound: !113, upperBoundExpression: !DIExpression(DW_OP_deref)*)* !113 = !DILocalVariable(scope: !2, name: “zb1”, file: !3, type: !4, flags: DIFlagArtificial) It turned out that gdb didn’t properly interpret location lists or variable references in the DW_AT_lower_bound and DW_AT_upper_bound attribute forms, so the compiler must generate either a constant or a block with the DW_OP operations for each of them. DW_TAG_array_type: DW_AT_name: array2 DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 1 DW_AT_upper_bound: 2 byte block: 91 70 2.2.3 Assumed size arrays An assumed size array leaves the last dimension of the array unspecified. SUBROUTINE subr3(array3) TYPE(t) :: array3(*) The compiler generates DWARF information without an upper bound, such as in this snippet. DW_TAG_array_type DW_AT_name: array3 DW_TAG_subrange_type DW_AT_type = int DW_AT_lower_bound = 1 This DWARF is produced by omission of the upper bound information. !122 = !DIFortranSubrange(lowerBound: 1) 2.2.4 Assumed shape arrays Fortran also has assumed shape arrays, which allow extra state to be passed into the procedure to describe the shape of the array dummy argument. This extra information is the array descriptor, generated by the compiler, and passed as a hidden argument. SUBROUTINE subr4(array4) TYPE(t) :: array4(:,:) In this case, the compiler generates DWARF expressions to access the results of the procedure’s usage of the array descriptor argument when it computes the lower bound (DW_AT_lower_bound) and upper bound (DW_AT_upper_bound). … call void @llvm.dbg.declare(metadata i64* %4, metadata !134, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %8, metadata !136, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %9, metadata !137, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %13, metadata !139, metadata !DIExpression()) … !130 = !DIFortranArrayType(baseType: !80, elements: !131) !131 = !{ !132, !133 } !132 = !DISubrange(lowerBound: !134, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !136, upperBoundExpression: !DIExpression(DW_OP_deref)) !133 = !DISubrange(lowerBound: !137, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !139, upperBoundExpression: !DIExpression(DW_OP_deref)) !134 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !136 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !137 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !139 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) The DWARF generated for this is as follows. DW_TAG_array_type: DW_AT_name: array4 DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 byte block: 91 78 DW_AT_upper_bound: 2 byte block: 91 70 DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 byte block: 91 68 DW_AT_upper_bound: 2 byte block: 91 60 2.2.5 Assumed rank arrays and coarrays This changeset does not address DWARF 5 extensions to support assumed rank arrays or coarrays. 3. Fortran COMMON Block COMMON blocks are a feature of Fortran that has no direct analog in C languages, but they are similar to data sections in assembly language programming. A COMMON block is a named area of memory that holds a collection of variables. Fortran subprograms may map the COMMON block memory area to their own, possibly distinct, non-empty list of variables. A Fortran COMMON block might look like the following example. COMMON /ALPHA/ I, J For this construct, the compiler generates a new scope-like DI construct (!DICommonBlock) into which variables (see I, J above) can be placed. As the common block implies a range of storage with global lifetime, the !DICommonBlock refers to a !DIGlobalVariable. The Fortran variable that comprise the COMMON block are also linked via metadata to offsets within the global variable that stands for the entire common block. @alpha_ = common global %alphabytes_ zeroinitializer, align 64, !dbg !27, !dbg !30, !dbg !33 !14 = distinct !DISubprogram(…) !20 = distinct !DICommonBlock(scope: !14, declaration: !25, name: "alpha") !25 = distinct !DIGlobalVariable(scope: !20, name: "common alpha", type: !24) !27 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression()) !29 = distinct !DIGlobalVariable(scope: !20, name: "i", file: !3, type: !28) !30 = !DIGlobalVariableExpression(var: !29, expr: !DIExpression()) !31 = distinct !DIGlobalVariable(scope: !20, name: "j", file: !3, type: !28) !32 = !DIExpression(DW_OP_plus_uconst, 4) !33 = !DIGlobalVariableExpression(var: !31, expr: !32) The DWARF generated for this is as follows. DW_TAG_common_block: DW_AT_name: alpha DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: common alpha DW_AT_type: array of 8 bytes DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: i DW_AT_type: integer*4 DW_AT_location: @alpha+0 DW_TAG_variable: DW_AT_name: j DW_AT_type: integer*4 DW_AT_location: @alpha+4 -- Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181101/2d644c1d/attachment.html>
Adrian Prantl via llvm-dev
2018-Nov-01 22:12 UTC
[llvm-dev] RFC: Adding debug information to LLVM to support Fortran
Thanks for sharing your plans, I made a few comments inline I noticed on my first quick read-through. -- adrian> On Nov 1, 2018, at 1:29 PM, Eric Schweitz via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > From: flang-dev <flang-dev-bounces at lists.flang-compiler.org <mailto:flang-dev-bounces at lists.flang-compiler.org>> On Behalf Of Eric Schweitz (PGI) > Sent: Thursday, November 01, 2018 1:02 PM > To: flang-dev at lists.flang-compiler.org <mailto:flang-dev at lists.flang-compiler.org> > Subject: [Flang-dev] RFC: Adding debug information to LLVM to support Fortran > > > > In order to support debugging in the Flang project, work has been done to extend LLVM debug information for the Fortran language. The changes are currently available at https://github.com/flang-compiler/llvm <https://github.com/flang-compiler/llvm>. > > > > In order to upstream these changes into LLVM itself, three smaller changesets, described below, will be uploaded to https://reviews.llvm.org <https://reviews.llvm.org/> for code review. > > > > 1. Elemental, Pure, and Recursive Procedures > > DWARF 4 defines attributes for these Fortran procedure specifiers: DW_AT_elemental, DW_AT_pure, DW_AT_recursive, resp. LLVM has a way of informing the DWARF generator of simple boolean attributes in the metadata via the flags parameter. We have added these new values to the existing collection of flags. > > !60 = !DISubprogram(…, flags: DIFlagElemental) > > !61 = !DISubprogram(…, flags: DIFlagPure) > > !62 = !DISubprogram(…, flags: DIFlagRecursive) >That change should be fairly straightforward, although we are starting to run out on bits in DIFLags, so we may need to come up with a clever encoding (such as reusing bits that aren't used in a DISubprogram context).> > > 2. Fortran Type Support > > 2.1 CHARACTER Intrinsic Type > > There is no analog in C for the Fortran CHARACTER type. The Fortran CHARACTER type maps to the DWARF tag, DW_TAG_string_type. We have added a new named DI to LLVM to generate this DWARF information. > > !21 = !DIStringType(name: “character(5)”, size: 40) > > > > This produces the following DWARF information. > > DW_TAG_string_type: > > DW_AT_name: “character(5)” > > DW_AT_byte_size: 5 > > > > CHARACTER types can also have deferred length. This is supported in the new metadata as follows. > > !22 = !DIStringType(name: “character(*)!1”, size: 32, stringLength: !23, stringLengthExpression: !DIExpression()) > > !23 = !DILocalVariable(scope: !3, arg: 4, file: !4, type: !5, flags: DIFlagArtificial) >Can you take a look at how variable-length arrays in C99 are implemented in Clang at the moment? It would be nice to use a similar scheme here.> > > This will generate the following DWARF information. > > DW_TAG_string_type: > > DW_AT_name: character(*)!1 > > DW_AT_string_length: 0x9b (location list) > > DW_AT_byte_size: 4 > > > > 2.2 Fortran Array Types and Bounds > > In this section we refer to the DWARF tag, DW_TAG_array_type, which is used to describe Fortran arrays. > > However in Fortran, arrays are not types but are rather runtime data objects, a multidimensional rectangular set of scalar data of homogeneous type. An array object has dimensions (rank and corank) and extents in those dimensions. The rank and ranges of the extents of an array may not be known until runtime. Arrays may be reshaped, acted upon in whole or in part, or otherwise be referenced (perhaps even in reverse order) non-contiguously. Furthermore arrays may be allocated and deallocated at runtime and aliased through other POINTER objects. In short, Fortran array objects are not readily mappable to the C family of languages model of arrays, and more expressive DWARF information is required. > > 2.2.1 Explicit array dimensions > > An array may be given a constant size as in the following example. The example shows a two-dimensional array, named array, that has indices from 1 to 10 for the rows and 2 to 11 for the columns. > > TYPE(t) :: array(10,2:11) > > > > For this declaration, the compiler generates the following LLVM metadata. > > !100 = !DIFortranArrayType(baseType: !7, elements: !101) >Since the DI* hierarchy really just is the DWARF type hierarchy, I don't think we will need to introduce any fortran-specific names for arrays.> !101 = !{ !102, !103 } > > !102 = !DIFortranSubrange(constLowerBound: 1, constUpperBound: 10) > > !103 = !DIFortranSubrange(constLowerBound: 2, constUpperBound: 11) > > > > The DWARF generated for this is as follows. (DWARF asserts in the standard that arrays are interpreted as column-major.) > > DW_TAG_array_type: > > DW_AT_name: array > > DW_AT_type: 4d08 ;TYPE(t) > > DW_TAG_subrange_type: > > DW_AT_type: int > > DW_AT_lower_bound: 1 > > DW_AT_upper_bound: 10 > > DW_TAG_subrange_type: > > DW_AT_type: int > > DW_AT_lower_bound: 2 > > DW_AT_upper_bound: 11 > > > > 2.2.2 Adjustable arrays > > By adjustable arrays, we mean that an array may have its size passed explicitly as another argument. > > SUBROUTINE subr2(array2,N) > > INTEGER :: N > > TYPE(t) :: array2(N) > > > > In this case, the compiler expresses the !DISubrange as an expression that references the dummy argument, N. > > call void @llvm.dbg.declare(metadata i64* %N, metadata !113, metadata !DIExpression()) > > … > > !110 = !DIFortranArrayType(baseType: !7, elements: !111) > > !111 = !{ !112 } > > !112 = !DIFortranSubrange(lowerBound: 1, upperBound: !113, upperBoundExpression: !DIExpression(DW_OP_deref)) >It would be better (and much more robust in presence of optimizations) if the DIExpression were part of a @llvm.dbg.declare / value intrinsic tying the DILocalVariable to an LLVM SSA value.> !113 = !DILocalVariable(scope: !2, name: “zb1”, file: !3, type: !4, flags: DIFlagArtificial) > > > > It turned out that gdb didn’t properly interpret location lists or variable references in the DW_AT_lower_bound and DW_AT_upper_bound attribute forms, so the compiler must generate either a constant or a block with the DW_OP operations for each of them. > > DW_TAG_array_type: > > DW_AT_name: array2 > > DW_AT_type: 4d08 ;TYPE(t) > > DW_TAG_subrange_type: > > DW_AT_type: int > > DW_AT_lower_bound: 1 > > DW_AT_upper_bound: 2 byte block: 91 70 > > > > 2.2.3 Assumed size arrays > > An assumed size array leaves the last dimension of the array unspecified. > > SUBROUTINE subr3(array3) > > TYPE(t) :: array3(*) > > > > The compiler generates DWARF information without an upper bound, such as in this snippet. > > DW_TAG_array_type > > DW_AT_name: array3 > > DW_TAG_subrange_type > > DW_AT_type = int > > DW_AT_lower_bound = 1 > > > > This DWARF is produced by omission of the upper bound information. > > !122 = !DIFortranSubrange(lowerBound: 1) > > > > 2.2.4 Assumed shape arrays > > Fortran also has assumed shape arrays, which allow extra state to be passed into the procedure to describe the shape of the array dummy argument. This extra information is the array descriptor, generated by the compiler, and passed as a hidden argument. > > SUBROUTINE subr4(array4) > > TYPE(t) :: array4(:,:) > > > > In this case, the compiler generates DWARF expressions to access the results of the procedure’s usage of the array descriptor argument when it computes the lower bound (DW_AT_lower_bound) and upper bound (DW_AT_upper_bound). > > … > > call void @llvm.dbg.declare(metadata i64* %4, metadata !134, metadata !DIExpression()) > > call void @llvm.dbg.declare(metadata i64* %8, metadata !136, metadata !DIExpression()) > > call void @llvm.dbg.declare(metadata i64* %9, metadata !137, metadata !DIExpression()) > > call void @llvm.dbg.declare(metadata i64* %13, metadata !139, metadata !DIExpression()) > > … > > !130 = !DIFortranArrayType(baseType: !80, elements: !131) > > !131 = !{ !132, !133 } > > !132 = !DISubrange(lowerBound: !134, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !136, upperBoundExpression: !DIExpression(DW_OP_deref)) > > !133 = !DISubrange(lowerBound: !137, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !139, upperBoundExpression: !DIExpression(DW_OP_deref)) >same here.> !134 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) > > !136 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) > > !137 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) > > !139 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) > > > > The DWARF generated for this is as follows. > > DW_TAG_array_type: > > DW_AT_name: array4 > > DW_AT_type: 4d08 ;TYPE(t) > > DW_TAG_subrange_type: > > DW_AT_type: int > > DW_AT_lower_bound: 2 byte block: 91 78 > > DW_AT_upper_bound: 2 byte block: 91 70 > > DW_TAG_subrange_type: > > DW_AT_type: int > > DW_AT_lower_bound: 2 byte block: 91 68 > > DW_AT_upper_bound: 2 byte block: 91 60 > > > > 2.2.5 Assumed rank arrays and coarrays > > This changeset does not address DWARF 5 extensions to support assumed rank arrays or coarrays. > > > > 3. Fortran COMMON Block > > COMMON blocks are a feature of Fortran that has no direct analog in C languages, but they are similar to data sections in assembly language programming. A COMMON block is a named area of memory that holds a collection of variables. Fortran subprograms may map the COMMON block memory area to their own, possibly distinct, non-empty list of variables. A Fortran COMMON block might look like the following example. > > > > COMMON /ALPHA/ I, J > > > > For this construct, the compiler generates a new scope-like DI construct (!DICommonBlock) into which variables (see I, J above) can be placed. As the common block implies a range of storage with global lifetime, the !DICommonBlock refers to a !DIGlobalVariable. The Fortran variable that comprise the COMMON block are also linked via metadata to offsets within the global variable that stands for the entire common block. > > > > @alpha_ = common global %alphabytes_ zeroinitializer, align 64, !dbg !27, !dbg !30, !dbg !33 > > !14 = distinct !DISubprogram(…) > > !20 = distinct !DICommonBlock(scope: !14, declaration: !25, name: "alpha") > > !25 = distinct !DIGlobalVariable(scope: !20, name: "common alpha", type: !24) > > !27 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression()) > > !29 = distinct !DIGlobalVariable(scope: !20, name: "i", file: !3, type: !28) > > !30 = !DIGlobalVariableExpression(var: !29, expr: !DIExpression()) > > !31 = distinct !DIGlobalVariable(scope: !20, name: "j", file: !3, type: !28) > > !32 = !DIExpression(DW_OP_plus_uconst, 4) > > !33 = !DIGlobalVariableExpression(var: !31, expr: !32) > > > > The DWARF generated for this is as follows. > > > > DW_TAG_common_block: > > DW_AT_name: alpha > > DW_AT_location: @alpha_+0 > > DW_TAG_variable: > > DW_AT_name: common alpha > > DW_AT_type: array of 8 bytes > > DW_AT_location: @alpha_+0 > > DW_TAG_variable: > > DW_AT_name: i > > DW_AT_type: integer*4 > > DW_AT_location: @alpha+0 > > DW_TAG_variable: > > DW_AT_name: j > > DW_AT_type: integer*4 > > DW_AT_location: @alpha+4 > > > > -- > > Eric > > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181101/33a463d3/attachment.html>
via llvm-dev
2018-Nov-01 22:27 UTC
[llvm-dev] RFC: Adding debug information to LLVM to support Fortran
Regarding flags, I was just thinking that maybe we should invent a new DISubprogramFlags type. DISubprogram already has a few bitfields for subprogram-specific things, Fortran will want 3 more, and there's no reason to fill up the generic DIFlags with more bits that are used in only one class. I agree that the array stuff needs to be designed with an eye to handling how other languages do arrays, and leverage the common aspects. Several languages have runtime-sized arrays and it would be nice to handle them all the same way. However the CHARACTER type probably does want to be DW_TAG_string_type rather than an array. COBOL also has strings as a fundamental type. I guess we'll have to learn what all the Fortran array stuff actually means now… --paulr From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Adrian Prantl via llvm-dev Sent: Thursday, November 01, 2018 6:12 PM To: Eric Schweitz Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] RFC: Adding debug information to LLVM to support Fortran Thanks for sharing your plans, I made a few comments inline I noticed on my first quick read-through. -- adrian On Nov 1, 2018, at 1:29 PM, Eric Schweitz via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: From: flang-dev <flang-dev-bounces at lists.flang-compiler.org<mailto:flang-dev-bounces at lists.flang-compiler.org>> On Behalf Of Eric Schweitz (PGI) Sent: Thursday, November 01, 2018 1:02 PM To: flang-dev at lists.flang-compiler.org<mailto:flang-dev at lists.flang-compiler.org> Subject: [Flang-dev] RFC: Adding debug information to LLVM to support Fortran In order to support debugging in the Flang project, work has been done to extend LLVM debug information for the Fortran language. The changes are currently available at https://github.com/flang-compiler/llvm. In order to upstream these changes into LLVM itself, three smaller changesets, described below, will be uploaded to https://reviews.llvm.org<https://reviews.llvm.org/> for code review. 1. Elemental, Pure, and Recursive Procedures DWARF 4 defines attributes for these Fortran procedure specifiers: DW_AT_elemental, DW_AT_pure, DW_AT_recursive, resp. LLVM has a way of informing the DWARF generator of simple boolean attributes in the metadata via the flags parameter. We have added these new values to the existing collection of flags. !60 = !DISubprogram(…, flags: DIFlagElemental) !61 = !DISubprogram(…, flags: DIFlagPure) !62 = !DISubprogram(…, flags: DIFlagRecursive) That change should be fairly straightforward, although we are starting to run out on bits in DIFLags, so we may need to come up with a clever encoding (such as reusing bits that aren't used in a DISubprogram context). 2. Fortran Type Support 2.1 CHARACTER Intrinsic Type There is no analog in C for the Fortran CHARACTER type. The Fortran CHARACTER type maps to the DWARF tag, DW_TAG_string_type. We have added a new named DI to LLVM to generate this DWARF information. !21 = !DIStringType(name: “character(5)”, size: 40) This produces the following DWARF information. DW_TAG_string_type: DW_AT_name: “character(5)” DW_AT_byte_size: 5 CHARACTER types can also have deferred length. This is supported in the new metadata as follows. !22 = !DIStringType(name: “character(*)!1”, size: 32, stringLength: !23, stringLengthExpression: !DIExpression()) !23 = !DILocalVariable(scope: !3, arg: 4, file: !4, type: !5, flags: DIFlagArtificial) Can you take a look at how variable-length arrays in C99 are implemented in Clang at the moment? It would be nice to use a similar scheme here. This will generate the following DWARF information. DW_TAG_string_type: DW_AT_name: character(*)!1 DW_AT_string_length: 0x9b (location list) DW_AT_byte_size: 4 2.2 Fortran Array Types and Bounds In this section we refer to the DWARF tag, DW_TAG_array_type, which is used to describe Fortran arrays. However in Fortran, arrays are not types but are rather runtime data objects, a multidimensional rectangular set of scalar data of homogeneous type. An array object has dimensions (rank and corank) and extents in those dimensions. The rank and ranges of the extents of an array may not be known until runtime. Arrays may be reshaped, acted upon in whole or in part, or otherwise be referenced (perhaps even in reverse order) non-contiguously. Furthermore arrays may be allocated and deallocated at runtime and aliased through other POINTER objects. In short, Fortran array objects are not readily mappable to the C family of languages model of arrays, and more expressive DWARF information is required. 2.2.1 Explicit array dimensions An array may be given a constant size as in the following example. The example shows a two-dimensional array, named array, that has indices from 1 to 10 for the rows and 2 to 11 for the columns. TYPE(t) :: array(10,2:11) For this declaration, the compiler generates the following LLVM metadata. !100 = !DIFortranArrayType(baseType: !7, elements: !101) Since the DI* hierarchy really just is the DWARF type hierarchy, I don't think we will need to introduce any fortran-specific names for arrays. !101 = !{ !102, !103 } !102 = !DIFortranSubrange(constLowerBound: 1, constUpperBound: 10) !103 = !DIFortranSubrange(constLowerBound: 2, constUpperBound: 11) The DWARF generated for this is as follows. (DWARF asserts in the standard that arrays are interpreted as column-major.) DW_TAG_array_type: DW_AT_name: array DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 1 DW_AT_upper_bound: 10 DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 DW_AT_upper_bound: 11 2.2.2 Adjustable arrays By adjustable arrays, we mean that an array may have its size passed explicitly as another argument. SUBROUTINE subr2(array2,N) INTEGER :: N TYPE(t) :: array2(N) In this case, the compiler expresses the !DISubrange as an expression that references the dummy argument, N. call void @llvm.dbg.declare(metadata i64* %N, metadata !113, metadata !DIExpression()) … !110 = !DIFortranArrayType(baseType: !7, elements: !111) !111 = !{ !112 } !112 = !DIFortranSubrange(lowerBound: 1, upperBound: !113, upperBoundExpression: !DIExpression(DW_OP_deref)) It would be better (and much more robust in presence of optimizations) if the DIExpression were part of a @llvm.dbg.declare / value intrinsic tying the DILocalVariable to an LLVM SSA value. !113 = !DILocalVariable(scope: !2, name: “zb1”, file: !3, type: !4, flags: DIFlagArtificial) It turned out that gdb didn’t properly interpret location lists or variable references in the DW_AT_lower_bound and DW_AT_upper_bound attribute forms, so the compiler must generate either a constant or a block with the DW_OP operations for each of them. DW_TAG_array_type: DW_AT_name: array2 DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 1 DW_AT_upper_bound: 2 byte block: 91 70 2.2.3 Assumed size arrays An assumed size array leaves the last dimension of the array unspecified. SUBROUTINE subr3(array3) TYPE(t) :: array3(*) The compiler generates DWARF information without an upper bound, such as in this snippet. DW_TAG_array_type DW_AT_name: array3 DW_TAG_subrange_type DW_AT_type = int DW_AT_lower_bound = 1 This DWARF is produced by omission of the upper bound information. !122 = !DIFortranSubrange(lowerBound: 1) 2.2.4 Assumed shape arrays Fortran also has assumed shape arrays, which allow extra state to be passed into the procedure to describe the shape of the array dummy argument. This extra information is the array descriptor, generated by the compiler, and passed as a hidden argument. SUBROUTINE subr4(array4) TYPE(t) :: array4(:,:) In this case, the compiler generates DWARF expressions to access the results of the procedure’s usage of the array descriptor argument when it computes the lower bound (DW_AT_lower_bound) and upper bound (DW_AT_upper_bound). … call void @llvm.dbg.declare(metadata i64* %4, metadata !134, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %8, metadata !136, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %9, metadata !137, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %13, metadata !139, metadata !DIExpression()) … !130 = !DIFortranArrayType(baseType: !80, elements: !131) !131 = !{ !132, !133 } !132 = !DISubrange(lowerBound: !134, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !136, upperBoundExpression: !DIExpression(DW_OP_deref)) !133 = !DISubrange(lowerBound: !137, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !139, upperBoundExpression: !DIExpression(DW_OP_deref)) same here. !134 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !136 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !137 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !139 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) The DWARF generated for this is as follows. DW_TAG_array_type: DW_AT_name: array4 DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 byte block: 91 78 DW_AT_upper_bound: 2 byte block: 91 70 DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 byte block: 91 68 DW_AT_upper_bound: 2 byte block: 91 60 2.2.5 Assumed rank arrays and coarrays This changeset does not address DWARF 5 extensions to support assumed rank arrays or coarrays. 3. Fortran COMMON Block COMMON blocks are a feature of Fortran that has no direct analog in C languages, but they are similar to data sections in assembly language programming. A COMMON block is a named area of memory that holds a collection of variables. Fortran subprograms may map the COMMON block memory area to their own, possibly distinct, non-empty list of variables. A Fortran COMMON block might look like the following example. COMMON /ALPHA/ I, J For this construct, the compiler generates a new scope-like DI construct (!DICommonBlock) into which variables (see I, J above) can be placed. As the common block implies a range of storage with global lifetime, the !DICommonBlock refers to a !DIGlobalVariable. The Fortran variable that comprise the COMMON block are also linked via metadata to offsets within the global variable that stands for the entire common block. @alpha_ = common global %alphabytes_ zeroinitializer, align 64, !dbg !27, !dbg !30, !dbg !33 !14 = distinct !DISubprogram(…) !20 = distinct !DICommonBlock(scope: !14, declaration: !25, name: "alpha") !25 = distinct !DIGlobalVariable(scope: !20, name: "common alpha", type: !24) !27 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression()) !29 = distinct !DIGlobalVariable(scope: !20, name: "i", file: !3, type: !28) !30 = !DIGlobalVariableExpression(var: !29, expr: !DIExpression()) !31 = distinct !DIGlobalVariable(scope: !20, name: "j", file: !3, type: !28) !32 = !DIExpression(DW_OP_plus_uconst, 4) !33 = !DIGlobalVariableExpression(var: !31, expr: !32) The DWARF generated for this is as follows. DW_TAG_common_block: DW_AT_name: alpha DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: common alpha DW_AT_type: array of 8 bytes DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: i DW_AT_type: integer*4 DW_AT_location: @alpha+0 DW_TAG_variable: DW_AT_name: j DW_AT_type: integer*4 DW_AT_location: @alpha+4 -- Eric _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181101/72fc523b/attachment-0001.html>
Eric Schweitz (PGI) via llvm-dev
2018-Nov-01 23:26 UTC
[llvm-dev] RFC: Adding debug information to LLVM to support Fortran
Hi Adrian, Thank you for the quick reply. (1) I agree. We’re running out of bits there. Furthermore, I have encoded these flags in a recycle-minded way to skirt the issue. (2) I will take a look. Note that Fortran debuggers expect DW_TAG_string_type for CHARACTER; and, they are distinct from arrays of INTEGER*1. (Yes, the conflation of string and CHARACTER terms is unfortunate.) (3) The Fortran DI name choice was to deliberately keep things separated. That may not be necessary as we move forward, but it was good at the time for sanity’s sake. (4) My understanding is the @llvm.dbg.declare !DIExpression is to track the location of a local variable. The !DIExpression in the lower (or upper) bound is intended as a way to describe how to find or compute the bound’s value from the array descriptor “variable” (computed from a member of a hidden argument to the function). The distinction isn’t entirely clear in the example. I’ll work on getting the patches put up. Thanks again. -- Eric From: Adrian Prantl <aprantl at apple.com<mailto:aprantl at apple.com>> Date: Thu, Nov 1, 2018 at 3:12 PM Subject: Re: [llvm-dev] RFC: Adding debug information to LLVM to support Fortran To: Eric Schweitz <eric.schweitz2 at gmail.com<mailto:eric.schweitz2 at gmail.com>> Cc: <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Thanks for sharing your plans, I made a few comments inline I noticed on my first quick read-through. -- adrian On Nov 1, 2018, at 1:29 PM, Eric Schweitz via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: From: flang-dev <flang-dev-bounces at lists.flang-compiler.org<mailto:flang-dev-bounces at lists.flang-compiler.org>> On Behalf Of Eric Schweitz (PGI) Sent: Thursday, November 01, 2018 1:02 PM To: flang-dev at lists.flang-compiler.org<mailto:flang-dev at lists.flang-compiler.org> Subject: [Flang-dev] RFC: Adding debug information to LLVM to support Fortran In order to support debugging in the Flang project, work has been done to extend LLVM debug information for the Fortran language. The changes are currently available at https://github.com/flang-compiler/llvm. In order to upstream these changes into LLVM itself, three smaller changesets, described below, will be uploaded to https://reviews.llvm.org<https://reviews.llvm.org/> for code review. 1. Elemental, Pure, and Recursive Procedures DWARF 4 defines attributes for these Fortran procedure specifiers: DW_AT_elemental, DW_AT_pure, DW_AT_recursive, resp. LLVM has a way of informing the DWARF generator of simple boolean attributes in the metadata via the flags parameter. We have added these new values to the existing collection of flags. !60 = !DISubprogram(…, flags: DIFlagElemental) !61 = !DISubprogram(…, flags: DIFlagPure) !62 = !DISubprogram(…, flags: DIFlagRecursive) That change should be fairly straightforward, although we are starting to run out on bits in DIFLags, so we may need to come up with a clever encoding (such as reusing bits that aren't used in a DISubprogram context). 2. Fortran Type Support 2.1 CHARACTER Intrinsic Type There is no analog in C for the Fortran CHARACTER type. The Fortran CHARACTER type maps to the DWARF tag, DW_TAG_string_type. We have added a new named DI to LLVM to generate this DWARF information. !21 = !DIStringType(name: “character(5)”, size: 40) This produces the following DWARF information. DW_TAG_string_type: DW_AT_name: “character(5)” DW_AT_byte_size: 5 CHARACTER types can also have deferred length. This is supported in the new metadata as follows. !22 = !DIStringType(name: “character(*)!1”, size: 32, stringLength: !23, stringLengthExpression: !DIExpression()) !23 = !DILocalVariable(scope: !3, arg: 4, file: !4, type: !5, flags: DIFlagArtificial) Can you take a look at how variable-length arrays in C99 are implemented in Clang at the moment? It would be nice to use a similar scheme here. This will generate the following DWARF information. DW_TAG_string_type: DW_AT_name: character(*)!1 DW_AT_string_length: 0x9b (location list) DW_AT_byte_size: 4 2.2 Fortran Array Types and Bounds In this section we refer to the DWARF tag, DW_TAG_array_type, which is used to describe Fortran arrays. However in Fortran, arrays are not types but are rather runtime data objects, a multidimensional rectangular set of scalar data of homogeneous type. An array object has dimensions (rank and corank) and extents in those dimensions. The rank and ranges of the extents of an array may not be known until runtime. Arrays may be reshaped, acted upon in whole or in part, or otherwise be referenced (perhaps even in reverse order) non-contiguously. Furthermore arrays may be allocated and deallocated at runtime and aliased through other POINTER objects. In short, Fortran array objects are not readily mappable to the C family of languages model of arrays, and more expressive DWARF information is required. 2.2.1 Explicit array dimensions An array may be given a constant size as in the following example. The example shows a two-dimensional array, named array, that has indices from 1 to 10 for the rows and 2 to 11 for the columns. TYPE(t) :: array(10,2:11) For this declaration, the compiler generates the following LLVM metadata. !100 = !DIFortranArrayType(baseType: !7, elements: !101) Since the DI* hierarchy really just is the DWARF type hierarchy, I don't think we will need to introduce any fortran-specific names for arrays. !101 = !{ !102, !103 } !102 = !DIFortranSubrange(constLowerBound: 1, constUpperBound: 10) !103 = !DIFortranSubrange(constLowerBound: 2, constUpperBound: 11) The DWARF generated for this is as follows. (DWARF asserts in the standard that arrays are interpreted as column-major.) DW_TAG_array_type: DW_AT_name: array DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 1 DW_AT_upper_bound: 10 DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 DW_AT_upper_bound: 11 2.2.2 Adjustable arrays By adjustable arrays, we mean that an array may have its size passed explicitly as another argument. SUBROUTINE subr2(array2,N) INTEGER :: N TYPE(t) :: array2(N) In this case, the compiler expresses the !DISubrange as an expression that references the dummy argument, N. call void @llvm.dbg.declare(metadata i64* %N, metadata !113, metadata !DIExpression()) … !110 = !DIFortranArrayType(baseType: !7, elements: !111) !111 = !{ !112 } !112 = !DIFortranSubrange(lowerBound: 1, upperBound: !113, upperBoundExpression: !DIExpression(DW_OP_deref)) It would be better (and much more robust in presence of optimizations) if the DIExpression were part of a @llvm.dbg.declare / value intrinsic tying the DILocalVariable to an LLVM SSA value. !113 = !DILocalVariable(scope: !2, name: “zb1”, file: !3, type: !4, flags: DIFlagArtificial) It turned out that gdb didn’t properly interpret location lists or variable references in the DW_AT_lower_bound and DW_AT_upper_bound attribute forms, so the compiler must generate either a constant or a block with the DW_OP operations for each of them. DW_TAG_array_type: DW_AT_name: array2 DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 1 DW_AT_upper_bound: 2 byte block: 91 70 2.2.3 Assumed size arrays An assumed size array leaves the last dimension of the array unspecified. SUBROUTINE subr3(array3) TYPE(t) :: array3(*) The compiler generates DWARF information without an upper bound, such as in this snippet. DW_TAG_array_type DW_AT_name: array3 DW_TAG_subrange_type DW_AT_type = int DW_AT_lower_bound = 1 This DWARF is produced by omission of the upper bound information. !122 = !DIFortranSubrange(lowerBound: 1) 2.2.4 Assumed shape arrays Fortran also has assumed shape arrays, which allow extra state to be passed into the procedure to describe the shape of the array dummy argument. This extra information is the array descriptor, generated by the compiler, and passed as a hidden argument. SUBROUTINE subr4(array4) TYPE(t) :: array4(:,:) In this case, the compiler generates DWARF expressions to access the results of the procedure’s usage of the array descriptor argument when it computes the lower bound (DW_AT_lower_bound) and upper bound (DW_AT_upper_bound). … call void @llvm.dbg.declare(metadata i64* %4, metadata !134, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %8, metadata !136, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %9, metadata !137, metadata !DIExpression()) call void @llvm.dbg.declare(metadata i64* %13, metadata !139, metadata !DIExpression()) … !130 = !DIFortranArrayType(baseType: !80, elements: !131) !131 = !{ !132, !133 } !132 = !DISubrange(lowerBound: !134, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !136, upperBoundExpression: !DIExpression(DW_OP_deref)) !133 = !DISubrange(lowerBound: !137, lowerBoundExpression: !DIExpression(DW_OP_deref), upperBound: !139, upperBoundExpression: !DIExpression(DW_OP_deref)) same here. !134 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !136 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !137 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) !139 = !DILocalVariable(scope: !2, file: !3, type: !9, flags: DIArtificial) The DWARF generated for this is as follows. DW_TAG_array_type: DW_AT_name: array4 DW_AT_type: 4d08 ;TYPE(t) DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 byte block: 91 78 DW_AT_upper_bound: 2 byte block: 91 70 DW_TAG_subrange_type: DW_AT_type: int DW_AT_lower_bound: 2 byte block: 91 68 DW_AT_upper_bound: 2 byte block: 91 60 2.2.5 Assumed rank arrays and coarrays This changeset does not address DWARF 5 extensions to support assumed rank arrays or coarrays. 3. Fortran COMMON Block COMMON blocks are a feature of Fortran that has no direct analog in C languages, but they are similar to data sections in assembly language programming. A COMMON block is a named area of memory that holds a collection of variables. Fortran subprograms may map the COMMON block memory area to their own, possibly distinct, non-empty list of variables. A Fortran COMMON block might look like the following example. COMMON /ALPHA/ I, J For this construct, the compiler generates a new scope-like DI construct (!DICommonBlock) into which variables (see I, J above) can be placed. As the common block implies a range of storage with global lifetime, the !DICommonBlock refers to a !DIGlobalVariable. The Fortran variable that comprise the COMMON block are also linked via metadata to offsets within the global variable that stands for the entire common block. @alpha_ = common global %alphabytes_ zeroinitializer, align 64, !dbg !27, !dbg !30, !dbg !33 !14 = distinct !DISubprogram(…) !20 = distinct !DICommonBlock(scope: !14, declaration: !25, name: "alpha") !25 = distinct !DIGlobalVariable(scope: !20, name: "common alpha", type: !24) !27 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression()) !29 = distinct !DIGlobalVariable(scope: !20, name: "i", file: !3, type: !28) !30 = !DIGlobalVariableExpression(var: !29, expr: !DIExpression()) !31 = distinct !DIGlobalVariable(scope: !20, name: "j", file: !3, type: !28) !32 = !DIExpression(DW_OP_plus_uconst, 4) !33 = !DIGlobalVariableExpression(var: !31, expr: !32) The DWARF generated for this is as follows. DW_TAG_common_block: DW_AT_name: alpha DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: common alpha DW_AT_type: array of 8 bytes DW_AT_location: @alpha_+0 DW_TAG_variable: DW_AT_name: i DW_AT_type: integer*4 DW_AT_location: @alpha+0 DW_TAG_variable: DW_AT_name: j DW_AT_type: integer*4 DW_AT_location: @alpha+4 -- Eric _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev ________________________________ This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181101/a38eb6dd/attachment.html>
Adrian Prantl via llvm-dev
2018-Nov-01 23:39 UTC
[llvm-dev] RFC: Adding debug information to LLVM to support Fortran
> On Nov 1, 2018, at 4:26 PM, Eric Schweitz (PGI) <eric.schweitz at pgroup.com> wrote: > > Hi Adrian, > > Thank you for the quick reply. > > (1) I agree. We’re running out of bits there. Furthermore, I have encoded these flags in a recycle-minded way to skirt the issue. > > (2) I will take a look. Note that Fortran debuggers expect DW_TAG_string_type for CHARACTER; and, they are distinct from arrays of INTEGER*1. (Yes, the conflation of string and CHARACTER terms is unfortunate.) > > (3) The Fortran DI name choice was to deliberately keep things separated. That may not be necessary as we move forward, but it was good at the time for sanity’s sake. > > (4) My understanding is the @llvm.dbg.declare !DIExpression is to track the location of a local variable. The !DIExpression in the lower (or upper) bound is intended as a way to describe how to find or compute the bound’s value from the array descriptor “variable” (computed from a member of a hidden argument to the function). The distinction isn’t entirely clear in the example.There probably should be one artificial variable per computed property of the array, the respective bounds refer to their own variable and if an expression is necessary it should be tied to the DIVariable using dbg.value/dbg.declare (which I think you'll need anyway). -- adrian> > I’ll work on getting the patches put up. Thanks again. > > --