thr3ads.net - llvm dev - [llvm-dev] [llvm-pdbutil] : merge not working properly [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Vivien Millet via llvm-dev

2019-Jan-17 18:23 UTC

[llvm-dev] [llvm-pdbutil] : merge not working properly

Ok I see..
what do you mean by “making sure to de-duplicate records as necessary” ?

Le jeu. 17 janv. 2019 à 19:09, Zachary Turner <zturner at google.com> a
écrit :
> It's possible in theory to support incremental updates to a PDB (the
file
> format is designed specifically with that in mind).  But this functionality
> was never added to the PDB library since lld doesn't support
incremental
> linking, we never really needed it.
>
> The "dumb" way would be to just create a new PDB file, build it
using the
> old contents and the new contents (making sure to de-duplicate records as
> necessary).
>
> Supporting incremental updates should be possible, but most of LLVM's
File
> I/O abstractions are based around mmapping a file and writing to it, which
> doesn't work when you don't know the file size in advance.  So
there would
> be some interesting problems to solve here.
>
> On Thu, Jan 17, 2019 at 10:03 AM Vivien Millet <vivien.millet at
gmail.com>
> wrote:
>
>> Hi Zachary !
>> If there a way to easily create a new PDBFileBuilder from an existing
>> PDBFile or can/should I do the translation myself ?
>> I would like to start from a builder filled with the EXE PDB data and
>> then complete its DBI stream with the JIT module/symbols.
>>
>> Thanks !
>>
>>
>> Le mer. 16 janv. 2019 à 23:41, Vivien Millet <vivien.millet at
gmail.com> a
>> écrit :
>>
>>> Thank you Zachary !
>>> I will have some soon I think ..
>>> I first need to explore the llvmpdb-util code more because I
don't even
>>> know where to start with the PDB api..
>>>
>>> Le mer. 16 janv. 2019 à 22:51, Zachary Turner <zturner at
google.com> a
>>> écrit :
>>>
>>>> Sure. Along the way I’m happy to answer any specific questions
you
>>>> might have too even if it’s for your downstream project
>>>> On Wed, Jan 16, 2019 at 1:38 PM Vivien Millet <vivien.millet
at gmail.com>
>>>> wrote:
>>>>
>>>>> I would be up to improve pdbutil but I doubt I have enough
knowledge
>>>>> or time to provide the complete merge feature, it would
still be a very
>>>>> specific kind of merge as you describe it. Anyway I could
start trying to
>>>>> do it in my jit compiler and then, once I get something
working (if that
>>>>> happens :)), i can come back to you with the piece of code
and see if it is
>>>>> worth integrating it to pdbutil and how ?
>>>>>
>>>>> Le mer. 16 janv. 2019 à 22:12, Zachary Turner <zturner
at google.com> a
>>>>> écrit :
>>>>>
>>>>>> Well, that’s certainly possible, but improving
llvm-pdbutil is
>>>>>> another possibility. Doing it directly in your jit
compiler will probably
>>>>>> save you time though, since you won’t have to worry
about writing tests and
>>>>>> going through code review
>>>>>> On Wed, Jan 16, 2019 at 1:01 PM Vivien Millet <
>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>
>>>>>>> Thanks for the tips !
>>>>>>> When you talk about doing all of this I suppose you
think about
>>>>>>> using llvm/debuginfo/pdb, pick code here and there
to generate the pdb in
>>>>>>> memory, read the executable one and perform the
merge directly in my jit
>>>>>>> compiler, right ? Not using pdbutil ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le mar. 15 janv. 2019 à 22:49, Zachary Turner
<zturner at google.com>
>>>>>>> a écrit :
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jan 15, 2019 at 2:50 AM Vivien Millet
<
>>>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello Zachary !
>>>>>>>>> Thanks for your time !
>>>>>>>>> So you are one of the happy guys who
suffered from the lack of PDB
>>>>>>>>> format information :)
>>>>>>>>>
>>>>>>>> Yes, that would be me :)
>>>>>>>>
>>>>>>>>
>>>>>>>>> To be honest I'm really a beginner in
the PDB stuff, I just read
>>>>>>>>> some llvm documentation to understand what
went wrong when merging my PDBs.
>>>>>>>>> In my case, what I do with my team and try
to achieve is this :
>>>>>>>>> - Run our application under a visual studio
debugger
>>>>>>>>> - Generate JIT code ( using llvm MCJIT  )
>>>>>>>>> - Then, either :
>>>>>>>>>    - export as COFF obj file with dwarf
information and then
>>>>>>>>> convert it with cv2pdb to obtain a pdb of
my JIT symbols (what I do now)
>>>>>>>>>    - export directly to PDB my JIT debug
info (what i would like
>>>>>>>>> to do, if you have an idea how..)
>>>>>>>>> - Detach the visual studio debugger
>>>>>>>>> - Merge my JIT pdb into a copy of the
executable pdb (where things
>>>>>>>>> start to go bad..)
>>>>>>>>> - Replace original executable by the copy
(creating a backup of
>>>>>>>>> original)
>>>>>>>>> - Reattach  the visual studio debugger to
my executable (loading
>>>>>>>>> the new pdb version)
>>>>>>>>> - Debug JIT code with visual studio.
>>>>>>>>> - On each JIT rebuild, restart these steps
from the original
>>>>>>>>> native executable PDB to avoid merge
conflict between the multiple JIT
>>>>>>>>> iterations
>>>>>>>>>
>>>>>>>> Yea, it's an interesting use case.  It
makes me think it would be
>>>>>>>> nice if the PDB format supported some way of
having a symbol which simply
>>>>>>>> refers to another PDB file, that way you could
re-write that PDB file at
>>>>>>>> runtime once all your code is jitted, and when
the debugger tries to look
>>>>>>>> up that symbol, it finds a record that tells it
to go check the other PDB
>>>>>>>> file.
>>>>>>>>
>>>>>>>> So, here are the things I think you would need
to do:
>>>>>>>>
>>>>>>>> 1) Create a JIT module in the module list with
a unique name.  All
>>>>>>>> symbols will go here.  llvm-pdbutil dump
-modules shows you the list.  Be
>>>>>>>> careful about putting it at the end though,
because there's already one at
>>>>>>>> the end called * LINKER * that is kind of
special.  On the other hand, you
>>>>>>>> don't want to put it first because it means
you will have to do lots of
>>>>>>>> fixups on the EXE PDB.  It's probably best
to add it right before the
>>>>>>>> linker module, this has the least chance of
breaking anything.
>>>>>>>>
>>>>>>>> 2) In the debug stream for this module, add all
symbols.  You will
>>>>>>>> need to fix up their type indices.  As you
noticed, llvm-pdbutil already
>>>>>>>> merges type information from the JIT PDB, so
after merging the type indices
>>>>>>>> in the EXE PDB will be different than they were
in the JIT PDB, but the
>>>>>>>> symbol records will refer to the JIT PDB type
indices.  So these need to be
>>>>>>>> fixed up.  LLD already has code to do this, you
can probably borrow a
>>>>>>>> similar algorithm with some slight
modifications (lldb/COFF/PDB.cpp, search
>>>>>>>> for mergeSymbolRecords)
>>>>>>>>
>>>>>>>> 3) Merge in the new section contributions and
section map.  See LLD
>>>>>>>> again for how to modify these.  Hopefully the
object file you exported
>>>>>>>> contains relocated symbol addresses so you
don't have to do any fixups here.
>>>>>>>>
>>>>>>>> 4) Merge in the publics and globals.  This
shouldn't be too hard, I
>>>>>>>> think you can just iterate over them in the JIT
PDB and add them to the new
>>>>>>>> EXE PDB.
>>>>>>>>
>>>>>>>> You're kind of in uncharted territory here,
so this is just a rough
>>>>>>>> idea of what needs to be done.  There may be
other issues that you don't
>>>>>>>> encounter until you actually try it out.
>>>>>>>>
>>>>>>>> Unfortunately I don't personally have the
time to work on this, but
>>>>>>>> it sounds neat, and I'm happy to help if
you run into questions or problems
>>>>>>>> along the way.
>>>>>>>>
>>>>>>>>
>>>>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190117/717efb0b/attachment.html>

Zachary Turner via llvm-dev

2019-Jan-17 18:36 UTC

head link

[llvm-dev] [llvm-pdbutil] : merge not working properly

Well, for example the TPI stream is just one big collection of types.
Presumably your JIT code will reuse some of the same types (perhaps,
std::string for example) as your non-jitted code.  Your jitted symbol
records in the object file (for example, a local variable of type
std::string in your jitted code) will refer to the type for std;:string by
a TypeIndex, and your original PDB will also refer to std::string by a
different TypeIndex.

In LLD, when we merge in types and symbols from each object file, we keep a
hash table of which types have already been seen, so that if we see the
same type again, we can just use the TypeIndex that we wrote on a previous
object file.  Then, when we add symbol records, we have to update its
fields that used the old TypeIndex to use the new TypeIndex instead.

De-duplicating though, I suppose, is not strictly necessary, it will just
keep your PDB size down.  But you *will* need to at least re-write the
TypeIndexes from the jitted code.  For example, you may decide that instead
of de-duplicating, you just append them all to the end of the TPI stream
(where all the types go in PDB) to keep things simple.  Since they were in
a different position before, they now have different TypeIndices.  So you
will need to re-write all TypeIndices so that they are correct after the
merge.   Both types and symbols can refer to types, so you will need to do
this both for the types of the jitted code as well as the symbols of the
jitted code.

Let me know if that makes sense.

On Thu, Jan 17, 2019 at 10:24 AM Vivien Millet <vivien.millet at
gmail.com>
wrote:
> Ok I see..
> what do you mean by “making sure to de-duplicate records as necessary” ?
>
> Le jeu. 17 janv. 2019 à 19:09, Zachary Turner <zturner at google.com>
a
> écrit :
>
>> It's possible in theory to support incremental updates to a PDB
(the file
>> format is designed specifically with that in mind).  But this
functionality
>> was never added to the PDB library since lld doesn't support
incremental
>> linking, we never really needed it.
>>
>> The "dumb" way would be to just create a new PDB file, build
it using the
>> old contents and the new contents (making sure to de-duplicate records
as
>> necessary).
>>
>> Supporting incremental updates should be possible, but most of
LLVM's
>> File I/O abstractions are based around mmapping a file and writing to
it,
>> which doesn't work when you don't know the file size in
advance.  So there
>> would be some interesting problems to solve here.
>>
>> On Thu, Jan 17, 2019 at 10:03 AM Vivien Millet <vivien.millet at
gmail.com>
>> wrote:
>>
>>> Hi Zachary !
>>> If there a way to easily create a new PDBFileBuilder from an
existing
>>> PDBFile or can/should I do the translation myself ?
>>> I would like to start from a builder filled with the EXE PDB data
and
>>> then complete its DBI stream with the JIT module/symbols.
>>>
>>> Thanks !
>>>
>>>
>>> Le mer. 16 janv. 2019 à 23:41, Vivien Millet <vivien.millet at
gmail.com>
>>> a écrit :
>>>
>>>> Thank you Zachary !
>>>> I will have some soon I think ..
>>>> I first need to explore the llvmpdb-util code more because I
don't even
>>>> know where to start with the PDB api..
>>>>
>>>> Le mer. 16 janv. 2019 à 22:51, Zachary Turner <zturner at
google.com> a
>>>> écrit :
>>>>
>>>>> Sure. Along the way I’m happy to answer any specific
questions you
>>>>> might have too even if it’s for your downstream project
>>>>> On Wed, Jan 16, 2019 at 1:38 PM Vivien Millet
<vivien.millet at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I would be up to improve pdbutil but I doubt I have
enough knowledge
>>>>>> or time to provide the complete merge feature, it would
still be a very
>>>>>> specific kind of merge as you describe it. Anyway I
could start trying to
>>>>>> do it in my jit compiler and then, once I get something
working (if that
>>>>>> happens :)), i can come back to you with the piece of
code and see if it is
>>>>>> worth integrating it to pdbutil and how ?
>>>>>>
>>>>>> Le mer. 16 janv. 2019 à 22:12, Zachary Turner
<zturner at google.com> a
>>>>>> écrit :
>>>>>>
>>>>>>> Well, that’s certainly possible, but improving
llvm-pdbutil is
>>>>>>> another possibility. Doing it directly in your jit
compiler will probably
>>>>>>> save you time though, since you won’t have to worry
about writing tests and
>>>>>>> going through code review
>>>>>>> On Wed, Jan 16, 2019 at 1:01 PM Vivien Millet <
>>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks for the tips !
>>>>>>>> When you talk about doing all of this I suppose
you think about
>>>>>>>> using llvm/debuginfo/pdb, pick code here and
there to generate the pdb in
>>>>>>>> memory, read the executable one and perform the
merge directly in my jit
>>>>>>>> compiler, right ? Not using pdbutil ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Le mar. 15 janv. 2019 à 22:49, Zachary Turner
<zturner at google.com>
>>>>>>>> a écrit :
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jan 15, 2019 at 2:50 AM Vivien
Millet <
>>>>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Zachary !
>>>>>>>>>> Thanks for your time !
>>>>>>>>>> So you are one of the happy guys who
suffered from the lack of
>>>>>>>>>> PDB format information :)
>>>>>>>>>>
>>>>>>>>> Yes, that would be me :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> To be honest I'm really a beginner
in the PDB stuff, I just read
>>>>>>>>>> some llvm documentation to understand
what went wrong when merging my PDBs.
>>>>>>>>>> In my case, what I do with my team and
try to achieve is this :
>>>>>>>>>> - Run our application under a visual
studio debugger
>>>>>>>>>> - Generate JIT code ( using llvm MCJIT 
)
>>>>>>>>>> - Then, either :
>>>>>>>>>>    - export as COFF obj file with dwarf
information and then
>>>>>>>>>> convert it with cv2pdb to obtain a pdb
of my JIT symbols (what I do now)
>>>>>>>>>>    - export directly to PDB my JIT
debug info (what i would like
>>>>>>>>>> to do, if you have an idea how..)
>>>>>>>>>> - Detach the visual studio debugger
>>>>>>>>>> - Merge my JIT pdb into a copy of the
executable pdb (where
>>>>>>>>>> things start to go bad..)
>>>>>>>>>> - Replace original executable by the
copy (creating a backup of
>>>>>>>>>> original)
>>>>>>>>>> - Reattach  the visual studio debugger
to my executable (loading
>>>>>>>>>> the new pdb version)
>>>>>>>>>> - Debug JIT code with visual studio.
>>>>>>>>>> - On each JIT rebuild, restart these
steps from the original
>>>>>>>>>> native executable PDB to avoid merge
conflict between the multiple JIT
>>>>>>>>>> iterations
>>>>>>>>>>
>>>>>>>>> Yea, it's an interesting use case.  It
makes me think it would be
>>>>>>>>> nice if the PDB format supported some way
of having a symbol which simply
>>>>>>>>> refers to another PDB file, that way you
could re-write that PDB file at
>>>>>>>>> runtime once all your code is jitted, and
when the debugger tries to look
>>>>>>>>> up that symbol, it finds a record that
tells it to go check the other PDB
>>>>>>>>> file.
>>>>>>>>>
>>>>>>>>> So, here are the things I think you would
need to do:
>>>>>>>>>
>>>>>>>>> 1) Create a JIT module in the module list
with a unique name.  All
>>>>>>>>> symbols will go here.  llvm-pdbutil dump
-modules shows you the list.  Be
>>>>>>>>> careful about putting it at the end though,
because there's already one at
>>>>>>>>> the end called * LINKER * that is kind of
special.  On the other hand, you
>>>>>>>>> don't want to put it first because it
means you will have to do lots of
>>>>>>>>> fixups on the EXE PDB.  It's probably
best to add it right before the
>>>>>>>>> linker module, this has the least chance of
breaking anything.
>>>>>>>>>
>>>>>>>>> 2) In the debug stream for this module, add
all symbols.  You will
>>>>>>>>> need to fix up their type indices.  As you
noticed, llvm-pdbutil already
>>>>>>>>> merges type information from the JIT PDB,
so after merging the type indices
>>>>>>>>> in the EXE PDB will be different than they
were in the JIT PDB, but the
>>>>>>>>> symbol records will refer to the JIT PDB
type indices.  So these need to be
>>>>>>>>> fixed up.  LLD already has code to do this,
you can probably borrow a
>>>>>>>>> similar algorithm with some slight
modifications (lldb/COFF/PDB.cpp, search
>>>>>>>>> for mergeSymbolRecords)
>>>>>>>>>
>>>>>>>>> 3) Merge in the new section contributions
and section map.  See
>>>>>>>>> LLD again for how to modify these. 
Hopefully the object file you exported
>>>>>>>>> contains relocated symbol addresses so you
don't have to do any fixups here.
>>>>>>>>>
>>>>>>>>> 4) Merge in the publics and globals.  This
shouldn't be too hard,
>>>>>>>>> I think you can just iterate over them in
the JIT PDB and add them to the
>>>>>>>>> new EXE PDB.
>>>>>>>>>
>>>>>>>>> You're kind of in uncharted territory
here, so this is just a
>>>>>>>>> rough idea of what needs to be done.  There
may be other issues that you
>>>>>>>>> don't encounter until you actually try
it out.
>>>>>>>>>
>>>>>>>>> Unfortunately I don't personally have
the time to work on this,
>>>>>>>>> but it sounds neat, and I'm happy to
help if you run into questions or
>>>>>>>>> problems along the way.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190117/abc074ce/attachment.html>

Vivien Millet via llvm-dev

2019-Jan-17 18:52 UTC

head link

[llvm-dev] [llvm-pdbutil] : merge not working properly

Ok I understand more what you meant. In fact I don’t care about the pdb
size, at least as a first step, so it won’t be a problem for me to have
duplicated symbols. Concerning TypeIndices my plan if possible is not to
generate a pdb for my jit and merge it, but instead directly extract debug
info from a DwarfContext just after llvm::object::ObjectFile is emitted by
the JIT engine and complete the EXE PDB I had rebuilt with PDBFileBuilder.
Does it sounds a good bet to you ? If I succeed doing that I think that
could be a good extension to the debugging possibilities of MCJit if not
being an extension to pdbutil.

Le jeu. 17 janv. 2019 à 19:37, Zachary Turner <zturner at google.com> a
écrit :
> Well, for example the TPI stream is just one big collection of types.
> Presumably your JIT code will reuse some of the same types (perhaps,
> std::string for example) as your non-jitted code.  Your jitted symbol
> records in the object file (for example, a local variable of type
> std::string in your jitted code) will refer to the type for std;:string by
> a TypeIndex, and your original PDB will also refer to std::string by a
> different TypeIndex.
>
> In LLD, when we merge in types and symbols from each object file, we keep
> a hash table of which types have already been seen, so that if we see the
> same type again, we can just use the TypeIndex that we wrote on a previous
> object file.  Then, when we add symbol records, we have to update its
> fields that used the old TypeIndex to use the new TypeIndex instead.
>
> De-duplicating though, I suppose, is not strictly necessary, it will just
> keep your PDB size down.  But you *will* need to at least re-write the
> TypeIndexes from the jitted code.  For example, you may decide that instead
> of de-duplicating, you just append them all to the end of the TPI stream
> (where all the types go in PDB) to keep things simple.  Since they were in
> a different position before, they now have different TypeIndices.  So you
> will need to re-write all TypeIndices so that they are correct after the
> merge.   Both types and symbols can refer to types, so you will need to do
> this both for the types of the jitted code as well as the symbols of the
> jitted code.
>
> Let me know if that makes sense.
>
> On Thu, Jan 17, 2019 at 10:24 AM Vivien Millet <vivien.millet at
gmail.com>
> wrote:
>
>> Ok I see..
>> what do you mean by “making sure to de-duplicate records as necessary”
?
>>
>> Le jeu. 17 janv. 2019 à 19:09, Zachary Turner <zturner at
google.com> a
>> écrit :
>>
>>> It's possible in theory to support incremental updates to a PDB
(the
>>> file format is designed specifically with that in mind).  But this
>>> functionality was never added to the PDB library since lld
doesn't support
>>> incremental linking, we never really needed it.
>>>
>>> The "dumb" way would be to just create a new PDB file,
build it using
>>> the old contents and the new contents (making sure to de-duplicate
records
>>> as necessary).
>>>
>>> Supporting incremental updates should be possible, but most of
LLVM's
>>> File I/O abstractions are based around mmapping a file and writing
to it,
>>> which doesn't work when you don't know the file size in
advance.  So there
>>> would be some interesting problems to solve here.
>>>
>>> On Thu, Jan 17, 2019 at 10:03 AM Vivien Millet <vivien.millet at
gmail.com>
>>> wrote:
>>>
>>>> Hi Zachary !
>>>> If there a way to easily create a new PDBFileBuilder from an
existing
>>>> PDBFile or can/should I do the translation myself ?
>>>> I would like to start from a builder filled with the EXE PDB
data and
>>>> then complete its DBI stream with the JIT module/symbols.
>>>>
>>>> Thanks !
>>>>
>>>>
>>>> Le mer. 16 janv. 2019 à 23:41, Vivien Millet <vivien.millet
at gmail.com>
>>>> a écrit :
>>>>
>>>>> Thank you Zachary !
>>>>> I will have some soon I think ..
>>>>> I first need to explore the llvmpdb-util code more because
I don't
>>>>> even know where to start with the PDB api..
>>>>>
>>>>> Le mer. 16 janv. 2019 à 22:51, Zachary Turner <zturner
at google.com> a
>>>>> écrit :
>>>>>
>>>>>> Sure. Along the way I’m happy to answer any specific
questions you
>>>>>> might have too even if it’s for your downstream project
>>>>>> On Wed, Jan 16, 2019 at 1:38 PM Vivien Millet <
>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>
>>>>>>> I would be up to improve pdbutil but I doubt I have
enough knowledge
>>>>>>> or time to provide the complete merge feature, it
would still be a very
>>>>>>> specific kind of merge as you describe it. Anyway I
could start trying to
>>>>>>> do it in my jit compiler and then, once I get
something working (if that
>>>>>>> happens :)), i can come back to you with the piece
of code and see if it is
>>>>>>> worth integrating it to pdbutil and how ?
>>>>>>>
>>>>>>> Le mer. 16 janv. 2019 à 22:12, Zachary Turner
<zturner at google.com>
>>>>>>> a écrit :
>>>>>>>
>>>>>>>> Well, that’s certainly possible, but improving
llvm-pdbutil is
>>>>>>>> another possibility. Doing it directly in your
jit compiler will probably
>>>>>>>> save you time though, since you won’t have to
worry about writing tests and
>>>>>>>> going through code review
>>>>>>>> On Wed, Jan 16, 2019 at 1:01 PM Vivien Millet
<
>>>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the tips !
>>>>>>>>> When you talk about doing all of this I
suppose you think about
>>>>>>>>> using llvm/debuginfo/pdb, pick code here
and there to generate the pdb in
>>>>>>>>> memory, read the executable one and perform
the merge directly in my jit
>>>>>>>>> compiler, right ? Not using pdbutil ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le mar. 15 janv. 2019 à 22:49, Zachary
Turner <zturner at google.com>
>>>>>>>>> a écrit :
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jan 15, 2019 at 2:50 AM Vivien
Millet <
>>>>>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello Zachary !
>>>>>>>>>>> Thanks for your time !
>>>>>>>>>>> So you are one of the happy guys
who suffered from the lack of
>>>>>>>>>>> PDB format information :)
>>>>>>>>>>>
>>>>>>>>>> Yes, that would be me :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> To be honest I'm really a
beginner in the PDB stuff, I just read
>>>>>>>>>>> some llvm documentation to
understand what went wrong when merging my PDBs.
>>>>>>>>>>> In my case, what I do with my team
and try to achieve is this :
>>>>>>>>>>> - Run our application under a
visual studio debugger
>>>>>>>>>>> - Generate JIT code ( using llvm
MCJIT  )
>>>>>>>>>>> - Then, either :
>>>>>>>>>>>    - export as COFF obj file with
dwarf information and then
>>>>>>>>>>> convert it with cv2pdb to obtain a
pdb of my JIT symbols (what I do now)
>>>>>>>>>>>    - export directly to PDB my JIT
debug info (what i would like
>>>>>>>>>>> to do, if you have an idea how..)
>>>>>>>>>>> - Detach the visual studio debugger
>>>>>>>>>>> - Merge my JIT pdb into a copy of
the executable pdb (where
>>>>>>>>>>> things start to go bad..)
>>>>>>>>>>> - Replace original executable by
the copy (creating a backup of
>>>>>>>>>>> original)
>>>>>>>>>>> - Reattach  the visual studio
debugger to my executable (loading
>>>>>>>>>>> the new pdb version)
>>>>>>>>>>> - Debug JIT code with visual
studio.
>>>>>>>>>>> - On each JIT rebuild, restart
these steps from the original
>>>>>>>>>>> native executable PDB to avoid
merge conflict between the multiple JIT
>>>>>>>>>>> iterations
>>>>>>>>>>>
>>>>>>>>>> Yea, it's an interesting use case. 
It makes me think it would be
>>>>>>>>>> nice if the PDB format supported some
way of having a symbol which simply
>>>>>>>>>> refers to another PDB file, that way
you could re-write that PDB file at
>>>>>>>>>> runtime once all your code is jitted,
and when the debugger tries to look
>>>>>>>>>> up that symbol, it finds a record that
tells it to go check the other PDB
>>>>>>>>>> file.
>>>>>>>>>>
>>>>>>>>>> So, here are the things I think you
would need to do:
>>>>>>>>>>
>>>>>>>>>> 1) Create a JIT module in the module
list with a unique name.
>>>>>>>>>> All symbols will go here.  llvm-pdbutil
dump -modules shows you the list.
>>>>>>>>>> Be careful about putting it at the end
though, because there's already one
>>>>>>>>>> at the end called * LINKER * that is
kind of special.  On the other hand,
>>>>>>>>>> you don't want to put it first
because it means you will have to do lots of
>>>>>>>>>> fixups on the EXE PDB.  It's
probably best to add it right before the
>>>>>>>>>> linker module, this has the least
chance of breaking anything.
>>>>>>>>>>
>>>>>>>>>> 2) In the debug stream for this module,
add all symbols.  You
>>>>>>>>>> will need to fix up their type indices.
As you noticed, llvm-pdbutil
>>>>>>>>>> already merges type information from
the JIT PDB, so after merging the type
>>>>>>>>>> indices in the EXE PDB will be
different than they were in the JIT PDB, but
>>>>>>>>>> the symbol records will refer to the
JIT PDB type indices.  So these need
>>>>>>>>>> to be fixed up.  LLD already has code
to do this, you can probably borrow a
>>>>>>>>>> similar algorithm with some slight
modifications (lldb/COFF/PDB.cpp, search
>>>>>>>>>> for mergeSymbolRecords)
>>>>>>>>>>
>>>>>>>>>> 3) Merge in the new section
contributions and section map.  See
>>>>>>>>>> LLD again for how to modify these. 
Hopefully the object file you exported
>>>>>>>>>> contains relocated symbol addresses so
you don't have to do any fixups here.
>>>>>>>>>>
>>>>>>>>>> 4) Merge in the publics and globals. 
This shouldn't be too hard,
>>>>>>>>>> I think you can just iterate over them
in the JIT PDB and add them to the
>>>>>>>>>> new EXE PDB.
>>>>>>>>>>
>>>>>>>>>> You're kind of in uncharted
territory here, so this is just a
>>>>>>>>>> rough idea of what needs to be done. 
There may be other issues that you
>>>>>>>>>> don't encounter until you actually
try it out.
>>>>>>>>>>
>>>>>>>>>> Unfortunately I don't personally
have the time to work on this,
>>>>>>>>>> but it sounds neat, and I'm happy
to help if you run into questions or
>>>>>>>>>> problems along the way.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190117/c50d9072/attachment.html>

llvm dev - Jan 2019 - [llvm-pdbutil] : merge not working properly

[llvm-dev] [llvm-pdbutil] : merge not working properly

[llvm-dev] [llvm-pdbutil] : merge not working properly

[llvm-dev] [llvm-pdbutil] : merge not working properly