I'm CC'ing the llvm-dev list because other people are more knowledgeable about the bytecode format/encoding than I am. Please follow-up the replies to the list. On Wed, Oct 20, 2004 at 11:27:53AM -0700, Yiping Fan wrote:> We also want to extend the llvm instructions/intrinsic > functions/types/passes to support our high-level synthesis for > hardware. First of all, we want to enhance the Bytecode/Asm Writer > and reader to support many attributes for every instruction, basic > block, function, and module. Basically, we want many extra fields to > be written out and read in. However, I cannot find a obvious way to do > this in current LLVM. Do you have any suggestion about this?We have a document describing the bytecode format here: [1] http://llvm.cs.uiuc.edu/docs/BytecodeFormat.html Also, a document on how to add new instructions, intrinsic functions: [2] http://llvm.cs.uiuc.edu/docs/ExtendingLLVM.html ([2] isn't exactly what you're asking for, but related) The question is, do you really want to have more extra fields in the instructions? What is it that you need to represent that the current system does not allow you to do? If you want multiple passes to communicate some information about the LLVM bytecode, perhaps it is better to keep a map/vector/etc on the side for this "side-band" information? You should realize that as soon as you change instructions and/or their meaning, you may prevent the current set of analyzers and optimizations from working as they do now. -- Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu
Yeah. We need to have more extra fields in the instruction. Fo example, during high-level synthesis, we must schedule an instruction to a certain control step (or cycle), and bind it to be execute on a certain functional unit, etc. Besides the in-memory exchange of the information, we also want on-disk exchange. That introduces the write-out/parse-in problem. Thanks ----- Original Message ----- From: "Misha Brukman" <brukman at uiuc.edu> To: "Yiping Fan" <fanyp at CS.UCLA.EDU>; "'Zhiru Zhang'" <zhiruz at CS.UCLA.EDU>; "Guoling Han" <leohgl at CS.UCLA.EDU> Cc: <llvmdev at cs.uiuc.edu> Sent: Wednesday, October 20, 2004 11:43 AM Subject: Re: LLVM Compiler Infrastructure Tutorial> I'm CC'ing the llvm-dev list because other people are more knowledgeable > about the bytecode format/encoding than I am. Please follow-up the > replies to the list. > > On Wed, Oct 20, 2004 at 11:27:53AM -0700, Yiping Fan wrote: >> We also want to extend the llvm instructions/intrinsic >> functions/types/passes to support our high-level synthesis for >> hardware. First of all, we want to enhance the Bytecode/Asm Writer >> and reader to support many attributes for every instruction, basic >> block, function, and module. Basically, we want many extra fields to >> be written out and read in. However, I cannot find a obvious way to do >> this in current LLVM. Do you have any suggestion about this? > > We have a document describing the bytecode format here: > [1] http://llvm.cs.uiuc.edu/docs/BytecodeFormat.html > Also, a document on how to add new instructions, intrinsic functions: > [2] http://llvm.cs.uiuc.edu/docs/ExtendingLLVM.html > ([2] isn't exactly what you're asking for, but related) > > The question is, do you really want to have more extra fields in the > instructions? What is it that you need to represent that the current > system does not allow you to do? > > If you want multiple passes to communicate some information about the > LLVM bytecode, perhaps it is better to keep a map/vector/etc on the side > for this "side-band" information? > > You should realize that as soon as you change instructions and/or their > meaning, you may prevent the current set of analyzers and optimizations > from working as they do now. > > -- > Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu >
Yiping, The LLVM IR is a mid-level representation and it is inadvisable to alter the definition of an instruction. What you're trying to do seems to be related to code generation (scheduling and binding to a functional unit). So, I would suggest that you write an analysis pass to compute and provide the information you need out of band. Then, you will need to provide a code generation pass to use the analysis information you generated to provide the necessary scheduling and functional unit affinity. I believe this approach will allow existing transforms and analyses to continue to work while providing you with the information you need. You should review the code in the Analysis library to get examples of how to write a function pass and associate it with instructions. If you need to store this information with the bytecode (ill advised), the bcreader and bcwriter libraries could be augmented to also store your computed information. Without more details on what you're trying to accomplish (the ultimate goal), its hard to be more specific about what you need to do. Reid. Yiping Fan wrote:> Yeah. We need to have more extra fields in the instruction. Fo example, > during high-level synthesis, we must schedule an instruction to > a certain control step (or cycle), and bind it to be execute on a certain > functional unit, etc. > Besides the in-memory exchange of the information, we also want on-disk > exchange. That introduces the write-out/parse-in problem. > > Thanks > > ----- Original Message ----- From: "Misha Brukman" <brukman at uiuc.edu> > To: "Yiping Fan" <fanyp at CS.UCLA.EDU>; "'Zhiru Zhang'" > <zhiruz at CS.UCLA.EDU>; "Guoling Han" <leohgl at CS.UCLA.EDU> > Cc: <llvmdev at cs.uiuc.edu> > Sent: Wednesday, October 20, 2004 11:43 AM > Subject: Re: LLVM Compiler Infrastructure Tutorial > > >> I'm CC'ing the llvm-dev list because other people are more knowledgeable >> about the bytecode format/encoding than I am. Please follow-up the >> replies to the list. >> >> On Wed, Oct 20, 2004 at 11:27:53AM -0700, Yiping Fan wrote: >> >>> We also want to extend the llvm instructions/intrinsic >>> functions/types/passes to support our high-level synthesis for >>> hardware. First of all, we want to enhance the Bytecode/Asm Writer >>> and reader to support many attributes for every instruction, basic >>> block, function, and module. Basically, we want many extra fields to >>> be written out and read in. However, I cannot find a obvious way to do >>> this in current LLVM. Do you have any suggestion about this? >> >> >> We have a document describing the bytecode format here: >> [1] http://llvm.cs.uiuc.edu/docs/BytecodeFormat.html >> Also, a document on how to add new instructions, intrinsic functions: >> [2] http://llvm.cs.uiuc.edu/docs/ExtendingLLVM.html >> ([2] isn't exactly what you're asking for, but related) >> >> The question is, do you really want to have more extra fields in the >> instructions? What is it that you need to represent that the current >> system does not allow you to do? >> >> If you want multiple passes to communicate some information about the >> LLVM bytecode, perhaps it is better to keep a map/vector/etc on the side >> for this "side-band" information? >> >> You should realize that as soon as you change instructions and/or their >> meaning, you may prevent the current set of analyzers and optimizations >> from working as they do now. >> >> -- >> Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu >> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Wed, Oct 20, 2004 at 11:59:45AM -0700, Yiping Fan wrote:> Yeah. We need to have more extra fields in the instruction. Fo > example, during high-level synthesis, we must schedule an instruction > to a certain control step (or cycle), and bind it to be execute on a > certain functional unit, etc.Since we're talking about "execution" and "scheduling", you are creating a back-end for LLVM, correct? In that case, we're talking about code generation. LLVM is a target-independent representation. Note that with LLVM, we are trying to separate WHAT the program is doing (the meaning of the program) from HOW it does it (which specific instructions get executed and when, and this includes scheduling). What you are trying to add to it is target-dependent (e.g. scheduling). That is not advisable on several levels, one of which is breaking the target abstraction that we have tried hard to maintain. Take a look at the X86, PowerPC, and Sparc target code generators (llvm/lib/Target/*). They are using a different representation, specifically, MachineInstr, MachineBasicBlock, and MachineFunction classes that are target-dependent (for example, they include asm opcodes and target registers). Something target-dependent such as scheduling and assignment to functional units would be done in this representation, after code generation (LLVM -> Machine code). Presumably, this (e.g. scheduling) information is not provided from the C/C++ front-end, but computed by a pass that you would write, correct? Then you can always compute this information on the fly, before any pass that needs to do something with this information needs to use it. As Reid mentioned, take a look a the Analysis interfaces and see if you can implement this as an Analysis that could be required by a pass and transparently ran for you by the PassManager.> Besides the in-memory exchange of the information, we also want > on-disk exchange. That introduces the write-out/parse-in problem.Again, if this is information that's computable from bytecode alone, you do not need to store it every time -- an Analyser pass can compute it dynamically. Also, as a reminder, if you change the LLVM representation, your new version may or may not be able to use the current set of analyses and optimizations, thus forcing you to "reinvent the wheel" in that respect. -- Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu
Perhaps I am missing some link. Need a bit clarification. For the C language, I want to access the LLVM code immediately generated by llvmgcc(cfrontend/bin/gcc) before it undergoes any further transformation or optimization. 1) Are there any libraries that enable me to parse C code and create the Module instance. 2) If answer to 1) is no, then is there some other way to create Module instance from C code than issuing llvmgcc -S command in my program, then parsing the output through Asmparser, and then creating Module instance as a result. I believe there should also be a directory frontend in the llvm source, that provides libraries for parsing different source languages and translating these into LLVM bytecode.
On Sun, Nov 07, 2004 at 01:19:07AM +0000, Umar Janjua wrote:> Perhaps I am missing some link. Need a bit clarification. > > For the C language, I want to access the LLVM code immediately > generated by llvmgcc(cfrontend/bin/gcc) before it undergoes any > further transformation or optimization.llvm-gcc -Wa,-disable-opt -Wl,-disable-opt file.c file2.c -o stuff.o This will run gccas and gccld without running the optimizations that they would run by default.> 1) Are there any libraries that enable me to parse C code and create > the Module instance.This is all part of llvm-gcc, and tied to the way GCC does things. There are no LLVM-style libraries for llvm-gcc.> 2) If answer to 1) is no, then is there some other way to create > Module instance from C code than issuing llvmgcc -S command in my > program, then parsing the output through Asmparser, and then creating > Module instance as a result.There is currently no such library because the GCC front-end is completely disjoint from the LLVM libraries. To change this would require a significant job of restructuring llvm-gcc (which is to say GCC itself), which is problematic in its own ways -- we are trying to have minimal differences with the GCC CVS tree so that we can update to get their bugfixes easily. If we switch to something which is wildly different from the main GCC tree, we have to maintain it ourselves, and no one has volunteered to handle this task (this is, again, non-trivial).> I believe there should also be a directory frontend in the llvm > source, that provides libraries for parsing different source languages > and translating these into LLVM bytecode.Sure, that would be nice to have, but again, it involves a significant amount of effort that no one has volunteered to provide. Patches accepted. :) -- Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu