We are going to use LLVM in a compiler project for transport triggered processors. See Wikipedia for more on transport triggering: <http://en.wikipedia.org/wiki/Transport_Triggered_Architectures>. One thing we need is some sort of libc. We are targeting embedded systems, and I have been looking at things like newlib. Are there people out there doing something similar? Or any advice or opinions as to how go about the whole thing? -- Pertti
On Thu, 2 Nov 2006, [ISO-8859-1] Pertti Kellom�ki wrote:> We are going to use LLVM in a compiler project for transport > triggered processors. See Wikipedia for more on transport triggering: > <http://en.wikipedia.org/wiki/Transport_Triggered_Architectures>.Cool> One thing we need is some sort of libc. We are targeting embedded > systems, and I have been looking at things like newlib. Are there > people out there doing something similar? Or any advice or opinions > as to how go about the whole thing?Using newlib makes sense! -Chris -- http://nondot.org/sabre/ http://llvm.org/
I have been browsing through the newlib documentation at <http://sources.redhat.com/newlib/> and pondering about how newlib relates to LLVM. Comments welcome, again. As I see it, there are basically two parts of libc that need to be considered. Much of libc is stuff like atoi(), isalpha(), etc. which are just convenience routines written in ANSI C. For these, it should be sufficient to compile them with llvm-gcc, and link them at the LLVM bytecode level. This is basically what is done at $(LLVM)/runtime/GCCLibraries/libc. The target specific stuff has mostly to do with system calls. Many system calls do not make sense in an embedded context, so newlib includes a suite of no-op stubs that one can use. For a minimal functionality, one can define read() and write() e.g. to operate on the serial port. To me, the most sensible approach would be to port newlib to the LLVM virtual machine, and use LLVM intrinsic functions as placeholders for the actual system calls. The back end would then emit target specific code for the intrinsics. What I don't quite see is how crt0 fits in the picture. Am I right in assuming that when llvm-gcc emits LLVM byte code, there is no crt0 involved? I haven't checked how the other back ends do it, but I assume that they rely on the host libc and crt0. In my case, I envision libc being linked in at the bytecode level, and crt0 being linked in by the back end. Does this sound like a sensible approach? -- Pertti
On Mon, 6 Nov 2006, [ISO-8859-1] Pertti Kellom�ki wrote:> What I don't quite see is how crt0 fits in the picture. Am I > right in assuming that when llvm-gcc emits LLVM byte code, there > is no crt0 involved? I haven't checked how the other back ends > do it, but I assume that they rely on the host libc and crt0. > In my case, I envision libc being linked in at the bytecode > level, and crt0 being linked in by the back end. > > Does this sound like a sensible approach?Yep, makes sense. -Chris -- http://nondot.org/sabre/ http://llvm.org/
I managed to compile newlib with llvm-gcc yesterday. That is, the machine independent part is now basically done, and the syscall part contains no-op stubs provided by libgloss. I haven't tested the port yet, but since newlib has already been ported to many architectures, I would be pretty surprised if there were any major problems. A couple of things I noticed when configuring newlib for LLVM. First, I did not find any preprocessor symbols that I could use to identify that we are compiling to LLVM byte code. If there is one, I'd be happy to hear it, but if not, then it might be a good idea to define __LLVM__ or something like that in (by) llvm-gcc. Another related thing is that even when I defined -emit-llvm in what I thought would be a global CFLAGS for all of newlib, it did not get propagated to all subdirectories. I solved both of these issues by creating a shell script that is just a fall-through to llvm-gcc, but passes "-emit-llvm -D__LLVM__" to it. It might be worthwhile to have a similar thing in the LLVM distribution, that is, a compiler that would identify the target as LLVM and produce byte code by default. There was very little to do in terms of porting. Basically the only thing I needed to tweak in the source code was to define floating point endiness, which I randomly picked to be __IEEE_BIG_ENDIAN. Hopefully someone can confirm or correct my choice. The next task is to go for the system calls. As I said earlier, I plan to use intrinsic functions as place holders. Any opinions how to name them? Currently there are a few intrinsics that have to do with libc, like llvm.memcpy and llvm.memmove. However, I would personally prefer less pollution in the intrinsic name space, so I would propose naming the intrinsics with a llvm.libc prefix, e.g. llvm.libc.open and so forth. Any strong opinions on this? -- Pertti