thr3ads.net - llvm dev - [LLVMdev] Re: idea 10 [Jan 2004]

If this information is useful, please help other people find it:
Share via:

"Valery A.Khamenya"

2004-Jan-08 03:19 UTC

[LLVMdev] Re: idea 10

> My $0.02 worth on this topic ..
and again |0.02 of mein :-)
 > However, I find it unreasonable to expect LLVM to provide 
> any features in this area.  In order to do anything meaningful, 
> LLVM would have to have some kind of awareness of networks
> (typically an operating system concern).  
> That seems at odds with the "low level" principles of LLVM. 
When I look at what we have today -- I agree.
But when I think about what we *should* have -- I don't agree.

For the very beginning think of a host with multiple CPUs. 
I could formulate my proposal even for this non-networked 
case. 

There should be an engine and layer for making dispatching optimizations in
run-time. If one CPU is loaded and code is
"parallelizable" why then not to send some part of 
calculation to other CPU? This kind of on-fly decision will
be one day incorporated in something like LLVM. 
 > Valery, could you be more explicit about what kind of 
> features in LLVM would support distributed computing? 
OK, np, I will try.

Consider this Fibonacci function as the model for our
using case:

f(int n) {
  if(n<2) return 1;
  return f(n-1) + f(n+2);
}

the complexity of this non-optimal version of Fibonacci 
function is O(2^n). The number of calls after start 
grows exponentionaly. It means that we have CPU loaded 
very quickly and we have a lot of calls to be 
"outsourced" to other CPUs. 

Is it OK up to here and I could continue using this 
model example?
> How would code evaluation
> distributed to multiple hosts benefit anyone?
For me it sounds like:
"do we need distributed computations at all?"

oh... could we take this as axiom?.. :-)
anyway, if you accept this model examlpe above we could 
see how it really could work.
> The two programs only communicate via a network connection. 
> The only places you can optimize that network connection are 
> in (a) the kernel and (b) the application.
> Case (a) is outside the scope of LLVM and case (b) is supported by LLVM
> today.  I assume this is obvious so what else were you thinking?
Let's consider both case (a) and (b) in more details.
Think of Fib. example as application from your case (b).
We transfer Fib. module into LLVM bytecode without 
optimization. Now I run this application
with some dispatcher (someday it will be part of kernel 
of LLVM-based OS). This dispatcher makes some reasonable 
optimizations to a LLVM-bytecode and starts the 
Fib code with argument provided by user. 
If argument is small enough, then function will give us a 
result soon, of course. Otherwize is however more interesting:
if argument is big enough then we load CPU very quickly and
we have to splash out our code to any other available CPUs.

Before LLVM project I'd never discuss something like this at
all, because of non-feasibility of such a realisation. I do
believe that with LLVM such a future is really possible. 
I do believe as well, that there should appear OSes 
with LLVM in kernel. Chris, you and others, could find 
these statements funny today, but tomorrow you could find 
more reason in these "strange" statements :)
(Who knows, maybe Chris is laughing now, because it is 
more then clear for him)

What should I expand/reformulate?

Valery.

Reid Spencer

2004-Jan-08 03:48 UTC

head link

[LLVMdev] Re: idea 10

Interesting email address there :)

On Thu, 2004-01-08 at 01:18, =?koi8-r?Q?=22?=Valery
A.Khamenya=?koi8-r?Q?=22=20?= wrote:
> For the very beginning think of a host with multiple CPUs. 
> I could formulate my proposal even for this non-networked 
> case. 
On the same machine, LLVM definitely needs to support both symmetric and
asymmetric multi-processing. I believe some primitives to support this
are being worked on now by Misha. However, I don't really consider this
"distributed systems" because distribution by a few inches doesn't
really amount to much. In my way of thinking distributed computing
*always* involves a network.
> There should be an engine and layer for making dispatching optimizations in
run-time. If one CPU is loaded and code is
> "parallelizable" why then not to send some part of 
> calculation to other CPU? This kind of on-fly decision will
> be one day incorporated in something like LLVM.
Okay, this kind of "distribution" (what I call parallel computing)
should also definitely be supported by LLVM. There are several
primitives that could be added to LLVM to enable compiler writers to
more easily write parallel programs. However, I don't think LLVM should
get involved in parallel programs that run on multiple computers, only a
single computer (possibly with multiple CPUs).
>  
> > Valery, could you be more explicit about what kind of 
> > features in LLVM would support distributed computing? 
> 
> OK, np, I will try.
> 
> Consider this Fibonacci function as the model for our
> using case:
> 
> f(int n) {
>   if(n<2) return 1;
>   return f(n-1) + f(n+2);
> }
> the complexity of this non-optimal version of Fibonacci 
> function is O(2^n). The number of calls after start 
> grows exponentionaly. It means that we have CPU loaded 
> very quickly and we have a lot of calls to be 
> "outsourced" to other CPUs. 
> 
> Is it OK up to here and I could continue using this 
> model example?
Yes, but I think the confusion was simply one of terminology. What
you're talking about is what I call parallel computing or
multi-processing (symmetric or asymetric). This isn't really distributed
computing although one could think of the operations being
"distributed"
across several CPUs on the _same_ computer.
> 
> > How would code evaluation
> > distributed to multiple hosts benefit anyone?
Okay, now you're talking about "hosts" which I take to mean
separate
physical computers with the only way for the "hosts" to communicate is
via a network. Is this what you mean by "host"?
> 
> For me it sounds like:
> "do we need distributed computations at all?"
No, we don't. Distributed computing would be built on top of LLVM (in my
opinion). But, we _do_ need support for parallel or multi-processing
computation as you described above (again, on a single physical
computer).
> oh... could we take this as axiom?.. :-)
I would accept "we need parallel computation support" as an axiom. I
don't think Chris will disagree, but I'll let him chime in on that.
> anyway, if you accept this model examlpe above we could 
> see how it really could work.
Sure ..
> 
> > The two programs only communicate via a network connection. 
> > The only places you can optimize that network connection are 
> > in (a) the kernel and (b) the application.
> > Case (a) is outside the scope of LLVM and case (b) is supported by
LLVM
> > today.  I assume this is obvious so what else were you thinking?
> 
> Let's consider both case (a) and (b) in more details.
> Think of Fib. example as application from your case (b).
> We transfer Fib. module into LLVM bytecode without 
> optimization. Now I run this application
> with some dispatcher (someday it will be part of kernel 
> of LLVM-based OS). This dispatcher makes some reasonable 
> optimizations to a LLVM-bytecode and starts the 
> Fib code with argument provided by user. 
> If argument is small enough, then function will give us a 
> result soon, of course. Otherwize is however more interesting:
> if argument is big enough then we load CPU very quickly and
> we have to splash out our code to any other available CPUs.
okay .. again, this isn't "distributed", just
"parallel", and I agree
its needed.
> 
> Before LLVM project I'd never discuss something like this at
> all, because of non-feasibility of such a realisation. I do
> believe that with LLVM such a future is really possible. 
> I do believe as well, that there should appear OSes 
> with LLVM in kernel. Chris, you and others, could find 
> these statements funny today, but tomorrow you could find 
> more reason in these "strange" statements :)
> (Who knows, maybe Chris is laughing now, because it is 
> more then clear for him)
I would tend to agree. I fully expect LLVM to fill the mission that Java
started: ubiquitous computing. I would expect to see LLVM programs
running on everything from gameboy to supercomputer, possibly with
kernel support. I've toyed with the idea of a Linux kernel module for
LLVM already. It would allow a bytecode file to be executed like any
other program. The JIT essentially inside the kernel. Weird idea but it
could be beneficial in terms of security. The module could be written in
such a way that it examines the bytecode before executing anything
making sure the bytecode isn't doing anything "weird". Since only
the
kernel can create the actual executable for the bytecode, no malicious
bytecode program could run. This assumes that "malicious" is
detectable
:) 
> 
> What should I expand/reformulate?
Nothing at this point. I think I realize where you're coming from now
and agree that _parallel_ computing is a very important next step for
LLVM.  Let's see what happens with the current work on synchronization
and threading primitives. These things will be needed to support the
kind of parallelization you're talking about.
> 
> Valery.
Best Regards,

Reid.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20040108/49887f05/attachment.sig>

"Valery A.Khamenya"

2004-Jan-08 05:24 UTC

head link

[LLVMdev] Re: idea 10

> Interesting email address there :)
> On Thu, 2004-01-08 at 01:18, =?koi8-r?Q?=22?=Valery
> A.Khamenya=?koi8-r?Q?=22=20?= wrote:
unfortunally some email parsers and email clients deny to work correctly with
international conventions :(
follow this URL for more details:
http://www.python.org/doc/current/lib/module-email.Header.html
 > On the same machine, LLVM definitely needs to support both symmetric and
> asymmetric multi-processing. I believe some primitives to support this
> are being worked on now by Misha. However, I don't really consider this
> "distributed systems" because distribution by a few inches
doesn't
> really amount to much. In my way of thinking distributed computing
> *always* involves a network.
you are right, but bigger steps after smaller ones?
let's agree with a simple things and then we could 
proceed (and probably die) with networking case.
 > Okay, this kind of "distribution" (what I call parallel
computing)
(see above)
> should also definitely be supported by LLVM. There are several
> primitives that could be added to LLVM to enable compiler writers to
> more easily write parallel programs. However, I don't think LLVM should
> get involved in parallel programs that run on multiple computers, only a
> single computer (possibly with multiple CPUs).
Oh, we agreed to a single host so soon? :)

> > Is it OK up to here and I could continue using this 
> > model example?
> 
> Yes, but I think the confusion was simply one of terminology. What
> you're talking about is what I call parallel computing or
> multi-processing (symmetric or asymetric). 
(see above again).
> This isn't really distributed computing although 
> one could think of the operations being "distributed"
> across several CPUs on the _same_ computer.
(people speaking about distributed computations even in 
more boring context: "one has transfered stand-alone 
application to a remote PC, executed it, and this already 
mean distributed computation").

My idea was to consider Fib example for a single PC with several CPUs and when
you are agreed that it makes sense to bring notion of CPU in LLVM layer, i
should ask you: "Raid, why should we restrict ourselves with _single_ PC
only ?!"
> 
> Okay, now you're talking about "hosts" which I take to mean
separate
> physical computers with the only way for the "hosts" to
communicate is
> via a network. Is this what you mean by "host"?
We could restrict ourselves to notion of host as in
http://www.faqs.org/rfcs/rfc1.html and later RFCs for
TCP/IP.
> No, we don't. Distributed computing would be built on top of LLVM (in
my
> opinion). But, we _do_ need support for parallel or multi-processing
> computation as you described above (again, on a single physical
> computer).
wait, let's fix your elrier statement. As far as I understood 
you do agree that case of multiple CPUs on a _single_ host 
should be supported at LLVM layer. Right?

as I see from here:
> okay .. again, this isn't "distributed", just
"parallel", and I agree
> its needed.
you call this case just "parallel" and agree.
> I would tend to agree. I fully expect LLVM to fill the mission that Java
> started: ubiquitous computing. I would expect to see LLVM programs
> running on everything from gameboy to supercomputer, possibly with
> kernel support.
"possibly with kernel support" is kind of crutch to get supercomputer
with networking architecture ;)

Reid, couldn't you agree that networking is only a particular interface to
get an access to others CPUs?

why should then LLVM be abstracted from suppercomputers with 
CPU distributed over network?

All benefits, what one could obtain from "LLVM supporting multiple CPU at
single host", one might obtaine from "LLVM supporting multiple CPU at
multiple hosts". Isn't that logical?

> I've toyed with the idea of a Linux kernel module for
> LLVM already. [...]
then even easier to speak :) 
 > Nothing at this point. I think I realize where you're coming from now
> and agree that _parallel_ computing is a very important next step for
> LLVM.  Let's see what happens with the current work on synchronization
> and threading primitives. These things will be needed to support the
> kind of parallelization you're talking about.
Right.

...I almost hear the thoughts from Chris: 
"guys, instead of trolling about cool things make something cool!"
;)

--
Valery

"Valery A.Khamenya"

2004-Jan-08 07:34 UTC

head link

[LLVMdev] Re: idea 10

Hello Se'bastien,
> I'm not sure to correctly understand what you mean, but I interpret it 
> as LLVM deciding where the code should be executed, like some 
> load-balancing strategy.
in this particular example it was really like that.
However I've tried to emphasize as well, that a decision 
"where to execute" is strongly connected with
LLVM optimizations, which become either 
applicable or not applicable depending on result of 
the decision. 
> In this perspective, I think this should be 
> left up to the higher-level language, or event to the application 
> developer: I don't think incorporating load balancing strategies 
> directly into LLVM would be interesting, because strategies are high 
> level patterns.
1. strategy of balancing does not completely belong to application.

2. even being no LLVMdeveloper, I do like minimalist approach to LLVM design
"throw out everything, what might be thrown away", but "don't
throw more!" :) -- How could you express in LLVM "optimization for
remote call" if "remote" doesn't fit to LLVM?..
> To me this appears more as an algorithmic design issue, this function 
> could be rewritten in "continuation passing style", and each 
> continuation could be distributed by a load-balancing strategy to the 
> computers sharing CPU resources. Using mechanisms such as
"futures" (as
> in Mozart) allows to do this easily...
1. It looks like you mean here, that one *must* change the 
code of Fib function in order to control the 
parallelization, right?

2. do you propose to ignore languages not supporting "continuation passing
style"?

> but I don't think these features 
> belong to the set of languages low level primitives and constructs.
if you don't pollute Fib. function with explicit 
optimizations and use language like C, then what kind 
of level should bring "parallel optimization" 
to your Fib. code?.. 

don't forget, I'd like to parallelize any parallelizable 
code, like in this Fib. example, written in C
> I personally don't think that automating the conversion of an algorithm
> from non-distributed to distributed execution can be done at a low 
> level, mainly because it involves many high-level constructs. 
well, don't kill me, but I don't share your religion in this point :)

More practically: let's use Fib example to get it parallelized 
in terms of LLVM. The example is simple enough!
> On the 
> other hand, I have heard of languages that try to implement primitives 
> for easy distributed computing, like the "Unified Parallel C"
(see
> http://www.gwu.edu/~upc/documentation.html and 
> http://ludo.humanoidz.org/doc/report-en.pdf), which may help to figure 
> out what kind of primitives would be useful for adding distributing 
> computing support into LLVM.
i agree. (and thank you for the nice links)
> Maybe you were thinking of something similar to UPC primitives 
> added to LLVM ?
rather "yes".

Se'bastien Pierre

2004-Jan-08 09:04 UTC

head link

[LLVMdev] Re: idea 10

Hello Valery,

I have some comments regarding your thoughts on LLVM support for 
distributed computing.

Valery A.Khamenya wrote:
>There should be an engine and layer for making dispatching optimizations in
run-time. If one CPU is loaded and code is
>"parallelizable" why then not to send some part of 
>calculation to other CPU? This kind of on-fly decision will
>be one day incorporated in something like LLVM. 
>  
>I'm not sure to correctly understand what you mean, but I interpret it 
as LLVM deciding where the code should be executed, like some 
load-balancing strategy. In this perspective, I think this should be 
left up to the higher-level language, or event to the application 
developer: I don't think incorporating load balancing strategies 
directly into LLVM would be interesting, because strategies are high 
level patterns.
>Consider this Fibonacci function as the model for our
>using case:
>
>f(int n) {
>  if(n<2) return 1;
>  return f(n-1) + f(n+2);
>}
>
>the complexity of this non-optimal version of Fibonacci 
>function is O(2^n). The number of calls after start 
>grows exponentionaly. It means that we have CPU loaded 
>very quickly and we have a lot of calls to be 
>"outsourced" to other CPUs. 
>  
>To me this appears more as an algorithmic design issue, this function 
could be rewritten in "continuation passing style", and each 
continuation could be distributed by a load-balancing strategy to the 
computers sharing CPU resources. Using mechanisms such as "futures"
(as
in Mozart) allows to do this easily... but I don't think these features 
belong to the set of languages low level primitives and constructs.

I personally don't think that automating the conversion of an algorithm 
from non-distributed to distributed execution can be done at a low 
level, mainly because it involves many high-level constructs. On the 
other hand, I have heard of languages that try to implement primitives 
for easy distributed computing, like the "Unified Parallel C" (see 
http://www.gwu.edu/~upc/documentation.html and 
http://ludo.humanoidz.org/doc/report-en.pdf), which may help to figure 
out what kind of primitives would be useful for adding distributing 
computing support into LLVM.

Maybe you were thinking of something similar to UPC primitives added to 
LLVM ?

-- Se'bastien

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Jan 2004 - [LLVMdev] Re: idea 10

[LLVMdev] Re: idea 10

[LLVMdev] Re: idea 10

[LLVMdev] Re: idea 10

[LLVMdev] Re: idea 10

[LLVMdev] Re: idea 10

Apparently Analagous Threads