I'm trying to figure out how to map more complex CISC instructions now. For example on the 68000, you have things like -- add.w (a0)+,(a1)+ So that equates to: temp1 = load a0 add 2, a0 temp2 = load a1 temp1 = add temp1, temp2 store temp1, a1 add 2, a1 How do I express that in a form for LLVM? I see things like pre_store and post_store, but I cant find anything in the way of documentation about this. And there doesn't appear to be a pre_load and post_load matching pair or anything like that... Thanks!
On 9 Jul 2015, at 10:41, James Boulton <eiconic at googlemail.com> wrote:> > I'm trying to figure out how to map more complex CISC instructions now. For > example on the 68000, you have things like -- > > add.w (a0)+,(a1)+ > > So that equates to: > > temp1 = load a0 > add 2, a0 > temp2 = load a1 > temp1 = add temp1, temp2 > store temp1, a1 > add 2, a1 > > How do I express that in a form for LLVM?The simple answer is: not very easily. I’d be inclined to treat instructions like these as optimisations and ignore them (aside from integrated assembler support) until the rest of the back end is working. Once that’s working, take a look at how the ARM back end uses ldm and stm instructions - basically pattern matching on the MachineInstrs after code generation to fold them together. Your other alternative in this case is to model this as an instruction that does a gather load of a two-element vector and then two extract elements. You might be able to get the vectoriser to generate these sequences and then match them, but I suspect that you’ll then have to define a load of pseudos for vector ops (type legalisation happens before instruction selection, so it’s difficult to have types that are only valid for a few complex IR / DAG sequences). David
On 9 July 2015 at 10:58, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> The simple answer is: not very easily. I’d be inclined to treat instructions like these as optimisations and ignore them (aside from integrated assembler support) until the rest of the back end is working. Once that’s working, take a look at how the ARM back end uses ldm and stm instructions - basically pattern matching on the MachineInstrs after code generation to fold them together.If you want to match the expanded pattern and merge into an add.w, then you can use table-gen pseudo instruction patterns. If the pattern is not simple enough, or generally comes in a random sequence, or needs additional checks (for example, "store t1, a1" must come *before* "add 2,a1" but "add 2,a0" can come at any time after "load a0"), you can do like ARM's LoadStoreOptimizer and fold it after instruction selection.> Your other alternative in this case is to model this as an instruction that does a gather load of a two-element vector and then two extract elements. You might be able to get the vectoriser to generate these sequences and then match them, but I suspect that you’ll then have to define a load of pseudos for vector ops (type legalisation happens before instruction selection, so it’s difficult to have types that are only valid for a few complex IR / DAG sequences).Yeah, you'll end up with a huge and complex list of pseudos that could bite you in the rear if you're not careful. --renato
What about things like pre_store and post_store, though? If there was a pre_load and post_load this would largely solve the problem. Of course there are a wealth of addressing modes for the 68k, but they should be able to be dealt with like this I think? -----Original Message----- From: Dr D. Chisnall [mailto:dc552 at hermes.cam.ac.uk] On Behalf Of David Chisnall Sent: 09 July 2015 10:58 To: James Boulton Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] New backend help request. The simple answer is: not very easily. I’d be inclined to treat instructions like these as optimisations and ignore them (aside from integrated assembler support) until the rest of the back end is working. Once that’s working, take a look at how the ARM back end uses ldm and stm instructions - basically pattern matching on the MachineInstrs after code generation to fold them together. Your other alternative in this case is to model this as an instruction that does a gather load of a two-element vector and then two extract elements. You might be able to get the vectoriser to generate these sequences and then match them, but I suspect that you’ll then have to define a load of pseudos for vector ops (type legalisation happens before instruction selection, so it’s difficult to have types that are only valid for a few complex IR / DAG sequences). David