Displaying 7 results from an estimated 7 matches for "m256".
Did you mean:
256
2012 Jun 27
1
[PATCH] x86/hvm: increase struct hvm_vcpu_io's mmio_large_read
....com>
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -59,13 +59,13 @@ struct hvm_vcpu_io {
unsigned long mmio_gva;
unsigned long mmio_gpfn;
- /* We may read up to m128 as a number of device-model transactions. */
+ /* We may read up to m256 as a number of device-model transactions. */
paddr_t mmio_large_read_pa;
- uint8_t mmio_large_read[16];
+ uint8_t mmio_large_read[32];
unsigned int mmio_large_read_bytes;
- /* We may write up to m128 as a number of device-model transactions. */
- paddr_t mmio_large_write_pa;
+...
2011 Nov 30
0
[PATCH 2/4] x86/emulator: add emulation of SIMD FP moves
...) \
+ (dst) = sse_prefix[(vex_pfx) - 1]; \
+} while (0)
+
union vex {
uint8_t raw[2];
struct {
@@ -3850,6 +3860,76 @@ x86_emulate(
case 0x19 ... 0x1f: /* nop (amd-defined) */
break;
+ case 0x2b: /* {,v}movntp{s,d} xmm,m128 */
+ /* vmovntp{s,d} ymm,m256 */
+ fail_if(ea.type != OP_MEM);
+ /* fall through */
+ case 0x28: /* {,v}movap{s,d} xmm/m128,xmm */
+ /* vmovap{s,d} ymm/m256,ymm */
+ case 0x29: /* {,v}movap{s,d} xmm,xmm/m128 */
+ /* vmovap{s,d} ymm,ymm/m256 */
+ fail_if(vex.pfx & VEX_PR...
2012 Nov 07
1
[LLVMdev] AVX broadcast Vs. vector constant pool load
...ad a constant
vector
// from the constant pool and not to broadcast it from a scalar.
Would anyone be able to explain why it is better to load a vector from the
constant pool rather than broadcast a scalar?
I checked out Agner Fog's tables, but it wasn't so obvious to me...
vmovaps y, m256:
Uops: 1
Lat: 4
Throughput: 1
vbroadcastsd y, m64:
Uops: 2
Lat: [Not or cannot be measured]
Throughput: 1
Thanks in advance,
Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121106/cd47...
2012 Jul 27
0
[LLVMdev] X86 FMA4
Hey Michael,
Thanks for the legwork!
It appears that the stats you listed are for movaps [SSE], not vmovaps
[AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256),
since they are both AVX instructions. Although, yes, I agree that this is
not clear from Agner's report. Please correct me if I am misunderstanding.
As I am sure you are aware, we cannot use SSE (movaps) instructions in an
AVX context, or else we'll pay the context switch penalty. It mig...
2012 Jul 27
2
[LLVMdev] X86 FMA4
Just looked up the numbers from Agner Fog for Sandy Bridge for vmovaps/etc for loading/storing from memory.
vmovaps - load takes 1 load mu op, 3 latency, with a reciprocal throughput of 0.5.
vmovaps - store takes 1 store mu op, 1 load mu op for address calculation, 3 latency, with a reciprocal throughput of 1.
He does not list vmovsd, but movsd has the same stats as vmovaps, so I feel it is a
2012 Jul 27
3
[LLVMdev] X86 FMA4
> It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding.
You are misunderstanding [no worries, happens to everyone = )]. The timings I listed were for vmovaps of the form,
vmovaps %xmm0, (mem)
i.e.,...
2010 May 19
8
Generating all possible models from full model
...37<-glm.convert(glm.nb(mantas~year+cosmonth+plankton,data=mydata))
m245<-glm.convert(glm.nb(mantas~year+sinmonth+coslunar,data=mydata))
m246<-glm.convert(glm.nb(mantas~year+sinmonth+sinlunar,data=mydata))
m247<-glm.convert(glm.nb(mantas~year+sinmonth+plankton,data=mydata))
m256<-glm.convert(glm.nb(mantas~year+coslunar+sinlunar,data=mydata))
m257<-glm.convert(glm.nb(mantas~year+coslunar+plankton,data=mydata))
m267<-glm.convert(glm.nb(mantas~year+sinlunar+plankton,data=mydata))
m345<-glm.convert(glm.nb(mantas~cosmonth+sinmonth+coslunar,data=mydata))...