Krzysztof Parzyszek via llvm-dev
2015-Nov-11 15:54 UTC
[llvm-dev] SROA and volatile memcpy/memset
On 11/11/2015 9:36 AM, Hal Finkel wrote:> ----- Original Message ----- >> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> >> >> Yeah, the remark about devices I made in my post was a result of a >> "last-minute" thought to add some rationale. It doesn't actually >> apply >> to SROA, since there are no devices that are mapped to the stack, >> which >> is what SROA is interested in. >> >> The concern with the testcase I attached is really about performance. >> Would it be reasonable to control the splitting in SROA via TTI? > > How so?I'm not sure which part you are referring to. The "volatileness" of the structure in question does not place the same restrictions on how we can access it as it would be in the case of a device access. The broken up loads and stores are legal in the sense that they won't cause any hardware issues, however they would take longer to execute because the resulting instructions would be marked as volatile and thus "non-optimizable". -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
----- Original Message -----> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvm-dev at lists.llvm.org, "Chandler Carruth" <chandlerc at gmail.com> > Sent: Wednesday, November 11, 2015 9:54:54 AM > Subject: Re: [llvm-dev] SROA and volatile memcpy/memset > > On 11/11/2015 9:36 AM, Hal Finkel wrote: > > ----- Original Message ----- > >> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> > >> > >> Yeah, the remark about devices I made in my post was a result of a > >> "last-minute" thought to add some rationale. It doesn't actually > >> apply > >> to SROA, since there are no devices that are mapped to the stack, > >> which > >> is what SROA is interested in. > >> > >> The concern with the testcase I attached is really about > >> performance. > >> Would it be reasonable to control the splitting in SROA via TTI? > > > > How so? > > I'm not sure which part you are referring to. The "volatileness" of > the > structure in question does not place the same restrictions on how we > can > access it as it would be in the case of a device access. The broken > up > loads and stores are legal in the sense that they won't cause any > hardware issues, however they would take longer to execute because > the > resulting instructions would be marked as volatile and thus > "non-optimizable".How would TTI be used? DataLayout, for example, already has information on legal/preferred integer sizes. Would that help? -Hal> > -Krzysztof > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Krzysztof Parzyszek via llvm-dev
2015-Nov-11 16:02 UTC
[llvm-dev] SROA and volatile memcpy/memset
On 11/11/2015 9:57 AM, Hal Finkel wrote:> ----- Original Message ----- >> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> >> >> I'm not sure which part you are referring to. The "volatileness" of >> the >> structure in question does not place the same restrictions on how we >> can >> access it as it would be in the case of a device access. The broken >> up >> loads and stores are legal in the sense that they won't cause any >> hardware issues, however they would take longer to execute because >> the >> resulting instructions would be marked as volatile and thus >> "non-optimizable". > > How would TTI be used? DataLayout, for example, already has information on legal/preferred integer sizes. Would that help?I was thinking about having some information about the preferred treatment of volatile memory intrinsics. The problem in this case is that the aggregate is a union, which would make it harder to decide how the accesses should be formed. -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation