thr3ads.net - llvm dev - [llvm-dev] TSAN hack on AArch64 for Android [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Dan Albert via llvm-dev

2015-Aug-28 16:37 UTC

[llvm-dev] TSAN hack on AArch64 for Android

IMO having to disable 2/3 of the tests means the patch isn't ready yet.

On Fri, Aug 28, 2015 at 9:31 AM, Jason Kim <jasonk at codeaurora.org>
wrote:
>
>
>  > -----Original Message-----
>  > From: Renato Golin [mailto:renato.golin at linaro.org]
> > > TESTS!
>  > >  Currently, about 2/3 tests for tsan fail/flake on
android+aarch64.
>  > > They aren't xfailed, because some of them are flaky, and
thus
>  > unexpectedly pass at times.
>  > > So I decided to mark them as unsupported on android+aarch64, and
thus
>  > is "green".
>  >
>  > So, if they currently pass on the production buildbots, you can't
mark
> them
>  > as unstable because of your patch, as that will mean your patch is
> making
>  > them unstable, and we can't just make dozens of tests unstable
for any
>  > amount of features.
>
> [ jasonk --> ] The tests, AFAIK, currently "pass" for a
somewhat squiggly
> definition of "pass".
> I get occasional fails on a local x86_64 build, w/o my patches.
>
>  > If your patch introduces the instability, you'll have to fix them
before
>
> [ jasonk --> ] The instabilities were there to begin with, AFAIK. Some
of
> the tests are run on a "deflake" regimen where they are repeated
10x,
> looking for a specific output.
>
>  > commit. That's why I suggested splitting it into many small
chunks,
> none of
>  > which will leave the tree in an unstable state, because you'll
have
> time to
>  > look at each instability individually, and fix them before commit.
The
> bots
>  > will also be happier, and every one wins.
>
> [ jasonk --> ]  Sorry, but I don't understand what you want here.
The
> tests I touched are marked unsupported (i.e. won't be run at all) ONLY
for
> aarch64-linux-androideabi. Is there something else you want? It should not
> impact any other configuration of TSAN/compiler-rt as-is. I also guard TSAN
> with a CMAKE variable so that it only gets built for aarch64 iff
> COMPILER_RT_FORCE_TSAN_AARCH64=1
>
>
>
>  > cheers,
>  > -renato
>
> Thanks!
> -Jason
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150828/4c45101f/attachment.html>

Jason Kim via llvm-dev

2015-Aug-28 17:02 UTC

head link

[llvm-dev] TSAN hack on AArch64 for Android

IMO having to disable 2/3 of the tests means the patch isn't ready yet.

[ jasonk --> ]  OK, that’s a fair view.  However, please keep in mind the
follwing:

(*) The tests weren’t all stable to begin with. 

(*) I am perfectly willing to triage the failures, but please keep in mind that
any “threshold” that we cross from NOT testing TSAN on aarch64+android to
actually testing on it is going to be a large hurdle.

(*) BIONIC is NOT glibc!

(*) Its pretty darn near impossible to  have the perfect patch, which is [a]
small [b] doesn’t break anything [c] adds major new features.

(*) What I have currently is: [a] not very big [b] doesn’t break anything
VISIBLY [c] adds major new features

(*) The new feature (TSAN on aarch64-linux-androideabi) is currently turned OFF
by default

(*) There are plenty of cases in prior history in llvm where a set of tests were
turned off temporarily, especially when a major new platform is being
introduced. I don’t think its unreasonable to have that as a starting point.

On Fri, Aug 28, 2015 at 9:31 AM, Jason Kim <jasonk at codeaurora.org
<mailto:jasonk at codeaurora.org> > wrote:

 > -----Original Message-----
 > From: Renato Golin [mailto:renato.golin at linaro.org
<mailto:renato.golin at linaro.org> ]> > TESTS! > >  Currently, about 2/3 tests for tsan fail/flake on android+aarch64.
 > > They aren't xfailed, because some of them are flaky, and thus
 > unexpectedly pass at times.
 > > So I decided to mark them as unsupported on android+aarch64, and thus
 > is "green".
 >
 > So, if they currently pass on the production buildbots, you can't mark
them
 > as unstable because of your patch, as that will mean your patch is making
 > them unstable, and we can't just make dozens of tests unstable for any
 > amount of features.

[ jasonk --> ] The tests, AFAIK, currently "pass" for a somewhat
squiggly definition of "pass".
I get occasional fails on a local x86_64 build, w/o my patches.

 > If your patch introduces the instability, you'll have to fix them
before

[ jasonk --> ] The instabilities were there to begin with, AFAIK. Some of the
tests are run on a "deflake" regimen where they are repeated 10x,
looking for a specific output.

 > commit. That's why I suggested splitting it into many small chunks,
none of
 > which will leave the tree in an unstable state, because you'll have
time to
 > look at each instability individually, and fix them before commit. The
bots
 > will also be happier, and every one wins.

[ jasonk --> ]  Sorry, but I don't understand what you want here. The
tests I touched are marked unsupported (i.e. won't be run at all) ONLY for
aarch64-linux-androideabi. Is there something else you want? It should not
impact any other configuration of TSAN/compiler-rt as-is. I also guard TSAN with
a CMAKE variable so that it only gets built for aarch64 iff
COMPILER_RT_FORCE_TSAN_AARCH64=1

 > cheers,
 > -renato

Thanks!
-Jason

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150828/c011c514/attachment.html>

Jason Kim via llvm-dev

2015-Aug-28 17:06 UTC

head link

[llvm-dev] TSAN hack on AArch64 for Android

From: Jason Kim [mailto:jasonk at codeaurora.org] 
Sent: Friday, August 28, 2015 10:03 AM
To: 'Dan Albert'
Cc: 'Renato Golin'; 'Dmitry Vyukov'; 'Kostya
Serebryany'; 'Tamas Berghammer'; 'Stephen Hines'; 'Tim
Northover'; 'kanheim at a-bix.com'; 'Chad Rosier';
'Evgenii Stepanov'; 'Adhemerval Zanella'; 'LLVM Dev';
'Jason Kim'; 'Kristof Beyls'; 'James Molloy'
Subject: RE: TSAN hack on AArch64 for Android

IMO having to disable 2/3 of the tests means the patch isn't ready yet.

[ jasonk --> ]  OK, that’s a fair view.  However, please keep in mind the
follwing:

(*) The tests weren’t all stable to begin with. 

(*) I am perfectly willing to triage the failures, but please keep in mind that
any “threshold” that we cross from NOT testing TSAN on aarch64+android to
actually testing on it is going to be a large hurdle.

(*) BIONIC is NOT glibc!

(*) Its pretty darn near impossible to  have the perfect patch, which is [a]
small [b] doesn’t break anything [c] adds major new features.

(*) What I have currently is: [a] not very big [b] doesn’t break anything
VISIBLY [c] adds major new features

(*) The new feature (TSAN on aarch64-linux-androideabi) is currently turned OFF
by default

(*) There are plenty of cases in prior history in llvm where a set of tests were
turned off temporarily, especially when a major new platform is being
introduced. I don’t think its unreasonable to have that as a starting point.

[ jasonk --> ] I don’t know if this is getting through, but the tests I
marked UNSUPPORTED, are ONLY unsupported on aarch64-linux-androideabi. They are
run, as is, on other configurations of compiler-rt.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150828/8a4f258e/attachment-0001.html>

Jason Kim via llvm-dev

2015-Aug-28 17:49 UTC

head link

[llvm-dev] TSAN hack on AArch64 for Android

+thread-sanitizer (which was left off for some reason)


On Fri, Aug 28, 2015 at 10:06 AM, Jason Kim <jasonk at codeaurora.org>
wrote:
>
>
>
>
> *From:* Jason Kim [mailto:jasonk at codeaurora.org]
> *Sent:* Friday, August 28, 2015 10:03 AM
> *To:* 'Dan Albert'
> *Cc:* 'Renato Golin'; 'Dmitry Vyukov'; 'Kostya
Serebryany'; 'Tamas
> Berghammer'; 'Stephen Hines'; 'Tim Northover';
'kanheim at a-bix.com'; 'Chad
> Rosier'; 'Evgenii Stepanov'; 'Adhemerval Zanella';
'LLVM Dev'; 'Jason Kim';
> 'Kristof Beyls'; 'James Molloy'
> *Subject:* RE: TSAN hack on AArch64 for Android
>
>
>
>
>
> IMO having to disable 2/3 of the tests means the patch isn't ready yet.
>
> *[ jasonk --> ]  OK, that’s a fair view.  However, please keep in mind
the
> follwing:*
>
>
>
> *(*) The tests weren’t all stable to begin with. *
>
> *(*) I am perfectly willing to triage the failures, but please keep in
> mind that any “threshold” that we cross from NOT testing TSAN on
> aarch64+android to actually testing on it is going to be a large hurdle. *
>
> *(*) BIONIC is NOT glibc!*
>
> *(*) Its pretty darn near impossible to  have the perfect patch, which is
> [a] small [b] doesn’t break anything [c] adds major new features. *
>
> *(*) What I have currently is: [a] not very big [b] doesn’t break anything
> VISIBLY [c] adds major new features*
>
> *(*) The new feature (TSAN on aarch64-linux-androideabi) is currently
> turned OFF by default*
>
> *(*) There are plenty of cases in prior history in llvm where a set of
> tests were turned off temporarily, especially when a major new platform is
> being introduced. I don’t think its unreasonable to have that as a starting
> point.*
>
>
>
>
>
> *[ jasonk --> ] I don’t know if this is getting through, but the tests I
> marked UNSUPPORTED, are ONLY unsupported on aarch64-linux-androideabi. They
> are run, as is, on other configurations of compiler-rt.*
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150828/903e9424/attachment.html>

Renato Golin via llvm-dev

2015-Aug-28 20:01 UTC

head link

[llvm-dev] TSAN hack on AArch64 for Android

On 28 August 2015 at 18:02, Jason Kim <jasonk at codeaurora.org>
wrote:> IMO having to disable 2/3 of the tests means the patch isn't ready yet.
Dan is absolutely right.

> (*) The tests weren’t all stable to begin with.
That's not a reason to break them more. If you can't prove they're
correct, you can't prove you broken them more, especially on other
arches.

> (*) I am perfectly willing to triage the failures, but please keep in mind
> that any “threshold” that we cross from NOT testing TSAN on aarch64+android
> to actually testing on it is going to be a large hurdle.
I'm glad you are, but please do it off trunk.

Your changes touch other OSs (ex. AArch64 Linux) and also other Arches
(see some comments to that effect on the patch). You can't just push
something that isn't ready like that.

> (*) BIONIC is NOT glibc!
I missed the point here...

> (*) Its pretty darn near impossible to  have the perfect patch, which is
[a]
> small [b] doesn’t break anything [c] adds major new features.
> (*) What I have currently is: [a] not very big [b] doesn’t break anything
> VISIBLY [c] adds major new features
It's hard, I give you that. But that's why most of us chose compilers
to work with. It's also a lot of fun.

I've seen lots of people provide lots of ideas on how to do that
slowly, and re-use changes in Android, RT, etc.

> (*) The new feature (TSAN on aarch64-linux-androideabi) is currently turned
> OFF by default
You're disabling the tests on Android/TSAN/AArch64, but you're
touching other arches.

Linaro has added *a lot* of sanitizers to AArch64. The way we did was
to have all changes local until *all* the tests pass. If *one or two*
tests were flaky, we marked them as "REQUIRES: stable-runtime" and
investigated it.

Never 2/3 of the tests. Never when touching other arches / OSs.

Have you tested your changes in x86_64 Android/Linux? AArch64 Linux?
What other arches would it touch? Mips64?

If your changes are *that* unstable in Android, I'm very worried that
it will destabilise the whole AArch64 implementation, which we just
pushed in and it is working.

One good example is ASAN/AArch64. When Christophe implemented it last
year, it took us 2 months to commit it to the tree, but we left the
build and *all* tests disabled because it passed 100% on our boxes,
but not on the public buildbot. It was only recently, many months
later, that we enabled the build/tests. In your case, it doesn't even
pass on your box.

> (*) There are plenty of cases in prior history in llvm where a set of tests
> were turned off temporarily, especially when a major new platform is being
> introduced. I don’t think its unreasonable to have that as a starting
point.
A major new platform, ie. a new back-end, doesn't normally break
anything elsewhere. And when it touches generic parts, tests are
carried out to prove that it didn't break anything.

If your code was 100% new files/functions/macros on a new architecture
that no one uses, it'd be ok. But that's not the case.

A big feature is hard to implement. You have to carry along your
branch, re-base, adapt, and you have to add tests, and test on other
architectures and OSs to make sure everything is good. For that big a
change, you cannot rely on the buildbots to tell you that, especially
not on the architecture you're targeting, and the base architecture,
which is x86_64.

If the tests are unstable even on x86_64 (I know they are, more
unstable than on ARM/AArch64), you have to be extra *extra* careful.
We cannot simply disable everything.

cheers,
--renato

llvm dev - Aug 2015 - TSAN hack on AArch64 for Android

[llvm-dev] TSAN hack on AArch64 for Android

[llvm-dev] TSAN hack on AArch64 for Android

[llvm-dev] TSAN hack on AArch64 for Android

[llvm-dev] TSAN hack on AArch64 for Android

[llvm-dev] TSAN hack on AArch64 for Android