thr3ads.net - llvm dev - [llvm-dev] StringSwitch class [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Anupama Chandrasekhar via llvm-dev

2016-Feb-05 22:43 UTC

[llvm-dev] StringSwitch class

Hi:

I have a question about the llvm StringSwitch class. Why is this more
efficient than comparing the hashes of the strings or just using a bunch of
if statements.

Anupama
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/9be0fcc8/attachment.html>

Robinson, Paul via llvm-dev

2016-Feb-06 00:28 UTC

head link

[llvm-dev] StringSwitch class

I don't know about "more efficient" but a StringSwitch is
certainly more concise and readable than a series of if statements with similar
functionality (such as doing length checks before the string compare).
Hashes always run the risk of collisions (false matches).
--paulr

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Anupama
Chandrasekhar via llvm-dev
Sent: Friday, February 05, 2016 2:44 PM
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] StringSwitch class

Hi:

I have a question about the llvm StringSwitch class. Why is this more efficient
than comparing the hashes of the strings or just using a bunch of if statements.

Anupama
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160206/d4c7e288/attachment.html>

Mehdi Amini via llvm-dev

2016-Feb-06 00:42 UTC

head link

[llvm-dev] StringSwitch class

> On Feb 5, 2016, at 2:43 PM, Anupama Chandrasekhar via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
> 
> Hi:
> 
> I have a question about the llvm StringSwitch class. Why is this more
efficient than comparing the hashes of the strings or just using a bunch of if
statements.
I don't know about hashes performance (comparing a small string is probably
more efficient than comparing hashes, and I'd be reluctant to collisions
anyway). Now since you mentioned a "bunch of if statements" you can
get very close.
StringSwitch is caching the length of the string to compare and first checking
the length before doing the actual comparison using memcmp. So the two
constructs below it should be equivalent:

int i = StringSwitch<int>("abc").case("de",
1).case("fghi", 2).case("jkl", 3).default(-1);

and:

int i;
const char *str = "abc";
int len = strlen(str);
if(len == 2 && std::memcmp(str, "de", 2) {
   i = 1;
} else if(len == 4 && std::memcmp(str, "fghi", 4) {
   i = 2;
} else if(len == 3 && std::memcmp(str, "jkl", 3) {
   i = 3;
} else {
   i = -1
}


I left up to you to appreciate the readability :)

-- 
Mehdi

Anupama Chandrasekhar via llvm-dev

2016-Feb-06 01:47 UTC

head link

[llvm-dev] StringSwitch class

ok, I understand hashes might not be the best.

I agree that the StringSwitch is concise and easy to read, but aren't we
doing a bunch of extra work even if a match is found in the very first
case, while in the if we would early exit.

Anupama

On Fri, Feb 5, 2016 at 4:42 PM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>
> > On Feb 5, 2016, at 2:43 PM, Anupama Chandrasekhar via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Hi:
> >
> > I have a question about the llvm StringSwitch class. Why is this more
> efficient than comparing the hashes of the strings or just using a bunch of
> if statements.
>
> I don't know about hashes performance (comparing a small string is
> probably more efficient than comparing hashes, and I'd be reluctant to
> collisions anyway). Now since you mentioned a "bunch of if
statements" you
> can get very close.
> StringSwitch is caching the length of the string to compare and first
> checking the length before doing the actual comparison using memcmp. So the
> two constructs below it should be equivalent:
>
> int i = StringSwitch<int>("abc").case("de",
1).case("fghi", 2).case("jkl",
> 3).default(-1);
>
> and:
>
> int i;
> const char *str = "abc";
> int len = strlen(str);
> if(len == 2 && std::memcmp(str, "de", 2) {
>    i = 1;
> } else if(len == 4 && std::memcmp(str, "fghi", 4) {
>    i = 2;
> } else if(len == 3 && std::memcmp(str, "jkl", 3) {
>    i = 3;
> } else {
>    i = -1
> }
>
>
> I left up to you to appreciate the readability :)
>
> --
> Mehdi
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/807a2dc1/attachment.html>

Chris Lattner via llvm-dev

2016-Feb-08 18:05 UTC

head link

[llvm-dev] StringSwitch class

> On Feb 5, 2016, at 4:42 PM, Mehdi Amini via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
>> On Feb 5, 2016, at 2:43 PM, Anupama Chandrasekhar via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>> 
>> Hi:
>> 
>> I have a question about the llvm StringSwitch class. Why is this more
efficient than comparing the hashes of the strings or just using a bunch of if
statements.
> 
> I don't know about hashes performance (comparing a small string is
probably more efficient than comparing hashes, and I'd be reluctant to
collisions anyway). Now since you mentioned a "bunch of if statements"
you can get very close.
> StringSwitch is caching the length of the string to compare and first
checking the length before doing the actual comparison using memcmp. So the two
constructs below it should be equivalent:
> 
> int i = StringSwitch<int>("abc").case("de",
1).case("fghi", 2).case("jkl", 3).default(-1);
> 
> and:
> 
> int i;
> const char *str = "abc";
> int len = strlen(str);
> if(len == 2 && std::memcmp(str, "de", 2) {
>   i = 1;
> } else if(len == 4 && std::memcmp(str, "fghi", 4) {
>   i = 2;
> } else if(len == 3 && std::memcmp(str, "jkl", 3) {
>   i = 3;
> } else {
>   i = -1
> }
Also, note that this sequence is specifically intended to be jump threadable and
recognized by the compiler’s optimizer.  This allows the compiler to turn it
into:


int len = strlen(str);
switch (len) {
case 2:
   if (std::memcmp(str, "de", 2)
     i = 1;
   else (std::memcmp(str, “qr", 2)
     i = 1;
   else
     i = -1;
  case 4:
..

Which isn’t optimal perhaps, but isn’t bad either.  The compiler should
theoretically be able to do good things with back-to-back memcmps, but probably
isn’t doing anything good there for long ones.  Short memcmps should be turned
into a load+compare, meaning they become another switch.

-Chris

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Feb 2016 - StringSwitch class

[llvm-dev] StringSwitch class

[llvm-dev] StringSwitch class

[llvm-dev] StringSwitch class

[llvm-dev] StringSwitch class

[llvm-dev] StringSwitch class

Seemingly Similar Threads