> On Feb 5, 2016, at 4:42 PM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > >> On Feb 5, 2016, at 2:43 PM, Anupama Chandrasekhar via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hi: >> >> I have a question about the llvm StringSwitch class. Why is this more efficient than comparing the hashes of the strings or just using a bunch of if statements. > > I don't know about hashes performance (comparing a small string is probably more efficient than comparing hashes, and I'd be reluctant to collisions anyway). Now since you mentioned a "bunch of if statements" you can get very close. > StringSwitch is caching the length of the string to compare and first checking the length before doing the actual comparison using memcmp. So the two constructs below it should be equivalent: > > int i = StringSwitch<int>("abc").case("de", 1).case("fghi", 2).case("jkl", 3).default(-1); > > and: > > int i; > const char *str = "abc"; > int len = strlen(str); > if(len == 2 && std::memcmp(str, "de", 2) { > i = 1; > } else if(len == 4 && std::memcmp(str, "fghi", 4) { > i = 2; > } else if(len == 3 && std::memcmp(str, "jkl", 3) { > i = 3; > } else { > i = -1 > }Also, note that this sequence is specifically intended to be jump threadable and recognized by the compiler’s optimizer. This allows the compiler to turn it into: int len = strlen(str); switch (len) { case 2: if (std::memcmp(str, "de", 2) i = 1; else (std::memcmp(str, “qr", 2) i = 1; else i = -1; case 4: .. Which isn’t optimal perhaps, but isn’t bad either. The compiler should theoretically be able to do good things with back-to-back memcmps, but probably isn’t doing anything good there for long ones. Short memcmps should be turned into a load+compare, meaning they become another switch. -Chris
The point I was wondering about is, say in the example, say the input string is "de" int i = StringSwitch<int>("de") .case("de", 1) .case("fghi", 2) .case("jkl", 3) .default(-1); will cause the 3 function calls to "Case()" and 1 to "Default()", Even if the functions were inlined I would perform an if(!Result) check though Result has been found, however if I were to code the same as: int len = strlen(str); if(len == 2 && std::memcmp(str, "de", 2) { i = 1; } else if(len == 4 && std::memcmp(str, "fghi", 4) { i = 2; } else if(len == 3 && std::memcmp(str, "jkl", 3) { i = 3; } else { i = -1 } I would be performing only one if check. On Mon, Feb 8, 2016 at 10:05 AM, Chris Lattner <clattner at apple.com> wrote:> > > On Feb 5, 2016, at 4:42 PM, Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > > >> On Feb 5, 2016, at 2:43 PM, Anupama Chandrasekhar via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> Hi: > >> > >> I have a question about the llvm StringSwitch class. Why is this more > efficient than comparing the hashes of the strings or just using a bunch of > if statements. > > > > I don't know about hashes performance (comparing a small string is > probably more efficient than comparing hashes, and I'd be reluctant to > collisions anyway). Now since you mentioned a "bunch of if statements" you > can get very close. > > StringSwitch is caching the length of the string to compare and first > checking the length before doing the actual comparison using memcmp. So the > two constructs below it should be equivalent: > > > > int i = StringSwitch<int>("abc").case("de", 1).case("fghi", > 2).case("jkl", 3).default(-1); > > > > and: > > > > int i; > > const char *str = "abc"; > > int len = strlen(str); > > if(len == 2 && std::memcmp(str, "de", 2) { > > i = 1; > > } else if(len == 4 && std::memcmp(str, "fghi", 4) { > > i = 2; > > } else if(len == 3 && std::memcmp(str, "jkl", 3) { > > i = 3; > > } else { > > i = -1 > > } > > Also, note that this sequence is specifically intended to be jump > threadable and recognized by the compiler’s optimizer. This allows the > compiler to turn it into: > > > int len = strlen(str); > switch (len) { > case 2: > if (std::memcmp(str, "de", 2) > i = 1; > else (std::memcmp(str, “qr", 2) > i = 1; > else > i = -1; > case 4: > .. > > Which isn’t optimal perhaps, but isn’t bad either. The compiler should > theoretically be able to do good things with back-to-back memcmps, but > probably isn’t doing anything good there for long ones. Short memcmps > should be turned into a load+compare, meaning they become another switch. > > -Chris > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160208/5954ad22/attachment.html>
Sorry for belaboring on a possibly minor point here, but is my understanding correct that even assuming that the case function is always inlined so we don't have extra function call overhead, we have the redundant if (!Result) checks when we use StringSwitch as opposed to a bunch of if- elses. Thanks. On Mon, Feb 8, 2016 at 12:00 PM, Anupama Chandrasekhar < anupama.lists at gmail.com> wrote:> The point I was wondering about is, say in the example, say the input > string is "de" > > int i = StringSwitch<int>("de") > .case("de", 1) > .case("fghi", 2) > .case("jkl", 3) > .default(-1); > > will cause the 3 function calls to "Case()" and 1 to "Default()", Even if > the functions were inlined I would perform an if(!Result) check though > Result has been found, however if I were to code the same as: > > int len = strlen(str); > if(len == 2 && std::memcmp(str, "de", 2) { > i = 1; > } else if(len == 4 && std::memcmp(str, "fghi", 4) { > i = 2; > } else if(len == 3 && std::memcmp(str, "jkl", 3) { > i = 3; > } else { > i = -1 > } > > I would be performing only one if check. > > On Mon, Feb 8, 2016 at 10:05 AM, Chris Lattner <clattner at apple.com> wrote: > >> >> > On Feb 5, 2016, at 4:42 PM, Mehdi Amini via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> > >> > >> >> On Feb 5, 2016, at 2:43 PM, Anupama Chandrasekhar via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> >> >> Hi: >> >> >> >> I have a question about the llvm StringSwitch class. Why is this more >> efficient than comparing the hashes of the strings or just using a bunch of >> if statements. >> > >> > I don't know about hashes performance (comparing a small string is >> probably more efficient than comparing hashes, and I'd be reluctant to >> collisions anyway). Now since you mentioned a "bunch of if statements" you >> can get very close. >> > StringSwitch is caching the length of the string to compare and first >> checking the length before doing the actual comparison using memcmp. So the >> two constructs below it should be equivalent: >> > >> > int i = StringSwitch<int>("abc").case("de", 1).case("fghi", >> 2).case("jkl", 3).default(-1); >> > >> > and: >> > >> > int i; >> > const char *str = "abc"; >> > int len = strlen(str); >> > if(len == 2 && std::memcmp(str, "de", 2) { >> > i = 1; >> > } else if(len == 4 && std::memcmp(str, "fghi", 4) { >> > i = 2; >> > } else if(len == 3 && std::memcmp(str, "jkl", 3) { >> > i = 3; >> > } else { >> > i = -1 >> > } >> >> Also, note that this sequence is specifically intended to be jump >> threadable and recognized by the compiler’s optimizer. This allows the >> compiler to turn it into: >> >> >> int len = strlen(str); >> switch (len) { >> case 2: >> if (std::memcmp(str, "de", 2) >> i = 1; >> else (std::memcmp(str, “qr", 2) >> i = 1; >> else >> i = -1; >> case 4: >> .. >> >> Which isn’t optimal perhaps, but isn’t bad either. The compiler should >> theoretically be able to do good things with back-to-back memcmps, but >> probably isn’t doing anything good there for long ones. Short memcmps >> should be turned into a load+compare, meaning they become another switch. >> >> -Chris >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160210/ef78b353/attachment.html>