Robinson, Paul
2013-Jun-19 22:39 UTC
[LLVMdev] How to deal with potentially unlimited count/length symbol names?
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Sean Silva > Sent: Wednesday, June 19, 2013 11:45 AM > To: edA-qa mort-ora-y > Cc: <llvmdev at cs.uiuc.edu> > Subject: Re: [LLVMdev] How to deal with potentially unlimited count/length symbol names? > > On Wed, Jun 19, 2013 at 1:04 AM, edA-qa mort-ora-y <eda-qa at disemia.com> wrote: > > > The problem is that if I derive the name from what the type contains the > > length of that name is essential unbound. So how does one generate > > names? I'm thinking of just using a long hash and hoping I don't get > > accidental collisions. Surely there must be a better way? > > Just a cryptographic hash (e.g. SHA1) to avoid the need to "hope" that there are no collisions. > > -- Sean SilvaCryptographic hashes don't guarantee you get no accidental collisions; their goal is to make it super hard to produce a collision _on purpose_. What you need is an algorithm designed for string inputs, with good uniformity, and an adequate output size; there are many. Accidental collisions are essentially the Birthday Problem: http://en.wikipedia.org/wiki/Birthday_problem See particularly the end of the "Square approximation" section, which specifically discusses the application to hashes, relating probability of collision to bit-width and number of inputs. For example, I worked out that with a 128-bit hash you need around 2^50 inputs before you get a collision probability of 1-in-a-billion. (For comparison, a 64-bit hash gets you about 185,000 inputs for the same collision probability.) If you want a "name-brand" algorithm, I'd suggest MD5 over SHA-1. It produces 128-bit output (versus 160-bit) and the fact that it is "cryptographically broken" is irrelevant to your use-case. --paulr
Sean Silva
2013-Jun-19 23:19 UTC
[LLVMdev] How to deal with potentially unlimited count/length symbol names?
On Wed, Jun 19, 2013 at 3:39 PM, Robinson, Paul < Paul_Robinson at playstation.sony.com> wrote:> > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Sean Silva > > Sent: Wednesday, June 19, 2013 11:45 AM > > To: edA-qa mort-ora-y > > Cc: <llvmdev at cs.uiuc.edu> > > Subject: Re: [LLVMdev] How to deal with potentially unlimited > count/length symbol names? > > > > On Wed, Jun 19, 2013 at 1:04 AM, edA-qa mort-ora-y <eda-qa at disemia.com> > wrote: > > > > > The problem is that if I derive the name from what the type contains > the > > > length of that name is essential unbound. So how does one generate > > > names? I'm thinking of just using a long hash and hoping I don't get > > > accidental collisions. Surely there must be a better way? > > > > Just a cryptographic hash (e.g. SHA1) to avoid the need to "hope" that > there are no collisions. > > > > -- Sean Silva > > Cryptographic hashes don't guarantee you get no accidental collisions; > their goal is to make it super hard to produce a collision _on purpose_. > What you need is an algorithm designed for string inputs, with good > uniformity, and an adequate output size; there are many. >It's obvious by the pigeonhole principle that there must be collisions. The point of my statement that OP doesn't have to "hope" that it avoids collisions: crypto hashes are designed (and depended on across the globe) to make collisions vanishingly unlikely; it's orders of magnitude more likely that a cosmic ray will silently corrupt the hash computation than for two strings to collide (except possibly for maliciously-chosen strings for weaker hashes). -- Sean Silva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130619/dabc7e5a/attachment.html>
罗勇刚(Yonggang Luo)
2013-Jun-20 01:03 UTC
[LLVMdev] How to deal with potentially unlimited count/length symbol names?
if youd don't care the readabilit, you can compress the function name.... 在 2013-6-20 上午7:22,"Sean Silva" <silvas at purdue.edu>写道:> > > > On Wed, Jun 19, 2013 at 3:39 PM, Robinson, Paul < > Paul_Robinson at playstation.sony.com> wrote: > >> > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] >> On Behalf Of Sean Silva >> > Sent: Wednesday, June 19, 2013 11:45 AM >> > To: edA-qa mort-ora-y >> > Cc: <llvmdev at cs.uiuc.edu> >> > Subject: Re: [LLVMdev] How to deal with potentially unlimited >> count/length symbol names? >> > >> > On Wed, Jun 19, 2013 at 1:04 AM, edA-qa mort-ora-y <eda-qa at disemia.com> >> wrote: >> > >> > > The problem is that if I derive the name from what the type contains >> the >> > > length of that name is essential unbound. So how does one generate >> > > names? I'm thinking of just using a long hash and hoping I don't get >> > > accidental collisions. Surely there must be a better way? >> > >> > Just a cryptographic hash (e.g. SHA1) to avoid the need to "hope" that >> there are no collisions. >> > >> > -- Sean Silva >> >> Cryptographic hashes don't guarantee you get no accidental collisions; >> their goal is to make it super hard to produce a collision _on purpose_. >> What you need is an algorithm designed for string inputs, with good >> uniformity, and an adequate output size; there are many. >> > > It's obvious by the pigeonhole principle that there must be collisions. > The point of my statement that OP doesn't have to "hope" that it avoids > collisions: crypto hashes are designed (and depended on across the globe) > to make collisions vanishingly unlikely; it's orders of magnitude more > likely that a cosmic ray will silently corrupt the hash computation than > for two strings to collide (except possibly for maliciously-chosen strings > for weaker hashes). > > -- Sean Silva > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130620/03991647/attachment.html>
Possibly Parallel Threads
- [LLVMdev] How to deal with potentially unlimited count/length symbol names?
- [LLVMdev] How to deal with potentially unlimited count/length symbol names?
- [LLVMdev] How to deal with potentially unlimited count/length symbol names?
- [LLVMdev] How to deal with potentially unlimited count/length symbol names?
- [LLVMdev] How to deal with potentially unlimited count/length symbol names?