Hi guys, I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS implementation to supplant the currently utilized sha256. On modern 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much slower than many of the SHA-3 candidates, so I went out and did some testing (details attached) on a possible new hash algorithm that might improve on this situation. However, before I start out on a pointless endeavor, I wanted to probe the field of ZFS users, especially those using dedup, on whether their workloads would benefit from a faster hash algorithm (and hence, lower CPU utilization). Developments of late have suggested to me three possible candidates: * SHA-512: simplest to implement (since the code is already in the kernel) and provides a modest performance boost of around 60%. * Skein-512: overall fastest of the SHA-3 finalists and much faster than SHA-512 (around 120-150% faster than the current sha256). * Edon-R-512: probably the fastest general purpose hash algorithm I''ve ever seen (upward of 300% speedup over sha256) , but might have potential security problems (though I don''t think this is of any relevance to ZFS, as it doesn''t use the hash for any kind of security purposes, but only for data integrity & dedup). My testing procedure: nothing sophisticated, I took the implementation of sha256 from the Illumos kernel and simply ran it on a dedicated psrset (where possible with a whole CPU dedicated, even if only to a single thread) - I tested both the generic C implementation and the Intel assembly implementation. The Skein and Edon-R implementations are in C optimized for 64-bit architectures from the respective authors (the most up to date versions I could find). All code has been compiled using GCC 3.4.3 from the repos (the same that can be used for building Illumos). Sadly, I don''t have access to Sun Studio. Cheers, -- Saso -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: hash_benchmark.txt URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/8ec04611/attachment.txt>
On 07/10/12 19:56, Sa?o Kiselkov wrote:> Hi guys, > > I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS > implementation to supplant the currently utilized sha256. On modern > 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much > slower than many of the SHA-3 candidates, so I went out and did some > testing (details attached) on a possible new hash algorithm that might > improve on this situation.Is the intent to store the 512 bit hash or truncate to 256 bit?
Edward Ned Harvey
2012-Jul-11 03:20 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > > I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS > implementation to supplant the currently utilized sha256. On modern > 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much > slower than many of the SHA-3 candidates, so I went out and did some > testing (details attached) on a possible new hash algorithm that might > improve on this situation.As coincidence would have it, I recently benchmarked md5 hashing and AES encryption on systems with and without AES-NI. Theoretically, hashing should be much faster because it''s asymmetric, while symmetric encryption has much less speed potential. I found md5 could hash at most several hundred MB/sec, and AES was about half to quarter of that speed ... Which is consistent with the theory. But if I had AES-NI, then AES was about 1.1 GB/sec. Which means we have much *much* more speed potential available untapped in terms of hashing. Now, when you consider that a single disk typically is able to sustain 1.0Gbit (128 MB) per second, it means, very quickly the CPU can become the bottleneck for sustained disk reads in a large raid system. I think a lot of the time, people with a bunch of disks in a raid configuration are able to neglect the CPU load, just because they''re using fletcher. Of the SHA3 finalists, I was pleased to see that only one of them was based on AES-NI, and the others are actually faster. So in vague hand-waving terms, yes I believe the stronger & faster hash algorithm, in time will be a big win for zfs performance. But only in situations where people have a sufficiently large number of disks and sufficiently high expectation for IO performance. CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahead of that curve.
On 07/11/2012 02:18 AM, John Martin wrote:> On 07/10/12 19:56, Sa?o Kiselkov wrote: >> Hi guys, >> >> I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS >> implementation to supplant the currently utilized sha256. On modern >> 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much >> slower than many of the SHA-3 candidates, so I went out and did some >> testing (details attached) on a possible new hash algorithm that might >> improve on this situation. > > Is the intent to store the 512 bit hash or truncate to 256 bit? >The intent is to truncate. I know ZFS can only store up to 32 bytes in the checksum. Cheers, -- Saso
On 07/11/2012 05:20 AM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov >> >> I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS >> implementation to supplant the currently utilized sha256. On modern >> 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much >> slower than many of the SHA-3 candidates, so I went out and did some >> testing (details attached) on a possible new hash algorithm that might >> improve on this situation. > > As coincidence would have it, I recently benchmarked md5 hashing and AES encryption on systems with and without AES-NI. Theoretically, hashing should be much faster because it''s asymmetric, while symmetric encryption has much less speed potential. I found md5 could hash at most several hundred MB/sec, and AES was about half to quarter of that speed ... Which is consistent with the theory. But if I had AES-NI, then AES was about 1.1 GB/sec. Which means we have much *much* more speed potential available untapped in terms of hashing.MD5 is a painfully slow hash compared to the SHA-3 candidates, or even SHA-512. The candidates I tested produced the following throughputs (a simple reversal of the cycles/byte metric for each CPU): Opteron 4234 (3.1 GHz): Skein-512: 355 MB/s Edon-R: 643 MB/s AMD Athlon II Neo N36L (1.3 GHz): Skein-512: 213 MB/s Edon-R: 364 MB/s Intel Xeon E5645 (2.4 GHz): Skein-512: 357 MB/s Edon-R: 734 MB/s Intel Xeon E5405 (2.0 GHz): Skein-512: 280 MB/s Edon-R: 531 MB/s Intel Xeon E5450 (3.0 GHz): Skein-512: 416 MB/s Edon-R: 792 MB/s Keep in mind that this is single-threaded on a pure-C implementation. During my tests I used GCC 3.4.3 in order to be able to assess speed improvements should the code be folded into Illumos (since that''s one compiler Illumos can be built with), but GCC 3.4.3 is seriously stone-age. Compiling with GCC 4.6.2 I got a further speed boost of around 10-20%, so even in pure C, Edon-R is probably able to breathe down the neck of the AES-NI optimized implementation you mentioned.> Now, when you consider that a single disk typically is able to sustain 1.0Gbit (128 MB) per second, it means, very quickly the CPU can become the bottleneck for sustained disk reads in a large raid system. I think a lot of the time, people with a bunch of disks in a raid configuration are able to neglect the CPU load, just because they''re using fletcher.Yes, that''s exactly what I''m getting at. It would be great to have a hash that you could enable with significantly more peace of mind than sha256 - with sha256, you always need to keep in mind that the hashing is going to be super-expensive (even for reads). My testing with a "small" JBOD from Supermicro showed that I was easily able to exceed 2 GB/s of reads off of just 45 7k2 SAS drives.> Of the SHA3 finalists, I was pleased to see that only one of them was based on AES-NI, and the others are actually faster. So in vague hand-waving terms, yes I believe the stronger & faster hash algorithm, in time will be a big win for zfs performance. But only in situations where people have a sufficiently large number of disks and sufficiently high expectation for IO performance.My whole reason for starting this exercise is that RAM is getting dirt cheap nowadays, so a reasonably large and fast dedup setup can be had for relatively little money. However, I think that the sha256 hash is really ill suited to this application and even if it isn''t a critical issue, I think it''s really inexcusable we are using something worse than the best of breed here (especially considering how ZFS was always designed to be easily extensible with new algorithms).> CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahead of that curve.As I said above, RAM is getting cheaper much faster than CPU performance is. Nowadays you can get 128 GB of server-grade RAM for around $1000, so equipping a machine with half a terabyte of RAM or more is getting commonplace. By a first degree approximation (assuming 200 B per block in the DDT) this would allow one to store upward of 80TB of unique 128K blocks in 128 GB of RAM - that''s easily above 100 TB of attached raw storage, so we''re looking at things like this: http://dataonstorage.com/dataon-products/6g-sas-jbod/dns-1660-4u-60-bay-6g-35inch-sassata-jbod.html These things come with eight 4-wide SAS 2.0 ports and enough bandwidth to saturate a QDR InfiniBand port. Another thing to consider are SSDs. A single 2U server with eight or even sixteen 2.5'''' SATA-III SSDs can achieve even higher throughput. So I fully agree with you, we need to stay ahead of the curve, however, I think the curve is much closer than we think! Cheers, -- Saso
Ferenc-Levente Juhos
2012-Jul-11 07:58 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Hello all, what about the fletcher2 and fletcher4 algorithms? According to the zfs man page on oracle, fletcher4 is the current default. Shouldn''t the fletcher algorithms be much faster then any of the SHA algorithms? On Wed, Jul 11, 2012 at 9:19 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote:> On 07/11/2012 05:20 AM, Edward Ned Harvey wrote: > >> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > >> bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > >> > >> I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS > >> implementation to supplant the currently utilized sha256. On modern > >> 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much > >> slower than many of the SHA-3 candidates, so I went out and did some > >> testing (details attached) on a possible new hash algorithm that might > >> improve on this situation. > > > > As coincidence would have it, I recently benchmarked md5 hashing and AES > encryption on systems with and without AES-NI. Theoretically, hashing > should be much faster because it''s asymmetric, while symmetric encryption > has much less speed potential. I found md5 could hash at most several > hundred MB/sec, and AES was about half to quarter of that speed ... Which > is consistent with the theory. But if I had AES-NI, then AES was about 1.1 > GB/sec. Which means we have much *much* more speed potential available > untapped in terms of hashing. > > MD5 is a painfully slow hash compared to the SHA-3 candidates, or even > SHA-512. The candidates I tested produced the following throughputs (a > simple reversal of the cycles/byte metric for each CPU): > > Opteron 4234 (3.1 GHz): > Skein-512: 355 MB/s > Edon-R: 643 MB/s > > AMD Athlon II Neo N36L (1.3 GHz): > Skein-512: 213 MB/s > Edon-R: 364 MB/s > > Intel Xeon E5645 (2.4 GHz): > Skein-512: 357 MB/s > Edon-R: 734 MB/s > > Intel Xeon E5405 (2.0 GHz): > Skein-512: 280 MB/s > Edon-R: 531 MB/s > > Intel Xeon E5450 (3.0 GHz): > Skein-512: 416 MB/s > Edon-R: 792 MB/s > > Keep in mind that this is single-threaded on a pure-C implementation. > During my tests I used GCC 3.4.3 in order to be able to assess speed > improvements should the code be folded into Illumos (since that''s one > compiler Illumos can be built with), but GCC 3.4.3 is seriously > stone-age. Compiling with GCC 4.6.2 I got a further speed boost of > around 10-20%, so even in pure C, Edon-R is probably able to breathe > down the neck of the AES-NI optimized implementation you mentioned. > > > Now, when you consider that a single disk typically is able to sustain > 1.0Gbit (128 MB) per second, it means, very quickly the CPU can become the > bottleneck for sustained disk reads in a large raid system. I think a lot > of the time, people with a bunch of disks in a raid configuration are able > to neglect the CPU load, just because they''re using fletcher. > > Yes, that''s exactly what I''m getting at. It would be great to have a > hash that you could enable with significantly more peace of mind than > sha256 - with sha256, you always need to keep in mind that the hashing > is going to be super-expensive (even for reads). My testing with a > "small" JBOD from Supermicro showed that I was easily able to exceed 2 > GB/s of reads off of just 45 7k2 SAS drives. > > > Of the SHA3 finalists, I was pleased to see that only one of them was > based on AES-NI, and the others are actually faster. So in vague > hand-waving terms, yes I believe the stronger & faster hash algorithm, in > time will be a big win for zfs performance. But only in situations where > people have a sufficiently large number of disks and sufficiently high > expectation for IO performance. > > My whole reason for starting this exercise is that RAM is getting dirt > cheap nowadays, so a reasonably large and fast dedup setup can be had > for relatively little money. However, I think that the sha256 hash is > really ill suited to this application and even if it isn''t a critical > issue, I think it''s really inexcusable we are using something worse than > the best of breed here (especially considering how ZFS was always > designed to be easily extensible with new algorithms). > > > CPU''s are not getting much faster. But IO is definitely getting faster. > It''s best to keep ahead of that curve. > > As I said above, RAM is getting cheaper much faster than CPU performance > is. Nowadays you can get 128 GB of server-grade RAM for around $1000, so > equipping a machine with half a terabyte of RAM or more is getting > commonplace. By a first degree approximation (assuming 200 B per block > in the DDT) this would allow one to store upward of 80TB of unique 128K > blocks in 128 GB of RAM - that''s easily above 100 TB of attached raw > storage, so we''re looking at things like this: > > > http://dataonstorage.com/dataon-products/6g-sas-jbod/dns-1660-4u-60-bay-6g-35inch-sassata-jbod.html > > These things come with eight 4-wide SAS 2.0 ports and enough bandwidth > to saturate a QDR InfiniBand port. Another thing to consider are SSDs. A > single 2U server with eight or even sixteen 2.5'''' SATA-III SSDs can > achieve even higher throughput. So I fully agree with you, we need to > stay ahead of the curve, however, I think the curve is much closer than > we think! > > Cheers, > -- > Saso > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/387230a2/attachment.html>
On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote:> Hello all, > > what about the fletcher2 and fletcher4 algorithms? According to the zfs man > page on oracle, fletcher4 is the current default. > Shouldn''t the fletcher algorithms be much faster then any of the SHA > algorithms? > On Wed, Jul 11, 2012 at 9:19 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote:Fletcher is a checksum, not a hash. It can and often will produce collisions, so you need to set your dedup to verify (do a bit-by-bit comparison prior to deduplication) which can result in significant write amplification (every write is turned into a read and potentially another write in case verify finds the blocks are different). With hashes, you can leave verify off, since hashes are extremely unlikely (~10^-77) to produce collisions. -- Saso
Ferenc-Levente Juhos
2012-Jul-11 08:41 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
I was under the impression that the hash (or checksum) used for data integrity is the same as the one used for deduplication, but now I see that they are different. On Wed, Jul 11, 2012 at 10:23 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote:> On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote: > > Hello all, > > > > what about the fletcher2 and fletcher4 algorithms? According to the zfs > man > > page on oracle, fletcher4 is the current default. > > Shouldn''t the fletcher algorithms be much faster then any of the SHA > > algorithms? > > On Wed, Jul 11, 2012 at 9:19 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com > >wrote: > > Fletcher is a checksum, not a hash. It can and often will produce > collisions, so you need to set your dedup to verify (do a bit-by-bit > comparison prior to deduplication) which can result in significant write > amplification (every write is turned into a read and potentially another > write in case verify finds the blocks are different). With hashes, you > can leave verify off, since hashes are extremely unlikely (~10^-77) to > produce collisions. > > -- > Saso >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/7f136a6b/attachment-0001.html>
On 07/11/2012 10:41 AM, Ferenc-Levente Juhos wrote:> I was under the impression that the hash (or checksum) used for data > integrity is the same as the one used for deduplication, > but now I see that they are different.They are the same "in use", i.e. once you switch dedup on, that implies checksum=sha256. However, if you want to you, you can force ZFS to use fletcher even with dedup by setting dedup=verify and checksum=fletcher4 (setting "dedup=on" with fletcher4 is not advised due to fletcher''s possibility of producing collisions and subsequent silent corruption). It''s also possible to set "dedup=verify" with "checksum=sha256", however, that makes little sense (as the chances of getting a random hash collision are essentially nil). Cheers, -- Saso
Joerg Schilling
2012-Jul-11 08:47 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Sa??o Kiselkov <skiselkov.ml at gmail.com> wrote:> write in case verify finds the blocks are different). With hashes, you > can leave verify off, since hashes are extremely unlikely (~10^-77) to > produce collisions.This is how a lottery works. the chance is low but some people still win. q~A
Ferenc-Levente Juhos
2012-Jul-11 08:50 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Actually although as you pointed out that the chances to have an sha256 collision is minimal, but still it can happen, that would mean that the dedup algorithm discards a block that he thinks is a duplicate. Probably it''s anyway better to do a byte to byte comparison if the hashes match to be sure that the blocks are really identical. The funny thing here is that ZFS tries to solve all sorts of data integrity issues with checksumming and healing, etc., and on the other hand a hash collision in the dedup algorithm can cause loss of data if wrongly configured. Anyway thanks that you have brought up the subject, now I know if I will enable the dedup feature I must set it to sha256,verify. On Wed, Jul 11, 2012 at 10:41 AM, Ferenc-Levente Juhos <feci1024 at gmail.com>wrote:> I was under the impression that the hash (or checksum) used for data > integrity is the same as the one used for deduplication, > but now I see that they are different. > > > On Wed, Jul 11, 2012 at 10:23 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote: > >> On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote: >> > Hello all, >> > >> > what about the fletcher2 and fletcher4 algorithms? According to the zfs >> man >> > page on oracle, fletcher4 is the current default. >> > Shouldn''t the fletcher algorithms be much faster then any of the SHA >> > algorithms? >> > On Wed, Jul 11, 2012 at 9:19 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com >> >wrote: >> >> Fletcher is a checksum, not a hash. It can and often will produce >> collisions, so you need to set your dedup to verify (do a bit-by-bit >> comparison prior to deduplication) which can result in significant write >> amplification (every write is turned into a read and potentially another >> write in case verify finds the blocks are different). With hashes, you >> can leave verify off, since hashes are extremely unlikely (~10^-77) to >> produce collisions. >> >> -- >> Saso >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/8bb9d39e/attachment.html>
Ferenc-Levente Juhos
2012-Jul-11 08:53 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
I''m pushing the send button too often, but yes, considering what said before, byte-to-byte comparison should be mandatory when deduplicating, and therefore a "lighter" hash or checksum algorithm, would suffice to reduce the number of dedup candidates. And overall deduping would be "bulletproof" and faster. On Wed, Jul 11, 2012 at 10:50 AM, Ferenc-Levente Juhos <feci1024 at gmail.com>wrote:> Actually although as you pointed out that the chances to have an sha256 > collision is minimal, but still it can happen, that would mean > that the dedup algorithm discards a block that he thinks is a duplicate. > Probably it''s anyway better to do a byte to byte comparison > if the hashes match to be sure that the blocks are really identical. > > The funny thing here is that ZFS tries to solve all sorts of data > integrity issues with checksumming and healing, etc., > and on the other hand a hash collision in the dedup algorithm can cause > loss of data if wrongly configured. > > Anyway thanks that you have brought up the subject, now I know if I will > enable the dedup feature I must set it to sha256,verify. > On Wed, Jul 11, 2012 at 10:41 AM, Ferenc-Levente Juhos < > feci1024 at gmail.com> wrote: > >> I was under the impression that the hash (or checksum) used for data >> integrity is the same as the one used for deduplication, >> but now I see that they are different. >> >> >> On Wed, Jul 11, 2012 at 10:23 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote: >> >>> On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote: >>> > Hello all, >>> > >>> > what about the fletcher2 and fletcher4 algorithms? According to the >>> zfs man >>> > page on oracle, fletcher4 is the current default. >>> > Shouldn''t the fletcher algorithms be much faster then any of the SHA >>> > algorithms? >>> > On Wed, Jul 11, 2012 at 9:19 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com >>> >wrote: >>> >>> Fletcher is a checksum, not a hash. It can and often will produce >>> collisions, so you need to set your dedup to verify (do a bit-by-bit >>> comparison prior to deduplication) which can result in significant write >>> amplification (every write is turned into a read and potentially another >>> write in case verify finds the blocks are different). With hashes, you >>> can leave verify off, since hashes are extremely unlikely (~10^-77) to >>> produce collisions. >>> >>> -- >>> Saso >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/65b49a2b/attachment.html>
On 07/11/2012 10:47 AM, Joerg Schilling wrote:> Sa??o Kiselkov <skiselkov.ml at gmail.com> wrote: > >> write in case verify finds the blocks are different). With hashes, you >> can leave verify off, since hashes are extremely unlikely (~10^-77) to >> produce collisions. > > This is how a lottery works. the chance is low but some people still win. > > q~AYou do realize that the age of the universe is only on the order of around 10^18 seconds, do you? Even if you had a trillion CPUs each chugging along at 3.0 GHz for all this time, the number of processor cycles you will have executed cumulatively is only on the order 10^40, still 37 orders of magnitude lower than the chance for a random hash collision. Cheers, -- Saso
Darren J Moffat
2012-Jul-11 09:02 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On 07/11/12 00:56, Sa?o Kiselkov wrote:> * SHA-512: simplest to implement (since the code is already in the > kernel) and provides a modest performance boost of around 60%.FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/256. http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf Note this is NOT a simple truncation of SHA-512 since when using SHA-512/t the initial value H(0) is different. See sections 5.3.6.2 and 6.7. I recommend the checksum value for this be checksum=sha512/256 A / in the value doesn''t cause any problems and it is the official NIST name of that hash. With the internal enum being: ZIO_CHECKSUM_SHA512_256 CR 7020616 already exists for adding this in Oracle Solaris. -- Darren J Moffat
On 07/11/2012 11:02 AM, Darren J Moffat wrote:> On 07/11/12 00:56, Sa?o Kiselkov wrote: >> * SHA-512: simplest to implement (since the code is already in the >> kernel) and provides a modest performance boost of around 60%. > > FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/256. > > http://csrc.nist.gov/publications/fips/fips180-4/fips-180-4.pdf > > Note this is NOT a simple truncation of SHA-512 since when using > SHA-512/t the initial value H(0) is different.Yes, I know that. In my original post I was only trying to probe the landscape for whether there is interest in this in the community. Of course for the implementation I''d go with the standardized truncation function. Skein-512 already includes a truncation function. For Edon-R, I''d have to devise one (probably based around SHA-512/256 or Skein).> See sections 5.3.6.2 and 6.7. > > I recommend the checksum value for this be > checksum=sha512/256 > > A / in the value doesn''t cause any problems and it is the official NIST > name of that hash. > > With the internal enum being: ZIO_CHECKSUM_SHA512_256 > > CR 7020616 already exists for adding this in Oracle Solaris.Okay, if I''ll implement it identically in Illumos then. However, given that I plan to use featureflags to advertise the feature to the outside world, I doubt interoperability will be possible (even though the on-disk format could be identical). Cheers, -- Saso
On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote:> Actually although as you pointed out that the chances to have an sha256 > collision is minimal, but still it can happen, that would mean > that the dedup algorithm discards a block that he thinks is a duplicate. > Probably it''s anyway better to do a byte to byte comparison > if the hashes match to be sure that the blocks are really identical. > > The funny thing here is that ZFS tries to solve all sorts of data integrity > issues with checksumming and healing, etc., > and on the other hand a hash collision in the dedup algorithm can cause > loss of data if wrongly configured. > > Anyway thanks that you have brought up the subject, now I know if I will > enable the dedup feature I must set it to sha256,verify.Oh jeez, I can''t remember how many times this flame war has been going on on this list. Here''s the gist: SHA-256 (or any good hash) produces a near uniform random distribution of output. Thus, the chances of getting a random hash collision are around 2^-256 or around 10^-77. If I asked you to pick two atoms at random *from the entire observable universe*, your chances of hitting on the same atom are higher than the chances of that hash collision. So leave dedup=on with sha256 and move on. Cheers, -- Saso
Joerg Schilling
2012-Jul-11 09:44 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Sa?o Kiselkov <skiselkov.ml at gmail.com> wrote:> On 07/11/2012 10:47 AM, Joerg Schilling wrote: > > Sa??o Kiselkov <skiselkov.ml at gmail.com> wrote: > > > >> write in case verify finds the blocks are different). With hashes, you > >> can leave verify off, since hashes are extremely unlikely (~10^-77) to > >> produce collisions. > > > > This is how a lottery works. the chance is low but some people still win. > > > > q~A > > You do realize that the age of the universe is only on the order of > around 10^18 seconds, do you? Even if you had a trillion CPUs each > chugging along at 3.0 GHz for all this time, the number of processor > cycles you will have executed cumulatively is only on the order 10^40, > still 37 orders of magnitude lower than the chance for a random hash > collision.This is not how chance works. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes:> On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote: > > Actually although as you pointed out that the chances to have an sha256 > > collision is minimal, but still it can happen, that would mean > > that the dedup algorithm discards a block that he thinks is a duplicate. > > Probably it''s anyway better to do a byte to byte comparison > > if the hashes match to be sure that the blocks are really identical. > > > > The funny thing here is that ZFS tries to solve all sorts of data integrity > > issues with checksumming and healing, etc., > > and on the other hand a hash collision in the dedup algorithm can cause > > loss of data if wrongly configured. > > > > Anyway thanks that you have brought up the subject, now I know if I will > > enable the dedup feature I must set it to sha256,verify. > > Oh jeez, I can''t remember how many times this flame war has been going > on on this list. Here''s the gist: SHA-256 (or any good hash) produces a > near uniform random distribution of output. Thus, the chances of getting > a random hash collision are around 2^-256 or around 10^-77. If I asked > you to pick two atoms at random *from the entire observable universe*, > your chances of hitting on the same atom are higher than the chances of > that hash collision. So leave dedup=on with sha256 and move on.So in ZFS, which normally uses 128kB blocks, you can instead store them 100% uniquely into 32 bytes.. A nice 4096x compression rate.. decompression is a bit slower though.. /Tomas -- Tomas Forsman, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Casper.Dik at oracle.com
2012-Jul-11 10:00 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>You do realize that the age of the universe is only on the order of >around 10^18 seconds, do you? Even if you had a trillion CPUs each >chugging along at 3.0 GHz for all this time, the number of processor >cycles you will have executed cumulatively is only on the order 10^40, >still 37 orders of magnitude lower than the chance for a random hash >collision. >Suppose you find a weakness in a specific hash algorithm; you use this to create hash collisions and now imagined you store the hash collisions in a zfs dataset with dedup enabled using the same hash algorithm..... Casper
Justin Stringfellow
2012-Jul-11 10:24 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>>You do realize that the age of the universe is only on the order of >>around 10^18 seconds, do you? Even if you had a trillion CPUs each >>chugging along at 3.0 GHz for all this time, the number of processor >>cycles you will have executed cumulatively is only on the order 10^40, >>still 37 orders of magnitude lower than the chance for a random hash >>collision.Here we go, boiling the oceans again :)>Suppose you find a weakness in a specific hash algorithm; you use this >to create hash collisions and now imagined you store the hash collisions >in a zfs dataset with dedup enabled using the same hash algorithm.....Sorry, but isn''t this what dedup=verify solves? I don''t see the problem here. Maybe all that''s needed is a comment in the manpage saying hash algorithms aren''t perfect. ? cheers, --justin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/3fe9e680/attachment.html>
Casper.Dik at oracle.com
2012-Jul-11 10:27 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>Sorry, but isn''t this what dedup=verify solves? I don''t see the problem here. >Maybe all that''s needed is a comment in the manpage saying hash algorithms >aren''t perfect.The point is that hash functions are many to one and I think the point was about that verify wasn''t really needed if the hash function is good enough. Casper
Ferenc-Levente Juhos
2012-Jul-11 10:32 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Saso, I''m not flaming at all, I happen to disagree, but still I understand that chances are very very very slim, but as one poster already said, this is how the lottery works. I''m not saying one should make an exhaustive search with trillions of computers just to produce a sha256 collision. If I wanted an exhaustive search I would generate all the numbers from 0 to 2**256 and I would definitely get at least 1 collision. If you formulate it in another way, by generating all the possible 256 bit (32 byte) blocks + 1 you will definitely get a collision. This is much more credible than the analogy with the age of the universe and atoms picked at random, etc. The fact is it can happen, it''s entirely possible that there are two jpg''s in the universe with different content and they have the same hash. I can''t prove the existence of those, but you can''t deny it. The fact is, that zfs and everyone using it trys to correct data degradation e.g. cause by cosmic rays, and on the other hand their using probability calculation (no matter how slim the chances are) to potentially discard valid data. You can come with other universe and atom theories and with the age of the universe, etc. The fact remains the same. And each generation was convinced that their current best checksum or hash algorithm is the best and will be the best forever. MD5 has demonstrated that it''s not the case. Time will tell what becomes of SHA256, but why take any chances. On Wed, Jul 11, 2012 at 11:10 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote:> On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote: > > Actually although as you pointed out that the chances to have an sha256 > > collision is minimal, but still it can happen, that would mean > > that the dedup algorithm discards a block that he thinks is a duplicate. > > Probably it''s anyway better to do a byte to byte comparison > > if the hashes match to be sure that the blocks are really identical. > > > > The funny thing here is that ZFS tries to solve all sorts of data > integrity > > issues with checksumming and healing, etc., > > and on the other hand a hash collision in the dedup algorithm can cause > > loss of data if wrongly configured. > > > > Anyway thanks that you have brought up the subject, now I know if I will > > enable the dedup feature I must set it to sha256,verify. > > Oh jeez, I can''t remember how many times this flame war has been going > on on this list. Here''s the gist: SHA-256 (or any good hash) produces a > near uniform random distribution of output. Thus, the chances of getting > a random hash collision are around 2^-256 or around 10^-77. If I asked > you to pick two atoms at random *from the entire observable universe*, > your chances of hitting on the same atom are higher than the chances of > that hash collision. So leave dedup=on with sha256 and move on. > > Cheers, > -- > Saso >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/d504bfca/attachment-0001.html>
Ferenc-Levente Juhos
2012-Jul-11 10:37 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Precisely, I said the same thing a few posts before: dedup=verify solves that. And as I said, one could use dedup=<hash algorithm>,verify with an inferior hash algorithm (that is much faster) with the purpose of reducing the number of dedup candidates. For that matter one could use a trivial CRC32, if the two blocks have the same CRC you do anyway a byte-to-byte comparison, but if they have differenct CRC''s you don''t need to do the actual comparison. On Wed, Jul 11, 2012 at 12:24 PM, Justin Stringfellow < justin.stringfellow at btinternet.com> wrote:> >>You do realize that the age of the universe is only on the order of > >>around 10^18 seconds, do you? Even if you had a trillion CPUs each > >>chugging along at 3.0 GHz for all this time, the number of processor > >>cycles you will have executed cumulatively is only on the order 10^40, > >>still 37 orders of magnitude lower than the chance for a random hash > >>collision. > > Here we go, boiling the oceans again :) > > > >Suppose you find a weakness in a specific hash algorithm; you use this > >to create hash collisions and now imagined you store the hash collisions > >in a zfs dataset with dedup enabled using the same hash algorithm..... > > Sorry, but isn''t this what dedup=verify solves? I don''t see the problem > here. Maybe all that''s needed is a comment in the manpage saying hash > algorithms aren''t perfect. > > cheers, > --justin > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/13519fc6/attachment.html>
On 07/11/2012 11:53 AM, Tomas Forsman wrote:> On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: >> Oh jeez, I can''t remember how many times this flame war has been going >> on on this list. Here''s the gist: SHA-256 (or any good hash) produces a >> near uniform random distribution of output. Thus, the chances of getting >> a random hash collision are around 2^-256 or around 10^-77. If I asked >> you to pick two atoms at random *from the entire observable universe*, >> your chances of hitting on the same atom are higher than the chances of >> that hash collision. So leave dedup=on with sha256 and move on. > > So in ZFS, which normally uses 128kB blocks, you can instead store them > 100% uniquely into 32 bytes.. A nice 4096x compression rate.. > decompression is a bit slower though..I really mean no disrespect, but this comment is so dumb I could swear my IQ dropped by a few tenths of a point just by reading. Cheers, -- Saso
On 07/11/2012 12:00 PM, Casper.Dik at oracle.com wrote:> > >> You do realize that the age of the universe is only on the order of >> around 10^18 seconds, do you? Even if you had a trillion CPUs each >> chugging along at 3.0 GHz for all this time, the number of processor >> cycles you will have executed cumulatively is only on the order 10^40, >> still 37 orders of magnitude lower than the chance for a random hash >> collision. >> > > Suppose you find a weakness in a specific hash algorithm; you use this > to create hash collisions and now imagined you store the hash collisions > in a zfs dataset with dedup enabled using the same hash algorithm.....That is one possibility I considered when evaluating whether to implement the Edon-R hash algorithm. It''s security margin is somewhat questionable, so then the next step is to determine whether this can be used in any way to implement a practical data-corruption attack. I guess is that it can''t, but I''m open to debate on this issue, since I''m not a security expert myself. The reason why I don''t think this can be used to implement a practical attack is that in order to generate a collision, you first have to know the disk block that you want to create a collision on (or at least the checksum), i.e. the original block is already in the pool. At that point, you could write a colliding block which would get de-dup''d, but that doesn''t mean you''ve corrupted the original data, only that you referenced it. So, in a sense, you haven''t corrupted the original block, only your own collision block (since that''s the copy doesn''t get written). Cheers, -- Saso
On 07/11/2012 12:24 PM, Justin Stringfellow wrote:>> Suppose you find a weakness in a specific hash algorithm; you use this >> to create hash collisions and now imagined you store the hash collisions >> in a zfs dataset with dedup enabled using the same hash algorithm..... > > Sorry, but isn''t this what dedup=verify solves? I don''t see the problem here. Maybe all that''s needed is a comment in the manpage saying hash algorithms aren''t perfect.It does solve it, but at a cost to normal operation. Every write gets turned into a read. Assuming a big enough and reasonably busy dataset, this leads to tremendous write amplification. Cheers, -- Saso
Casper.Dik at oracle.com
2012-Jul-11 11:00 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>On 07/11/2012 12:24 PM, Justin Stringfellow wrote: >>> Suppose you find a weakness in a specific hash algorithm; you use this >>> to create hash collisions and now imagined you store the hash collisions >>> in a zfs dataset with dedup enabled using the same hash algorithm..... >> >> Sorry, but isn''t this what dedup=verify solves? I don''t see the problem here. Maybe all that''s needed is a comment in the manpage saying hash algorithms aren''t perfect.> >It does solve it, but at a cost to normal operation. Every write gets >turned into a read. Assuming a big enough and reasonably busy dataset, >this leads to tremendous write amplification.If and only if the block is being dedup''ed. (In that case, you''re just changing the write of a whole block into one read (of the block) and an update in the dedup date (the whole block isn''t written) Casper
Justin Stringfellow
2012-Jul-11 11:09 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> The point is that hash functions are many to one and I think the point > was about that verify wasn''t really needed if the hash function is good > enough.This is a circular argument really, isn''t it? Hash algorithms are never perfect, but we''re trying to build a perfect one? ? It seems to me the obvious fix is to use hash to identify candidates for dedup, and then do the actual verify and dedup asynchronously. Perhaps a worker thread doing this at low priority? Did anyone consider this? ? cheers, --justin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/832ef189/attachment.html>
On 07/11/2012 12:32 PM, Ferenc-Levente Juhos wrote:> Saso, I''m not flaming at all, I happen to disagree, but still I understand > that > chances are very very very slim, but as one poster already said, this is > how > the lottery works. I''m not saying one should make an exhaustive search with > trillions of computers just to produce a sha256 collision. > If I wanted an exhaustive search I would generate all the numbers from > 0 to 2**256 and I would definitely get at least 1 collision. > If you formulate it in another way, by generating all the possible 256 bit > (32 byte) > blocks + 1 you will definitely get a collision. This is much more credible > than the analogy with the age of the universe and atoms picked at random, > etc.First of all, I never said that the chance is zero. It''s definitely non-zero, but claiming that is analogous to the lottery is just not appreciating the scale of the difference. Next, your proposed method of finding hash collisions is naive in that you assume that you merely need generate 256-bit numbers. First of all, the smallest blocks in ZFS are 1k (IIRC), i.e. 8192 bits. Next, you fail to appreciate the difficulty of even generating 2^256 256-bit numbers. Here''s how much memory you''d need: 2^256 * 32 ~= 2^261 bytes Memory may be cheap, but not that cheap...> The fact is it can happen, it''s entirely possible that there are two jpg''s > in the universe with different content and they have the same hash. > I can''t prove the existence of those, but you can''t deny it.I''m not denying that. Read what I said.> The fact is, that zfs and everyone using it trys to correct data > degradation e.g. cause by cosmic rays, and on the other hand their > using probability calculation (no matter how slim the chances are) to > potentially discard valid data. > You can come with other universe and atom theories and with the > age of the universe, etc. The fact remains the same.This is just a profoundly naive statement. You always need to make trade-offs between practicality and performance. How "slim" the chances are has a very real impact on engineering decisions.> And each generation was convinced that their current best checksum or > hash algorithm is the best and will be the best forever. MD5 has > demonstrated that it''s not the case. Time will tell what becomes of > SHA256, but why take any chances.You really don''t understand much about hash algorithms, do you? MD5 has *very* good safety against random collisions - that''s why it''s still used in archive integrity checking (ever wonder what algorithm Debian and RPM packages use?). The reason why it''s been replaced by newer algorithms has nothing to do with its chance of random hash collisions, but with deliberate collision attacks on security algorithms built on top of it. Also, the reason why we don''t use it in ZFS is because SHA-256 is faster and it has a larger pattern space (thus further lowering the odds of random collisions). -- Saso
On 07/11/2012 12:37 PM, Ferenc-Levente Juhos wrote:> Precisely, I said the same thing a few posts before: > dedup=verify solves that. And as I said, one could use dedup=<hash > algorithm>,verify with > an inferior hash algorithm (that is much faster) with the purpose of > reducing the number of dedup candidates. > For that matter one could use a trivial CRC32, if the two blocks have the > same CRC you do anyway a > byte-to-byte comparison, but if they have differenct CRC''s you don''t need > to do the actual comparison.Yes, and that''s exactly what the Fletcher4+verify combo does (Fletcher is faster than CRC32). That''s why I started this thread - to assess whether a better hash algorithm is needed and/or desired. -- Saso
On 07/11/2012 01:09 PM, Justin Stringfellow wrote:>> The point is that hash functions are many to one and I think the point >> was about that verify wasn''t really needed if the hash function is good >> enough. > > This is a circular argument really, isn''t it? Hash algorithms are never perfect, but we''re trying to build a perfect one? > > It seems to me the obvious fix is to use hash to identify candidates for dedup, and then do the actual verify and dedup asynchronously. Perhaps a worker thread doing this at low priority? > Did anyone consider this?This assumes you have low volumes of deduplicated data. As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. -- Saso
Casper.Dik at oracle.com
2012-Jul-11 11:36 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>This assumes you have low volumes of deduplicated data. As your dedup >ratio grows, so does the performance hit from dedup=verify. At, say, >dedupratio=10.0x, on average, every write results in 10 reads.I don''t follow. If dedupratio == 10, it means that each item is *referenced* 10 times but it is only stored *once*. Only when you have hash collisions then multiple reads would be needed. Only one read is needed except in the case of hash collisions. Casper
Justin Stringfellow
2012-Jul-11 11:42 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> This assumes you have low volumes of deduplicated data. As your dedup > ratio grows, so does the performance hit from dedup=verify. At, say, > dedupratio=10.0x, on average, every write results in 10 reads.Well you can''t make an omelette without breaking eggs! Not a very nice one, anyway. ? Yes dedup is expensive but much like using O_SYNC,?it''s a?conscious decision here to?take a performance hit?in order to be sure about our data. Moving the actual reads to a async thread as I suggested?should improve things. ? cheers, --justin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/120cc4b6/attachment.html>
On 07/11/2012 01:36 PM, Casper.Dik at oracle.com wrote:> > >> This assumes you have low volumes of deduplicated data. As your dedup >> ratio grows, so does the performance hit from dedup=verify. At, say, >> dedupratio=10.0x, on average, every write results in 10 reads. > > I don''t follow. > > If dedupratio == 10, it means that each item is *referenced* 10 times > but it is only stored *once*. Only when you have hash collisions then > multiple reads would be needed. > > Only one read is needed except in the case of hash collisions.No, *every* dedup write will result in a block read. This is how: 1) ZIO gets block X and computes HASH(X) 2) ZIO looks up HASH(X) in DDT 2a) HASH(X) not in DDT -> unique write; exit 2b) HASH(X) in DDT; continue 3) Read original disk block Y with HASH(Y) = HASH(X) <--here''s the read 4) Verify X == Y 4a) X == Y; increment refcount 4b) X != Y; hash collision; write new block <-- here''s the collision So in other words, by the time you figure out you''ve got a hash collision, you already did the read, ergo, every dedup write creates a read! -- Saso
On 07/11/2012 01:42 PM, Justin Stringfellow wrote:>> This assumes you have low volumes of deduplicated data. As your dedup >> ratio grows, so does the performance hit from dedup=verify. At, say, >> dedupratio=10.0x, on average, every write results in 10 reads. > > Well you can''t make an omelette without breaking eggs! Not a very nice one, anyway. > > Yes dedup is expensive but much like using O_SYNC, it''s a conscious decision here to take a performance hit in order to be sure about our data. Moving the actual reads to a async thread as I suggested should improve things.And my point here is that the expense is unnecessary and can be omitted if we choose our algorithms and settings carefully. "Async" here won''t help, you''ll still get equal write amplification, only it''s going to be spread in between txg''s. -- Saso
On Wed, July 11, 2012 04:50, Ferenc-Levente Juhos wrote:> Actually although as you pointed out that the chances to have an sha256 > collision is minimal, but still it can happen, that would mean > that the dedup algorithm discards a block that he thinks is a duplicate. > Probably it''s anyway better to do a byte to byte comparison > if the hashes match to be sure that the blocks are really identical. > > The funny thing here is that ZFS tries to solve all sorts of data > integrity > issues with checksumming and healing, etc., > and on the other hand a hash collision in the dedup algorithm can cause > loss of data if wrongly configured. > > Anyway thanks that you have brought up the subject, now I know if I will > enable the dedup feature I must set it to sha256,verify.There was a long-ish thread on the topic last year ("(Fletcher+Verification) versus (Sha256+No Verification) "): http://mail.opensolaris.org/pipermail/zfs-discuss/2011-January/thread.html#46864
On Tue, July 10, 2012 19:56, Sa??o Kiselkov wrote:> However, before I start out on a pointless endeavor, I wanted to probe > the field of ZFS users, especially those using dedup, on whether their > workloads would benefit from a faster hash algorithm (and hence, lower > CPU utilization). Developments of late have suggested to me three > possible candidates:[...] I''d wait until SHA-3 is announced. It''s supposed to happen this year, of which only six months are left: http://csrc.nist.gov/groups/ST/hash/timeline.html http://en.wikipedia.org/wiki/NIST_hash_function_competition It was actually supposed to happen to "2Q", so they''re running a little late it seems.
On 07/11/2012 03:39 PM, David Magda wrote:> On Tue, July 10, 2012 19:56, Sa??o Kiselkov wrote: >> However, before I start out on a pointless endeavor, I wanted to probe >> the field of ZFS users, especially those using dedup, on whether their >> workloads would benefit from a faster hash algorithm (and hence, lower >> CPU utilization). Developments of late have suggested to me three >> possible candidates: > [...] > > I''d wait until SHA-3 is announced. It''s supposed to happen this year, of > which only six months are left: > > http://csrc.nist.gov/groups/ST/hash/timeline.html > http://en.wikipedia.org/wiki/NIST_hash_function_competition > > It was actually supposed to happen to "2Q", so they''re running a little > late it seems.I''m not convinced waiting makes much sense. The SHA-3 standardization process'' goals are different from "ours". SHA-3 can choose to go with something that''s slower, but has a higher security margin. I think that absolute super-tight security isn''t all that necessary for ZFS, since the hash isn''t used for security purposes. We only need something that''s fast and has a good pseudo-random output distribution. That''s why I looked toward Edon-R. Even though it might have security problems in itself, it''s by far the fastest algorithm in the entire competition. Cheers, -- Saso
Gregg Wonderly
2012-Jul-11 13:57 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn''t that be the win? Gregg Wonderly On Jul 11, 2012, at 5:56 AM, Sa?o Kiselkov wrote:> On 07/11/2012 12:24 PM, Justin Stringfellow wrote: >>> Suppose you find a weakness in a specific hash algorithm; you use this >>> to create hash collisions and now imagined you store the hash collisions >>> in a zfs dataset with dedup enabled using the same hash algorithm..... >> >> Sorry, but isn''t this what dedup=verify solves? I don''t see the problem here. Maybe all that''s needed is a comment in the manpage saying hash algorithms aren''t perfect. > > It does solve it, but at a cost to normal operation. Every write gets > turned into a read. Assuming a big enough and reasonably busy dataset, > this leads to tremendous write amplification. > > Cheers, > -- > Saso > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Edward Ned Harvey
2012-Jul-11 13:58 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > > On 07/11/2012 11:53 AM, Tomas Forsman wrote: > > On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: > >> Oh jeez, I can''t remember how many times this flame war has been going > >> on on this list. Here''s the gist: SHA-256 (or any good hash) produces a > >> near uniform random distribution of output. Thus, the chances ofgetting> >> a random hash collision are around 2^-256 or around 10^-77. If I asked > >> you to pick two atoms at random *from the entire observable universe*, > >> your chances of hitting on the same atom are higher than the chances of > >> that hash collision. So leave dedup=on with sha256 and move on. > > > > So in ZFS, which normally uses 128kB blocks, you can instead store them > > 100% uniquely into 32 bytes.. A nice 4096x compression rate.. > > decompression is a bit slower though.. > > I really mean no disrespect, but this comment is so dumb I could swear > my IQ dropped by a few tenths of a point just by reading.Cool it please. You say "I mean no disrespect" and then say something which is clearly disrespectful. Tomas''s point is to illustrate that hashing is a many-to-one function. If it were possible to rely on the hash to always be unique, then you could use it as a compression algorithm. He''s pointing out that''s insane. His comment was not in the slightest bit dumb; if anything, it seems like maybe somebody (or some people) didn''t get his point.
On 07/11/2012 03:57 PM, Gregg Wonderly wrote:> Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn''t that be the win?Don''t think that, if you can think of this procedure, that the crypto security guys at universities haven''t though about it as well? Of course they have. No, simply generating a sequence of random patterns and hoping to hit a match won''t do the trick. P.S. I really don''t mean to sound smug or anything, but I know one thing for sure: the crypto researchers who propose these algorithms are some of the brightest minds on this topic on the planet, so I would hardly think they didn''t consider trivial problems. Cheers, -- Saso
Edward Ned Harvey
2012-Jul-11 14:03 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > > As your dedup > ratio grows, so does the performance hit from dedup=verify. At, say, > dedupratio=10.0x, on average, every write results in 10 reads.Why? If you intend to write a block, and you discover it has a matching checksum, you only need to verify it matches one block. You don''t need to verify it matches all the other blocks that have previously been verified to match each other.
Bob Friesenhahn
2012-Jul-11 14:05 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Tue, 10 Jul 2012, Edward Ned Harvey wrote:> > CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahead of that curve.It seems that per-socket CPU performance is doubling every year. That seems like faster to me. If server CPU chipsets offer accelleration for some type of standard encryption, then that needs to be considered. The CPU might not need to do the encryption the hard way. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Edward Ned Harvey
2012-Jul-11 14:13 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Gregg Wonderly > > Since there is a finite number of bit patterns per block, have you triedto just> calculate the SHA-256 or SHA-512 for every possible bit pattern to see ifthere> is ever a collision? If you found an algorithm that produced nocollisions for> any possible block bit pattern, wouldn''t that be the win?Maybe I misunderstand what you''re saying, but if I got it right, what you''re saying is physically impossible to do in the time of the universe... And guaranteed to fail even if you had all the computational power of God. I think you''re saying ... In a block of 128k, sequentially step through all the possible values ... starting with 0, 1, 2, ... 2^128k ... and compute the hashes of each value, and see if you ever find a hash collision. If this is indeed what you''re saying, recall, the above operation will require on order 2^128k operations to complete. But present national security standards accept 2^256 operations as satisfactory to protect data from brute force attacks over the next 30 years. Furthermore, in a 128k block, there exist 2^128k possible values, while in a 512bit hash, there exist only 2^512 possible values (which is still a really huge number.) This means there will exist at least 2^127.5k collisions. However, these numbers are so astronomically universally magnanimously huge, it will still take more than a lifetime to find any one of those collisions. So it''s impossible to perform such a computation, and if you could, you would be guaranteed to find a LOT of collisions.
On 07/11/2012 03:58 PM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov >> >> I really mean no disrespect, but this comment is so dumb I could swear >> my IQ dropped by a few tenths of a point just by reading. > > Cool it please. You say "I mean no disrespect" and then say something which > is clearly disrespectful.I sort of flew off the handle there, and I shouldn''t have. It felt like Tomas was misrepresenting my position and putting words in my mouth I didn''t say. I certainly didn''t mean to diminish the validity of an honest question.> Tomas''s point is to illustrate that hashing is a many-to-one function. If > it were possible to rely on the hash to always be unique, then you could use > it as a compression algorithm. He''s pointing out that''s insane. His > comment was not in the slightest bit dumb; if anything, it seems like maybe > somebody (or some people) didn''t get his point.I understood his point very well and I never argued that hashing always results in unique hash values, which is why I thought he was misrepresenting what I said. So for a full explanation of why hashes aren''t usable for compression: 1) they are one-way (kind of bummer for decompression) 2) they operate far below the Shannon limit (i.e. unusable for lossless compression) 3) their output is pseudo-random, so even if we find collisions, we have no way to distinguish which input was the most likely one meant for a given hash value (all are equally probable) A formal proof would of course take longer to construct and would take time that I feel is best spent writing code. Cheers, -- Saso
Joerg Schilling
2012-Jul-11 14:17 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Tue, 10 Jul 2012, Edward Ned Harvey wrote: > > > > CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahead of that curve. > > It seems that per-socket CPU performance is doubling every year. > That seems like faster to me.This would only apply, if you implement a multi threaded hash. The single threaded performance win is less that doubling each year. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Gregg Wonderly
2012-Jul-11 14:19 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
But this is precisely the kind of "observation" that some people seem to miss out on the importance of. As Tomas suggested in his post, if this was true, then we could have a huge compression ratio as well. And even if there was 10% of the bit patterns that created non-unique hashes, you could use the fact that a block hashed to a known bit pattern that didn''t have collisions, to compress the other 90% of your data. I''m serious about this from a number of perspectives. We worry about the time it would take to reverse SHA or RSA hashes to passwords, not even thinking that what if someone has been quietly computing all possible hashes for the past 10-20 years into a database some where, with every 5-16 character password, and now has an instantly searchable hash-to-password database. Sometimes we ignore the scale of time, thinking that only the immediately visible details are what we have to work with. If no one has computed the hashes for every single 4K and 8K block, then fine. But, if that was done, and we had that data, we''d know for sure which algorithm was going to work the best for the number of bits we are considering. Speculating based on the theory of the algorithms for "random" number of bits is just silly. Where''s the real data that tells us what we need to know? Gregg Wonderly On Jul 11, 2012, at 9:02 AM, Sa?o Kiselkov wrote:> On 07/11/2012 03:57 PM, Gregg Wonderly wrote: >> Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn''t that be the win? > > Don''t think that, if you can think of this procedure, that the crypto > security guys at universities haven''t though about it as well? Of course > they have. No, simply generating a sequence of random patterns and > hoping to hit a match won''t do the trick. > > P.S. I really don''t mean to sound smug or anything, but I know one thing > for sure: the crypto researchers who propose these algorithms are some > of the brightest minds on this topic on the planet, so I would hardly > think they didn''t consider trivial problems. > > Cheers, > -- > Saso
Bob Friesenhahn
2012-Jul-11 14:22 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Wed, 11 Jul 2012, Sa?o Kiselkov wrote:> the hash isn''t used for security purposes. We only need something that''s > fast and has a good pseudo-random output distribution. That''s why I > looked toward Edon-R. Even though it might have security problems in > itself, it''s by far the fastest algorithm in the entire competition.If an algorithm is not ''secure'' and zfs is not set to verify, doesn''t that mean that a knowledgeable user will be able to cause intentional data corruption if deduplication is enabled? A user with very little privilege might be able to cause intentional harm by writing the magic data block before some other known block (which produces the same hash) is written. This allows one block to substitute for another. It does seem that security is important because with a human element, data is not necessarily random. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Ferenc-Levente Juhos
2012-Jul-11 14:22 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
You don''t need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other words a value between 0..2^256 - 1 3. If you start counting from 0 to 2^256 and for each number calculate the SHA256 you will get at least one hash collision (if the hash algortihm is prefectly distributed) 4. Counting from 0 to 2^256, is nothing else but reproducing all possible bit pattern on 32 bytes It''s not about whether one computer is capable of producing the above hashes or not, or whether there are actually that many unique 32 byte bit patterns in the universe. A collision can happen. On Wed, Jul 11, 2012 at 3:57 PM, Gregg Wonderly <gregg at wonderly.org> wrote:> Since there is a finite number of bit patterns per block, have you tried > to just calculate the SHA-256 or SHA-512 for every possible bit pattern to > see if there is ever a collision? If you found an algorithm that produced > no collisions for any possible block bit pattern, wouldn''t that be the win? > > Gregg Wonderly > > On Jul 11, 2012, at 5:56 AM, Sa?o Kiselkov wrote: > > > On 07/11/2012 12:24 PM, Justin Stringfellow wrote: > >>> Suppose you find a weakness in a specific hash algorithm; you use this > >>> to create hash collisions and now imagined you store the hash > collisions > >>> in a zfs dataset with dedup enabled using the same hash algorithm..... > >> > >> Sorry, but isn''t this what dedup=verify solves? I don''t see the problem > here. Maybe all that''s needed is a comment in the manpage saying hash > algorithms aren''t perfect. > > > > It does solve it, but at a cost to normal operation. Every write gets > > turned into a read. Assuming a big enough and reasonably busy dataset, > > this leads to tremendous write amplification. > > > > Cheers, > > -- > > Saso > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/08d32d3a/attachment-0001.html>
Casper.Dik at oracle.com
2012-Jul-11 14:23 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>On Tue, 10 Jul 2012, Edward Ned Harvey wrote: >> >> CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahead of that curve.> >It seems that per-socket CPU performance is doubling every year. >That seems like faster to me.I think that I/O isn''t getting as fast as CPU is; memory capacity and bandwith and CPUs are getting faster. I/O, not so much. (Apart from the one single step from harddisk to SSD; but note that I/O is limited to standard interfaces and as such it is likely be helddown by requiring a new standard. Casper
Edward Ned Harvey
2012-Jul-11 14:26 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: Bob Friesenhahn [mailto:bfriesen at simple.dallas.tx.us] > Sent: Wednesday, July 11, 2012 10:06 AM > > On Tue, 10 Jul 2012, Edward Ned Harvey wrote: > > > > CPU''s are not getting much faster. But IO is definitely getting faster.It''s> best to keep ahead of that curve. > > It seems that per-socket CPU performance is doubling every year. > That seems like faster to me.It''s a good point. It''s only *serial* computation that isn''t increasing much anymore. But computing hashes for a bunch of independent blocks is a parallel computation operation. For now, they''re able to keep parallelizing via more cores. Still, I wonder how many more years this can continue? The reason the serial computation isn''t increasing much anymore is simply because the transistors etc have gotten small enough that they can''t really increase clock speed much more. If you have a 3.0 Ghz clock, how far does light travel in one hertz? About 10cm. All the transistors, capacitors, wires etc need time to react and stabilize before the next clock cycle... For now, they''re still able to keep making larger dies, with more cores on them, and they''re certainly developing layering and 3-D technologies, which will certainly allow the number of cores to keep growing for some time to come. But there is a limit somewhere.
Gregg Wonderly
2012-Jul-11 14:27 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. People are hooking up 4 GPUs in their computers and getting huge performance gains. 5-6 char password space covered in a few days. 12 or so chars would take one machine a couple of years if I recall. So, if we had 20 people with that class of machine, we''d be down to a few months. I''m just suggesting that while the compute space is still huge, it''s not actually undoable, it just requires some thought into how to approach the problem, and then some time to do the computations. Huge space, but still finite? Gregg Wonderly On Jul 11, 2012, at 9:13 AM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- >> bounces at opensolaris.org] On Behalf Of Gregg Wonderly >> >> Since there is a finite number of bit patterns per block, have you tried > to just >> calculate the SHA-256 or SHA-512 for every possible bit pattern to see if > there >> is ever a collision? If you found an algorithm that produced no > collisions for >> any possible block bit pattern, wouldn''t that be the win? > > Maybe I misunderstand what you''re saying, but if I got it right, what you''re > saying is physically impossible to do in the time of the universe... And > guaranteed to fail even if you had all the computational power of God. > > I think you''re saying ... In a block of 128k, sequentially step through all > the possible values ... starting with 0, 1, 2, ... 2^128k ... and compute > the hashes of each value, and see if you ever find a hash collision. If > this is indeed what you''re saying, recall, the above operation will require > on order 2^128k operations to complete. But present national security > standards accept 2^256 operations as satisfactory to protect data from brute > force attacks over the next 30 years. Furthermore, in a 128k block, there > exist 2^128k possible values, while in a 512bit hash, there exist only 2^512 > possible values (which is still a really huge number.) This means there > will exist at least 2^127.5k collisions. However, these numbers are so > astronomically universally magnanimously huge, it will still take more than > a lifetime to find any one of those collisions. So it''s impossible to > perform such a computation, and if you could, you would be guaranteed to > find a LOT of collisions. > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 07/11/2012 04:19 PM, Gregg Wonderly wrote:> But this is precisely the kind of "observation" that some people seem to miss out on the importance of. As Tomas suggested in his post, if this was true, then we could have a huge compression ratio as well. And even if there was 10% of the bit patterns that created non-unique hashes, you could use the fact that a block hashed to a known bit pattern that didn''t have collisions, to compress the other 90% of your data. > > I''m serious about this from a number of perspectives. We worry about the time it would take to reverse SHA or RSA hashes to passwords, not even thinking that what if someone has been quietly computing all possible hashes for the past 10-20 years into a database some where, with every 5-16 character password, and now has an instantly searchable hash-to-password database.This is something very well known in the security community as "rainbow tables" and a common method to protect against it is via salting. Never use a password hashing scheme which doesn''t use salts for exactly the reason you outlined above.> Sometimes we ignore the scale of time, thinking that only the immediately visible details are what we have to work with. > > If no one has computed the hashes for every single 4K and 8K block, then fine. But, if that was done, and we had that data, we''d know for sure which algorithm was going to work the best for the number of bits we are considering.Do you even realize how many 4K or 8K blocks there are?!?! Exactly 2^32768 or 2^65536 respectively. I wouldn''t worry about somebody having those pre-hashed ;-) Rainbow tables only work for a very limited subset of data.> Speculating based on the theory of the algorithms for "random" number of bits is just silly. Where''s the real data that tells us what we need to know?If you don''t trust math, then I there''s little I can do to convince you. But remember our conversation the next time you step into a car or get on an airplane. The odds that you''ll die on that ride are far higher than that you''ll find a random hash collision in a 256-bit hash algorithm... -- Saso
Gregg Wonderly
2012-Jul-11 14:30 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
This is exactly the issue for me. It''s vital to always have verify on. If you don''t have the data to prove that every possible block combination possible, hashes uniquely for the "small" bit space we are talking about, then how in the world can you say that "verify" is not necessary? That just seems ridiculous to propose. Gregg Wonderly On Jul 11, 2012, at 9:22 AM, Bob Friesenhahn wrote:> On Wed, 11 Jul 2012, Sa?o Kiselkov wrote: >> the hash isn''t used for security purposes. We only need something that''s >> fast and has a good pseudo-random output distribution. That''s why I >> looked toward Edon-R. Even though it might have security problems in >> itself, it''s by far the fastest algorithm in the entire competition. > > If an algorithm is not ''secure'' and zfs is not set to verify, doesn''t that mean that a knowledgeable user will be able to cause intentional data corruption if deduplication is enabled? A user with very little privilege might be able to cause intentional harm by writing the magic data block before some other known block (which produces the same hash) is written. This allows one block to substitute for another. > > It does seem that security is important because with a human element, data is not necessarily random. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/_______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 07/11/2012 04:22 PM, Bob Friesenhahn wrote:> On Wed, 11 Jul 2012, Sa?o Kiselkov wrote: >> the hash isn''t used for security purposes. We only need something that''s >> fast and has a good pseudo-random output distribution. That''s why I >> looked toward Edon-R. Even though it might have security problems in >> itself, it''s by far the fastest algorithm in the entire competition. > > If an algorithm is not ''secure'' and zfs is not set to verify, doesn''t > that mean that a knowledgeable user will be able to cause intentional > data corruption if deduplication is enabled? A user with very little > privilege might be able to cause intentional harm by writing the magic > data block before some other known block (which produces the same hash) > is written. This allows one block to substitute for another. > > It does seem that security is important because with a human element, > data is not necessarily random.Theoretically yes, it is possible, but the practicality of such an attack is very much in doubt. In case this is a concern, however, one can always switch to a more secure hash function (e.g. Skein-512). Cheers, -- Saso
Bob Friesenhahn
2012-Jul-11 14:32 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Wed, 11 Jul 2012, Joerg Schilling wrote:> Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote: > >> On Tue, 10 Jul 2012, Edward Ned Harvey wrote: >>> >>> CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahead of that curve. >> >> It seems that per-socket CPU performance is doubling every year. >> That seems like faster to me. > > This would only apply, if you implement a multi threaded hash.While it is true that the per-block hash latency does not improve much with new CPUs (and may even regress), given multiple I/Os at once, hashes may be be computed by different cores and so it seems that total system performance will scale with per-socket CPU performance. Even with a single stream of I/O, multiple zfs blocks will be read or written so mutiple block hashes may be computed at once on different cores. Server OSs like Solaris have been focusing on improving total system throughput rather than single-threaded bandwidth. I don''t mean to discount the importance of this effort though. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On 07/11/2012 04:23 PM, Casper.Dik at oracle.com wrote:> >> On Tue, 10 Jul 2012, Edward Ned Harvey wrote: >>> >>> CPU''s are not getting much faster. But IO is definitely getting faster. It''s best to keep ahea > d of that curve. >> >> It seems that per-socket CPU performance is doubling every year. >> That seems like faster to me. > > I think that I/O isn''t getting as fast as CPU is; memory capacity and > bandwith and CPUs are getting faster. I/O, not so much. > (Apart from the one single step from harddisk to SSD; but note that > I/O is limited to standard interfaces and as such it is likely be > helddown by requiring a new standard.Have you seen one of those SSDs made by FusionIO? Those things fit in a single PCI-e x8 slot and can easily push a sustained rate upward of several GB/s. Do not expect that drives are the be-all end-all to storage. Hybrid storage invalidated the traditional "CPU & memory fast, disks slow" wisdom years ago. Cheers, -- Saso
Justin Stringfellow
2012-Jul-11 14:36 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision?? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn''t that be the win?? Perhaps I''ve missed something, but if there was *never* a collision, you''d have stumbled across a rather impressive lossless compression algorithm. I''m pretty sure there''s some Big Mathematical Rules (Shannon?) that mean this cannot be. ? cheers, --justin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/555e4f27/attachment.html>
Ferenc-Levente Juhos
2012-Jul-11 14:39 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
As I said several times before, to produce hash collisions. Or to calculate rainbow tables (as a previous user theorized it) you only need the following. You don''t need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other words a value between 0..2^256 - 1 3. If you start counting from 0 to 2^256 and for each number calculate the SHA256 you will get at least one hash collision (if the hash algortihm is prefectly distributed) 4. Counting from 0 to 2^256, is nothing else but reproducing all possible bit pattern on 32 bytes It''s not about whether one computer is capable of producing the above hashes or not, or whether there are actually that many unique 32 byte bit patterns in the universe. A collision can happen. On Wed, Jul 11, 2012 at 4:34 PM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote:> On 07/11/2012 04:23 PM, Casper.Dik at oracle.com wrote: > > > >> On Tue, 10 Jul 2012, Edward Ned Harvey wrote: > >>> > >>> CPU''s are not getting much faster. But IO is definitely getting > faster. It''s best to keep ahea > > d of that curve. > >> > >> It seems that per-socket CPU performance is doubling every year. > >> That seems like faster to me. > > > > I think that I/O isn''t getting as fast as CPU is; memory capacity and > > bandwith and CPUs are getting faster. I/O, not so much. > > (Apart from the one single step from harddisk to SSD; but note that > > I/O is limited to standard interfaces and as such it is likely be > > helddown by requiring a new standard. > > Have you seen one of those SSDs made by FusionIO? Those things fit in a > single PCI-e x8 slot and can easily push a sustained rate upward of > several GB/s. Do not expect that drives are the be-all end-all to > storage. Hybrid storage invalidated the traditional "CPU & memory fast, > disks slow" wisdom years ago. > > Cheers, > -- > Saso > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/9f9ad21c/attachment.html>
On 07/11/2012 04:27 PM, Gregg Wonderly wrote:> Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. People are hooking up 4 GPUs in their computers and getting huge performance gains. 5-6 char password space covered in a few days. 12 or so chars would take one machine a couple of years if I recall. So, if we had 20 people with that class of machine, we''d be down to a few months. I''m just suggesting that while the compute space is still huge, it''s not actually undoable, it just requires some thought into how to approach the problem, and then some time to do the computations. > > Huge space, but still finite?There are certain physical limits which one cannot exceed. For instance, you cannot store 2^256 units of 32-byte quantities in Earth. Even if you used proton spin (or some other quantum property) to store a bit, there simply aren''t enough protons in the entire visible universe to do it. You will never ever be able to search a 256-bit memory space using a simple exhaustive search. The reason why our security hashes are so long (256-bits, 512-bits, more...) is because attackers *don''t* do an exhaustive search. -- Saso
On 07/11/2012 04:30 PM, Gregg Wonderly wrote:> This is exactly the issue for me. It''s vital to always have verify on. If you don''t have the data to prove that every possible block combination possible, hashes uniquely for the "small" bit space we are talking about, then how in the world can you say that "verify" is not necessary? That just seems ridiculous to propose.Do you need assurances that in the next 5 seconds a meteorite won''t fall to Earth and crush you? No. And yet, the Earth puts on thousands of tons of weight each year from meteoric bombardment and people have been hit and killed by them (not to speak of mass extinction events). Nobody has ever demonstrated of being able to produce a hash collision in any suitably long hash (128-bits plus) using a random search. All hash collisions have been found by attacking the weaknesses in the mathematical definition of these functions (i.e. some part of the input didn''t get obfuscated well in the hash function machinery and spilled over into the result, resulting in a slight, but usable non-randomness). Cheers, -- Saso
On 07/11/2012 04:36 PM, Justin Stringfellow wrote:> > >> Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn''t that be the win? > > Perhaps I''ve missed something, but if there was *never* a collision, you''d have stumbled across a rather impressive lossless compression algorithm. I''m pretty sure there''s some Big Mathematical Rules (Shannon?) that mean this cannot be.Do you realize how big your lookup dictionary would have to be? -- Saso
Edward Ned Harvey
2012-Jul-11 14:47 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Gregg Wonderly > > But this is precisely the kind of "observation" that some people seem tomiss> out on the importance of. As Tomas suggested in his post, if this wastrue,> then we could have a huge compression ratio as well. And even if therewas> 10% of the bit patterns that created non-unique hashes, you could use the > fact that a block hashed to a known bit pattern that didn''t havecollisions, to> compress the other 90% of your data.In fact, if you were to assume hash value X corresponded to data block Y, you could use the hash function as a form of compression, but for decompression you would need a lookup table. So if you were going to compress Y once, you would not gain anything. But if you had a data stream with a bunch of duplicates in it, you might hash Y many times, discovering X is already in your lookup table, you only need to store another copy of X. So in fact, dedup is a form of hash-based compression. But we know it''s not a lossless compression, so the decision to verify or not to verify is a matter of probability of collision. Many of us have done the math on this probability, and many of us are comfortable with neglecting the verify, because we know the probability of collision is so small. But the decision to verify or not verify is very much dependent on your intended purpose (the type of data you''re storing, and what its significance is). More importantly, the decision to verify or not verify depends on who you would need to justify your decision to. Nobody will ever get in trouble for enabling verify. But if you have to convince your CEO that you made the right choice by disabling verify ... You might just choose to enable verify for the sake of not explaining your decision.> I''m serious about this from a number of perspectives. We worry about the > time it would take to reverse SHA or RSA hashes to passwords, not even > thinking that what if someone has been quietly computing all possiblehashes> for the past 10-20 years into a database some where, with every 5-16 > character password, and now has an instantly searchable hash-to-password > database.That is called a rainbow table, and it''s commonly used for cracking poorly implemented password schemes. Even with massively powerful computers and a whole lot of storage, you only have enough time and space to compute a rainbow table for commonly used (easily guessable) passwords. To prevent attackers from successfully using rainbow tables, even when users might choose bad passwords, a security conscious developer will use salting and stretching. This increases the randomness and the time to compute the password hashes, thwarting rainbow tables.> If no one has computed the hashes for every single 4K and 8K block, then > fine. But, if that was done, and we had that data, we''d know for surewhich> algorithm was going to work the best for the number of bits we are > considering.Let''s imagine computing a 256-bit hash (32 bytes = 2^5 bytes) for every 1K (8Kbit) block. Storage for this table would require 2^5 * 2^8192 bytes 2^8197 bytes = 3.5 * 10^2467 bytes. Recall, a petabyte is 10^15 bytes. So ... Storing the rainbow table for merely 1K possibilities ... I don''t know if it would fit in all the storage on planet Earth. Maybe. But having the rainbow table for *precisely* 1K block size isn''t useful unless you happen to have precisely a 1K block you want to lookup... If you happen to be looking up a 1025 byte data hash, or a 4K data hash ... You''re out of luck. This table won''t help you. It''s physically impossible to do, even on the supremely simplified problem of 1K block size and 256-bit hash.
Casper.Dik at oracle.com
2012-Jul-11 14:48 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>Unfortunately, the government imagines that people are using their home com>puters to compute hashes and try and decrypt stuff. Look at what is happen>ing with GPUs these days. People are hooking up 4 GPUs in their computers >and getting huge performance gains. 5-6 char password space covered in a f>ew days. 12 or so chars would take one machine a couple of years if I reca>ll. So, if we had 20 people with that class of machine, we''d be down to a >few months. I''m just suggesting that while the compute space is still hug>e, it''s not actually undoable, it just requires some thought into how to ap>proach the problem, and then some time to do the computations. > >Huge space, but still finite=85Dan Brown seems to think so in "Digital Fortress" but it just means he has no grasp on "big numbers". 2^128 is a huge space, finite *but* beyond brute force *forever*. Cconsidering that we have nearly 10billion people and if you give them all of them 1 billion computers all being able to compute 1 billion checks per second, how many years does it take before we get the solution? Did you realize that that number is *twice* the number of the years needed for a *single* computer with the same specification to solve this problem for 64 bits? There are two reasons for finding a new hash alrgorithm: - a faster one on current hardware - a better one with a larger output But bruteforce is not what we are defending against: we''re trying to defend against bugs in the hash algorithm. In the case of md5 and the related hash algorithm, a new attack method was discovered and it made many hash algorithms obsolete/broken. When a algorithm is broken, the "work factor" needed for a successful attack depends in part of the hash, e.g., you may left with 64 bits of effective has and that would be brute forcible. Casper Casper
On 07/11/2012 04:39 PM, Ferenc-Levente Juhos wrote:> As I said several times before, to produce hash collisions. Or to calculate > rainbow tables (as a previous user theorized it) you only need the > following. > > You don''t need to reproduce all possible blocks. > 1. SHA256 produces a 256 bit hash > 2. That means it produces a value on 256 bits, in other words a value > between 0..2^256 - 1 > 3. If you start counting from 0 to 2^256 and for each number calculate the > SHA256 you will get at least one hash collision (if the hash algortihm is > prefectly distributed) > 4. Counting from 0 to 2^256, is nothing else but reproducing all possible > bit pattern on 32 bytes > > It''s not about whether one computer is capable of producing the above > hashes or not, or whether there are actually that many unique 32 byte bit > patterns in the universe. > A collision can happen.It''s actually not that simple, because in hash collision attacks you''re not always afforded the luxury of being able to define your input block. More often than not, you want to modify a previously hashed block in such a fashion that it carries your intended modifications while hashing to the same original value. Say for instance you want to modify a 512-byte message (e.g. an SSL certificate) to point to your own CN. Here your rainbow table, even if you could store it somewhere (you couldn''t, btw), would do you little good here. -- Saso
Gregg Wonderly
2012-Jul-11 14:49 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Yes, but from the other angle, the number of unique 128K blocks that you can store on your ZFS pool, is actually finitely small, compared to the total space. So the patterns you need to actually consider is not more than the physical limits of the universe. Gregg Wonderly On Jul 11, 2012, at 9:39 AM, Sa?o Kiselkov wrote:> On 07/11/2012 04:27 PM, Gregg Wonderly wrote: >> Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. People are hooking up 4 GPUs in their computers and getting huge performance gains. 5-6 char password space covered in a few days. 12 or so chars would take one machine a couple of years if I recall. So, if we had 20 people with that class of machine, we''d be down to a few months. I''m just suggesting that while the compute space is still huge, it''s not actually undoable, it just requires some thought into how to approach the problem, and then some time to do the computations. >> >> Huge space, but still finite? > > There are certain physical limits which one cannot exceed. For instance, > you cannot store 2^256 units of 32-byte quantities in Earth. Even if you > used proton spin (or some other quantum property) to store a bit, there > simply aren''t enough protons in the entire visible universe to do it. > You will never ever be able to search a 256-bit memory space using a > simple exhaustive search. The reason why our security hashes are so long > (256-bits, 512-bits, more...) is because attackers *don''t* do an > exhaustive search. > > -- > Saso
Edward Ned Harvey
2012-Jul-11 14:50 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: Gregg Wonderly [mailto:gregg at wonderly.org] > Sent: Wednesday, July 11, 2012 10:28 AM > > Unfortunately, the government imagines that people are using their home > computers to compute hashes and try and decrypt stuff. Look at what is > happening with GPUs these days.heheheh. I guess the NSA didn''t think of that. ;-) (That''s sarcasm, in case anyone didn''t get it.)
Ferenc-Levente Juhos
2012-Jul-11 14:54 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
You don''t have to store all hash values: a. Just memorize the first one SHA256(0) b. start cointing c. bang: by the time you get to 2^256 you get at least a collision. (do this using BOINC, you dont have to wait for the last hash to be calculated, I''m pretty sure a collision will occur sooner) 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other words a value between 0..2^256 - 1 3. If you start counting from 0 to 2^256 and for each number calculate the SHA256 you will get at least one hash collision (if the hash algortihm is prefectly distributed) 4. Counting from 0 to 2^256, is nothing else but reproducing all possible bit pattern on 32 bytes And nobody is trying to proove that SHA256 is bad, and I don''t want to do compression either. But it''s unrealistic to think that a SHA256 collision wont occur just because it would be impossible to find it via brute-force. On Wed, Jul 11, 2012 at 4:49 PM, Sa?o Kiselkov <skiselkov.ml at gmail.com>wrote:> On 07/11/2012 04:39 PM, Ferenc-Levente Juhos wrote: > > As I said several times before, to produce hash collisions. Or to > calculate > > rainbow tables (as a previous user theorized it) you only need the > > following. > > > > You don''t need to reproduce all possible blocks. > > 1. SHA256 produces a 256 bit hash > > 2. That means it produces a value on 256 bits, in other words a value > > between 0..2^256 - 1 > > 3. If you start counting from 0 to 2^256 and for each number calculate > the > > SHA256 you will get at least one hash collision (if the hash algortihm is > > prefectly distributed) > > 4. Counting from 0 to 2^256, is nothing else but reproducing all possible > > bit pattern on 32 bytes > > > > It''s not about whether one computer is capable of producing the above > > hashes or not, or whether there are actually that many unique 32 byte bit > > patterns in the universe. > > A collision can happen. > > It''s actually not that simple, because in hash collision attacks you''re > not always afforded the luxury of being able to define your input block. > More often than not, you want to modify a previously hashed block in > such a fashion that it carries your intended modifications while hashing > to the same original value. Say for instance you want to modify a > 512-byte message (e.g. an SSL certificate) to point to your own CN. Here > your rainbow table, even if you could store it somewhere (you couldn''t, > btw), would do you little good here. > > -- > Saso >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/af558bf4/attachment-0001.html>
Casper.Dik at oracle.com
2012-Jul-11 14:54 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>Do you need assurances that in the next 5 seconds a meteorite won''t fall >to Earth and crush you? No. And yet, the Earth puts on thousands of tons >of weight each year from meteoric bombardment and people have been hit >and killed by them (not to speak of mass extinction events). Nobody has >ever demonstrated of being able to produce a hash collision in any >suitably long hash (128-bits plus) using a random search. All hash >collisions have been found by attacking the weaknesses in the >mathematical definition of these functions (i.e. some part of the input >didn''t get obfuscated well in the hash function machinery and spilled >over into the result, resulting in a slight, but usable non-randomness).The reason why we don''t protect against such event because it would be extremely expensive with a very small chance of it being needed. "verify" doesn''t cost much so even if the risk as infinitesimal as a direct meteorite hit, it may still be cost effective. (Just like we''d better be off preparing for the climate changing (rather cheap) rather then trying to keep the climate from changing (impossible and still extremely expensive) Casper
Gregg Wonderly
2012-Jul-11 14:56 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
So, if I had a block collision on my ZFS pool that used dedup, and it had my bank balance of $3,212.20 on it, and you tried to write your bank balance of $3,292,218.84 and got the same hash, no verify, and thus you got my block/balance and now your bank balance was reduced by 3 orders of magnitude, would you be okay with that? What assurances would you be content with using my ZFS pool? Gregg Wonderly On Jul 11, 2012, at 9:43 AM, Sa?o Kiselkov wrote:> On 07/11/2012 04:30 PM, Gregg Wonderly wrote: >> This is exactly the issue for me. It''s vital to always have verify on. If you don''t have the data to prove that every possible block combination possible, hashes uniquely for the "small" bit space we are talking about, then how in the world can you say that "verify" is not necessary? That just seems ridiculous to propose. > > Do you need assurances that in the next 5 seconds a meteorite won''t fall > to Earth and crush you? No. And yet, the Earth puts on thousands of tons > of weight each year from meteoric bombardment and people have been hit > and killed by them (not to speak of mass extinction events). Nobody has > ever demonstrated of being able to produce a hash collision in any > suitably long hash (128-bits plus) using a random search. All hash > collisions have been found by attacking the weaknesses in the > mathematical definition of these functions (i.e. some part of the input > didn''t get obfuscated well in the hash function machinery and spilled > over into the result, resulting in a slight, but usable non-randomness). > > Cheers, > -- > Saso
On 07/11/2012 04:54 PM, Ferenc-Levente Juhos wrote:> You don''t have to store all hash values: > a. Just memorize the first one SHA256(0) > b. start cointing > c. bang: by the time you get to 2^256 you get at least a collision.Just one question: how long do you expect this going to take on average? Come on, do the math! -- Saso
On 07/11/2012 04:56 PM, Gregg Wonderly wrote:> So, if I had a block collision on my ZFS pool that used dedup, and it had my bank balance of $3,212.20 on it, and you tried to write your bank balance of $3,292,218.84 and got the same hash, no verify, and thus you got my block/balance and now your bank balance was reduced by 3 orders of magnitude, would you be okay with that? What assurances would you be content with using my ZFS pool?I''d feel entirely safe. There, I said it. -- Saso
Gregg Wonderly
2012-Jul-11 15:03 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
I''m just suggesting that the "time frame" of when 256-bits or 512-bits is less safe, is closing faster than one might actually think, because social elements of the internet allow a lot more effort to be focused on a single "problem" than one might consider. Gregg Wonderly On Jul 11, 2012, at 9:50 AM, Edward Ned Harvey wrote:>> From: Gregg Wonderly [mailto:gregg at wonderly.org] >> Sent: Wednesday, July 11, 2012 10:28 AM >> >> Unfortunately, the government imagines that people are using their home >> computers to compute hashes and try and decrypt stuff. Look at what is >> happening with GPUs these days. > > heheheh. I guess the NSA didn''t think of that. ;-) > (That''s sarcasm, in case anyone didn''t get it.) > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Wed, July 11, 2012 09:45, Sa?o Kiselkov wrote:> I''m not convinced waiting makes much sense. The SHA-3 standardization > process'' goals are different from "ours". SHA-3 can choose to go with > something that''s slower, but has a higher security margin. I think that > absolute super-tight security isn''t all that necessary for ZFS, since > the hash isn''t used for security purposes. We only need something that''s > fast and has a good pseudo-random output distribution. That''s why I > looked toward Edon-R. Even though it might have security problems in > itself, it''s by far the fastest algorithm in the entire competition.Fair enough, though I think eventually the SHA-3 winner will be incorporated into hardware (or at least certain instructions used in the algorithm will). I think waiting a few more weeks/months shouldn''t be a big deal, as the winner should be announced Real Soon Now, and then a more informed decision can probably be made.
On 07/11/2012 05:10 PM, David Magda wrote:> On Wed, July 11, 2012 09:45, Sa?o Kiselkov wrote: > >> I''m not convinced waiting makes much sense. The SHA-3 standardization >> process'' goals are different from "ours". SHA-3 can choose to go with >> something that''s slower, but has a higher security margin. I think that >> absolute super-tight security isn''t all that necessary for ZFS, since >> the hash isn''t used for security purposes. We only need something that''s >> fast and has a good pseudo-random output distribution. That''s why I >> looked toward Edon-R. Even though it might have security problems in >> itself, it''s by far the fastest algorithm in the entire competition. > > Fair enough, though I think eventually the SHA-3 winner will be > incorporated into hardware (or at least certain instructions used in the > algorithm will). I think waiting a few more weeks/months shouldn''t be a > big deal, as the winner should be announced Real Soon Now, and then a more > informed decision can probably be made.The AES process winner had been announced in October 2000. Considering AES-NI was proposed in March 2008 and first silicon for it appeared around January 2010, I wouldn''t hold my breath hoping for hardware SHA-3-specific acceleration getting a widespread foothold for at least another 5-10 years (around 2-3 technology generations). That being said, a lot can be achieved using SIMD instructions, but that doesn''t depend on the SHA-3 process in any way. -- Saso
On Wed, July 11, 2012 10:23, Casper.Dik at oracle.com wrote:> I think that I/O isn''t getting as fast as CPU is; memory capacity and > bandwith and CPUs are getting faster. I/O, not so much. > (Apart from the one single step from harddisk to SSD; but note that > I/O is limited to standard interfaces and as such it is likely be > helddown by requiring a new standard.The only other thing on the (theoretical) horizon would be things based on the memristor.
Bob Friesenhahn
2012-Jul-11 15:33 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Wed, 11 Jul 2012, Sa?o Kiselkov wrote:> > The reason why I don''t think this can be used to implement a practical > attack is that in order to generate a collision, you first have to know > the disk block that you want to create a collision on (or at least the > checksum), i.e. the original block is already in the pool. At that > point, you could write a colliding block which would get de-dup''d, but > that doesn''t mean you''ve corrupted the original data, only that you > referenced it. So, in a sense, you haven''t corrupted the original block, > only your own collision block (since that''s the copy doesn''t get written).This is not correct. If you know the well-known block to be written, then you can arrange to write your collision block prior to when the well-known block is written. Therefore, it is imperative that the hash algorithm make it clearly impractical to take a well-known block and compute a collision block. For example, the well-known block might be part of a Windows anti-virus package, or a Windows firewall configuration, and corrupting it might leave a Windows VM open to malware attack. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Wed, Jul 11, 2012 at 9:48 AM, <Casper.Dik at oracle.com> wrote:>>Huge space, but still finite=85 > > Dan Brown seems to think so in "Digital Fortress" but it just means he > has no grasp on "big numbers".I couldn''t get past that. I had to put the book down. I''m guessing it was as awful as it threatened to be. IMO, FWIW, yes, do add SHA-512 (truncated to 256 bits, of course). Nico --
Casper.Dik at oracle.com
2012-Jul-11 15:40 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
>On Wed, Jul 11, 2012 at 9:48 AM, <Casper.Dik at oracle.com> wrote: >>>Huge space, but still finite=85 >> >> Dan Brown seems to think so in "Digital Fortress" but it just means he >> has no grasp on "big numbers". > >I couldn''t get past that. I had to put the book down. I''m guessing >it was as awful as it threatened to be.It is *fiction*. So just read it as if it is "magical", like Harry Potter. It''s just like "well researched" in fiction means exactly the same as "well researched" in journalism: the facts aren''t actually facts but could pass as facts to those who hasn''t had a proper science education (which unfortunately includes a large part of the population and 99.5% of all politicians) Casper
On 07/11/2012 05:33 PM, Bob Friesenhahn wrote:> On Wed, 11 Jul 2012, Sa?o Kiselkov wrote: >> >> The reason why I don''t think this can be used to implement a practical >> attack is that in order to generate a collision, you first have to know >> the disk block that you want to create a collision on (or at least the >> checksum), i.e. the original block is already in the pool. At that >> point, you could write a colliding block which would get de-dup''d, but >> that doesn''t mean you''ve corrupted the original data, only that you >> referenced it. So, in a sense, you haven''t corrupted the original block, >> only your own collision block (since that''s the copy doesn''t get >> written). > > This is not correct. If you know the well-known block to be written, > then you can arrange to write your collision block prior to when the > well-known block is written. Therefore, it is imperative that the hash > algorithm make it clearly impractical to take a well-known block and > compute a collision block. > > For example, the well-known block might be part of a Windows anti-virus > package, or a Windows firewall configuration, and corrupting it might > leave a Windows VM open to malware attack.True, but that may not be enough to produce a practical collision for the reason that while you know which bytes you want to attack, these might not line up with ZFS disk blocks (especially the case with Windows VMs which are store in large opaque zvols) - such an attack would require physical access to the machine (at which point you can simply manipulate the blocks directly). -- Saso
Gregg Wonderly
2012-Jul-11 15:58 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
You''re entirely sure that there could never be two different blocks that can hash to the same value and have different content? Wow, can you just send me the cash now and we''ll call it even? Gregg On Jul 11, 2012, at 9:59 AM, Sa?o Kiselkov wrote:> On 07/11/2012 04:56 PM, Gregg Wonderly wrote: >> So, if I had a block collision on my ZFS pool that used dedup, and it had my bank balance of $3,212.20 on it, and you tried to write your bank balance of $3,292,218.84 and got the same hash, no verify, and thus you got my block/balance and now your bank balance was reduced by 3 orders of magnitude, would you be okay with that? What assurances would you be content with using my ZFS pool? > > I''d feel entirely safe. There, I said it. > > -- > Saso
On 07/11/2012 05:58 PM, Gregg Wonderly wrote:> You''re entirely sure that there could never be two different blocks that can hash to the same value and have different content? > > Wow, can you just send me the cash now and we''ll call it even?You''re the one making the positive claim and I''m calling bullshit. So the onus is on you to demonstrate the collision (and that you arrived at it via your brute force method as described). Until then, my money stays safely on my bank account. Put up or shut up, as the old saying goes. -- Saso
Gregg Wonderly
2012-Jul-11 16:23 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
What I''m saying is that I am getting conflicting information from your rebuttals here. I (and others) say there will be collisions that will cause data loss if verify is off. You say it would be so rare as to be impossible from your perspective. Tomas says, well then lets just use the hash value for a 4096X compression. You fluff around his argument calling him names. I say, well then compute all the possible hashes for all possible bit patterns and demonstrate no dupes. You say it''s not possible to do that. I illustrate a way that loss of data could cost you money. You say it''s impossible for there to be a chance of me constructing a block that has the same hash but different content. Several people have illustrated that 128K to 32bits is a huge and lossy ratio of compression, yet you still say it''s viable to leave verify off. I say, in fact that the total number of unique patterns that can exist on any pool is small, compared to the total, illustrating that I understand how the key space for the algorithm is small when looking at a ZFS pool, and thus could have a non-collision opportunity. So I can see what perspective you are drawing your confidence from, but I, and others, are not confident that the risk has zero probability. I''m pushing you to find a way to demonstrate that there is zero risk because if you do that, then you''ve, in fact created the ultimate compression factor (but enlarged the keys that could collide because the pool is now virtually larger), to date for random bit patterns, and you''ve also demonstrated that the particular algorithm is very good for dedup. That would indicate to me, that you can then take that algorithm, and run it inside of ZFS dedup to automatically manage when verify is necessary by detecting when a collision occurs. I appreciate the push back. I''m trying to drive thinking about this into the direction of what is known and finite, away from what is infinitely complex and thus impossible to explore. Maybe all the work has already been done? Gregg On Jul 11, 2012, at 11:02 AM, Sa?o Kiselkov wrote:> On 07/11/2012 05:58 PM, Gregg Wonderly wrote: >> You''re entirely sure that there could never be two different blocks that can hash to the same value and have different content? >> >> Wow, can you just send me the cash now and we''ll call it even? > > You''re the one making the positive claim and I''m calling bullshit. So > the onus is on you to demonstrate the collision (and that you arrived at > it via your brute force method as described). Until then, my money stays > safely on my bank account. Put up or shut up, as the old saying goes. > > -- > Saso
On Wed, July 11, 2012 11:58, Gregg Wonderly wrote:> You''re entirely sure that there could never be two different blocks that > can hash to the same value and have different content?[...] The odds of being hit by lighting (at least in the US) are about 1 in 700,000. I don''t worry about that happening, so personally I don''t see the point in worry something a few more orders of magnitude less likely to happen. When I get hit by lightning (or win the lottery), I''ll start worrying about hash collisions. Until then: meh. :)
Bob Friesenhahn
2012-Jul-11 16:37 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Wed, 11 Jul 2012, Sa?o Kiselkov wrote:>> >> For example, the well-known block might be part of a Windows anti-virus >> package, or a Windows firewall configuration, and corrupting it might >> leave a Windows VM open to malware attack. > > True, but that may not be enough to produce a practical collision for > the reason that while you know which bytes you want to attack, these > might not line up with ZFS disk blocks (especially the case with Windows > VMs which are store in large opaque zvols) - such an attack would > require physical access to the machine (at which point you can simply > manipulate the blocks directly).I think that well-known blocks are much easier to predict than you say because operating systems, VMs, and application software behave in predictable patterns. However, deriving another useful block which hashes the same should be extremely difficult and any block hashing algorithm needs to assure that. Having an excellent random distribution property is not sufficient if it is relatively easy to compute some other block producing the same hash. It may be useful to compromise a known block even if the compromized result is complete garbage. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Richard Elling
2012-Jul-11 16:58 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Thanks Sa?o! Comments below... On Jul 10, 2012, at 4:56 PM, Sa?o Kiselkov wrote:> Hi guys, > > I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS > implementation to supplant the currently utilized sha256.No need to supplant, there are 8 bits for enumerating hash algorithms, so adding another is simply a matter of coding. With the new feature flags, it is almost trivial to add new algorithms without causing major compatibility headaches. Darren points out that Oracle is considering doing the same, though I do not expect Oracle to pick up the feature flags.> On modern > 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much > slower than many of the SHA-3 candidates, so I went out and did some > testing (details attached) on a possible new hash algorithm that might > improve on this situation. > > However, before I start out on a pointless endeavor, I wanted to probe > the field of ZFS users, especially those using dedup, on whether their > workloads would benefit from a faster hash algorithm (and hence, lower > CPU utilization). Developments of late have suggested to me three > possible candidates: > > * SHA-512: simplest to implement (since the code is already in the > kernel) and provides a modest performance boost of around 60%. > > * Skein-512: overall fastest of the SHA-3 finalists and much faster > than SHA-512 (around 120-150% faster than the current sha256). > > * Edon-R-512: probably the fastest general purpose hash algorithm I''ve > ever seen (upward of 300% speedup over sha256) , but might have > potential security problems (though I don''t think this is of any > relevance to ZFS, as it doesn''t use the hash for any kind of security > purposes, but only for data integrity & dedup). > > My testing procedure: nothing sophisticated, I took the implementation > of sha256 from the Illumos kernel and simply ran it on a dedicated > psrset (where possible with a whole CPU dedicated, even if only to a > single thread) - I tested both the generic C implementation and the > Intel assembly implementation. The Skein and Edon-R implementations are > in C optimized for 64-bit architectures from the respective authors (the > most up to date versions I could find). All code has been compiled using > GCC 3.4.3 from the repos (the same that can be used for building > Illumos). Sadly, I don''t have access to Sun Studio.The last studio release suitable for building OpenSolaris is available in the repo. See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos I''d be curious about whether you see much difference based on studio 12.1, gcc 3.4.3 and gcc 4.4 (or even 4.7) -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/aacae22a/attachment.html>
On 07/11/2012 06:23 PM, Gregg Wonderly wrote:> What I''m saying is that I am getting conflicting information from your rebuttals here.Well, let''s address that then:> I (and others) say there will be collisions that will cause data loss if verify is off.Saying that "there will be" without any supporting evidence to back it up amounts to a prophecy.> You say it would be so rare as to be impossible from your perspective.Correct.> Tomas says, well then lets just use the hash value for a 4096X compression. > You fluff around his argument calling him names.Tomas'' argument was, as I understood later, an attempt at sarcasm. Nevertheless, I later explained exactly why I consider the hash-compression claim total and utter bunk: "So for a full explanation of why hashes aren''t usable for compression: 1) they are one-way (kind of bummer for decompression) 2) they operate far below the Shannon limit (i.e. unusable for lossless compression) 3) their output is pseudo-random, so even if we find collisions, we have no way to distinguish which input was the most likely one meant for a given hash value (all are equally probable)"> I say, well then compute all the possible hashes for all possible bit patterns and demonstrate no dupes.This assumes it''s possible to do so. Frenc made a similar claim and I responded with this question: "how long do you expect this going to take on average? Come on, do the math!". I pose the same to you. Find the answer and you''ll understand exactly why what you''re proposing is impossible.> You say it''s not possible to do that.Please go on and compute a reduced size of the problem for, say, 2^64 32-byte values (still a laughably small space for the problem, but I''m feeling generous). Here''s the amount of storage you''ll need: 2^64 * 32 = 524288 Exabytes And that''s for a problem that I''ve reduced for you by 192 orders of magnitude. You see, only when you do the math you realize how off base you are in claiming that pre-computation of hash rainbow tables for generic bit patterns is doable.> I illustrate a way that loss of data could cost you money.That''s merely an emotional argument where you are trying to attack me by trying to invoke an emotional response from when "my ass" is on the line. Sorry, that doesn''t invalidate the original argument that you can''t do rainbow table pre-computation for long bit patterns.> You say it''s impossible for there to be a chance of me constructing a block that has the same hash but different content.To make sure we''re not using ambiguous rhetoric here, allow me to summarize my position: you cannot produce, in practical terms, a hash collision on a 256-bit secure hash algorithm using a brute-force algorithm.> Several people have illustrated that 128K to 32bits is a huge and lossy ratio of compression, yet you still say it''s viable to leave verify off.Except that we''re not talking 128K to 32b, but 128K to 256b. Also, only once you appreciate the mathematics behind the size of the 256-bit pattern space can you understand why leaving verify off is okay.> I say, in fact that the total number of unique patterns that can exist on any pool is small, compared to the total, illustrating that I understand how the key space for the algorithm is small when looking at a ZFS pool, and thus could have a non-collision opportunity.This is so profoundly wrong that it leads me to suspect you never took courses on cryptography and/or information theory. The size of your storage pool DOESN''T MATTER ONE BIT to the size of the key space. Even if your pool were the size of a single block, we''re talking here about the *mathematical* possibility of hitting on a random block that hashes to the same value. Given a stream of random data blocks (thus simulating an exhaustive brute-force search) and a secure pseudo-random hash function (which has a roughly equal chance of producing any output value for a given input block), you''ve got only a 10^-77 chance of getting a hash collision. If you don''t understand how this works, read a book on digital coding theory.> So I can see what perspective you are drawing your confidence from, but I, and others, are not confident that the risk has zero probability.I never said the risk is zero. The risk non-zero, but is so close to zero, that you may safely ignore it (since we take much greater risks on a daily basis without so much as a blink of an eye).> I''m pushing you to find a way to demonstrate that there is zero risk because if you do that, then you''ve, in fact created the ultimate compression factor (but enlarged the keys that could collide because the pool is now virtually larger), to date for random bit patterns, and you''ve also demonstrated that the particular algorithm is very good for dedup. > That would indicate to me, that you can then take that algorithm, and run it inside of ZFS dedup to automatically manage when verify is necessary by detecting when a collision occurs.Do you know what a dictionary is in compression algorithms? Do you even know how things like Huffman coding or LZW work, at least in principle? If not, then I can see why you didn''t understand my earlier explanations of why hashes aren''t usable for compression.> I appreciate the push back. I''m trying to drive thinking about this into the direction of what is known and finite, away from what is infinitely complex and thus impossible to explore.If you don''t understand the mathematics behind my arguments, just say so. -- Saso
Bob Friesenhahn
2012-Jul-11 17:11 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Wed, 11 Jul 2012, Richard Elling wrote:> > The last studio release suitable for building OpenSolaris is available in the repo. > See the instructions at?http://wiki.illumos.org/display/illumos/How+To+Build+illumosNot correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. The only way I can see to obtain these releases is illegally. However, Studio 12.3 (free download) produces user-space executables which run fine under Illumos. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Hi Richard, On 07/11/2012 06:58 PM, Richard Elling wrote:> Thanks Sa?o! > Comments below... > > On Jul 10, 2012, at 4:56 PM, Sa?o Kiselkov wrote: > >> Hi guys, >> >> I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS >> implementation to supplant the currently utilized sha256. > > No need to supplant, there are 8 bits for enumerating hash algorithms, so > adding another is simply a matter of coding. With the new feature flags, it is > almost trivial to add new algorithms without causing major compatibility > headaches. Darren points out that Oracle is considering doing the same, > though I do not expect Oracle to pick up the feature flags.I meant in the functional sense, not in the technical - of course, my changes would be implemented as a feature flags add-on.>> On modern >> 64-bit CPUs SHA-256 is actually much slower than SHA-512 and indeed much >> slower than many of the SHA-3 candidates, so I went out and did some >> testing (details attached) on a possible new hash algorithm that might >> improve on this situation. >> >> However, before I start out on a pointless endeavor, I wanted to probe >> the field of ZFS users, especially those using dedup, on whether their >> workloads would benefit from a faster hash algorithm (and hence, lower >> CPU utilization). Developments of late have suggested to me three >> possible candidates: >> >> * SHA-512: simplest to implement (since the code is already in the >> kernel) and provides a modest performance boost of around 60%. >> >> * Skein-512: overall fastest of the SHA-3 finalists and much faster >> than SHA-512 (around 120-150% faster than the current sha256). >> >> * Edon-R-512: probably the fastest general purpose hash algorithm I''ve >> ever seen (upward of 300% speedup over sha256) , but might have >> potential security problems (though I don''t think this is of any >> relevance to ZFS, as it doesn''t use the hash for any kind of security >> purposes, but only for data integrity & dedup). >> >> My testing procedure: nothing sophisticated, I took the implementation >> of sha256 from the Illumos kernel and simply ran it on a dedicated >> psrset (where possible with a whole CPU dedicated, even if only to a >> single thread) - I tested both the generic C implementation and the >> Intel assembly implementation. The Skein and Edon-R implementations are >> in C optimized for 64-bit architectures from the respective authors (the >> most up to date versions I could find). All code has been compiled using >> GCC 3.4.3 from the repos (the same that can be used for building >> Illumos). Sadly, I don''t have access to Sun Studio. > > The last studio release suitable for building OpenSolaris is available in the repo. > See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos > > I''d be curious about whether you see much difference based on studio 12.1, > gcc 3.4.3 and gcc 4.4 (or even 4.7) > -- richardI''ll try Sun Studio, but the difference between GCC 3.4.3 and GCC 4.6.2 is quite profound. GCC 4.6.2 is consistently producing hashing code that runs faster (though by much depends on algo). Here''s a sample for on an Opteron 4234 machine (each run hashing 10G of data): # psrset -e 1 time ./edon-r-gcc3.4 real 17.1 user 17.1 sys 0.0 # psrset -e 1 time ./edon-r-gcc4.6 real 12.3 user 12.2 sys 0.0 # psrset -e 1 time ./skein-gcc3.4 real 30.8 user 30.8 sys 0.0 # psrset -e 1 time ./skein-gcc4.6 real 29.2 user 29.2 sys 0.0 Attached are my test sources for Skein and Edon-R. Cheers -- Saso -------------- next part -------------- A non-text attachment was scrubbed... Name: edon-r.zip Type: application/zip Size: 6281 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/825f76c9/attachment-0002.zip> -------------- next part -------------- A non-text attachment was scrubbed... Name: skein.zip Type: application/zip Size: 20737 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/825f76c9/attachment-0003.zip>
Richard Elling
2012-Jul-11 17:47 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Jul 11, 2012, at 10:11 AM, Bob Friesenhahn wrote:> On Wed, 11 Jul 2012, Richard Elling wrote: >> The last studio release suitable for building OpenSolaris is available in the repo. >> See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos > > Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. The only way I can see to obtain these releases is illegally.In the US, the term "illegal" is most often used for criminal law. Contracts between parties are covered under civil law. It is the responsibility of the parties to agree to and enforce civil contracts. This includes you, dear reader. -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/ecc04813/attachment.html>
Richard Elling
2012-Jul-11 17:49 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Jul 11, 2012, at 10:23 AM, Sa?o Kiselkov wrote:> Hi Richard, > > On 07/11/2012 06:58 PM, Richard Elling wrote: >> Thanks Sa?o! >> Comments below... >> >> On Jul 10, 2012, at 4:56 PM, Sa?o Kiselkov wrote: >> >>> Hi guys, >>> >>> I''m contemplating implementing a new fast hash algorithm in Illumos'' ZFS >>> implementation to supplant the currently utilized sha256. >> >> No need to supplant, there are 8 bits for enumerating hash algorithms, so >> adding another is simply a matter of coding. With the new feature flags, it is >> almost trivial to add new algorithms without causing major compatibility >> headaches. Darren points out that Oracle is considering doing the same, >> though I do not expect Oracle to pick up the feature flags. > > I meant in the functional sense, not in the technical - of course, my > changes would be implemented as a feature flags add-on.Great! Let''s do it! -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/f244c504/attachment.html>
Hung-Sheng Tsao (LaoTsao) Ph.D
2012-Jul-11 19:01 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
Sent from my iPad On Jul 11, 2012, at 13:11, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Wed, 11 Jul 2012, Richard Elling wrote: >> The last studio release suitable for building OpenSolaris is available in the repo. >> See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos > > Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel.hi you can still download 12 12.1 12.2, AFAIK through OTN> The only way I can see to obtain these releases is illegally. > > However, Studio 12.3 (free download) produces user-space executables which run fine under Illumos. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Wed, Jul 11, 2012 at 3:45 AM, Sa?o Kiselkov <skiselkov.ml at gmail.com> wrote:> It''s also possible to set "dedup=verify" with "checksum=sha256", > however, that makes little sense (as the chances of getting a random > hash collision are essentially nil).IMO dedup should always verify. Nico --
Bob Friesenhahn
2012-Jul-11 19:16 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Wed, 11 Jul 2012, Hung-Sheng Tsao (LaoTsao) Ph.D wrote:>> >> Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. > > you can still download 12 12.1 12.2, AFAIK through OTNThat is true (and I have done so). Unfortunately the versions offered are not the correct ones to build the OpenSolaris kernel. Special patched versions with particular date stamps are required. The only way that I see to obtain these files any more is via distribution channels primarily designed to perform copyright violations. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Hung-Sheng Tsao Ph.D.
2012-Jul-11 19:54 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On 7/11/2012 3:16 PM, Bob Friesenhahn wrote:> On Wed, 11 Jul 2012, Hung-Sheng Tsao (LaoTsao) Ph.D wrote: >>> >>> Not correct as far as I can tell. You should re-read the page you >>> referenced. Oracle recinded (or lost) the special Studio releases >>> needed to build the OpenSolaris kernel. >> >> you can still download 12 12.1 12.2, AFAIK through OTN > > That is true (and I have done so). Unfortunately the versions offered > are not the correct ones to build the OpenSolaris kernel. Special > patched versions with particular date stamps are required. The only > way that I see to obtain these files any more is via distribution > channels primarily designed to perform copyright violations.one can still download the patches through MOS, but one need to pay the support for development tolls:-(> > Bob-- -------------- next part -------------- A non-text attachment was scrubbed... Name: laotsao.vcf Type: text/x-vcard Size: 600 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/3d7b4c60/attachment.vcf>
Bill Sommerfeld
2012-Jul-11 20:06 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On 07/11/12 02:10, Sa?o Kiselkov wrote:> Oh jeez, I can''t remember how many times this flame war has been going > on on this list. Here''s the gist: SHA-256 (or any good hash) produces a > near uniform random distribution of output. Thus, the chances of getting > a random hash collision are around 2^-256 or around 10^-77.I think you''re correct that most users don''t need to worry about this -- sha-256 dedup without verification is not going to cause trouble for them. But your analysis is off. You''re citing the chance that two blocks picked at random will have the same hash. But that''s not what dedup does; it compares the hash of a new block to a possibly-large population of other hashes, and that gets you into the realm of "birthday problem" or "birthday paradox". See http://en.wikipedia.org/wiki/Birthday_problem for formulas. So, maybe somewhere between 10^-50 and 10^-55 for there being at least one collision in really large collections of data - still not likely enough to worry about. Of course, that assumption goes out the window if you''re concerned that an adversary may develop practical ways to find collisions in sha-256 within the deployment lifetime of a system. sha-256 is, more or less, a scaled-up sha-1, and sha-1 is known to be weaker than the ideal 2^80 strength you''d expect from 2^160 bits of hash; the best credible attack is somewhere around 2^57.5 (see http://en.wikipedia.org/wiki/SHA-1#SHA-1). on a somewhat less serious note, perhaps zfs dedup should contain "chinese lottery" code (see http://tools.ietf.org/html/rfc3607 for one explanation) which asks the sysadmin to report a detected sha-256 collision to eprint.iacr.org or the like...
On 07/11/2012 10:06 PM, Bill Sommerfeld wrote:> On 07/11/12 02:10, Sa?o Kiselkov wrote: >> Oh jeez, I can''t remember how many times this flame war has been going >> on on this list. Here''s the gist: SHA-256 (or any good hash) produces a >> near uniform random distribution of output. Thus, the chances of getting >> a random hash collision are around 2^-256 or around 10^-77. > > I think you''re correct that most users don''t need to worry about this -- > sha-256 dedup without verification is not going to cause trouble for them. > > But your analysis is off. You''re citing the chance that two blocks picked at > random will have the same hash. But that''s not what dedup does; it compares > the hash of a new block to a possibly-large population of other hashes, and > that gets you into the realm of "birthday problem" or "birthday paradox". > > See http://en.wikipedia.org/wiki/Birthday_problem for formulas. > > So, maybe somewhere between 10^-50 and 10^-55 for there being at least one > collision in really large collections of data - still not likely enough to > worry about.Yeah, I know, I did this as a quick first-degree approximation. However, the provided range is still very far above the chances of getting a random bit-rot error that even Fletcher won''t catch.> Of course, that assumption goes out the window if you''re concerned that an > adversary may develop practical ways to find collisions in sha-256 within the > deployment lifetime of a system. sha-256 is, more or less, a scaled-up sha-1, > and sha-1 is known to be weaker than the ideal 2^80 strength you''d expect from > 2^160 bits of hash; the best credible attack is somewhere around 2^57.5 (see > http://en.wikipedia.org/wiki/SHA-1#SHA-1).Of course, this is theoretically possible, however, I do not expect such an attack to be practical within any reasonable time frame of the deployment. In any case, should a realistic need to solve this arise, we can always simply switch hashes (I''m also planning to implement Skein-512/256) and do a recv/send to rewrite everything on disk. PITA? Yes. Serious problem? Don''t think so.> on a somewhat less serious note, perhaps zfs dedup should contain "chinese > lottery" code (see http://tools.ietf.org/html/rfc3607 for one explanation) > which asks the sysadmin to report a detected sha-256 collision to > eprint.iacr.org or the like...How about we ask them to report to me instead, like so: 1) Detect collision 2) Report to me 3) ??? 4) Profit! Cheers, -- Saso
Richard Elling
2012-Jul-11 21:39 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Jul 11, 2012, at 1:06 PM, Bill Sommerfeld wrote:> on a somewhat less serious note, perhaps zfs dedup should contain "chinese > lottery" code (see http://tools.ietf.org/html/rfc3607 for one explanation) > which asks the sysadmin to report a detected sha-256 collision to > eprint.iacr.org or the like...Agree. George was in that section of the code a few months ago (zio.c) and I asked him to add a kstat, at least. I''ll follow up with him next week, or get it done some other way. -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/4f3a0ec3/attachment-0001.html>
On Wed, Jul 11, 2012 at 7:48 AM, Casper Dik wrote:> Dan Brown seems to think so in "Digital Fortress" but it just means he > has no grasp on "big numbers". >Or little else, for that matter. I seem to recall one character in the book that would routinely slide under a "mainframe" on his back as if on a mechanics dolly, solder CPUs to the motherboard above his face, and perform all manner of bullshit on the fly repairs that never even existed back in the earliest days of mid-20th century computing. I don''t recall anything else of a technical nature that made a lick of sense and the story was only further insulting by the mass of alleged super geniuses that could barely tie their own shoelaces, etc. etc. Reading the one star reviews of this book on Amazon is far more enlightening & entertaining than reading the actual book. I found it so insulting that I couldn''t finish the last 70 pages of the paperback. -Gary -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120711/f099e33e/attachment.html>
Gregg Wonderly
2012-Jul-12 03:56 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
On Jul 11, 2012, at 12:06 PM, Sa?o Kiselkov wrote:>> I say, in fact that the total number of unique patterns that can exist on any pool is small, compared to the total, illustrating that I understand how the key space for the algorithm is small when looking at a ZFS pool, and thus could have a non-collision opportunity. > > This is so profoundly wrong that it leads me to suspect you never took > courses on cryptography and/or information theory. The size of your > storage pool DOESN''T MATTER ONE BIT to the size of the key space. Even > if your pool were the size of a single block, we''re talking here about > the *mathematical* possibility of hitting on a random block that hashes > to the same value. Given a stream of random data blocks (thus simulating > an exhaustive brute-force search) and a secure pseudo-random hash > function (which has a roughly equal chance of producing any output value > for a given input block), you''ve got only a 10^-77 chance of getting a > hash collision. If you don''t understand how this works, read a book on > digital coding theory.The size of the pool does absolutely matter, because it represents the total number of possible bit patterns you can involve in the mapping (through the math). If the size of the ZFS pool is limited, the total number of "unique" blocks is in fact limited by the size of the pool. This affects how many collisions are possible, and thus how effective dedup can be. Overtime, if the bit patterns can change on each block, at some point, you can arrive at one of the collisions. Yes, it''s rare, I''m not disputing that, I am disputing that the risk is discardable in computer applications where data integrity matters. For example, losing money as the example I used.>> I''m pushing you to find a way to demonstrate that there is zero risk because if you do that, then you''ve, in fact created the ultimate compression factor (but enlarged the keys that could collide because the pool is now virtually larger), to date for random bit patterns, and you''ve also demonstrated that the particular algorithm is very good for dedup. >> That would indicate to me, that you can then take that algorithm, and run it inside of ZFS dedup to automatically manage when verify is necessary by detecting when a collision occurs. > > Do you know what a dictionary is in compression algorithms?Yes I am familiar with this kind of compression> Do you even > know how things like Huffman coding or LZW work, at least in principle?Yes> If not, then I can see why you didn''t understand my earlier explanations > of why hashes aren''t usable for compression.With zero collisions in a well defined key space, they work perfectly for compression. To whit, you are saying that you are comfortable enough using them for dedup, which is exactly a form of compression. I''m agreeing that the keyspace is huge, but the collision possibilities mean I''m not comfortable with verify=no If there wasn''t a sufficiently "small" keyspace in a ZFS pool, then dedup would never succeed. There are some block contents that are recurring. Usually blocks filled with 00, FF, or some pattern from a power up memory pattern etc. So those few common patterns are easily dedup''d out.> >> I appreciate the push back. I''m trying to drive thinking about this into the direction of what is known and finite, away from what is infinitely complex and thus impossible to explore. > > If you don''t understand the mathematics behind my arguments, just say so.I understand the math. I''m not convinced it''s nothing to worry about, because my data is valuable enough to me that I am using ZFS. If I was using dedup, I''d for sure turn on verify? Gregg
You can treat whatever hash function as an idealized one, but actual hash functions aren''t. There may well be as-yet-undiscovered input bit pattern ranges where there''s a large density of collisions in some hash function, and indeed, since our hash functions aren''t ideal, there must be. We just don''t know where these potential collisions are -- for cryptographically secure hash functions that''s enough (plus 2nd pre-image and 1st pre-image resistance, but allow me to handwave), but for dedup? *shudder*. Now, for some content types collisions may not be a problem at all. Think of security camera recordings: collisions will show up as bad frames in a video stream that no one is ever going to look at, and if they should need it, well, too bad. And for other content types collisions can be horrible. Us ZFS lovers love to talk about how silent bit rot means you may never know about serious corruption in other filesystems until it''s too late. Now, if you disable verification in dedup, what do you get? The same situation as other filesystems are in relative to bit rot, only with different likelihoods. Disabling verification is something to do after careful deliberation, not something to do by default. Nico --
Edward Ned Harvey
2012-Jul-12 13:37 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > > On 07/11/2012 05:58 PM, Gregg Wonderly wrote: > > You''re entirely sure that there could never be two different blocks thatcan> hash to the same value and have different content? > > > > Wow, can you just send me the cash now and we''ll call it even? > > You''re the one making the positive claim and I''m calling bullshit. So > the onus is on you to demonstrate the collision (and that you arrived at > it via your brute force method as described). Until then, my money stays > safely on my bank account. Put up or shut up, as the old saying goes.Come on dude. Chill out.
Edward Ned Harvey
2012-Jul-12 13:44 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Nico Williams > > IMO dedup should always verify.IMO, it should be a decision left to the user or admin, with the default being verify. (Exactly as it is now.) Because there is a performance-to-reliability tradeoff, and there is a very solid argument that the probability of collision causing data loss is so small that you should instead be more concerned with metoer strike and other improbable failures causing data loss. For some people and some situations, that argument will be good enough - they would opt for the performance increase. And for others, that argument is not good enough - they would opt for verification.
Edward Ned Harvey
2012-Jul-12 13:59 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > > This is so profoundly wrong that it leads me to suspect you never took > courses on cryptography and/or information theory.More inflammatory commentary, and personal comments? Saso, seriously, knock it off.> The size of your > storage pool DOESN''T MATTER ONE BIT to the size of the key space.Are we still talking about SHA256 collisions in a zpool? If so, the size of the storage pool (at least, the used blocks inside the pool) definitely has an effect on the probability of a collision. The first block written to pool, gets a checksum. There is a zero-probability of collision, because it''s the first one. The second block written gets another checksum. It has a 2^-256 probability of collision with the first one. The 3rd block has a 2*2^-256 probability... The 4th block has 3*2^-256 probability... This is the birthday problem. If you don''t know it, look it up on wikipedia. Each block written has a higher (but still very low) probability of collision with any other block that was previously written, until ... When you write your approximately 2^128th block, you have a 50/50 chance of collision with any other block. Fortunately, this amount of data is far greater than all the storage in the world combined, so this probability can never be reached with earthly means. For all Earthly data sets, assuming no SHA256 weaknesses are being exploited, the probability of a collision is still extremely small, but the more data blocks in the data set, the higher the probability is.
Edward Ned Harvey
2012-Jul-12 14:14 UTC
[zfs-discuss] New fast hash algorithm - is it needed?
> From: Jim Klimov [mailto:jimklimov at cos.ru] > Sent: Thursday, July 12, 2012 8:42 AM > To: Edward Ned Harvey > Subject: Re: [zfs-discuss] New fast hash algorithm - is it needed? > > 2012-07-11 18:03, Edward Ned Harvey ?????: > >> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > >> bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > >> > >> As your dedup > >> ratio grows, so does the performance hit from dedup=verify. At, say, > >> dedupratio=10.0x, on average, every write results in 10 reads. > > > > Why? > > If you intend to write a block, and you discover it has a matching checksum, > > you only need to verify it matches one block. You don''t need to verify it > > matches all the other blocks that have previously been verified to match > > each other. > > As Saso explained, if you wrote the same block 10 times > and detected it was already deduped, then by verifying > this detection 10 times you get about 10 extra reads.(In this case, Jim, you wrote me off-list and I replied on-list, but in this case, I thought it would be ok because this message doesn''t look private or exclusive to me. I apologize if I was wrong.) I get the miscommunication now - When you write the duplicate block for the 10th time, we all understand you''re not going to go back and verify 10 blocks. (It seemed, at least to me, that''s what Saso was saying. Which is why I told him, No you don''t.) You''re saying, that when you wrote the duplicate block the 2nd time, you verified... And then when you wrote it the 3rd time, you verified... And the 4th time, and the 5th time... By the time you write the 10th time, you have already verified 9 previous times, but you''re still going to verify again. Normally you would expect writing duplicate blocks dedup''d to be faster than writing the non-dedup''d data, because you get to skip the actual write. (This idealistic performance gain might be pure vapor due to need to update metadata, but ignoring that technicality, continue hypothetically...) When you verify, supposedly you''re giving up that hypothetical performance gain, because you have to do a read instead of a write. So at first blush, it seems like no net-gain for performance. But if the duplicate block gets written frequently (for example a block of all zeros) then there''s a high probability the duplicate block is already in ARC, so you can actually skip reading from disk and just read from RAM. So, the "10 extra reads" will sometimes be true - if the duplicate block doesn''t already exist in ARC. And the "10 extra reads" will sometimes be false - if the duplicate block is already in ARC.
On Thu, Jul 12, 2012 at 9:14 AM, Edward Ned Harvey < opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:> > From: Jim Klimov [mailto:jimklimov at cos.ru] > > Sent: Thursday, July 12, 2012 8:42 AM > > To: Edward Ned Harvey > > Subject: Re: [zfs-discuss] New fast hash algorithm - is it needed? > > > > 2012-07-11 18:03, Edward Ned Harvey ?????: > > >> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > > >> bounces at opensolaris.org] On Behalf Of Sa?o Kiselkov > > >> > > >> As your dedup > > >> ratio grows, so does the performance hit from dedup=verify. At, say, > > >> dedupratio=10.0x, on average, every write results in 10 reads. > > > > > > Why? > > > If you intend to write a block, and you discover it has a matching > checksum, > > > you only need to verify it matches one block. You don''t need to > verify it > > > matches all the other blocks that have previously been verified to > match > > > each other. > > > > As Saso explained, if you wrote the same block 10 times > > and detected it was already deduped, then by verifying > > this detection 10 times you get about 10 extra reads. > > (In this case, Jim, you wrote me off-list and I replied on-list, but in > this case, I thought it would be ok because this message doesn''t look > private or exclusive to me. I apologize if I was wrong.) > > I get the miscommunication now - > > When you write the duplicate block for the 10th time, we all understand > you''re not going to go back and verify 10 blocks. (It seemed, at least to > me, that''s what Saso was saying. Which is why I told him, No you don''t.) > > You''re saying, that when you wrote the duplicate block the 2nd time, you > verified... And then when you wrote it the 3rd time, you verified... And > the 4th time, and the 5th time... By the time you write the 10th time, > you have already verified 9 previous times, but you''re still going to > verify again. > > Normally you would expect writing duplicate blocks dedup''d to be faster > than writing the non-dedup''d data, because you get to skip the actual > write. (This idealistic performance gain might be pure vapor due to need > to update metadata, but ignoring that technicality, continue > hypothetically...) When you verify, supposedly you''re giving up that > hypothetical performance gain, because you have to do a read instead of a > write. So at first blush, it seems like no net-gain for performance. But > if the duplicate block gets written frequently (for example a block of all > zeros) then there''s a high probability the duplicate block is already in > ARC, so you can actually skip reading from disk and just read from RAM. > > So, the "10 extra reads" will sometimes be true - if the duplicate block > doesn''t already exist in ARC. And the "10 extra reads" will sometimes be > false - if the duplicate block is already in ARC.Sasso: yes, it''s absolutely worth implementing a higher performing hashing algorithm. I''d suggest simply ignoring the people that aren''t willing to acknowledge basic mathematics rather than lashing out. No point in feeding the trolls. The PETABYTES of data Quantum and DataDomain have out there are proof positive that complex hashes get the job done without verify, even if you don''t want to acknowledge the math behind it. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120712/e04a3e1f/attachment.html>
On 07/12/2012 07:16 PM, Tim Cook wrote:> Sasso: yes, it''s absolutely worth implementing a higher performing hashing > algorithm. I''d suggest simply ignoring the people that aren''t willing to > acknowledge basic mathematics rather than lashing out. No point in feeding > the trolls. The PETABYTES of data Quantum and DataDomain have out there > are proof positive that complex hashes get the job done without verify, > even if you don''t want to acknowledge the math behind it.That''s what I deduced as well. I have far too much time to explain statistics, coding theory and compression algorithms to random people on the Internet. Code talks. I''m almost done implementing SHA-512/256 and Edon-R-512/256. I''ve spent most of the time learning how the ZFS code is structured and I now have most of the pieces in place, so expect the completed code drop with (SHA-512, Edon-R and Skein) by this weekend. Cheers, -- Saso
On 07/12/2012 09:52 PM, Sa?o Kiselkov wrote:> I have far too much time to explainP.S. that should have read "I have taken far too much time explaining". Men are crap at multitasking... Cheers, -- Saso