Here''s a slightly updated version of the BTRFS snappy interface. snappy is a faster compression algorithm that provides similar compression as LZO, but generally better performance. This version has: - Ported to latest tree - Extensive testing and benchmarking (in its 3.0 based incarnation) - A few cleanups - Adds a minor performance improvement Should be ready for pulling into the btrfs tree now. The crypto parts have been acked by Herbert. The following changes since commit 2485a4b610171f4e1c4ab0d053569747795c1bbe: Merge branch ''for-linus'' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input (2012-01-12 12:40:41 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc snappy Andi Kleen (3): Add the snappy-c compressor to lib v2 BTRFS: Add snappy support v2 Add snappy interface to crypto API crypto/Kconfig | 9 + crypto/Makefile | 1 + crypto/snappy.c | 99 ++++ fs/btrfs/Kconfig | 1 + fs/btrfs/Makefile | 2 +- fs/btrfs/compression.c | 1 + fs/btrfs/compression.h | 1 + fs/btrfs/ctree.h | 9 +- fs/btrfs/disk-io.c | 2 + fs/btrfs/ioctl.c | 4 + fs/btrfs/snappy.c | 432 ++++++++++++++++ fs/btrfs/super.c | 9 +- include/linux/snappy.h | 26 + lib/Kconfig | 6 + lib/Makefile | 4 + lib/snappy.c | 1300 ++++++++++++++++++++++++++++++++++++++++++++++++ 16 files changed, 1901 insertions(+), 5 deletions(-) create mode 100644 crypto/snappy.c create mode 100644 fs/btrfs/snappy.c create mode 100644 include/linux/snappy.h create mode 100644 lib/snappy.c -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Andi Kleen <ak@linux.intel.com> This is a C port of the google snappy compressor. It has roughly comparable compression to LZO, but is significantly faster on many file types. For example it beats all other compressors on already compressed data. I ported the original C++ code over to C and did some changes to make it better fit into the kernel. It preallocates the worst case memory consumption now. While the code being larger than lzo it is still reasonable (about 5K on x86). Decompression needs very little memory, Compression currently 192K on 64bit and 128K on 32bit. For comparison LZO compression needs 128K on 64bit and 64K on 32bit. [This could be lowered significantly by not preallocating for most use cases, typically the footprint is much lower. The original C++ version only allocated most of this when (rarely) needed, but this is more problematic in the kernel] There are some minor divergences from the Linux coding standards: in particular I kept the C++/C99 style mixed statement/declarations. This was mainly to not diverge too much from the reference C++ source, so that improvements from there can be easily ported. There are some other left overs from the google style, but very little now. Performance: The compressor performs best on 64bit-LE systems, but is also quite good on 32bit. I haven''t tested BE, but I don''t expect that to add a lot of overhead. Here is some performance data (32bit, Nehalem): c/b = cycles/byte; lower numbers are better. x86-64 executable: (compression minimally slower than qlz, but much better at decompression, lzo is left in the dust): snappy: emacs-gtk: 11007968 b: ratio 0.38: comp 8.13 uncomp 2.65 c/b lzo : emacs-gtk: 11007968 b: ratio 0.33: comp 12.74 uncomp 4.70 c/b zlib1 : emacs-gtk: 11007968 b: ratio 0.27: comp 49.96 uncomp 13.14 c/b zlib3 : emacs-gtk: 11007968 b: ratio 0.26: comp 64.17 uncomp 12.33 c/b lzf : emacs-gtk: 11007968 b: ratio 0.37: comp 9.85 uncomp 4.33 c/b qlz : emacs-gtk: 11007968 b: ratio 0.34: comp 7.51 uncomp 6.28 c/b fastlz: emacs-gtk: 11007968 b: ratio 0.37: comp 10.73 uncomp 4.97 c/b Compressed data (beats everything else): snappy: udev-151.tar.gz: 634842 b: ratio 1.00: comp 0.99 uncomp 0.33 c/b lzo : udev-151.tar.gz: 634842 b: ratio 1.00: comp 41.44 uncomp 0.66 c/b zlib1 : udev-151.tar.gz: 634842 b: ratio 1.00: comp 116.99 uncomp 3.94 c/b zlib3 : udev-151.tar.gz: 634842 b: ratio 1.00: comp 117.68 uncomp 3.94 c/b lzf : udev-151.tar.gz: 634842 b: ratio 1.03: comp 16.32 uncomp 1.14 c/b qlz : udev-151.tar.gz: 634842 b: ratio 1.00: comp 10.42 uncomp 0.42 c/b fastlz: udev-151.tar.gz: 634842 b: ratio 1.03: comp 19.35 uncomp 2.07 c/b Text file (compression somewhat slower than qlz, but decompression much better, lzo is much worse): snappy: manual.txt: 445343 b: ratio 0.47: comp 12.01 uncomp 3.12 c/b lzo : manual.txt: 445343 b: ratio 0.44: comp 16.32 uncomp 7.53 c/b zlib1 : manual.txt: 445343 b: ratio 0.35: comp 56.37 uncomp 15.59 c/b zlib3 : manual.txt: 445343 b: ratio 0.31: comp 73.45 uncomp 13.99 c/b lzf : manual.txt: 445343 b: ratio 0.46: comp 13.43 uncomp 5.47 c/b qlz : manual.txt: 445343 b: ratio 0.44: comp 9.16 uncomp 8.19 c/b fastlz: manual.txt: 445343 b: ratio 0.46: comp 14.22 uncomp 7.28 c/b As you can see snappy is a good all-around compressor. On 64bit the compression is even faster and beats everything else easily: snappy: emacs-gtk: 11007968 b: ratio 0.38: comp 4.90 uncomp 2.65 c/b lzo : emacs-gtk: 11007968 b: ratio 0.33: comp 11.24 uncomp 4.46 c/b zlib1 : emacs-gtk: 11007968 b: ratio 0.27: comp 41.67 uncomp 11.13 c/b zlib3 : emacs-gtk: 11007968 b: ratio 0.26: comp 51.80 uncomp 10.54 c/b lzf : emacs-gtk: 11007968 b: ratio 0.37: comp 8.79 uncomp 4.05 c/b qlz : emacs-gtk: 11007968 b: ratio 0.34: comp 5.44 uncomp 5.46 c/b fastlz: emacs-gtk: 11007968 b: ratio 0.37: comp 9.91 uncomp 4.77 c/b On 64bit it''s now nearly as fast as qlz on the text file too: snappy: manual.txt: 445343 b: ratio 0.47: comp 7.79 uncomp 3.47 c/b lzo : manual.txt: 445343 b: ratio 0.44: comp 15.46 uncomp 7.27 c/b zlib1 : manual.txt: 445343 b: ratio 0.35: comp 45.79 uncomp 12.78 c/b zlib3 : manual.txt: 445343 b: ratio 0.31: comp 60.52 uncomp 11.72 c/b lzf : manual.txt: 445343 b: ratio 0.46: comp 12.62 uncomp 5.30 c/b qlz : manual.txt: 445343 b: ratio 0.44: comp 6.81 uncomp 7.65 c/b fastlz: manual.txt: 445343 b: ratio 0.46: comp 13.75 uncomp 6.52 c/b Overall it''s a good alternative to lzo, with the only drawback being the somewhat higher memory use. v2: Some minor performance improvements and cleanups. 32bit compression should be a few percent faster now. Signed-off-by: Andi Kleen <ak@linux.intel.com> --- include/linux/snappy.h | 26 + lib/Kconfig | 6 + lib/Makefile | 4 + lib/snappy.c | 1300 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 1336 insertions(+), 0 deletions(-) create mode 100644 include/linux/snappy.h create mode 100644 lib/snappy.c diff --git a/include/linux/snappy.h b/include/linux/snappy.h new file mode 100644 index 0000000..4119803 --- /dev/null +++ b/include/linux/snappy.h @@ -0,0 +1,26 @@ +#ifndef _LINUX_SNAPPY_H +#define _LINUX_SNAPPY_H 1 + +#include <linux/types.h> + +/* Only needed for compression. This preallocates the worst case */ +struct snappy_env { + u16 *hash_table; + void *scratch; + void *scratch_output; +}; + +int snappy_init_env(struct snappy_env *env); +void snappy_free_env(struct snappy_env *env); +bool snappy_uncompress(const char *compressed, size_t n, char *uncompressed); +int snappy_compress(struct snappy_env *env, + const char *input, + size_t input_length, + char *compressed, + size_t *compressed_length); +bool snappy_uncompressed_length(const char *buf, size_t len, size_t *result); +size_t snappy_max_compressed_length(size_t source_len); + + + +#endif diff --git a/lib/Kconfig b/lib/Kconfig index 201e1b3..719e4f2 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -310,4 +310,10 @@ config DIGSIG Digital signature verification. Currently only RSA is supported. Implementation is done using GnuPG MPI library +config SNAPPY + tristate "Snappy compressor" + help + Add the snappy compressor. This is a reasonable compressor that + compresses and decompresses extremly fast. + endmenu diff --git a/lib/Makefile b/lib/Makefile index dace162..2f5f86a 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -121,6 +121,10 @@ obj-$(CONFIG_DQL) += dynamic_queue_limits.o obj-$(CONFIG_MPILIB) += mpi/ obj-$(CONFIG_DIGSIG) += digsig.o +CFLAGS_snappy.o += $(call cc-disable-warning, declaration-after-statement) \ + -DNDEBUG=1 +obj-$(CONFIG_SNAPPY) += snappy.o + hostprogs-y := gen_crc32table clean-files := crc32table.h diff --git a/lib/snappy.c b/lib/snappy.c new file mode 100644 index 0000000..0b39e07 --- /dev/null +++ b/lib/snappy.c @@ -0,0 +1,1300 @@ +/* + * C port of the snappy compressor from Google. + * This is a very fast compressor with comparable compression to lzo. + * Works best on 64bit little-endian, but should be good on others too. + * Ported by Andi Kleen. + * Based on snappy 1.0.3 plus some selected changes from SVN. + */ + +/* + * Copyright 2005 Google Inc. All Rights Reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following disclaimer + * in the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Google Inc. nor the names of its + * contributors may be used to endorse or promote products derived from + * this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/slab.h> +#include <linux/string.h> +#include <linux/snappy.h> +#include <asm/unaligned.h> + +#define CRASH_UNLESS(x) BUG_ON(!(x)) +#define CHECK(cond) CRASH_UNLESS(cond) +#define CHECK_LE(a, b) CRASH_UNLESS((a) <= (b)) +#define CHECK_GE(a, b) CRASH_UNLESS((a) >= (b)) +#define CHECK_EQ(a, b) CRASH_UNLESS((a) == (b)) +#define CHECK_NE(a, b) CRASH_UNLESS((a) != (b)) +#define CHECK_LT(a, b) CRASH_UNLESS((a) < (b)) +#define CHECK_GT(a, b) CRASH_UNLESS((a) > (b)) + +#define UNALIGNED_LOAD16(_p) get_unaligned((u16 *)(_p)) +#define UNALIGNED_LOAD32(_p) get_unaligned((u32 *)(_p)) +#define UNALIGNED_LOAD64(_p) get_unaligned((u64 *)(_p)) + +#define UNALIGNED_STORE16(_p, _val) put_unaligned(_val, (u16 *)(_p)) +#define UNALIGNED_STORE32(_p, _val) put_unaligned(_val, (u32 *)(_p)) +#define UNALIGNED_STORE64(_p, _val) put_unaligned(_val, (u64 *)(_p)) + +#ifdef NDEBUG + +#define DCHECK(cond) do {} while(0) +#define DCHECK_LE(a, b) do {} while(0) +#define DCHECK_GE(a, b) do {} while(0) +#define DCHECK_EQ(a, b) do {} while(0) +#define DCHECK_NE(a, b) do {} while(0) +#define DCHECK_LT(a, b) do {} while(0) +#define DCHECK_GT(a, b) do {} while(0) + +#else + +#define DCHECK(cond) CHECK(cond) +#define DCHECK_LE(a, b) CHECK_LE(a, b) +#define DCHECK_GE(a, b) CHECK_GE(a, b) +#define DCHECK_EQ(a, b) CHECK_EQ(a, b) +#define DCHECK_NE(a, b) CHECK_NE(a, b) +#define DCHECK_LT(a, b) CHECK_LT(a, b) +#define DCHECK_GT(a, b) CHECK_GT(a, b) + +#endif + +static inline bool is_little_endian(void) +{ +#ifdef __LITTLE_ENDIAN__ + return true; +#endif + return false; +} + +static inline int log2_floor(u32 n) +{ + return n == 0 ? -1 : 31 ^ __builtin_clz(n); +} + +static inline int find_lsb_set_non_zero(u32 n) +{ + return __builtin_ctz(n); +} + +static inline int find_lsb_set_non_zero64(u64 n) +{ + if (sizeof(long) == 4) { + if (n & 0xffffffff) + return __builtin_ctz(n & 0xffffffff); + return 32 + __builtin_ctz(n >> 32); + } + return __builtin_ctzll(n); +} + +#define kmax32 5 + +/* + * Attempts to parse a varint32 from a prefix of the bytes in [ptr,limit-1]. + * Never reads a character at or beyond limit. If a valid/terminated varint32 + * was found in the range, stores it in *OUTPUT and returns a pointer just + * past the last byte of the varint32. Else returns NULL. On success, + * "result <= limit". + */ +static inline const char *varint_parse32_with_limit(const char *p, + const char *l, + u32 * OUTPUT) +{ + const unsigned char *ptr = (const unsigned char *)(p); + const unsigned char *limit = (const unsigned char *)(l); + u32 b, result; + + if (ptr >= limit) + return NULL; + b = *(ptr++); + result = b & 127; + if (b < 128) + goto done; + if (ptr >= limit) + return NULL; + b = *(ptr++); + result |= (b & 127) << 7; + if (b < 128) + goto done; + if (ptr >= limit) + return NULL; + b = *(ptr++); + result |= (b & 127) << 14; + if (b < 128) + goto done; + if (ptr >= limit) + return NULL; + b = *(ptr++); + result |= (b & 127) << 21; + if (b < 128) + goto done; + if (ptr >= limit) + return NULL; + b = *(ptr++); + result |= (b & 127) << 28; + if (b < 16) + goto done; + return NULL; /* Value is too long to be a varint32 */ +done: + *OUTPUT = result; + return (const char *)(ptr); +} + +/* + * REQUIRES "ptr" points to a buffer of length sufficient to hold "v". + * EFFECTS Encodes "v" into "ptr" and returns a pointer to the + * byte just past the last encoded byte. + */ +static inline char *varint_encode32(char *sptr, u32 v) +{ + /* Operate on characters as unsigneds */ + unsigned char *ptr = (unsigned char *)(sptr); + static const int B = 128; + + if (v < (1 << 7)) { + *(ptr++) = v; + } else if (v < (1 << 14)) { + *(ptr++) = v | B; + *(ptr++) = v >> 7; + } else if (v < (1 << 21)) { + *(ptr++) = v | B; + *(ptr++) = (v >> 7) | B; + *(ptr++) = v >> 14; + } else if (v < (1 << 28)) { + *(ptr++) = v | B; + *(ptr++) = (v >> 7) | B; + *(ptr++) = (v >> 14) | B; + *(ptr++) = v >> 21; + } else { + *(ptr++) = v | B; + *(ptr++) = (v >> 7) | B; + *(ptr++) = (v >> 14) | B; + *(ptr++) = (v >> 21) | B; + *(ptr++) = v >> 28; + } + return (char *)(ptr); +} + +struct source { + const char *ptr; + size_t left; +}; + +static inline int available(struct source *s) +{ + return s->left; +} + +static inline const char *peek(struct source *s, size_t * len) +{ + *len = s->left; + return s->ptr; +} + +static inline void skip(struct source *s, size_t n) +{ + s->left -= n; + s->ptr += n; +} + +struct sink { + char *dest; +}; + +static inline void append(struct sink *s, const char *data, size_t n) +{ + if (data != s->dest) + memcpy(s->dest, data, n); + s->dest += n; +} + +static inline void *sink_peek(struct sink *s, size_t n) +{ + return s->dest; +} + +struct writer { + char *base; + char *op; + char *op_limit; +}; + +/* Called before decompression */ +static inline void writer_set_expected_length(struct writer *w, size_t len) +{ + w->op_limit = w->op + len; +} + +/* Called after decompression */ +static inline bool writer_check_length(struct writer *w) +{ + return w->op == w->op_limit; +} + +/* + * Copy "len" bytes from "src" to "op", one byte at a time. Used for + * handling COPY operations where the input and output regions may + * overlap. For example, suppose: + * src == "ab" + * op == src + 2 + * len == 20 + * After IncrementalCopy(src, op, len), the result will have + * eleven copies of "ab" + * ababababababababababab + * Note that this does not match the semantics of either memcpy() + * or memmove(). + */ +static inline void incremental_copy(const char *src, char *op, int len) +{ + DCHECK_GT(len, 0); + do { + *op++ = *src++; + } while (--len > 0); +} + +/* + * Equivalent to IncrementalCopy except that it can write up to ten extra + * bytes after the end of the copy, and that it is faster. + * + * The main part of this loop is a simple copy of eight bytes at a time until + * we''ve copied (at least) the requested amount of bytes. However, if op and + * src are less than eight bytes apart (indicating a repeating pattern of + * length < 8), we first need to expand the pattern in order to get the correct + * results. For instance, if the buffer looks like this, with the eight-byte + * <src> and <op> patterns marked as intervals: + * + * abxxxxxxxxxxxx + * [------] src + * [------] op + * + * a single eight-byte copy from <src> to <op> will repeat the pattern once, + * after which we can move <op> two bytes without moving <src>: + * + * ababxxxxxxxxxx + * [------] src + * [------] op + * + * and repeat the exercise until the two no longer overlap. + * + * This allows us to do very well in the special case of one single byte + * repeated many times, without taking a big hit for more general cases. + * + * The worst case of extra writing past the end of the match occurs when + * op - src == 1 and len == 1; the last copy will read from byte positions + * [0..7] and write to [4..11], whereas it was only supposed to write to + * position 1. Thus, ten excess bytes. + */ + +#define kmax_increment_copy_overflow 10 + +static inline void incremental_copy_fast_path(const char *src, char *op, + int len) +{ + while (op - src < 8) { + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(src)); + len -= op - src; + op += op - src; + } + while (len > 0) { + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(src)); + src += 8; + op += 8; + len -= 8; + } +} + +static inline bool writer_append_from_self(struct writer *w, u32 offset, + u32 len) +{ + char *op = w->op; + const int space_left = w->op_limit - op; + + if (op - w->base <= offset - 1u) /* -1u catches offset==0 */ + return false; + if (len <= 16 && offset >= 8 && space_left >= 16) { + /* Fast path, used for the majority (70-80%) of dynamic + * invocations. */ + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(op - offset)); + UNALIGNED_STORE64(op + 8, UNALIGNED_LOAD64(op - offset + 8)); + } else { + if (space_left >= len + kmax_increment_copy_overflow) { + incremental_copy_fast_path(op - offset, op, len); + } else { + if (space_left < len) { + return false; + } + incremental_copy(op - offset, op, len); + } + } + + w->op = op + len; + return true; +} + +static inline bool writer_append(struct writer *w, const char *ip, u32 len, + bool allow_fast_path) +{ + char *op = w->op; + const int space_left = w->op_limit - op; + if (allow_fast_path && len <= 16 && space_left >= 16) { + /* Fast path, used for the majority (about 90%) of dynamic + * invocations. */ + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(ip)); + UNALIGNED_STORE64(op + 8, UNALIGNED_LOAD64(ip + 8)); + } else { + if (space_left < len) + return false; + memcpy(op, ip, len); + } + w->op = op + len; + return true; +} + +/* + * Any hash function will produce a valid compressed bitstream, but a good + * hash function reduces the number of collisions and thus yields better + * compression for compressible input, and more speed for incompressible + * input. Of course, it doesn''t hurt if the hash function is reasonably fast + * either, as it gets called a lot. + */ +static inline u32 hash_bytes(u32 bytes, int shift) +{ + u32 kmul = 0x1e35a7bd; + return (bytes * kmul) >> shift; +} + +static inline u32 hash(const char *p, int shift) +{ + return hash_bytes(UNALIGNED_LOAD32(p), shift); +} + +/* + * Compressed data can be defined as: + * compressed := item* literal* + * item := literal* copy + * + * The trailing literal sequence has a space blowup of at most 62/60 + * since a literal of length 60 needs one tag byte + one extra byte + * for length information. + * + * Item blowup is trickier to measure. Suppose the "copy" op copies + * 4 bytes of data. Because of a special check in the encoding code, + * we produce a 4-byte copy only if the offset is < 65536. Therefore + * the copy op takes 3 bytes to encode, and this type of item leads + * to at most the 62/60 blowup for representing literals. + * + * Suppose the "copy" op copies 5 bytes of data. If the offset is big + * enough, it will take 5 bytes to encode the copy op. Therefore the + * worst case here is a one-byte literal followed by a five-byte copy. + * I.e., 6 bytes of input turn into 7 bytes of "compressed" data. + * + * This last factor dominates the blowup, so the final estimate is: + */ +size_t snappy_max_compressed_length(size_t source_len) +{ + return 32 + source_len + source_len / 6; +} +EXPORT_SYMBOL(snappy_max_compressed_length); + +enum { + LITERAL = 0, + COPY_1_BYTE_OFFSET = 1, /* 3 bit length + 3 bits of offset in opcode */ + COPY_2_BYTE_OFFSET = 2, + COPY_4_BYTE_OFFSET = 3 +}; + +static inline char *emit_literal(char *op, + const char *literal, + int len, bool allow_fast_path) +{ + int n = len - 1; /* Zero-length literals are disallowed */ + + if (n < 60) { + /* Fits in tag byte */ + *op++ = LITERAL | (n << 2); + +/* + * The vast majority of copies are below 16 bytes, for which a + * call to memcpy is overkill. This fast path can sometimes + * copy up to 15 bytes too much, but that is okay in the + * main loop, since we have a bit to go on for both sides: + * + * - The input will always have kInputMarginBytes = 15 extra + * available bytes, as long as we''re in the main loop, and + * if not, allow_fast_path = false. + * - The output will always have 32 spare bytes (see + * MaxCompressedLength). + */ + if (allow_fast_path && len <= 16) { + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(literal)); + UNALIGNED_STORE64(op + 8, + UNALIGNED_LOAD64(literal + 8)); + return op + len; + } + } else { + /* Encode in upcoming bytes */ + char *base = op; + int count = 0; + op++; + while (n > 0) { + *op++ = n & 0xff; + n >>= 8; + count++; + } + DCHECK(count >= 1); + DCHECK(count <= 4); + *base = LITERAL | ((59 + count) << 2); + } + memcpy(op, literal, len); + return op + len; +} + +static inline char *emit_copy_less_than64(char *op, int offset, int len) +{ + DCHECK_LE(len, 64); + DCHECK_GE(len, 4); + DCHECK_LT(offset, 65536); + + if ((len < 12) && (offset < 2048)) { + int len_minus_4 = len - 4; + DCHECK(len_minus_4 < 8); /* Must fit in 3 bits */ + *op++ + COPY_1_BYTE_OFFSET | ((len_minus_4) << 2) | ((offset >> 8) + << 5); + *op++ = offset & 0xff; + } else { + *op++ = COPY_2_BYTE_OFFSET | ((len - 1) << 2); + put_unaligned_le16(offset, op); + op += 2; + } + return op; +} + +static inline char *emit_copy(char *op, int offset, int len) +{ + /* + * Emit 64 byte copies but make sure to keep at least four bytes + * reserved + */ + while (len >= 68) { + op = emit_copy_less_than64(op, offset, 64); + len -= 64; + } + + /* + * Emit an extra 60 byte copy if have too much data to fit in + * one copy + */ + if (len > 64) { + op = emit_copy_less_than64(op, offset, 60); + len -= 60; + } + + /* Emit remainder */ + op = emit_copy_less_than64(op, offset, len); + return op; +} + +/** + * snappy_uncompressed_length - return length of uncompressed output. + * @start: compressed buffer + * @n: length of compressed buffer. + * @result: Write the length of the uncompressed output here. + * + * Returns true when successfull, otherwise false. + */ +bool snappy_uncompressed_length(const char *start, size_t n, size_t * result) +{ + u32 v = 0; + const char *limit = start + n; + if (varint_parse32_with_limit(start, limit, &v) != NULL) { + *result = v; + return true; + } else { + return false; + } +} +EXPORT_SYMBOL(snappy_uncompressed_length); + +#define kblock_log 15 +#define kblock_size (1 << kblock_log) + +#define kmax_hash_table_bits 14 +#define kmax_hash_table_size (1 << kmax_hash_table_bits) + +/* + * Use smaller hash table when input.size() is smaller, since we + * fill the table, incurring O(hash table size) overhead for + * compression, and if the input is short, we won''t need that + * many hash table entries anyway. + */ +static u16 *get_hash_table(struct snappy_env *env, size_t input_size, + int *table_size) +{ + int htsize = 256; + + DCHECK(kmax_hash_table_size >= 256); + while (htsize < kmax_hash_table_size && htsize < input_size) + htsize <<= 1; + CHECK_EQ(0, htsize & (htsize - 1)); + CHECK_LE(htsize, kmax_hash_table_size); + + u16 *table; + table = env->hash_table; + + *table_size = htsize; + memset(table, 0, htsize * sizeof(*table)); + return table; +} + +/* + * Return the largest n such that + * + * s1[0,n-1] == s2[0,n-1] + * and n <= (s2_limit - s2). + * + * Does not read *s2_limit or beyond. + * Does not read *(s1 + (s2_limit - s2)) or beyond. + * Requires that s2_limit >= s2. + * + * Separate implementation for x86_64, for speed. Uses the fact that + * x86_64 is little endian. + */ +#if defined(__LITTLE_ENDIAN__) +static inline int find_match_length(const char *s1, + const char *s2, const char *s2_limit) +{ + int matched = 0; + + DCHECK_GE(s2_limit, s2); + /* + * Find out how long the match is. We loop over the data 64 bits at a + * time until we find a 64-bit block that doesn''t match; then we find + * the first non-matching bit and use that to calculate the total + * length of the match. + */ + while (likely(s2 <= s2_limit - 8)) { + if (unlikely + (UNALIGNED_LOAD64(s2) == UNALIGNED_LOAD64(s1 + matched))) { + s2 += 8; + matched += 8; + } else { + /* + * On current (mid-2008) Opteron models there + * is a 3% more efficient code sequence to + * find the first non-matching byte. However, + * what follows is ~10% better on Intel Core 2 + * and newer, and we expect AMD''s bsf + * instruction to improve. + */ + u64 x + UNALIGNED_LOAD64(s2) ^ UNALIGNED_LOAD64(s1 + + matched); + int matching_bits = find_lsb_set_non_zero64(x); + matched += matching_bits >> 3; + return matched; + } + } + while (likely(s2 < s2_limit)) { + if (likely(s1[matched] == *s2)) { + ++s2; + ++matched; + } else { + return matched; + } + } + return matched; +} +#else +static inline int find_match_length(const char *s1, + const char *s2, const char *s2_limit) +{ + /* Implementation based on the x86-64 version, above. */ + DCHECK_GE(s2_limit, s2); + int matched = 0; + + while (s2 <= s2_limit - 4 && + UNALIGNED_LOAD32(s2) == UNALIGNED_LOAD32(s1 + matched)) { + s2 += 4; + matched += 4; + } + if (is_little_endian() && s2 <= s2_limit - 4) { + u32 x + UNALIGNED_LOAD32(s2) ^ UNALIGNED_LOAD32(s1 + matched); + int matching_bits = find_lsb_set_non_zero(x); + matched += matching_bits >> 3; + } else { + while ((s2 < s2_limit) && (s1[matched] == *s2)) { + ++s2; + ++matched; + } + } + return matched; +} +#endif + +/* + * For 0 <= offset <= 4, GetU32AtOffset(UNALIGNED_LOAD64(p), offset) will + * equal UNALIGNED_LOAD32(p + offset). Motivation: On x86-64 hardware we have + * empirically found that overlapping loads such as + * UNALIGNED_LOAD32(p) ... UNALIGNED_LOAD32(p+1) ... UNALIGNED_LOAD32(p+2) + * are slower than UNALIGNED_LOAD64(p) followed by shifts and casts to u32. + */ +static inline u32 get_u32_at_offset(u64 v, int offset) +{ + DCHECK(0 <= offset && offset <= 4); + return v >> (is_little_endian()? 8 * offset : 32 - 8 * offset); +} + +/* + * Flat array compression that does not emit the "uncompressed length" + * prefix. Compresses "input" string to the "*op" buffer. + * + * REQUIRES: "input" is at most "kBlockSize" bytes long. + * REQUIRES: "op" points to an array of memory that is at least + * "MaxCompressedLength(input.size())" in size. + * REQUIRES: All elements in "table[0..table_size-1]" are initialized to zero. + * REQUIRES: "table_size" is a power of two + * + * Returns an "end" pointer into "op" buffer. + * "end - op" is the compressed size of "input". + */ + +static char *compress_fragment(const char *const input, + const size_t input_size, + char *op, u16 * table, const int table_size) +{ + /* "ip" is the input pointer, and "op" is the output pointer. */ + const char *ip = input; + CHECK_LE(input_size, kblock_size); + CHECK_EQ(table_size & (table_size - 1), 0); + const int shift = 32 - log2_floor(table_size); + DCHECK_EQ(UINT_MAX >> shift, table_size - 1); + const char *ip_end = input + input_size; + const char *baseip = ip; + /* + * Bytes in [next_emit, ip) will be emitted as literal bytes. Or + * [next_emit, ip_end) after the main loop. + */ + const char *next_emit = ip; + + const int kinput_margin_bytes = 15; + + if (likely(input_size >= kinput_margin_bytes)) { + const char *ip_limit = input + input_size - + kinput_margin_bytes; + + u32 next_hash; + for (next_hash = hash(++ip, shift);;) { + DCHECK_LT(next_emit, ip); +/* + * The body of this loop calls EmitLiteral once and then EmitCopy one or + * more times. (The exception is that when we''re close to exhausting + * the input we goto emit_remainder.) + * + * In the first iteration of this loop we''re just starting, so + * there''s nothing to copy, so calling EmitLiteral once is + * necessary. And we only start a new iteration when the + * current iteration has determined that a call to EmitLiteral will + * precede the next call to EmitCopy (if any). + * + * Step 1: Scan forward in the input looking for a 4-byte-long match. + * If we get close to exhausting the input then goto emit_remainder. + * + * Heuristic match skipping: If 32 bytes are scanned with no matches + * found, start looking only at every other byte. If 32 more bytes are + * scanned, look at every third byte, etc.. When a match is found, + * immediately go back to looking at every byte. This is a small loss + * (~5% performance, ~0.1% density) for lcompressible data due to more + * bookkeeping, but for non-compressible data (such as JPEG) it''s a huge + * win since the compressor quickly "realizes" the data is incompressible + * and doesn''t bother looking for matches everywhere. + * + * The "skip" variable keeps track of how many bytes there are since the + * last match; dividing it by 32 (ie. right-shifting by five) gives the + * number of bytes to move ahead for each iteration. + */ + u32 skip = 32; + + const char *next_ip = ip; + const char *candidate; + do { + ip = next_ip; + u32 hval = next_hash; + DCHECK_EQ(hval, hash(ip, shift)); + u32 bytes_between_hash_lookups = skip++ >> 5; + next_ip = ip + bytes_between_hash_lookups; + if (unlikely(next_ip > ip_limit)) { + goto emit_remainder; + } + next_hash = hash(next_ip, shift); + candidate = baseip + table[hval]; + DCHECK_GE(candidate, baseip); + DCHECK_LT(candidate, ip); + + table[hval] = ip - baseip; + } while (likely(UNALIGNED_LOAD32(ip) !+ UNALIGNED_LOAD32(candidate))); + +/* + * Step 2: A 4-byte match has been found. We''ll later see if more + * than 4 bytes match. But, prior to the match, input + * bytes [next_emit, ip) are unmatched. Emit them as "literal bytes." + */ + DCHECK_LE(next_emit + 16, ip_end); + op = emit_literal(op, next_emit, ip - next_emit, true); + +/* + * Step 3: Call EmitCopy, and then see if another EmitCopy could + * be our next move. Repeat until we find no match for the + * input immediately after what was consumed by the last EmitCopy call. + * + * If we exit this loop normally then we need to call EmitLiteral next, + * though we don''t yet know how big the literal will be. We handle that + * by proceeding to the next iteration of the main loop. We also can exit + * this loop via goto if we get close to exhausting the input. + */ + u64 input_bytes = 0; + u32 candidate_bytes = 0; + + do { +/* + * We have a 4-byte match at ip, and no need to emit any + * "literal bytes" prior to ip. + */ + const char *base = ip; + int matched = 4 + + find_match_length(candidate + 4, ip + 4, + ip_end); + ip += matched; + int offset = base - candidate; + DCHECK_EQ(0, memcmp(base, candidate, matched)); + op = emit_copy(op, offset, matched); +/* + * We could immediately start working at ip now, but to improve + * compression we first update table[Hash(ip - 1, ...)]. + */ + const char *insert_tail = ip - 1; + next_emit = ip; + if (unlikely(ip >= ip_limit)) { + goto emit_remainder; + } + input_bytes = UNALIGNED_LOAD64(insert_tail); + u32 prev_hash + hash_bytes(get_u32_at_offset + (input_bytes, 0), shift); + table[prev_hash] = ip - baseip - 1; + u32 cur_hash + hash_bytes(get_u32_at_offset + (input_bytes, 1), shift); + candidate = baseip + table[cur_hash]; + candidate_bytes = UNALIGNED_LOAD32(candidate); + table[cur_hash] = ip - baseip; + } while (get_u32_at_offset(input_bytes, 1) =+ candidate_bytes); + + next_hash + hash_bytes(get_u32_at_offset(input_bytes, 2), + shift); + ++ip; + } + } + +emit_remainder: + /* Emit the remaining bytes as a literal */ + if (next_emit < ip_end) + op = emit_literal(op, next_emit, ip_end - next_emit, false); + + return op; +} + +/* + * ----------------------------------------------------------------------- + * Lookup table for decompression code. Generated by ComputeTable() below. + * ----------------------------------------------------------------------- + */ + +/* Mapping from i in range [0,4] to a mask to extract the bottom 8*i bits */ +static const u32 wordmask[] = { + 0u, 0xffu, 0xffffu, 0xffffffu, 0xffffffffu +}; + +/* + * Data stored per entry in lookup table: + * Range Bits-used Description + * ------------------------------------ + * 1..64 0..7 Literal/copy length encoded in opcode byte + * 0..7 8..10 Copy offset encoded in opcode byte / 256 + * 0..4 11..13 Extra bytes after opcode + * + * We use eight bits for the length even though 7 would have sufficed + * because of efficiency reasons: + * (1) Extracting a byte is faster than a bit-field + * (2) It properly aligns copy offset so we do not need a <<8 + */ +static const u16 char_table[256] = { + 0x0001, 0x0804, 0x1001, 0x2001, 0x0002, 0x0805, 0x1002, 0x2002, + 0x0003, 0x0806, 0x1003, 0x2003, 0x0004, 0x0807, 0x1004, 0x2004, + 0x0005, 0x0808, 0x1005, 0x2005, 0x0006, 0x0809, 0x1006, 0x2006, + 0x0007, 0x080a, 0x1007, 0x2007, 0x0008, 0x080b, 0x1008, 0x2008, + 0x0009, 0x0904, 0x1009, 0x2009, 0x000a, 0x0905, 0x100a, 0x200a, + 0x000b, 0x0906, 0x100b, 0x200b, 0x000c, 0x0907, 0x100c, 0x200c, + 0x000d, 0x0908, 0x100d, 0x200d, 0x000e, 0x0909, 0x100e, 0x200e, + 0x000f, 0x090a, 0x100f, 0x200f, 0x0010, 0x090b, 0x1010, 0x2010, + 0x0011, 0x0a04, 0x1011, 0x2011, 0x0012, 0x0a05, 0x1012, 0x2012, + 0x0013, 0x0a06, 0x1013, 0x2013, 0x0014, 0x0a07, 0x1014, 0x2014, + 0x0015, 0x0a08, 0x1015, 0x2015, 0x0016, 0x0a09, 0x1016, 0x2016, + 0x0017, 0x0a0a, 0x1017, 0x2017, 0x0018, 0x0a0b, 0x1018, 0x2018, + 0x0019, 0x0b04, 0x1019, 0x2019, 0x001a, 0x0b05, 0x101a, 0x201a, + 0x001b, 0x0b06, 0x101b, 0x201b, 0x001c, 0x0b07, 0x101c, 0x201c, + 0x001d, 0x0b08, 0x101d, 0x201d, 0x001e, 0x0b09, 0x101e, 0x201e, + 0x001f, 0x0b0a, 0x101f, 0x201f, 0x0020, 0x0b0b, 0x1020, 0x2020, + 0x0021, 0x0c04, 0x1021, 0x2021, 0x0022, 0x0c05, 0x1022, 0x2022, + 0x0023, 0x0c06, 0x1023, 0x2023, 0x0024, 0x0c07, 0x1024, 0x2024, + 0x0025, 0x0c08, 0x1025, 0x2025, 0x0026, 0x0c09, 0x1026, 0x2026, + 0x0027, 0x0c0a, 0x1027, 0x2027, 0x0028, 0x0c0b, 0x1028, 0x2028, + 0x0029, 0x0d04, 0x1029, 0x2029, 0x002a, 0x0d05, 0x102a, 0x202a, + 0x002b, 0x0d06, 0x102b, 0x202b, 0x002c, 0x0d07, 0x102c, 0x202c, + 0x002d, 0x0d08, 0x102d, 0x202d, 0x002e, 0x0d09, 0x102e, 0x202e, + 0x002f, 0x0d0a, 0x102f, 0x202f, 0x0030, 0x0d0b, 0x1030, 0x2030, + 0x0031, 0x0e04, 0x1031, 0x2031, 0x0032, 0x0e05, 0x1032, 0x2032, + 0x0033, 0x0e06, 0x1033, 0x2033, 0x0034, 0x0e07, 0x1034, 0x2034, + 0x0035, 0x0e08, 0x1035, 0x2035, 0x0036, 0x0e09, 0x1036, 0x2036, + 0x0037, 0x0e0a, 0x1037, 0x2037, 0x0038, 0x0e0b, 0x1038, 0x2038, + 0x0039, 0x0f04, 0x1039, 0x2039, 0x003a, 0x0f05, 0x103a, 0x203a, + 0x003b, 0x0f06, 0x103b, 0x203b, 0x003c, 0x0f07, 0x103c, 0x203c, + 0x0801, 0x0f08, 0x103d, 0x203d, 0x1001, 0x0f09, 0x103e, 0x203e, + 0x1801, 0x0f0a, 0x103f, 0x203f, 0x2001, 0x0f0b, 0x1040, 0x2040 +}; + +struct snappy_decompressor { + struct source *reader; /* Underlying source of bytes to decompress */ + const char *ip; /* Points to next buffered byte */ + const char *ip_limit; /* Points just past buffered bytes */ + u32 peeked; /* Bytes peeked from reader (need to skip) */ + bool eof; /* Hit end of input without an error? */ + char scratch[5]; /* Temporary buffer for peekfast boundaries */ +}; + +static void +init_snappy_decompressor(struct snappy_decompressor *d, struct source *reader) +{ + d->reader = reader; + d->ip = NULL; + d->ip_limit = NULL; + d->peeked = 0; + d->eof = false; +} + +static void exit_snappy_decompressor(struct snappy_decompressor *d) +{ + skip(d->reader, d->peeked); +} + +/* + * Read the uncompressed length stored at the start of the compressed data. + * On succcess, stores the length in *result and returns true. + * On failure, returns false. + */ +static bool read_uncompressed_length(struct snappy_decompressor *d, + u32 * result) +{ + DCHECK(d->ip == NULL); /* + * Must not have read anything yet + * Length is encoded in 1..5 bytes + */ + *result = 0; + u32 shift = 0; + while (true) { + if (shift >= 32) + return false; + size_t n; + const char *ip = peek(d->reader, &n); + if (n == 0) + return false; + const unsigned char c = *(const unsigned char *)(ip); + skip(d->reader, 1); + *result |= (u32) (c & 0x7f) << shift; + if (c < 128) { + break; + } + shift += 7; + } + return true; +} + +static bool refill_tag(struct snappy_decompressor *d); + +/* + * Process the next item found in the input. + * Returns true if successful, false on error or end of input. + */ +static void decompress_all_tags(struct snappy_decompressor *d, + struct writer *writer) +{ + const char *ip = d->ip; + + for (;;) { + if (d->ip_limit - ip < 5) { + d->ip = ip; + if (!refill_tag(d)) + return; + ip = d->ip; + } + + const unsigned char c = *(const unsigned char *)(ip++); + + if ((c & 0x3) == LITERAL) { + u32 literal_length = c >> 2; + if (unlikely(literal_length >= 60)) { + /* Long literal */ + const u32 literal_ll = literal_length - 59; + literal_length = get_unaligned_le32(ip) & + wordmask[literal_ll]; + ip += literal_ll; + } + ++literal_length; + + u32 avail = d->ip_limit - ip; + while (avail < literal_length) { + if (!writer_append(writer, ip, avail, false)) + return; + literal_length -= avail; + skip(d->reader, d->peeked); + size_t n; + ip = peek(d->reader, &n); + avail = n; + d->peeked = avail; + if (avail == 0) + return; /* Premature end of input */ + d->ip_limit = ip + avail; + } + bool allow_fast_path = (avail >= 16); + if (!writer_append(writer, ip, literal_length, + allow_fast_path)) + return; + ip += literal_length; + } else { + const u32 entry = char_table[c]; + const u32 trailer = get_unaligned_le32(ip) & + wordmask[entry >> 11]; + const u32 length = entry & 0xff; + ip += entry >> 11; + + /* + * copy_offset/256 is encoded in bits 8..10. + * By just fetching those bits, we get + * copy_offset (since the bit-field starts at + * bit 8). + */ + const u32 copy_offset = entry & 0x700; + if (!writer_append_from_self(writer, + copy_offset + trailer, + length)) + return; + } + } +} + +static bool refill_tag(struct snappy_decompressor *d) +{ + const char *ip = d->ip; + + if (ip == d->ip_limit) { + size_t n; + /* Fetch a new fragment from the reader */ + skip(d->reader, d->peeked); /* All peeked bytes are used up */ + ip = peek(d->reader, &n); + d->peeked = n; + if (n == 0) { + d->eof = true; + return false; + } + d->ip_limit = ip + n; + } + + /* Read the tag character */ + DCHECK_LT(ip, d->ip_limit); + const unsigned char c = *(const unsigned char *)(ip); + const u32 entry = char_table[c]; + const u32 needed = (entry >> 11) + 1; /* +1 byte for ''c'' */ + DCHECK_LE(needed, sizeof(d->scratch)); + + /* Read more bytes from reader if needed */ + u32 nbuf = d->ip_limit - ip; + + if (nbuf < needed) { + /* + * Stitch together bytes from ip and reader to form the word + * contents. We store the needed bytes in "scratch". They + * will be consumed immediately by the caller since we do not + * read more than we need. + */ + memmove(d->scratch, ip, nbuf); + skip(d->reader, d->peeked); /* All peeked bytes are used up */ + d->peeked = 0; + while (nbuf < needed) { + size_t length; + const char *src = peek(d->reader, &length); + if (length == 0) + return false; + u32 to_add = min_t(u32, needed - nbuf, length); + memcpy(d->scratch + nbuf, src, to_add); + nbuf += to_add; + skip(d->reader, to_add); + } + DCHECK_EQ(nbuf, needed); + d->ip = d->scratch; + d->ip_limit = d->scratch + needed; + } else if (nbuf < 5) { + /* + * Have enough bytes, but move into scratch so that we do not + * read past end of input + */ + memmove(d->scratch, ip, nbuf); + skip(d->reader, d->peeked); /* All peeked bytes are used up */ + d->peeked = 0; + d->ip = d->scratch; + d->ip_limit = d->scratch + nbuf; + } else { + /* Pass pointer to buffer returned by reader. */ + d->ip = ip; + } + return true; +} + +static int internal_uncompress(struct source *r, + struct writer *writer, u32 max_len) +{ + struct snappy_decompressor decompressor; + u32 uncompressed_len = 0; + + init_snappy_decompressor(&decompressor, r); + + if (!read_uncompressed_length(&decompressor, &uncompressed_len)) + return -EIO; + /* Protect against possible DoS attack */ + if ((u64) (uncompressed_len) > max_len) + return -EIO; + + writer_set_expected_length(writer, uncompressed_len); + + /* Process the entire input */ + decompress_all_tags(&decompressor, writer); + + exit_snappy_decompressor(&decompressor); + return (decompressor.eof && writer_check_length(writer)) ? 0 : -EIO; +} + +static inline int compress(struct snappy_env *env, struct source *reader, + struct sink *writer) +{ + int err; + size_t written = 0; + int N = available(reader); + char ulength[kmax32]; + char *p = varint_encode32(ulength, N); + + append(writer, ulength, p - ulength); + written += (p - ulength); + + while (N > 0) { + /* Get next block to compress (without copying if possible) */ + size_t fragment_size; + const char *fragment = peek(reader, &fragment_size); + if (fragment_size == 0) { + err = -EIO; + goto out; + } + const int num_to_read = min_t(int, N, kblock_size); + size_t bytes_read = fragment_size; + + int pending_advance = 0; + if (bytes_read >= num_to_read) { + /* Buffer returned by reader is large enough */ + pending_advance = num_to_read; + fragment_size = num_to_read; + } +#ifdef SCATHER_GATHER + else { + memcpy(env->scratch, fragment, bytes_read); + skip(reader, bytes_read); + + while (bytes_read < num_to_read) { + fragment = peek(reader, &fragment_size); + size_t n + min_t(size_t, fragment_size, + num_to_read - bytes_read); + memcpy(env->scratch + bytes_read, fragment, n); + bytes_read += n; + skip(reader, n); + } + DCHECK_EQ(bytes_read, num_to_read); + fragment = env->scratch; + fragment_size = num_to_read; + } +#endif + if (fragment_size < num_to_read) + return -EIO; + + /* Get encoding table for compression */ + int table_size; + u16 *table = get_hash_table(env, num_to_read, &table_size); + + /* Compress input_fragment and append to dest */ + const int max_output + snappy_max_compressed_length(num_to_read); + + char *dest; + dest = sink_peek(writer, max_output); +#ifdef SCATHER_GATHER + if (!dest) { + /* + * Need a scratch buffer for the output, + * because the byte sink doesn''t have enough + * in one piece. + */ + dest = env->scratch_output; + } +#endif + char *end = compress_fragment(fragment, fragment_size, + dest, table, table_size); + append(writer, dest, end - dest); + written += (end - dest); + + N -= num_to_read; + skip(reader, pending_advance); + } + + err = 0; +out: + return err; +} + +/** + * snappy_compress - Compress a buffer using the snappy compressor. + * @env: Preallocated environment + * @input: Input buffer + * @input_length: Length of input_buffer + * @compressed: Output buffer for compressed data + * @compressed_length: The real length of the output written here. + * + * Return 0 on success, otherwise an negative error code. + * + * The output buffer must be at least + * snappy_max_compressed_length(input_length) bytes long. + * + * Requires a preallocated environment from snappy_init_env. + * The environment does not keep state over individual calls + * of this function, just preallocates the memory. + */ +int snappy_compress(struct snappy_env *env, + const char *input, + size_t input_length, + char *compressed, size_t *compressed_length) +{ + struct source reader = { + .ptr = input, + .left = input_length + }; + struct sink writer = { + .dest = compressed, + }; + int err = compress(env, &reader, &writer); + + /* Compute how many bytes were added */ + *compressed_length = (writer.dest - compressed); + return err; +} +EXPORT_SYMBOL(snappy_compress); + +/** + * snappy_uncompress - Uncompress a snappy compressed buffer + * @compressed: Input buffer with compressed data + * @n: length of compressed buffer + * @uncompressed: buffer for uncompressed data + * + * The uncompressed data buffer must be at least + * snappy_uncompressed_length(compressed) bytes long. + * + * Returns true when successfull, otherwise false. + */ +bool snappy_uncompress(const char *compressed, size_t n, char *uncompressed) +{ + struct source reader = { + .ptr = compressed, + .left = n + }; + struct writer output = { + .base = uncompressed, + .op = uncompressed + }; + return internal_uncompress(&reader, &output, 0xffffffff); +} +EXPORT_SYMBOL(snappy_uncompress); + +/** + * snappy_init_env - Allocate snappy compression environment + * @env: Environment to preallocate + * + * Returns 0 on success, otherwise negative errno. + * Must run in process context. + */ +int snappy_init_env(struct snappy_env *env) +{ + env->hash_table = vmalloc(sizeof(u16) * kmax_hash_table_size); + if (!env->hash_table) + goto error; +#ifdef SCATHER_GATHER + env->scratch = vmalloc(kblock_size); + env->scratch_output + vmalloc(snappy_max_compressed_length(kblock_size)); + if (!env->scratch || !env->scratch_output) + goto error; +#endif + return 0; +error: + snappy_free_env(env); + return -ENOMEM; +} +EXPORT_SYMBOL(snappy_init_env); + +/** + * snappy_free_env - Free an snappy compression environment + * @env: Environment to free. + * + * Must run in process context. + */ +void snappy_free_env(struct snappy_env *env) +{ + vfree(env->hash_table); +#ifdef SCATHER_GATHER + vfree(env->scratch); + vfree(env->scratch_output); +#endif + memset(env, 0, sizeof(struct snappy_env)); +} +EXPORT_SYMBOL(snappy_free_env); -- 1.7.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Andi Kleen <ak@linux.intel.com> Add support in btrfs for snappy compression. This is generally a faster compressor than LZO (except for higher memory usage) with comparable compression and should imho replace it. This is based on the lzo code with minor modifications. This has been tested with various test workloads and also in a real distribution. One challenge with the benchmarks is that a lot of of benchmarks only read/write zeroes, which are a uncommon good case for IO compression. However even with modified workloads that use different patterns gains are seen. In general snappy performs better with a 64bit CPU, but also generally outperforms LZO on a 32bit Atom system on IO workloads. The disk space usage is similar to LZO (within 0.2%) On a distribution boot test on a netboot snappy is about 11.5% faster than LZO or 15.5% faster than an uncompressed btrfs. Open: implement scatter-gather support and get rid of the temporary buffers. This would only work in some special cases, but should improve performance further by eliminating copies. Thanks to Jacob Sowles for extensive testing and benchmarking. v2: Some cleanups, port to latest tree. Signed-off-by: Andi Kleen <ak@linux.intel.com> --- fs/btrfs/Kconfig | 1 + fs/btrfs/Makefile | 2 +- fs/btrfs/compression.c | 1 + fs/btrfs/compression.h | 1 + fs/btrfs/ctree.h | 9 +- fs/btrfs/disk-io.c | 2 + fs/btrfs/ioctl.c | 4 + fs/btrfs/snappy.c | 432 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/super.c | 9 +- 9 files changed, 456 insertions(+), 5 deletions(-) create mode 100644 fs/btrfs/snappy.c diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig index ecb9fd3..d55df9c 100644 --- a/fs/btrfs/Kconfig +++ b/fs/btrfs/Kconfig @@ -6,6 +6,7 @@ config BTRFS_FS select ZLIB_DEFLATE select LZO_COMPRESS select LZO_DECOMPRESS + select SNAPPY help Btrfs is a new filesystem with extents, writable snapshotting, support for multiple devices and many more features. diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index c0ddfd2..b93ab52 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -8,6 +8,6 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \ export.o tree-log.o free-space-cache.o zlib.o lzo.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ - reada.o backref.o + reada.o backref.o snappy.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 14f1c5a..f85f7fd 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -730,6 +730,7 @@ static wait_queue_head_t comp_workspace_wait[BTRFS_COMPRESS_TYPES]; struct btrfs_compress_op *btrfs_compress_op[] = { &btrfs_zlib_compress, &btrfs_lzo_compress, + &btrfs_snappy_compress, }; int __init btrfs_init_compress(void) diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index a12059f..971a425 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -79,5 +79,6 @@ struct btrfs_compress_op { extern struct btrfs_compress_op btrfs_zlib_compress; extern struct btrfs_compress_op btrfs_lzo_compress; +extern struct btrfs_compress_op btrfs_snappy_compress; #endif diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 6738503..a06a642 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -458,6 +458,7 @@ struct btrfs_super_block { #define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL (1ULL << 1) #define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS (1ULL << 2) #define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO (1ULL << 3) +#define BTRFS_FEATURE_INCOMPAT_COMPRESS_SNAPPY (1ULL << 4) #define BTRFS_FEATURE_COMPAT_SUPP 0ULL #define BTRFS_FEATURE_COMPAT_RO_SUPP 0ULL @@ -465,7 +466,8 @@ struct btrfs_super_block { (BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF | \ BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL | \ BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS | \ - BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO) + BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO | \ + BTRFS_FEATURE_INCOMPAT_COMPRESS_SNAPPY) /* * A leaf is full of items. offset and size tell us where to find @@ -621,8 +623,9 @@ enum btrfs_compression_type { BTRFS_COMPRESS_NONE = 0, BTRFS_COMPRESS_ZLIB = 1, BTRFS_COMPRESS_LZO = 2, - BTRFS_COMPRESS_TYPES = 2, - BTRFS_COMPRESS_LAST = 3, + BTRFS_COMPRESS_SNAPPY = 3, + BTRFS_COMPRESS_TYPES = 3, + BTRFS_COMPRESS_LAST = 4, }; struct btrfs_inode_item { diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index f99a099..925dd0a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2113,6 +2113,8 @@ struct btrfs_root *open_ctree(struct super_block *sb, features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF; if (tree_root->fs_info->compress_type & BTRFS_COMPRESS_LZO) features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO; + if (tree_root->fs_info->compress_type & BTRFS_COMPRESS_SNAPPY) + features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_SNAPPY; btrfs_set_super_incompat_flags(disk_super, features); features = btrfs_super_compat_ro_flags(disk_super) & diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 5441ff1..2cc2b76 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1173,6 +1173,10 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO; btrfs_set_super_incompat_flags(disk_super, features); } + if (range->compress_type == BTRFS_COMPRESS_SNAPPY) { + features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_SNAPPY; + btrfs_set_super_incompat_flags(disk_super, features); + } ret = defrag_count; diff --git a/fs/btrfs/snappy.c b/fs/btrfs/snappy.c new file mode 100644 index 0000000..d6bd607 --- /dev/null +++ b/fs/btrfs/snappy.c @@ -0,0 +1,432 @@ +/* + * Copyright (C) 2011 Intel Corporation + * Author: Andi Kleen + * Copyright (C) 2008 Oracle + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +/* largely copy''n''pasted from lzo.c Unify? */ + +/* XXX: could use snappy''s fragments to avoid the working buffer? */ + +#include <linux/kernel.h> +#include <linux/slab.h> +#include <linux/vmalloc.h> +#include <linux/init.h> +#include <linux/err.h> +#include <linux/sched.h> +#include <linux/pagemap.h> +#include <linux/bio.h> +#include <linux/snappy.h> +#include "compression.h" + +#define SNAPPY_LEN 4 + +struct workspace { + void *buf; /* where compressed data goes */ + void *cbuf; /* where decompressed data goes */ + struct snappy_env env; + struct list_head list; +}; + +static void snappy_free_workspace(struct list_head *ws) +{ + struct workspace *workspace = list_entry(ws, struct workspace, list); + + snappy_free_env(&workspace->env); + vfree(workspace->buf); + vfree(workspace->cbuf); + kfree(workspace); +} + +static struct list_head *snappy_alloc_workspace(void) +{ + struct workspace *workspace; + + workspace = kzalloc(sizeof(*workspace), GFP_NOFS); + if (!workspace) + return ERR_PTR(-ENOMEM); + + workspace->buf = vmalloc(PAGE_CACHE_SIZE); + workspace->cbuf = vmalloc(snappy_max_compressed_length(PAGE_CACHE_SIZE)); + if (!workspace->buf || !workspace->cbuf) + goto fail; + + if (snappy_init_env(&workspace->env) < 0) + goto fail; + + INIT_LIST_HEAD(&workspace->list); + + return &workspace->list; +fail: + snappy_free_workspace(&workspace->list); + return ERR_PTR(-ENOMEM); +} + +static inline void write_compress_length(char *buf, size_t len) +{ + __le32 dlen; + + dlen = cpu_to_le32(len); + memcpy(buf, &dlen, SNAPPY_LEN); +} + +static inline size_t read_compress_length(char *buf) +{ + __le32 dlen; + + memcpy(&dlen, buf, SNAPPY_LEN); + return le32_to_cpu(dlen); +} + +static int snappy_compress_pages(struct list_head *ws, + struct address_space *mapping, + u64 start, unsigned long len, + struct page **pages, + unsigned long nr_dest_pages, + unsigned long *out_pages, + unsigned long *total_in, + unsigned long *total_out, + unsigned long max_out) +{ + struct workspace *workspace = list_entry(ws, struct workspace, list); + int ret = 0; + char *data_in; + char *cpage_out; + int nr_pages = 0; + struct page *in_page = NULL; + struct page *out_page = NULL; + unsigned long bytes_left; + + size_t in_len; + size_t out_len; + char *buf; + unsigned long tot_in = 0; + unsigned long tot_out = 0; + unsigned long pg_bytes_left; + unsigned long out_offset; + unsigned long bytes; + + *out_pages = 0; + *total_out = 0; + *total_in = 0; + + in_page = find_get_page(mapping, start >> PAGE_CACHE_SHIFT); + data_in = kmap(in_page); + + /* + * store the size of all chunks of compressed data in + * the first 4 bytes + */ + out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM); + if (out_page == NULL) { + ret = -ENOMEM; + goto out; + } + cpage_out = kmap(out_page); + out_offset = SNAPPY_LEN; + tot_out = SNAPPY_LEN; + pages[0] = out_page; + nr_pages = 1; + pg_bytes_left = PAGE_CACHE_SIZE - SNAPPY_LEN; + + /* compress at most one page of data each time */ + in_len = min(len, PAGE_CACHE_SIZE); + while (tot_in < len) { + ret = snappy_compress(&workspace->env, + data_in, in_len, workspace->cbuf, + &out_len); + if (ret != 0) { + printk(KERN_DEBUG "btrfs uncompression in loop returned %d\n", + ret); + ret = -1; + goto out; + } + + /* store the size of this chunk of compressed data */ + write_compress_length(cpage_out + out_offset, out_len); + tot_out += SNAPPY_LEN; + out_offset += SNAPPY_LEN; + pg_bytes_left -= SNAPPY_LEN; + + tot_in += in_len; + tot_out += out_len; + + /* copy bytes from the working buffer into the pages */ + buf = workspace->cbuf; + while (out_len) { + bytes = min_t(unsigned long, pg_bytes_left, out_len); + + memcpy(cpage_out + out_offset, buf, bytes); + + out_len -= bytes; + pg_bytes_left -= bytes; + buf += bytes; + out_offset += bytes; + + /* + * we need another page for writing out. + * + * Note if there''s less than 4 bytes left, we just + * skip to a new page. + */ + if ((out_len == 0 && pg_bytes_left < SNAPPY_LEN) || + pg_bytes_left == 0) { + if (pg_bytes_left) { + memset(cpage_out + out_offset, 0, + pg_bytes_left); + tot_out += pg_bytes_left; + } + + /* we''re done, don''t allocate new page */ + if (out_len == 0 && tot_in >= len) + break; + + kunmap(out_page); + if (nr_pages == nr_dest_pages) { + out_page = NULL; + ret = -1; + goto out; + } + + out_page = alloc_page(GFP_NOFS | __GFP_HIGHMEM); + if (out_page == NULL) { + ret = -ENOMEM; + goto out; + } + cpage_out = kmap(out_page); + pages[nr_pages++] = out_page; + + pg_bytes_left = PAGE_CACHE_SIZE; + out_offset = 0; + } + } + + /* we''re making it bigger, give up */ + if (tot_in > 8192 && tot_in < tot_out) + goto out; + + /* we''re all done */ + if (tot_in >= len) + break; + + if (tot_out > max_out) + break; + + bytes_left = len - tot_in; + kunmap(in_page); + page_cache_release(in_page); + + start += PAGE_CACHE_SIZE; + in_page = find_get_page(mapping, start >> PAGE_CACHE_SHIFT); + data_in = kmap(in_page); + in_len = min(bytes_left, PAGE_CACHE_SIZE); + } + + if (tot_out > tot_in) + goto out; + + /* store the size of all chunks of compressed data */ + cpage_out = kmap(pages[0]); + write_compress_length(cpage_out, tot_out); + + kunmap(pages[0]); + + ret = 0; + *total_out = tot_out; + *total_in = tot_in; +out: + *out_pages = nr_pages; + if (out_page) + kunmap(out_page); + + if (in_page) { + kunmap(in_page); + page_cache_release(in_page); + } + + return ret; +} + +static int snappy_decompress_biovec(struct list_head *ws, + struct page **pages_in, + u64 disk_start, + struct bio_vec *bvec, + int vcnt, + size_t srclen) +{ + struct workspace *workspace = list_entry(ws, struct workspace, list); + int ret = 0, ret2; + char *data_in; + unsigned long page_in_index = 0; + unsigned long page_out_index = 0; + unsigned long total_pages_in = (srclen + PAGE_CACHE_SIZE - 1) / + PAGE_CACHE_SIZE; + unsigned long buf_start; + unsigned long buf_offset = 0; + unsigned long bytes; + unsigned long working_bytes; + unsigned long pg_offset; + + size_t in_len; + size_t out_len; + unsigned long in_offset; + unsigned long in_page_bytes_left; + unsigned long tot_in; + unsigned long tot_out; + unsigned long tot_len; + char *buf; + bool may_late_unmap, need_unmap; + + data_in = kmap(pages_in[0]); + tot_len = read_compress_length(data_in); + + tot_in = SNAPPY_LEN; + in_offset = SNAPPY_LEN; + tot_len = min_t(size_t, srclen, tot_len); + in_page_bytes_left = PAGE_CACHE_SIZE - SNAPPY_LEN; + + tot_out = 0; + pg_offset = 0; + + while (tot_in < tot_len) { + in_len = read_compress_length(data_in + in_offset); + in_page_bytes_left -= SNAPPY_LEN; + in_offset += SNAPPY_LEN; + tot_in += SNAPPY_LEN; + + tot_in += in_len; + working_bytes = in_len; + may_late_unmap = need_unmap = false; + + /* fast path: avoid using the working buffer */ + if (in_page_bytes_left >= in_len) { + buf = data_in + in_offset; + bytes = in_len; + may_late_unmap = true; + goto cont; + } + + /* copy bytes from the pages into the working buffer */ + buf = workspace->cbuf; + buf_offset = 0; + while (working_bytes) { + bytes = min(working_bytes, in_page_bytes_left); + + memcpy(buf + buf_offset, data_in + in_offset, bytes); + buf_offset += bytes; +cont: + working_bytes -= bytes; + in_page_bytes_left -= bytes; + in_offset += bytes; + + /* check if we need to pick another page */ + if ((working_bytes == 0 && in_page_bytes_left < SNAPPY_LEN) + || in_page_bytes_left == 0) { + tot_in += in_page_bytes_left; + + if (working_bytes == 0 && tot_in >= tot_len) + break; + + if (page_in_index + 1 >= total_pages_in) { + ret = -1; + goto done; + } + + if (may_late_unmap) + need_unmap = true; + else + kunmap(pages_in[page_in_index]); + + data_in = kmap(pages_in[++page_in_index]); + + in_page_bytes_left = PAGE_CACHE_SIZE; + in_offset = 0; + } + } + + ret = -1; + if (snappy_uncompressed_length(buf, in_len, &out_len)) + ret = snappy_uncompress(buf, in_len, workspace->buf); + if (need_unmap) + kunmap(pages_in[page_in_index - 1]); + if (ret != 0) { + printk(KERN_WARNING "btrfs decompress failed\n"); + ret = -1; + break; + } + + buf_start = tot_out; + tot_out += out_len; + + ret2 = btrfs_decompress_buf2page(workspace->buf, buf_start, + tot_out, disk_start, + bvec, vcnt, + &page_out_index, &pg_offset); + if (ret2 == 0) + break; + } +done: + kunmap(pages_in[page_in_index]); + return ret; +} + +static int btrfs_snappy_decompress(struct list_head *ws, unsigned char *data_in, + struct page *dest_page, + unsigned long start_byte, + size_t srclen, size_t destlen) +{ + struct workspace *workspace = list_entry(ws, struct workspace, list); + size_t in_len; + size_t out_len; + int ret = 0; + char *kaddr; + unsigned long bytes; + + BUG_ON(srclen < SNAPPY_LEN); + + in_len = read_compress_length(data_in); + data_in += SNAPPY_LEN; + + ret = -1; + if (snappy_uncompressed_length(data_in, in_len, &out_len)) + ret = snappy_uncompress(data_in, in_len, workspace->buf); + if (ret != 0) { + printk(KERN_WARNING "btrfs decompress failed!\n"); + ret = -1; + goto out; + } + if (out_len < start_byte) { + ret = -1; + goto out; + } + + bytes = min_t(unsigned long, destlen, out_len - start_byte); + + kaddr = kmap_atomic(dest_page, KM_USER0); + memcpy(kaddr, workspace->buf + start_byte, bytes); + kunmap_atomic(kaddr, KM_USER0); +out: + return ret; +} + +struct btrfs_compress_op btrfs_snappy_compress = { + .alloc_workspace = snappy_alloc_workspace, + .free_workspace = snappy_free_workspace, + .compress_pages = snappy_compress_pages, + .decompress_biovec = snappy_decompress_biovec, + .decompress = btrfs_snappy_decompress, +}; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index ae488aa..3240473 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -276,6 +276,9 @@ int btrfs_parse_options(struct btrfs_root *root, char *options) } else if (strcmp(args[0].from, "lzo") == 0) { compress_type = "lzo"; info->compress_type = BTRFS_COMPRESS_LZO; + } else if (strcmp(args[0].from, "snappy") == 0) { + compress_type = "snappy"; + info->compress_type = BTRFS_COMPRESS_SNAPPY; } else { ret = -EINVAL; goto out; @@ -687,8 +690,12 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) if (btrfs_test_opt(root, COMPRESS)) { if (info->compress_type == BTRFS_COMPRESS_ZLIB) compress_type = "zlib"; - else + else if (info->compress_type == BTRFS_COMPRESS_SNAPPY) + compress_type = "snappy"; + else if (info->compress_type == BTRFS_COMPRESS_LZO) compress_type = "lzo"; + else + compress_type = "?"; if (btrfs_test_opt(root, FORCE_COMPRESS)) seq_printf(seq, ",compress-force=%s", compress_type); else -- 1.7.7.4
From: Andi Kleen <ak@linux.intel.com> Mainly so that ubifs can use it. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andi Kleen <ak@linux.intel.com> --- crypto/Kconfig | 9 +++++ crypto/Makefile | 1 + crypto/snappy.c | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 109 insertions(+), 0 deletions(-) create mode 100644 crypto/snappy.c diff --git a/crypto/Kconfig b/crypto/Kconfig index e6cfe1a..768943dc 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -924,6 +924,15 @@ config CRYPTO_LZO help This is the LZO algorithm. +config CRYPTO_SNAPPY + tristate "Snappy compression algorithm" + select CRYPTO_ALGAPI + select SNAPPY + help + snappy is a faster alternative to the lzo compression algorithm + with comparable compression. It is very fast on 64bit systems, but also + good on 32bit systems. It especially excels at already compressed data. + comment "Random Number Generation" config CRYPTO_ANSI_CPRNG diff --git a/crypto/Makefile b/crypto/Makefile index f638063..8a0e6c6 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -82,6 +82,7 @@ obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o obj-$(CONFIG_CRYPTO_CRC32C) += crc32c.o obj-$(CONFIG_CRYPTO_AUTHENC) += authenc.o authencesn.o obj-$(CONFIG_CRYPTO_LZO) += lzo.o +obj-$(CONFIG_CRYPTO_SNAPPY) += snappy.o obj-$(CONFIG_CRYPTO_RNG2) += rng.o obj-$(CONFIG_CRYPTO_RNG2) += krng.o obj-$(CONFIG_CRYPTO_ANSI_CPRNG) += ansi_cprng.o diff --git a/crypto/snappy.c b/crypto/snappy.c new file mode 100644 index 0000000..43a64b9 --- /dev/null +++ b/crypto/snappy.c @@ -0,0 +1,99 @@ +/* + * Cryptographic API. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 51 + * Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + */ + +#include <linux/init.h> +#include <linux/module.h> +#include <linux/crypto.h> +#include <linux/vmalloc.h> +#include <linux/snappy.h> + +struct snappy_ctx { + struct snappy_env env; +}; + +/* Only needed for compression actually */ +static int snappy_init(struct crypto_tfm *tfm) +{ + struct snappy_ctx *ctx = crypto_tfm_ctx(tfm); + + return snappy_init_env(&ctx->env); +} + +static void snappy_exit(struct crypto_tfm *tfm) +{ + struct snappy_ctx *ctx = crypto_tfm_ctx(tfm); + + snappy_free_env(&ctx->env); +} + +static int snp_compress(struct crypto_tfm *tfm, const u8 *src, + unsigned int slen, u8 *dst, unsigned int *dlen) +{ + struct snappy_ctx *ctx = crypto_tfm_ctx(tfm); + size_t olen; + int err; + + /* XXXX very pessimistic. check in snappy? */ + if (*dlen < snappy_max_compressed_length(*dlen)) + return -EINVAL; + err = snappy_compress(&ctx->env, src, slen, dst, &olen); + *dlen = olen; + return err; +} + +static int snp_decompress(struct crypto_tfm *tfm, const u8 *src, + unsigned int slen, u8 *dst, unsigned int *dlen) +{ + size_t ulen; + + if (!snappy_uncompressed_length(src, slen, &ulen)) + return -EIO; + if (*dlen < ulen) + return -EINVAL; + *dlen = ulen; + return snappy_uncompress(src, slen, dst) ? 0 : -EIO; +} + +static struct crypto_alg alg = { + .cra_name = "snappy", + .cra_flags = CRYPTO_ALG_TYPE_COMPRESS, + .cra_ctxsize = sizeof(struct snappy_ctx), + .cra_module = THIS_MODULE, + .cra_list = LIST_HEAD_INIT(alg.cra_list), + .cra_init = snappy_init, + .cra_exit = snappy_exit, + .cra_u = { .compress = { + .coa_compress = snp_compress, + .coa_decompress = snp_decompress } } +}; + +static int __init snappy_mod_init(void) +{ + return crypto_register_alg(&alg); +} + +static void __exit snappy_mod_fini(void) +{ + crypto_unregister_alg(&alg); +} + +module_init(snappy_mod_init); +module_exit(snappy_mod_fini); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Snappy Compression Algorithm"); -- 1.7.7.4
Jaromir Zdrazil
2012-Jan-13 02:40 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
Hi. Thank you for the job. Could you please share the output of the testing and give us more info about the crypto? Thank you! Jaromir> ------------ Původní zpráva ------------ > Od: Andi Kleen <andi@firstfloor.org> > Předmět: Updated btrfs/crypto snappy interface ready for merging > Datum: 13.1.2012 01:30:21 > ---------------------------------------- > Here''s a slightly updated version of the BTRFS snappy interface. > snappy is a faster compression algorithm that provides similar > compression as LZO, but generally better performance. > > This version has: > > - Ported to latest tree > - Extensive testing and benchmarking (in its 3.0 based incarnation) > - A few cleanups > - Adds a minor performance improvement > > Should be ready for pulling into the btrfs tree now. > The crypto parts have been acked by Herbert. > > The following changes since commit 2485a4b610171f4e1c4ab0d053569747795c1bbe: > > Merge branch ''for-linus'' of > git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input (2012-01-12 12:40:41 > -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc snappy > > Andi Kleen (3): > Add the snappy-c compressor to lib v2 > BTRFS: Add snappy support v2 > Add snappy interface to crypto API > > crypto/Kconfig | 9 + > crypto/Makefile | 1 + > crypto/snappy.c | 99 ++++ > fs/btrfs/Kconfig | 1 + > fs/btrfs/Makefile | 2 +- > fs/btrfs/compression.c | 1 + > fs/btrfs/compression.h | 1 + > fs/btrfs/ctree.h | 9 +- > fs/btrfs/disk-io.c | 2 + > fs/btrfs/ioctl.c | 4 + > fs/btrfs/snappy.c | 432 ++++++++++++++++ > fs/btrfs/super.c | 9 +- > include/linux/snappy.h | 26 + > lib/Kconfig | 6 + > lib/Makefile | 4 + > lib/snappy.c | 1300 ++++++++++++++++++++++++++++++++++++++++++++++++ > 16 files changed, 1901 insertions(+), 5 deletions(-) > create mode 100644 crypto/snappy.c > create mode 100644 fs/btrfs/snappy.c > create mode 100644 include/linux/snappy.h > create mode 100644 lib/snappy.c > > -Andi > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > >Jaromir Zdrazil jaromir.zdrazil@email.cz -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2012-Jan-16 13:54 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
On Thu, Jan 12, 2012 at 04:28:47PM -0800, Andi Kleen wrote:> Here''s a slightly updated version of the BTRFS snappy interface. > snappy is a faster compression algorithm that provides similar > compression as LZO, but generally better performance.Thanks Andi, I''ve queued this up to a btrfs snappy branch. I put the commits against 3.2 instead of 3.3 though so people could try it out. The only problem I''ve hit is that snappy is failing xfstests number 135. It''s reproducible on an empty FS. Could you please take a look? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Zefan
2012-Jan-17 08:44 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
21:54, Chris Mason wrote:> On Thu, Jan 12, 2012 at 04:28:47PM -0800, Andi Kleen wrote: >> Here''s a slightly updated version of the BTRFS snappy interface. >> snappy is a faster compression algorithm that provides similar >> compression as LZO, but generally better performance. > > Thanks Andi, I''ve queued this up to a btrfs snappy branch. I put the > commits against 3.2 instead of 3.3 though so people could try it out. > > The only problem I''ve hit is that snappy is failing xfstests number 135. > It''s reproducible on an empty FS. Could you please take a look? >It''s because decompressing inline extents always fails. I''ve fixed it and will send the patch out in a new mail thread. But seems there''s bug in lib snappy code, which makes the decompressed data doesn''t quite match the original data. Simply copy a file to a btrfs filesystem with snappy enabled, and clear page cache, and check the file: * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public - * License as published by the Free Software Foundation; - * version 2.1 of the License (not later!) + * License as published by the Free ftwaware Foundation; + * version 2.1 of the censnse (not later!) * - * This program is distributed in the hope that it will be useful, + * is program i is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. + * GNU Lesser General Publicicenense for more details. * * You should have received a copy of the GNU Lesser General Public - * License along with this program; if not, write to the Free Software + * License along withhis progogram; if not, write to theree e Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA * * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Andi Kleen
2012-Jan-17 08:46 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
> It''s because decompressing inline extents always fails. I''ve fixed it > and will send the patch out in a new mail thread.Thanks for fixing.> > But seems there''s bug in lib snappy code, which makes the decompressed > data doesn''t quite match the original data. > > Simply copy a file to a btrfs filesystem with snappy enabled, and clear > page cache, and check the file:Hmm weird, I have never seen this. Do you have a reproducer? The basic compression code is quite well tested, I have a reasonable unit test. -Andi
Li Zefan
2012-Jan-17 08:56 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
Andi Kleen wrote:>> It''s because decompressing inline extents always fails. I''ve fixed it >> and will send the patch out in a new mail thread. > > Thanks for fixing. > >> >> But seems there''s bug in lib snappy code, which makes the decompressed >> data doesn''t quite match the original data. >> >> Simply copy a file to a btrfs filesystem with snappy enabled, and clear >> page cache, and check the file: > > Hmm weird, I have never seen this. Do you have a reproducer? >Just use some randomly chosen text files. For example: # mount -t btrfs -o compress=snappy /dev/xxx /mnt # cp -r btrfs-progs-unstable/ /mnt # sync # echo 3 > /proc/sys/vm/drop_caches # diff -Nurp btrfs-progs-unstable /mnt/btrfs-progs-unstable I''ve tested on both x86_32 and x86_64.> The basic compression code is quite well tested, I have a reasonable > unit test. >
David Sterba
2012-Jan-17 09:07 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
On Thu, Jan 12, 2012 at 04:28:47PM -0800, Andi Kleen wrote:> Here''s a slightly updated version of the BTRFS snappy interface. > snappy is a faster compression algorithm that provides similar > compression as LZO, but generally better performance.Recently the LZ4 method showed up on the real-time compression scene http://fastcompression.blogspot.com/p/lz4.html (homepage) http://code.google.com/p/lz4/ (source repo) it has comparable performance and compression ratio to snappy. Quoting from the source repo main page: Name Ratio C.speed D.speed LZ4 (r41) 2.08 319 1070 LZO 2.05 1x_1 2.07 318 466 Snappy 1.0.4 2.02 242 683 My own benchmarking confirms that lz4 is has a bit faster decompression, but what is a big difference from snappy is it''s memory consumption for compression: * 32kb for unbounded chunksize * 16kb for chunksize < 64k (a tuned compression) No additional memory is needed for decompression in both snappy and lz4. With a minor tweak and chunksize < 4G the context size could be reduced to 16k as well. There is also LZ4HC, "high compression" mode, which maintains same binary format, but the compression ratio is better. http://code.google.com/p/lz4hc/ Usecase: There could be the fast version used transparently and the -hc version could be allowed for the ''fi defrag'' command in order to recompress selected files. LZ4 is written in C and it''s BSD, LZ4HC is L-GPL. There is some space for improvements in the code, but as it''s good already as it stands now. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Li Zefan
2012-Jan-17 09:27 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
Andi Kleen wrote:>> It''s because decompressing inline extents always fails. I''ve fixed it >> and will send the patch out in a new mail thread. > > Thanks for fixing. > >> >> But seems there''s bug in lib snappy code, which makes the decompressed >> data doesn''t quite match the original data. >> >> Simply copy a file to a btrfs filesystem with snappy enabled, and clear >> page cache, and check the file: > > Hmm weird, I have never seen this. Do you have a reproducer? > > The basic compression code is quite well tested, I have a reasonable > unit test. >At first I saved emails and patched them in linux-btrfs git tree, and I got weired result. Then I pulled the snappy branch directly, and now nothing is wrong.. Sorry for the noise.
Andi Kleen
2012-Jan-17 09:51 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
> At first I saved emails and patched them in linux-btrfs git tree, and I > got weired result. Then I pulled the snappy branch directly, and now > nothing is wrong.. Sorry for the noise.np. Thanks for testing. -Andi -- ak@linux.intel.com -- Speaking for myself only.
evergreen
2012-Jan-18 15:02 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
> > it has comparable performance and compression ratio to snappy. Quoting > from the source repo main page: > > Name Ratio C.speed D.speed > LZ4 (r41) 2.08 319 1070 > LZO 2.05 1x_1 2.07 318 466 > Snappy 1.0.4 2.02 242 683 >Although there is LessFS''s Mark Ruijter : http://www.lessfs.com/wordpress/?p=684&cpage=1#comment-3114 "On average the speeds appear to be 48% faster then snappy." Sounds promising... -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Markus F.X.J. Oberhumer
2012-Jan-18 15:05 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
On 2012-01-13 01:28, Andi Kleen wrote:> Here''s a slightly updated version of the BTRFS snappy interface. > snappy is a faster compression algorithm that provides similar > compression as LZO, but generally better performance.I''d like to note that the LZO version in the current Linux kernel is rather outdated - it seems to be based on the 2005 release. In fact the latest version LZO 2.06 does compress both slightly faster and better than snappy 1.0.4 when benchmarking the Calgary and Silesia compression corpus (tested with gcc 4.6 on Nehalem & Sandy Bridge). Furthermore please be aware that from a pure compression point of view snappy et al. are very close cousins of LZO (strictly byte-aligned LZ77) that mainly differ in implementation issues like using a table to number of branches - and indeed similar optimizations could be applied to any version. I''m not sure if there is an official kernel maintainer of LZO, but I''d offer to assist you updating to the latest version and eliminating any possible performance issues. Cheers, Markus -- Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/
Markus F.X.J. Oberhumer
2012-Jan-23 16:19 UTC
ANN: linux-kernel-lzo-2.06.20120123 - update LZO to v2.06
Hi, I''ve prepared a small package that updates the LZO version in the Linux kernel to LZO v2.06. Please get it from: http://www.oberhumer.com/opensource/lzo/download/Testing/linux-kernel-lzo-2.06.20120123.tar.gz As stated in the README, its main purpose is to allow easy benchmarking of the latest LZO versions - these do feature some nice speed improvements, and while I have done a lot of synthetic benchmarking I''m really very curious and appreciate feedback on "real-world" performance numbers like usage in btrfs and zram. Share and enjoy, Markus http://www.oberhumer.com/opensource/lzo/ On 2012-01-18 16:05, Markus F.X.J. Oberhumer wrote:> On 2012-01-13 01:28, Andi Kleen wrote: >> Here''s a slightly updated version of the BTRFS snappy interface. >> snappy is a faster compression algorithm that provides similar >> compression as LZO, but generally better performance. > > I''d like to note that the LZO version in the current Linux kernel is > rather outdated - it seems to be based on the 2005 release. > > In fact the latest version LZO 2.06 does compress both slightly faster and > better than snappy 1.0.4 when benchmarking the Calgary and Silesia > compression corpus (tested with gcc 4.6 on Nehalem & Sandy Bridge). > > Furthermore please be aware that from a pure compression point of view > snappy et al. are very close cousins of LZO (strictly byte-aligned LZ77) > that mainly differ in implementation issues like using a table to > number of branches - and indeed similar optimizations could be applied > to any version. > > I''m not sure if there is an official kernel maintainer of LZO, but I''d > offer to assist you updating to the latest version and eliminating > any possible performance issues.-- Markus Oberhumer, <markus@oberhumer.com>, http://www.oberhumer.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nitin Gupta
2012-Jan-23 19:12 UTC
Re: ANN: linux-kernel-lzo-2.06.20120123 - update LZO to v2.06
Hi Markus, Thanks for the patches! On 01/23/2012 11:19 AM, Markus F.X.J. Oberhumer wrote:> Hi, > > I''ve prepared a small package that updates the LZO version in the Linux > kernel to LZO v2.06. > > Please get it from: > > http://www.oberhumer.com/opensource/lzo/download/Testing/linux-kernel-lzo-2.06.20120123.tar.gz > > As stated in the README, its main purpose is to allow easy benchmarking of the > latest LZO versions - these do feature some nice speed improvements, and while > I have done a lot of synthetic benchmarking I''m really very curious and > appreciate feedback on "real-world" performance numbers like usage in > btrfs and zram. >I will soon integrate them with zram and get some performance numbers. Thanks, Nitin> > http://www.oberhumer.com/opensource/lzo/ > > On 2012-01-18 16:05, Markus F.X.J. Oberhumer wrote: >> On 2012-01-13 01:28, Andi Kleen wrote: >>> Here''s a slightly updated version of the BTRFS snappy interface. >>> snappy is a faster compression algorithm that provides similar >>> compression as LZO, but generally better performance. >> >> I''d like to note that the LZO version in the current Linux kernel is >> rather outdated - it seems to be based on the 2005 release. >> >> In fact the latest version LZO 2.06 does compress both slightly faster and >> better than snappy 1.0.4 when benchmarking the Calgary and Silesia >> compression corpus (tested with gcc 4.6 on Nehalem & Sandy Bridge). >> >> Furthermore please be aware that from a pure compression point of view >> snappy et al. are very close cousins of LZO (strictly byte-aligned LZ77) >> that mainly differ in implementation issues like using a table to >> number of branches - and indeed similar optimizations could be applied >> to any version. >> >> I''m not sure if there is an official kernel maintainer of LZO, but I''d >> offer to assist you updating to the latest version and eliminating >> any possible performance issues. >
Hugo Chevrain
2012-Jan-24 17:03 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
> evergreen writes: > > > > Although there is LessFS''s Mark Ruijter : > http://www.lessfs.com/wordpress/?p=684&cpage=1#comment-3114 > > "On average the speeds appear to be 48% faster then snappy." > Sounds promising... > > --Well, it seems that the performance is really good. Apparently, LZ4 has displaced Snappy as speed champion for LessFS. http://www.lessfs.com/wordpress/?p=688 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Andi Kleen
2012-Jan-25 01:36 UTC
Re: ANN: linux-kernel-lzo-2.06.20120123 - update LZO to v2.06
On Mon, Jan 23, 2012 at 05:19:40PM +0100, Markus F.X.J. Oberhumer wrote:> Hi, > > I''ve prepared a small package that updates the LZO version in the Linux > kernel to LZO v2.06.I ran benchmarks on the new miniLZO and LZ4 on 64bit. LZ4 is generally slower than snappy/lzo in the micro benchmarks. The new LZO is better than the old one, but still loses to snappy most of the time (but often by very small amounts only) Will be worth checking the new LZO will the full distribution boot test. I agree it''s definitely a good idea to update the kernel version. However I must say it would be a major project to bring it up to kernel coding standards. snappy is still interesting, but much less so than it was before. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Jan-25 11:30 UTC
Re: Updated btrfs/crypto snappy interface ready for merging
On Tue, Jan 24, 2012 at 05:03:16PM +0000, Hugo Chevrain wrote:> > evergreen writes: > > > > > > > Although there is LessFS''s Mark Ruijter : > > http://www.lessfs.com/wordpress/?p=684&cpage=1#comment-3114 > > > > "On average the speeds appear to be 48% faster then snappy." > > Sounds promising... > > > > -- > > Well, it seems that the performance is really good. > Apparently, LZ4 has displaced Snappy as speed champion for LessFS. > > http://www.lessfs.com/wordpress/?p=688I''m preparing a LZ4 on top of Chris'' snappy branch to have some testbed for comparing and benchmarking. I''ll send it to list after a few tests. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 12, 2012 at 6:28 PM, Andi Kleen <andi@firstfloor.org> wrote:> From: Andi Kleen <ak@linux.intel.com> > > This is a C port of the google snappy compressor. It has roughly > comparable compression to LZO, but is significantly faster on many file > types. For example it beats all other compressors on already > compressed data. > > I ported the original C++ code over to C and did some changes > to make it better fit into the kernel. It preallocates the worst > case memory consumption now. While the code being larger > than lzo it is still reasonable (about 5K on x86). > > Decompression needs very little memory, Compression > currently 192K on 64bit and 128K on 32bit. For comparison > LZO compression needs 128K on 64bit and 64K on 32bit. > > [This could be lowered significantly by not preallocating > for most use cases, typically the footprint is much lower. > The original C++ version only allocated most of this > when (rarely) needed, but this is more problematic in the kernel] > > There are some minor divergences from the Linux coding standards: > in particular I kept the C++/C99 style mixed statement/declarations. > This was mainly to not diverge too much from the reference C++ > source, so that improvements from there can be easily ported. > There are some other left overs from the google style, but very > little now. > > Performance: > > The compressor performs best on 64bit-LE systems, > but is also quite good on 32bit. I haven''t tested BE, but > I don''t expect that to add a lot of overhead. > > Here is some performance data (32bit, Nehalem): > c/b = cycles/byte; lower numbers are better. > > x86-64 executable: (compression minimally slower than qlz, but > much better at decompression, lzo is left in the dust): > > snappy: emacs-gtk: 11007968 b: ratio 0.38: comp 8.13 uncomp 2.65 c/b > lzo : emacs-gtk: 11007968 b: ratio 0.33: comp 12.74 uncomp 4.70 c/b > zlib1 : emacs-gtk: 11007968 b: ratio 0.27: comp 49.96 uncomp 13.14 c/b > zlib3 : emacs-gtk: 11007968 b: ratio 0.26: comp 64.17 uncomp 12.33 c/b > lzf : emacs-gtk: 11007968 b: ratio 0.37: comp 9.85 uncomp 4.33 c/b > qlz : emacs-gtk: 11007968 b: ratio 0.34: comp 7.51 uncomp 6.28 c/b > fastlz: emacs-gtk: 11007968 b: ratio 0.37: comp 10.73 uncomp 4.97 c/b > > Compressed data (beats everything else): > > snappy: udev-151.tar.gz: 634842 b: ratio 1.00: comp 0.99 uncomp 0.33 c/b > lzo : udev-151.tar.gz: 634842 b: ratio 1.00: comp 41.44 uncomp 0.66 c/b > zlib1 : udev-151.tar.gz: 634842 b: ratio 1.00: comp 116.99 uncomp 3.94 c/b > zlib3 : udev-151.tar.gz: 634842 b: ratio 1.00: comp 117.68 uncomp 3.94 c/b > lzf : udev-151.tar.gz: 634842 b: ratio 1.03: comp 16.32 uncomp 1.14 c/b > qlz : udev-151.tar.gz: 634842 b: ratio 1.00: comp 10.42 uncomp 0.42 c/b > fastlz: udev-151.tar.gz: 634842 b: ratio 1.03: comp 19.35 uncomp 2.07 c/b > > Text file (compression somewhat slower than qlz, but decompression > much better, lzo is much worse): > > snappy: manual.txt: 445343 b: ratio 0.47: comp 12.01 uncomp 3.12 c/b > lzo : manual.txt: 445343 b: ratio 0.44: comp 16.32 uncomp 7.53 c/b > zlib1 : manual.txt: 445343 b: ratio 0.35: comp 56.37 uncomp 15.59 c/b > zlib3 : manual.txt: 445343 b: ratio 0.31: comp 73.45 uncomp 13.99 c/b > lzf : manual.txt: 445343 b: ratio 0.46: comp 13.43 uncomp 5.47 c/b > qlz : manual.txt: 445343 b: ratio 0.44: comp 9.16 uncomp 8.19 c/b > fastlz: manual.txt: 445343 b: ratio 0.46: comp 14.22 uncomp 7.28 c/b > > As you can see snappy is a good all-around compressor. > > On 64bit the compression is even faster and beats everything else easily: > > snappy: emacs-gtk: 11007968 b: ratio 0.38: comp 4.90 uncomp 2.65 c/b > lzo : emacs-gtk: 11007968 b: ratio 0.33: comp 11.24 uncomp 4.46 c/b > zlib1 : emacs-gtk: 11007968 b: ratio 0.27: comp 41.67 uncomp 11.13 c/b > zlib3 : emacs-gtk: 11007968 b: ratio 0.26: comp 51.80 uncomp 10.54 c/b > lzf : emacs-gtk: 11007968 b: ratio 0.37: comp 8.79 uncomp 4.05 c/b > qlz : emacs-gtk: 11007968 b: ratio 0.34: comp 5.44 uncomp 5.46 c/b > fastlz: emacs-gtk: 11007968 b: ratio 0.37: comp 9.91 uncomp 4.77 c/b > > On 64bit it''s now nearly as fast as qlz on the text file too: > > snappy: manual.txt: 445343 b: ratio 0.47: comp 7.79 uncomp 3.47 c/b > lzo : manual.txt: 445343 b: ratio 0.44: comp 15.46 uncomp 7.27 c/b > zlib1 : manual.txt: 445343 b: ratio 0.35: comp 45.79 uncomp 12.78 c/b > zlib3 : manual.txt: 445343 b: ratio 0.31: comp 60.52 uncomp 11.72 c/b > lzf : manual.txt: 445343 b: ratio 0.46: comp 12.62 uncomp 5.30 c/b > qlz : manual.txt: 445343 b: ratio 0.44: comp 6.81 uncomp 7.65 c/b > fastlz: manual.txt: 445343 b: ratio 0.46: comp 13.75 uncomp 6.52 c/b > > Overall it''s a good alternative to lzo, with the only > drawback being the somewhat higher memory use. > > v2: Some minor performance improvements and cleanups. > 32bit compression should be a few percent faster now. > Signed-off-by: Andi Kleen <ak@linux.intel.com> > --- > include/linux/snappy.h | 26 + > lib/Kconfig | 6 + > lib/Makefile | 4 + > lib/snappy.c | 1300 ++++++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 1336 insertions(+), 0 deletions(-) > create mode 100644 include/linux/snappy.h > create mode 100644 lib/snappy.c > > diff --git a/include/linux/snappy.h b/include/linux/snappy.h > new file mode 100644 > index 0000000..4119803 > --- /dev/null > +++ b/include/linux/snappy.h > @@ -0,0 +1,26 @@ > +#ifndef _LINUX_SNAPPY_H > +#define _LINUX_SNAPPY_H 1 > + > +#include <linux/types.h> > + > +/* Only needed for compression. This preallocates the worst case */ > +struct snappy_env { > + u16 *hash_table; > + void *scratch; > + void *scratch_output; > +}; > + > +int snappy_init_env(struct snappy_env *env); > +void snappy_free_env(struct snappy_env *env); > +bool snappy_uncompress(const char *compressed, size_t n, char *uncompressed); > +int snappy_compress(struct snappy_env *env, > + const char *input, > + size_t input_length, > + char *compressed, > + size_t *compressed_length); > +bool snappy_uncompressed_length(const char *buf, size_t len, size_t *result); > +size_t snappy_max_compressed_length(size_t source_len); > + > + > + > +#endif > diff --git a/lib/Kconfig b/lib/Kconfig > index 201e1b3..719e4f2 100644 > --- a/lib/Kconfig > +++ b/lib/Kconfig > @@ -310,4 +310,10 @@ config DIGSIG > Digital signature verification. Currently only RSA is supported. > Implementation is done using GnuPG MPI library > > +config SNAPPY > + tristate "Snappy compressor" > + help > + Add the snappy compressor. This is a reasonable compressor that > + compresses and decompresses extremly fast. > + > endmenu > diff --git a/lib/Makefile b/lib/Makefile > index dace162..2f5f86a 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > @@ -121,6 +121,10 @@ obj-$(CONFIG_DQL) += dynamic_queue_limits.o > obj-$(CONFIG_MPILIB) += mpi/ > obj-$(CONFIG_DIGSIG) += digsig.o > > +CFLAGS_snappy.o += $(call cc-disable-warning, declaration-after-statement) \ > + -DNDEBUG=1 > +obj-$(CONFIG_SNAPPY) += snappy.o > + > hostprogs-y := gen_crc32table > clean-files := crc32table.h > > diff --git a/lib/snappy.c b/lib/snappy.c > new file mode 100644 > index 0000000..0b39e07 > --- /dev/null > +++ b/lib/snappy.c > @@ -0,0 +1,1300 @@ > +/* > + * C port of the snappy compressor from Google. > + * This is a very fast compressor with comparable compression to lzo. > + * Works best on 64bit little-endian, but should be good on others too. > + * Ported by Andi Kleen. > + * Based on snappy 1.0.3 plus some selected changes from SVN. > + */ > + > +/* > + * Copyright 2005 Google Inc. All Rights Reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions are > + * met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above > + * copyright notice, this list of conditions and the following disclaimer > + * in the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of Google Inc. nor the names of its > + * contributors may be used to endorse or promote products derived from > + * this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + */ > + > +#include <linux/kernel.h> > +#include <linux/module.h> > +#include <linux/slab.h> > +#include <linux/string.h> > +#include <linux/snappy.h> > +#include <asm/unaligned.h> > + > +#define CRASH_UNLESS(x) BUG_ON(!(x)) > +#define CHECK(cond) CRASH_UNLESS(cond) > +#define CHECK_LE(a, b) CRASH_UNLESS((a) <= (b)) > +#define CHECK_GE(a, b) CRASH_UNLESS((a) >= (b)) > +#define CHECK_EQ(a, b) CRASH_UNLESS((a) == (b)) > +#define CHECK_NE(a, b) CRASH_UNLESS((a) != (b)) > +#define CHECK_LT(a, b) CRASH_UNLESS((a) < (b)) > +#define CHECK_GT(a, b) CRASH_UNLESS((a) > (b)) > + > +#define UNALIGNED_LOAD16(_p) get_unaligned((u16 *)(_p)) > +#define UNALIGNED_LOAD32(_p) get_unaligned((u32 *)(_p)) > +#define UNALIGNED_LOAD64(_p) get_unaligned((u64 *)(_p)) > + > +#define UNALIGNED_STORE16(_p, _val) put_unaligned(_val, (u16 *)(_p)) > +#define UNALIGNED_STORE32(_p, _val) put_unaligned(_val, (u32 *)(_p)) > +#define UNALIGNED_STORE64(_p, _val) put_unaligned(_val, (u64 *)(_p)) > + > +#ifdef NDEBUG > + > +#define DCHECK(cond) do {} while(0) > +#define DCHECK_LE(a, b) do {} while(0) > +#define DCHECK_GE(a, b) do {} while(0) > +#define DCHECK_EQ(a, b) do {} while(0) > +#define DCHECK_NE(a, b) do {} while(0) > +#define DCHECK_LT(a, b) do {} while(0) > +#define DCHECK_GT(a, b) do {} while(0) > + > +#else > + > +#define DCHECK(cond) CHECK(cond) > +#define DCHECK_LE(a, b) CHECK_LE(a, b) > +#define DCHECK_GE(a, b) CHECK_GE(a, b) > +#define DCHECK_EQ(a, b) CHECK_EQ(a, b) > +#define DCHECK_NE(a, b) CHECK_NE(a, b) > +#define DCHECK_LT(a, b) CHECK_LT(a, b) > +#define DCHECK_GT(a, b) CHECK_GT(a, b) > + > +#endif > + > +static inline bool is_little_endian(void) > +{ > +#ifdef __LITTLE_ENDIAN__ > + return true; > +#endif > + return false; > +} > + > +static inline int log2_floor(u32 n) > +{ > + return n == 0 ? -1 : 31 ^ __builtin_clz(n); > +} > + > +static inline int find_lsb_set_non_zero(u32 n) > +{ > + return __builtin_ctz(n); > +} > + > +static inline int find_lsb_set_non_zero64(u64 n) > +{ > + if (sizeof(long) == 4) { > + if (n & 0xffffffff) > + return __builtin_ctz(n & 0xffffffff); > + return 32 + __builtin_ctz(n >> 32); > + } > + return __builtin_ctzll(n); > +} > + > +#define kmax32 5 > + > +/* > + * Attempts to parse a varint32 from a prefix of the bytes in [ptr,limit-1]. > + * Never reads a character at or beyond limit. If a valid/terminated varint32 > + * was found in the range, stores it in *OUTPUT and returns a pointer just > + * past the last byte of the varint32. Else returns NULL. On success, > + * "result <= limit". > + */ > +static inline const char *varint_parse32_with_limit(const char *p, > + const char *l, > + u32 * OUTPUT) > +{ > + const unsigned char *ptr = (const unsigned char *)(p); > + const unsigned char *limit = (const unsigned char *)(l); > + u32 b, result; > + > + if (ptr >= limit) > + return NULL; > + b = *(ptr++); > + result = b & 127; > + if (b < 128) > + goto done; > + if (ptr >= limit) > + return NULL; > + b = *(ptr++); > + result |= (b & 127) << 7; > + if (b < 128) > + goto done; > + if (ptr >= limit) > + return NULL; > + b = *(ptr++); > + result |= (b & 127) << 14; > + if (b < 128) > + goto done; > + if (ptr >= limit) > + return NULL; > + b = *(ptr++); > + result |= (b & 127) << 21; > + if (b < 128) > + goto done; > + if (ptr >= limit) > + return NULL; > + b = *(ptr++); > + result |= (b & 127) << 28; > + if (b < 16) > + goto done; > + return NULL; /* Value is too long to be a varint32 */ > +done: > + *OUTPUT = result; > + return (const char *)(ptr); > +} > + > +/* > + * REQUIRES "ptr" points to a buffer of length sufficient to hold "v". > + * EFFECTS Encodes "v" into "ptr" and returns a pointer to the > + * byte just past the last encoded byte. > + */ > +static inline char *varint_encode32(char *sptr, u32 v) > +{ > + /* Operate on characters as unsigneds */ > + unsigned char *ptr = (unsigned char *)(sptr); > + static const int B = 128; > + > + if (v < (1 << 7)) { > + *(ptr++) = v; > + } else if (v < (1 << 14)) { > + *(ptr++) = v | B; > + *(ptr++) = v >> 7; > + } else if (v < (1 << 21)) { > + *(ptr++) = v | B; > + *(ptr++) = (v >> 7) | B; > + *(ptr++) = v >> 14; > + } else if (v < (1 << 28)) { > + *(ptr++) = v | B; > + *(ptr++) = (v >> 7) | B; > + *(ptr++) = (v >> 14) | B; > + *(ptr++) = v >> 21; > + } else { > + *(ptr++) = v | B; > + *(ptr++) = (v >> 7) | B; > + *(ptr++) = (v >> 14) | B; > + *(ptr++) = (v >> 21) | B; > + *(ptr++) = v >> 28; > + } > + return (char *)(ptr); > +} > + > +struct source { > + const char *ptr; > + size_t left; > +}; > + > +static inline int available(struct source *s) > +{ > + return s->left; > +} > + > +static inline const char *peek(struct source *s, size_t * len) > +{ > + *len = s->left; > + return s->ptr; > +} > + > +static inline void skip(struct source *s, size_t n) > +{ > + s->left -= n; > + s->ptr += n; > +} > + > +struct sink { > + char *dest; > +}; > + > +static inline void append(struct sink *s, const char *data, size_t n) > +{ > + if (data != s->dest) > + memcpy(s->dest, data, n); > + s->dest += n; > +} > + > +static inline void *sink_peek(struct sink *s, size_t n) > +{ > + return s->dest; > +} > + > +struct writer { > + char *base; > + char *op; > + char *op_limit; > +}; > + > +/* Called before decompression */ > +static inline void writer_set_expected_length(struct writer *w, size_t len) > +{ > + w->op_limit = w->op + len; > +} > + > +/* Called after decompression */ > +static inline bool writer_check_length(struct writer *w) > +{ > + return w->op == w->op_limit; > +} > + > +/* > + * Copy "len" bytes from "src" to "op", one byte at a time. Used for > + * handling COPY operations where the input and output regions may > + * overlap. For example, suppose: > + * src == "ab" > + * op == src + 2 > + * len == 20 > + * After IncrementalCopy(src, op, len), the result will have > + * eleven copies of "ab" > + * ababababababababababab > + * Note that this does not match the semantics of either memcpy() > + * or memmove(). > + */ > +static inline void incremental_copy(const char *src, char *op, int len) > +{ > + DCHECK_GT(len, 0); > + do { > + *op++ = *src++; > + } while (--len > 0); > +} > + > +/* > + * Equivalent to IncrementalCopy except that it can write up to ten extra > + * bytes after the end of the copy, and that it is faster. > + * > + * The main part of this loop is a simple copy of eight bytes at a time until > + * we''ve copied (at least) the requested amount of bytes. However, if op and > + * src are less than eight bytes apart (indicating a repeating pattern of > + * length < 8), we first need to expand the pattern in order to get the correct > + * results. For instance, if the buffer looks like this, with the eight-byte > + * <src> and <op> patterns marked as intervals: > + * > + * abxxxxxxxxxxxx > + * [------] src > + * [------] op > + * > + * a single eight-byte copy from <src> to <op> will repeat the pattern once, > + * after which we can move <op> two bytes without moving <src>: > + * > + * ababxxxxxxxxxx > + * [------] src > + * [------] op > + * > + * and repeat the exercise until the two no longer overlap. > + * > + * This allows us to do very well in the special case of one single byte > + * repeated many times, without taking a big hit for more general cases. > + * > + * The worst case of extra writing past the end of the match occurs when > + * op - src == 1 and len == 1; the last copy will read from byte positions > + * [0..7] and write to [4..11], whereas it was only supposed to write to > + * position 1. Thus, ten excess bytes. > + */ > + > +#define kmax_increment_copy_overflow 10 > + > +static inline void incremental_copy_fast_path(const char *src, char *op, > + int len) > +{ > + while (op - src < 8) { > + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(src)); > + len -= op - src; > + op += op - src; > + } > + while (len > 0) { > + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(src)); > + src += 8; > + op += 8; > + len -= 8; > + } > +} > + > +static inline bool writer_append_from_self(struct writer *w, u32 offset, > + u32 len) > +{ > + char *op = w->op; > + const int space_left = w->op_limit - op; > + > + if (op - w->base <= offset - 1u) /* -1u catches offset==0 */ > + return false; > + if (len <= 16 && offset >= 8 && space_left >= 16) { > + /* Fast path, used for the majority (70-80%) of dynamic > + * invocations. */ > + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(op - offset)); > + UNALIGNED_STORE64(op + 8, UNALIGNED_LOAD64(op - offset + 8)); > + } else { > + if (space_left >= len + kmax_increment_copy_overflow) { > + incremental_copy_fast_path(op - offset, op, len); > + } else { > + if (space_left < len) { > + return false; > + } > + incremental_copy(op - offset, op, len); > + } > + } > + > + w->op = op + len; > + return true; > +} > + > +static inline bool writer_append(struct writer *w, const char *ip, u32 len, > + bool allow_fast_path) > +{ > + char *op = w->op; > + const int space_left = w->op_limit - op; > + if (allow_fast_path && len <= 16 && space_left >= 16) { > + /* Fast path, used for the majority (about 90%) of dynamic > + * invocations. */ > + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(ip)); > + UNALIGNED_STORE64(op + 8, UNALIGNED_LOAD64(ip + 8)); > + } else { > + if (space_left < len) > + return false; > + memcpy(op, ip, len); > + } > + w->op = op + len; > + return true; > +} > + > +/* > + * Any hash function will produce a valid compressed bitstream, but a good > + * hash function reduces the number of collisions and thus yields better > + * compression for compressible input, and more speed for incompressible > + * input. Of course, it doesn''t hurt if the hash function is reasonably fast > + * either, as it gets called a lot. > + */ > +static inline u32 hash_bytes(u32 bytes, int shift) > +{ > + u32 kmul = 0x1e35a7bd; > + return (bytes * kmul) >> shift; > +} > + > +static inline u32 hash(const char *p, int shift) > +{ > + return hash_bytes(UNALIGNED_LOAD32(p), shift); > +} > + > +/* > + * Compressed data can be defined as: > + * compressed := item* literal* > + * item := literal* copy > + * > + * The trailing literal sequence has a space blowup of at most 62/60 > + * since a literal of length 60 needs one tag byte + one extra byte > + * for length information. > + * > + * Item blowup is trickier to measure. Suppose the "copy" op copies > + * 4 bytes of data. Because of a special check in the encoding code, > + * we produce a 4-byte copy only if the offset is < 65536. Therefore > + * the copy op takes 3 bytes to encode, and this type of item leads > + * to at most the 62/60 blowup for representing literals. > + * > + * Suppose the "copy" op copies 5 bytes of data. If the offset is big > + * enough, it will take 5 bytes to encode the copy op. Therefore the > + * worst case here is a one-byte literal followed by a five-byte copy. > + * I.e., 6 bytes of input turn into 7 bytes of "compressed" data. > + * > + * This last factor dominates the blowup, so the final estimate is: > + */ > +size_t snappy_max_compressed_length(size_t source_len) > +{ > + return 32 + source_len + source_len / 6; > +} > +EXPORT_SYMBOL(snappy_max_compressed_length); > + > +enum { > + LITERAL = 0, > + COPY_1_BYTE_OFFSET = 1, /* 3 bit length + 3 bits of offset in opcode */ > + COPY_2_BYTE_OFFSET = 2, > + COPY_4_BYTE_OFFSET = 3 > +}; > + > +static inline char *emit_literal(char *op, > + const char *literal, > + int len, bool allow_fast_path) > +{ > + int n = len - 1; /* Zero-length literals are disallowed */ > + > + if (n < 60) { > + /* Fits in tag byte */ > + *op++ = LITERAL | (n << 2); > + > +/* > + * The vast majority of copies are below 16 bytes, for which a > + * call to memcpy is overkill. This fast path can sometimes > + * copy up to 15 bytes too much, but that is okay in the > + * main loop, since we have a bit to go on for both sides: > + * > + * - The input will always have kInputMarginBytes = 15 extra > + * available bytes, as long as we''re in the main loop, and > + * if not, allow_fast_path = false. > + * - The output will always have 32 spare bytes (see > + * MaxCompressedLength). > + */ > + if (allow_fast_path && len <= 16) { > + UNALIGNED_STORE64(op, UNALIGNED_LOAD64(literal)); > + UNALIGNED_STORE64(op + 8, > + UNALIGNED_LOAD64(literal + 8)); > + return op + len; > + } > + } else { > + /* Encode in upcoming bytes */ > + char *base = op; > + int count = 0; > + op++; > + while (n > 0) { > + *op++ = n & 0xff; > + n >>= 8; > + count++; > + } > + DCHECK(count >= 1); > + DCHECK(count <= 4); > + *base = LITERAL | ((59 + count) << 2); > + } > + memcpy(op, literal, len); > + return op + len; > +} > + > +static inline char *emit_copy_less_than64(char *op, int offset, int len) > +{ > + DCHECK_LE(len, 64); > + DCHECK_GE(len, 4); > + DCHECK_LT(offset, 65536); > + > + if ((len < 12) && (offset < 2048)) { > + int len_minus_4 = len - 4; > + DCHECK(len_minus_4 < 8); /* Must fit in 3 bits */ > + *op++ > + COPY_1_BYTE_OFFSET | ((len_minus_4) << 2) | ((offset >> 8) > + << 5); > + *op++ = offset & 0xff; > + } else { > + *op++ = COPY_2_BYTE_OFFSET | ((len - 1) << 2); > + put_unaligned_le16(offset, op); > + op += 2; > + } > + return op; > +} > + > +static inline char *emit_copy(char *op, int offset, int len) > +{ > + /* > + * Emit 64 byte copies but make sure to keep at least four bytes > + * reserved > + */ > + while (len >= 68) { > + op = emit_copy_less_than64(op, offset, 64); > + len -= 64; > + } > + > + /* > + * Emit an extra 60 byte copy if have too much data to fit in > + * one copy > + */ > + if (len > 64) { > + op = emit_copy_less_than64(op, offset, 60); > + len -= 60; > + } > + > + /* Emit remainder */ > + op = emit_copy_less_than64(op, offset, len); > + return op; > +} > + > +/** > + * snappy_uncompressed_length - return length of uncompressed output. > + * @start: compressed buffer > + * @n: length of compressed buffer. > + * @result: Write the length of the uncompressed output here. > + * > + * Returns true when successfull, otherwise false. > + */ > +bool snappy_uncompressed_length(const char *start, size_t n, size_t * result) > +{ > + u32 v = 0; > + const char *limit = start + n; > + if (varint_parse32_with_limit(start, limit, &v) != NULL) { > + *result = v; > + return true; > + } else { > + return false; > + } > +} > +EXPORT_SYMBOL(snappy_uncompressed_length); > + > +#define kblock_log 15 > +#define kblock_size (1 << kblock_log) > + > +#define kmax_hash_table_bits 14 > +#define kmax_hash_table_size (1 << kmax_hash_table_bits) > + > +/* > + * Use smaller hash table when input.size() is smaller, since we > + * fill the table, incurring O(hash table size) overhead for > + * compression, and if the input is short, we won''t need that > + * many hash table entries anyway. > + */ > +static u16 *get_hash_table(struct snappy_env *env, size_t input_size, > + int *table_size) > +{ > + int htsize = 256; > + > + DCHECK(kmax_hash_table_size >= 256); > + while (htsize < kmax_hash_table_size && htsize < input_size) > + htsize <<= 1; > + CHECK_EQ(0, htsize & (htsize - 1)); > + CHECK_LE(htsize, kmax_hash_table_size); > + > + u16 *table; > + table = env->hash_table; > + > + *table_size = htsize; > + memset(table, 0, htsize * sizeof(*table)); > + return table; > +} > + > +/* > + * Return the largest n such that > + * > + * s1[0,n-1] == s2[0,n-1] > + * and n <= (s2_limit - s2). > + * > + * Does not read *s2_limit or beyond. > + * Does not read *(s1 + (s2_limit - s2)) or beyond. > + * Requires that s2_limit >= s2. > + * > + * Separate implementation for x86_64, for speed. Uses the fact that > + * x86_64 is little endian. > + */ > +#if defined(__LITTLE_ENDIAN__) > +static inline int find_match_length(const char *s1, > + const char *s2, const char *s2_limit) > +{ > + int matched = 0; > + > + DCHECK_GE(s2_limit, s2); > + /* > + * Find out how long the match is. We loop over the data 64 bits at a > + * time until we find a 64-bit block that doesn''t match; then we find > + * the first non-matching bit and use that to calculate the total > + * length of the match. > + */ > + while (likely(s2 <= s2_limit - 8)) { > + if (unlikely > + (UNALIGNED_LOAD64(s2) == UNALIGNED_LOAD64(s1 + matched))) { > + s2 += 8; > + matched += 8; > + } else { > + /* > + * On current (mid-2008) Opteron models there > + * is a 3% more efficient code sequence to > + * find the first non-matching byte. However, > + * what follows is ~10% better on Intel Core 2 > + * and newer, and we expect AMD''s bsf > + * instruction to improve. > + */ > + u64 x > + UNALIGNED_LOAD64(s2) ^ UNALIGNED_LOAD64(s1 + > + matched); > + int matching_bits = find_lsb_set_non_zero64(x); > + matched += matching_bits >> 3; > + return matched; > + } > + } > + while (likely(s2 < s2_limit)) { > + if (likely(s1[matched] == *s2)) { > + ++s2; > + ++matched; > + } else { > + return matched; > + } > + } > + return matched; > +} > +#else > +static inline int find_match_length(const char *s1, > + const char *s2, const char *s2_limit) > +{ > + /* Implementation based on the x86-64 version, above. */ > + DCHECK_GE(s2_limit, s2); > + int matched = 0; > + > + while (s2 <= s2_limit - 4 && > + UNALIGNED_LOAD32(s2) == UNALIGNED_LOAD32(s1 + matched)) { > + s2 += 4; > + matched += 4; > + } > + if (is_little_endian() && s2 <= s2_limit - 4) { > + u32 x > + UNALIGNED_LOAD32(s2) ^ UNALIGNED_LOAD32(s1 + matched); > + int matching_bits = find_lsb_set_non_zero(x); > + matched += matching_bits >> 3; > + } else { > + while ((s2 < s2_limit) && (s1[matched] == *s2)) { > + ++s2; > + ++matched; > + } > + } > + return matched; > +} > +#endif > + > +/* > + * For 0 <= offset <= 4, GetU32AtOffset(UNALIGNED_LOAD64(p), offset) will > + * equal UNALIGNED_LOAD32(p + offset). Motivation: On x86-64 hardware we have > + * empirically found that overlapping loads such as > + * UNALIGNED_LOAD32(p) ... UNALIGNED_LOAD32(p+1) ... UNALIGNED_LOAD32(p+2) > + * are slower than UNALIGNED_LOAD64(p) followed by shifts and casts to u32. > + */ > +static inline u32 get_u32_at_offset(u64 v, int offset) > +{ > + DCHECK(0 <= offset && offset <= 4); > + return v >> (is_little_endian()? 8 * offset : 32 - 8 * offset); > +} > + > +/* > + * Flat array compression that does not emit the "uncompressed length" > + * prefix. Compresses "input" string to the "*op" buffer. > + * > + * REQUIRES: "input" is at most "kBlockSize" bytes long. > + * REQUIRES: "op" points to an array of memory that is at least > + * "MaxCompressedLength(input.size())" in size. > + * REQUIRES: All elements in "table[0..table_size-1]" are initialized to zero. > + * REQUIRES: "table_size" is a power of two > + * > + * Returns an "end" pointer into "op" buffer. > + * "end - op" is the compressed size of "input". > + */ > + > +static char *compress_fragment(const char *const input, > + const size_t input_size, > + char *op, u16 * table, const int table_size) > +{ > + /* "ip" is the input pointer, and "op" is the output pointer. */ > + const char *ip = input; > + CHECK_LE(input_size, kblock_size); > + CHECK_EQ(table_size & (table_size - 1), 0); > + const int shift = 32 - log2_floor(table_size); > + DCHECK_EQ(UINT_MAX >> shift, table_size - 1); > + const char *ip_end = input + input_size; > + const char *baseip = ip; > + /* > + * Bytes in [next_emit, ip) will be emitted as literal bytes. Or > + * [next_emit, ip_end) after the main loop. > + */ > + const char *next_emit = ip; > + > + const int kinput_margin_bytes = 15; > + > + if (likely(input_size >= kinput_margin_bytes)) { > + const char *ip_limit = input + input_size - > + kinput_margin_bytes; > + > + u32 next_hash; > + for (next_hash = hash(++ip, shift);;) { > + DCHECK_LT(next_emit, ip); > +/* > + * The body of this loop calls EmitLiteral once and then EmitCopy one or > + * more times. (The exception is that when we''re close to exhausting > + * the input we goto emit_remainder.) > + * > + * In the first iteration of this loop we''re just starting, so > + * there''s nothing to copy, so calling EmitLiteral once is > + * necessary. And we only start a new iteration when the > + * current iteration has determined that a call to EmitLiteral will > + * precede the next call to EmitCopy (if any). > + * > + * Step 1: Scan forward in the input looking for a 4-byte-long match. > + * If we get close to exhausting the input then goto emit_remainder. > + * > + * Heuristic match skipping: If 32 bytes are scanned with no matches > + * found, start looking only at every other byte. If 32 more bytes are > + * scanned, look at every third byte, etc.. When a match is found, > + * immediately go back to looking at every byte. This is a small loss > + * (~5% performance, ~0.1% density) for lcompressible data due to more > + * bookkeeping, but for non-compressible data (such as JPEG) it''s a huge > + * win since the compressor quickly "realizes" the data is incompressible > + * and doesn''t bother looking for matches everywhere. > + * > + * The "skip" variable keeps track of how many bytes there are since the > + * last match; dividing it by 32 (ie. right-shifting by five) gives the > + * number of bytes to move ahead for each iteration. > + */ > + u32 skip = 32; > + > + const char *next_ip = ip; > + const char *candidate; > + do { > + ip = next_ip; > + u32 hval = next_hash; > + DCHECK_EQ(hval, hash(ip, shift)); > + u32 bytes_between_hash_lookups = skip++ >> 5; > + next_ip = ip + bytes_between_hash_lookups; > + if (unlikely(next_ip > ip_limit)) { > + goto emit_remainder; > + } > + next_hash = hash(next_ip, shift); > + candidate = baseip + table[hval]; > + DCHECK_GE(candidate, baseip); > + DCHECK_LT(candidate, ip); > + > + table[hval] = ip - baseip; > + } while (likely(UNALIGNED_LOAD32(ip) !> + UNALIGNED_LOAD32(candidate))); > + > +/* > + * Step 2: A 4-byte match has been found. We''ll later see if more > + * than 4 bytes match. But, prior to the match, input > + * bytes [next_emit, ip) are unmatched. Emit them as "literal bytes." > + */ > + DCHECK_LE(next_emit + 16, ip_end); > + op = emit_literal(op, next_emit, ip - next_emit, true); > + > +/* > + * Step 3: Call EmitCopy, and then see if another EmitCopy could > + * be our next move. Repeat until we find no match for the > + * input immediately after what was consumed by the last EmitCopy call. > + * > + * If we exit this loop normally then we need to call EmitLiteral next, > + * though we don''t yet know how big the literal will be. We handle that > + * by proceeding to the next iteration of the main loop. We also can exit > + * this loop via goto if we get close to exhausting the input. > + */ > + u64 input_bytes = 0; > + u32 candidate_bytes = 0; > + > + do { > +/* > + * We have a 4-byte match at ip, and no need to emit any > + * "literal bytes" prior to ip. > + */ > + const char *base = ip; > + int matched = 4 + > + find_match_length(candidate + 4, ip + 4, > + ip_end); > + ip += matched; > + int offset = base - candidate; > + DCHECK_EQ(0, memcmp(base, candidate, matched)); > + op = emit_copy(op, offset, matched); > +/* > + * We could immediately start working at ip now, but to improve > + * compression we first update table[Hash(ip - 1, ...)]. > + */ > + const char *insert_tail = ip - 1; > + next_emit = ip; > + if (unlikely(ip >= ip_limit)) { > + goto emit_remainder; > + } > + input_bytes = UNALIGNED_LOAD64(insert_tail); > + u32 prev_hash > + hash_bytes(get_u32_at_offset > + (input_bytes, 0), shift); > + table[prev_hash] = ip - baseip - 1; > + u32 cur_hash > + hash_bytes(get_u32_at_offset > + (input_bytes, 1), shift); > + candidate = baseip + table[cur_hash]; > + candidate_bytes = UNALIGNED_LOAD32(candidate); > + table[cur_hash] = ip - baseip; > + } while (get_u32_at_offset(input_bytes, 1) => + candidate_bytes); > + > + next_hash > + hash_bytes(get_u32_at_offset(input_bytes, 2), > + shift); > + ++ip; > + } > + } > + > +emit_remainder: > + /* Emit the remaining bytes as a literal */ > + if (next_emit < ip_end) > + op = emit_literal(op, next_emit, ip_end - next_emit, false); > + > + return op; > +} > + > +/* > + * ----------------------------------------------------------------------- > + * Lookup table for decompression code. Generated by ComputeTable() below. > + * ----------------------------------------------------------------------- > + */ > + > +/* Mapping from i in range [0,4] to a mask to extract the bottom 8*i bits */ > +static const u32 wordmask[] = { > + 0u, 0xffu, 0xffffu, 0xffffffu, 0xffffffffu > +}; > + > +/* > + * Data stored per entry in lookup table: > + * Range Bits-used Description > + * ------------------------------------ > + * 1..64 0..7 Literal/copy length encoded in opcode byte > + * 0..7 8..10 Copy offset encoded in opcode byte / 256 > + * 0..4 11..13 Extra bytes after opcode > + * > + * We use eight bits for the length even though 7 would have sufficed > + * because of efficiency reasons: > + * (1) Extracting a byte is faster than a bit-field > + * (2) It properly aligns copy offset so we do not need a <<8 > + */ > +static const u16 char_table[256] = { > + 0x0001, 0x0804, 0x1001, 0x2001, 0x0002, 0x0805, 0x1002, 0x2002, > + 0x0003, 0x0806, 0x1003, 0x2003, 0x0004, 0x0807, 0x1004, 0x2004, > + 0x0005, 0x0808, 0x1005, 0x2005, 0x0006, 0x0809, 0x1006, 0x2006, > + 0x0007, 0x080a, 0x1007, 0x2007, 0x0008, 0x080b, 0x1008, 0x2008, > + 0x0009, 0x0904, 0x1009, 0x2009, 0x000a, 0x0905, 0x100a, 0x200a, > + 0x000b, 0x0906, 0x100b, 0x200b, 0x000c, 0x0907, 0x100c, 0x200c, > + 0x000d, 0x0908, 0x100d, 0x200d, 0x000e, 0x0909, 0x100e, 0x200e, > + 0x000f, 0x090a, 0x100f, 0x200f, 0x0010, 0x090b, 0x1010, 0x2010, > + 0x0011, 0x0a04, 0x1011, 0x2011, 0x0012, 0x0a05, 0x1012, 0x2012, > + 0x0013, 0x0a06, 0x1013, 0x2013, 0x0014, 0x0a07, 0x1014, 0x2014, > + 0x0015, 0x0a08, 0x1015, 0x2015, 0x0016, 0x0a09, 0x1016, 0x2016, > + 0x0017, 0x0a0a, 0x1017, 0x2017, 0x0018, 0x0a0b, 0x1018, 0x2018, > + 0x0019, 0x0b04, 0x1019, 0x2019, 0x001a, 0x0b05, 0x101a, 0x201a, > + 0x001b, 0x0b06, 0x101b, 0x201b, 0x001c, 0x0b07, 0x101c, 0x201c, > + 0x001d, 0x0b08, 0x101d, 0x201d, 0x001e, 0x0b09, 0x101e, 0x201e, > + 0x001f, 0x0b0a, 0x101f, 0x201f, 0x0020, 0x0b0b, 0x1020, 0x2020, > + 0x0021, 0x0c04, 0x1021, 0x2021, 0x0022, 0x0c05, 0x1022, 0x2022, > + 0x0023, 0x0c06, 0x1023, 0x2023, 0x0024, 0x0c07, 0x1024, 0x2024, > + 0x0025, 0x0c08, 0x1025, 0x2025, 0x0026, 0x0c09, 0x1026, 0x2026, > + 0x0027, 0x0c0a, 0x1027, 0x2027, 0x0028, 0x0c0b, 0x1028, 0x2028, > + 0x0029, 0x0d04, 0x1029, 0x2029, 0x002a, 0x0d05, 0x102a, 0x202a, > + 0x002b, 0x0d06, 0x102b, 0x202b, 0x002c, 0x0d07, 0x102c, 0x202c, > + 0x002d, 0x0d08, 0x102d, 0x202d, 0x002e, 0x0d09, 0x102e, 0x202e, > + 0x002f, 0x0d0a, 0x102f, 0x202f, 0x0030, 0x0d0b, 0x1030, 0x2030, > + 0x0031, 0x0e04, 0x1031, 0x2031, 0x0032, 0x0e05, 0x1032, 0x2032, > + 0x0033, 0x0e06, 0x1033, 0x2033, 0x0034, 0x0e07, 0x1034, 0x2034, > + 0x0035, 0x0e08, 0x1035, 0x2035, 0x0036, 0x0e09, 0x1036, 0x2036, > + 0x0037, 0x0e0a, 0x1037, 0x2037, 0x0038, 0x0e0b, 0x1038, 0x2038, > + 0x0039, 0x0f04, 0x1039, 0x2039, 0x003a, 0x0f05, 0x103a, 0x203a, > + 0x003b, 0x0f06, 0x103b, 0x203b, 0x003c, 0x0f07, 0x103c, 0x203c, > + 0x0801, 0x0f08, 0x103d, 0x203d, 0x1001, 0x0f09, 0x103e, 0x203e, > + 0x1801, 0x0f0a, 0x103f, 0x203f, 0x2001, 0x0f0b, 0x1040, 0x2040 > +}; > + > +struct snappy_decompressor { > + struct source *reader; /* Underlying source of bytes to decompress */ > + const char *ip; /* Points to next buffered byte */ > + const char *ip_limit; /* Points just past buffered bytes */ > + u32 peeked; /* Bytes peeked from reader (need to skip) */ > + bool eof; /* Hit end of input without an error? */ > + char scratch[5]; /* Temporary buffer for peekfast boundaries */ > +}; > + > +static void > +init_snappy_decompressor(struct snappy_decompressor *d, struct source *reader) > +{ > + d->reader = reader; > + d->ip = NULL; > + d->ip_limit = NULL; > + d->peeked = 0; > + d->eof = false; > +} > + > +static void exit_snappy_decompressor(struct snappy_decompressor *d) > +{ > + skip(d->reader, d->peeked); > +} > + > +/* > + * Read the uncompressed length stored at the start of the compressed data. > + * On succcess, stores the length in *result and returns true. > + * On failure, returns false. > + */ > +static bool read_uncompressed_length(struct snappy_decompressor *d, > + u32 * result) > +{ > + DCHECK(d->ip == NULL); /* > + * Must not have read anything yet > + * Length is encoded in 1..5 bytes > + */ > + *result = 0; > + u32 shift = 0; > + while (true) { > + if (shift >= 32) > + return false; > + size_t n; > + const char *ip = peek(d->reader, &n); > + if (n == 0) > + return false; > + const unsigned char c = *(const unsigned char *)(ip); > + skip(d->reader, 1); > + *result |= (u32) (c & 0x7f) << shift; > + if (c < 128) { > + break; > + } > + shift += 7; > + } > + return true; > +} > + > +static bool refill_tag(struct snappy_decompressor *d); > + > +/* > + * Process the next item found in the input. > + * Returns true if successful, false on error or end of input. > + */ > +static void decompress_all_tags(struct snappy_decompressor *d, > + struct writer *writer) > +{ > + const char *ip = d->ip; > + > + for (;;) { > + if (d->ip_limit - ip < 5) { > + d->ip = ip; > + if (!refill_tag(d)) > + return; > + ip = d->ip; > + } > + > + const unsigned char c = *(const unsigned char *)(ip++); > + > + if ((c & 0x3) == LITERAL) { > + u32 literal_length = c >> 2; > + if (unlikely(literal_length >= 60)) { > + /* Long literal */ > + const u32 literal_ll = literal_length - 59; > + literal_length = get_unaligned_le32(ip) & > + wordmask[literal_ll]; > + ip += literal_ll; > + } > + ++literal_length; > + > + u32 avail = d->ip_limit - ip; > + while (avail < literal_length) { > + if (!writer_append(writer, ip, avail, false)) > + return; > + literal_length -= avail; > + skip(d->reader, d->peeked); > + size_t n; > + ip = peek(d->reader, &n); > + avail = n; > + d->peeked = avail; > + if (avail == 0) > + return; /* Premature end of input */ > + d->ip_limit = ip + avail; > + } > + bool allow_fast_path = (avail >= 16); > + if (!writer_append(writer, ip, literal_length, > + allow_fast_path)) > + return; > + ip += literal_length; > + } else { > + const u32 entry = char_table[c]; > + const u32 trailer = get_unaligned_le32(ip) & > + wordmask[entry >> 11]; > + const u32 length = entry & 0xff; > + ip += entry >> 11; > + > + /* > + * copy_offset/256 is encoded in bits 8..10. > + * By just fetching those bits, we get > + * copy_offset (since the bit-field starts at > + * bit 8). > + */ > + const u32 copy_offset = entry & 0x700; > + if (!writer_append_from_self(writer, > + copy_offset + trailer, > + length)) > + return; > + } > + } > +} > + > +static bool refill_tag(struct snappy_decompressor *d) > +{ > + const char *ip = d->ip; > + > + if (ip == d->ip_limit) { > + size_t n; > + /* Fetch a new fragment from the reader */ > + skip(d->reader, d->peeked); /* All peeked bytes are used up */ > + ip = peek(d->reader, &n); > + d->peeked = n; > + if (n == 0) { > + d->eof = true; > + return false; > + } > + d->ip_limit = ip + n; > + } > + > + /* Read the tag character */ > + DCHECK_LT(ip, d->ip_limit); > + const unsigned char c = *(const unsigned char *)(ip); > + const u32 entry = char_table[c]; > + const u32 needed = (entry >> 11) + 1; /* +1 byte for ''c'' */ > + DCHECK_LE(needed, sizeof(d->scratch)); > + > + /* Read more bytes from reader if needed */ > + u32 nbuf = d->ip_limit - ip; > + > + if (nbuf < needed) { > + /* > + * Stitch together bytes from ip and reader to form the word > + * contents. We store the needed bytes in "scratch". They > + * will be consumed immediately by the caller since we do not > + * read more than we need. > + */ > + memmove(d->scratch, ip, nbuf); > + skip(d->reader, d->peeked); /* All peeked bytes are used up */ > + d->peeked = 0; > + while (nbuf < needed) { > + size_t length; > + const char *src = peek(d->reader, &length); > + if (length == 0) > + return false; > + u32 to_add = min_t(u32, needed - nbuf, length); > + memcpy(d->scratch + nbuf, src, to_add); > + nbuf += to_add; > + skip(d->reader, to_add); > + } > + DCHECK_EQ(nbuf, needed); > + d->ip = d->scratch; > + d->ip_limit = d->scratch + needed; > + } else if (nbuf < 5) { > + /* > + * Have enough bytes, but move into scratch so that we do not > + * read past end of input > + */ > + memmove(d->scratch, ip, nbuf); > + skip(d->reader, d->peeked); /* All peeked bytes are used up */ > + d->peeked = 0; > + d->ip = d->scratch; > + d->ip_limit = d->scratch + nbuf; > + } else { > + /* Pass pointer to buffer returned by reader. */ > + d->ip = ip; > + } > + return true; > +} > + > +static int internal_uncompress(struct source *r, > + struct writer *writer, u32 max_len) > +{ > + struct snappy_decompressor decompressor; > + u32 uncompressed_len = 0; > + > + init_snappy_decompressor(&decompressor, r); > + > + if (!read_uncompressed_length(&decompressor, &uncompressed_len)) > + return -EIO; > + /* Protect against possible DoS attack */ > + if ((u64) (uncompressed_len) > max_len) > + return -EIO; > + > + writer_set_expected_length(writer, uncompressed_len); > + > + /* Process the entire input */ > + decompress_all_tags(&decompressor, writer); > + > + exit_snappy_decompressor(&decompressor); > + return (decompressor.eof && writer_check_length(writer)) ? 0 : -EIO; > +} > + > +static inline int compress(struct snappy_env *env, struct source *reader, > + struct sink *writer) > +{ > + int err; > + size_t written = 0; > + int N = available(reader); > + char ulength[kmax32]; > + char *p = varint_encode32(ulength, N); > + > + append(writer, ulength, p - ulength); > + written += (p - ulength); > + > + while (N > 0) { > + /* Get next block to compress (without copying if possible) */ > + size_t fragment_size; > + const char *fragment = peek(reader, &fragment_size); > + if (fragment_size == 0) { > + err = -EIO; > + goto out; > + } > + const int num_to_read = min_t(int, N, kblock_size); > + size_t bytes_read = fragment_size; > + > + int pending_advance = 0; > + if (bytes_read >= num_to_read) { > + /* Buffer returned by reader is large enough */ > + pending_advance = num_to_read; > + fragment_size = num_to_read; > + } > +#ifdef SCATHER_GATHER > + else { > + memcpy(env->scratch, fragment, bytes_read); > + skip(reader, bytes_read); > + > + while (bytes_read < num_to_read) { > + fragment = peek(reader, &fragment_size); > + size_t n > + min_t(size_t, fragment_size, > + num_to_read - bytes_read); > + memcpy(env->scratch + bytes_read, fragment, n); > + bytes_read += n; > + skip(reader, n); > + } > + DCHECK_EQ(bytes_read, num_to_read); > + fragment = env->scratch; > + fragment_size = num_to_read; > + } > +#endif > + if (fragment_size < num_to_read) > + return -EIO; > + > + /* Get encoding table for compression */ > + int table_size; > + u16 *table = get_hash_table(env, num_to_read, &table_size); > + > + /* Compress input_fragment and append to dest */ > + const int max_output > + snappy_max_compressed_length(num_to_read); > + > + char *dest; > + dest = sink_peek(writer, max_output); > +#ifdef SCATHER_GATHER > + if (!dest) { > + /* > + * Need a scratch buffer for the output, > + * because the byte sink doesn''t have enough > + * in one piece. > + */ > + dest = env->scratch_output; > + } > +#endif > + char *end = compress_fragment(fragment, fragment_size, > + dest, table, table_size); > + append(writer, dest, end - dest); > + written += (end - dest); > + > + N -= num_to_read; > + skip(reader, pending_advance); > + } > + > + err = 0; > +out: > + return err; > +} > + > +/** > + * snappy_compress - Compress a buffer using the snappy compressor. > + * @env: Preallocated environment > + * @input: Input buffer > + * @input_length: Length of input_buffer > + * @compressed: Output buffer for compressed data > + * @compressed_length: The real length of the output written here. > + * > + * Return 0 on success, otherwise an negative error code. > + * > + * The output buffer must be at least > + * snappy_max_compressed_length(input_length) bytes long. > + * > + * Requires a preallocated environment from snappy_init_env. > + * The environment does not keep state over individual calls > + * of this function, just preallocates the memory. > + */ > +int snappy_compress(struct snappy_env *env, > + const char *input, > + size_t input_length, > + char *compressed, size_t *compressed_length) > +{ > + struct source reader = { > + .ptr = input, > + .left = input_length > + }; > + struct sink writer = { > + .dest = compressed, > + }; > + int err = compress(env, &reader, &writer); > + > + /* Compute how many bytes were added */ > + *compressed_length = (writer.dest - compressed); > + return err; > +} > +EXPORT_SYMBOL(snappy_compress); > + > +/** > + * snappy_uncompress - Uncompress a snappy compressed buffer > + * @compressed: Input buffer with compressed data > + * @n: length of compressed buffer > + * @uncompressed: buffer for uncompressed data > + * > + * The uncompressed data buffer must be at least > + * snappy_uncompressed_length(compressed) bytes long. > + * > + * Returns true when successfull, otherwise false. > + */ > +bool snappy_uncompress(const char *compressed, size_t n, char *uncompressed) > +{ > + struct source reader = { > + .ptr = compressed, > + .left = n > + }; > + struct writer output = { > + .base = uncompressed, > + .op = uncompressed > + }; > + return internal_uncompress(&reader, &output, 0xffffffff); > +} > +EXPORT_SYMBOL(snappy_uncompress); > + > +/** > + * snappy_init_env - Allocate snappy compression environment > + * @env: Environment to preallocate > + * > + * Returns 0 on success, otherwise negative errno. > + * Must run in process context. > + */ > +int snappy_init_env(struct snappy_env *env) > +{ > + env->hash_table = vmalloc(sizeof(u16) * kmax_hash_table_size); > + if (!env->hash_table) > + goto error; > +#ifdef SCATHER_GATHER > + env->scratch = vmalloc(kblock_size); > + env->scratch_output > + vmalloc(snappy_max_compressed_length(kblock_size)); > + if (!env->scratch || !env->scratch_output) > + goto error; > +#endif > + return 0; > +error: > + snappy_free_env(env); > + return -ENOMEM; > +} > +EXPORT_SYMBOL(snappy_init_env); > + > +/** > + * snappy_free_env - Free an snappy compression environment > + * @env: Environment to free. > + * > + * Must run in process context. > + */ > +void snappy_free_env(struct snappy_env *env) > +{ > + vfree(env->hash_table); > +#ifdef SCATHER_GATHER > + vfree(env->scratch); > + vfree(env->scratch_output); > +#endif > + memset(env, 0, sizeof(struct snappy_env)); > +} > +EXPORT_SYMBOL(snappy_free_env); > -- > 1.7.7.4I''ve run into one of those x86_64/x86 errors (I think x86_64 has different implicit includes). (BTW: If you''re ever reworking this patch set, I''d like to make an ad hoc request for slightly different names for fs/btrfs/snappy.c and lib/snappy.c) When building a x86 kernel, I get the following errors: CC [M] lib/snappy.o lib/snappy.c: In function ''snappy_init_env'': lib/snappy.c:1268:2: error: implicit declaration of function ''vmalloc'' CC [M] fs/btrfs/free-space-cache.o lib/snappy.c:1268:18: warning: assignment makes pointer from integer without a cast lib/snappy.c: In function ''snappy_free_env'': lib/snappy.c:1293:2: error: implicit declaration of function ''vfree'' make[1]: *** [lib/snappy.o] Error 1 make: *** [lib] Error 2 The error clears with this patch: diff --git a/lib/snappy.c b/lib/snappy.c index 3848c6c..a25b2a4 100644 --- a/lib/snappy.c +++ b/lib/snappy.c @@ -41,6 +41,7 @@ #include <linux/slab.h> #include <linux/string.h> #include <linux/snappy.h> +#include <linux/vmalloc.h> #include <asm/unaligned.h> #define CRASH_UNLESS(x) BUG_ON(!(x)) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> (BTW: If you''re ever reworking this patch set, I''d like to make an ad > hoc request for slightly different names for fs/btrfs/snappy.c and > lib/snappy.c)Why?> When building a x86 kernel, I get the following errors: > CC [M] lib/snappy.o > lib/snappy.c: In function ''snappy_init_env'': > lib/snappy.c:1268:2: error: implicit declaration of function ''vmalloc'' > CC [M] fs/btrfs/free-space-cache.o > lib/snappy.c:1268:18: warning: assignment makes pointer from integer > without a cast > lib/snappy.c: In function ''snappy_free_env'': > lib/snappy.c:1293:2: error: implicit declaration of function ''vfree'' > make[1]: *** [lib/snappy.o] Error 1 > make: *** [lib] Error 2 > > The error clears with this patch:Added thanks. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 14, 2012 at 1:52 PM, Andi Kleen <ak@linux.intel.com> wrote:> >> (BTW: If you''re ever reworking this patch set, I''d like to make an ad >> hoc request for slightly different names for fs/btrfs/snappy.c and >> lib/snappy.c) > > > Why? >It''s not a big deal, I just found it confusing at first to see "snappy.c" being a different file in two places.
richard -rw- weinberger
2012-Jul-19 20:58 UTC
Re: ANNOUNCE: linux-kernel-lzo-20120716 - update LZO
Hi! On Mon, Jul 16, 2012 at 8:30 PM, Markus F.X.J. Oberhumer <markus@oberhumer.com> wrote:> I encourage all compression users to test and benchmark this new version, > and I also would ask some official LZO maintainer to convert the updated > source files into a GIT commit and possibly push it to Linus or linux-next.You could also send a patch yourself. :-) -- Thanks, //richard -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Oct-09 08:39 UTC
Re: ANN: linux-kernel-lzo-2.06.20120123 - update LZO to v2.06
On Wed, Jan 25, 2012 at 02:36:18AM +0100, Andi Kleen wrote:> I ran benchmarks on the new miniLZO and LZ4 on 64bit. LZ4 is generally slower > than snappy/lzo in the micro benchmarks.And the reason why you measured worse speed for LZ4 although (AFAICT) everybody else''s measurements claim the opposite is quite simple: likely due to a copy&paste error you did not benchmark LZ4 at all: https://github.com/andikleen/snappy-c/blob/master/glue.c#L282 274 void test_lz4(char *map, size_t size, char *fn) 275 { 276 int i; 277 int err; 278 size_t outlen = size * 2; 279 char *out = xmalloc(outlen); 280 char *buf2 = xmalloc(size); 281 282 BENCH(fastlz, "lz4", fn, NULL); ^^^^^^ 283 284 free(out); 285 free(buf2); 286 } (LZ4 is on it''s track towards kernel) david