base on Snappy compressor/decompressor for Java snappy-java [![Build Status](https://travis-ci.org/xerial/snappy-java.svg?branch=master)](https://travis-ci.org/xerial/snappy-java) [![Maven Central](https://maven-badges.herokuapp.com/maven-central/org.xerial.snappy/snappy-java/badge.svg)](https://maven-badges.herokuapp.com/maven-central/org.xerial.snappy/snappy-java/) [![Javadoc](https://javadoc.io/badge2/org.xerial.snappy/snappy-java/javadoc.svg)](https://javadoc.io/doc/org.xerial.snappy/snappy-java) === snappy-java is a Java port of the [snappy](https://github.com/google/snappy), a fast C++ compresser/decompresser developed by Google. ## Features * Fast compression/decompression around 200~400MB/sec. * Less memory usage. SnappyOutputStream uses only 32KB+ in default. * JNI-based implementation to achieve comparable performance to the native C++ version. * Although snappy-java uses JNI, it can be used safely with multiple class loaders (e.g. Tomcat, etc.). * Compression/decompression of Java primitive arrays (`float[]`, `double[]`, `int[]`, `short[]`, `long[]`, etc.) * To improve the compression ratios of these arrays, you can use a fast data-rearrangement implementation ([`BitShuffle`](https://oss.sonatype.org/service/local/repositories/releases/archive/org/xerial/snappy/snappy-java/1.1.8/snappy-java-1.1.8-javadoc.jar/!/org/xerial/snappy/BitShuffle.html)) before compression * Portable across various operating systems; Snappy-java contains native libraries built for Window/Mac/Linux, etc. snappy-java loads one of these libraries according to your machine environment (It looks system properties, `os.name` and `os.arch`). * Simple usage. Add the snappy-java-(version).jar file to your classpath. Then call compression/decompression methods in `org.xerial.snappy.Snappy`. * [Framing-format support](https://github.com/google/snappy/blob/master/framing_format.txt) (Since 1.1.0 version) * OSGi support * [Apache License Version 2.0](http://www.apache.org/licenses/LICENSE-2.0). Free for both commercial and non-commercial use. ## Performance * Snappy's main target is very high-speed compression/decompression with reasonable compression size. So the compression ratio of snappy-java is modest and about the same as `LZF` (ranging 20%-100% according to the dataset). * Here are some [benchmark results](https://github.com/ning/jvm-compressor-benchmark/wiki), comparing snappy-java and the other compressors `LZO-java`/`LZF`/`QuickLZ`/`Gzip`/`Bzip2`. Thanks [Tatu Saloranta @cotowncoder](http://twitter.com/#!/cowtowncoder) for providing the benchmark suite. * The benchmark result indicates snappy-java is the fastest compressor/decompressor in Java: https://ning.github.io/jvm-compressor-benchmark/results/canterbury-roundtrip-2011-07-28/index.html * The decompression speed is twice as fast as the others: https://ning.github.io/jvm-compressor-benchmark/results/canterbury-uncompress-2011-07-28/index.html ## Download [![Maven Central](https://maven-badges.herokuapp.com/maven-central/org.xerial.snappy/snappy-java/badge.svg)](https://maven-badges.herokuapp.com/maven-central/org.xerial.snappy/snappy-java/) [![Javadoc](https://javadoc.io/badge2/org.xerial.snappy/snappy-java/javadoc.svg)](https://javadoc.io/doc/org.xerial.snappy/snappy-java) * [Release Notes](Milestone.md) The current stable version is available from here: * Release version: https://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/ * Snapshot version (the latest beta version): https://oss.sonatype.org/content/repositories/snapshots/org/xerial/snappy/snappy-java/ ### Using with Maven Snappy-java is available from Maven's central repository. Add the following dependency to your pom.xml: <dependency> <groupId>org.xerial.snappy</groupId> <artifactId>snappy-java</artifactId> <version>(version)</version> <type>jar</type> <scope>compile</scope> </dependency> ### Using with sbt ``` libraryDependencies += "org.xerial.snappy" % "snappy-java" % "(version)" ``` ## Usage First, import `org.xerial.snapy.Snappy` in your Java code: ```java import org.xerial.snappy.Snappy; ``` Then use `Snappy.compress(byte[])` and `Snappy.uncompress(byte[])`: ```java String input = "Hello snappy-java! Snappy-java is a JNI-based wrapper of " + "Snappy, a fast compresser/decompresser."; byte[] compressed = Snappy.compress(input.getBytes("UTF-8")); byte[] uncompressed = Snappy.uncompress(compressed); String result = new String(uncompressed, "UTF-8"); System.out.println(result); ``` In addition, high-level methods (`Snappy.compress(String)`, `Snappy.compress(float[] ..)` etc. ) and low-level ones (e.g. `Snappy.rawCompress(.. )`, `Snappy.rawUncompress(..)`, etc.), which minimize memory copies, can be used. ### Stream-based API Stream-based compressor/decompressor `SnappyOutputStream`/`SnappyInputStream` are also available for reading/writing large data sets. `SnappyFramedOutputStream`/`SnappyFramedInputStream` can be used for the [framing format](https://github.com/google/snappy/blob/master/framing_format.txt). * See also [Javadoc API](https://oss.sonatype.org/service/local/repositories/releases/archive/org/xerial/snappy/snappy-java/1.1.3-M1/snappy-java-1.1.3-M1-javadoc.jar/!/index.html) #### Compatibility Notes The original Snappy format definition did not define a file format. It later added a "framing" format to define a file format, but by this point major software was already using an industry standard instead -- represented in this library by the `SnappyOutputStream` and `SnappyInputStream` methods. For interoperability with other libraries, check that compatible formats are used. Note that not all libraries support all variants. * `SnappyOutputStream` and `SnappyInputStream` use `[magic header:16 bytes]([block size:int32][compressed data:byte array])*` format. You can read the result of `Snappy.compress` with `SnappyInputStream`, but you cannot read the compressed data generated by `SnappyOutputStream` with `Snappy.uncompress`. * `SnappyHadoopCompatibleOutputStream` does not emit a file header but write out the current block size as a preemble to each block #### Data format compatibility matrix: | Write\Read | `Snappy.uncompress` | `SnappyInputStream` | `SnappyFramedInputStream` | `org.apache.hadoop.io.compress.SnappyCodec` | | --------------- |:-------------------:|:------------------:|:-----------------------:|:-------------------------------------------:| | `Snappy.compress` | ok | ok | x | x | | `SnappyOutputStream` | x | ok | x | x | | `SnappyFramedOutputStream` | x | x | ok | x | | `SnappyHadoopCompatibleOutputStream` | x | x | x | ok | ### BitShuffle API (Since 1.1.3-M2) BitShuffle is an algorithm that reorders data bits (shuffle) for efficient compression (e.g., a sequence of integers, float values, etc.). To use BitShuffle routines, import `org.xerial.snapy.BitShuffle`: ```java import org.xerial.snappy.BitShuffle; int[] data = new int[] {1, 3, 34, 43, 34}; byte[] shuffledByteArray = BitShuffle.shuffle(data); byte[] compressed = Snappy.compress(shuffledByteArray); byte[] uncompressed = Snappy.uncompress(compressed); int[] result = BitShuffle.unshuffleIntArray(uncompress); System.out.println(result); ``` Shuffling and unshuffling of primitive arrays (e.g., `short[]`, `long[]`, `float[]`, `double[]`, etc.) are supported. See [Javadoc](http://static.javadoc.io/org.xerial.snappy/snappy-java/1.1.3-M1/org/xerial/snappy/BitShuffle.html) for the details. ### Setting classpath If you have snappy-java-(VERSION).jar in the current directory, use `-classpath` option as follows: $ javac -classpath ".;snappy-java-(VERSION).jar" Sample.java # in Windows or $ javac -classpath ".:snappy-java-(VERSION).jar" Sample.java # in Mac or Linux ## Public discussion group Post bug reports or feature request to the Issue Tracker: <https://github.com/xerial/snappy-java/issues> Public discussion forum is here: [Xerial Public Discussion Group](http://groups.google.com/group/xerial?hl=en) ## For developers snappy-java uses sbt (simple build tool for Scala) as a build tool. Here is a simple usage $ ./sbt # enter sbt console > ~test # run tests upon source code change > ~testOnly # run tests that matches a given name pattern > publishM2 # publish jar to $HOME/.m2/repository > package # create jar file > findbugs # Produce findbugs report in target/findbugs > jacoco:cover # Report the code coverage of tests to target/jacoco folder If you need to see detailed debug messages, launch sbt with `-Dloglevel=debug` option: ``` $ ./sbt -Dloglevel=debug ``` For the details of sbt usage, see my blog post: [Building Java Projects with sbt](http://xerial.org/blog/2014/03/24/sbt/) ### Building from the source code See the [build instruction](https://github.com/xerial/snappy-java/blob/master/BUILD.md). Building from the source code is an option when your OS platform and CPU architecture is not supported. To build snappy-java, you need Git, JDK (1.6 or higher), g++ compiler (mingw in Windows) etc. $ git clone https://github.com/xerial/snappy-java.git $ cd snappy-java $ make When building on Solaris, use `gmake`: $ gmake A file `target/snappy-java-$(version).jar` is the product additionally containing the native library built for your platform. ### Creating a new release GitHub action [https://github.com/xerial/snappy-java/blob/master/.github/workflows/release.yml] will publish a new relase to Maven Central (Sonatype) when a new tag vX.Y.Z is pushed. ## Miscellaneous Notes ### Using snappy-java with Tomcat 6 (or higher) Web Server Simply put the snappy-java's jar to WEB-INF/lib folder of your web application. Usual JNI-library specific problem no longer exists since snappy-java version 1.0.3 or higher can be loaded by multiple class loaders. ### Configure snappy-java using property file Prepare org-xerial-snappy.properties file (under the root path of your library) in Java's property file format. Here is a list of the available properties: * org.xerial.snappy.lib.path (directory containing a snappyjava's native library) * org.xerial.snappy.lib.name (library file name) * org.xerial.snappy.tempdir (temporary directory to extract a native library bundled in snappy-java) * org.xerial.snappy.use.systemlib (if this value is true, use system installed libsnappyjava.so looking the path specified by java.library.path) ---- Snappy-java is developed by [Taro L. Saito](http://www.xerial.org/leo). Twitter [@taroleo](http://twitter.com/#!/taroleo) ", Assign "at most 3 tags" to the expected json: {"id":"2478","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"