Gzip better compression. Higher CPU usage. Good for cold data
Snappy, lzo lesser compression. Good for hot data. Snappy better than lzo mostly
bzip2 can provide better compression than gzip on some files but higher CPU usage. HBase not supported
lz4 and zstd are Splitable
Saving of Normal Files
Supported Codecs: bzip2, deflate, uncompressed, lz4, gzip, snappy, none
Default: none
When saving as a Parquet, ORC
Supported Codec (Parquet): brotli, uncompressed, lz4, gzip, lzo, snappy, none, zstd
Supported Codec (ORC): uncompressed, lzo, snappy, zlib, none
Default: snappy
Choosing a Data Compression Format | 5.6.x | Cloudera Documentation