Save to a file: B64E, LZ4 compressed MessagePack.


ASTER::ACTION::JSON::Serialize w/ MessagePack & Save to a file

JSON data currently in memory is stored in a file.

Binary data is stored as text data encoded in Base64. Since the binary is compressed with LZ4 before being serialized, the data size is reduced.

Parameter.1: Destination path of a file

ASTER rev.1.0.5 allows users to freely configure the data storage location. However, due to security concerns, future updates may change the specifications and restrict file storage destinations.

"./aster/savedata/b64eMsgPack.txt"

JSON is a text-based data format, but MessagePack is a format that allows handling it in binary form.

The conversion to MessagePack uses nlohmann/json’s to_msgpack.

Relevant URL :

Generally, even without compression, converting JSON to MessagePack results in a smaller data size compared to JSON.

Data that contains a large number of numerical values benefits from binary representation, allowing for efficient reduction of data size.

For example, in nlohmann/json, there is a default setting when storing numbers in JSON, where all floating-point numbers are serialized as Double type strings.

Relevant URLs
/// a class to store JSON values
/// @sa https://json.nlohmann.me/api/basic_json/
template<template<typename U, typename V, typename... Args> class ObjectType =
         std::map,
         template<typename U, typename... Args> class ArrayType = std::vector,
         class StringType = std::string, class BooleanType = bool,
         class NumberIntegerType = std::int64_t,
         class NumberUnsignedType = std::uint64_t,
         class NumberFloatType = double,// <-------------------------------------------:
         template<typename U> class AllocatorType = std::allocator,
         template<typename T, typename SFINAE = void> class JSONSerializer =
         adl_serializer,
         class BinaryType = std::vector<std::uint8_t>, // cppcheck-suppress syntaxError
         class CustomBaseClass = void>
class basic_json;

MessagePack stores numerical data in the most optimal format—whether as integers or floating-point numbers—when converting to binary. This alone helps reduce data size.

JSON inherently includes essential characters such as " (double quotes), : (colon), and {} (curly brackets) to define its data structure. MessagePack replaces these structural elements with binary identifiers and tags, effectively reducing redundant notation.


Comparison of Compression Technologies

Compression Method Cmp Ratio Speed Extraction Speed Characteristics
zlib High Medium Medium Versatile, nearly standard in PHP8
gzip High Slow Medium Similar to zlib, file-oriented
ZIP Medium Medium Medium Good for multiple files
zstd Low Fast Fast High-speed archiving
LZ4 Low Very Fast Very Fast Speed-focused, lower ratio

Starting from ASTER rev.1.0.5, data compression and decompression support zlib and LZ4. Since the decompression process for compressed data is publicly available as a Python 3 implementation, stored data can be used in other environments.

Python3 Implementation: LZ4 version

python script here..
import sys
import traceback
import binascii
import base64
import lz4.block
import msgpack
import json

# powershell  python base64_lz4_msgPack_python3_.py | Out-Host

# Function to decode byte string values in JSON (keys & values)
def decode_bytes(obj):
    if isinstance(obj, dict):
        return {k.decode('utf-8') if isinstance(k, bytes) else k: decode_bytes(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [decode_bytes(v) for v in obj]
    elif isinstance(obj, bytes):
        try:
            return obj.decode('utf-8')  # バイト型の文字列をデコード
        except UnicodeDecodeError:
            return obj.hex()  # Return in hexadecimal notation upon failure.
    else:
        return obj

# Set Base64-encoded compressed data (insert appropriate data) : appropriate == LZ4 compressed MessagePack
base64_encoded_data = "8CyCpnN0cmluZ6/ml6XmnKzoqp7ooajnj76pYWxnb3JpdGhtk6NMWjSoQ29tcHJlc3OqRGVjb21wcmVzcw=="

# Debug: Throw an error if Base64 data is empty
if not base64_encoded_data:
    raise ValueError("Error: base64_encoded_data is empty. Provide a valid Base64 string.")

# Base64 decoding
binary_data = base64.b64decode(base64_encoded_data)

# Debug: HEX Dump of Compressed Data
print(f"Binary Data (HEX dump, first bytes): {binascii.hexlify(binary_data[:min(20, len(binary_data))]).decode()}")
print(f"Binary Data (HEX dump, last bytes): {binascii.hexlify(binary_data[-min(20, len(binary_data)):]).decode()}")

# Decompression using LZ4
try:
    estimated_original_size = len(binary_data)<<1  # Allocate buffer approximately twice the size after compression.
    decompressed_data = lz4.block.decompress(binary_data, uncompressed_size=estimated_original_size)
except lz4.block.LZ4BlockError as e:
    print(f"Error: LZ4 decompression failed - {e}")
    traceback.print_exc()  # Display detailed error messages.
    sys.exit(1)
except Exception as e:
    print(f"Unexpected Error: {e}")
    traceback.print_exc()  # Display the stack trace when an unexprected error occurs.
    sys.exit(1)

# Decoding using MessagePack
try:
    original_data = msgpack.unpackb(decompressed_data, raw=False)
except Exception as e:
    print(f"Error: MessagePack decoding failed - {e}")
    sys.exit(1)

# Decode and format JSON for display
decoded_data = decode_bytes(original_data)
print(json.dumps(decoded_data, ensure_ascii=False, indent=2))

Since the data is not encrypted, it remains editable for general developers. However, for regular users, modifying the stored data becomes difficult, making it somewhat more resistant to tampering compared to saving JSON in plain text.

LZ4 acceleration settings :

ASTER rev.1.0.5 selects acceleration=1 as the default setting.

int compressedSize = LZ4_compress_fast(
	reinterpret_cast<const char*>(inputData.data()), 
	reinterpret_cast<char*>(compressedData.data()), 
	static_cast<int>(inputData.size()), 
	static_cast<int>(compressedData.size()), 
	1
	// 1 = Original Data Size=6,469 -> Compressed =3,834
	// 5 = Original Data Size=6,469 -> Compressed =4,425
);
  • acceleration=1 is the setting that prioritizes stability and offers the highest compression ratio.
  • acceleration=5 is the setting that prioritizes processing speed, resulting in a lower compression ratio.