Encoding Encrypted Messages Inside Legal Chess Games Using PGN

Steganography focuses on concealing the existence of communication rather than only protecting its contents. While encryption makes data unreadable, steganography aims to make communication itself unremarkable. Most modern steganographic systems rely on digital media—images, audio, or video—as carrier formats. These approaches, however, face recurring challenges: statistical detectability, format fragility, compression artifacts, and strict capacity limits imposed by the medium.
This work explores an alternative direction: symbolic, rule-constrained steganography using legal chess notation (PGN). By combining deterministic chess move generation with AES-256-GCM authenticated encryption, the system produces standard, valid chess games that can securely encode encrypted messages while remaining compatible with existing chess tools and PGN validators.
The full technical design and evaluation are presented in my research paper. This article explains the motivation, design decisions, and practical implications of the project.
Why Chess PGN Is a Viable Steganographic Medium
Chess PGN is a structured, globally standardized textual format used to represent games and analysis. It offers several properties that make it particularly suitable for steganography:
Every move is constrained by formal chess rules
Long games and deep analysis trees are common and expected
PGN files are widely shared, archived, and parsed across platforms
Variations are a native feature of the format
Unlike media-based carriers, PGN does not degrade under copying or reformatting, and its validity can be mechanically verified. A syntactically correct PGN that follows chess rules does not appear anomalous simply because it is long or complex.
This project leverages those properties to create a rule-validated carrier, where illegal or malformed outputs are structurally impossible.
System Overview
The system operates as a deterministic pipeline:
Plaintext → Authenticated Encryption → Integer Conversion → Chess Encoding → PGN Output
The input message is optionally protected with a password.
The message is encrypted using AES-256-GCM, providing confidentiality and built-in integrity verification.
The encrypted payload is converted into a large integer.
That integer is encoded into chess moves by selecting legal moves based on its value.
When linear encoding capacity is exhausted, chess variations are introduced to extend capacity.
Decoding reverses the process. Any modification to the PGN—whether accidental or intentional—results in authentication failure during decryption.
Cryptographic Design and Security Model
The security of the system is grounded in established cryptographic primitives rather than obscurity.
Authenticated Encryption
AES-256-GCM is used to ensure both confidentiality and integrity.
Any alteration to the encoded data results in authentication failure.
No separate checksum or signature is required.
Optional Password Protection
Password-based encryption uses PBKDF2-HMAC-SHA256 with a high iteration count.
This adds resistance against offline brute-force attacks.
A truncated password hash allows fast rejection of incorrect passwords without full decryption.
The system does not claim information-theoretic steganographic security. Indistinguishability claims are empirical and based on observed behavior, not formal proofs. This limitation is explicitly acknowledged in the research.
Deterministic Encoding Using Legal Chess Moves
At the core of the system is a deterministic mapping between integers and legal chess moves.
For any given board position:
All legal moves are generated
Moves are sorted deterministically
The encrypted integer selects a move using modular arithmetic
The remaining value continues into the next position
Each position effectively acts as a base-N encoding system, where N equals the number of legal moves available in that position. On average, this yields approximately 4.3 bits per chess ply, consistent with experimental results.
Because the data is encrypted before encoding, the move selection appears statistically similar to arbitrary play rather than reflecting properties of the original message.
Capacity Expansion Through Variations
A defining contribution of this work is the systematic use of chess variations to expand encoding capacity.
Earlier symbolic or text-based steganography approaches typically encode data along a single linear structure. Once exhausted, encoding must stop. Chess PGN naturally avoids this limitation.
When the mainline game can no longer encode additional data:
Variations are added at positions with multiple legal alternatives
Each variation becomes an additional encoding path
Encoding proceeds across the resulting move tree
This approach provides theoretically unbounded capacity, constrained primarily by practical PGN size limits rather than the encoding model itself. Importantly, such variation-heavy PGNs are common in real chess analysis, making this expansion method structurally consistent with legitimate content.
Integrity and Tamper Detection
Because encryption uses AES-GCM:
Any modification to moves, order, or structure is detected
Insertions, deletions, or reordering of moves break authentication
Tampering cannot silently alter the decoded message
This makes the system tamper-evident by design, a property that many traditional steganographic techniques lack.
Comparison With Prior Research
Compared to traditional image or audio steganography:
No manipulation of pixel or frequency distributions
No vulnerability to compression or transcoding
No fixed carrier size limits
Compared to text-based steganography:
No synonym substitution or linguistic artifacts
No dependence on natural language generation
No semantic distortion
Compared to earlier game-based ideas:
Uses standard PGN rather than custom formats
Integrates authenticated encryption, not just concealment
Introduces structured variation-based capacity scaling
The system is compatible with existing chess tooling while providing stronger integrity guarantees than most steganographic methods.
Live Implementation and Demo
A public demo of the project is available at:
For demonstration purposes:
A 10,000-character input limit is currently enforced
This limit applies only to the demo environment
The underlying system supports significantly larger payloads
The demo also includes optional password protection, allowing users to observe authenticated encryption behavior in practice.
Practical Limitations
The system is not intended to hide arbitrarily large datasets without scrutiny. Extremely long PGNs with excessive branching may attract attention, and encoding time increases linearly with message size.
The design prioritizes correctness, verifiability, and security over maximal concealment.
Closing Remarks
This project demonstrates that rule-constrained symbolic systems, such as chess, can serve as reliable steganographic channels when combined with modern cryptography. By respecting both cryptographic and domain-specific constraints, the system produces outputs that are valid, portable, and tamper-evident.
Rather than replacing existing steganographic approaches, this work introduces a complementary direction—one where structure, rules, and determinism form the foundation of secure hidden communication.
Note: The header image for this article is AI-generated and is used solely for visual illustration.

