hash-of-wisdom/docs/PROTOCOL.md

391 lines
16 KiB
Markdown

# Word of Wisdom Protocol Specification
## Overview
The **Word of Wisdom** protocol is a TCP-based challenge-response protocol designed to mitigate DDoS attacks by requiring clients to solve a **Proof-of-Work (PoW)** puzzle before accessing protected resources (quotes).
The protocol uses **HMAC-signed challenges** for stateless server operation, aggressive timeouts to prevent slowloris attacks, and a simple binary framing with JSON payloads. It is **stateless** on the server thanks to HMAC-signed challenges, eliminating the need for challenge storage.
## Proof of Work Algorithm Choice
### Selected Algorithm: SHA-256 Hashcash with HMAC Authentication
The protocol uses **SHA-256 based Hashcash** with **HMAC-signed challenges** for secure, stateless operation.
For detailed analysis of alternative PoW algorithms and comprehensive justification of this choice, see [POW_ANALYSIS.md](./POW_ANALYSIS.md).
### Key Benefits
- **Proven Security**: Battle-tested in Bitcoin with 15+ years of production experience
- **Stateless Server**: HMAC signatures eliminate challenge storage, enabling horizontal scaling
- **Universal Compatibility**: No memory requirements ensure compatibility across all client types
- **Fast Verification**: Near-instant verification enables high server throughput under attack
- **Adjustable Difficulty**: Fine-grained control through leading zero bits (3-10 bits practical range)
### Difficulty Scaling Strategy
- **Normal Operations**: 4-bit difficulty (average 16 hash attempts)
- **Load-Based Adjustment**: +1 bit difficulty when server load exceeds threshold
- **Failure-Based Penalty**: +2 bits per 5 failed attempts in 2-minute window (capped at +6 extra bits)
- **Success Reset**: Failure counter resets to zero after successful solution
## Protocol Flow
### Successful Flow
```
Client Server
| |
|-------- CHALLENGE_REQUEST ------------->|
| |
|<------- CHALLENGE_RESPONSE -------------| (HMAC-signed)
| |
|-------- SOLUTION_REQUEST -------------->|
| |
|<------- QUOTE_RESPONSE -----------------| (if solution valid)
| |
```
### Error Flow
```
Client Server
|-------- CHALLENGE_REQUEST ------------->|
|<------- CHALLENGE_RESPONSE -------------|
|-------- SOLUTION_REQUEST (invalid) ---->|
|<------- ERROR_RESPONSE -----------------| (if solution invalid)
```
## Message Format
All protocol messages use a binary format with the following structure:
```
+------------------+------------------+------------------+
| Message Type | Length | Payload |
| (1 byte) | (4 bytes) | (N bytes) |
+------------------+------------------+------------------+
```
- **Message Type**: Single byte indicating message type (see table below)
- **Length**: 32-bit big-endian integer indicating payload length in bytes
- **Payload**: Variable-length payload (can be empty, maximum 8KB for security)
### Encoding Details
- **Endianness**: All multi-byte integers use big-endian encoding
- **JSON Format**: UTF-8 encoding, compact format (no pretty-printing)
- **Size Limits**: Maximum 8KB payload to prevent memory exhaustion attacks
## Message Types
| Type | Value | Name | Direction | Description |
|------|-------|------|-----------|-------------|
| 0x01 | CHALLENGE_REQUEST | Client → Server | Client requests a new PoW challenge |
| 0x02 | CHALLENGE_RESPONSE | Server → Client | Server issues HMAC-signed challenge |
| 0x03 | SOLUTION_REQUEST | Client → Server | Client submits challenge + nonce |
| 0x04 | QUOTE_RESPONSE | Server → Client | Server sends quote (if solution valid) |
| 0x05 | ERROR_RESPONSE | Server → Client | Server reports an error |
## Message Payloads
### CHALLENGE_REQUEST (0x01)
- **Payload**: Empty
- **Description**: Client requests a new challenge from the server
- **Usage**: First message in the protocol flow
### CHALLENGE_RESPONSE (0x02)
- **Payload**: JSON-encoded challenge object
- **Description**: Server provides HMAC-signed challenge for PoW computation
- **Format**:
```json
{
"id": "challenge_unique_id",
"timestamp": 1640995200,
"difficulty": 4,
"resource": "192.168.1.100:8080",
"random": "a1b2c3d4e5f6",
"hmac": "base64url_encoded_signature"
}
```
**Field Descriptions**:
- **id**: Unique identifier for this challenge
- **timestamp**: Unix timestamp when challenge was created
- **difficulty**: Number of leading zero bits required in solution hash
- **resource**: Server resource identifier (typically IP:port)
- **random**: Random hex string for challenge uniqueness
- **hmac**: HMAC-SHA256 signature of canonical challenge fields
**Security Notes**:
- Server is **stateless**: no need to store challenges locally
- HMAC signature prevents challenge forgery and tampering
- Timestamp enables TTL validation without server-side storage
### SOLUTION_REQUEST (0x03)
- **Payload**: JSON-encoded solution object
- **Description**: Client submits PoW solution with original challenge
- **Format**:
```json
{
"challenge": {
"id": "challenge_unique_id",
"timestamp": 1640995200,
"difficulty": 4,
"resource": "192.168.1.100:8080",
"random": "a1b2c3d4e5f6",
"hmac": "base64url_encoded_signature"
},
"nonce": "solution_nonce_value"
}
```
**Requirements**:
- Client must echo the complete original challenge object
- Nonce must produce a valid PoW hash with required difficulty
- Challenge must not be expired (within TTL window)
### QUOTE_RESPONSE (0x04)
- **Payload**: JSON-encoded quote object
- **Description**: Server sends inspirational quote after successful PoW verification
- **Format**:
```json
{
"text": "The only way to do great work is to love what you do.",
"author": "Steve Jobs",
"category": "motivation"
}
```
**Field Descriptions**:
- **text**: The inspirational quote text
- **author**: Attribution for the quote
- **category**: Thematic category (motivation, wisdom, success, etc.)
### ERROR_RESPONSE (0x05)
- **Payload**: JSON-encoded error object
- **Description**: Server reports errors in client requests or server state
- **Format**:
```json
{
"code": "INVALID_SOLUTION",
"message": "The provided PoW solution is incorrect",
"retry_after": 30
}
```
**Field Descriptions**:
- **code**: Machine-readable error code (see Error Codes section)
- **message**: Human-readable error description
- **retry_after**: Optional delay in seconds before client should retry
## Error Codes
| Code | Description | Client Action | Server Action |
|------|-------------|---------------|---------------|
| **MALFORMED_MESSAGE** | Invalid frame format or JSON parsing error | Disconnect and retry with correct format | Log error and close connection |
| **INVALID_CHALLENGE** | Challenge HMAC signature verification failed | Request new challenge from server | Generate new valid challenge |
| **INVALID_SOLUTION** | PoW hash verification failed for submitted nonce | Retry with correct nonce computation | Log failed attempt for rate limiting |
| **EXPIRED_CHALLENGE** | Challenge timestamp exceeds TTL window | Request fresh challenge from server | Generate new challenge with current timestamp |
| **RATE_LIMITED** | Client exceeds request rate limits | Wait for `retry_after` seconds before retry | Apply temporary throttling to client IP |
| **SERVER_ERROR** | Internal server error or temporary unavailability | Retry connection after delay | Log error and investigate system health |
| **TOO_MANY_CONNECTIONS** | Server at maximum connection capacity | Retry connection later | Reject new connections until capacity available |
| **DIFFICULTY_TOO_HIGH** | Adaptive difficulty exceeds client capabilities | Request new challenge or give up | May reduce difficulty if appropriate |
### Error Response Format
All errors follow the consistent ERROR_RESPONSE format:
```json
{
"code": "ERROR_CODE_NAME",
"message": "Human readable description of the error",
"retry_after": 30,
"details": {
"additional_context": "Optional additional error context"
}
}
```
### Error Handling Strategy
- **Client Errors**: Provide specific actionable error codes
- **Server Errors**: Log detailed information server-side, return generic client errors
- **Rate Limiting**: Include retry timing information in error responses
- **Security**: Avoid exposing internal system details in error messages
## Hashcash Challenge Format
The server uses **SHA-256 based Hashcash** with **HMAC authentication** for Proof of Work challenges.
### Challenge String Structure
```
resource:timestamp:difficulty:random
```
**Example**:
```
192.168.1.100:8080:1640995200:4:a1b2c3d4e5f6
```
### Solution Process
1. **Receive**: Client receives HMAC-signed challenge from server
2. **Extract**: Client extracts challenge fields to construct challenge string
3. **Iterate**: Client appends different nonce values to challenge string
4. **Hash**: Client computes SHA-256 hash of `challenge_string:nonce`
5. **Check**: Client checks if hash has required number of leading zero bits
6. **Repeat**: If not valid, increment nonce and repeat from step 4
7. **Submit**: When valid nonce found, submit solution to server
### Verification Process
Server verifies solutions through the following steps:
1. **HMAC Verification**: Verify challenge HMAC signature against server secret
2. **TTL Check**: Verify challenge timestamp is within TTL window (5 minutes)
3. **Reconstruction**: Reconstruct challenge string from submitted challenge fields
4. **Hash Computation**: Compute SHA-256 hash of `challenge_string:nonce`
5. **Difficulty Check**: Verify hash has required number of leading zero bits
6. **Success**: If all checks pass, grant access to quote resource
### Difficulty Examples
| Difficulty | Leading Zero Bits | Average Attempts | Example Hash |
|------------|-------------------|------------------|---------------|
| 3 | 3 bits | 8 | `000a1b2c...` |
| 4 | 4 bits | 16 | `0001a2b3...` |
| 5 | 5 bits | 32 | `0000a1b2...` |
| 6 | 6 bits | 64 | `00001a2b...` |
## Connection Management
### Connection Lifecycle
1. **Connect**: Client establishes TCP connection to server
2. **Challenge**: Client requests and receives HMAC-signed challenge
3. **Solve**: Client solves PoW challenge offline (can take time)
4. **Submit**: Client submits solution with challenge proof
5. **Receive**: Client receives quote (if valid) or error (if invalid)
6. **Disconnect**: Connection closes automatically after response
### Timeouts and Limits
| Parameter | Value | Purpose |
|-----------|-------|----------|
| **Challenge TTL** | 5 minutes | Prevents stale challenge reuse |
| **Solution Timeout** | 5 seconds | Prevents slowloris attacks |
| **Connection Timeout** | 15 seconds | Limits connection holding time |
| **Message Size Limit** | 8KB | Prevents memory exhaustion |
| **Max Connections** | 1000 | Global server capacity limit |
### Timeout Behavior
- **Challenge Expiry**: Challenges become invalid after 5 minutes from timestamp
- **Solution Window**: Client has 5 seconds to submit solution after challenge
- **Connection Limits**: Connections auto-close after 15 seconds of inactivity
- **Resource Protection**: Aggressive timeouts prevent resource exhaustion attacks
## Rate Limiting & DDOS Protection
### Connection-Level Protection (HAProxy/Envoy)
Handled **before application layer** by reverse proxy:
| Metric | Limit | Purpose |
|--------|-------|----------|
| **New Connections/sec** | ≤10 per IP | Prevents connection flooding |
| **Concurrent Connections** | ≤20 per IP | Limits resource usage per client |
| **Burst Allowance** | 30 connections | Handles legitimate traffic spikes |
| **Global Connection Cap** | 1000 total | Protects server capacity |
### Application-Level Protection
#### Failed Solution Tracking
- **Counter**: Track invalid solution attempts per client IP/identifier
- **Window**: Rolling 2-minute time window for failure counting
- **Penalty**: Each group of 5 failures increases difficulty by +2 bits
- **Cap**: Maximum +6 additional difficulty bits to prevent client DOS
- **Reset**: Successful solution resets failure counter to zero
#### Adaptive Difficulty Scaling
- **Load-Based**: +1 difficulty bit when server CPU/memory exceeds threshold
- **Attack Response**: Automatic difficulty increase during detected attacks
- **Recovery**: Gradual difficulty reduction as attack subsides
- **Monitoring**: Continuous monitoring of success/failure ratios
### Rate Limiting Rules
| Rule | Limit | Action |
|------|-------|--------|
| **Challenge Requests** | 10 per minute per IP | Temporary IP throttling |
| **Solution Attempts** | 5 per minute per IP | Increased difficulty penalty |
| **Invalid Solutions** | 5 per 2 minutes | +2 difficulty bits |
| **Connection Frequency** | 10 per second per IP | Connection rejection |
## Security Considerations
### PoW Security
- **Minimum Difficulty**: 3 leading zero bits (prevents trivial bypass attempts)
- **Maximum Difficulty**: 10 leading zero bits (prevents excessive client DOS)
- **Dynamic Scaling**: Adjusts automatically based on server load and attack patterns
- **CPU-Bound Work**: Memory-independent computation ensures fairness across hardware
### Challenge Security
- **Uniqueness**: Each challenge includes timestamp and cryptographic random data
- **Expiration**: Challenges automatically expire after 5-minute TTL window
- **HMAC Authentication**: Prevents challenge forgery and tampering
- **Stateless Verification**: No server-side storage required for validation
- **Replay Protection**: Timestamp and HMAC combination prevents replay attacks
### Input Validation
- **Message Size**: Strict 8KB maximum per message (prevents memory exhaustion)
- **JSON Schema**: All JSON payloads validated against strict schemas
- **Challenge Format**: Rigorous validation of challenge structure and fields
- **Nonce Validation**: Proper integer bounds checking for nonce values
- **Encoding Validation**: UTF-8 encoding validation for all text fields
### Network Security
- **Connection Limits**: Per-IP and global connection rate limiting
- **Timeout Protection**: Aggressive timeouts prevent slowloris attacks
- **Resource Binding**: Challenges tied to client connection context
- **Error Information**: Limited error details to prevent information disclosure
### Operational Security
- **HMAC Secret**: Server maintains secret key for challenge signing
- **Logging**: Comprehensive attack detection and monitoring
- **Metrics**: Real-time visibility into attack patterns and system health
- **Graceful Degradation**: System remains functional under attack conditions
## Implementation Notes
### Protocol Implementation
- **Endianness**: All multi-byte integers use big-endian encoding for consistency
- **JSON Encoding**: UTF-8 encoding for all text, compact format (no pretty-printing)
- **Required Fields**: All JSON fields marked as required must be present
- **Optional Fields**: Handle optional fields gracefully with sensible defaults
### Server Implementation
- **HMAC Secret**: Server maintains cryptographically secure secret key
- **Challenge Generation**: Use cryptographically secure random number generator
- **Quote Storage**: Preload quotes from file/database on startup
- **Concurrent Handling**: Support for multiple simultaneous client connections
- **Resource Management**: Proper cleanup of connections and temporary resources
### Client Implementation
- **PoW Computation**: Efficient nonce iteration and hash computation
- **Connection Management**: Proper TCP connection lifecycle handling
- **Error Handling**: Graceful handling of all error conditions
- **Retry Logic**: Intelligent retry with exponential backoff
### Error Handling
- **Server Errors**: Always send ERROR_RESPONSE for client-detectable errors
- **Logging**: Comprehensive server-side logging for debugging and monitoring
- **Connection Termination**: Graceful connection closure on errors
- **Client Recovery**: Clients should handle errors and retry appropriately
### Performance Considerations
- **Keep-Alive**: Not supported (one quote per connection for simplicity)
- **Connection Pooling**: Server supports concurrent connection handling
- **Memory Efficiency**: Minimal memory footprint per connection
- **CPU Efficiency**: Optimized hash computation and verification
- **Scalability**: Stateless design enables horizontal scaling
### Future Extensions
- **Resource Types**: Protocol designed to support resources beyond quotes
- **Authentication**: Framework supports future authentication mechanisms
- **Compression**: Payload compression can be added without protocol changes
- **Encryption**: TLS termination recommended at load balancer level