commit eda2028cbeed8c9222fd34c7a3210e25bc4a6128 Author: Savely Krendelhoff Date: Fri Aug 22 13:31:49 2025 +0700 Organize documentation structure diff --git a/docs/IMPLEMENTATION.md b/docs/IMPLEMENTATION.md new file mode 100644 index 0000000..b614357 --- /dev/null +++ b/docs/IMPLEMENTATION.md @@ -0,0 +1,187 @@ +# Word of Wisdom Server - Implementation Plan + +## Phase 1: Proof of Work Package Implementation +**Goal**: Create standalone, testable PoW package with HMAC-signed stateless challenges + +- [ ] **Project Setup** + - [ ] Initialize Go module and basic project structure + - [ ] Create PoW challenge structure and types + - [ ] Set up testing framework and utilities + +- [ ] **Challenge Generation & HMAC Security** + - [ ] Implement HMAC-signed challenge generation (stateless) + - [ ] Create challenge authenticity verification + - [ ] Add timestamp validation for replay protection (5 minutes TTL) + - [ ] Implement canonical challenge field ordering for HMAC + - [ ] Add Base64URL encoding for HMAC signatures + - [ ] Implement challenge string construction (`quotes:timestamp:difficulty:random`) + +- [ ] **PoW Algorithm Implementation** + - [ ] Implement SHA-256 based PoW solution algorithm + - [ ] Implement leading zero bit counting for difficulty + - [ ] Create nonce iteration and solution finding + - [ ] Add difficulty scaling (3-10 bits range) + - [ ] Create challenge string format: `quotes:timestamp:difficulty:random:nonce` + - [ ] Implement hash verification for submitted solutions + +- [ ] **Verification & Validation** + - [ ] Create challenge verification logic with HMAC validation + - [ ] Add solution validation against original challenge + - [ ] Test HMAC tamper detection and validation + - [ ] Add difficulty adjustment mechanisms + +- [ ] **Testing & Performance** + - [ ] Unit tests for challenge generation and verification + - [ ] Unit tests for HMAC signing and validation + - [ ] Unit tests for PoW solution finding and verification + - [ ] Benchmark tests for different difficulty levels + - [ ] Test edge cases (expired challenges, invalid HMAC, wrong difficulty) + - [ ] Performance tests for concurrent challenge operations + +## Phase 2: Basic Server Architecture +- [ ] Set up dependency injection framework (wire/dig) +- [ ] Create core interfaces and contracts +- [ ] Set up structured logging (zerolog/logrus) +- [ ] Set up metrics collection (prometheus) +- [ ] Create configuration management +- [ ] Integrate PoW package into server architecture + +## Phase 3: Quote Management System +- [ ] Define quote storage interface +- [ ] Implement in-memory quote repository (fake) +- [ ] Create quote selection service (random) +- [ ] Load initial quote collection from file/config +- [ ] Add quote validation and sanitization +- [ ] Write unit tests for quote management + +## Phase 4: TCP Protocol Implementation +- [ ] Implement binary message protocol codec +- [ ] Create protocol message types and structures +- [ ] Implement connection handler with proper error handling +- [ ] Add message serialization/deserialization (JSON) +- [ ] Create protocol state machine +- [ ] Implement connection lifecycle management +- [ ] Write unit tests for protocol components + +## Phase 5: Server Core & Request Handling +- [ ] Implement TCP server with connection pooling +- [ ] Create request router and handler dispatcher +- [ ] Add connection timeout and lifecycle management +- [ ] Implement graceful shutdown mechanism +- [ ] Add request/response logging middleware +- [ ] Create health check endpoints +- [ ] Write integration tests for server core + +## Phase 6: DDOS Protection & Rate Limiting +- [ ] Implement IP-based connection limiting +- [ ] Create rate limiting service with time windows +- [ ] Add automatic difficulty adjustment based on load +- [ ] Implement temporary IP blacklisting +- [ ] Create circuit breaker for overload protection +- [ ] Add monitoring for attack detection +- [ ] Write tests for protection mechanisms + +## Phase 7: Observability & Monitoring +- [ ] Add structured logging throughout application +- [ ] Implement metrics for key performance indicators: + - [ ] Active connections count + - [ ] Challenge generation rate + - [ ] Solution verification rate + - [ ] Success/failure ratios + - [ ] Response time histograms +- [ ] Create logging middleware for request tracing +- [ ] Add error categorization and reporting +- [ ] Implement health check endpoints + +## Phase 8: Configuration & Environment Setup +- [ ] Create configuration structure with validation +- [ ] Support environment variables and config files +- [ ] Add configuration for different environments (dev/prod) +- [ ] Implement feature flags for protection levels +- [ ] Create deployment configuration templates +- [ ] Add configuration validation and defaults + +## Phase 9: Client Implementation +- [ ] Create client application structure +- [ ] Implement PoW solver algorithm +- [ ] Create client-side protocol implementation +- [ ] Add retry logic and error handling +- [ ] Implement connection management +- [ ] Create CLI interface for client +- [ ] Add client metrics and logging +- [ ] Write client unit and integration tests + +## Phase 10: Docker & Deployment +- [ ] Create multi-stage Dockerfile for server +- [ ] Create Dockerfile for client +- [ ] Create docker-compose.yml for local development +- [ ] Add docker-compose for production deployment +- [ ] Create health check scripts for containers +- [ ] Add environment-specific configurations +- [ ] Create deployment documentation + +## Phase 11: Testing & Quality Assurance +- [ ] Write comprehensive unit tests (>80% coverage): + - [ ] PoW algorithm tests + - [ ] Protocol handler tests + - [ ] Rate limiting tests + - [ ] Quote service tests + - [ ] Configuration tests +- [ ] Create integration tests: + - [ ] End-to-end client-server communication + - [ ] Load testing scenarios + - [ ] Failure recovery tests + - [ ] DDOS protection validation +- [ ] Add benchmark tests for performance validation +- [ ] Create stress testing scenarios + +## Phase 12: Documentation & Final Polish +- [ ] Write comprehensive README with setup instructions +- [ ] Create API documentation for all interfaces +- [ ] Add inline code documentation +- [ ] Create deployment guide +- [ ] Write troubleshooting guide +- [ ] Add performance tuning recommendations +- [ ] Create monitoring and alerting guide + +## Phase 13: Production Readiness Checklist +- [ ] Security audit of all components +- [ ] Performance benchmarking and optimization +- [ ] Memory leak detection and prevention +- [ ] Resource cleanup validation +- [ ] Error handling coverage review +- [ ] Logging security (no sensitive data exposure) +- [ ] Configuration security (secrets management) +- [ ] Container security hardening + +## Directory Structure +``` +/ +├── cmd/ +│ ├── server/ # Server application entry point +│ └── client/ # Client application entry point +├── internal/ +│ ├── server/ # Server core logic +│ ├── protocol/ # Protocol implementation +│ ├── pow/ # Proof of Work implementation +│ ├── quotes/ # Quote management +│ ├── ratelimit/ # Rate limiting & DDOS protection +│ ├── config/ # Configuration management +│ ├── metrics/ # Metrics collection +│ └── logger/ # Structured logging +├── pkg/ # Public packages +├── test/ # Integration tests +├── docker/ # Docker configurations +├── deployments/ # Deployment configurations +└── docs/ # Additional documentation +``` + +## Success Criteria +- [ ] Server handles 1000+ concurrent connections +- [ ] PoW protection prevents DDOS attacks effectively +- [ ] All tests pass with >80% code coverage +- [ ] Docker containers build and run successfully +- [ ] Client successfully solves challenges and receives quotes +- [ ] Comprehensive logging and metrics in place +- [ ] Production-ready error handling and recovery +- [ ] Clear documentation for deployment and operation diff --git a/docs/POW_ANALYSIS.md b/docs/POW_ANALYSIS.md new file mode 100644 index 0000000..a446903 --- /dev/null +++ b/docs/POW_ANALYSIS.md @@ -0,0 +1,372 @@ +# Proof of Work Algorithm Analysis + +## Overview + +This document analyzes various Proof of Work (PoW) algorithms considered for the Word of Wisdom protocol and provides detailed justification for the chosen approach. + +## PoW Algorithm Alternatives + +### 1. SHA-256 Hashcash (CHOSEN) + +**Description**: Bitcoin-style hashcash requiring hash output with specific number of leading zero bits. + +**Pros**: +- Widely tested and battle-proven in Bitcoin +- Simple to implement and verify +- CPU-bound computation (fair across hardware) +- Adjustable difficulty through leading zero bits +- Fast verification (single SHA-256 hash) +- No memory requirements +- Deterministic verification time + +**Cons**: +- Vulnerable to ASIC mining (specialized hardware advantage) +- Power consumption scales with difficulty +- Brute force approach (no early termination) + +**Our Mitigation**: +- DDOS protection doesn't require ASIC resistance (temporary challenges) +- Difficulty kept low (3-6 bits) to minimize power consumption +- Server-controlled difficulty prevents client-side optimization attacks + +### 2. Scrypt + +**Description**: Memory-hard function designed to resist ASIC mining. + +**Pros**: +- ASIC-resistant design +- Memory-hard computation +- Battle-tested in Litecoin +- Configurable memory and time parameters + +**Cons**: +- Complex implementation +- Memory requirements may disadvantage mobile clients +- Slower verification than simple hashing +- Parameter tuning complexity +- Potential denial-of-service on client memory + +**Why Not Chosen**: +- Unnecessary complexity for DDOS protection use case +- Memory requirements create client hardware discrimination +- Verification overhead impacts server performance under attack + +### 3. Equihash + +**Description**: Memory-hard algorithm based on birthday paradox, used in Zcash. + +**Pros**: +- ASIC-resistant through memory requirements +- Mathematically elegant approach +- Proven in production cryptocurrency +- Configurable memory parameters + +**Cons**: +- Very complex implementation +- High memory requirements (150+ MB typically) +- Slow verification process +- Not suitable for lightweight clients +- Complex parameter selection + +**Why Not Chosen**: +- Excessive complexity and resource requirements +- Poor fit for anti-DDOS use case requiring quick challenges +- Would exclude mobile and embedded clients + +### 4. Argon2 + +**Description**: Modern password hashing function winner, memory-hard with multiple variants. + +**Pros**: +- Designed for modern security requirements +- Multiple variants (Argon2i, Argon2d, Argon2id) +- Configurable time/memory trade-offs +- Resistant to side-channel attacks +- ASIC-resistant design + +**Cons**: +- Primarily designed for password hashing, not PoW +- Memory requirements create client inequality +- Complex implementation with many parameters +- Slower than simple hashing functions +- Not optimized for network protocols + +**Why Not Chosen**: +- Over-engineered for DDOS protection scenario +- Memory requirements discriminate against resource-constrained clients +- Verification overhead impacts server scalability + +### 5. CryptoNight + +**Description**: Memory-hard algorithm designed for CPU mining, used in Monero. + +**Pros**: +- CPU-optimized design +- ASIC-resistant through memory requirements +- Ring buffer memory pattern +- Proven in cryptocurrency applications + +**Cons**: +- 2MB memory requirement per instance +- Complex implementation +- Slower verification than simple hashing +- Memory requirements exclude lightweight clients +- Frequent algorithm updates needed for ASIC resistance + +**Why Not Chosen**: +- Memory requirements too high for general clients +- Complexity outweighs benefits for anti-DDOS use case +- Algorithm instability due to frequent updates + +### 6. Cuckoo Cycle + +**Description**: Graph-theoretic PoW based on finding cycles in random graphs. + +**Pros**: +- Memory-hard through graph traversal +- Mathematically interesting approach +- ASIC-resistant design +- Adjustable memory requirements + +**Cons**: +- Very complex implementation +- High memory requirements +- Slow verification process +- Limited production experience +- Complex parameter tuning + +**Why Not Chosen**: +- Excessive complexity for simple DDOS protection +- Memory requirements create client barriers +- Unproven in high-throughput server scenarios + +### 7. X11/X16R (Multi-Hash) + +**Description**: Combines multiple hash functions in sequence or rotation. + +**Pros**: +- ASIC-resistant through algorithm diversity +- Harder to optimize with specialized hardware +- Proven in some cryptocurrencies + +**Cons**: +- Complex implementation requiring multiple hash functions +- Slower than single-hash approaches +- More attack surface (multiple algorithms) +- Difficult to verify implementation correctness +- Higher CPU usage for verification + +**Why Not Chosen**: +- Unnecessary complexity for anti-DDOS use case +- Multiple algorithms increase implementation risk +- Verification overhead impacts server performance + +## Decision Matrix + +| Algorithm | Complexity | Speed | Memory | ASIC Resistance | Implementation Risk | Server Impact | +|-----------|------------|-------|--------|-----------------|-------------------|---------------| +| **SHA-256 Hashcash** | Low | High | None | Low | Low | Low | +| Scrypt | Medium | Medium | High | High | Medium | Medium | +| Equihash | High | Low | Very High | High | High | High | +| Argon2 | High | Low | High | High | Medium | Medium | +| CryptoNight | High | Low | High | High | High | High | +| Cuckoo Cycle | Very High | Low | High | High | Very High | High | +| X11/X16R | High | Medium | Low | Medium | High | Medium | + +## Final Decision: SHA-256 Hashcash + +### Primary Justifications + +1. **Simplicity**: Single, well-understood hash function with minimal implementation complexity +2. **Performance**: Fast computation and near-instant verification enable high server throughput +3. **Universality**: No memory requirements ensure compatibility across all client types +4. **Proven Reliability**: Battle-tested in Bitcoin with 15+ years of production experience +5. **Adjustable Difficulty**: Fine-grained control through leading zero bits (3-10 bits practical range) + +### ASIC Resistance Not Required + +For DDOS protection, ASIC resistance is unnecessary because: + +- **Temporary Challenges**: Each challenge is unique and expires within minutes +- **Cost vs. Benefit**: ASIC development cost far exceeds potential attack value +- **Dynamic Difficulty**: Server can adjust difficulty faster than ASIC deployment +- **Legitimate Use**: No financial incentive for specialized hardware development + +### Enhanced Security Through HMAC Integration + +Our implementation addresses SHA-256 hashcash limitations: + +- **Stateless Operation**: HMAC signatures eliminate server storage requirements +- **Replay Protection**: Timestamp + HMAC prevents challenge reuse +- **Forgery Prevention**: Server secret prevents challenge generation attacks +- **Scalability**: No server state enables horizontal scaling without coordination + +## HMAC Stateless Design Analysis + +### HMAC vs Storage-Based Challenge Management + +#### Stateless Approach (HMAC) - CHOSEN + +**How it works**: +1. Server generates challenge data with timestamp and random values +2. Server computes HMAC signature: `hmac = HMAC-SHA256(secret_key, challenge_fields)` +3. Server sends challenge + HMAC signature (stores nothing) +4. Client submits solution with original challenge + HMAC +5. Server recomputes HMAC to verify challenge authenticity +6. If HMAC valid and timestamp within TTL, verify PoW solution + +**Advantages**: +- **Zero Storage**: No database or memory requirements for active challenges +- **Horizontal Scaling**: Any server instance can validate any challenge +- **High Performance**: No database lookups during attack conditions +- **Fault Tolerance**: Server restarts don't invalidate active challenges +- **Simple Architecture**: No challenge cleanup or garbage collection needed +- **Cost Effective**: Minimal infrastructure requirements for scaling + +**Security Guarantees**: +- **Authenticity**: Only server with secret key can create valid challenges +- **Integrity**: Any tampering with challenge fields invalidates HMAC +- **Expiration**: Timestamp validation prevents stale challenge reuse +- **Non-repudiation**: Server can verify it issued specific challenge + +#### Stateful Approach (Storage-Based) - REJECTED + +**How it would work**: +1. Server generates challenge and stores in database/memory +2. Server sends challenge to client (no HMAC needed) +3. Client submits solution with challenge ID +4. Server looks up stored challenge to verify solution +5. Server removes challenge from storage after use + +**Advantages**: +- **Perfect Replay Protection**: Each challenge used exactly once +- **Granular Control**: Individual challenge revocation possible +- **Rich Analytics**: Track per-challenge usage patterns +- **Complex Rate Limiting**: Per-challenge attempt limits +- **Audit Trail**: Complete challenge lifecycle logging + +**Disadvantages**: +- **Storage Overhead**: Database/memory requirements grow with concurrent users +- **Scaling Complexity**: Challenge synchronization across server instances +- **Performance Impact**: Database queries during high-load attack scenarios +- **Single Point of Failure**: Database outage invalidates all active challenges +- **Cleanup Complexity**: Expired challenge garbage collection required +- **Infrastructure Cost**: Database clustering for high availability + +### Security Trade-offs Analysis + +#### Replay Attack Comparison + +**HMAC Approach**: +- **Risk**: Same challenge reusable within 5-minute TTL window +- **Mitigation**: Short TTL limits replay window +- **Impact**: Acceptable for DDoS protection use case +- **Real Risk**: Low (attacker gains no significant advantage) + +**Storage Approach**: +- **Risk**: Zero replay attacks (perfect single-use enforcement) +- **Cost**: Significant infrastructure and complexity overhead +- **Impact**: Minimal benefit for DDoS protection scenario + +#### Rate Limiting Capabilities + +**HMAC Limitations**: +- Cannot track per-challenge attempt counts +- Cannot revoke specific problematic challenges +- Fixed TTL for all challenges regardless of client behavior + +**HMAC Mitigations**: +- IP-based rate limiting remains fully functional +- Failed solution tracking per IP provides abuse detection +- Adaptive difficulty scaling responds to attack patterns +- Connection-level limits prevent resource exhaustion + +**Storage Benefits**: +- Per-challenge attempt counting +- Individual challenge blacklisting +- Dynamic TTL adjustment per client +- Rich forensic analysis capabilities + +### Design Decision Rationale + +For the Word of Wisdom DDoS protection system, **HMAC stateless design** is optimal because: + +#### Primary Requirements Met +- **High Availability**: System remains functional during database outages +- **Horizontal Scaling**: Simple load balancer distribution without coordination +- **Attack Resistance**: Performance doesn't degrade with concurrent challenges +- **Operational Simplicity**: Minimal infrastructure for production deployment + +#### Acceptable Trade-offs +- **Replay Window**: 5-minute replay risk acceptable vs infrastructure complexity +- **Rate Limiting**: IP-based limits sufficient for DDoS protection use case +- **Analytics**: Basic failure tracking adequate for adaptive difficulty +- **Forensics**: Connection-level logging provides sufficient attack analysis + +#### Use Case Alignment +- **DDoS Protection Focus**: System optimized for availability under attack +- **Quote Service**: Low-value target doesn't justify complex security measures +- **Temporary Challenges**: Short-lived challenges reduce replay attack value +- **Cost Sensitivity**: Minimal infrastructure preferred for educational/demo system + +### When Storage-Based Would Be Better + +**High-Security Applications**: +- Financial services requiring perfect replay protection +- Authentication systems with strict single-use requirements +- High-value resource protection where replay attacks have significant impact + +**Complex Rate Limiting Needs**: +- Per-user challenge quotas required +- Sophisticated abuse pattern detection needed +- Real-time challenge analytics essential for operations + +**Rich Monitoring Requirements**: +- Detailed forensic analysis of attack patterns +- Challenge lifecycle tracking for security auditing +- Complex rate limiting with per-challenge granularity + +### Implementation Validation + +The HMAC approach successfully addresses all primary security concerns: + +1. **Challenge Authenticity**: HMAC prevents forged challenges +2. **Data Integrity**: Tampering detection through signature validation +3. **Replay Protection**: Timestamp + TTL limits reuse window +4. **Server Scalability**: No coordination required between instances +5. **Attack Resilience**: Performance maintained under high challenge volume + +The trade-offs (limited replay protection, reduced analytics) are acceptable given the system's primary goal of DDoS mitigation for a quote service. + +### Addressing SHA-256 Cons + +1. **ASIC Advantage**: Mitigated by low difficulty and temporary challenges +2. **Power Consumption**: Limited by max 6-bit difficulty (average 64 attempts) +3. **Brute Force**: Acceptable for anti-DDOS where work proof is the goal + +### Alternative Deployment Scenarios + +If future requirements change, our architecture supports algorithm swapping: + +- **Mobile-Heavy Clients**: Reduce difficulty to 2-3 bits +- **High-Security Environments**: Increase difficulty to 8-10 bits +- **Algorithm Migration**: Protocol supports algorithm field for future updates +- **Hybrid Approach**: Different algorithms per client capability + +## Implementation Recommendations + +### Difficulty Scaling Strategy + +- **Start Conservative**: Begin with 4-bit difficulty (16 attempts average) +- **Monitor Performance**: Track client success rates and completion times +- **Adjust Dynamically**: Increase difficulty under attack, decrease during normal operation +- **Client Feedback**: Monitor error rates to avoid excluding legitimate clients + +### Future Evolution Path + +1. **Phase 1**: SHA-256 hashcash with HMAC (current) +2. **Phase 2**: Optional client capability negotiation +3. **Phase 3**: Multi-algorithm support for different client classes +4. **Phase 4**: Machine learning-based difficulty adjustment + +The chosen SHA-256 hashcash approach provides the optimal balance of simplicity, performance, and security for our anti-DDOS use case while maintaining flexibility for future enhancement. \ No newline at end of file diff --git a/docs/PROTOCOL.md b/docs/PROTOCOL.md new file mode 100644 index 0000000..1c27038 --- /dev/null +++ b/docs/PROTOCOL.md @@ -0,0 +1,390 @@ +# Word of Wisdom Protocol Specification + +## Overview + +The **Word of Wisdom** protocol is a TCP-based challenge-response protocol designed to mitigate DDoS attacks by requiring clients to solve a **Proof-of-Work (PoW)** puzzle before accessing protected resources (quotes). + +The protocol uses **HMAC-signed challenges** for stateless server operation, aggressive timeouts to prevent slowloris attacks, and a simple binary framing with JSON payloads. It is **stateless** on the server thanks to HMAC-signed challenges, eliminating the need for challenge storage. + +## Proof of Work Algorithm Choice + +### Selected Algorithm: SHA-256 Hashcash with HMAC Authentication + +The protocol uses **SHA-256 based Hashcash** with **HMAC-signed challenges** for secure, stateless operation. + +For detailed analysis of alternative PoW algorithms and comprehensive justification of this choice, see [POW_ANALYSIS.md](./POW_ANALYSIS.md). + +### Key Benefits + +- **Proven Security**: Battle-tested in Bitcoin with 15+ years of production experience +- **Stateless Server**: HMAC signatures eliminate challenge storage, enabling horizontal scaling +- **Universal Compatibility**: No memory requirements ensure compatibility across all client types +- **Fast Verification**: Near-instant verification enables high server throughput under attack +- **Adjustable Difficulty**: Fine-grained control through leading zero bits (3-10 bits practical range) + +### Difficulty Scaling Strategy + +- **Normal Operations**: 4-bit difficulty (average 16 hash attempts) +- **Load-Based Adjustment**: +1 bit difficulty when server load exceeds threshold +- **Failure-Based Penalty**: +2 bits per 5 failed attempts in 2-minute window (capped at +6 extra bits) +- **Success Reset**: Failure counter resets to zero after successful solution + +## Protocol Flow + +### Successful Flow +``` +Client Server + | | + |-------- CHALLENGE_REQUEST ------------->| + | | + |<------- CHALLENGE_RESPONSE -------------| (HMAC-signed) + | | + |-------- SOLUTION_REQUEST -------------->| + | | + |<------- QUOTE_RESPONSE -----------------| (if solution valid) + | | +``` + +### Error Flow +``` +Client Server + |-------- CHALLENGE_REQUEST ------------->| + |<------- CHALLENGE_RESPONSE -------------| + |-------- SOLUTION_REQUEST (invalid) ---->| + |<------- ERROR_RESPONSE -----------------| (if solution invalid) +``` + +## Message Format + +All protocol messages use a binary format with the following structure: + +``` ++------------------+------------------+------------------+ +| Message Type | Length | Payload | +| (1 byte) | (4 bytes) | (N bytes) | ++------------------+------------------+------------------+ +``` + +- **Message Type**: Single byte indicating message type (see table below) +- **Length**: 32-bit big-endian integer indicating payload length in bytes +- **Payload**: Variable-length payload (can be empty, maximum 8KB for security) + +### Encoding Details +- **Endianness**: All multi-byte integers use big-endian encoding +- **JSON Format**: UTF-8 encoding, compact format (no pretty-printing) +- **Size Limits**: Maximum 8KB payload to prevent memory exhaustion attacks + +## Message Types + +| Type | Value | Name | Direction | Description | +|------|-------|------|-----------|-------------| +| 0x01 | CHALLENGE_REQUEST | Client → Server | Client requests a new PoW challenge | +| 0x02 | CHALLENGE_RESPONSE | Server → Client | Server issues HMAC-signed challenge | +| 0x03 | SOLUTION_REQUEST | Client → Server | Client submits challenge + nonce | +| 0x04 | QUOTE_RESPONSE | Server → Client | Server sends quote (if solution valid) | +| 0x05 | ERROR_RESPONSE | Server → Client | Server reports an error | + +## Message Payloads + +### CHALLENGE_REQUEST (0x01) +- **Payload**: Empty +- **Description**: Client requests a new challenge from the server +- **Usage**: First message in the protocol flow + +### CHALLENGE_RESPONSE (0x02) +- **Payload**: JSON-encoded challenge object +- **Description**: Server provides HMAC-signed challenge for PoW computation +- **Format**: +```json +{ + "id": "challenge_unique_id", + "timestamp": 1640995200, + "difficulty": 4, + "resource": "192.168.1.100:8080", + "random": "a1b2c3d4e5f6", + "hmac": "base64url_encoded_signature" +} +``` + +**Field Descriptions**: +- **id**: Unique identifier for this challenge +- **timestamp**: Unix timestamp when challenge was created +- **difficulty**: Number of leading zero bits required in solution hash +- **resource**: Server resource identifier (typically IP:port) +- **random**: Random hex string for challenge uniqueness +- **hmac**: HMAC-SHA256 signature of canonical challenge fields + +**Security Notes**: +- Server is **stateless**: no need to store challenges locally +- HMAC signature prevents challenge forgery and tampering +- Timestamp enables TTL validation without server-side storage + +### SOLUTION_REQUEST (0x03) +- **Payload**: JSON-encoded solution object +- **Description**: Client submits PoW solution with original challenge +- **Format**: +```json +{ + "challenge": { + "id": "challenge_unique_id", + "timestamp": 1640995200, + "difficulty": 4, + "resource": "192.168.1.100:8080", + "random": "a1b2c3d4e5f6", + "hmac": "base64url_encoded_signature" + }, + "nonce": "solution_nonce_value" +} +``` + +**Requirements**: +- Client must echo the complete original challenge object +- Nonce must produce a valid PoW hash with required difficulty +- Challenge must not be expired (within TTL window) + +### QUOTE_RESPONSE (0x04) +- **Payload**: JSON-encoded quote object +- **Description**: Server sends inspirational quote after successful PoW verification +- **Format**: +```json +{ + "text": "The only way to do great work is to love what you do.", + "author": "Steve Jobs", + "category": "motivation" +} +``` + +**Field Descriptions**: +- **text**: The inspirational quote text +- **author**: Attribution for the quote +- **category**: Thematic category (motivation, wisdom, success, etc.) + +### ERROR_RESPONSE (0x05) +- **Payload**: JSON-encoded error object +- **Description**: Server reports errors in client requests or server state +- **Format**: +```json +{ + "code": "INVALID_SOLUTION", + "message": "The provided PoW solution is incorrect", + "retry_after": 30 +} +``` + +**Field Descriptions**: +- **code**: Machine-readable error code (see Error Codes section) +- **message**: Human-readable error description +- **retry_after**: Optional delay in seconds before client should retry + +## Error Codes + +| Code | Description | Client Action | Server Action | +|------|-------------|---------------|---------------| +| **MALFORMED_MESSAGE** | Invalid frame format or JSON parsing error | Disconnect and retry with correct format | Log error and close connection | +| **INVALID_CHALLENGE** | Challenge HMAC signature verification failed | Request new challenge from server | Generate new valid challenge | +| **INVALID_SOLUTION** | PoW hash verification failed for submitted nonce | Retry with correct nonce computation | Log failed attempt for rate limiting | +| **EXPIRED_CHALLENGE** | Challenge timestamp exceeds TTL window | Request fresh challenge from server | Generate new challenge with current timestamp | +| **RATE_LIMITED** | Client exceeds request rate limits | Wait for `retry_after` seconds before retry | Apply temporary throttling to client IP | +| **SERVER_ERROR** | Internal server error or temporary unavailability | Retry connection after delay | Log error and investigate system health | +| **TOO_MANY_CONNECTIONS** | Server at maximum connection capacity | Retry connection later | Reject new connections until capacity available | +| **DIFFICULTY_TOO_HIGH** | Adaptive difficulty exceeds client capabilities | Request new challenge or give up | May reduce difficulty if appropriate | + +### Error Response Format + +All errors follow the consistent ERROR_RESPONSE format: +```json +{ + "code": "ERROR_CODE_NAME", + "message": "Human readable description of the error", + "retry_after": 30, + "details": { + "additional_context": "Optional additional error context" + } +} +``` + +### Error Handling Strategy +- **Client Errors**: Provide specific actionable error codes +- **Server Errors**: Log detailed information server-side, return generic client errors +- **Rate Limiting**: Include retry timing information in error responses +- **Security**: Avoid exposing internal system details in error messages + +## Hashcash Challenge Format + +The server uses **SHA-256 based Hashcash** with **HMAC authentication** for Proof of Work challenges. + +### Challenge String Structure +``` +resource:timestamp:difficulty:random +``` + +**Example**: +``` +192.168.1.100:8080:1640995200:4:a1b2c3d4e5f6 +``` + +### Solution Process +1. **Receive**: Client receives HMAC-signed challenge from server +2. **Extract**: Client extracts challenge fields to construct challenge string +3. **Iterate**: Client appends different nonce values to challenge string +4. **Hash**: Client computes SHA-256 hash of `challenge_string:nonce` +5. **Check**: Client checks if hash has required number of leading zero bits +6. **Repeat**: If not valid, increment nonce and repeat from step 4 +7. **Submit**: When valid nonce found, submit solution to server + +### Verification Process + +Server verifies solutions through the following steps: + +1. **HMAC Verification**: Verify challenge HMAC signature against server secret +2. **TTL Check**: Verify challenge timestamp is within TTL window (5 minutes) +3. **Reconstruction**: Reconstruct challenge string from submitted challenge fields +4. **Hash Computation**: Compute SHA-256 hash of `challenge_string:nonce` +5. **Difficulty Check**: Verify hash has required number of leading zero bits +6. **Success**: If all checks pass, grant access to quote resource + +### Difficulty Examples + +| Difficulty | Leading Zero Bits | Average Attempts | Example Hash | +|------------|-------------------|------------------|---------------| +| 3 | 3 bits | 8 | `000a1b2c...` | +| 4 | 4 bits | 16 | `0001a2b3...` | +| 5 | 5 bits | 32 | `0000a1b2...` | +| 6 | 6 bits | 64 | `00001a2b...` | + +## Connection Management + +### Connection Lifecycle +1. **Connect**: Client establishes TCP connection to server +2. **Challenge**: Client requests and receives HMAC-signed challenge +3. **Solve**: Client solves PoW challenge offline (can take time) +4. **Submit**: Client submits solution with challenge proof +5. **Receive**: Client receives quote (if valid) or error (if invalid) +6. **Disconnect**: Connection closes automatically after response + +### Timeouts and Limits + +| Parameter | Value | Purpose | +|-----------|-------|----------| +| **Challenge TTL** | 5 minutes | Prevents stale challenge reuse | +| **Solution Timeout** | 5 seconds | Prevents slowloris attacks | +| **Connection Timeout** | 15 seconds | Limits connection holding time | +| **Message Size Limit** | 8KB | Prevents memory exhaustion | +| **Max Connections** | 1000 | Global server capacity limit | + +### Timeout Behavior +- **Challenge Expiry**: Challenges become invalid after 5 minutes from timestamp +- **Solution Window**: Client has 5 seconds to submit solution after challenge +- **Connection Limits**: Connections auto-close after 15 seconds of inactivity +- **Resource Protection**: Aggressive timeouts prevent resource exhaustion attacks + +## Rate Limiting & DDOS Protection + +### Connection-Level Protection (HAProxy/Envoy) + +Handled **before application layer** by reverse proxy: + +| Metric | Limit | Purpose | +|--------|-------|----------| +| **New Connections/sec** | ≤10 per IP | Prevents connection flooding | +| **Concurrent Connections** | ≤20 per IP | Limits resource usage per client | +| **Burst Allowance** | 30 connections | Handles legitimate traffic spikes | +| **Global Connection Cap** | 1000 total | Protects server capacity | + +### Application-Level Protection + +#### Failed Solution Tracking +- **Counter**: Track invalid solution attempts per client IP/identifier +- **Window**: Rolling 2-minute time window for failure counting +- **Penalty**: Each group of 5 failures increases difficulty by +2 bits +- **Cap**: Maximum +6 additional difficulty bits to prevent client DOS +- **Reset**: Successful solution resets failure counter to zero + +#### Adaptive Difficulty Scaling +- **Load-Based**: +1 difficulty bit when server CPU/memory exceeds threshold +- **Attack Response**: Automatic difficulty increase during detected attacks +- **Recovery**: Gradual difficulty reduction as attack subsides +- **Monitoring**: Continuous monitoring of success/failure ratios + +### Rate Limiting Rules + +| Rule | Limit | Action | +|------|-------|--------| +| **Challenge Requests** | 10 per minute per IP | Temporary IP throttling | +| **Solution Attempts** | 5 per minute per IP | Increased difficulty penalty | +| **Invalid Solutions** | 5 per 2 minutes | +2 difficulty bits | +| **Connection Frequency** | 10 per second per IP | Connection rejection | + +## Security Considerations + +### PoW Security +- **Minimum Difficulty**: 3 leading zero bits (prevents trivial bypass attempts) +- **Maximum Difficulty**: 10 leading zero bits (prevents excessive client DOS) +- **Dynamic Scaling**: Adjusts automatically based on server load and attack patterns +- **CPU-Bound Work**: Memory-independent computation ensures fairness across hardware + +### Challenge Security +- **Uniqueness**: Each challenge includes timestamp and cryptographic random data +- **Expiration**: Challenges automatically expire after 5-minute TTL window +- **HMAC Authentication**: Prevents challenge forgery and tampering +- **Stateless Verification**: No server-side storage required for validation +- **Replay Protection**: Timestamp and HMAC combination prevents replay attacks + +### Input Validation +- **Message Size**: Strict 8KB maximum per message (prevents memory exhaustion) +- **JSON Schema**: All JSON payloads validated against strict schemas +- **Challenge Format**: Rigorous validation of challenge structure and fields +- **Nonce Validation**: Proper integer bounds checking for nonce values +- **Encoding Validation**: UTF-8 encoding validation for all text fields + +### Network Security +- **Connection Limits**: Per-IP and global connection rate limiting +- **Timeout Protection**: Aggressive timeouts prevent slowloris attacks +- **Resource Binding**: Challenges tied to client connection context +- **Error Information**: Limited error details to prevent information disclosure + +### Operational Security +- **HMAC Secret**: Server maintains secret key for challenge signing +- **Logging**: Comprehensive attack detection and monitoring +- **Metrics**: Real-time visibility into attack patterns and system health +- **Graceful Degradation**: System remains functional under attack conditions + +## Implementation Notes + +### Protocol Implementation +- **Endianness**: All multi-byte integers use big-endian encoding for consistency +- **JSON Encoding**: UTF-8 encoding for all text, compact format (no pretty-printing) +- **Required Fields**: All JSON fields marked as required must be present +- **Optional Fields**: Handle optional fields gracefully with sensible defaults + +### Server Implementation +- **HMAC Secret**: Server maintains cryptographically secure secret key +- **Challenge Generation**: Use cryptographically secure random number generator +- **Quote Storage**: Preload quotes from file/database on startup +- **Concurrent Handling**: Support for multiple simultaneous client connections +- **Resource Management**: Proper cleanup of connections and temporary resources + +### Client Implementation +- **PoW Computation**: Efficient nonce iteration and hash computation +- **Connection Management**: Proper TCP connection lifecycle handling +- **Error Handling**: Graceful handling of all error conditions +- **Retry Logic**: Intelligent retry with exponential backoff + +### Error Handling +- **Server Errors**: Always send ERROR_RESPONSE for client-detectable errors +- **Logging**: Comprehensive server-side logging for debugging and monitoring +- **Connection Termination**: Graceful connection closure on errors +- **Client Recovery**: Clients should handle errors and retry appropriately + +### Performance Considerations +- **Keep-Alive**: Not supported (one quote per connection for simplicity) +- **Connection Pooling**: Server supports concurrent connection handling +- **Memory Efficiency**: Minimal memory footprint per connection +- **CPU Efficiency**: Optimized hash computation and verification +- **Scalability**: Stateless design enables horizontal scaling + +### Future Extensions +- **Resource Types**: Protocol designed to support resources beyond quotes +- **Authentication**: Framework supports future authentication mechanisms +- **Compression**: Payload compression can be added without protocol changes +- **Encryption**: TLS termination recommended at load balancer level