12 KiB
Architecture Choices
This document explains the key architectural decisions made in the Hash of Wisdom project and the reasoning behind them.
Overall Architecture
Clean Architecture
We follow Clean Architecture principles with clear layer separation:
┌─────────────────────────────────────┐
│ Infrastructure Layer │ ← cmd/, internal/server, internal/protocol
├─────────────────────────────────────┤
│ Application Layer │ ← internal/application (message handling)
├─────────────────────────────────────┤
│ Domain Layer │ ← internal/service, internal/pow (business logic)
├─────────────────────────────────────┤
│ External Layer │ ← internal/quotes (external APIs)
└─────────────────────────────────────┘
Benefits:
- Testability: Each layer can be unit tested independently
- Maintainability: Changes in one layer don't cascade
- Flexibility: Easy to swap implementations (e.g., different quote sources)
- Domain Focus: Core business rules are isolated and protected
Protocol Design
Binary Protocol with JSON Payloads
Choice: Custom binary protocol with JSON-encoded message bodies
Why Binary Protocol:
- Performance: Efficient framing and length prefixes
- Reliability: Clear message boundaries prevent parsing issues
- Extensibility: Easy to add message types and versions
Why JSON Payloads:
- Simplicity: Standard library support, easy debugging
- Flexibility: Schema evolution without breaking compatibility
- Tooling: Excellent tooling and human readability
Alternative Considered: Pure binary (Protocol Buffers)
- Rejected Because: Added complexity without significant benefit for our use case
- Trade-off: Slightly larger payload size for much simpler implementation
Stateless Challenge Design
Choice: HMAC-signed challenges with all state embedded
type Challenge struct {
Target string `json:"target"` // "quotes"
Timestamp int64 `json:"timestamp"` // Unix timestamp
Difficulty int `json:"difficulty"` // Leading zero bits
Random string `json:"random"` // Entropy
Signature string `json:"signature"` // HMAC-SHA256
}
Benefits:
- Scalability: No server-side session storage required
- Reliability: Challenges survive server restarts
- Security: HMAC prevents tampering and replay attacks
- Simplicity: No cache management or cleanup needed
Alternative Considered: Session-based challenges
- Rejected Because: Requires distributed session management for horizontal scaling
Proof-of-Work Algorithm
SHA-256 with Leading Zero Bits
Choice: SHA-256 hashing with difficulty measured as leading zero bits
Why SHA-256:
- Security: Cryptographically secure, extensively tested
- Performance: Hardware-optimized on most platforms
- Standardization: Well-known algorithm with predictable properties
Why Leading Zero Bits:
- Linear Scaling: Each bit doubles the difficulty (2^n complexity)
- Simplicity: Easy to verify and understand
- Flexibility: Fine-grained difficulty adjustment
Alternative Considered: Scrypt/Argon2 (memory-hard functions)
- Rejected Because: Excessive complexity for DDoS protection use case
- Trade-off: ASIC resistance not needed for temporary challenges
Difficulty Range: 4-30 Bits
Choice: Configurable difficulty with reasonable bounds
- Minimum (4 bits): ~16 attempts average, sub-second solve time
- Maximum (30 bits): ~1 billion attempts, several seconds on modern CPU
- Default (4 bits): Balance between protection and user experience
Server Architecture
TCP Server with Per-Connection Goroutines
Choice: Custom TCP server with one goroutine per connection
func (s *TCPServer) Start(ctx context.Context) error {
// Start listener
listener, err := net.Listen("tcp", s.config.Address)
if err != nil {
return err
}
// Start accept loop in goroutine
go s.acceptLoop(ctx)
return nil // Returns immediately
}
func (s *TCPServer) acceptLoop(ctx context.Context) {
for {
conn, err := s.listener.Accept()
if err != nil || ctx.Done() != nil {
return
}
// Launch handler in goroutine with WaitGroup tracking
s.wg.Add(1)
go func() {
defer s.wg.Done()
s.handleConnection(ctx, conn)
}()
}
}
Benefits:
- Concurrency: Each connection handled in separate goroutine
- Non-blocking Start: Server starts in background, returns immediately
- Graceful Shutdown: WaitGroup ensures all connections finish before stop
- Context Cancellation: Proper cleanup when context is cancelled
- Resource Control: Connection timeouts prevent resource exhaustion
Alternative Considered: HTTP/REST API
- Rejected Because: Test task requirements
Connection Security: Multi-Level Timeouts
Choice: Layered timeout protection against various attacks
- Connection Timeout (15s): Maximum total connection lifetime
- Read Timeout (5s): Maximum time between incoming bytes
- Write Timeout (5s): Maximum time to send response
Protects Against:
- Slowloris: Slow read timeout prevents slow header attacks
- Slow POST: Connection timeout limits total request time
- Resource Exhaustion: Automatic cleanup of stale connections
Configuration Management
cleanenv with YAML + Environment Variables
Choice: File-based configuration with environment variable overrides
# config.yaml
server:
address: ":8080"
pow:
difficulty: 4
# Environment override
export POW_DIFFICULTY=8
Benefits:
- Development: Easy configuration files for local development
- Production: Environment variables for containerized deployments
- Validation: Built-in validation and type conversion
- Documentation: Self-documenting with struct tags
Alternative Considered: Pure environment variables
- Rejected Because: Harder to manage complex configurations
Observability Architecture
Prometheus Metrics
Choice: Prometheus format metrics with essential measurements
Application Metrics:
wisdom_requests_total- All incoming requestswisdom_request_errors_total{error_type}- Errors by typewisdom_request_duration_seconds- Request processing timewisdom_quotes_served_total- Successfully served quotes
Go Runtime Metrics (automatically exported):
go_memstats_*- Memory allocation and GC statisticsgo_goroutines- Current number of goroutinesgo_gc_duration_seconds- Garbage collection durationprocess_*- Process-level CPU, memory, and file descriptor stats
Design Principle: Simple metrics that provide actionable insights
- Avoided: Complex multi-dimensional metrics
- Focus: Essential health and performance indicators
- Runtime Visibility: Go collector provides deep runtime observability
Metrics at Infrastructure Layer
Choice: Collect metrics in TCP server, not business logic
// In TCP server (infrastructure)
metrics.RequestsTotal.Inc()
start := time.Now()
response, err := s.wisdomApplication.HandleMessage(ctx, msg)
metrics.RequestDuration.Observe(time.Since(start).Seconds())
Benefits:
- Separation of Concerns: Business logic stays pure
- Consistency: All requests measured the same way
- Performance: Minimal overhead in critical path
Design Patterns
Dependency Injection
All major components use constructor injection:
server := server.NewTCPServer(wisdomApplication, config, options...)
service := service.NewWisdomService(generator, verifier, quoteService)
Benefits:
- Testing: Easy to inject mocks and stubs
- Configuration: Runtime assembly of components
- Decoupling: Components don't know about concrete implementations
Interface Segregation
Small, focused interfaces for easy testing:
type ChallengeGenerator interface {
GenerateChallenge(ctx context.Context) (*Challenge, error)
}
type QuoteService interface {
GetQuote(ctx context.Context) (string, error)
}
Functional Options
Flexible configuration with sensible defaults:
server := NewTCPServer(application, config,
WithLogger(logger),
)
Clean Architecture Implementation
See the layer diagram in the Overall Architecture section above for package organization.
Testing Architecture
Layered Testing Strategy
- Unit Tests: Each package tested independently with mocks
- Integration Tests: End-to-end tests with real TCP connections
- Benchmark Tests: Performance validation for PoW algorithms
// Unit test with mocks
func TestWisdomService_HandleMessage(t *testing.T) {
mockGenerator := &MockGenerator{}
mockVerifier := &MockVerifier{}
mockQuotes := &MockQuoteService{}
service := NewWisdomService(mockGenerator, mockVerifier, mockQuotes)
// Test business logic in isolation
}
// Integration test with real components
func TestTCPServer_SlowlorisProtection(t *testing.T) {
// Start real server, make slow connection
// Verify server doesn't hang
}
Security Architecture
Defense in Depth
Multiple security layers working together:
- HMAC Authentication: Prevents challenge tampering
- Timestamp Validation: Prevents replay attacks (5-minute TTL)
- Connection Timeouts: Prevents resource exhaustion
- Proof-of-Work: Rate limiting through computational cost
- Input Validation: All protocol messages validated
Threat Model
Primary Threats Addressed:
- DDoS Attacks: PoW makes attacks expensive
- Resource Exhaustion: Connection timeouts and limits
- Protocol Attacks: Binary framing prevents confusion
- Replay Attacks: Timestamp validation in challenges
Threats NOT Addressed (by design):
- Authentication: Public service, no user accounts
- Authorization: All valid solutions get quotes
- Data Confidentiality: Quotes are public information
Trade-offs Made
Simplicity vs Performance
- Chose: Simple JSON payloads over binary serialization
- Trade-off: ~30% larger messages for easier debugging and maintenance
Memory vs CPU
- Chose: Stateless challenges requiring CPU verification
- Trade-off: More CPU per request for better scalability
Flexibility vs Optimization
- Chose: Interface-based design with dependency injection
- Trade-off: Small runtime overhead for much better testability
Features vs Complexity
- Chose: Essential features only (no rate limiting, user accounts, etc.)
- Benefit: Clean, focused implementation that does one thing well
Future Architecture Considerations
For production scaling, consider:
- Quote Service Enhancement: Caching, fallback quotes, multiple API sources
- Load Balancing: Multiple server instances behind load balancer
- Rate Limiting: Per-IP request limiting for additional protection
- Monitoring: Full observability stack (Prometheus, Grafana, alerting)
- Security: TLS encryption for sensitive deployments
The current architecture provides a solid foundation for these enhancements while maintaining simplicity and focus.