# Redis Integration: Questions & Answers ## Q1: Why cache locations/items if they're already in memory? **Short Answer**: You're absolutely right - we should **NOT** cache static data that's already loaded in memory! **Revised Approach**: ### What to Cache in Redis: 1. ✅ **Player sessions** (dynamic, needs cross-worker sharing) 2. ✅ **Location player registry** (who's where, changes constantly) 3. ✅ **Player inventory** (reduce DB queries for frequently accessed data) 4. ✅ **Active combat states** (for cross-worker coordination) 5. ✅ **Dropped items per location** (dynamic world state) ### What NOT to Cache: 1. ❌ **Locations** - Already in `LOCATIONS` dict from `world_loader.py` 2. ❌ **Items** - Already in `ITEMS_MANAGER.items` from `items.py` 3. ❌ **NPCs** - Already in `NPCS` dict from `npcs.py` 4. ❌ **Interactables** - Already in each `Location.interactables` list **Why This Matters**: - Each worker loads `load_world()` on startup → all static data in memory - No point duplicating in Redis (wastes memory, adds latency) - Redis should only store **dynamic, cross-worker state** --- ## Q2: How do unique items work? **Database Structure**: ```python # unique_items table (single source of truth) unique_items = Table( "unique_items", Column("id", Integer, primary_key=True), Column("item_id", String), # Template reference (e.g., "iron_sword") Column("durability", Integer), Column("max_durability", Integer), Column("tier", Integer, default=1), Column("unique_stats", JSON), # Custom stats Column("created_at", Float) ) # inventory table (references unique_items) inventory = Table( "inventory", Column("id", Integer, primary_key=True), Column("character_id", Integer), Column("item_id", String), # Template ID Column("quantity", Integer), # Always 1 for unique items Column("unique_item_id", Integer, ForeignKey("unique_items.id")), # Link Column("is_equipped", Boolean) ) ``` **Flow**: 1. **Creation**: NPC drops weapon → `create_unique_item()` → insert into `unique_items` 2. **Pickup**: Player picks up → insert into `inventory` with `unique_item_id` reference 3. **Equip**: Player equips → queries join `inventory ⋈ unique_items` to get stats 4. **Drop**: Player drops → move to `dropped_items` (keeping `unique_item_id` link) 5. **Deletion**: Item despawns → CASCADE delete removes from `inventory`/`dropped_items` **Redis Caching Strategy**: ```python # Cache unique item data when equipped/viewed key = f"unique_item:{unique_item_id}" value = { "item_id": "iron_sword", "durability": 85, "max_durability": 100, "tier": 2, "unique_stats": {"damage_bonus": 5} } # TTL: 5 minutes (invalidate on durability change) ``` --- ## Q3: How do enemies work with custom stats? **Combat Initialization**: When combat starts, NPC gets **randomized HP**: ```python # NPCDefinition in npcs.py @dataclass class NPCDefinition: hp_min: int # e.g., 80 hp_max: int # e.g., 120 damage_min: int damage_max: int defense: int # ... other stats # When combat starts (in game_logic.py or main.py) import random npc_def = NPCS.get("raider") # Load from memory npc_hp = random.randint(npc_def.hp_min, npc_def.hp_max) # Random HP # Store in database await db.create_combat( player_id=player_id, npc_id="raider", npc_hp=npc_hp, # Randomized npc_max_hp=npc_hp, location_id=location_id ) ``` **Redis Caching for Active Combat**: ```python # Cache active combat state (avoid repeated DB queries) key = f"player:{character_id}:combat" value = { "npc_id": "raider", "npc_hp": 95, "npc_max_hp": 115, "turn": "player", "npc_damage_min": 8, "npc_damage_max": 15, "npc_defense": 3 } # TTL: No expiration (deleted when combat ends) ``` **Combat Flow**: 1. Player attacks → Check Redis cache for combat state 2. If miss → Query DB → Cache in Redis 3. Calculate damage, update NPC HP 4. Update Redis cache + Publish `combat_update` to player channel 5. NPC turn → Repeat 6. Combat ends → Delete Redis cache + Publish `combat_over` --- ## Q4: How is everything loaded on server startup? **Current Flow** (per worker): ```python # api/main.py - Lifespan startup @asynccontextmanager async def lifespan(app: FastAPI): # 1. Database await db.init_db() # Connect to PostgreSQL # 2. Load static data into memory (THIS PART) WORLD: World = load_world() # Load locations from gamedata/locations.json LOCATIONS: Dict[str, Location] = WORLD.locations ITEMS_MANAGER = ItemsManager() # Load items from gamedata/items.json # NPCs loaded in data/npcs.py module (imported on demand) # 3. Start background tasks (single worker via file lock) tasks = await background_tasks.start_background_tasks(manager, LOCATIONS) yield ``` **With Redis Integration**: ```python @asynccontextmanager async def lifespan(app: FastAPI): # 1. Database await db.init_db() # 2. Redis connection await redis_manager.connect() # 3. Load static data (STAYS IN MEMORY - NO REDIS CACHING) WORLD: World = load_world() LOCATIONS: Dict[str, Location] = WORLD.locations ITEMS_MANAGER = ItemsManager() # 4. Subscribe to Redis Pub/Sub channels location_channels = [f"location:{loc_id}" for loc_id in LOCATIONS.keys()] await redis_manager.subscribe_to_channels(location_channels + ['game:broadcast']) # 5. Start Redis message listener (background task) asyncio.create_task(redis_manager.listen_for_messages(manager.handle_redis_message)) # 6. Register this worker in Redis await redis_manager.redis_client.sadd('active_workers', redis_manager.worker_id) # 7. Start background tasks (distributed via Redis locks) tasks = await background_tasks.start_background_tasks(manager, LOCATIONS) yield # Cleanup await redis_manager.redis_client.srem('active_workers', redis_manager.worker_id) await redis_manager.disconnect() ``` --- ## Q5: How many channels can exist? **Redis Pub/Sub Channels**: ### Fixed Channels (Always Active): 1. `game:broadcast` - Global announcements (1 channel) 2. `game:workers` - Worker coordination (1 channel) ### Dynamic Channels (Created on Demand): **Location Channels** (14 currently): - `location:start_point` - `location:overpass` - `location:gas_station` - ... (one per location in `locations.json`) **Player Channels** (one per connected player): - `player:1` (character_id=1) - `player:2` - `player:5` - ... (created on WebSocket connect, destroyed on disconnect) **Total Active Channels**: - **Minimum**: 16 (2 fixed + 14 locations) - **With 100 players**: 116 (2 + 14 + 100) - **With 1000 players**: 1016 (2 + 14 + 1000) **Redis Limits**: - Redis supports **millions** of channels simultaneously - Each channel has minimal memory overhead (~100 bytes) - 1000 channels = ~100 KB memory (negligible) **Subscription Strategy**: - All workers subscribe to: `game:broadcast` + all location channels - Each worker subscribes to: only its connected players' channels - When player connects → Worker subscribes to `player:{id}` - When player disconnects → Worker unsubscribes from `player:{id}` --- ## Q6: How does client update data in the UI? **Current Flow** (without Redis): ``` 1. User clicks "Attack" button ↓ 2. Client: POST /api/game/combat/action {"action": "attack"} ↓ 3. Server: Process attack, update DB ↓ 4. Server: Send WebSocket message to player ↓ 5. Server: Query DB for other players in location ↓ 6. Server: Send WebSocket messages to location ↓ 7. Client: Receives WebSocket "combat_update" ↓ 8. Client: Updates UI (HP bar, combat log) ↓ 9. Client: GET /api/game/state (refresh full state) ↓ 10. Server: Query DB for player, inventory, combat, etc. ↓ 11. Client: Re-render entire game UI ``` **With Redis** (optimized): ``` 1. User clicks "Attack" button ↓ 2. Client: POST /api/game/combat/action {"action": "attack"} ↓ 3. Server: Process attack, update DB + Redis cache ↓ 4. Server: Publish to Redis channel "player:{id}" (personal message) ↓ 5. Worker handling that player: Receives Redis message ↓ 6. Worker: Send WebSocket to local connection ↓ 7. Client: Receives WebSocket "combat_update" with ALL needed data ↓ 8. Client: Updates UI directly from WebSocket payload (NO API CALL) ↓ 9. Server: Publish to Redis channel "location:{id}" (broadcast) ↓ 10. All workers: Receive location broadcast ↓ 11. Workers: Send WebSocket to their local connections in that location ↓ 12. Other players: UI updates with "Jocaru is in combat" ``` **Key Changes**: - ✅ **No more `GET /api/game/state` after actions** - WebSocket payload contains everything - ✅ **Cross-worker broadcasts** - Redis pub/sub ensures all workers relay messages - ✅ **Reduced DB queries** - Combat state cached in Redis - ✅ **Faster UI updates** - WebSocket messages < 2ms via Redis **WebSocket Message Format** (enhanced): ```json { "type": "combat_update", "data": { "message": "You dealt 12 damage!", "log_entry": "You dealt 12 damage!", "combat_over": false, "combat": { "npc_id": "raider", "npc_hp": 85, "npc_max_hp": 115, "turn": "npc" }, "player": { "hp": 78, "stamina": 42, "xp": 1250, "level": 5 } }, "timestamp": "2025-11-09T18:00:00Z" } ``` Client receives this → Updates HP bar, combat log, turn indicator **WITHOUT** calling `/api/game/state`. --- ## Q7: Disconnected players staying in location? **Excellent Gameplay Mechanic!** This adds risk/consequence to disconnecting in dangerous areas. ### Implementation: **When Player Disconnects**: ```python # ConnectionManager.disconnect() async def disconnect(self, player_id: int): # 1. Remove local WebSocket connection if player_id in self.active_connections: del self.active_connections[player_id] # 2. Update Redis session (mark as disconnected) session = await redis_manager.get_player_session(player_id) if session: session['websocket_connected'] = 'false' session['disconnect_time'] = str(time.time()) await redis_manager.set_player_session(player_id, session, ttl=3600) # Keep for 1 hour # 3. KEEP player in location registry (don't remove) # await redis_manager.remove_player_from_location(...) # DON'T DO THIS # 4. Broadcast to location await redis_manager.publish_to_location( session['location_id'], { "type": "player_status_change", "data": { "player_id": player_id, "username": session['username'], "status": "disconnected", "message": f"{session['username']} has disconnected (vulnerable)" } } ) ``` **When Other Players Query Location**: ```python # GET /api/game/location endpoint @app.get("/api/game/location") async def get_current_location(current_user: dict = Depends(get_current_user)): # Get players in location from Redis player_ids = await redis_manager.get_players_in_location(location_id) other_players = [] for pid in player_ids: if pid == current_user['id']: continue # Get player session session = await redis_manager.get_player_session(pid) if session: other_players.append({ "id": pid, "username": session['username'], "level": int(session['level']), "hp": int(session['hp']), "is_connected": session['websocket_connected'] == 'true', "can_attack": True # Always true, even if disconnected! }) return { "id": location_id, "other_players": other_players # Includes disconnected players } ``` **Combat with Disconnected Player**: ```python # POST /api/game/pvp/initiate @app.post("/api/game/pvp/initiate") async def initiate_pvp(target_id: int, current_user: dict = Depends(get_current_user)): # Check target session target_session = await redis_manager.get_player_session(target_id) if not target_session: raise HTTPException(400, detail="Target player not found") # Allow combat even if disconnected is_connected = target_session['websocket_connected'] == 'true' # Create PvP combat pvp_combat = await db.create_pvp_combat( attacker_id=current_user['id'], defender_id=target_id, location_id=current_user['location_id'] ) if is_connected: # Target is online → Send WebSocket notification await redis_manager.publish_to_player(target_id, { "type": "pvp_challenge", "data": { "attacker": current_user['name'], "attacker_level": current_user['level'] } }) else: # Target is offline → Auto-acknowledge, they can't respond await db.acknowledge_pvp_combat(pvp_combat['id'], target_id) # Attacker gets free first strike advantage return { "message": f"{target_session['username']} is disconnected - you get first strike!", "pvp_combat": pvp_combat, "target_vulnerable": True } ``` **Cleanup Policy** (optional): ```python # Background task: Remove disconnected players after 1 hour async def cleanup_disconnected_players(): while True: await asyncio.sleep(300) # Every 5 minutes # Get all player sessions keys = await redis_manager.redis_client.keys("player:*:session") for key in keys: session = await redis_manager.redis_client.hgetall(key) if session['websocket_connected'] == 'false': disconnect_time = float(session['disconnect_time']) # If disconnected for > 1 hour if time.time() - disconnect_time > 3600: character_id = int(key.split(':')[1]) location_id = session['location_id'] # Remove from location registry await redis_manager.remove_player_from_location(character_id, location_id) # Delete session await redis_manager.delete_player_session(character_id) print(f"🧹 Cleaned up disconnected player {character_id}") ``` **UI Display**: ```tsx // Frontend: Show disconnected status {otherPlayers.map(player => (
{player.username} Lv. {player.level} {!player.is_connected && ( ⚠️ Disconnected (Vulnerable) )} {player.can_attack && ( )}
))} ``` --- ## Q8: RDB vs AOF - Code changes needed? **Short Answer**: No code changes required, only Redis configuration. ### Redis Persistence Options: **RDB (Snapshotting)**: - Periodic snapshots to disk - Fast restarts, smaller files - May lose last few seconds of data **AOF (Append-Only File)**: - Logs every write operation - More durable, no data loss - Slower restarts, larger files **Recommended Configuration** (for your use case): ```bash # docker-compose.yml echoes_redis: command: | redis-server --appendonly yes # Enable AOF --appendfsync everysec # Sync every second (good balance) --save 900 1 # RDB backup every 15 min if 1+ key changed --save 300 10 # RDB backup every 5 min if 10+ keys changed --save 60 10000 # RDB backup every 1 min if 10k+ keys changed --maxmemory 512mb # Max memory usage --maxmemory-policy allkeys-lru # Evict least recently used keys ``` **What This Gives You**: - ✅ **AOF for durability**: Every write logged (max 1 second data loss) - ✅ **RDB for fast recovery**: Snapshots for quick restarts - ✅ **Memory protection**: Won't crash if memory full (evicts old caches) **Application Code**: No changes needed! Redis handles persistence transparently. **Testing Persistence**: ```bash # 1. Add some data docker exec echoes_of_the_ashes_redis redis-cli SET test:key "hello" # 2. Restart Redis docker restart echoes_of_the_ashes_redis # 3. Check if data persisted docker exec echoes_of_the_ashes_redis redis-cli GET test:key # Should return: "hello" ``` --- ## Q9: What if cache invalidation isn't aggressive enough? **Potential Problems**: ### 1. Stale Player Stats **Scenario**: Player levels up, but Redis cache shows old level ``` 1. Player gains XP → DB updated (level 6) 2. Redis cache still shows level 5 3. Other players see "Lv. 5" instead of "Lv. 6" ``` **Solution**: Invalidate on every stat change ```python async def update_character_stats(character_id: int, **kwargs): # Update DB await db.update_character(character_id, **kwargs) # Invalidate Redis cache await redis_manager.delete_player_session(character_id) # Or update cache directly session = await redis_manager.get_player_session(character_id) if session: session.update(kwargs) await redis_manager.set_player_session(character_id, session) ``` ### 2. Ghost Items in Inventory **Scenario**: Player drops item, but cache shows they still have it ``` 1. Player drops "Iron Sword" 2. DB updated (inventory row deleted) 3. Redis cache still shows sword in inventory 4. Player sees sword in UI, tries to equip → Error! ``` **Solution**: Invalidate inventory cache on add/remove/use ```python async def remove_item_from_inventory(character_id: int, item_id: str): # Update DB await db.delete_inventory_item(character_id, item_id) # Invalidate cache (force reload next time) await redis_manager.invalidate_inventory(character_id) ``` ### 3. Wrong Player Count in Location **Scenario**: Player moves, but old location still shows them ``` 1. Player moves overpass → gas_station 2. Redis location registry not updated 3. Other players in overpass still see them 4. Broadcasts sent to wrong location ``` **Solution**: Atomic location updates ```python async def move_player(character_id: int, from_loc: str, to_loc: str): # Use Redis transaction (atomic) async with redis_manager.redis_client.pipeline() as pipe: pipe.srem(f"location:{from_loc}:players", character_id) pipe.sadd(f"location:{to_loc}:players", character_id) await pipe.execute() ``` ### 4. Combat State Desync **Scenario**: Combat ends, but cache shows still in combat ``` 1. Player defeats enemy 2. DB: active_combats row deleted 3. Redis: combat cache still exists 4. Player sees combat UI, can't move ``` **Solution**: Explicit cache deletion on combat end ```python async def end_combat(character_id: int): # Delete from DB await db.end_combat(character_id) # Delete Redis cache await redis_manager.redis_client.delete(f"player:{character_id}:combat") # Update player session session = await redis_manager.get_player_session(character_id) if session: session['in_combat'] = 'false' await redis_manager.set_player_session(character_id, session) ``` **General Strategy**: ```python # PATTERN 1: Write-Through Cache (recommended for critical data) async def update_data(key, value): await db.update(key, value) # Write to DB first await redis_manager.cache(key, value) # Update cache immediately # PATTERN 2: Cache Invalidation (simpler, slight delay) async def update_data(key, value): await db.update(key, value) # Write to DB await redis_manager.delete_cache(key) # Delete cache (reload on next access) # PATTERN 3: TTL Fallback (for non-critical data) # Set short TTLs (e.g., 30 seconds) so cache self-expires if not invalidated await redis_manager.cache(key, value, ttl=30) ``` **For Your Game**: - ✅ **Aggressive invalidation** for: inventory, combat state, player stats - ✅ **Write-through cache** for: player sessions, location registry - ✅ **TTL fallback** for: dropped items list, interactable cooldowns --- ## Q10: No feature flags needed (dev only) **Agreed!** Since you're the only tester, we can implement directly without feature flags. ### Simplified Rollout: **Phase 1: Redis Infrastructure (Week 1)** - Add Redis to docker-compose - Create redis_manager.py - Test connection/pub-sub **Phase 2: Pub/Sub Only (Week 2)** - Update ConnectionManager to use Redis pub/sub - Keep all other logic same (no caching yet) - Test cross-worker broadcasts **Phase 3: Add Caching (Week 3)** - Add player session cache - Add inventory cache - Add combat state cache - Test performance improvements **Phase 4: Multi-Worker (Week 4)** - Increase workers to 2 - Test load balancing - Monitor for race conditions **Simplified Implementation** (no toggles): ```python # Just implement Redis directly async def lifespan(app: FastAPI): await db.init_db() await redis_manager.connect() # No if/else, just do it # ... rest of startup ``` --- ## Updated Implementation Priority Based on your feedback, here's what we'll actually implement: ### Phase 1: Redis Pub/Sub (Core Multi-Worker Support) **Goal**: Enable cross-worker broadcasts **Changes**: 1. Add Redis container 2. Create `redis_manager.py` with pub/sub only 3. Update ConnectionManager: - Keep local WebSocket storage - Change `send_personal_message()` → publish to Redis - Change `send_to_location()` → publish to Redis - Add `handle_redis_message()` → send to local WebSockets 4. Subscribe to location channels on startup **What We DON'T Cache**: - ❌ Locations (already in memory) - ❌ Items (already in memory) - ❌ NPCs (already in memory) ### Phase 2: Dynamic State Caching (Performance) **Goal**: Reduce database queries for frequently accessed data **What We DO Cache**: 1. ✅ Player sessions (location, HP, level, stats) 2. ✅ Location player registry (Set of character IDs per location) 3. ✅ Player inventory (with aggressive invalidation) 4. ✅ Active combat state (with explicit deletion) 5. ✅ Dropped items per location (with TTL) ### Phase 3: Multi-Worker Deployment **Goal**: Horizontal scaling **Changes**: 1. Update docker-compose for 4 workers 2. Test load distribution 3. Implement distributed background task locks 4. Monitor performance --- ## Next Steps Ready to implement? Here's what I'll do: 1. **Create `redis_manager.py`** - Simplified version (no static data caching) 2. **Update `docker-compose.yml`** - Add Redis container 3. **Update `ConnectionManager`** - Integrate pub/sub 4. **Update endpoints** - Add cache invalidation where needed 5. **Implement disconnected player** - Keep in location, mark as vulnerable 6. **Test suite** - Verify cross-worker communication Do you want me to proceed with implementation?