Documentation:
- Created docs/DEFENSIVE_LAYERS_MAPPING.md documenting all 5 defensive layers
- Maps which symptoms each layer suppresses
- Defines safe removal order after root cause fix
- Includes test methods for each layer removal
Diagnostic Logging Enhancements (ChatGPT work):
- TLS_SLL_HEAD_SET log with count and backtrace for NORMALIZE_USERPTR
- tiny_next_store_log with filtering capability
- Environment variables for log filtering:
- HAKMEM_TINY_SLL_NEXTCLS: class filter for next store (-1 disables)
- HAKMEM_TINY_SLL_NEXTTAG: tag filter (substring match)
- HAKMEM_TINY_SLL_HEADCLS: class filter for head trace
Current Investigation Status:
- sh8bench 60/120s: crash-free, zero NEXT_INVALID/HDR_RESET/SANITIZE
- BUT: shot limit (256) exhausted by class3 tls_push before class1/drain
- Need: Add tags to pop/clear paths, or increase shot limit for class1
Purpose of this commit:
- Document defensive layers for safe removal later
- Enable targeted diagnostic logging
- Prepare for final root cause identification
Next Steps:
1. Add tags to tls_sll_pop tiny_next_write (e.g., "tls_pop_clear")
2. Re-run with HAKMEM_TINY_SLL_NEXTTAG=tls_pop
3. Capture class1 writes that lead to corruption
🤖 Generated with Claude Code (https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>