feat: Add ACE allocation failure tracing and debug hooks

This commit introduces a comprehensive tracing mechanism for allocation failures within the Adaptive Cache Engine (ACE) component. This feature allows for precise identification of the root cause for Out-Of-Memory (OOM) issues related to ACE allocations.

Key changes include:
- **ACE Tracing Implementation**:
  - Added  environment variable to enable/disable detailed logging of allocation failures.
  - Instrumented , , and  to distinguish between "Threshold" (size class mismatch), "Exhaustion" (pool depletion), and "MapFail" (OS memory allocation failure).
- **Build System Fixes**:
  - Corrected  to ensure  is properly linked into , resolving an  error.
- **LD_PRELOAD Wrapper Adjustments**:
  - Investigated and understood the  wrapper's behavior under , particularly its interaction with  and  checks.
  - Enabled debugging flags for  environment to prevent unintended fallbacks to 's  for non-tiny allocations, allowing comprehensive testing of the  allocator.
- **Debugging & Verification**:
  - Introduced temporary verbose logging to pinpoint execution flow issues within  interception and  routing. These temporary logs have been removed.
  - Created  to facilitate testing of the tracing features.

This feature will significantly aid in diagnosing and resolving allocation-related OOM issues in  by providing clear insights into the failure pathways.
This commit is contained in:
Moe Charm (CI)
2025-12-01 16:37:59 +09:00
parent 2bd8da9267
commit 4ef0171bc0
85 changed files with 5930 additions and 479 deletions

View File

@ -349,7 +349,12 @@ static inline int l25_alloc_new_run(int class_idx) {
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
}
if (raw == MAP_FAILED || raw == NULL) return 0;
if (raw == MAP_FAILED || raw == NULL) {
if (g_hakem_config.ace_trace) {
fprintf(stderr, "[ACE-FAIL] MapFail: class=%d size=%zu (LargePool)\n", class_idx, run_bytes);
}
return 0;
}
L25ActiveRun* ar = &g_l25_active[class_idx];
ar->base = (char*)raw;
ar->cursor = (char*)raw;
@ -663,6 +668,9 @@ static int refill_freelist(int class_idx, int shard_idx) {
}
if (!raw) {
if (g_hakem_config.ace_trace) {
fprintf(stderr, "[ACE-FAIL] MapFail: class=%d size=%zu (LargePool Refill)\n", class_idx, bundle_size);
}
if (ok_any) break; else return 0;
}