CreditScope β€” Hybrid Risk Decision Engine

Three-layer runtime safety classifier wired into the CreditScope chat pipeline.

Layers

Layer File What it does
Feature circuit_integration.py β†’ _classify_safety() LogReg on 4K SAE features (layer 39)
Intent backend/agent/risk/pattern_rules.py Rule-based evasion / roleplay / exploit-tail / pure-financial detectors
Response backend/agent/risk/response_guard.py Pattern banks for bypass, concealment-fraud, procedural steps, secrets, injection
Blend backend/agent/risk/risk_engine.py Weighted sum + four rule overrides β†’ 0-1 score + categorical verdict

Verdict categories

CLEAN Β· BORDERLINE Β· SUSPICIOUS Β· HIGH_RISK

Wiring

circuit_integration.py::_run_fast_analysis() calls _classify_safety() (LogReg), then immediately calls compute_final_risk() to produce the hybrid verdict. The result replaces safety.verdict and safety.adversarial_probability in the response the frontend reads β€” no frontend changes needed.

Latency

< 1 ms overhead (pure stdlib regex + weighted arithmetic).
Model inference performance is unaffected β€” the hook runs post-inference on CPU.

Related repos

  • sarel/credit-cyber-4k-features β€” SAE checkpoints + trained LogReg classifier
  • sarel/creditscope-circuit-models β€” circuit tracer models
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support