MatLogica | AADC Quick Start Guide

AADC Quick Start Guide

AADC Quick Start: 3-Step Integration in Weeks

Get started with AADC in 3 simple steps: see code examples, timelines, and best practices for rapid deployment. No templates required, minimal code changes, and results in weeks not months.

Integration in Weeks, Not Months

Unlike tape-based AAD tools that require months or years of integration effort, AADC can be deployed in weeks with minimal code changes. This quick start guide walks you through the three-step process.

What Makes AADC Integration Fast?

  • No templates required
  • Automated migration scripts
  • Comprehensive debugging tools
  • CI/CD compatible
  • Incremental integration
  • No changes to architecture

The 3-Step Process

AADC integration follows a straightforward workflow that works with C++, C#, and Python codebases:

Step 1: Replace double with idouble

Integration time: less than a month

What Happens

Use MatLogica's Automated Integration scripts to replace standard double with idouble (AADC's active type) for variables involved in the computational flow. Use provided automated scripts for safe transformation.

❌ Before

double price = monteCarloPrice(
    spot,
    vol,
    rate
);

✓ After

idouble price = monteCarloPrice(
    spot,  // idouble
    vol,   // idouble
    rate   // idouble
);

Results:

  • ✓ Maintains identical numerical results
  • ✓ Code compiles and runs at full speed
  • ✓ No slowdown during development (unlike tape-based AAD)
  • ✓ Ready for Step 2

Step 2: Record Your Function

Integration time: Days

What Happens

Identify the pricing/simulation function to optimize, mark inputs and outputs, and instruct AADC to record the computational graph. This generates an optimized binary kernel.

Basic Recording Pattern

// 1. Create AADC context
AADC::Context ctx;

// 2. Mark inputs
idouble spot = 100.0;
idouble vol = 0.2;
ctx.markInput(spot);
ctx.markInput(vol);

// 3. Start recording
ctx.startRecording();

// 4. Execute function once
idouble price = myPricingFunction(spot, vol);

// 5. Mark output
ctx.markOutput(price);

// 6. Generate kernel
auto kernel = ctx.stopRecording();

Key Considerations

  • Branch handling: Ensure all code branches are covered
  • Typically 10-20 lines added: Minimal changes to your codebase
  • One-time cost: Recording happens once, kernel used repeatedly
  • Fast compilation: Kernel generation takes milliseconds to seconds

Step 3: Execute the Kernel

Integration time: Days

What Happens

Use the generated kernel instead of the original function. Compute valuations AND all derivatives simultaneously with 6-100x performance improvement.

Execution Pattern

// Process portfolio
for (auto& trade : portfolio) {
    // Set inputs
    std::vector inputs = {
        trade.spot, trade.vol, trade.rate
    };
    
    // Execute kernel - computes price + all derivatives
    auto result = kernel.execute(inputs);
    
    // Extract results
    trade.price = result.value();        // Original value
    auto greeks = result.adjoints();     // All derivatives
    
    trade.delta = greeks[0];  // ∂price/∂spot
    trade.vega = greeks[1];   // ∂price/∂vol
    trade.rho = greeks[2];    // ∂price/∂rate
}

AADC Toolkit

Technical Findings
  • AVX2 (256-bit): Process 4 double-precision values per CPU cycle
  • AVX512 (512-bit): Process 8 double-precision values per cycle (~1.7x faster than AVX2)
  • NUMA-awareness: Minimizes expensive remote memory access
  • Automated Generation of the Adjoint Code Multithreading: even if the original code is not thread-safe
  • Code compression
When Kernel Updates Are Needed

Learn the new approach of kernel-oriented design to build or modernise your quant libraries using battle-tested architectures

Debugging & Troubleshooting

Typical Integration Timeline

Phase Activities Duration Deliverable
Week 1 Setup, idouble integration 5 days idouble produced same numerical results
Weeks 2-3 Identify target function, record and use recording 5-10 days One function accelerated, validated and ready for production
Weeks 5-12 Expand to core pricing functions, parallel testing 8 weeks Core functions accelerated
Weeks 12-18 Full library integration, comprehensive testing 6 weeks Complete integration tested
Weeks 18-24 Production rollout, monitoring, optimization 6 weeks Production deployment
Total Timeline (Typical) for Large Legacy Systems (10M+ LOC) 6 months

Learn More

Explore independent benchmarks, whitepapers, and validation studies demonstrating MatLogica AADC's performance across diverse workloads. From Intel-backed research showing 1770x speedups to open-source comparisons with JAX and PyTorch, see the evidence.

Technical Findings
  • Binary kernels deployed here
  • Execution scales elastically
  • Results returned instantly
  • 50-99% cost reduction

See how our clients transformed their quant libraries

Technical Findings
  • Revolutionary adjoint factor <1 achievement
  • 6-1000x performance improvements
  • Minimal code changes required
  • Production-proven at major banks