MatLogica Benchmark Results

Benchmark Results: AADC is 10x+ faster than JAX, PyTorch, and TensorFlow.

A benchmark comparing JAX, PyTorch and Tensorflow vs MatLogica AADC for Quantitative Applications - and the use of AADC in ML applications.

Request a demo

In recent years, popular tools like JAX, TensorFlow and PyTorch have made significant strides in machine learning applications, particularly with small-scale graphs that feature large nodes. ML applications like LLMs or YOLO v8, optimized for real-time object detection, use large parameter matrices (tens of millions of parameters). They typically have less than 100 nodes.

Quant finance models are fundamentally different: they often involve over 1,000 nodes (and, in cases such as XVA, millions), with each node performing scalar operations — such as discounting individual cashflows for specific trades. This means, the valuation graph in Quant Finance applications is significantly larger than in a typical ML model.

Both, the compilation time of the valuation graph and the performance of the resulting code (kernel) are integral parts of the execution process in computational finance. The graph compilation time is often neglected, resulting in promising results in a test environment but when applied to a real problem it can render the whole idea unviable.

MatLogica AADC is a pioneering framework initially designed for quantitative finance workloads which also excels for certain ML use cases, as discussed below. AADC leverages advanced graph compilation techniques and enables automatic differentiation (backpropagation) to achieve remarkable performance gains, as evidenced by the benchmark results.

  • Code Generation approach for Automated Adjoint Differentiation (AAD) was first introduced by Dmitri Goloubentsev and Evgeny Lakshtanov in Wilmott magazine in 2019.

In this benchmark, we look at the performance of a “down-and-out” European Call Option comprising 1K paths, and 500 time-steps, using 6-Core/12-Thread AMD Ryzen 5 7600X CPU with AVX512 vector extensions.

When graph compilation time is considered, MatLogica AADC delivers execution times of just 0.074 seconds, while the next best performer is PyTorch at 0.172s, over 2x slower.

In finance, tasks such as Pricing, Live Risk, Stress Testing, VaR, and others allow the compiled graph to be re-used, as the calculations remain the same and only input parameters change.

In scenarios where the valuation graph is re-used, MatLogica outperforms its closest competitor, TensorFlow, by >16x. This performance is particularly beneficial for financial simulations that involve many nodes with smaller computations, in contrast to typical ML workloads comprising fewer, larger nodes.

Below, we provide the results of this benchmark, for 1000 paths, (1) including kernel generation time and (2) when re-using the kernels. All measurements are in seconds, and the full benchmark is available for download here. Source code can be provided on demand.

Applications in Machine Learning

The differences between various types of neural networks (NNs) can be illustrated by examining their structure and application. For instance, GPT-3, a prominent model in the realm of natural language processing, contains about 175 billion parameters and approximately 128 self-attention layers.

In contrast, time series prediction, or optimal control problems typically require far fewer parameters (possibly up to 50,000). AADC plays a crucial role in enhancing the performance of such neural networks, as it translates functional programs representing neurons into optimized machine code, significantly accelerating execution times compared to leading neural network frameworks, such as JAX.

MatLogica AADC’s superiority in this respect is well demonstrated in papers by Prof. Roland Olsson, where he shows how automatic programming can synthesize novel recurrent neurons designed for specific tasks such as time series analysis, Oil Well Event Prediction, and imitation learning for CNC machines. These papers show AADC’s primacy in evaluating numerous candidate neurons (typically around ten million for each dataset). This optimization allows researchers to automatically develop new neurons delivering better accuracies than Transformers or LSTMs. Additionally, AADC provides very rapid screening of candidate neurons, which cannot be achieved using JAX or PyTorch.

Conclusions

As the demand for fast computations grows in quantitative finance and ML applications, AADC provides a solution that outperforms the popular best-in-class ML frameworks by orders of magnitude without requiring learning new languages.

This only scratches the surface for quant models! Where more advanced techniques, such as stochastic local vol are used, the difference between generic ML toolkits operating on large tensors and AADC, which optimises workloads on the scalar operation level, becomes even more pronounced.

AADC fully exploits AVX512 hardware capabilities and enables multithreading, making it a robust solution for the unique challenges presented by the quantitative finance sector. It supports well-known NumPy ufuncs and functions, allowing for seamless integration with existing Python libraries, and it can compile graphs written in a mix of C++ and Python. This combination of speed and flexibility makes AADC a game-changer for executing large-scale computational graphs, ensuring that performance is no longer a hindrance to innovation in financial analysis.

AADC both enhances the speed and efficiency of quantitative workloads and enables the design of new Neural Network architectures, making it an excellent tool for machine learning practitioners.

Please contact dmitri@matlogica.com if you require more information.