📄 Our paper is out: Optimizing the Knowledge Graph–LLM Interface🚀 Sign up for Cognee Cloud — now in beta!
30 May 2025
Artificial Intelligence (cs.AI)Computation and Language (cs.CL)

Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning*

Vasilije Marković1, Lazar Obradović1, Laszlo Hajdu1,2,3, and Jovan Pavlović3
In this paper, we present systematic hyperparameter optimization results using cognee's modular framework across multiple QA benchmarks.

AI Memory Benchmark Results

Understanding how well different AI memory systems retain and utilize context across interactions is crucial for enhancing LLM performance.
We have updated our benchmark to include a comprehensive evaluation of cognee AI memory system against other leading tools, including LightRAG, Mem0, and Graphiti (previous result).
This analysis provides a detailed comparison of performance metrics, helping developers select the best AI memory solution for their applications.

Key performance metrics

The evaluation results are based on the following metrics:
Human-like correctness
0.93
0.93
DeepEval correctness
0.85
0.85
DeepEval f1
0.84
0.84
DeepEval EM
0.69
0.69

Benchmark comparison

Optimized Cognee configurations

Cognee Graph Completion with Chain-of-Thought (CoT) shows significant performance improvements over the previous non-optimized version:
Human-like Correctness: +25% (0.738 → 0.925)DeepEval Correctness: +49% (0.569 → 0.846)DeepEval F1: +314% (0.203 → 0.841)DeepEval EM: +1618% (0.04 → 0.687)

Comprehensive Metrics Comparison

Looking for a custom deployment? Chat with our engineers!

Latest Deep Dives

Evaluation Results | Cognee