30 May 2025
Artificial Intelligence (cs.AI)Computation and Language (cs.CL)
Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning*
Vasilije Marković1, Lazar Obradović1, Laszlo Hajdu1,2,3, and Jovan Pavlović3
In this paper, we present systematic hyperparameter optimization results using cognee's modular framework across multiple QA benchmarks.
AI Memory Benchmark Results
Understanding how well different AI memory systems retain and utilize context across interactions is crucial for enhancing LLM performance.We have updated our benchmark to include a comprehensive evaluation of cognee AI memory system against other leading tools, including LightRAG, Mem0, and Graphiti (previous result).This analysis provides a detailed comparison of performance metrics, helping developers select the best AI memory solution for their applications.Key performance metrics
The evaluation results are based on the following metrics:Human-like correctness
DeepEval correctness
DeepEval f1
DeepEval EM
Benchmark comparison
Optimized Cognee configurations
Cognee Graph Completion with Chain-of-Thought (CoT) shows significant performance improvements over the previous non-optimized version:
Human-like Correctness: +25% (0.738 → 0.925)DeepEval Correctness: +49% (0.569 → 0.846)DeepEval F1: +314% (0.203 → 0.841)DeepEval EM: +1618% (0.04 → 0.687)
Comprehensive Metrics Comparison
Looking for a custom deployment? Chat with our engineers!

Deep DivesAug 15, 2025
Build graph-native RAG with cognee and Amazon Neptune Analytics
Vasilije Markovic

Deep DivesAug 15, 2025
Pluggable Observability for Semantic Workflows: cognee Ă— Keywords AI
Hande Kafkas

Deep DivesAug 8, 2025
AI Memory Meets Real-World Testing: Rethinking Traditional QA Benchmarks
Vasilije Markovic