📄 Our paper is out: Optimizing Knowledge Graph-LLM Interface
🚀 Cognee Cloud: Sign up for Cogwit Beta!
📄 Read Our Research Paper
We've published our findings on optimizing the interface between Knowledge Graphs and LLMs for complex reasoning. In this paper, we present systematic hyperparameter optimization results using cognee's modular framework across multiple QA benchmarks.
AI Memory Benchmark Results
Understanding how well different AI memory systems retain and utilize context across interactions is crucial for enhancing LLM performance.
We have updated our benchmark to include a comprehensive evaluation of cognee AI memory system against other leading tools, including LightRAG, Mem0, and Graphiti (previous result).
This analysis provides a detailed comparison of performance metrics, helping developers select the best AI memory solution for their applications.
The evaluation results are based on the following metrics:
Key Performance Metrics
Results for cognee
0.93
Human-like Correctness
0.85
DeepEval Correctness
0.84
DeepEval F1
0.69
DeepEval EM
Benchmark Comparison
Optimized Cognee Configurations
Cognee Graph Completion with Chain-of-Thought (CoT) shows significant performance improvements over the previous non-optimized version:
  • Human-like Correctness: +25% (0.738 → 0.925)
  • DeepEval Correctness: +49% (0.569 → 0.846)
  • DeepEval F1: +314% (0.203 → 0.841)
  • DeepEval EM: +1618% (0.04 → 0.687)
Comprehensive Metrics Comparison
Dive Deeper
What is Next?
Continuous improvement is key. We are actively enhancing our benchmarks, integrating new metrics, and evaluating additional AI memory solutions. Stay tuned for updates and more detailed analysis.
Have questions or want help optimizing your AI system?
Last updated: August 4th, 2025