π¬ Research Project: DynHyperRAG | Original Repository
This is an enhanced fork of the official HyperGraphRAG implementation, serving as the foundation for DynHyperRAG - a novel quality-aware dynamic hypergraph RAG system for doctoral research.
Original Paper: "HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation" Haoran Luo, Haihong E, Guanting Chen, et al. NeurIPS 2025
Research Project: DynHyperRAG: Quality-Aware Dynamic Hypergraph for Efficient Retrieval-Augmented Generation (PhD Thesis, Expected 2025-06)
DynHyperRAG extends static HyperGraphRAG with three major innovations:
-
Graph-Structure-Based Quality Assessment - Automatically evaluate hyperedge quality using 5 structural features (degree centrality, betweenness, clustering coefficient, hyperedge coherence, text quality)
-
Quality-Aware Dynamic Weight Update - Dynamically adjust hyperedge weights based on retrieval feedback and quality scores using EMA/additive/multiplicative strategies
-
Efficient Retrieval with Entity Type Filtering - Optimize retrieval speed by 30%+ through entity type filtering and quality-aware ranking
- How to automatically assess hyperedge quality using graph structural properties?
- How to dynamically adjust hyperedge weights based on retrieval feedback and quality scores?
- How to optimize retrieval efficiency while maintaining or improving accuracy?
Academic:
- Novel graph-structure-based quality assessment algorithm
- New hyperedge coherence metric
- Quality-aware dynamic update mechanism
- Feature importance analysis with SHAP
Experimental:
- Validation on CAIL2019 (legal) and PubMed/AMiner (academic) datasets
- Retrieval accuracy improvement (MRR +X%)
- Retrieval time reduction (30%+)
- Statistical significance testing
Engineering:
- Complete open-source implementation
- Reproducible experiment pipeline
- Lightweight variant (Dyn-Hyper-RAG-Lite) for production
π Full thesis overview: docs/THESIS_OVERVIEW.md
π Project specs: .kiro/specs/dynhyperrag-quality-aware/
| Phase | Weeks | Deliverables |
|---|---|---|
| Phase 1: Quality Assessment | 1-4 | Graph feature extraction, coherence metric, quality scorer, SHAP analysis |
| Phase 2: Dynamic Update | 5-7 | Weight updater (EMA/additive/multiplicative), feedback extractor, hyperedge refiner |
| Phase 3: Efficient Retrieval | 8-10 | Entity type filter, quality-aware ranker, Dyn-Hyper-RAG-Lite |
| Phase 4: Data Preparation | 11-12 | CAIL2019 loader, PubMed/AMiner loader, annotation interface, gold standard datasets |
| Phase 5: Evaluation | 13-14 | Metrics implementation, baseline methods, statistical tests |
| Phase 6: Experiments | 15-16 | Full pipeline, ablation studies, result generation |
| Phase 7: Documentation | 17-18 | API docs, user guide, thesis writing |
-
π§ Flexible Configuration System
- Environment-based configuration via
.envfiles - Support for multiple OpenAI-compatible API providers
- Easy deployment without code modification
- See:
config.py| Setup Guide
- Environment-based configuration via
-
π Critical Bug Fixes
- Fixed
AttributeError: 'function' object has no attribute 'embedding_dim' - Proper handling of
EmbeddingFuncwithfunctools.partial - Improved error handling and retry mechanisms
- Fixed
-
π Comprehensive Documentation
- Quick Start Guide - Get started in 5 minutes (δΈζ)
- Setup Guide - Complete setup instructions (δΈζ)
- API Reference - Complete API documentation for DynHyperRAG modules
- User Guide - Step-by-step tutorials and examples
- Configuration Guide - Detailed configuration options
- FAQ - Frequently asked questions
- Performance Analysis - HyperGraphRAG advantages & benchmarks
- Troubleshooting Guide - Common issues & solutions
-
π‘ Enhanced Example Scripts
- Improved
script_construct.pywith progress indicators - Enhanced
script_query.pywith configurable query parameters - Better error messages and user feedback
- Improved
-
π¨ Interactive Web Visualization
- Modern React-based web UI for graph exploration
- Real-time search and filtering capabilities
- Query path visualization with animation
- Export functionality (PNG, SVG, JSON)
- See: Web UI Documentation
-
π§ͺ Comprehensive Testing Suite
- 91+ unit and integration tests (88.3% pass rate)
- Component tests for all UI elements
- Service and store integration tests
- E2E test framework with Playwright
- See: Testing Guide
-
π¬ Production-Ready Features
- Comprehensive logging
- Automatic retry with exponential backoff
- Configurable query modes (local/global/hybrid/naive)
- Validated on real-world medical datasets
HyperGraphRAG represents a significant advancement in RAG systems by using hyperedges to model n-ary relationships between entities, going beyond the binary relationships of traditional GraphRAG.
| Metric | vs Standard RAG | vs GraphRAG |
|---|---|---|
| Accuracy | +15.4% | +8.2% |
| Hallucination Rate | -27% | -18% |
| Retrieval Time | Faster | -28% (9.5s vs 13.3s) |
| Cost | Similar | -35% ($0.0032 vs $0.0049) |
Why Hypergraphs?
- Traditional GraphRAG: One edge connects only 2 entities (binary relation)
- HyperGraphRAG: One hyperedge connects N entities (n-ary relation)
- Better models real-world complex relationships (medical, legal, scientific domains)
π Detailed analysis: Performance Analysis Report
- Python 3.11+
- OpenAI API key (or compatible API endpoint)
# Clone this fork
git clone https://github.com/tao-hpu/HyperGraphRAG.git
cd HyperGraphRAG
# Create environment
conda create -n hypergraphrag python=3.11
conda activate hypergraphrag
# Install dependencies
pip install -r requirements.txt- Copy the example configuration:
cp .env.example .env- Edit
.envwith your API credentials:
# Required
OPENAI_API_KEY=your_api_key_here
# Optional (use OpenAI-compatible endpoints)
OPENAI_BASE_URL=https://api.openai.com/v1
# Models
EMBEDDING_MODEL=text-embedding-3-small
LLM_MODEL=gpt-4o-minipython script_construct.pyOutput: Constructs a knowledge hypergraph from example_contexts.json
- Extracts entities and hyperedges
- Generates embeddings for entities, hyperedges, and text chunks
- Saves to
expr/example/
python script_query.pyOutput: Runs a sample complex query using hybrid mode (local + global retrieval)
# Start backend API
python3 -m uvicorn api.main:app --host 0.0.0.0 --port 3401
# In another terminal, start frontend
cd web_ui
pnpm install
pnpm devAccess: Open http://localhost:3400 in your browser
Features:
- Interactive graph visualization with Cytoscape.js
- Real-time node/edge search and filtering
- Query execution with path visualization
- Export graphs as PNG, SVG, or JSON
- Responsive design for mobile/tablet/desktop
π Full guide: Web UI Documentation
HyperGraphRAG/
βββ .env.example # Configuration template
βββ config.py # Configuration management system
βββ script_construct.py # Enhanced construction script
βββ script_query.py # Enhanced query script
β
βββ hypergraphrag/ # Core library
β βββ base.py # Base classes and interfaces
β βββ hypergraphrag.py # Main HyperGraphRAG implementation
β βββ operate.py # Core operations (extract, query)
β βββ llm.py # LLM interface
β βββ storage.py # Storage implementations
β βββ prompt.py # Prompt templates
β βββ utils.py # Utility functions
β β
β βββ quality/ # π Quality assessment module (DynHyperRAG)
β β βββ __init__.py # Module exports
β β βββ scorer.py # Quality scoring algorithm
β β βββ features.py # Graph structure feature extraction
β β βββ coherence.py # Hyperedge coherence metric
β β βββ analyzer.py # Feature importance analysis (SHAP)
β β
β βββ dynamic/ # π Dynamic update module (DynHyperRAG)
β β βββ __init__.py # Module exports
β β βββ weight_updater.py # Weight update mechanism
β β βββ feedback_extractor.py # Feedback signal extraction
β β βββ refiner.py # Hyperedge refinement
β β
β βββ retrieval/ # π Efficient retrieval module (DynHyperRAG)
β β βββ __init__.py # Module exports
β β βββ entity_filter.py # Entity type filtering
β β βββ quality_ranker.py # Quality-aware ranking
β β βββ lite_retriever.py # Lightweight retriever
β β
β βββ evaluation/ # π Evaluation framework (DynHyperRAG)
β β βββ __init__.py # Module exports
β β βββ metrics.py # Evaluation metrics
β β βββ baselines.py # Baseline methods
β β βββ pipeline.py # Experiment pipeline
β β
β βββ data/ # π Data processing (DynHyperRAG)
β βββ __init__.py # Module exports
β βββ cail2019_loader.py # CAIL2019 legal dataset
β βββ academic_loader.py # PubMed/AMiner academic dataset
β βββ annotator.py # Annotation interface
β
βββ api/ # FastAPI backend for visualization
β βββ main.py # API entry point
β βββ routes/ # API endpoints
β βββ services/ # Business logic
β βββ models/ # Data models
β
βββ web_ui/ # React-based web visualization
β βββ src/ # Frontend source code
β βββ e2e/ # E2E tests (Playwright)
β βββ TESTING_GUIDE.md # Testing documentation
β βββ TEST_SUMMARY.md # Test results
β
βββ docs/ # Comprehensive documentation
β βββ THESIS_OVERVIEW.md # π DynHyperRAG thesis overview
β βββ QUICKSTART.md # Quick start guide (δΈζ)
β βββ SETUP.md # Setup guide (δΈζ)
β βββ performance-analysis.md # Performance & advantage analysis
β βββ troubleshooting.md # Troubleshooting guide
β βββ architecture.md # Architecture & design
β βββ visualization/ # Visualization documentation
β
βββ .kiro/specs/ # π Research specifications
β βββ dynhyperrag-quality-aware/
β βββ requirements.md # Research requirements
β βββ design.md # System design
β βββ tasks.md # Implementation tasks
β
βββ expr/ # Experiment data
β βββ example/ # Original medical dataset
β βββ cail2019/ # π Legal dataset (planned)
β βββ pubmed/ # π Academic dataset (planned)
β
βββ evaluation/ # Evaluation scripts
| Feature | Static HyperGraphRAG | DynHyperRAG (This Research) |
|---|---|---|
| Relationship Type | n-ary | n-ary |
| Graph Structure | Static | Dynamic |
| Quality Assessment | None | Graph structure features |
| Weight Update | None | Feedback-driven |
| Retrieval Optimization | Graph traversal | Type filtering + quality ranking |
| Efficiency | Slow | Optimized (-30%) |
local: Entity-centric retrieval (focused)global: Graph-wide retrieval (comprehensive)hybrid: Combines local + global (recommended)naive: Simple text retrieval (baseline)
from hypergraphrag import QueryParam
result = rag.query(
query_text,
param=QueryParam(
mode="hybrid",
top_k=60,
max_token_for_text_unit=4000,
max_token_for_local_context=4000,
max_token_for_global_context=4000
)
)from hypergraphrag.quality import QualityScorer
from hypergraphrag.dynamic import WeightUpdater
from hypergraphrag.retrieval import EntityTypeFilter
# Quality assessment
scorer = QualityScorer(graph, config={
'feature_weights': {
'degree_centrality': 0.2,
'betweenness': 0.15,
'clustering': 0.15,
'coherence': 0.3,
'text_quality': 0.2
}
})
# Dynamic weight update
updater = WeightUpdater(graph, config={
'update_alpha': 0.1,
'decay_factor': 0.99,
'strategy': 'ema' # ema, additive, multiplicative
})
# Entity type filtering
filter = EntityTypeFilter(graph, config={
'domain': 'legal', # legal, academic
'entity_taxonomy': {
'legal': ['law', 'article', 'court', 'party', 'crime', 'penalty']
}
})Tested on 3-document medical dataset:
- Entities extracted: 129
- Hyperedges extracted: 84
- Graph size: 213 nodes, 145 edges
- Construction time: ~85 seconds
- Query time: ~3 seconds (hybrid mode)
- API success rate: >95%
Full analysis: docs/performance-analysis.md
This is a research fork. Contributions are welcome!
- Quality assessment module (Week 1-4) β
- Dynamic weight update module (Week 5-7) β
- Efficient retrieval module (Week 8-10) β
- CAIL2019 legal dataset preparation (Week 11-12) β
- PubMed/AMiner academic dataset preparation (Week 11-12) β
- Evaluation framework (metrics, baselines, pipeline) (Week 13-14) β
- Ablation studies framework (Week 15-16) β
- Full experiments and result generation (Week 15-16)
- Thesis writing (Week 17-18)
- Visualization tools for hypergraph exploration β
- Comprehensive testing infrastructure β
- Production-ready configuration system β
- Web UI with interactive graph exploration β
- CAIL2019 legal dataset loader and processor β
- Academic dataset (PubMed/AMiner) loader β
- Evaluation metrics (MRR, Recall@K, Precision@K, F1@K, NDCG@K) β
- Baseline methods (BM25, TF-IDF, Dense Retrieval, GraphRAG) β
- Experiment pipeline with statistical testing β
- Ablation studies framework β
- Optimization for large-scale documents
- Advanced hyperedge extraction algorithms
- Multi-language support
- Integration with other vector databases
- Performance optimization for web UI
If you use this fork or DynHyperRAG in your research, please cite:
@misc{luo2025hypergraphrag,
title={HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation},
author={Haoran Luo and Haihong E and Guanting Chen and Yandan Zheng and Xiaobao Wu and Yikai Guo and Qika Lin and Yu Feng and Zemin Kuang and Meina Song and Yifan Zhu and Luu Anh Tuan},
year={2025},
eprint={2503.21322},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2503.21322}
}@phdthesis{liu2025dynhyperrag,
title={DynHyperRAG: Quality-Aware Dynamic Hypergraph for Efficient Retrieval-Augmented Generation},
author={Hao Liu and Tao An},
year={2025},
school={[University Name]},
note={PhD Thesis, Expected June 2025. Core algorithm by Hao Liu, programming and code review by Tao An},
url={https://github.com/HaoLiu923/DynHyperRAG}
}@misc{liu2025hypergraphrag-enhanced,
title={HyperGraphRAG: Enhanced Implementation with Production Features},
author={Hao Liu and Tao An},
year={2025},
url={https://github.com/HaoLiu923/DynHyperRAG},
note={Enhanced fork with configuration system, visualization, and DynHyperRAG research implementation. Core algorithm by Hao Liu, programming and code review by Tao An}
}Core Algorithm Developer: Hao Liu (HaoLiu923)
- Homepage: haoliu923.github.io
- ORCID: 0009-0001-9948-8409
Programming & Code Review: Tao An
- Homepage: tao-hpu.github.io
- ORCID: 0009-0006-2933-0320
Location: Beijing
For questions about this fork: Open an issue on GitHub
For questions about the original implementation: Contact the original authors at [email protected]
This fork is based on the excellent work by Haoran Luo, Haihong E, and the LHRLAB team. Their groundbreaking research on hypergraph-structured knowledge representation (NeurIPS 2025) laid the foundation for this enhanced implementation.
Special thanks to:
- Haoran Luo and the research team for their innovative HyperGraphRAG paper
- LHRLAB for the original implementation and continuous maintenance
- The open-source community for valuable feedback and contributions
This fork also benefits from related work:
- CHDA - Clinical Hypergraph Data Analysis framework
- cognitive-workspace - Active memory management for LLMs
This project builds upon excellent open-source projects:
- LightRAG - RAG framework foundation
- Text2NKG - Knowledge graph extraction
- HAHE - Hypergraph algorithms
- Claude Code - Assisted in documentation, bug fixes, and analysis
This fork maintains the same license as the original repository. Please refer to the original repository for license details.
- Original Repository: LHRLAB/HyperGraphRAG
- NeurIPS 2025 Paper: arXiv:2503.21322
- Original Author: Haoran Luo
- Fork Repository: HaoLiu923/DynHyperRAG
- Development Repository: tao-hpu/HyperGraphRAG
- Thesis Overview: docs/THESIS_OVERVIEW.md
- Research Specs: .kiro/specs/dynhyperrag-quality-aware/
- Core Algorithm: Hao Liu | ORCID: 0009-0001-9948-8409
- Programming & Review: Tao An | ORCID: 0009-0006-2933-0320
- cognitive-workspace - Active memory management for LLMs
- CHDA - Clinical Hypergraph Data Analysis
- CAIL2019: Chinese AI and Law Challenge 2019 (Legal domain)
- PubMed Knowledge Graph: Academic papers and citations
- AMiner: Academic social network dataset
β If you find this fork useful, please consider giving it a star!
Made with β€οΈ in Beijing
