YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸ” GraphRAG Inference Hackathon β€” 3-Pipeline Benchmarking System

TigerGraph 3 Pipelines 14 Novelties 12 LLMs 12 Papers 55 Tests

One query in β†’ three pipelines run β†’ side-by-side responses + metrics out.

Proving that graphs make LLM inference faster, cheaper, and smarter β€” backed by 12 research papers, 6 novel retrieval techniques, and the full hackathon evaluation stack.

3-Pipeline Architecture Β· TG GraphRAG Integration Β· Novelties Β· Evaluation Β· Quick Start


🎯 What This Is

A 3-pipeline GraphRAG benchmarking system built on top of the TigerGraph GraphRAG repo, with 14 novel techniques from 2024–2025 research, 12 LLM providers, and a production dashboard showing all three pipelines side-by-side with LLM-as-a-Judge + BERTScore evaluation.

Pipeline 1: LLM-Only Pipeline 2: Basic RAG Pipeline 3: GraphRAG
Query β†’ LLM β†’ Answer Query β†’ Embed β†’ Top-K Chunks β†’ LLM Query β†’ TG GraphRAG Service β†’ NoveltyEngine β†’ LLM
No retrieval. Worst-case baseline. Vector embeddings. Industry standard. Built on tigergraph/graphrag + 6 novelties.

🐯 TigerGraph GraphRAG Integration

Pipeline 3 is built on top of the official TigerGraph GraphRAG repo (Path B: customize). The integration layer (tg_graphrag_client.py) wraps the official service:

from graphrag.layers.tg_graphrag_client import TGGraphRAGClient

client = TGGraphRAGClient(service_url="http://localhost:8000")
client.connect()

# Official retrievers: Hybrid Search, Community, Sibling
result = client.retrieve(query="What did Einstein discover?",
                         retriever="hybrid", top_k=5, num_hops=2)
result = client.retrieve(query="Main themes?",
                         retriever="community", community_level=2)

Modes: REST API (official service) β†’ Direct pyTigerGraph (fallback) β†’ Offline (passage-based).

# Deploy official TG GraphRAG + point our system at it
git clone https://github.com/tigergraph/graphrag && cd graphrag && docker-compose up -d
export GRAPHRAG_SERVICE_URL=http://localhost:8000
python -m graphrag.main benchmark --samples 50

πŸ—οΈ 3-Pipeline Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  LAYER 4: EVALUATION                                                          β”‚
β”‚  LLM-as-a-Judge (PASS/FAIL, β‰₯90%) β”‚ BERTScore F1 (β‰₯0.55) β”‚ RAGAS β”‚ F1/EM    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LAYER 3: UNIVERSAL LLM (12 Providers)                                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LAYER 2: 3-PIPELINE ORCHESTRATION + NOVELTY ENGINE                           β”‚
β”‚  Pipeline 1: LLM-Only β”‚ Pipeline 2: Basic RAG β”‚ Pipeline 3: GraphRAG         β”‚
β”‚  NoveltyEngine: PolyG Router β†’ PPR β†’ Spreading Activation β†’ Token Budget     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  LAYER 1: GRAPH                                                               β”‚
β”‚  TG GraphRAG Service (official repo) ←→ Direct pyTigerGraph (fallback)        β”‚
β”‚  Retrievers: Hybrid, Community, Sibling β”‚ GSQL: PPR, Paths, Activation        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Pipeline 3 Flow

Query β†’ keyword extraction β†’ TG GraphRAG Service (hybrid retriever)
      β†’ NoveltyEngine: PolyG Router β†’ PPR β†’ Spreading Activation β†’ Token Budget
      β†’ Structured context (entities + relationships + passages) β†’ LLM β†’ Answer

🌟 14 Novel Techniques

Graph Retrieval (6 papers, wired into Pipeline 3 via NoveltyEngine)

# Technique Paper Result Code
1 PPR Confidence Retrieval CatRAG Best reasoning on 4 benchmarks PPRConfidenceScorer
2 Spreading Activation SA-RAG +39% correctness SpreadingActivation
3 Flow-Pruned Paths PathRAG 62–65% win rate PathPruner
4 Token Budget Controller TERAG 97% token reduction TokenBudgetController
5 PolyG Hybrid Router RAGRouter-Bench Adaptive > fixed PolyGRouter
6 Incremental Updates TG-RAG O(new) cost IncrementalGraphUpdater

Architecture + System (#7–14)

Schema-bounded extraction, dual-level keywords, adaptive routing, graph reasoning explanation, 12-provider LLM, OpenClaw agent, live 3-pipeline dashboard, advanced GSQL queries.


πŸ“Š Evaluation Framework

All hackathon-required metrics implemented in evaluation_layer.py:

Metric Target Implementation
LLM-as-a-Judge (PASS/FAIL) β‰₯ 90% pass rate compute_llm_judge() β€” reference-guided, CoT, JSON output
BERTScore F1 β‰₯ 0.55 rescaled / β‰₯ 0.88 raw compute_bertscore() β€” roberta-large with rescaling
F1 / Exact Match β€” SQuAD/HotpotQA standard
RAGAS β€” Faithfulness, Relevancy, Context Precision/Recall
Token Efficiency β€” Per-pipeline per-query tracking
Cost per Query β€” tokens Γ— provider_pricing
Latency β€” End-to-end ms
from graphrag.layers.evaluation_layer import compute_llm_judge, compute_bertscore

# LLM-as-a-Judge
result = compute_llm_judge(question, reference, candidate, llm_fn)
# β†’ {"verdict": "PASS", "feedback": "..."}

# BERTScore
results = compute_bertscore(predictions, references, rescale=True)
# β†’ {"mean_f1": 0.62, "pass_rate": 0.85}

πŸš€ Quick Start

git clone https://huggingface.co/muthuk1/graphrag-inference-hackathon
cd graphrag-inference-hackathon && cp .env.example .env
pip install -r requirements.txt

# Setup TigerGraph (schema + core + advanced GSQL queries)
python graphrag/setup_tigergraph.py

# 3-pipeline benchmark
python -m graphrag.main benchmark --samples 50 --output results.json

# 3-column Gradio dashboard
python -m graphrag.main dashboard

# Next.js dashboard
cd web && npm install && npm run dev

# Docker
docker build -t graphrag . && docker run -p 3000:3000 -p 7860:7860 --env-file .env graphrag

# Free (Ollama)
ollama pull llama3.2 && python -m graphrag.main demo

πŸ“ Project Structure

graphrag/layers/
  tg_graphrag_client.py       # πŸ†• Official TG GraphRAG service integration
  orchestration_layer.py      # πŸ†• 3-pipeline + NoveltyEngine wiring
  evaluation_layer.py         # πŸ†• LLM-Judge + BERTScore + RAGAS + F1/EM
  novelties.py                # 6 novel techniques (PPR, activation, paths, budget, router, incremental)
  graph_layer.py              # TigerGraph GSQL + schema
  gsql_advanced.py            # Advanced GSQL (PPR, paths, activation)
  llm_layer.py / universal_llm.py  # 12-provider LLM
graphrag/
  benchmark.py                # πŸ†• 3-pipeline HotpotQA benchmark
  dashboard.py                # πŸ†• 3-column Gradio dashboard
  setup_tigergraph.py         # πŸ†• Schema + core + advanced query install
  ingestion.py / main.py
web/src/app/api/compare/      # πŸ†• 3-pipeline Next.js API
openclaw/                     # Agent skills
tests/                        # 55 tests

πŸ“š References (12 Papers)

Implemented: CatRAG, SA-RAG, PathRAG, TERAG, RAGRouter-Bench, TG-RAG

Architecture: Microsoft GraphRAG, LightRAG, Youtu-GraphRAG, HippoRAG 2

Evaluation: LLM-as-a-Judge (NeurIPS 2023), BERTScore (ICLR 2020)


πŸ”— Links

TigerGraph GraphRAG Β· TigerGraph Savanna Β· TigerGraph MCP Β· TigerGraph Docs


πŸ† Built for the GraphRAG Inference Hackathon by TigerGraph

3 Pipelines Β· 14 Novelties Β· 12 Papers Β· 12 LLMs Β· 55 Tests Β· LLM-Judge + BERTScore Β· Docker

Build it. Benchmark it. Prove graph beats tokens.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for muthuk1/graphrag-inference-hackathon