J谩r贸kel艖 RAG System - Performance Tracking

Welcome to my J谩r贸kel艖 RAG System's Performance Tracker Dashboard. This platform collects and visualizes civic issues reported across Budapest, helping you explore, analyze, and understand urban challenges. Dive into embeddings, RAG evaluations, and other experiments to see how AI aids civic engagement.

About This Project

The J谩r贸kel艖 RAG System builds a comprehensive Retrieval-Augmented Generation (RAG) pipeline for data from J谩r贸kel艖.hu, a Hungarian civic platform for reporting and tracking public issues across Budapest and other Hungarian cities.

This project came from a personal passion for civic engagement. As someone who regularly bikes around Budapest, I've witnessed the disrepair, vandalism, and neglect of our public spaces. Instead of falling into apathy, I began using the J谩r贸kel艖 platform and discovered hope that even in challenging political and financial climates, issues can and will be resolved when properly reported and tracked.

Key Features

The system enables users to browse issues by district, category, and status while generating summaries and insights from reports using AI-powered semantic search and pattern detection.

My Experiments & Tools

Embedding Comparison

Compare different embedding models to see how they represent issue descriptions, allowing better semantic search and clustering of similar reports. Benchmarks Hungarian, English, and multilingual models on real civic data.

View Embeddings

RAG Evaluation

Comprehensive automated evaluation of our Retrieval-Augmented Generation pipeline using standard IR metrics (hit rate, recall@k, precision@k). Tracks performance across different models, top-k values, and languages.

View RAG Eval

Streamlit UI

Interactive web interface for querying the RAG system with debug capabilities. Features query input, real-time processing logs, and detailed response analysis for civic issue exploration.

Learn More

Embeddings Visualizations

Interactive 2D scatter plots of civic issues mapped by content similarity using t-SNE dimensionality reduction. Explore patterns and clusters colored by district, status, category, or institution.

View Visualizations

Power BI Dashboard

Comprehensive interactive dashboard visualizing civic issues across Budapest by district, category, status, and temporal trends. Currently in development with automated CSV export pipeline.

Coming Soon

Performance Optimizations

Production-grade performance engineering with 17x startup improvements through intelligent URL index caching. Includes benchmarking analysis of async vs sync approaches, real-world performance testing, and scalability projections to 1M+ records.

View Performance Report

Optimized Status Pipeline

Revolutionary 24x performance improvement for status updates. Transformed 6-hour timeout-prone process into smart 4-job pipeline completing in 10-15 minutes. Features targeted URL detection and parallel processing architecture.

View Pipeline Optimization