About this project
MarocRail Optimizer: Intelligent Railway Schedule Management System Project Type: Full-Stack Machine Learning Application | Internship Project (3rd year) at ONCF Overview MarocRail Optimizer is a data-driven train scheduling system that predicts delays and optimizes railway operations using machine learning. this project demonstrates end-to-end implementation of predictive analytics in transportation logistics. Technical Implementation Backend Architecture: • Python/Flask REST API serving 10+ endpoints for schedule management, delay analytics, and predictions • SQLite relational database (8.75 MB) with normalized schema handling 34,160 delay records and 2,112 weekly schedules • Modular script architecture for synthetic data generation, database management, and model training Machine Learning Pipeline: • Random Forest classifier achieving 79~80% (30 days for this project was an issue but i could improve it if i have more time) accuracy in delay prediction using 31 engineered features • Dual-model approach: classification for delay probability + regression for duration estimation • Feature engineering incorporating temporal patterns (hour, day, season), weather conditions, route history, and cascade effects • Trained on 6 months of realistic synthetic data spanning 10 stations, 25 routes, and 80 trains Schedule Optimization Engine: • Heuristic-based conflict detection algorithm identifying platform conflicts, turnaround time violations, and maintenance windows • Automated resolution system adjusting departure times, reassigning platforms, and redistributing train capacity • Performance tracking showing 15%+ reduction in predicted delays post-optimization Frontend Interface: • Responsive bilingual dashboard (English/French) built with vanilla JavaScript and Chart.js • Five core modules: system overview, schedule viewer, visual analytics, ML prediction interface, and optimization controls • Real-time data visualization displaying delay patterns by cause, time, weather, and route performance System Capabilities • Processes and analyzes 54,284 passenger flow records for demand-based scheduling • Handles complex railway network simulation with realistic constraints (train types: Al Boraq high-speed, TNR express, Regular) • Generates actionable insights through interactive charts showing delay distributions, hourly patterns, and weather correlations • Provides instant delay risk assessment for any train configuration with configurable parameters Data Engineering • Fully synthetic dataset generation mimicking real-world railway operations • Realistic delay distribution: 45% passenger-related, 24% cascade effects, 16% weather, 11% technical, 4% maintenance • Temporal modeling incorporating peak hours (6-9 AM, 5-8 PM), seasonal variations, and day-of-week patterns • Complete data pipeline from JSON storage through SQLite indexing to model-ready feature matrices Technical Stack • Backend: Python 3.9+, Flask 3.0, Pandas, NumPy, Scikit-learn • Database: SQLite with optimized indexing • Frontend: HTML5, CSS3, JavaScript (ES6+), Chart.js • ML Models: Random Forest (Classifier + Regressor), serialized with Joblib Key Achievements • Successfully implemented production-ready ML model with 79% prediction accuracy and 10.86-minute MAE • Designed RESTful API architecture supporting concurrent requests and session management • Created scalable database schema handling 100,000+ records with sub-second query response • Developed bilingual user interface supporting French/English language switching • Delivered complete documentation including architecture diagrams, API specs, and deployment guides Engineering Highlights This project demonstrates proficiency in full-stack development, machine learning deployment, database design, and transportation logistics optimization. The modular architecture allows easy extension for real-time tracking, predictive maintenance scheduling, and passenger notification systems.