Haben Eyasu Akelom

Flagship Case Study: H-CDT (Haben Career Digital Twin)

An agentic AI architecture focused on reliable orchestration, evaluation loops, and production constraints

Goal: Build a reliable career-intelligence agent that plans, critiques, and iterates before returning user-facing recommendations.

Orchestration Pattern

Supervisor/Worker architecture where the supervisor decomposes tasks and routes specialized prompts to worker agents for retrieval, synthesis, and validation.

Evaluation Loop

Two-LLM review loop using a generator plus evaluator model to score groundedness, factual consistency, and actionability before final output.

Inference Optimization

Resolved memory pressure on Ubuntu-hosted infrastructure using batch-limited embeddings, bounded context windows, and staged retrieval for stable latency.

Stack: Python • FastAPI • LangChain • ChromaDB • Docker • Ubuntu • OpenRouter/OpenAI • Structured evaluation prompts

About Me

8+ Years

Experience

8M+ Monthly

Throughput

35+ Banks

Integration

99.8%

Reliability

I am a Senior Software & AI Systems Architect with 8+ years engineering high-stakes distributed systems. I bridge robust backend infrastructure and frontier AI to drive measurable business ROI.

Backend & Distributed Systems

Architected high-throughput microservices (Java/Spring Boot, FastAPI) for 35+ financial institutions, managing 8M+ monthly transactions with 99.8% uptime. Specialist in PCI-DSS aligned transaction engines, multi-tenant SaaS, event-driven architectures (Kafka), and containerized orchestration (Docker/K8s).

Enterprise AI & RAG Orchestration

Designing production-grade RAG systems for grounded conversational intelligence. Expert in metadata-aware retrieval, vector databases (ChromaDB/FAISS), multi-LLM orchestration (OpenRouter/Ollama), and advanced reranking/grounding to mitigate hallucinations in enterprise environments.

MLOps & Real-Time Intelligence

Building full-lifecycle MLOps pipelines—from feature engineering to ensemble deployment. Engineered real-time transcription and intent detection with Faster-Whisper + Redis Streams, achieving sub-100ms latency for live analysis with explainability and compliance.

Engineering Philosophy: I build for scale, security, and explainability. My focus is creating resilient, production-grade systems that solve complex business challenges.

Professional Experience

AI/ML Engineering Portfolio

Andela AI Engineering Bootcamp

February 16, 2026 – Present | Remote

Building enterprise-grade RAG systems and real-time AI processing solutions. Developing production-ready systems leveraging Redis Streams, Faster-Whisper, ChromaDB, LangChain, and OpenRouter.

Company RAG Expert: Enterprise RAG system transforming static documentation into intelligent conversational AI. ChromaDB vector store, OpenRouter LLM orchestration, privacy-first architecture. Reduced information retrieval from minutes to seconds.
Real-Time Call Center AI: Distributed system for live audio transcription and analysis. Faster-Whisper ASR, Redis Streams, multi-LLM support. Real-time sentiment analysis and action item extraction. Built under guidance of ED Donner.

AI/ML Engineering Portfolio

10 Academy KAIM Program

November 2025 – January 2026 | Addis Ababa, Ethiopia

Completed intensive AI engineering program focused on production MLOps, NLP, and analytics leadership. Delivered 20+ end-to-end systems across fintech, insurance, and renewable energy.

Production AI Systems: Built and deployed credit scoring, fraud detection, and RAG applications with measurable business outcomes.
Data & Forecasting: Delivered time-series forecasting and policy analytics platforms with stakeholder-ready dashboards.
MLOps & Reliability: Established experiment tracking, CI/CD automation, and reproducible workflows for production readiness.
Leadership: Translated technical outputs into strategic decisions through structured communication and cross-functional collaboration.

Senior Software Engineer

Kacha Digital Financial Service S.C

December 2023 – Present | Addis Ababa, Ethiopia

Systems Built: Enterprise USSD platform • Multi-tenant microservices architecture • Credit scoring API • KYC data pipeline

Scale: 35+ financial institutions • 8M+ monthly transactions • 99.8% uptime • Thousands of concurrent sessions

Impact: 60% faster API response times • Automated credit risk assessment • Secure transaction processing • Mentored junior engineers

Full-Stack Engineer

AELAF Engineering

December 2020 – December 2023 | Addis Ababa, Ethiopia

Systems Built: Multi-provider payment platform • Enterprise APIs (REST/SOAP) • Event-driven architecture • Monitoring dashboards

Scale: 5 telecom providers • 10,000+ daily transactions • High-volume processing

Impact: 40% latency reduction • Secure authentication systems • Automated CI/CD pipelines • Containerized deployments

Software Engineer

Defense University College of Engineering

2017 – 2019 | Bishoftu, Ethiopia

Systems Built: Clinic management system • University registrar system • Full-stack institutional platforms

Scale: Thousands of students • Patient records digitization

Impact: 60% improved administrative efficiency • Secure RESTful APIs • Transaction integrity

Technical Skills

Grouped by Domain Expertise — Senior-level organization

🏗️ The Core Engine

High-scale infrastructure & backend systems

Java 8+ Spring Boot Microservices Kafka Spring Security Spring Cloud REST APIs SOAP APIs OAuth2/JWT Event-Driven Design JavaScript Node.js React Next.js SQL PostgreSQL MySQL MongoDB Redis Multi-Tenancy API Gateway

🧠 The Intelligence Layer

AI/MLOps & production-grade ML systems

Python FastAPI MLflow NLP/LLMs MLOps Machine Learning (scikit-learn, XGBoost, LightGBM) Predictive Modeling Statistical Analysis/A/B Testing Time Series Analysis RAG (Retrieval-Augmented Generation) Vector Databases (ChromaDB, FAISS, Weaviate) LangChain Hugging Face sentence-transformers SHAP (Explainable AI) MCP (Model Context Protocol) Agentic AI Systems Spec-Driven Development Human-in-the-Loop Pandas NumPy DVC (Data Version Control) Containerized Model Serving Streamlit Plotly Gradio Event Impact Modeling Semantic Search

⚙️ The Infrastructure

DevOps, cloud & deployment orchestration

Docker Docker Compose Kubernetes CI/CD AWS/Cloud Jenkins GitLab CI/CD GitHub Actions ELK Stack Prometheus Grafana RBAC AES-256 Local Sovereign Runtime OpenClaw Network Git WebSocket Supabase pytest

Focus Areas & Capabilities

🏦

Digital Finance Systems

Architecting high-availability payment gateways and multi-tenant financial platforms that process millions of transactions with 99.8% uptime, enabling secure integrations for 35+ financial institutions.

🧠

AI-Driven Decision Support

Building production-grade MLOps pipelines for credit scoring, fraud detection, and risk analytics that enhance financial inclusion and operational efficiency while maintaining regulatory compliance.

⚡

Platform Reliability

Designing resilient microservices architectures with automated monitoring, CI/CD pipelines, and disaster recovery strategies that ensure mission-critical systems remain operational.

🔒

Secure Integrations

Implementing secure API gateways, OAuth2/JWT authentication, and encryption protocols that protect sensitive financial data while enabling seamless cross-platform connectivity.

📊

Data Analytics & Insights

Transforming raw data into actionable business intelligence through advanced EDA, statistical modeling, and interactive dashboards that drive data-driven decision-making.

🚀

Scalable Infrastructure

Deploying containerized applications with Docker/Kubernetes, establishing cloud-native architectures, and optimizing system performance to handle exponential growth.

Featured AI/ML Projects (Top 5)

Challenge-led case studies showing architecture, execution, and measurable impact

🧠 AI/ML

Intelligent Complaint Analysis for Financial Services

GitHub

Challenge: Manual analysis of 464K+ customer complaints was time-consuming, preventing proactive issue resolution for 500,000+ users.

Solution: Production RAG chatbot enabling autonomous question-answering over complaint data, reducing analysis time from days to minutes.

Role: Lead Data & AI Engineer—architected end-to-end RAG system. Technical decisions: sentence-transformers embeddings, ChromaDB vector store, optimized chunking (500/75), LLM integration.

Tech Stack: ChromaDB • LangChain • sentence-transformers • Gradio • 464K+ records processed • Semantic search (top-k=5) • Streaming responses with source citations

Impact: 10x faster analysis • Self-service analytics for non-technical teams • Proactive fraud detection • Full traceability

Python RAG ChromaDB FAISS LangChain Hugging Face sentence-transformers LLMs Gradio NLP Vector Databases Semantic Search Agentic AI

🧠 AI/ML

Company RAG Expert: Enterprise Knowledge Synthesis System

GitHub

Program: Andela AI Engineering Bootcamp — Enterprise RAG & Knowledge Synthesis

Challenge: Scattered documentation across policies, technical docs, and legal templates makes information retrieval time-consuming.

Solution: Enterprise RAG system transforming static docs into conversational AI assistant. Privacy-first architecture with on-premises storage.

Role: Lead AI Engineer—architected end-to-end RAG system. Technical decisions: ChromaDB vector store, embedding optimization, chunking strategy, OpenRouter LLM orchestration.

Tech Stack: ChromaDB • LangChain • OpenRouter • Gradio • Modular ingestion pipeline • Semantic search • Streaming responses • Metadata-aware retrieval

Impact: Minutes to seconds retrieval • Self-service knowledge access • Data sovereignty • Scales to thousands of documents

Python RAG ChromaDB LangChain OpenRouter sentence-transformers Gradio Vector Databases Semantic Search Document Processing Metadata-Aware Retrieval Enterprise RAG Knowledge Synthesis

🧠 AI/ML

Real-Time Call Center AI: Live Transcription & Analysis System

GitHub

Program: Andela AI Engineering Bootcamp — Real-Time AI & Distributed Systems

Challenge: Manual note-taking and delayed insights during live customer interactions reduce agent effectiveness.

Solution: Real-time AI system transcribing, analyzing, and summarizing live conversations with instant insights, sentiment analysis, and action item extraction.

Role: Lead AI Engineer—architected end-to-end real-time system. Technical decisions: Redis Streams architecture, Faster-Whisper ASR, multi-LLM orchestration (Ollama, OpenRouter, OpenAI, Hugging Face), WebSocket implementation.

Tech Stack: Redis Streams • Faster-Whisper • FastAPI • React • WebSockets • PostgreSQL • 200ms audio chunking • Real-time transcription • Speaker diarization • Multi-LLM support

Impact: Eliminated manual note-taking • Real-time decision-making • Scales to hundreds of concurrent calls • Extensible to healthcare, legal, education

Python FastAPI React WebSockets Redis Streams PostgreSQL Faster-Whisper ASR Speaker Diarization LLM Orchestration Ollama OpenRouter OpenAI Hugging Face Real-Time Processing Distributed Systems Docker Docker Compose SQLAlchemy Vite

🧠 AI/ML

Bati Bank Credit Scoring MLOps

GitHub

Challenge: Bati Bank needed BNPL service but lacked credit history data for online-first customers. Traditional models require credit bureau data unavailable for this segment.

Solution: Production FastAPI microservice automating credit risk assessment using alternative transaction data (RFM metrics). Hybrid system with K-Means clustering and XGBoost, compliant with Basel II requirements.

Role: Lead Analytics Engineer—owned end-to-end development from Basel II compliance research to production. Designed RFM-based proxy target methodology, selected champion model, established MLOps framework.

Tech Stack: FastAPI • XGBoost • MLflow • Docker • RFM metrics • K-Means clustering • WoE transformations • GitHub Actions CI/CD • Basel II compliant

Impact: 60% market reach expansion • Weeks to days deployment cycle • Basel II compliant • Full experiment lineage for regulatory transparency

Python Pandas Scikit-learn XGBoost MLflow Docker FastAPI Pydantic GitHub Actions pytest

🧠 AI/ML

Fraud Shield ML

GitHub

Challenge: Increasing fraud losses with existing rule-based systems generating too many false positives while missing sophisticated fraud patterns. Needed real-time detection (<100ms) with high accuracy and explainability.

Solution: Stacking Ensemble (XGBoost + LightGBM) achieving 95%+ recall with 40% false positive reduction. Integrated SHAP explainability, handled class imbalance with SMOTE, deployed containerized FastAPI microservice with sub-100ms latency.

Role: Lead Data Scientist—owned complete system architecture. Technical decisions: Stacking Ensemble model, feature engineering, MLOps framework. Ensured 95%+ recall, <100ms latency, 40% false positive reduction.

Tech Stack: XGBoost • LightGBM • SMOTE • SHAP • MLflow • FastAPI • Docker • GitHub Actions • Geolocation integration • Transaction velocity features

Impact: 95%+ fraud recall • 40% false positive reduction • 25% faster alert investigation • Sub-100ms latency • Regulatory compliance

Python Pandas XGBoost LightGBM imbalanced-learn (SMOTE) SHAP MLflow Docker FastAPI GitHub Actions PostgreSQL

Featured Software Projects (Top 2)

Backend and platform projects focused on scale, reliability, and secure integrations

⚡ Software

Enterprise MaaS Platform

GitHub

Challenge: Enterprise clients need scalable, secure bulk SMS solutions with multi-tenant architecture and reliable telecom provider integrations.

Solution: High-volume multi-tenant SaaS platform using Spring Boot 3. Hybrid multi-tenancy (row/schema/database-level isolation), API gateway with JWT authentication, event-driven billing, intelligent routing with failover logic achieving 99%+ reliability.

Role: Lead Architect & Developer—designed complete platform. Architectural decisions: multi-tenancy model, API gateway design, billing engine, telecom provider integration. Achieved 99%+ service reliability.

Tech Stack: Java 17 • Spring Boot 3 • Multi-tenancy • API Gateway • JWT • Event-driven architecture • Telecom integrations • Intelligent routing • Failover logic

Impact: Secure multi-tenant architecture • 99%+ service reliability • Enterprise-grade data security • Automated billing and rate-limiting

Java 17 Spring Boot 3 Multi-Tenancy API Gateway JWT Event-Driven Microservices Docker

⚡ Software

Personal Finance Tracker API

GitHub

Challenge: Need for production-grade RESTful API with secure authentication, multi-currency support, and automated deployment.

Solution: FastAPI and PostgreSQL with OAuth2 password flow, Argon2 hashing, JWT tokens. CI/CD with Jenkins, Docker, Kubernetes. Achieved 90% test coverage.

Role: Full-stack Developer—built production-ready API with comprehensive security and automated deployment pipeline.

Tech Stack: FastAPI • PostgreSQL • OAuth2 • Argon2 • JWT • Jenkins • Docker • Kubernetes • Multi-currency support

Impact: Production-ready API • 90% test coverage • Automated CI/CD • Comprehensive security

Python FastAPI PostgreSQL Docker Kubernetes CI/CD Jenkins

Additional Projects

A compact list of additional implementations across AI/ML and software engineering.

AI/ML

Sentiment-Driven Stock Prediction — Market Forecasting · NLP Sentiment
Insurance Risk Analytics & Dynamic Pricing — Risk Pricing · XGBoost · SHAP
Solar Data Discovery & Analytics Platform — EDA · Energy Analytics · Streamlit
Bank App Review Analytics — NLP Analytics · Sentiment · App Insights
Forecasting Financial Inclusion in Ethiopia — Forecasting · Policy Analytics
Project Chimera: The Autonomous Influencer Factory — Agentic AI · MCP · FastAPI

Software Engineering

E-commerce Microservices Platform — Spring Cloud · Service Discovery
Real-Time Log File Processing Microservice — Real-time Processing · WebSocket · Node/Next.js

Education

M.Tech. in Computer Engineering

Defense University, College of Engineering

September 2019 – December 2021 | Bishoftu, Ethiopia

B.Sc. in Information Technology (Engineering)

Mekelle University

September 2011 – July 2016 | Mekelle, Ethiopia

Architecture & System Design

System thinking, architectural patterns, and technical design decisions

End-to-End AI Pipeline (Reference Architecture)

flowchart LR
  A[Data Ingestion] --> B[Preprocessing & Chunking]
  B --> C[Embeddings & Feature Store]
  C --> D[Inference / RAG Retrieval]
  D --> E[Evaluation Loop]
  E --> F[Monitoring & Feedback]

Pattern: Production-first AI lifecycle with explicit retrieval, evaluation, and monitoring stages to improve reliability and reduce drift.

Enterprise RAG Architecture

📄

Document Ingestion

→

✂️

Chunking & Processing

→

🔢

Embedding Generation

→

💾

Vector Store (ChromaDB)

→

🔍

Semantic Search

→

🤖

LLM (OpenRouter)

→

💬

Response Generation

Design Decisions: Privacy-first architecture (documents remain local) • Modular ingestion pipeline • Metadata-aware retrieval • Multi-model LLM orchestration

Real-Time AI Processing Pipeline

🎤

Audio Capture

→

📡

API Gateway (FastAPI)

→

🔄

Redis Streams

→

🎙️

ASR (Faster-Whisper)

→

🧠

LLM Analysis

→

💾

PostgreSQL

→

📊

Frontend (React)

Design Decisions: Distributed architecture for horizontal scaling • WebSocket for real-time communication • Multi-LLM orchestration • Event-driven processing

Microservices Architecture

🌐

API Gateway

→

🔐

Auth Service

→

💳

Payment Service

→

👤

User Service

→

📊

Notification Service

→

💾

Database Layer

Design Decisions: Service discovery (Eureka) • Event-driven communication (Kafka) • Independent scaling • Multi-tenant isolation

GitHub Repositories

Quick access to core repositories and production implementations

Company RAG Expert Real-Time Call Center AI Financial Complaints RAG Credit Scoring MLOps Project Chimera Enterprise MaaS Platform

View All Repositories

Technical Insights

Architecture thinking, implementation lessons, and production AI engineering practices

Designing Enterprise RAG with Governance

Architecture · 8 min read

How to build grounded RAG systems using metadata-aware retrieval, reranking, and source-backed responses while reducing hallucinations in executive-facing workflows.

Building Real-Time AI Pipelines with Redis Streams

AI Engineering · 7 min read

A practical pattern for live transcription and analysis: stream processing, back-pressure control, WebSocket delivery, and multi-LLM orchestration at low latency.

MLOps in Financial Systems: From Model to Production

MLOps · 6 min read

A production blueprint for credit scoring and fraud detection: experiment tracking, model registry, explainability, compliance, and automated deployment.

Get In Touch

I'm always open to discussing new opportunities, challenging projects, or strategic collaborations. Whether you're seeking a Senior AI Engineer & MLOps Specialist or interested in enterprise RAG, agentic orchestration, and production AI systems, feel free to reach out.

Open to collaboration, mentorship, and impact-driven opportunities.

linkedin.com/in/habeneyasu

💻

GitHub

github.com/habeneyasu

📱

Phone

+251 942 707 424