Abdelkareem Elkhateb

What I’ve Built
BertHash-Femto
Arabic embedding model that’s 113x smaller than AraBERTv2 (1.2M vs 135M parameters) while achieving 94% of its performance. Runs on edge devices.
Why it matters: Production-ready Arabic AI without expensive GPU infrastructure.
Zarra & Bojji
Arabic AI models that are 10x smaller than competitors but just as accurate. They run on phones, not just expensive servers.
كم كالوري
Ask “how many calories in koshari?” in Arabic and get real answers. Semantic search for nutrition.
GPUVec
Developer toolkit for ML infrastructure decisions. Compare GPU pricing across 50+ cloud providers, benchmark LLMs and embedding models, and calculate Qdrant vector database specs.
Where I Work
Xbites 2025 - now
Building the AI backend for Darin, a smart real estate platform. I designed semantic search (improved relevancy from 30% to 92%), built an AI agent factory that generates custom agents in under 2 minutes, and automated data pipelines across 150+ developers and 500 projects.
Result: Reduced broker workflow time by 83%. Enabled deals with major real estate companies.
Hamza Salem Lab 2024 - now
Research group focused on advancing Arabic AI. I build embedding models and contribute to academic publications. Our mission is to make cutting-edge Arabic AI accessible beyond big tech.
Result: Multiple papers published, 54 citations, models used internationally.
NAMMA Nov 2024 - now
Part-time NLP Engineer and open-source maintainer. Contributing to state-of-the-art Arabic LLMs, embeddings, OCR, and ASR systems.
Result: Multiple SOTA Arabic models released openly, used by thousands of developers across the Arab world.
Freelance 2021 - now
Search engines, APIs, ML pipelines, and web applications for clients worldwide. Top-rated freelancer on Upwork with 20+ completed projects.
Result: Repeat clients from 5+ countries.
What I’m Working On Now
- Better Arabic embeddings — Current models struggle with dialects. Fixing that.
- Arabic OCR — Reading old Arabic manuscripts and printed books automatically.
- Multimodal Arabic AI — Models that understand both Arabic text and images together.
- CPU-efficient Transformers — Master’s thesis research on making models faster without GPUs.
7 Papers
54 Citations
962 Papers Read
Work With Me
Need Arabic AI, production ML systems, or efficient model architectures?
Updates
Get notified when I publish new research or projects.