Abdelkareem Elkhateb

Abdelkareem Elkhateb

I build Arabic AI that actually works.

500M+ Arabic speakers deserve better than translated English models.

I create efficient NLP systems that run anywhere — from cloud servers to cheap phones.


What I’ve Built

Zarra & Bojji

Arabic AI models that are 10x smaller than competitors but just as accurate. They run on phones, not just expensive servers.

Read more →

كم كالوري

Ask “how many calories in koshari?” in Arabic and get real answers. Semantic search for nutrition.

Try it →

GPUVec

Developer toolkit for ML infrastructure decisions. Compare GPU pricing across 50+ cloud providers, benchmark LLMs and embedding models, and calculate Qdrant vector database specs.

Check it out →

Arabic GLiNER

Extract names, places, and organizations from Arabic text automatically.

GitHub →


Where I Work

Xbites 2025 - now

Building the AI backend for Darin, a smart real estate platform. I designed semantic search (improved relevancy from 30% to 92%), built an AI agent factory that generates custom agents in under 2 minutes, and automated data pipelines across 150+ developers and 500 projects.

Result: Reduced broker workflow time by 83%. Enabled deals with major real estate companies.

Hamza Salem Lab 2024 - now

Research group focused on advancing Arabic AI. I build embedding models and contribute to academic publications. Our mission is to make cutting-edge Arabic AI accessible beyond big tech.

Result: Multiple papers published, 54 citations, models used internationally.

NAMMA Nov 2024 - now

Part-time NLP Engineer and open-source maintainer. Contributing to state-of-the-art Arabic LLMs, embeddings, OCR, and ASR systems.

Result: Multiple SOTA Arabic models released openly, used by thousands of developers across the Arab world.

Freelance 2021 - now

Search engines, APIs, ML pipelines, and web applications for clients worldwide. Top-rated freelancer on Upwork with 20+ completed projects.

Result: Repeat clients from 5+ countries.


What I’m Working On Now

  • Better Arabic embeddings — Current models struggle with dialects. Fixing that.
  • Arabic OCR — Reading old Arabic manuscripts and printed books automatically.
  • Multimodal Arabic AI — Models that understand both Arabic text and images together.
  • CPU-efficient Transformers — Master’s thesis research on making models faster without GPUs.

7 Papers

54 Citations

962 Papers Read


Work With Me

Need Arabic AI, production ML systems, or efficient model architectures?


Updates

Get notified when I publish new research or projects.