![]()
Abdelkareem Elkhateb
Arabic NLP Researcher & AI Engineer
I build efficient AI models, semantic search systems, and production ML pipelines for Arabic and multilingual applications.
اللغة ليست عِلمًا .. بل هي شيء فوق العلم
لغتنا في خطر داهم .. ونحن أيضًا “Language is not a science — it is something above science. Our language is in danger — and so are we.”
Who am I
I’m an NLP Engineer at Xbites, building the AI backend for Darin.
I also research with Hamza Salem Lab on Arabic embeddings and contribute to NAMMA for open-source Arabic AI.
My thesis focuses on making transformers run efficiently on CPUs. I write about Arabic NLP, embedding models, and building production AI systems.
Projects
BertHash-Femto
113× smaller than AraBERTv2, 94% accuracy. Runs on edge devices. → GitHub
Zarra & Bojji
Tiny Arabic models for phones. → Read more
GPUVec
GPU pricing & ML benchmarks. → Check it
SEO Rat
Free, open-source SEO tool for static sites. Uses real Google Search Console data, runs locally. → GitHub
Recent Writing
My Journey Building a PC for AI Research in Egypt
A personal journey of building a high-performance PC for AI research in Egypt, covering hardware choices, cost comparisons, and local challenges.
LiteRT & Qualcomm AI Hub: On-Device ML Without the Cloud
A practical guide on converting PyTorch models to LiteRT, benchmarking on real Android devices, and using Qualcomm AI Hub for NPU compilation.
Tarteel & Muaalem El Quran: Optimizing Quranic AI
Optimizing the Muaalem El Quran model using TensorRT and ONNX — a deep dive into Quranic speech recognition.
HyperRun + ColGrep: A Self-Hosted Alternative to RunLLM
Add an AI-powered “Ask AI” chat widget to any docs site using ColGrep semantic code search and FastHTML. Self-hosted, BYOK, multi-provider.
View all posts → · Embedding series →
فقالَ: إن تَصدُقِ اللَّهَ يَصدُقْكَ
Qabilah