kareem's Blog

Connecting with ideas in ML

kareem — Thu, 05 Jun 2025 21:00:00 GMT

Here is the corrected version with grammar errors fixed and improved clarity, while preserving the original meaning and tone:

Get paid for your knowledge!
I’m not interested in all this talk about being replaced by AI. If AI can replace an engineer’s mind, it will replace everything in the world.
When this happens, it won’t really matter!
I got my first job one year after graduation with a good salary for my experience and the local currency.
الحمد والشكر والفضل لله وحده.
Actually, I got two jobs, not one!
I landed a full-time AI Engineer position at a startup that is already operational with its own customers, plus a part-time role with flexible meetings. I’m the wildcard—maybe even the AI—in this startup, which is set to receive funding in the coming weeks.
This might give the impression that I’m a great engineer, but I’m not!
I’m not good at programming; I’m below average and can barely do cool stuff on my own. I only know some solutions to try, and I’m familiar with multiple resources and places to get help. I love asking more experienced people for feedback.
I can only say that I love this field a lot! It’s very interesting. With just a laptop and my mind, I can create things the world needs and will pay for!
This may sound silly, but it’s an amazing idea to think about, especially for someone like me who doesn’t enjoy the outside world or dealing with people in real life.
What I like is that the world is connected, and you can gain recognition quickly if you’re doing real work and building connections with people in your field.
I’m interested in Machine Learning and focus on niche areas where not many people are working. In my language, there’s little competition, which gives me a unique edge! But in reality, I’m below average; the basics I explore in these areas are enough for now.
I don’t advise anyone to do this, but I want to say that when you try your best, reflect on your goals, start crafting your ideas, and engage with others’ ideas, your influence and impact will grow significantly.
I’ve met many ordinary people who simply read documentation, ask questions about what they don’t know, and are beginners, yet others look at them and think, “Wow, they must be geniuses!”
But they’re not! You may have more knowledge and experience than them, but business, your birthplace, and the college you graduated from are strong factors in determining your path.
You can reach great places without these, but they accelerate your progress!
We all know the story of the child who learned programming at 11 years old and now, at 25, codes as easily as walking.
What I’ve found is that you can still build great connections with someone who has 10 years more experience than you, and they might even say, “You’re a smart person! I want you to work with me.”
We’re all limited, and there will always be gaps that require other minds to fill. The great thing is that we grow faster when we connect with such minds!
I really admire the work from the AnswerDotAI team. I find that Jeremy Howard has a profound impact on how I think about AI and learning.
The courses I’ve taken with him, the community, and the tools make me feel ahead of the curve and capable of creating things!
When you start following people like Omar Khatab, Benjamin, Antoine, Tom Arson, and many others, and interact with their work, you begin to gain weight in these spaces. I’m not just talking about getting a job—I’m talking about the level of ideas.
There are people watching, people building tools around concepts, and people creating new concepts!
I wish I could reach the cutting edge of knowledge. This may not be precise, but I see it as a journey I want to pursue.
There’s more to say, but this world has immense potential for smart people—not in terms of marketing or being a shallow influencer.
I want to say that I have many ideas I want to create and share with other minds. I believe I’ll be able to make a positive impact in the areas I’m passionate about and improve Islamic tech and Arabic NLP.

Pylate Day 1

kareem — Tue, 27 May 2025 21:00:00 GMT

PyLate & ColBERT Evaluation - Complete Learning Guide

Core Concepts

What is PyLate?

Q: What is PyLate and what does it do? A: PyLate is a Python library for vector retrieval and search, specifically designed for ColBERT models. It provides tools for indexing documents, encoding queries, and performing similarity search.

What is ColBERT?

Q: How does ColBERT work? A: ColBERT (Contextualized Late Interaction over BERT) creates dense vector representations for documents and queries, then uses late interaction (token-level matching) for retrieval instead of single vector similarity.

Dataset Formats

Format 1: Triplet/Multi-negative Format

Q: What does triplet format look like? A: Each row contains: - query: The search query - positive: One relevant document
- negative1, negative2, …: Multiple irrelevant documents

Pros: Good for training with hard negatives Cons: No separate corpus, harder to evaluate

Format 2: Structured Retrieval Format

Q: What does structured format contain? A: Three separate components: - corpus: All documents with IDs - queries: All queries with IDs - qrels: Relevance judgments (query-doc pairs)

Pros: Standard IR evaluation format, works with PyLate directly Cons: More complex structure

Format 3: Passage Ranking Format

Q: How does passage ranking format work? A: Each row has: - query_id, query: Query information - positive_passages: List of relevant documents - negative_passages: List of irrelevant documents

Pros: Multiple positives/negatives per query, rich annotations Cons: Requires extraction to create corpus

Data Conversion Strategies

Memory-Efficient File Writing

Q: How do you handle large datasets efficiently? A: Stream processing with direct file writing:

with open('corpus.jsonl', 'w') as f:
    for el in dataset['corpus']:
        if el['corpus-id'] and el['text']:
            json.dump({"_id": el['corpus-id'], "text": el['text']}, f)
            f.write('\n')

Pros: Low memory usage, handles any dataset size Cons: Requires file I/O, slightly slower

PyLate File Requirements

Q: What files does BEIR datasets is ? A: - corpus.jsonl: {"_id": "doc1", "text": "document text"} - queries.jsonl: {"_id": "q1", "text": "query text"} - qrels/split.tsv: query_id\tdoc_id\tscore

Critical: File names and folder structure must match exactly

Direct Dictionary Conversion

Q: How do you convert without files? A: Transform to PyLate’s expected return format:

documents = [{"id": doc_id, "text": text} for doc_id, text in corpus.items()]
queries = list(queries.values())
# qrels stays as dictionary

Pros: Faster, no file operations Cons: Must match exact format, harder to debug

Evaluation Approaches

Custom Evaluation with ranx

Q: When do you use ranx for evaluation? A: When you have non-standard formats or want custom metrics:

qrels = Qrels(qrels_dict)
run = Run(run_dict)
metrics = evaluate(qrels, run, ["ndcg@5", "map@5"])

Pros: Flexible, works with any format Cons: Manual setup required

PyLate Built-in Evaluation

Q: When do you use PyLate’s evaluation? A: When data is in standard BEIR format with proper file structure

Pros: Standardized, less code Cons: Strict format requirements

Indexing Strategies

PLAID Index

Q: What is PLAID indexing? A: PyLate’s efficient indexing method for ColBERT embeddings, supporting fast similarity search

Key Parameters: - index_folder: Where to store index - index_name: Identifier for the index
- override=True: Overwrites existing index

Document Processing

Q: How do you prepare documents for indexing? A: 1. Extract unique documents from all sources 2. Create document IDs and embeddings 3. Add to index with add_documents()

Important: Use is_query=False for documents, is_query=True for queries

Common Pitfalls & Solutions

File Format Issues

Q: What are common file format mistakes? A: - Wrong file extensions (.csv instead of .tsv) - Incorrect folder structure (missing qrels folder) - Wrong field names (id vs _id)

Data Extraction Problems

Q: How do you handle variable-length lists? A: Use nested loops for negative passages:

for row in dataset:
    for neg_doc in row['negative_passages']:
        # process each negative document

Memory Management

Q: How do you avoid memory issues? A: - Process datasets in chunks - Use generators instead of lists - Write to files incrementally - Use dictionaries to avoid duplicates

Performance Considerations

Batch Processing

Q: How do you optimize encoding speed? A: Use appropriate batch sizes: - Larger batches: Faster but more memory - Smaller batches: Slower but memory-safe - Typical: batch_size=32 or batch_size=64

Index Management

Q: How do you manage multiple indexes? A: Use descriptive names and separate folders: - index_folder="arabic_index" - index_name="gte-multilingual-base"

Evaluation Metrics

Standard IR Metrics

Q: What metrics should you track? A: - NDCG@k: Normalized discounted cumulative gain - MAP@k: Mean average precision
- Recall@k: Proportion of relevant docs retrieved - Precision@k: Proportion of retrieved docs that are relevant

Interpreting Results

Q: What makes good retrieval performance? A: - NDCG@5 > 0.7: Excellent - NDCG@5 > 0.5: Good
- NDCG@5 > 0.3: Acceptable - NDCG@5 < 0.3: Needs improvement

This comprehensive guide covers all the key concepts, trade-offs, and practical considerations for working with PyLate and ColBERT evaluation pipelines.

RAG RAPTOR, DFloat11 and pyrefly

kareem — Sat, 24 May 2025 21:00:00 GMT

the last 2 days i was doing some writing some more well crafted articles they will publish later and they were 2 amazing and hard days ..let’s go into the details. 1. RAPTOR RAG 2. DFloat 1 3. pyrefly 4. fastembed

What is RAPTOR RAG?

raptor rag

It’s a technique to imrpove they way you RAG above your documents and they say it will improve the performance by 20% ..but it don’t!

What is RAPTOR ? “It’s a Tree constructed from your chunks by recursively cluster chunks of text based on their vector embeddings and generates text summaries of those cluster, constructing a tree from the bottom up.”

They try to solve the problem of retrieve only a few short, contiguous text chunks, which limits their ability to represent and levevarge large-scale discourse structure. This is particularly relevant for thematic questions that require integrating knowledge from multiple parts of a text, such as understanding an entire book, as in the NarrativeQA dataset (Kočiskỳ et al., 2018).

Consider the fairy tale of Cinderella, and the question “How did Cinderella reach her happy ending?”. The top-k retrieved short contiguous texts will not contain enough context to answer the question.

They solve this by making the indexing and retrieval uses a tree structure to capture both high-level and low-level details about a text. the steps are the following: 1. cluster chunks of text 2. generates summary for those clusters and then repeats 3. generating a tree from the bottom up.

the model based summarization

After clustering the nodes using Gaussian Mixture Models, the nodes in each cluster are sent to a language model for summarization. This step allows the model to transform large chunks of text into concise, coherent summaries of the selected nodes. For our experiments, we use gpt-3.5-turbo to generate the summaries. The summarization step condenses the potentially large volume of retrieved information into a manageable size.

Querying RAPTOR

1. Tree traversal 2. collapsed tree retreival

to undersand how they work the picture above is very clear.

what i want to say is the results are not fair and the claim of 20% is baisd. if you look at accuracy between raptor and the sbert solution you will find that it’s not much better while still very expensive to create and retrieve this if for question answering and flow of the information is very reptitive.

what is cool about RAPTOR technique is using it for summary.

having details, medium and high information for document will help you create better summary because you divide the task of generting a summary to a single LLM in one show which will not be much accurate. you know divided the task some of the summary already done and create in the vector store so the task is more easier for the final llm model.

DFloat11: Efficient, Lossless Compression for LLMs

DFloat11 (DF11) is a novel, mathematically lossless compression technique that reduces large language model (LLM) memory usage by about 30%—with zero loss in accuracy. Unlike traditional quantization, which can degrade model quality, DF11 uses Huffman coding to compress only the predictable exponent bits of model weights, preserving all original information.

How DF11 Works

DFloat11 encoding

Sign and Fraction Bits: Kept unchanged, as they contain high-entropy information.
Exponent Bits: Compressed using a precomputed Huffman tree, replacing the fixed 8-bit exponent with a variable-length code. This saves about 5 bits per weight on average.

Storage and Decoding

Storage: The sign/fraction and exponent bits are stored separately, along with a small header for the Huffman codebook.
Decoding: At runtime, the original weights are quickly reconstructed by combining the sign/fraction block with the decoded exponent, enabling fast, parallel processing on GPUs.

Key Benefits

30% reduction in model size compared to bf16
100% identical accuracy to the original model
Applicable to any transformer-based LLM ## Pyrefly with vscode Pyrefly it’s a pyton type check written in Rust. they I installed into vscode and tried it with llamaindex and typercodebase it was very nice expirence.

references

Brave Leo AI with Ollama, vllm, and any huggingface llm locally

kareem — Thu, 22 May 2025 21:00:00 GMT

What’s Brave Leo AI All About?

Brave Leo AI is a super handy, privacy-first AI assistant built right into the Brave browser.

It’s there to help you out with all sorts of tasks, and it works on your computer (macOS, Windows, Linux) or phone (Android and iOS).

The best part? You don’t need to sign up or log in to use it for free, and it’s designed to keep your data private.

Brave Leo with Ollama or VLLM

What Can Brave Leo AI Do?

Leo’s got a lot of tricks up its sleeve: - Summarize Stuff Instantly: It can give you quick summaries of webpages, PDFs, Google Docs, Google Sheets, or even YouTube videos by reading their transcripts.

Answer Questions: Whether it’s about a webpage or just something you’re curious about, Leo can explain things clearly and even offer different perspectives.
Write and Create: Need an article, email, essay, or some code? Leo can whip it up for you.
Translate and Code: It can translate text into different languages or help with coding by suggesting or generating code snippets.
Custom AI Models: With the “Bring Your Own Model” feature, you can plug in your own local or remote AI models for a personalized experience.

Is Brave Leo Safe to Use?

Privacy is Leo’s middle name! Here’s why it’s safe:

Anonymized Requests: Leo uses a reverse proxy, so Brave can’t tie your requests to your IP address.
No Chat Storage: Your conversations aren’t saved on Brave’s servers or used to train AI models.
No Sign-Up Needed: You can use it for free without an account. Even the premium version uses anonymous tokens to keep things private.
Local Storage: Your chat history stays on your device, and you can clear it anytime through the browser settings.
Heads-Up on Third-Party Models: If you use external AI models (like Anthropic’s Claude), their data policies might differ (Claude keeps chats for 30 days, for
example). Always check the privacy terms if you go that route.

What About Your Chat History with Leo?

If you’re using Brave version 1.75 or higher on desktop or Android (not in Incognito or Tor mode), you can keep track of your chats with Leo.

They’re stored locally on your device, not on any server, so you’re in control. You can revisit, continue, or delete them from the Leo full-page view (brave://leo-ai) or the browser’s sidebar.

Just note that clearing your browsing history will also wipe out any webpage-related content in your chats. Easy peasy!

Bring Your Own Model (BYOM) with Brave Leo

With BYOM , you can connect your own AI models to Leo for a custom experience. You can use platforms like vLLM, SGLang, or any inefernce engines with any Hugging Face Transformers model, as long as it follows the OpenAI Chat Protocol.

For example, you can run a model like Qwen2.5-VL-3B-Instruct locally with this command:


python -m sglang.launch_server --port 7501 --model-path Qwen/Qwen2.5-VL-3B-Instruct

This sets up a server for SGLang (or you can use vLLM with a similar command).

Then, in Brave Leo’s BYOM settings, add your model with these details:

Label: Qwen2.5-VL-3B-Instruct

Model Request Name: Qwen2.5

Server Endpoint: http://127.0.0.1:7501/v1/chat/completions

Context Size: 4000

API Key: local

System Prompt: A custom prompt like, “You are Leo, a helpful AI assistant by Brave. Provide clear, concise, polite responses under 80 words. Use a neutral tone, clarify if needed, and ensure accuracy.”

Brave doesn’t proxy these requests, so check the privacy terms of your chosen provider. Once set up, your model integrates with Leo, letting you use it directly in the browser for tailored, private AI chats. It’s like giving Leo your own custom brain!

البوصلة الحقيقة

kareem — Tue, 20 May 2025 21:00:00 GMT

السعي الي الكمال

في الوقت الحالي .. أنا اقوم بالكثير من المشاريع ومنها الواعد الذي سيكون سبب في إدخال دخل مادي كبير و علمي إن شاء الله.

حاليا أعمل علي رسالة الماستر + عضو فعال في مجتمعين لتطوير البرمجيات والذكاء الإصطناعي في العالم العربي والإسلامي ولله الحمد والفضل وحده + أعمل في معمل باحثي تابع لاحد الأشخاص المميزة جدا و لدي أعمال باحثيه ومشاريع خاصه من نواحي متعددة.

أستشعر حاليا كثرة المهام والتشتت الكبير و قلة الوقت والمجهود والكثير ..لكن هذا ليس المشكلة. ما أستشعره هو زيادة الإعجاب بالنفس والبعد عن العبادة من أجل لذة العلم و حظ النفس.

و من داخلي لا أقنع بكل ما أقدم ..واري انه هناك المزيد من العلم والمعرفة يجب تحصليها و عدم الرضا بكوني مجرد مستقبل ومعدل لما أستقبله. مجال دراستي هو الذكاء والإصطناعي بتخصصاته و علوم الحاسب.

ما يزعجني الكثير ..منها لماذا أقوم باستقبال العلم من الخارج ولست أتحدث عن مجرد الشروحات الأجنبية و الأوراق والمشاريع البرمجية وكل هذا ولكن أتحدث عن الريادة في العلم ذاته. أتحدث عن أمثال جيفري هينتون وما بعده ..الرغبة في الوصول الي حافة العلم والإبداع وليس مجرد التعديل وفهم ما يصدرون.

هل أكتفي بهذا !

في الحقيقة لا اتذكر عندما تم إطلاق deepseek وقرات عن الشركة التي تقف ورائها..لماذا لم تكرر التجربة عربية إسلامية خالصة! ليس مستحيل تقنيا ونحن أولي بلغتنا وتراثنا من غيرنا من الشعوب خصوصا ..أن تكون هذه الشعوب لديها كراهية متشعبه خاصة لدينك وشعبك ولنا في فلسطين عبرة فيما يحدث فيها من تنكيل و جرائم لا احب حتي التفكر فيها. لكن ما يزيد الحزن والضعف والهم والغمة ..أن تجد من الريادة في مجالك و تخصصك أغلبهم من هولاء الصنف الذي يقتل إخوتك بدم بارد!

كيف تسعد نفسك وانت تابع لهولاء تنتظر منهم أن يعطوك بعض الفتتات من العلم لتذهب لتنتفخ به نفسك وتظن أن لديك شي من علم وهو فتات.

بداية من الباحثين لأصحاب الشركات للمواقع التي أقوم بالاستضافة عليها إلي الكورسات و مجتمعات التقنية..بل تزيد الحسرة عندي بحثي في بيانات لتطوير نموذج صغير لمهمة من مهام اللغة العربية أجد البيانات أين !ومن يعمل عليها !

الشعور بالعجز

يقومون بتجميع و تطوير نماذج تخدم فهم اللهجة الشامية تساعدهم في كل أعمالهم..وليس هذا وقت الحديث..

العوائق كثير مادية و معرفية وكثير لكن سقف الممكن مذهل..و هناك أمل و إن لم يكن يكفي أن تموت وانت تحاول وتنكر بقلبك وهذا أضعف الإيمان.

البداية

لدي من العمر الآن 22 عام لدي الكثير من الندم والحسرة علي العمر الفائت في غير الطاعة والعمل الصالح ووددت لو عندي فرصة في الرجوع للخلف كثيرا والبداية مجددا..لكننا لسنا في فيلم أنمي بل واقع…عزائي إنك إن تستدرك، تستدرك عند رب كريم قادر علي كل شئ. يبدل لك سيئاتك.

أعاني من الكثير من عسر الفهم في كل شئ تقريبا..خصوصا اللغة العربية لدي عسر قراءة و كتابة و الي الصف الثالث الثانوي لدي مشاكل فادحه في الكتابة..الإ أنني كنت احب الرياضيات برغم فشلي فيها ..واحب الإنجيليزي والحال أحسن من العربية بكثير ..احبهم ولكنهم لا يحبوني والحمد لله.

بدات أحاول جاهدا في الصف الثاني الثانوي محاولات تعتبر جاده الي الثالث الثانوي وكنت فاشله وتجاوزت الامتحانات بالغش ودخلت الجامعة ولا استحق هذا المكان ولا ذلك المجموع.

في كل تلك السنوات الفائته كنت تعيش في عالم المقارانات وتنتظر من الأهل و الأصحاب الإعتراف والتقدير ..برغم كل هذا فكنت أتلقي الكثير من الحب والإيمان بي من كل من درس لي تقريبا. فجزاهم الله خيرا.

من داخلي لأ افكر في هذه الأفكار كثيرا..لكني في بداية الجامعة تبدل الحال و أصبح كل وقتي تعلم ودراسة الي أن تخرجت العام الفائت و مررت بكثير من الصعاب وتجاوزتها و الحمد لله و الي الأن أقترب من اللحظة التي سيتغير فيها كل شئ ويتبدل كل الحال..من فرص السفر والمنح و نجاح الأعمال والحرية المادة و الكثير ثم قبلها يوم الحصاد أو يومها مثل أخر مره..أفقد كل هذا و أبدا من جديد..تشعر بالحزن كثيرا..لكن بعد فترة تجد رحمات ربك و تقديره في كل هذا و هناك أوقات لا تعرف ماهي الحكمة لكن تبدا في رؤية تقصيرك و تعيد حسابات وتتوجه الي الله مجددا.

التحرر من الدنس

بعد هذه السنين تنظر لنفسك و تبدا في الغوص في أعماك الذات لماذا أقوم بهذا وماهي الرغبة والهدف !

تجد أنك تم حبسك باستخدام معتقدات و أفكار سامة و فاسدة طيلة حياتك و القليل القلليل من الأفكار الصالحة من التربية و المجتمع…تأتي هذه الأفكار بعد أن تفككت الحياة وتبدلت وهناك الكثير من الموتي في حياتك وأنت مازلت

تعيش في سعيك للاعتراف بك…تعيش أسير للماضي فتظلم بذلك الحاضر والمستقبل.

الأنانية ولذة العلم

كان هناك أوقات يجب أن اختار المال أم العلم!وكنت بكل بساطة العلم ..فالمال سيأتي مستقبلا بعد العلم ..وهذا ليس صحيحا 100% ولكن الواقع مختلف..

لماذا تريد العلم!! أقوم بسرد المائات من الشعارات الجذابة عن العلم والتعلم والهمة ورفع الحرج عن الأمة والثغر وكلام يجعلك تقول ياه بارك الله فيك…

معظمه كلام ..لحظات الحقيقة هي المحن ..اللحظة التي سوف تتخلي عن عمل معين وسعي و لن يذكر اسمك ولكنه سيخرج باسم كلي وليس اسم يخصك فتشعر من داخلك ماذا لو كنت انا من قام به وحدي وتم ذكري اسمي وحدي علي منصاتي سوف أصبح معروفا في مجالي و احصل علي فرصة عمل جيدة….

لن أخبر هذا كيفية عمل النموذج هذا او أن هناك طريقة أسهل الي ان اقوم انا بها حتي يكون لي السبق!

والكثير من لحظات الانانية وحب الذات ولذة العلم تأتي علي حساب الغاية الاولي و مرضاة الله ، و حسن العلم والإخلاص في العمل والكثير من الثمرات التي أضيعها فأصبح عبدا للعبادة (طلب العلم ) ولست عبدا لله.

هل هذا هو الحق؟

الحمد لله الذي أراني بعض من أدناس القلب وفساد النية، هل هذا هو الحق! بالتاكيد لا.

ماهو الحق إذا…هناك الكثير من الإجابات لكن ما يحضر في قلبي حاليا بعد الحصول علي كثير من ما أحب من علم ومال و علاقات و غيره..أجد الإجابة القرانية هي الإجابة الوحيده الشافيه لتعبر عن ما أمر به من أزمة هوية حقيقة.

{ قُلْ هَلْ نُنَبِّئُكُمْ بِالأَخْسَرِينَ أَعْمَالا * الَّذِينَ ضَلَّ سَعْيُهُمْ فِي الْحَيَاةِ الدُّنْيَا وَهُمْ يَحْسَبُونَ أَنَّهُمْ يُحْسِنُونَ صُنْعًا } سورة الكهف.
سورة الحديد
أرْوَى. · سورة الحديد - البراء بصفر

وليس عند ما يقال بعد سورة الحديد ومافيها من خطاب يحرك الصخر و يشفي الصدر و يعلي اللهمة ويعطي للعبد النظارة الحقيقة لرؤية الحق.

طلب التوفيق و الصلاح

في الحقيقة كل ما في الأعلي هو تمهيد أن كل أمالي و طموحاتي هي رغبة في المنافسة و الحصول علي التقدير والرفعة في الدينا…وهذا خسران مبين.

أرغب فعلا أن تكون تتحول هذه الرغبة الي رغبة فيما عند الله و محبتة و رضاه فهذا هو التوفيق و أن ترتفع درجاتي في الدنيا بأكبر قدر قبل فوات الأوان عندما تنقلب كل أعمالي الي حسرة و قبل أن يضل سعي و أنا أحسب أني احسن العمل.

بعد كل المصاعب التي تقابلني من حاجة مادة وحاجة للفهم والوقت والتوفيق والعلم و غير هذا..أجد أن الامر كله بيد الله و قدرته.

فلا حول ولا قوة الإبالله و حسبي الله ونعم الوكيل.

يجب علي تقبل المحدودية و الضعف و اللجوء الي من بيده ملكوت السموات و الأرض القادر علي كل شئ. سبحان ربي و تعالي و صلي الله وسلم علي سيدنا محمد.

Domains Day: Connecting Railway with Vercel, Cloudflare, and Hostinger

Kareem — Mon, 19 May 2025 21:00:00 GMT

Moving My Domains

I decided to transfer all 11 of my domains from Hostinger to other providers. I consulted two friends—a team lead in Canada and an SEO specialist in Turkey—and both recommended Porkbun.

I registered with Porkbun and was required to verify my identity. They offered multiple verification options, and I chose to use my government ID. However, I faced issues: I uploaded my ID about 10 times, and it failed every time, even though the verification platform supported my language. I reached out to their help center, and the fastest response came via email.

It took about 12 hours to receive a reply, which felt like a long time. The good news? They instantly verified my ID with no further issues. Later, when I asked another question, their response again took several hours, unlike Hostinger, which typically responds within 1 to 30 minutes (and at most within 2 hours). However, I found Hostinger’s customer service stricter and sometimes less friendly.

I’ll discuss my reasons for moving later, but for now, let’s focus on the process.

Moving from Hostinger to Porkbun

I had never transferred a domain before, so I expected it to be challenging. However, Hostinger made the process straightforward without asking why I was transferring.

Here’s how to do it:

Go to your Hostinger dashboard.
Disable the Privacy Protection (WHOIS Privacy Protection) button.
Disable the Transfer Lock button. This may take up to 12 hours to update.
Obtain the Authorization Code (EPP code) from the dashboard and save it, as you’ll need to provide it to Porkbun for the domain transfer.

On Porkbun’s website: - Enter your domain name in the Domain Name field under the transfer section. - Copy the authorization code from Hostinger and paste it into the Auth Code field. - Click Submit.

The transfer(s) will be added to your cart. From there, click Continue to Billing to pay for the transfer. It’s that simple!

If your domain is older than 60 days, the transfer typically takes 5 to 7 days. Some domains may transfer in as little as 2 days, but Hostinger will send a verification request via email to confirm the transfer.

Note: If your domain is less than 60 days old, you’ll need to wait until it passes the 60-day mark. There’s talk of this being reduced to 30 days, but as of this writing (May 2025), the 60-day rule applies, per ICANN regulations, not Hostinger or Porkbun.

Connecting Railway and Vercel with Porkbun

Connecting a domain to Vercel is straightforward. Follow the instructions provided by Vercel, which are similar to those for any registrar. Add the necessary DNS records in Porkbun’s DNS Management section, accessible from the domain management dashboard under your account menu.

Connecting to Railway was trickier. I needed an intermediary, and I chose Cloudflare to manage my Porkbun domain’s DNS. Here’s how it worked:

Configure Cloudflare to manage your Porkbun domain.
Add the necessary Railway DNS records to Cloudflare.
Wait approximately 1 to 2 days for the DNS changes to propagate.

Once the DNS updates are complete, your domain should work seamlessly with Railway.

Starting Substack as a AI Researcher

kareem — Sun, 18 May 2025 21:00:00 GMT

What is Substack?

From their about page: “On Substack, writers and creators can publish their work and make money from paid subscriptions while supporters can directly sustain the work they deeply value.”

It’s a platform where you can publish what you want in an organized way, with great analytics tools and a powerful recommendation and search system.

I like to think of it as a nicer version of Medium, with more than just SEO dumped from Google.

I really love the network from a design perspective. It’s an organized recommendation system based on multiple aspects.

Why use Substack as an AI Researcher?

I used LinkedIn and Medium, and I don’t find them very useful for engagement or building a real network.

My gravity is very tiny in these networks and I can’t get bigger for multiple reasons I will discuss later.

The same is true for X, but it’s a very good place for the ML community.

My reasons:

Improve my writing style. I am not good at writing or English, but I am trying to do my best. Communication is a very important skill for my work as a researcher.
I build products I want people to learn more about…more later.
I have services I want to market…more later.

Did you forget something?

^_^

Die, empty, and publish your knowledge!

Actually, the main reason for me to write is that I feel lonely in my journey of learning about DL and software engineering.

And I feel bored most of the time.

I love the feeling that I am writing and archiving my life. This reminds me that I am doing hard work, learning new things, and my life will not end without doing great things from my point of view.

Recapping what I learned when I write about it makes the concepts stick in your mind, and you get more insights from people’s comments. This is very helpful for me.

For readers, you can see what I do, and you may be interested in it, be inspired, and learn from my mistakes.

Following the Giants?

When I started learning about DL, I searched for a great book about DL and PyTorch that is well explained in both code and theory.

I found his Substack where he publishes what he learns. At the same time, I found similar people like

These people I follow and appreciate what they provide!

So I started to ask, why do they use Substack?

The answer is very clear if you know them ^_^

Quick analysis

The UI/UX, speed of the website, and animations are very cool.

Let’s look at the level I want to reach :)

Ahead of AI

On 28/04/2023

It reached 15,028 free subscribers and only 69 paid ones!

This is very disappointing for me.

You don’t know if these are monthly or yearly paid and how much they give.

Of course, I don’t know how much he earns or others. I want to motivate myself and not create high expectations. He is doing great stuff, really great. His content is in the top 5 for me for AI and real content.

No scams, no “read this paper,” “look at this book,” “here are the top 1000 chatbots that are better than 10 who are better than GPT!!”

Oh my god, QwenClaudeMixtral just released a model that will convert space into water in the year 1021932103120931920.

This is very silly if you respect your readers’ minds!

Ahead of AI subscribers in 2024

What about now?

He has more than 105k+ and is #71 Rising in Technology. I don’t know why 71!! He must be top 5.

Jay Alammar => 23k+ subscribers

Maarten Grootendorst => 20k+ subscribers
Paul Iusztin => 25k+ subscribers

Most likes per post are around 10 to 20 only!

The comments are around 0 to 5 :)

My Goal in the coming 6 months?

I am doing multiple things at the current time.

I am on 4 projects and doing my Master Thesis and a lot of projects and websites I am developing.

I will not be able to provide much, but I will try my best.

I don’t care about the number of subscribers, I care about real ones, who will comment and read the content. If I can get 30 people to read it, this is great. Really, if you imagine 30 people in your room watching what you did, it’s mind-blowing. Imagine 300 or 3,000…Wow.

For money, if I can get only $200 per month, this is very great as a start.

Disease from other websites?

I noticed the following issues on the first day:

Sh*** shorts and copyright issues: By copyright, I mean some people take others’ content from outside Substack and post it as if it is their own and get followers for this! This is very bad. I don’t like strict copyright like “you copied this sentence” or “used a logo,” but stealing the whole content!!!
Very short articles compared to normal tweets! I found a lot of articles that appeared to me are like tweets…I thought I was in a place where people write in-depth content, not clickbait.

I don’t talk about the post but the long-form ones.

Drop your Substack fake numbers: A lot of people post “drop your Substack below and whatever it’s about I will exchange” and multiple similar things!! What is the
purpose then… you have 1 million subscribers, then what?

There is no official API for Substack

The most annoying thing in Substack is there is no current API to develop apps, create extensions, and automate a lot of things.

What things can I write about?

You seem to love to talk a lot? What will you write then?

It’s very obvious…

I want to talk about the following:

Compute world with AI, fine-tuning/training, and cloud instances, etc.
Products I use and tried, not product reviews.
Books and papers I read and summarized.
TILs, what I learned today!
Novel writing about the AI world like vector databases, embedding, and searching, etc.
Applications I build and services I provide.
My workflow and software i use.
Substack tips and analysis.

you can find links here for now: 1. Gpuvec publications

Kareem’s TILs

Explore the Qdrant Blog core concepts | Part 1

kareem — Sat, 17 May 2025 21:00:00 GMT

Overview of Qdrant features and Concepts

I will divide the blogs into 3 types:

Startups using Qdrant: why and comparision you get read nice usecases with some numbers of comparision and how qdrant is amazing
Startup using Qdrant++: The same but with code snippets and System Design discussion..very useful for me as a developer
Qdrant releases: New features in Qdrant and how to use them.

Remember this when i will build Agentic RAG in the new job

users tend to ask more structured, analytical questions when they know a database is involved—queries better suited to SQL than vector search. This prompted the team to pair Qdrant with a text-to-SQL system, blending unstructured and structured query capabilities for a more versatile agent.

Hotel Search with Vectors

Superlinked enhances search by embedding each attribute (text, numbers, categories) into specialized spaces, enabling nuanced, multi-attribute queries.

An LLM interprets user intent, assigning weights to preferences (e.g., price, rating), allowing flexible, business-driven ranking without system redesign.

Hard filters narrow results, while weighted nearest neighbor search ranks them by user preferences.

This unified approach supports multimodal search—combining semantic text, scaled numerical, and categorical data—preserving relationships and preference strengths.

Unlike traditional systems that separate or flatten data, Superlinked enables simultaneous, weighted consideration of all attributes, solving challenges like reconciling

results across types and capturing nuanced user intent.

Reciprocal Rank Rusion (RRF)

Qdrant’s native support for Reciprocal Rank Fusion (RRF) streamlined their retriever implementations, reducing hybrid search code by 80%. The multi-vector capabilities also enabled more sophisticated retrieval methods that better captured semantic relationships.

Qdrant 1.13 GPU Indexing

Here is summary of these new features

GPU Accelerated Indexing with Qdrant

You can Index over all majro GPU vendors including NVIDIA,AMD and Intel that support Vulkan API to get speeds up to 10x faster than CPU-based methods*

As of right now this solution supports only on-premises deployments, but they will introduce support for Qdrant Cloud shortly.

Additional benefits:

Multi-GPU support
GPU indexing supports all quantization options and datatypes in Qdrant

Strict mode for Opertional Control

Strict Mode enforces operational controls in distributed Qdrant deployments. It limits resource-intensive operations (like unindexed filtering and large batch sizes), sets boundaries on search parameters, and adds safeguards for payload sizes and timeouts. This prevents system overload, solves the “noisy neighbor” problem, and ensures reliable performance—especially in multi-tenant or serverless environments.

HNSW Graph Compression

Make search lighter on memory wihtout sacrificing speed with Delta Encoding.

Delta Encoding is a clever way to compress data by storing only the differences (or “deltas”) between values. It’s commonly used in search engines (for the classical inverted index) to save space and improve performance. I think i have read this with Colbertv2 using similar techniques to reduce the siz it’s called residual compression mechanism needs more searching

It’s now used for HNSW graph structure that powers Qdrant’s search.

Static Embedding with Qdrant and Model2vec

Static embedding from minishLab reduce the model size with 15x reduction and up to 500x speed increase while the maintain more than 85% of the performance levels. it’s like our zaraah model for arabic.

Static embedding are dense embedding so you can also use with qdrant collections. The retrieval is not going to be any faster becuase static embeddings. but the speedup is in creating the vectors from your data and encoding the queries.

If you want to make the retrieval faster use the following: 1. Matryoshka Embeddings 2. Quantization methods like (Scalar and Binary Quantization) ### When to use Static Embeddings ?

Mobile applications - although many smartphones have powerful CPUs or even GPUs, the battery life is still a concern, and the static embeddings might be a good compromise between the quality and the power consumption. Moreover, the static embeddings can be used in the applications that require offline mode.
Web browser extensions - running a transformer-based model in a web browser is usually not quite an option, but static embeddings might be a good choice, as they have fewer parameters and are faster to encode.
Embedded systems - the static embeddings might be a good choice for the devices with limited computational power, such as IoT devices or microcontrollers.

References:

There is more text in here from Qdrant not me..you can continue reading here

TIL ? Tody I lernt to create TIL

kareem — Fri, 16 May 2025 21:00:00 GMT

Why TIL ?

It helps you overcome the perfacationsim in writting. you don’t need to create great article in depth about things you want to share or learn about..etc, all what you want it to write a thing you were trying to learn or solve today and how you solve it

which will help me more focused and make the process of learning more easier and useful for me and others

create your own gravitey

sometimes you want to reach and communicate with people in the same space of problems you are solving, search engines are very bad in provide the information you want..this is related to how these engines work and other SEO stuff. but for me, i can’t reach people and help them or get benefit from them because they simple don’t know about me!!

TIL will decreases this spaces and daily TIl about the things i am learning which are alot will start to give me nice SEO and unique because i am talking about things i don’t know and i am interested in and there will be much people in the same boat this will increase my X account and linkedin and this is very useful in the current time.

TILs level up

I also want to think in way to extract more crafted blogs from my TILs. but just start making it habbit and will see who will it comes in the end.

initial thoughts: 1. weekly recap from my TILs and SEO optimization for the keywords that is increasing and i am interested in 2. ML to extract related stuff, for example i will want to take about the folloinwg topics: - Late interaciton (Pylate, Colbertv2, ColPali) - Searching - Model2vec - Visino Language models..etc collecting them and start writing about them and adding internal links for them will be very useful to improve my blog system

Things i learned the last 2 days

The best way to increase traffic is comment with valuble knowledge.

I was scorlling on X and found some popular account tweets about Harvard CS197 AI Research and i had create a review year ago about it I add the link in the comment and just slept. Boom i found reply analytics links opened wihtout annoying anyone!! - 501 Impressions - 77 Engagements - 2 profile visits - 74 clicks

What is PALID index?

PLAID index is for indexing very large datasets with Later interaction models like (Colbertv2 & ColPali) IT Solves the storage footprint and allow you to scale to infinity

They swapped the faiss from facebook what is nice thing because i have multiple bad time to install it especially the GPU version.

and used fastkeamns from the amazing @bclaive which i really like his work on embeddings

It’s replacement for Voyager-based HNSW index which was very bad for scaling Late-Interaction retrieval models

Prime Intellect VS Gpuvec

I was creating a website to find and compare the cloud compute instances from all cloud providers in easy and interactive way.

I started to work on it the last month, but stoped for other work.

suddenly i found this website which is called Prime intellect.

And i just want to say,woooooow. it’s a piece of art. The design and information on it and how fast, accurate is very embrassing for my poor gpuvec.com

should i continue improve my website?

Actually yes, we share similar goals but there is multiple chances to compete or even collaborite!! who knows!

They are more than just listing and compare prices, then enable you to use these GPUs from their website.

Also they create decentralized models and have a mutliple expirened engineers.