AI & ML interests

None defined yet.

Recent Activity

IlyasMoutawwakilย  updated a Space about 2 months ago
optimum-intel/benchmark-openvino
IlyasMoutawwakilย  updated a Space about 2 months ago
optimum-intel/benchmark-openvino
IlyasMoutawwakilย  updated a Space 7 months ago
optimum-intel/fastrag-e2e
View all activity

optimum-intel's activity

jeffboudierย 
posted an update about 1 month ago
view post
Post
597
NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos
  • 1 reply
ยท
jeffboudierย 
posted an update 3 months ago
jeffboudierย 
posted an update 4 months ago
jeffboudierย 
posted an update 5 months ago
view post
Post
459
Inference Endpoints got a bunch of cool updates yesterday, this is my top 3
jeffboudierย 
posted an update 5 months ago
view post
Post
4043
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up
ยท
IlyasMoutawwakilย 
posted an update 8 months ago
view post
Post
4031
Last week, Intel's new Xeon CPUs, Sapphire Rapids (SPR), landed on Inference Endpoints and I think they got the potential to reduce the cost of your RAG pipelines ๐Ÿ’ธ

Why ? Because they come with Intelยฎ AMX support, which is a set of instructions that support and accelerate BF16 and INT8 matrix multiplications on CPU โšก

I went ahead and built a Space to showcase how to efficiently deploy embedding models on SPR for both Retrieving and Ranking documents, with Haystack compatible components: https://huggingface.co/spaces/optimum-intel/haystack-e2e

Here's how it works:

- Document Store: A FAISS document store containing the seven-wonders dataset, embedded, indexed and stored on the Space's persistent storage to avoid unnecessary re-computation of embeddings.

- Retriever: It embeds the query at runtime and retrieves from the dataset N documents that are most semantically similar to the query's embedding.
We use the small variant of the BGE family here because we want a model that's fast to run on the entire dataset and has a small embedding space for fast similarity search. Specifically we use an INT8 quantized bge-small-en-v1.5, deployed on an Intel Sapphire Rapids CPU instance.

- Ranker: It re-embeds the retrieved documents at runtime and re-ranks them based on semantic similarity to the query's embedding. We use the large variant of the BGE family here because it's optimized for accuracy allowing us to filter the most relevant k documents that we'll use in the LLM prompt. Specifically we use an INT8 quantized bge-large-en-v1.5, deployed on an Intel Sapphire Rapids CPU instance.

Space: https://huggingface.co/spaces/optimum-intel/haystack-e2e
Retriever IE: optimum-intel/fastrag-retriever
Ranker IE: optimum-intel/fastrag-ranker
jeffboudierย 
posted an update 9 months ago
jeffboudierย 
posted an update 10 months ago