Skip to content

Stell embeddings doesn't match with the official example (w/ sentence transformers) #764

@girishponkiya

Description

@girishponkiya

System Info

I started a service with the following command:

sudo docker run -d --gpus all \
    -p 8087:80 \
    --memory 8g \
    --cpus 1 \
    -v embeddings-test:/data  \
    -e TRUST_REMOTE_CODE="true"\
    --name embeddings-stella-1.5b \
    --pull always ghcr.io/huggingface/text-embeddings-inference:cuda-1.8.3 \
    --model-id NovaSearch/stella_en_1.5B_v5 \
    --hf-token [MY_TOKEN]

..and computed similarity for examples queries and documents - as listed in the model's README. However, the similarity values are different than what I got thought the SentenceTransformer library.

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

  1. Start an embedding service with:
sudo docker run -d --gpus all \
    -p 8087:80 \
    --memory 8g \
    --cpus 1 \
    -v embeddings-test:/data  \
    -e TRUST_REMOTE_CODE="true"\
    --name embeddings-stella-1.5b \
    --pull always ghcr.io/huggingface/text-embeddings-inference:cuda-1.8.3 \
    --model-id NovaSearch/stella_en_1.5B_v5 \
    --hf-token [MY_TOKEN]
  1. Compute similarity for the following queries and documents:
query_prompt = "Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: "
queries = [
    "What are some ways to reduce stress?",
    "What are the benefits of drinking green tea?",
]

# docs do not need any prompts
docs = [
    "There are many effective ways to reduce stress. Some common techniques include deep breathing, meditation, and physical activity. Engaging in hobbies, spending time in nature, and connecting with loved ones can also help alleviate stress. Additionally, setting boundaries, practicing self-care, and learning to say no can prevent stress from building up.",
    "Green tea has been consumed for centuries and is known for its potential health benefits. It contains antioxidants that may help protect the body against damage caused by free radicals. Regular consumption of green tea has been associated with improved heart health, enhanced cognitive function, and a reduced risk of certain types of cancer. The polyphenols in green tea may also have anti-inflammatory and weight loss properties.",
]

Script to compute the similarity:
sanity-check.py

Expected behavior

Expected similarity:

# [[0.8178789  0.2958377 ]
#  [0.31938642 0.7853526 ]]

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions