Integrate with Sentence Transformers v5.4

#10

by tomaarsen HF Staff - opened 5 days ago

base: refs/heads/main

←

from: refs/pr/10

Discussion Files changed

+89

-3

tomaarsen

Nomic AI org 5 days ago

•

edited 5 days ago

Hello!

Pull Request overview

Integrate nomic-embed-vision-v1.5 with Sentence Transformers v5.4+

Details

The integration uses a Transformer -> Pooling(cls) -> Normalize pipeline with modality_config set to {"image": {"method": "forward", "method_output_name": "last_hidden_state"}} so the model accepts image inputs. A processor_config.json was added to ensure AutoProcessor loads the CLIPImageProcessor instead of falling back to a tokenizer (since model_type: nomic_bert would otherwise resolve to BertTokenizer).

Note: this model requires https://huggingface.co/nomic-ai/nomic-bert-2048/discussions/23 to fix three transformers v5 compatibility issues in the shared modeling code:

Adding self.post_init() to NomicVisionModel.__init__ (required for all_tied_weights_keys)
Lazy recomputation of rotary position embeddings in NomicVisionRotaryEmbeddingCat.get_embed (non-persistent buffers are not materialized when from_pretrained initializes on torch.device("meta") in v5)
Replacing self.norm_factor buffer with inline math.sqrt(self.head_dim) in NomicAttentionPooling and NomicBertAttention (same meta-device issue)

Added files:

modules.json: Defines the Transformer -> Pooling -> Normalize pipeline
config_sentence_transformers.json: ST model config with cosine similarity
sentence_bert_config.json: Transformer config with image modality_config
1_Pooling/config.json: CLS pooling mode, 768-dim embeddings
processor_config.json: Ensures AutoProcessor loads CLIPImageProcessor

Modified files:

config.json: Fixed n_inner from 2048.0 (float) to 2048 (int) for transformers v5 strict validation
README.md: Added sentence-transformers library tag, and a "Using Sentence Transformers" usage section

Here's a script that uses both this PR and the companion PR:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("nomic-ai/nomic-embed-vision-v1.5", revision="refs/pr/10", model_kwargs={"code_revision": "refs/pr/23"}, trust_remote_code=True)

embeddings = model.encode("http://images.cocodataset.org/val2017/000000039769.jpg")
print(embeddings.shape)
# (768,)

Once both are merged, then the revision/model_kwargs can be excluded.

Note that none of the old behaviour is affected/changed. It only adds an additional way to run this model in a familiar and common format.

Tom Aarsen

Integrate with Sentence Transformers v5.48efd7edd

tomaarsen changed pull request status to open 5 days ago

srinivasbilla

2 days ago

•

edited 2 days ago

I am getting nan when i try to replicate this with sentence-transformers==5.4.0

Use ST v5.4.0 for warningsc48e0a12

tomaarsen

Nomic AI org 2 days ago

•

edited 2 days ago

Hello!

This is what I get, for reference:

>>> from sentence_transformers import SentenceTransformer
>>> model = SentenceTransformer("nomic-ai/nomic-embed-vision-v1.5", revision="refs/pr/10", model_kwargs={"code_revision": "refs/pr/23"}, trust_remote_code=True)
>>> embeddings = model.encode("http://images.cocodataset.org/val2017/000000039769.jpg")
>>> print(embeddings)
Loading weights: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 211/211 [00:00<00:00, 5870.40it/s]
[ 4.71330713e-03 -2.53534522e-02  6.63616322e-03 -2.95666978e-02
 -4.34983559e-02 -1.22364080e-02  2.38989759e-03 -3.60762812e-02
. . .
 -4.39791530e-02 -3.05440221e-02 -1.93784963e-02 -1.76065695e-02
 -3.54587808e-02 -4.97163460e-02  7.33873341e-03 -3.87372449e-02]

Can you share your torch and transformers versions?
I'm using torch 2.10.0+cu128 and transformers 5.5.0. I know some older torch versions had some issues regarding nan, although you shouldn't need a version as new as 2.10.0.

Tom Aarsen

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment