Add JinaBERT model #35320

joelpaulkoch · 2024-12-18T15:51:25Z

What does this PR do?

This PR adds JinaBERT to transformers.
This enables running jinaai/jina-embeddings-v2-base-code without trust_remote_code.

Relevant issue and discussion is here: #27035.

Note that there are two implementations in use for Jina Embeddings v2:
This PR covers jinaai/jina-embeddings-v2-base-code which uses this implementation.
Additionally, there is jinaai/jina-embeddings-v2-base-en (and variants small-en, base-zh, base-de, base-es) which use a different implementation.

I'm not sure if we can make a single JinaBERT implementation that works for both base-code and base-en.
Otherwise, we probably would want to have two JinaBERT implementations, for instance JinaBERT for base-en and variants and JinaBERTv2 for base-code.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Tests

Quite some of the generated tests fail and I'd need help there to judge what we need or what must be updated.

I've updated the test_inference_no_head_absolute_embedding integration test to assert on the output that I get from the original implementation.
Moreover, I added an test_encode integration test to assert that we get the same results as in the example provided by Jina AI.

Who can review?

@ArthurZucker

@bwanglzu might be interested too

joelpaulkoch · 2024-12-18T17:18:51Z

src/transformers/models/auto/configuration_auto.py

@@ -147,6 +147,7 @@
        ("instructblipvideo", "InstructBlipVideoConfig"),
        ("jamba", "JambaConfig"),
        ("jetmoe", "JetMoeConfig"),
+        ("jina_bert", "JinaBertConfig"),


Looking at the naming of other models, we could rename this to jinabert?

src/transformers/models/jina_bert/__init__.py

src/transformers/models/jina_bert/modular_jina_bert.py

joelpaulkoch · 2024-12-18T18:22:25Z

I followed this guide and first ran transformers-cli add-new-model-like, then I've created modular_jina_bert.py to structure the code according to the modular transformers concept.

I didn't delete any of the files generated in the first step, so I think there is a lot to clean up.

Also, modular_jina_bert.py is mostly copy-paste from the original implemenation.
I only included functions that changed in comparison to BERT.
There are probably things that could be improved or removed.

Moreover, I left some TODOs in modular_jina_bert.py for open points.

joelpaulkoch · 2025-01-03T21:16:10Z

I've updated the PR where I was somewhat certain. I would still need help, especially regarding the tests and other checks.

One thing I've noticed is that quite some tests fail with AttributeError: Not needed for JinaBert which is a result of what I did here (following the modular transformers guide as JinaBertLMPredictionHead does not define _tie_weights but BertLMPredictionHead does):

class JinaBertLMPredictionHead(BertLMPredictionHead):
    def _tie_weights(self):
        raise AttributeError("Not needed for JinaBert")

…JinaBertModel

Prevents this error: TypeError: unsupported operand type(s) for +` -> 'Tensor' and 'NoneType' This occurred on line 251 in modeling_jina_bert.py `attention_probs = nn.functional.softmax(attention_scores + bias, dim=-1)`

stevhliu

Thanks! It would also be helpful to add a code snippet in the docstrings showing how to generate the embeddings

docs/source/en/model_doc/jina_bert.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

joelpaulkoch · 2025-01-28T19:41:37Z

Thanks for your review! I took the docs from the original repository but still applied your suggestions.
I will add a snippet in the docstrings

ArthurZucker · 2025-02-04T15:23:50Z

hey sorry for the delay! Having a look in a bit!

ArthurZucker

Wow the in a bit became 3 month....
My main comment is that the tokenizer should not be in the modeling, this never happens in transformers!

Appart from that, for the modular a lot of blocks seems to be very similar so inheritance should help you !

joelpaulkoch added 6 commits December 18, 2024 15:26

Generated from transformers-cli add-new-model-like

9bc1dd9

Add modular_jina_bert.py and generate modeling and configuration files

d5d16fb

Update integration tests to compare results with original implementation

35f36d7

Run make fixup

ae20f6c

Add docs

017264a

Regenerate modeling_jina_bert.py

41b0bb1

joelpaulkoch mentioned this pull request Dec 18, 2024

Improve modular transformers documentation #35322

Merged

5 tasks

joelpaulkoch commented Dec 18, 2024

View reviewed changes

qubvel added New model run-slow Text labels Dec 19, 2024

joelpaulkoch added 10 commits January 3, 2025 16:34

Remove tqdm

b2a86b6

Always import scaled_dot_product_attention

c681395

Remove everything optimum related

6baa6ae

Remove TODO

7c29d2b

Regenerate single model single files

3b1e970

Add missing imports and JinaBert.. classes

2d9b9d7

Run make fix-copies

6e0628e

Remove generated conversion files

9711224

Include pipeline_model_mapping in test (as it's for BERT)

d367b47

Remove presumably unrequired tests

75ada39

joelpaulkoch marked this pull request as ready for review January 10, 2025 10:03

joelpaulkoch requested review from ArthurZucker, Rocketknight1 and stevhliu as code owners January 10, 2025 10:03

joelpaulkoch marked this pull request as draft January 10, 2025 10:05

joelpaulkoch added 2 commits January 10, 2025 11:36

Fix import issues

ff29449

Copy docs from original implementation and ignore JinaBertConfig and …

e51298b

…JinaBertModel

joelpaulkoch added 5 commits January 10, 2025 12:06

Don't raise for _tie_weights

1c8103d

Sort imports

a74753e

Set fx_compatible = False

999e8eb

Add bias to crossattention to prevent type error

dcb8502

Prevents this error: TypeError: unsupported operand type(s) for +` -> 'Tensor' and 'NoneType' This occurred on line 251 in modeling_jina_bert.py `attention_probs = nn.functional.softmax(attention_scores + bias, dim=-1)`

Uncomment warn_if_padding_and_no_attention_mask

ddbac91

stevhliu reviewed Jan 13, 2025

View reviewed changes

joelpaulkoch and others added 4 commits January 28, 2025 20:35

Update docs/source/en/model_doc/jina_bert.md

a1beb00

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Update docs/source/en/model_doc/jina_bert.md

6c2b002

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Update docs/source/en/model_doc/jina_bert.md

7ebe28e

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Update docs/source/en/model_doc/jina_bert.md

dd258ce

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

ArthurZucker mentioned this pull request Feb 4, 2025

Offline mode doesn't work with models that require trust_remote_code=True #34855

Closed

4 tasks

ArthurZucker reviewed Jul 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add JinaBERT model #35320

Add JinaBERT model #35320

joelpaulkoch commented Dec 18, 2024 •

edited

Loading

Uh oh!

joelpaulkoch Dec 18, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joelpaulkoch commented Dec 18, 2024

Uh oh!

joelpaulkoch commented Jan 3, 2025 •

edited

Loading

Uh oh!

stevhliu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joelpaulkoch commented Jan 28, 2025

Uh oh!

ArthurZucker commented Feb 4, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Add JinaBERT model #35320

Are you sure you want to change the base?

Add JinaBERT model #35320

Conversation

joelpaulkoch commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Tests

Who can review?

Uh oh!

joelpaulkoch Dec 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joelpaulkoch commented Dec 18, 2024

Uh oh!

joelpaulkoch commented Jan 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joelpaulkoch commented Jan 28, 2025

Uh oh!

ArthurZucker commented Feb 4, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

joelpaulkoch commented Dec 18, 2024 •

edited

Loading

joelpaulkoch commented Jan 3, 2025 •

edited

Loading