Skip to content

Support intern-s1 #14875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

RunningLeon
Copy link
Contributor

@github-actions github-actions bot added the python python script changes label Jul 25, 2025
name = name.replace(r".lambda_2", r".ls2")
name = name.replace(r".layernorm_before.", r".norm1.")
name = name.replace(r".layernorm_after.", r".norm2.")
return name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use self.map_tensor_name(name) with proper mapping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name remapping is only for intern-s1, so I believe this is better to put the logic in a function and mapping it the internvlchat model weight names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please at least add the vision_tower/vision_model mappings to gguf-py/gguf/tensor_mapping.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. will update later.

@CISC
Copy link
Collaborator

CISC commented Jul 29, 2025

The Python Type-Check CI needs to be resolved.

@RunningLeon
Copy link
Contributor Author

The Python Type-Check CI needs to be resolved.

@CISC hi, could you tell how to fix this error? Seems not reasonable to me

/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3219:23 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3220:13 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3220:49 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:32[21](https://github.com/ggml-org/llama.cpp/actions/runs/16612224904/job/46997396567?pr=14875#step:5:22):23 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3[22](https://github.com/ggml-org/llama.cpp/actions/runs/16612224904/job/46997396567?pr=14875#step:5:23)2:13 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3222:49 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
6 errors, 0 warnings, 14 informations
Error: 6 errors

@CISC
Copy link
Collaborator

CISC commented Jul 30, 2025

The Python Type-Check CI needs to be resolved.

@CISC hi, could you tell how to fix this error? Seems not reasonable to me

Running pyright locally helps, the line numbers are wrong for some reason, this is the actual erroneous codeblock:

if isinstance(self.hparams_vision['image_size'], list):
self.hparams_vision['image_size'] = self.hparams_vision['image_size'][0]
if isinstance(self.hparams_vision['patch_size'], list):
self.hparams_vision['patch_size'] = self.hparams_vision['patch_size'][0]

@@ -2998,7 +2999,12 @@ def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iter
@ModelBase.register("InternVisionModel")
class InternVisionModel(MmprojModel):
def set_gguf_parameters(self):
if isinstance(self.hparams_vision['image_size'], list):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(self.hparams_vision['image_size'], list):
assert self.hparams_vision is not None
if isinstance(self.hparams_vision['image_size'], list):

Comment on lines +3032 to +3039
names_map = {
"model.multi_modal_projector.layer_norm.bias": "mlp1.0.bias",
"model.multi_modal_projector.layer_norm.weight": "mlp1.0.weight",
"model.multi_modal_projector.linear_1.bias": "mlp1.1.bias",
"model.multi_modal_projector.linear_1.weight": "mlp1.1.weight",
"model.multi_modal_projector.linear_2.bias": "mlp1.3.bias",
"model.multi_modal_projector.linear_2.weight": "mlp1.3.weight",
}
Copy link
Collaborator

@ngxson ngxson Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok I think the mapping of 6 tensors can't be added to tensor_mapping.py, as it will mess up conversion for other models. So it's ok to keep these 6 tensors here for now.

But one thing I'm not sure, why mapped name you are using is mlp1.%d.%s? I think it should be mm.model.mlp.%d.%s to match the original InternVL model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it just maps interns1 weight name to internvl weight name

"mlp1.{bid}", # InternVL

@@ -1190,6 +1205,7 @@ class TensorNameMap:

MODEL_TENSOR.V_MM_INP_NORM: (
"multi_modal_projector.norm",
"model.multi_modal_projector.layer_norm", # Intern-S1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed as the norm is already mapped to mlp.0. This is an artifact from InternVL: https://huggingface.co/OpenGVLab/InternVL3-8B-Instruct/blob/a34d3e4e129a5856abfd6aa6de79776484caa14e/modeling_internvl_chat.py#L79

Suggested change
"model.multi_modal_projector.layer_norm", # Intern-S1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy