Skip to content

Conversation

@vasqu
Copy link
Contributor

@vasqu vasqu commented Dec 5, 2025

As per title, discovered during ernie vl

@vasqu vasqu requested a review from ArthurZucker December 5, 2025 13:57
Comment on lines +169 to +171
mapping["ernie4_5_moe"] += [
WeightRenaming("mlp.moe_statics.e_score_correction_bias", "mlp.gate.moe_statics.e_score_correction_bias")
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This weight was missing

Comment on lines -375 to 381
router_logits = F.linear(hidden_states.float(), self.weight)
router_logits = F.softmax(router_logits, dim=1, dtype=torch.float)
router_top_value, router_indices = torch.topk(self.moe_statics(router_logits), self.top_k, dim=-1)
router_top_value = router_top_value / torch.clamp(
router_top_value.sum(dim=-1, keepdim=True), min=self.norm_min
routing_weights = F.softmax(router_logits, dim=1, dtype=torch.float)
_, selected_experts = torch.topk(self.moe_statics(routing_weights), self.top_k, dim=-1)
routing_weights = torch.gather(routing_weights, dim=-1, index=selected_experts)
routing_weights = routing_weights / torch.clamp(
routing_weights.sum(dim=-1, keepdim=True), min=self.norm_min
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core where I messed up previously. Now generations are similar again. I could also generate same outputs with VL using this change.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 😢

@slow
def test_model_21b_a3b_generation(self):
EXPECTED_TEXT_COMPLETION = "User: Hey, are you conscious? Can you talk to me?\nAssistant: I don't have consciousness in the way humans do. I'm a text-based AI created to process and generate responses based on patterns in data." # fmt: skip
EXPECTED_TEXT_COMPLETION = "User: Hey, are you conscious? Can you talk to me?\nAssistant: \nI don't have consciousness in the way humans do. I don't feel emotions, have thoughts, or experience awareness. However, I'm" # fmt: skip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a fast test for this one as well as we seem to break it often

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: ernie4_5_moe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants