-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Only default rope_parameters to empty dict if there is something to put in it
#42651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…to put in it Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| return | ||
| # Case 1: RoPE param keys do not intersect with possible `layer_types` -> one global dict | ||
| if getattr(self, "layer_types", None) is None or not set(rope_parameters.keys()).issubset(self.layer_types): | ||
| elif getattr(self, "layer_types", None) is None or not set(rope_parameters.keys()).issubset(self.layer_types): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the custom model has no rope parameters (i.e. it is {}), we will get triggered by the original issue of evaluting to True, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that if rope_parameters is explicitly set to {} we'll still get here. But if rope_parameters is unset, it will remain None and this method will exit early with a warning
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
zucchini-nlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect
set(rope_parameters.keys()).issubset(self.layer_types)will evaluate toTrueifrope_parametersis an emptydict.This is a problem if
standardise_rope_paramsis called on a config for a model which do not have RoPE because it will triggerCase 2. This method might be called on models which do not have RoPE in third party libraries (e.g. vLLM) in order to "upgrade" configs of custom models they use.This PR makes
standardise_rope_paramsharmless to call on configs with no RoPE at all.