-
Notifications
You must be signed in to change notification settings - Fork 64
Add multi-container support #880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
🤖 Hi @zxhe-sean, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
|
@zxhe-sean why do we need this change? Could you cover it with tests? |
a62db5b to
f21d198
Compare
0e8f567 to
cfb8a07
Compare
jamOne-
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes and the unit tests!
I left a couple of comments, PTAL.
48b43f5 to
0f07392
Compare
0f07392 to
1fe5b6d
Compare
jamOne-
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, looks good! Left one more comment
| tpu_type_requires_workload_policy=True, | ||
| supports_accelerator_network_profile=False, | ||
| docker_platform=AMD_PLATFORM, | ||
| parallel_containers=2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the FR, Victor said that not all tpu7x should support this, for instance 2x2x1 should not? Or am I wrong?
Could we add a support for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2x2x1 is 4 chips on a full host. I think it is the smallest topology where we can start using 2 containers @Obliviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 chips can support it but less than 4 can't
|
@Obliviour @FIoannides PTAL |
|
So this PR provides us the ability from the workload to enable two container workloads. There are two other dimensions that need to be covered that I think we need xpk team's help in:
|
Description
Adding a flag --multi-container for supporting multi-container workloads.
With this flag enabled, running workload create will duplicate the
containerpart in the final yaml file so that the resulting workload will be running in two containers running on one host.Issue
b/459820949
Testing
https://cloudlogging.app.goo.gl/F81QQQ4Ca9Gew7hs8