forked from deepspeedai/DeepSpeed
-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Error Log :
=========================== short test summary info ============================
FAILED tests/unit/test_checkpointing.py::test_checkpoint_moe[4]
FAILED tests/unit/test_checkpointing.py::test_checkpoint_moe_and_zero[4-True]
FAILED tests/unit/test_checkpointing.py::test_checkpoint_moe_and_zero[2-True]
FAILED tests/unit/test_configurable_parallel.py::TestConfigurableMP::test_gpt2_basic
====== 4 failed, 581 passed, 58 skipped, 1 warning in 3850.22s (1:04:10) =======
Steps to reproduce :
Follow the steps in this PR to install pytorch with hipify_torch as submodule
After building and installing pytorch from source , clone DeepSpeed from upstream and do a jit build and run unit tests:
git clone https://github.com/microsoft/DeepSpeed.git- #include<THC/THCGeneral.h> from csrc/lamb/fused_lamb_cuda_kernel.cu removed before building
./install.sh(JIT build)DEEPSPEED_TEST_WITH_ROCM=1 pytest --forked tests/unit/test_* 2>&1 | tee deepspeed_unit_test
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working