-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
(Just keeping notes for myself. Maybe someone else will find these useful as well.)
Running the script on alihlt
Version with virtualenv
Prerequisities - to be installed by admin:
- python3-virtualenv
- graphviz
- ROOT prerequsities
Running
- Install ROOT 6. Add to
~/.bashrc:
export PATH=/opt/rocm/bin:$PATH
export ALIBUILD_WORK_DIR="$HOME/alice/sw"
eval "`alienv shell-helper`"
Reload shell.
2. Add PYTHONPATH=/home/${LOGNAME}/.virtualenvs/tpcwithdnn/lib/python3.6/site-packages/:$PYTHONPATH to load.sh:89 and comment LD_LIBRARY_PATH line.
3. Copy input data from aliceml and change paths in database*.yml (/home/mkabus/data/...).
4.
alienv enter ROOT/latest
source load.sh
pip uninstall tf-nightly-gpu
pip install tensorflow-rocm
- In
utilities_dnn.py:58replacepool_typewith1(forcingAveragePooling3D,MaxPooling3Dcauses: "3D pooling doesn't support workspace index mask mode" error). - Change
run_paralleltotrueindatabase*.yml. - In
dnn_optimiser.py:58set devices explicitly, for 6 devices:self.strategy = MirroredStrategy(devices=["/gpu:0", "/gpu:1", "/gpu:2", "/gpu:3", "/gpu:4", "/gpu:5"])
Debugging
Comments to a tensorflow issue
ROCM guide on HIP debugging
ROCM guide on system-level debugging
Profiling
Metadata
Metadata
Assignees
Labels
No labels