Skip to content

Commit c4b6e8b

Browse files
committed
[Docs] Move GPU troubleshooting section to RUN_DEBUG.md for better organization
- Removed detailed GPU memory troubleshooting from `README.md` to avoid redundancy. - Added and expanded GPU troubleshooting section in `RUN_DEBUG.md` for clearer guidance.
1 parent 5887b41 commit c4b6e8b

File tree

2 files changed

+25
-26
lines changed

2 files changed

+25
-26
lines changed

README.md

Lines changed: 0 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -312,31 +312,6 @@ docker run --rm -it --gpus all \
312312
--model /data/Llama-3.2-1B-Instruct.FP16.gguf \
313313
--prompt "Tell me a joke"
314314
```
315-
-----------
316-
317-
## Troubleshooting GPU Memory Issues
318-
319-
### Out of Memory Error
320-
321-
You may encounter an out-of-memory error like:
322-
```
323-
Exception in thread "main" uk.ac.manchester.tornado.api.exceptions.TornadoOutOfMemoryException: Unable to allocate 100663320 bytes of memory.
324-
To increase the maximum device memory, use -Dtornado.device.memory=<X>GB
325-
```
326-
327-
This indicates that the default GPU memory allocation (7GB) is insufficient for your model.
328-
329-
### Solution
330-
331-
First, check your GPU specifications. If your GPU has high memory capacity, you can increase the GPU memory allocation using the `--gpu-memory` flag:
332-
333-
```bash
334-
# For 3B models, try increasing to 15GB
335-
./llama-tornado --gpu --model beehive-llama-3.2-3b-instruct-fp16.gguf --prompt "Tell me a joke" --gpu-memory 15GB
336-
337-
# For 8B models, you may need even more (20GB or higher)
338-
./llama-tornado --gpu --model beehive-llama-3.2-8b-instruct-fp16.gguf --prompt "Tell me a joke" --gpu-memory 20GB
339-
```
340315

341316
-----------
342317

docs/RUN_DEBUG.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,28 @@
1-
## GPU Memory Requirements by Model Size
1+
## Troubleshooting GPU Memory Issues
2+
3+
### Out of Memory Error
4+
5+
You may encounter an out-of-memory error like:
6+
```
7+
Exception in thread "main" uk.ac.manchester.tornado.api.exceptions.TornadoOutOfMemoryException: Unable to allocate 100663320 bytes of memory.
8+
To increase the maximum device memory, use -Dtornado.device.memory=<X>GB
9+
```
10+
11+
This indicates that the default GPU memory allocation (7GB) is insufficient for your model.
12+
13+
### Solution
14+
15+
First, check your GPU specifications. If your GPU has high memory capacity, you can increase the GPU memory allocation using the `--gpu-memory` flag:
16+
17+
```bash
18+
# For 3B models, try increasing to 15GB
19+
./llama-tornado --gpu --model beehive-llama-3.2-3b-instruct-fp16.gguf --prompt "Tell me a joke" --gpu-memory 15GB
20+
21+
# For 8B models, you may need even more (20GB or higher)
22+
./llama-tornado --gpu --model beehive-llama-3.2-8b-instruct-fp16.gguf --prompt "Tell me a joke" --gpu-memory 20GB
23+
```
24+
25+
### GPU Memory Requirements by Model Size
226

327
| Model Size | Recommended GPU Memory |
428
|-------------|------------------------|

0 commit comments

Comments
 (0)