Skip to content

Commit 817e63b

Browse files
authored
doc: benchmark description
1 parent 2a28c61 commit 817e63b

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@
4848

4949
BigCodeBench is an **_easy-to-use_** benchmark for solving **_practical_** and **_challenging_** tasks via code. It aims to evaluate the true programming capabilities of large language models (LLMs) in a more realistic setting. The benchmark is designed for HumanEval-like function-level code generation tasks, but with much more complex instructions and diverse function calls.
5050

51+
There are two splits in BigCodeBench:
52+
- `Complete`: Thes split is designed for code completion based on the comprehensive docstrings.
53+
- `Instruct`: The split works for the instruction-tuned and chat models only, where the models are asked to generate a code snippet based on the natural language instructions. The instructions only contain necessary information, and require more complex reasoning.
54+
5155
### Why BigCodeBench?
5256

5357
BigCodeBench focuses on task automation via code generation with *diverse function calls* and *complex instructions*, with:

0 commit comments

Comments
 (0)