Skip to content

Commit 9bcdbc7

Browse files
authored
Add note about label noise in CIFAR-10 dataset documentation (#21855)
The CIFAR-10 dataset is known to contain a small percentage of mislabeled samples, which can affect model training and evaluation. This note helps users understand that some label inconsistencies are expected and inherent to the original dataset. Fixes #21631 Signed-off-by: Samaresh Kumar Singh <ssam3003@gmail.com>
1 parent 529e162 commit 9bcdbc7

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

keras/src/datasets/cifar10.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,11 @@ def load_data():
5959
assert y_train.shape == (50000, 1)
6060
assert y_test.shape == (10000, 1)
6161
```
62+
63+
**Note**: The CIFAR-10 dataset is known to have a small percentage of
64+
mislabeled samples, which is inherent to the original dataset. This label
65+
noise may impact training and evaluation. For more details, refer to
66+
discussions in the research literature on CIFAR-10 label quality.
6267
"""
6368
dirname = "cifar-10-batches-py-target"
6469
origin = "https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"

0 commit comments

Comments
 (0)