diff --git a/datasets/index.md b/datasets/index.md index d2ac4f3..5cb34c9 100644 --- a/datasets/index.md +++ b/datasets/index.md @@ -34,7 +34,7 @@ The datasets here should not require sign-up for web services or writing emails

Java GitHub corpus

-

This dataset includes about 14'000 Java files from GitHub, split into training and test set. +

This dataset includes about 14'000 Java projects from GitHub, split into training and test set. The files are from open source projects that have been forked at least once.
[download dataset]