Would be really cool to sample wikipedia hyperlink graph.
Wikipedia requests "Please do not use a web crawler to download large numbers of articles. Aggressive crawling of the server can cause a dramatic slow-down of Wikipedia."
https://en.wikipedia.org/wiki/Wikipedia:Database_download#Please_do_not_use_a_web_crawler
So, would it be ok if we limited the number of pages downloaded? I don't know what a good number is. Is 50k too high?
Alternatively, that link above describes how one can download the data in bulk.