Heykuki News
The netflix prize, www.netflixprize.com is one obvious one, but only 1 set.
Project Gutenberg is another where you can get large amounts of text. http://www.gutenberg.org/wiki/Main_Page
Datasets for medicin would be greatly appreciated.