Ethan Zuckerman and his student Matt Stempeck at MIT's Center for Civic Media are undertaking this massive project themselves, using a combination of human and computer labor to categorize many terabytes of news stories. Zuckerman hopes to create an eventual product that would let the user track his or her "news nutrition."
I post this not to be self-promotional but because I think it would be of interest to this community... and I'm curious to hear how you all would attack this problem.