I can think of a few things to look at (grammatical constructs, cliques, semantic graphs), but I'm curious to see what fellow HNers might come up with.
What would you investigate if you had the full text and metadata for every Wikipedia page, and a lot of time on your hands?