curl "http://news.ycombinator.com/item?id=837790" | grep -o "http[^\"]*" | grep wsj
and hence this: curl "http://online.wsj.com/article/SB125354895752528171.html#mod=WSJ_hps_sections_smallbusiness" | less
I could then read the text. but it struck me just how much other crap there was, so I save the text: curl "http://online.wsj.com/article/SB125354895752528171.html#mod=WSJ_hps_sections_smallbusiness" > wsj0
and edited out everything except the byline, text, and legal phrasing at the end. I stored the result in file wsj1.The result:
# ll wsj?
-rw-r--r-- 1 RiderOfGiraffes users 126701 2009-09-22 21:41 wsj0
-rw-r--r-- 1 RiderOfGiraffes users 5042 2009-09-22 21:44 wsj1
Yup. 5K of text in 125K of page.I'm glad I don't use dialup anymore.
And the article wasn't worth reading anyway.