- DB Size: 543 million rows
- Data Size: 173GB (uncompressed)
- Stored in mysql
- 200+ Million tweets from 13+ Million users
- Collected in 1 week
- Operation costs: 100+ dollars
- Rackspace Cloud - 1 CentOS 8GB Ram server
- Java, memcache, mysql and perl for core processing
- js, php for analytics & visualization
* Download the data at this url http://www.archive.org/details/2011-06-calufa-twitter-sql