This morning around 8:00 UTC Nagios alerted me, that the thread count on a number of Apache Tomcat servers I support for a client started to rise dramatically. I've discovered that a library that is part of the application was trying to fetch DTDs like http://java.sun.com/dtd/properties.dtd from java.sun.com. The servers are unreachable so each request thread was taking 30+ seconds to time out. I've worked around the problem by putting an iptables rule for 192.9.162.55 port 80, that would reject the request immediately and informed to client to look for a permanent solution.
In case the library can't be fixed (I don't know who developed it), I'm planning on putting the relevant DTD files on a local HTTP server and redirect all requests for java.sun.com to that virtual host (eg. through relevant entries in the hosts file).
Check your application servers and the software you use if it processes XML documents. One way to check:
$ netstat -ant | grep -E '192.9.162.55:80[[:space:]]+SYN_SENT' | wc -l
184
$
SYN_SENT is because the 192.9.162.55 (java.sun.com) is not responding ATM. If it becomes reachable again, just s/[[:space::]]+SYN_SENT// .Relevant URLs for more information:
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/
http://www.oasis-open.org/committees/entity/spec-2001-08-06.html
And a previous discussion here about the same problem:
http://news.ycombinator.com/item?id=3094075