Rice University logo
 
Top blue bar image
Or looking for known, fixed vulnerabilities on servers that should know better (and several that shouldn't)
 

First Steps

So we now have a primitive spider working!  You can see our code, which uses the Scrapy toolkit, at: https://github.com/tbook/comp527-serversurvey  Pretty ugly at this point, but hopefully we will have something nice by the end of the semester.

We made an initial attempt to crawl the Alexa top 500 and gather some basic data from the server headers.  You can see a survey of our initial results here.

Discovering all of the interesting headers sent back by the servers that we encountered prompted a slight change in our methodology – we will log all server headers, which will allow us to assemble a fairly complete directory of what headers are in use, and use them to classify servers.  We will also examine the date stamps to survey how many servers have the date correctly configured.

Some initial insights:

  • It seems that many servers are (understandably) guarded about sharing version information.  Many servers don’t give the version, and some don’t even share the name of the server.  Several return the helpful string “server”, or “confidential”
  • There is quite a variety of servers “in the wild.”  Apache has the largest share, but we observed the following other servers, as well: aris, BWS, GSE, gws, IBM, lighttpd, Microsoft-IIS, nginx, Netscape, PWS, Sun-Java-System-Web-Server, Tengine, and others.
  • Reddit.com seems to be trying a SQL exploit.  Their server string is: “‘; DROP TABLE servertypes; –“
  • Roughly 3/4 of servers provide charset information, which varies widely, with UTF-8 being the most common, but ISO-8859-1, GB2312, GBK, windows-1251, windows-1256, EUC-JP, EUC-KR, Shift_JIS, and Big5 also appearing.  gsmarena.com uses “None” for it’s charset, apparently giving you the freedom to interpret their content in the way that you find personally most satisfying.

That’s where we are right now.  Look for more updates in the coming weeks!

Comments are closed.