Rice University logo
 
Top blue bar image
Or looking for known, fixed vulnerabilities on servers that should know better (and several that shouldn't)
 

Some real progress

Over the last weekend, Martha and I added some features to our spider that will enable it to gather the information needed for our analysis.  We expanded the spider to create a variety of different requests, allowing us to compare responses from different servers and compute a unique “fingerprint” for different server configurations.   At the present moment, I am running the spider on the Alexa Top 10,000 to get a big enough dataset to do some initial analysis and to begin to identify some of the potential security weaknesses that may be out there “in the wild.”

Here are the requests that the current spider makes against each server:

  • An ordinary get request against the root URL
  • A partial get request of 50 bytes against the root URL
  • A conditional get request for pages modified after a future date against the root URL
  • A head request against the root URL
  • An options request
  • A trace request against the root URL
  • A request for the root URL as a CSS stylesheet
  • A request for robots.txt as a CSS stylesheet
  • A request for a relative URL below the root directory
  • A request for the favicon

For each of these requests, we record the following fields:

  • The server version string
  • The response content type
  • The date of the response
  • The request method
  • The request URL
  • The response URL (often different due to redirects)
  • The request headers
  • The complete response headers
  • The reported content length
  • The actual length of the body (although there may be some character encoding issues that make our representation inaccurate)
  • The response status code

We have also prepared some analysis software in python that will catalog different responses for servers with the same server id string.  This may help us to identify servers sending a false version string as well as different configurations for different servers.

We are looking forward to continuing with our analysis, and seeing what meaningful data we are able to extract from our results.

Comments are closed.