Rice University logo
 
Top blue bar image
Or looking for known, fixed vulnerabilities on servers that should know better (and several that shouldn't)
 

Ongoing Analysis

In addition to accomplishing our stated goal of eating turkey, Martha and I made some additional progress last week. After a futile struggle to make use of the Weka data mining software, I returned to Python and produced a script that used Bayesian probability to calculate the likelyhood of any given server type given a particular response.

Building on that, we completed an analysis program that predicts the server type of any given site based on the responses that the server gives to our queries. Although more analysis remains to be done, the initial results look promising. In most cases our software agrees with the reported server type, but in many cases, the results are different. For example, most servers identifying themselves as IBM_HTTP_SERVER were recognized as Apache. At first this appeared to be a weakness in the software, but, as it turns out, IBM Http Server is, in fact, re-branded Apache, and this result showed that the program was behaving correctly.

Sites that refused to provide a server type were generally identified with one of the major server vendors. More interesting is a significant minority of sites that report using one of the major servers, but which are identified by our program as using another. Do these cases represent errors on the part of our analysis, or do the represent a pattern of some site administrators intentionally providing false version strings? Given the number of obviously false version strings, it seems likely that, at least in some cases, the latter is taking place.

This week, we will continue our analysis. Stay tuned for more results!

Comments are closed.