A significant spike in web traffic today, a lot of it down to these requests (the URI is good so why the 404?)
61.175.139.6 – – [02/Oct/2007:00:44:59 -0700] “GET /wordpress/index.php/archives/2005/04/11/your-wish-is-my-command/#comment-332774 HTTP/1.1” 404 32880 “http://www.paulbeard.org/wordpress/index.php/archives/2005/04/11/your-wish-is-my-command/” “Opera/7.02 Bork-edition (Windows NT 5.0; U) [en]”
and these, like 9000 of them from some rogue web crawler.
75.126.214.21 – – [02/Oct/2007:18:35:14 -0700] “GET /wordpress/index.php/page/54/ HTTP/1.1” 200 40581 “-” “Jakarta Commons-HttpClient/3.0.1”
Annoyances. I don’t get the first one at all. And I don’t get why I have 12,000 lines of data but can only get less than 200 to be recognized by this regexp:
ARGF.each_line do |line|
if line =~ %r{GET ([^ .]+) }
puts $1
counts[$1] += 1
end
end