Beautiful Code is a new book from O’Reilly that illustrates how programmers solves problems: thinking about it, computers are really fast calculators or adding machines, so how do programmers make them edit photographs or run web servers?
Perhaps the rest of the book covers that, but Tim Bray’s chapter on search is available for download and I took a look at his examples. Hey, I actually learned something.
Like him, I wanted to run a quick report on the most popular nuggets I have put up. For the year so far, here’s the top ten (with timings):
2374: /wordpress/index.php/archives/2006/11/01/links-for-2006-11-02/
2074: /wordpress/index.php/archives/2004/05/29/abu-ghraib-scandal-rooted-in-porn/
1557: /wordpress/index.php/archives/2003/12/06/a-compromise-on-who-gets-their-likeness-on-the-dime
1428: /wordpress/index.php/archives/2005/04/11/your-wish-is-my-command/
1009: /wordpress/index.php/archives/2006/05/03/post-op/
774: /wordpress/index.php/archives/2005/03/17/killing-your-own-baby/
727: /wordpress/index.php/archives/2006/02/26/top-ten-hits-c/
667: /wordpress/index.php/archives/2004/08/16/solved-migrating-data-from-yahoo-calendar-to-ical/feed/
635: /wordpress/index.php/archives/2006/08/05/320-gb-drive-for-90/
570: /wordpress/index.php/archives/2002/07/12/windows-xp-bah/
2.67 real 2.02 user 0.08 sys
And for only stuff generated in 2007:
465: /wordpress/index.php/archives/2007/02/01/links-for-2007-02-01/
268: /wordpress/index.php/archives/2007/01/18/links-for-2007-01-18/
209: /wordpress/index.php/archives/2007/03/10/bleg-os-x-drivers-for-issc-bluetooth-dongle/
198: /wordpress/index.php/archives/2007/01/13/making-dashboard-widgets-more-useful/
174: /wordpress/index.php/archives/2007/04/17/omnibus-i-got-run-over-by-the/
163: /wordpress/index.php/archives/2007/03/01/inspiration/
151: /wordpress/index.php/archives/2007/01/26/restoring-a-verizon-v3cs-obexbluetooth-functionality/
145: /wordpress/index.php/archives/2007/05/02/necessary-evils-backups/
134: /wordpress/index.php/archives/2007/05/25/xml-rpc-server-accepts-post-requests-only/
129: /wordpress/index.php/archives/2007/04/28/links-for-2007-04-28/
2.03 real 1.40 user 0.06 sys
These are all in ruby, running over almost 350,000 lines of data in the last example. Not bad at 2 seconds.
The code is copied and pasted, with the regex modified to match what I use:
counts = {}
counts.default = 0
ARGF.each_line do |line|
if line =~ %r{GET (/wordpress/index.php/archives/2007/\d\d/\d\d[^ .]+) }
counts[$1] += 1
end
end
keys_by_count = counts.keys.sort { |a, b| counts[b] <=> counts[a] }
keys_by_count[0 .. 9].each do |key|
puts “#{counts[key]}: #{key}”
end
He explains the magic better than I can, but ruby looks like an elegant mixture of power and ease of use. I have tried to pick it up before but programming is not where whatever talents I have lie. This has always been a problem that has interested me, so perhaps I’ll explore it a bit further.