treehugging

Some environmentalists gave a presentation at my kids’ school today and we have been talking about the Things We Can Do to Make a Difference, to help preserve the salmon and the orcas (the two totem animals of our region).

Most of the suggestions are things we already do and have since before our inquisitors arrived. But it’s interesting to go over it all.

The only real 24/7 usage I can see cutting down would be the CPUs churning away downstairs: I’m not using them, necessarily, but they represent my small contribution to the conversation and I suspect cost less than other devices.

broadband » SETI@Home FAQ 1.3 Advanced Information:

300w PSU
Assumed maximum demand due to losses: 375w
In one calendar month this would consume 270.90 Kw/h at a cost of 0.06c per Kw/h, the maximum cost is $16.25

SCL (my local utility) say they charge 8.05¢/KWh, so a PC with a 300W power supply would cost about $34/month: a dollar a day or near enough for a system that’s not all that loaded.

Graph Image-2.Php

Now playing: Keito by Ali Farka Touré with Ry Cooder from the album “Talking Timbuktu”

worth a read, if you think a retired General knows anything about war and its consequences

Nieman Watchdog > Commentary > Iraq through the prism of Vietnam:

Those who say Iraq is nothing like Vietnam have another guess coming, says retired Gen. William Odom. He lists striking similarities and asserts that only after it pulls out of Iraq can the U.S. hope for international support to deal with anti-Western forces.

Of course, he cites a lot of pesky facts that contradict the reality we have told so much about these past 5 years.

resolution

This seems to work for what I have in mind: subsecond returns (though with a pretty small dataset).

DROP TABLE IF EXISTS `access_log`;
CREATE TABLE `access_log` (
id mediumint AUTO_INCREMENT,
agent varchar(255),
bytes_sent int(10),
child_pid smallint(5) unsigned,
cookie varchar(255),
machine_id varchar(25),
request_file varchar(255),
referer varchar(255),
remote_host varchar(50),
remote_logname varchar(50),
remote_user varchar(50),
request_duration smallint(5) unsigned,
request_line varchar(255),
request_method varchar(10),
request_protocol varchar(10),
request_time varchar(28),
request_uri varchar(255),
request_args varchar(255),
server_port smallint(5) unsigned,
ssl_cipher varchar(25),
ssl_keysize smallint(5) unsigned,
ssl_maxkeysize smallint(5) unsigned,
status smallint(5) unsigned,
time_stamp timestamp(15),
virtual_host varchar(255),
FULLTEXT KEY request_uri (request_uri),
PRIMARY KEY (id),
INDEX ip_timestamp (remote_host,time_stamp),
);

So the ultimate goal of this is to be able to pull out the top 10 or 25 most-requested URLs. I’m guessing I need to grab the URLs (more likely the entry numbers) and then query the wp database for the title, then write out my list of URL, linked by title, and the count.

Right now, I am loading the data for 2006 for some more tests. Then I’ll pick this up again.

Thanks for all your help.
<update> I’m not sure this is do-able, given the constraints of my systems here. Once the dataset gets to a meaningful size (like 300k records — this year’s volume, so far), queries are quite slow. It shouldn’t take 30+ seconds to fetch and order 10 rows from a 300k row table. I’m sure it’s tunable or something a competent database designer would fix. I’ll get back to it after awhile.
Continue reading “resolution”

index, from the creation

Second whack at this. Suggestions welcome . . .

  1. Made the id an index
  2. Removed the NULLs
  3. Removed the extraneous tables I don’t use
-- MySQL dump 10.9
--
-- Host: localhost    Database: apachelogs
-- ------------------------------------------------------
-- Server version       4.1.18-log

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `access_log`
--

DROP TABLE IF EXISTS `access_log`;
CREATE TABLE `access_log` (
  `id` mediumint NOT NULL AUTOINCREMENT,
  `agent` varchar(255) default NOT NULL,
  `bytes_sent` int(10) unsigned default NOT NULL,
  `child_pid` smallint(5) unsigned default NOT NULL,
  `cookie` varchar(255) default NOT NULL,
  `machine_id` varchar(25) default NOT NULL,
  `request_file` varchar(255) default NOT NULL,
  `referer` varchar(255) default NOT NULL,
  `remote_host` varchar(50) default NOT NULL,
  `remote_logname` varchar(50) default NOT NULL,
  `remote_user` varchar(50) default NOT NULL,
  `request_duration` smallint(5) unsigned default NOT NULL,
  `request_line` varchar(255) default NOT NULL,
  `request_method` varchar(10) default NOT NULL,
  `request_protocol` varchar(10) default NOT NULL,
  `request_time` varchar(28) default NOT NULL,
  `request_uri` varchar(255) default NOT NULL,
  `request_args` varchar(255) default NOT NULL,
  `server_port` smallint(5) unsigned default NOT NULL,
  `ssl_cipher` varchar(25) default NOT NULL,
  `ssl_keysize` smallint(5) unsigned default NOT NULL,
  `ssl_maxkeysize` smallint(5) unsigned default NOT NULL,
  `status` smallint(5) unsigned default NOT NULL,
  `time_stamp` int(10) unsigned default NOT NULL,
  `virtual_host` varchar(255) default NOT NULL,
  FULLTEXT KEY `request_uri` (`request_uri`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

indexes

I have been playing with indexes in MySQL today, and found a good example of their benefits:

select request_uri, count(request_uri) as count from access_log group by request_uri order by count desc limit 25;

coughed up results thusly:

25 rows in set (10 min 38.19 sec)

This is against a table with 329,000 rows.

With the addition of this:

alter table access_log add index request_uri_index (request_uri);

we get these results:

25 rows in set (26.98 sec)

Not blinding fast, but a considerable improvement.
Continue reading “indexes”

is the Mac crackable?

Do your worst . . .

Mac OS X Security Challenge:

The challenge is as follows: simply alter the web page on this machine, test.doit.wisc.edu. The machine is a Mac mini (PowerPC) running Mac OS X 10.4.5 with Security Update 2006-001, has two local accounts, and has ssh and http open – a lot more than most Mac OS X machines will ever have open. Email das@doit.wisc.edu if you feel you have met the requirements, along with the mechanism used. The mechanism will then be reported to Apple and/or the entities responsible for the component(s). Going after other hosts/devices on the network is out of bounds.

So you can get the source to the OS and the daemons running on it (ssh and http) — which is more of a headstart than you get with Windows. Let’s see if anyone can do it.

[tip]

what makes an SF book an SF book?

Apparently, all it takes is for a rabid SF fan to like it.

Science Fiction Books – A Reading List by Dave Itzkoff – New York Times:

Following is a list of favorites, with commentary, by the writer of the Book Review’s new science fiction column.

Again with the genrefication. A Clockwork Orange is science fiction? Looking for Jake?

I don’t think so, but as the reviewer/genrefier says of his classification of The Crying of Lot 49 as SciFi:

Due to space limitations, I can’t offer my complete explanation of why this is a science-fiction book, so for the sake of efficiency let me simply say to anyone who disagrees with my classification of it as such: You’re wrong.

Ok, then.

I think the concepts of dystopian futures and stories where technology is either prominent piece of the staging or the moral equivalent of a character are being conflated; either that or some readers are assuming that since their preferred genre is SF, anything that engages them as deeply must also be SF. Rather than permit their canon to expand, they pull other works into their preferred canon.