experiments

I have the RSS feed for new titles at Project Gutenberg in my aggregator and noticed that Flatland was in the collection.

I read that when I was a boy and thought it would make for a good read. So starting Feb 1, I’ll be putting up one section a day here: they’re already posted into WordPress and will be made available at 4 AM on each day.

This is the best-known of these projects, I suppose: I think it would be interesting to see more of these. I see that Ulysses is in the Gutenberg repository, possibly incorrectly, considering all the fuss that Joyce’s remaining heir made at the prospect of public readings for the centenary this past summer.

What book would you read if it were serialized?

Now playing: I Shot The Sheriff by Bob Marley & The Wailers from the album “Legend” | Get it

Brad Delong on Chuq Van Rospach on Comment Spam

Chuq’s comment make sense, but Brad takes the next step: I think Google would do well to try that.

Chuq Van Rospach on Comment Spam:

If I were google, I would have taken a different approach. I would not have said, “if you make sure links have a ‘nofollow’ tag we won’t include them in our index.” Instead, I would have said, “Links that are created by somebody other than the page author — i.e., links inside blocks labeled as comments — will *not* be included in our index unless they have a ‘follow’ tag inside them.” Given where we are now, I think the second approach would have done more good and also improved the quality of google’s indexes.

apologies for intermittent service

Things have been a little flaky today: MySQL may have been down from 9-12 AM today, and snmpd went offline over the same span (how do you know what happens when your monitoring tools die?).

And then I decided to upgrade MySQL from 3.23.58 to 4.x and for some reason, it took more than an hour for me to figure which of the various shims and enhancements needed to be rebuilt (I thought I did them all, and I know I did some of them more than once).

The list of packages/ports that got touched in the process:

  • php4-snmp-4.3.10_1
  • rc_subr-1.31
  • png-1.2.8
  • pkgconfig-0.15.0_1
  • freetype2-2.1.7_4
  • php4-zlib-4.3.10_1
  • php4-xmlrpc-4.3.10_1
  • perl-5.8.5
  • openssl-0.9.7e_1
  • mod_php4-4.3.10_1,1
  • libiconv-1.9.2_1
  • expat-1.95.8
  • apache-2.0.52_4
  • php4-xml-4.3.10_1
  • php4-tokenizer-4.3.10_1
  • php4-session-4.3.10_1
  • php4-posix-4.3.10_1
  • php4-pcre-4.3.10_1
  • php4-overload-4.3.10_1
  • php4-mysql-4.3.10_1
  • mysql-client-4.0.23a
  • gettext-0.14.1
  • fontconfig-2.2.3,1
  • t1lib-5.0.1,1
  • php4-gettext-4.3.10_1
  • php4-ctype-4.3.10_1
  • php4-calendar-4.3.10_1
  • php4-bz2-4.3.10_1
  • jpeg-6b_3
  • XFree86-libraries-4.4.0_3
  • php4-gd-4.3.10_1
  • php4-bcmath-4.3.10_1
  • php4-extensions-1.0

Not all of them seem relevant but dependencies can be an ugly business.

from my local FreeCycle list

Sounds like a big request, but here’s the deal: I drive for Metro and I am constantly seeing the homeless lugging around tons of blankets trying to keep warm. From my old “backpacking” days I remember having a down sleeping bag that I could stuff into a bag the size of a pillow case or smaller and it kept me toasty warm. If you can part with that down sleeping bag that you swear you are going to go camping with but haven’t in like forever…………..I will along with my 10 year old grandson go downtown and distribute them. Last year this time it was sandwiches and socks………I would like to bump it up a bit. I can drive around and pick up this weekend. Peace and thanks!

Not a call to action, just an observation.

and Windows is popular, why?

Microsoft RC4 Flaw:

One of the most important rules of stream ciphers is to never use the same keystream to encrypt two different documents. If someone does, you can break the encryption by XORing the two ciphertext streams together. The keystream drops out, and you end up with plaintext XORed with plaintext — and you can easily recover the two plaintexts using letter frequency analysis and other basic techniques.

It’s an amateur crypto mistake. The easy way to prevent this attack is to use a unique initialization vector (IV) in addition to the key whenever you encrypt a document.

Microsoft uses the RC4 stream cipher in both Word and Excel. And they make this mistake. Hongjun Wu has details (link is a PDF).

In this report, we point out a serious security flaw in Microsoft Word and Excel. The stream cipher RC4 [9] with key length up to 128 bits is used in Microsoft Word and Excel to protect the documents. But when an encrypted document gets modified and saved, the initialization vector remains the same and thus the same keystream generated from RC4 is applied to encrypt the different versions of that document. The consequence is disastrous since a lot of information of the document could be recovered easily.

This isn’t new. Microsoft made the same mistake in 1999 with RC4 in WinNT Syskey. Five years later, Microsoft has the same flaw in other products.

Hmm, does this — repeating the same mistake — happen in open source implementations?

if you care about the health of kids in today’s world, read this book

This may just corroborate some of what you already know, but for me it quantified some things and made me aware of some things I hadn’t considered.


“Born to Buy : The Commercialized Child and the New Consumer Culture” (Juliet B. Schor)

What I found most disturbing was the amount of money and effort going into exploiting kids and making them into passive — and later — aggressive consumers. The central idea of the book is that kids are exposed to too many messages and exhortations to buy as a result of unsupervised or unregulated advertising. The amount of time kids watch TV is up but the amount of advertising — especially for non-nutritional foods, violent games, and adult products like alcohol, is way up. Schor presents some frightening details on the link between exposure to Hollywood films and the increased use of alcohol and tobacco through the use of paid product placements in the films.

And she discovered, contrary to her own assumptions, a link between troubled kids — poor self-esteem, poor decision-making skills, and unhealthy attitudes toward family and society — and TV watching/ad exposure. The initial assumption was that troubled kids watched TV, ate poorly, and scratched their materialistic itch to soothe their feelings about themselves, but a study demonstrated that the media messages were causing the other problems.

The last part of the book has a call to action for legislation and policy initiatives we’re unlikely to see, but along with that are some simple steps that every household can take without waiting for government ‘help.’

  • Turn off the TV and put it somewhere that makes it inconvenient for everyone to watch it
  • Model good behavior — don’t tell the kid he can’t have expensive things while you shop at Tiffany & Co.
  • Go outside: it will go you both good
  • If you have to be inside, play boardgames, paint, draw, read, write, tell stories: live

One of my favorite arguments — the need for Slow Food, among other things — makes an appearance, as well, as we see a generation of kids who have never tasted anything but sugary/salty snacks and who rather eat a blue snack than a green fruit or vegetable.

Right book, wrong time: as the author notes, this is not a great climate for arguments against the power of big business in favor of the individual, even if the individual is 6. But don’t that stop you.

on preventing referer/comment spam

Following up on Tim Bray’s comments on a recent flood of referer spam, I just left this comment at Gary’s site:

What puzzles me is how colocation and broadband providers never seem to monitor their networks well enough to see this: if individual sites can see these storms, I imagine are even easier to see on the sending side.

I suppose the only recourse is to ensure no reputable business uses shoddy hosting providers: perhaps we need to start publishing a score card that tracks what provider networks are responsible for the most outbound crap.

Would “shoddyhosts.org” be worth doing? The allegations would need to be documented — you wouldn’t want someone’s gripes over a billing error getting added into the mix — but that seems to happen most of the time anyway.

It’s not like hosting and broadband providers are unaware of the problem. Why should their customers have to spend their time concocting new solutions when the people who run the networks could stop a lot of this at the source?

fun with spammers

Thoughts on Spam:

Some poor schmuck looking for an open formmail.pl relay on my server. I honestly felt pity for the guy; 403 and 404 errors all day long can really take the wind out of your sails. So I decided to give the troller something useful to read (repetition is the key to learning).

This brief Perl script has brought me joy; my desire is that it brings formmail.pl trollers joy also. The script, when completely finished, returns an HTML page approximately 10,000,000 bytes long (10MB) and it takes about 2 minutes to do that (the select statement slows things down, so all the joy does not come at once).

I was surprised to see I am getting these as well: anyone who’s still looking for (or hosting) formmail.pl (which wasn’t Y2K-safe) deserves whatever they get.

how to render web traffic logging useless

Referrer Spamstorm:

Near as I can tell, pretty well every somewhat-visible website in the world is seeing its logfiles fill up with with bogus page fetches there only as a vehicle for a spammish “referer” field; whether or not the site posts referrer data. This high-volume flood is a fairly recent phenomenon, and what makes it weird is that the vast majority of the bogus referrer sites are off the air due to some terms-of-service violation. It would appear that a sleazebag somewhere launched a really ambitious assault on the whole world — using, I can only assume, a few zillion zombified drone machines — only to be found out and have their hosting yanked while their mindless slaves continue to spew vacuous venom into logfiles everywhere. Damn, the Internet is a weird place.

This has been commented on in this space before, but what I find most annoying is that it means my own logging is useless for all but the most raw of statistics — bandwidth used — and I track that with mrtg anyway. So I use the Google AdSense reports to see how many actual page views they log. Yesterday, I recorded more that 7,000 page views — I drop requests for css files, images, and javascript on the floor — yesterday, while Google only recorded 500 actual requests.

And how surprised would you be to learn that all the bogus referer spammers report themselves as MSIE/Windows?

Rebecca may not be seeing this as much as I am, but these jerks make it hard to know how much good could be done if we can’t reliably count our audience.
remembering rebecca :: january 2005:

You’ll never see a pledge drive on this site, and those of you who have visited my wishlist know that I do this because I love it, not because I hope for some material gain. But if every individual who visits this site gave just $10, we would have raised over $65,000 in just the first two weeks of January. So now I am asking for your financial support. Please, if you can spare just $5 or $10, donate right now to help rebuild these lives.

Now playing: Northern Boy by Randy Newman & others from the album “Randy Newman’s Faust” | Get it

<update> it seems Tim updated his earlier post with some user-submitted research and a call to the home phone of the perpetrator of this global irritation.