network tuning, OS X Leopard edition

I had occasion to fire up an old PPC iMac G5 (OS X 10.5.8) the other week and was appalled at how slowly it performed at network access. So here’s what I did to fix it.

Per Scott Rolande, there are tunable values for many aspects of the TCP stack. Handily, they live in a text file and can be tinkered with interactively.

kern.ipc.maxsockbuf=4194304
kern.ipc.somaxconn=512
kern.ipc.nmbclusters=2048
net.inet.tcp.rfc1323=1
net.inet.tcp.win_scale_factor=3
net.inet.tcp.sockthreshold=16
net.inet.tcp.sendspace=262144
net.inet.tcp.recvspace=262144
net.inet.tcp.mssdflt=1440
net.inet.tcp.msl=15000
net.inet.tcp.always_keepalive=0
net.inet.tcp.delayed_ack=0
net.inet.tcp.slowstart_flightsize=4
net.inet.tcp.blackhole=2
net.inet.udp.blackhole=1
net.inet.icmp.icmplim=50

This machines didn’t have a sysctl.conf file so I used copied his and used it to pull out the current values.

for i in `cut -d= -f1 sysctl.conf`; do sysctl $i; done
kern.ipc.maxsockbuf: 8388608
kern.ipc.somaxconn: 128
kern.ipc.nmbclusters: 32768
net.inet.tcp.rfc1323: 1
net.inet.tcp.win_scale_factor: 3
net.inet.tcp.sockthreshold: 64
net.inet.tcp.sendspace: 65536
net.inet.tcp.recvspace: 65536
net.inet.tcp.mssdflt: 512
net.inet.tcp.msl: 15000
net.inet.tcp.always_keepalive: 0
net.inet.tcp.delayed_ack: 3
net.inet.tcp.slowstart_flightsize: 1
net.inet.tcp.blackhole: 0
net.inet.udp.blackhole: 0
net.inet.icmp.icmplim: 250

A little different. Not sure why kern.ipc.maxsockbuf is so much higher on an old machine that maxes out at 2Gb of RAM…

To test throughput, I needed a test file.
hdiutil create -size 1g test.dmg
.....................................................................................................................................
created: /Users/paul/test.dmg

Over wireless G on a mixed wireless N/G network to a wired 100 Mbit host on a Gigabit switch, it managed a stately 12 Mbits/second.

Twelve minutes (12m19.024s) later:
sent 1073872981 bytes received 42 bytes 1452160.95 bytes/sec
total size is 1073741824 speedup is 1.00

Oy. Now to try it to a wireless destination, a 10.8.3 machine.

Hmm, interestingly, OS X handles rsync transfers a little differently: it blocks out space equivalent to the size of the file. This is checking the size of the file during the transfer. du tells a different story than ls. As you can see the file size never changes during the transfer. du -h .test.dmg.GsCjdW; sleep 10 ; du -h .test.dmg.GsCjdW
1.0G .test.dmg.GsCjdW
1.0G .test.dmg.GsCjdW

Using ls -l shows the actual size of the file, not the disk space set aside for it.

Still slow: sent 1073872981 bytes received 42 bytes 1428972.75 bytes/sec

Took 12m30.961s, the difference being because it went to sleep (out of boredom?).

After changing the various sysctl OIDs, things got much worse.

This is what I have on the 10.8.3 system:
kern.ipc.maxsockbuf: 4194304
kern.ipc.somaxconn: 1024
kern.ipc.nmbclusters: 32768
net.inet.tcp.rfc1323: 1
net.inet.tcp.win_scale_factor: 3
net.inet.tcp.sockthreshold is not implemented
net.inet.tcp.sendspace: 2097152
net.inet.tcp.recvspace: 2097152
net.inet.tcp.mssdflt: 1460
net.inet.tcp.msl: 15000
net.inet.tcp.always_keepalive: 0
net.inet.tcp.delayed_ack: 0
net.inet.tcp.slowstart_flightsize: 1
net.inet.tcp.blackhole: 0
net.inet.udp.blackhole: 0
net.inet.icmp.icmplim: 250

A 1Gb transfer takes too long (which of course is the problem) so I made a couple of small changes and tried a 100Mbit file. Down to 13 seconds. Hmm, not bad. The changes:
sysctl -w net.inet.tcp.sendspace=4194304
sysctl -w net.inet.tcp.recvspace=4194304
sysctl -w net.inet.tcp.mssdflt=1460

I set net.inet.tcp.[send|recv]space to be half of kern.ipc.maxsockbuf and made the net.inet.tcp.mssdflt match the receiving system.

Now a 1Gb test file takes 53.287s. Copying from 10.8.3 to 10.5.8 took just 31.215s. After synchronizing the net.inet.tcp.mssdflt on the system I first tested with, transfers to there are down to 1m47.471s.

So some big improvements for not much effort. I’m sure there are lots of other tweaks but given the relatively little need for more improvement and the limited potential (the old 10.5 system on wireless G is frozen in time while the newer wireless N machines will see further improvements), I don’t know that I’ll bother. A twelve-fold increase in one direction and a 24-fold boost going the other way is pretty good. If I really cared, i.e., this was something I expected to do regularly, I’d run a Cat5 cable to it and call it done.

After a reboot to ensure the values stay put, I tested different copy methods as well, all with the same 1Gb file.

from the 100Mbit wired machine using rsync: 0m56.349s

same to/from, using scp -C for compression (since I used rsync -z): 1m40.794s

from the 10.8.3 system to the 10.5 system with scp -C: 1m35.228s

from the 10.8.3 system to the 10.5 system with rsync -z: 0m24.734s (!!)

from the 10.5 system to 10.8.3 with rsync -z: 0m38.861s

So even better after the reboot. Could be other variables in there as well. I’m calling it done.

UPDATE: the morning after shows a different story. I was puzzled that snmp monitoring wasn’t working so I took a look this morning and things are slow again, down to 5 Mbits/second from the 12 I considered poky. At this point, I’m not sure how reliable the benchmark data was or at least how I was evaluating it.

I’ll have to investigate further. I created some more test media by splitting up the 1Gb file into smaller ones, so I have a pile of 100Mbit and 10Mbit files as well. Part of the optimization I am looking for is good throughput for large files as well as being able to handle smaller files quickly. Large buffers and access to a good sized reserve of connections, in other words.

ORLY?

Here’s what I would recommend: If you still use “admin” as a username on your blog, change it

ORLY?
Screen Shot 2013-04-13 at 11.47.24 AM

I’m just lucky that way, I guess.

I changed it in the database as there was no way to change it in the UI. But if I’m not the only WordPress user with this problem, then what?

PS: 50 lockouts on wp-login since I installed that additional layer of security.

forensics

These two user agents make up pretty much all the brute force attempts to access the admin pages here for one day.

"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 GTB5"
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0"

The full list from just one sample — yesterday — looks like this:
"Mozilla/5.0 (Windows NT 6.1; AMD64; en:16.0) Gecko/20100101 Firefox/16.0"
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0"
"Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.15 (KHTML, like Gecko) Chrome/24.0.1295.0 Chrome/24.0.1295.0 Safari/537.15"
"Mozilla/5.0 (Windows; U; MSIE 9.0; WIndows NT 9.0; en-US))"
"Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 Firefox/3.5.3 GTB5"
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)"
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)"
"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 7.1; Trident/5.0)"
"Mozilla/6.0 (Windows NT 6.2; WOW64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1"

So I can’t really block by user agent, as I had hoped.

Notice anything in common between all of them?

None of them claim to be anything but Windows. For all I know these are actual browsers on someone’s PC running a background process, not a script with a bogus presentation. The UAs seem legit from a cursory Google search.

So I come back once more to the question: how does shutting down arbitrary ports on residential customer networks do anything to defeat this? This may seem like background noise but it’s a constant, relentless attack on the network that adds volume and congestion, which should be of concern to network providers, and puts customers at risk, which should be an issue for those customers and their network providers. It’s network abuse, at the most basic definition.

I’m sure there is enough information from various research projects, honeypots, and ad hoc stuff like this to put together a defensive strategy.

  • Find the weak links, be they network providers who don’t care like the image below reveals, browsers and operating systems that are unsecured on delivery (see above: the GTB5 variant is a toolbar, I assume, and those have been linked to this kind of nonsense for years), and isolate them.
  • Work with publishers and hosting companies to identify and report patterns of abuse.
  • Establish a buyer’s guide to network service that monitors and publicizes poor security practices, advising both business and personal users that they may find themselves cut off from the internet if they sign on with one of these suspect providers.

For the moment, I am simply taking all the IP addresses that get caught by the login trap and adding them to a ban list. When I have a few more, perhaps I’ll sift through them to find out what netblocks/ISPs claim them and see what that reveals.

stay classy, there.

Well, apparently this is a new thing.

Right now there’s a botnet going around all of the WordPresses it can find trying to login with the “admin” username and a bunch of common passwords, and it has turned into a news story […] Most other advice isn’t great — supposedly this botnet has over 90,000 IP addresses, so an IP limiting or login throttling plugin isn’t going to be great (they could try from a different IP a second for 24 hours).

The first 24 hours

So here’s how the first day or so went of logging and blocking repeated attempts to access the login screen by brute force.

20130411-193234.jpg

What protection against this does Comcast offer with their smtp lockdowns and block on https?

Added the Stealth Login Page Plugin as well, with the added benefit that future script kiddies will get an eyeful of sex.com on each attempt. What would be better is some site that jams a ton of bits down the pipe upon accessing a page.

and moar cleanup

One of the side-effects of owncloud is the client chatter, all those PROPFIND attempts.

grep -c PROPF /var/log/httpd/httpd-access.log
2740
which is about 1/4 of the log file:
9438 /var/log/httpd/httpd-access.log

I have yet to decipher the right Sacred Rune that will tell me how to prevent this cruft from being logged. I’m able to ignore/not log traffic from 127.0.0.1 but cannot ignore requests for specific URI contents (like PROPFIND requests) or by other IP addresses.

Aha. It turns out the issue was either some old stuff that was messing up the evaluation process of each request or that the IfSetEnvIF requests have to come in order. Or it just doesn’t work, as I have tested those already. Requests from 192.168.0.x networks are logged, but requests from the outside are ignored. Requests for some specified directories are not logged but others are, despite near identical config options.


SetEnvIfNoCase Request_URI "^/metrics" dontlog
SetEnvIfNoCase Request_URI "^PROPFIND" dontlog

Seems legit.

192.168.0.4 - paul [10/Apr/2013:15:38:30 -0700] "GET /cloud/status.php HTTP/1.1" 200 74

I wonder if it’s because I am authenticated?

Nope, it seems to be the menubar client (on OS X) that does that. In a browser, none of the requests are logged, even authenticated. But the little menubar gadget? Everything is logged…

What’s also interesting is that IP addresses can be used unescaped. I didn’t know that.

SetEnvIf Remote_Addr "192.168.0.4" dontlog

moar cleanup

I see a lot of garbage had crept in over the course of using different publishing platforms, editors, and other tools. I’m sure there’s a better way to do this (like a stored procedure in MySQL) but I managed to hack back a lot of the weeds with stuff like this:

mysql> UPDATE crank_posts SET post_content = REPLACE(post_content, '’','\'');

I probably spent more time trying to do this in SequelPro and wrestling with syntax, none of which was necessary: the line above works in the commandline environment. It seems to have worked once but I don’t think I saw the status message saying to: I didn’t realize it til I saw a nonsense test string staring back at me.

Just one more thing that should Just Work.