Posted by Doug
Wed, 29 Jan 2003 22:13:00 GMT
I finally got tired enough of hitting the delete key that I setup a bayesing-like filter to categorize my mail using statistical analysis. The point is that I don’t want to stop what I’m working on to switch over to my mail to see something about a Nigerian needing assistance. So, I’m using bogofilter. After training it on the more than 3000 spam messages I happen to have lying around on my hard drive (that’s roughly the number of spam messages I’ve received from the evening of Dec 4 to the present) and training it on all the equally large piles of legitimate mail I have laying around I’m happy to announce not one spam has leaked through to my inbox. I’m also sad to say there have been several false positives. So, I’ll probably have to spend the next few days parsing all of my junk mail saying, “Yes, that’s really spam,” and “No, that’s not really spam.”
Posted in Software | no comments
Posted by Doug
Wed, 29 Jan 2003 00:47:00 GMT
Joel Spolsky comes up with another good article:
Something we had done since the last release of
CityDesk somehow caused our
publish times to increase by about 100%; on a particular large site we
use for stress testing it had gone from about a minute to about two
minutes.
The first thing I tried was a profiler:
Compuware DevPartner Studio.
Indeed this showed me where a lot of
bottlenecks are; that data will be useful to speed up our publish
times even more, but I really wanted to find the specific bug that I thought we had introduced which was slowing us down.
The next thing I tried was a method I learned from Gabi at Juno:
the old binary search method. Before we started work on this release,
publishing took 1’04”. Today it takes 1’57”. So I started checking out
old versions of the source from CVS by date, rebuilding, and timing
how long publishing took with each day’s build. Here’s what I
found:
As of May 1: 1’57”
As of April 1: 1’05”
As of April 15:
1’05”
As of April 22: 1’06”
As of April 26: 1’58”
As of April
24: 1’05”
As of April 25: 1’05”
Aha! Now all I had to do was run WinDiff to compare the source tree
from April 25th and April 26th, and I discovered four things that were
changed that day, one of which was a function that DevPartner had told
me was kind of slow, anyway. Within minutes I found the culprit—
that function was originally written to cache its results because it’s
often called with the same inputs, and I had inadvertently changed the
cache key in one place and not another, so we were getting 100% misses
instead of 99% hits. Solved! Total elapsed time to find this bug:
about an hour. If your source code is much bigger than CityDesk,
builds and checkouts may be slow. This is as good a reason as any
to keep all your old
daily builds
around.
© 2003 Joel Spolsky
Posted in Software | no comments
Posted by Doug
Thu, 23 Jan 2003 15:25:00 GMT
Yes, I’m boycotting
amazon.com because they not only have, but are enforcing several unbelievably obvious patents. The only reason they are considered “novel” is because they took an every day business practice and applied it to the Internet. The
USPTO must be thinking, “Ooo, the Internet! It’s so new and powerful! This patent mentions the Internet, it must be new and powerful too!”
Well, it appears to me that
The Register is the first to
report on another stupid patent. However, this is such a fine piece of reporting; they not only report
SBC Communications Inc enforcing this stupid patent, they also did the leg work to find the prior art!
Posted in Software | no comments