This is a simple one built in C#/.Net. I fixed a significant bug and raised classification accuracy from 74% -> 96% in my noisy dataset (automobile accident claims). I emailed the author with a bunch of improvements (such as histograms) and some other tweaks but never heard back. Anyway, the bugfix is simple: take a look at category.cs. In TeachPhrase(), move m_TotalWords++ inside the test for "if (!m_Phrases.TryGetValue(phrase, out pc))".
What you want here is to count the # of unique words. The original code was counting the total # of times all words appear.This one change reduced classification errors by 3X.
Oh, forgot to mention that you'll want to compute probabilities using logarithms to avoid underflow precision loss (which will introduce classification errors).
example: wordValue = System.Math.Log((double)count / (double)cat.TotalWords);
derefr,
Unfortunately, you are either jealous and letting that affect your judgment or you completely fail to understand that there are real-world reasons to own a Mac. I bought a Mac because I wanted a system that is designed rather than evolved, comes with software I actually want to use (iMovie, Garageband, iPhoto), is secure, and not a target for viruses/spyware, and can occasionally play my Windows video games and host my Windows development environment.
The reason I switched my mom, inlaws, brother and sister to Macs is because I got damned sick and tired of being their IT Helpdesk every other weekend -- diagnosing problems, rebuilding machines infected with spyware & viruses, installing patches and fixes, etc. I've got better things to do with my life such as making original content (music & video) with the computer. However, some people just enjoy farting around with settings. That's not me.
Those commercials appeal to me: a 40-year-old guy wearing $17 Wrangler jeans and a t-shirt from Target. You know why? Because they bring out the defensive XP fanboys such as yourself. We laugh at the commercials, then we laugh at you. Now run along.
Don't you have some patches to install or something to update?
I'm sorry for coming off as an "XP fanboy"--I didn't mean to sound negative with anything I said. I fit into at least three of the groups I mentioned, currently am on my third iPod, and am thinking of getting the very Macbook Air I'm talking about. I love OS X, but that has nothing to do with my argument. I know full-well that I'm in the upper-middle class, and you probably are too--you've just chosen, like most enthusiasts, to spend your money on technology rather than fashion.
The fact that Apple has a successfully viral strategy in using their more inexpensive products (such as the Mac Mini) to guide people into buying complementary products (such as the Apple TV, Cinema displays, and Airport Express) and eventually fully integrate them into the "Mac lifestyle" is the mark of a smart company, and exactly what Microsoft wishes it had when it refers to a lacking "consumer experience." They probably never will as long as they don't control the hardware, though.
Of course, from the perspective of the consumer, Apple products are sometimes the "pragmatic decision." However, in all of this, I was referring to Apple's marketing department's intended market for their products and services, not necessarily the "long tail" of economic and word-of-mouth users.
Mac backup: Last day of the month I run Carbon Copy Cloner to backup everything to a 500GB firewire external drive. (firewire, so that I can boot OSX from that image, if need be.)
PC Backup: I dumped my PCs 18 months ago when I bought my Intel-chip Mac. Now both my "PCs" (XP development machine and Win2k3 Web Server) are running as virtual machines on my Mac, so my Windows install is actually just one big file used by Parallels which is backed up as part of the Mac backup.
This is a simple one built in C#/.Net. I fixed a significant bug and raised classification accuracy from 74% -> 96% in my noisy dataset (automobile accident claims). I emailed the author with a bunch of improvements (such as histograms) and some other tweaks but never heard back. Anyway, the bugfix is simple: take a look at category.cs. In TeachPhrase(), move m_TotalWords++ inside the test for "if (!m_Phrases.TryGetValue(phrase, out pc))".
What you want here is to count the # of unique words. The original code was counting the total # of times all words appear.This one change reduced classification errors by 3X.
Cheers, --Jack