Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ideas to monetize new artifical intelligence
27 points by marcus on Nov 13, 2007 | hide | past | favorite | 102 comments
I've developed a new machine learning algorithm that understands the relationship between its inputs better and outperforms existing algorithms on almost every test I've done, and I'm looking for new ideas as to how to monetize it. Already working on trading commodities with it, which is progressing but still hasn't reached the point where its a money making scheme. Also working on some CAD (computer assisted diagnosis) applications.

Any ideas?



I don't believe you.


Challenge me. Select a dataset send me the training data and I'll send you my results so you can verify it...

I don't mind being tested.


I'd like to see the result of your algorithm on the Netflix challenge as well.


You can tell us your results with KDD data sets: http://kdd.ics.uci.edu/


He says in some other comment:

"I've tested it on the data from the KDD 2006 cup a contest in the KDD conference whose goal was to identify Pulmonary Embolism based on data generated from CT scans and it out scored the cup winners by a 50% margin."


There are a lot of different fields in ML. I don't believe that you have an algorithm that beats all algorithms out there , even if those are specialized on a specific setting.


You are right there are quite a few problem types in ML and a lot of different algorithms but because my idea is a basic insight into something that is missing in existing algorithms, I've been able to incorporate the change into several different algorithms.


What's your score on the Netflix Prize quiz?


Can't really get you a score that you'll be able to trust without submitting a result set, can't submit a result set without agreeing to publicize my algorithm... Pick a different test. One where you can do the verification without a 3rd party that requires me to relinquish my trade secret.


That's not true. You can submit your results and then refuse to release your algorithm, disqualifying you from the competition.

"Upon qualifying, as described above, the Participant is required to submit within one (1) week for judging a description of their algorithm along with all source code. The Participant warrants that the source code is either fully or substantially developed and functions or will function as represented by the description. Failure to deliver both the description and source code within one (1) week will disqualify that entry and additional qualifying entries will be considered."


Ok missed that part in the contract... it was a bit long and in legalese. I'll tinker with it to fit it to the problem (different algorithm category, mine is best fitted for classification problems and the test is a clustering problem. The dataset needs to be manipulated a bit first for it to work )


Build a public web API to use your technology for a fee. Make it dead simple to incorporate AI into any application.


It's tempting to be a tool vendor, but unfortunately there are many established tool people. You don't even know the names, most likely, of all the little neural network toolkit companies that flopped in the early-mid 90s.

It's better to use a new technology to create a complete solution for people.


The problem with building a complete solution is that it usually requires a lot of investment and domain specific knowledge. Having a better algorithm offsets a lot of it but I am not sure I could write a better CAD (computer assisted diagnosis) software on my own in a reasonable time frame that will outperform the market leaders even if using their data I can improve their results by 50%.

There is just too much science involved in feature selection etc...


Subject matter expertise can be hired, as employees, or as advisors to your company. For this type of start up, you not only need a team to build out the product end users will ultimately use (an expensive endeavor in and of itself), but you also need to gain credibility from experts and get published before doctors will look at your stuff.


I was hoping to avoid all the hassle by sticking to my core competency of artificial intelligence doing a co-venture with some established players in the fields and help them improve their results. I know that the Ralph Waldo Emerson quote "Build a better mousetrap and the world will beat a path to your door" is not true but I was hoping that at least the world will meet me half way...


I see. How about creating a public API (like someone else suggested) so we can see what it does, and then have a couple guys shop it around for you as you get validation studies going?

Then you give people enough to get enticed, but don't give everything away. You can then pursue multiple applications with a parallel effort.


Funny, I have the exact same dilemma with my image analysis algorithm...


Thats an awesome idea. Integration is pretty simple, as you can use it almost as a drop in replacement for a backprop neural network.


Uhm? The latency as a public api will render it useless for most applications.


Most AI application aren't realtime (although some of them in the chemical industry for example are ). A doctor doesn't care if his cancer diagnosis takes another 20 milliseconds so long as its a bit more accurate. And a bank trying to analyze if a credit card transaction is fraud or not doesn't care about the short delays either.


> And a bank trying to analyze if a credit card transaction is fraud or not doesn't care about the short delays either.

Oh yes, they do. That I do know.


even if its in the milliseconds range?


fincancial fraud detection bottlenecks are typically between ram and the processor. Thousands of snychronously incoming transactions have to be examined simultanously, because they correlate heavily.

It is not like "here is one transaction, is it a fraud?" but "here are 2^20 transactions, what are the frauds?".

You could do this by pipelining, but I guess Banks want a zero downtime system and I personally would not trust an API in terms of reliability.

Another point is, that banks will not give you the original data. They will have to "pseudonymize" several entries, such as credit card numbers, names, ... This would force them to preprocess the data which gives every transaction a very little + O(n) and which might decrease the speed even more.

(I'm not saying it's technically impossible, but I'd say there are better ways, such as releasing it closed source or just using it to predict financial data - which as we all know is possible and being done by hedge fonds, so this should be the best way IF you have that algorithm ;)


You could simply prove it works and sell the whole thing to the bank.


I've taken classes from people who worked on fraud detection for banks (Fair Isaac) and they were working on legacy hardware (shitty old mainframes) with retardedly limited floating point precision.

Performance is of the essence in these situations; any clever trick you can think of to speed things up should be used in such a situation (but keep it fairly simple; lookup tables and so forth, for example).


Applying the algorithm to financial markets would be highly latency-sensitive, though.

I'm not sure I buy the web service idea, either: wouldn't typical applications require a lot of input data (training set + test set) in order to be effective? Uploading all that data could be annoying, compared with just running the algorithm locally at the customer's site.


Training always takes a ton of time, even more so with my algorithm which is a bit more complex. But training is usually only done once and afterwards results can be generated very quickly.


I wasn't talking about training time -- I was talking about data set size. Frequently uploading a few GB of data over the public Net to do effective training is going to be an annoyance. You may also need to perform the training multiple times, especially if your algorithm takes any parameters.


It does take a few parameters but there is no reason why the same dataset would have to be uploaded every time you tweak a parameter, I can just store it and let you play with it until you are satisfied with the results. And again training isn't done that frequently usually.


Investing. Look to Machine Insight in Cambridge, and the work they've done. They're trying to train an AI system to be Warren Buffet in a box. You don't hear much about this trend, because people are making way too much money to talk about it openly.

I'll put it this way: if you can consistently beat the market by a few percentage points, you can be a billionaire.


>...people are making way too much money to talk about it openly.

Or maybe people are _wasting_ way too much money to talk about it openly.


Do you have any information at all to base your comment on?


Do you have any information of anyone algorithmically outperforming the market enough to make them personally billionaires?


I've heard anecdotes from multiple sources that there are groups of software engineers and algorithm designers buried deep inside all the major firms that are doing just this. I know those firms manage and create hundreds of billions of dollars and I've been told that the stiffest competition is in these tiny software components of the business.

But I don't mean to sound too snarky. Sorry about that. So go ahead and try to get a job at Machine Insight or another such firm to find out what they're all about. http://www.machineinsight.com/


Algorithmic trading is actually commonplace:

http://en.wikipedia.org/wiki/Quantitative_analyst

http://en.wikipedia.org/wiki/Algorithmic_Trading_Platforms

http://www.iht.com/articles/2006/11/23/business/trading.php?...

Typically this is not a question of "find the secret strategy and make billions!", but rather finding a good execution algorithm to break an order of $billions down into a series of smaller orders for a good average price, or finding and exploiting minute arbitrage opportunities, etc. Algorithmic trading is responsible for many billions of dollars of trades daily; I don't know offhand of particular people who have become billionaires off it, but there are definitely lots of folks who have made a lot of money.


I know there are people who trade algorithmically (I'm one of them). I just wanted evidence of one person who had made billions from it.


You've taken my comment on a personal level in that you think I'm talking about an individual. I would form a company around the idea and target investing.


How about D. E. Shaw? I'm not sure how much he's made, but it's surely quite a lot.


Fine, let me rephrase.

Or maybe people are _wasting_ way too much money to talk about it openly?


Patent and publish it. CAD is not limited by the algorithm so much as a lack new problems. However, if you can generate a little buzz you can become a consultant / start a consultant company.

PS: Many machine learning systems can trade off accuracy for efficiency so you might look into increasing efficiency vs. accuracy for some existing application.


Thought about patenting it, but the entire idea of a patent is the ability to enforce a monopoly. I will never be able to inspect the insides of any commercial AI in the field, so as to allege an infringement on my patent.


Do not do this. Patents are not good protection, and a good enough algorithm is much better kept a secret.


This is what's wrong with our patent system. Imagine if all those great minds who came up with theorems you had to learn in 4-6 years of college decided to patent them.

Math should not be patentable. It's a crime.


Actually most of them have been dead for a long time so it wouldn't really matter, but I agree with you some things should not be patentable.

On the other hand imagine a world without patents and you'd probably get a world fill with trade secrets and that is even worse, at least a patent guarantees that the idea will pass to the public domain after a certain period. Would the world benefit if technology was segmented and kept secret like the coca-cola formula?

But seriously that is a great discussion for different thread.


At least patents are only 20 years, copyright is forever.


Copyright isn't forever. It originally expired 17 years after the death of the author. Unfortunately Disney pushed for extension of copyrights every time their key copyrights (Micky mouse & friends) were about to expire and have been able to push extending them quite unreasonably.


Right, Disney has effectively made copyright forever by making it last more than a lifetime.


Here's a start: http://www.netflixprize.com

If you've got awesome prediction technology, that's an easy $50k, and maybe an easy $1 million.


How about _proving_ that the algorithm works first. Prove it on something simple and verifiable like symbolic addition http://news.ycombinator.com/item?id=75439. Here I go again with my own agenda, sorry...


This is a classic example of the saying "When the only tool you have is a hammer, every problem looks like a nail" To a guy with a spam filter, everything looks like mail to be filtered. (it was a cool & very interesting experiment non the less )

Symbolic addition is exactly the wrong kind of problem for this algorithm as the symbols don't have any relationships with each other which is exactly the insight this algorithm adds, and which almost every other dataset has.

I've proven that the algorithm works to my complete satisfaction. I've tested it almost every dataset in UCI machine learning repository and it outperforms the best published results on almost all of them. I've tested it on the data from the KDD 2006 cup a contest in the KDD conference whose goal was to identify Pulmonary Embolism based on data generated from CT scans and it out scored the cup winners by a 50% margin.

I know the algorithm works, I am just not sure how to monetize it...


>Symbolic addition is exactly the wrong kind of problem for this algorithm as the symbols don't have any relationships with each other...

That was my thinkng exactly. I am glad it is confirmed by an expert...


My team is currently in 5th in the Netflix Prize competition. The main part of the method I'm using is pretty general (nothing specific to movies). If anyone here thinks they might have a use for this kind of thing / wants to collaborate on something, send me an email.


I am a tech commercialization consultant for some of the crazy stuff that comes out of the Small Business Innovation Research (SBIR) program. Scientists and engineers make all this cool stuff, but they don't really know what to do with it, and how to package and sell it. That's where I come in.

E-mail me and maybe we can figure out what it would be good for. I'm thinking you should patent the technology, show proof of concept comparison tests, and shop application specific licenses around.


I'd love to have a virtual clone of myself, i.e. something that has learned to think like me. I would treat it as a virtual assistant. I'd like it to prioritize and schedule my work. Tell me when it's time to call it quits or when I need to be doing something else.

Essentially, it'd be the virtual embodiment of my conscience.

Going along those lines, let's say you're a brilliant CEO or hacker. You could license copies of your wisdom to others who need guidance.


http://channel9.msdn.com/Showpost.aspx?postid=40064

Talks about some similar ideas -- applying ML techniques to make information workers more productive. Cool stuff.


Can you give any specific examples (references to academic papers) you've already tested it on and shown an improvement - no need to give details if you don't feel you can, but it would be interesting to know which tests you've done and against which existing algorithms.


KDD 2006 cup http://www.cs.unm.edu/kdd_cup_2006

My algorithm scored 2.043 19 on tasks 1 & 2 Top 3 places 1: 1.35 1.28 1.27 Top 3 places 2: 13.58 13.56 13.44 Note these scores represent 5 different teams.


No doubt the siemens medical solutions people would be interested in talking with you further. The KDD 2006 training and results data have now been released, so its not really a great demonstration.


I had the same idea but I tried approaching them and couldn't get them to try and test my algorithm. Cold calls in that industry are very difficult...

I don't want them to take my word for it, just supply me with more test data and I could send them my results.


Try the specific kdd people there as listed on this paper: http://portal.acm.org/citation.cfm?id=1233321.1233326

I'm sure an opportunity to prove your algorithm will pop up. Its the sort of field where you can fool yourself sometimes by not being careful enough with training and testing data, so random people off the internet generally don't seem worth the trouble to an established researcher.


Dating site matching?

There are a zillion ways to monetize it. The real question is can you successfully do so. What about selling an appliance similar to google search, but instead is your predictive technology. Make a public api for people to test it, and then sell the hardware option.


Ad targeting.


Very interesting idea, do you have any idea where I can get a dataset for it? I'd rather test it before I start pouring time & money to building an ad network first


In response to the idea that your search algorithm is better than all others:

http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_opt...

In response to your question I say that it should be easy to monetize your algorithm if it is truly good at solving some class problems. Just find problems where the solutions are valuable to some group of people and sell the solutions you create.


Never said it was a search algorithm.

That is what I am trying to find in this thread ideas on where the solutions are most valuable and how to pitch them.


How do you use it to trade commodities? What do you trade?


You could try to find a good pilot customer who has valuable data and is willing to work with you to adapt the algorithm to their particular problem. What you want as an outcome is a story like this: "Math Whizzes Turbocharge An Online Retailer's Sales" (http://www.informationweek.com/showArticle.jhtml?articleID=2...)


Working with a medical company to improve the results of their CAD software. After it works (hopefully :) ) that can be a great way to drive the next client.


Who are you? Maybe you can't speak a lot about your discovering, but you can at least tell what your name is and how you got to this?


My name is Avi Marcus, I am a 29 year old, semi retired (sold my previous startup ) hacker from Israel. Started programming at 8, started CS in college when I was 13. Flunked out because I was bored at 15. Always been interested in artificial intelligence, but a year and a half ago I read a book on the way the human brain works and suddenly I had a vision of something programmers missed in all of the machine learning algorithms they built, which is a basic part of the human cognitive process, and I suddenly understood how to add it to existing algorithms.

Anything else?


Going out and winning the Netflix prize and/or KDD and refusing to release your code would bring you the best tests (and biggest paying customers) because of the publicity. I actually can't think of a better and faster way of building a business around machine learning. Beat the best teams in the world as an individual and you won't have to worry about how to make money any more.


Thank you for your answer. I was curious about it, and I'm sure this info would be useful for anyone interested in contacting you.


Added my email to my YC news profile.


We'll all look like fools if we cry "bullshit" and you happen to be right. But we'd look like even bigger fools if we just took your word for it.


I think we have to do neither; if he's clever (and if he's done what he claims, he surely is) he's not trying for us to take his word for it, or for us to say he's a great genius and all that. He's just looking for a way to monetize his discovering, and I think it's right.


From my experience genius isn't a boolean thing, people can be brilliant in somethings and horrible in other things. I try to acknowledge my limitations and ask for help in the things I suck at.


Sounds like you read: "On Intelligence" - if so, cool, if not do so. Then do this:

break CAPTCHA

There are plenty of people who will pay for that, and a successful implementation of Hawkins/George system will be able to do so.


Lets try to stick to something ethical with either a positive impact on the world or at least without a negative one.

Don't want the spam in the worlds Inboxes to be on my hands.


Netflix challenge? No recurring revenue, but, hey, a million dollars. You'll have to disclose the algorithm to accept the prize.


Thats the problem disclosing the algorithm is something I'd rather avoid. I don't mind sharing it with a company for a licensing fee as long as I don't have to public domain it.


Then don't disclose the algorithm or collect the $1 million. If your algorithm really is good, demonstrate it by 'winning the prize', but rather than claiming the money and turning your algorithm over to Netflix sell it to someone.

You're really wasting your time with the stock market though. Technical analysis is just silly.


> Technical analysis is just silly.

That was also my first reaction, but maybe the commodities market (which he's targeting) is more inefficient than equities.


Indeed, by winning the Netflix prize you've proven your approach is worth at least 1 million. And by beating some very, very good teams you've shown you know something they don't. That's very valuable evidence. That's IF you can hack it :)


I couldn't agree more. Based on the quality of the teams at the top, if you beat their algorithms you have proven you have something particularly special. I think you'll easily be able to sell it for more than $1 million.


I just glanced at the rules. I didn't see anything about disclosing outside of Netflix, but you do have to license winning algorithms to them. The license appears to grant them the right to make & sell products based on the algorithm.


Just happened pon this board. I quickly looked through this thread and didn't see this idea mentioned. I've worked ACIS who's kernal make AutoCAD what it is. When you want to use their code they provide the object files for you to link into your code. It preserves their secrets but allows you to make use of the functionality. You could try something like that... I have an encryption scheme that could further hide functionality, even during execution. muxzero@inbox.com if I can help.


I would use it in productivity software.

The users would input their to do list and the AI would suggest the best next task for them to complete. You could have the user input some information about their lifestyle - married? family? when do they work? etc - and the AI would take into consideration these factors when determining the next best task.


Very knowledge intensive, isn't it? Sounds as hard as passing the Turing test, to me.


Sports Betting


Nigerian scamming. Oh wait, you want something ethical?:)


If it doesn't make money trading, then it isn't better than existing algorithms, because that's what you're competing against.


There are probably some AIs making money out there but they try to beat the market by examining a variety of commodities at once and discovering their interdependencies faster and better than humans can. I have far more hubris, I'm trying to make it work with pure technical analysis on a single commodity. Which is a bit harder.


win the "go" competition.

check out http://evolvedmachines.com

those guys might like to help you out.

also there is http://www.numenta.com/

that might be interested in some good algorithyms


License it to game dev companies to provide better opponent AI.


I have no idea how game AIs work, I thought most of it was totally scripted to just appear intelligent. I'll need to research the field a bit.


Forget it. Modern game AI is 90% specialization to the problem domain and 10% application of algorithms. That's the breakdown of how much work is involved as well as the breakdown of what contributes to a "good" game AI.

When it comes to what middleware game AI developers want, they're far more interested in a framework that helps them organize all the domain specialization rather than improved algorithms. And game middleware is an increasingly well-trodden and difficult market. Again, forget it.


Some is scripted. They do a variety of other things too. For example, path finding gets its own algorithm. And they have algorithms for getting multiple guys to move around in formation. Sometimes there is some type of system with goals (get 1000 wood, get an 80 food army, kill X base, etc) and then it picks things to achieve those goals (build more wood harvesters, build more units, attack). sometimes they use "learning" algorithms where they tweak parameters by running the AI repeatedly and seeing which does better, and they can use games with a human player to do that too (if it wins too badly, have it aim less precisely). And lots of other one-off things.


My algorithm won't replace A* as a path finding algorithm, and it wasn't designed to do swarming ( although maybe I can adapt it to swarming ). And as far as I know most of the rest is totally scripted... Again not exactly my field so I need to read up on it before I can rule it out.


Yeh, do you have aim or any other way to contact you?


You could also email me at bkmrkr at yahoo.com


Oil futures?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: