Hacker Newsnew | past | comments | ask | show | jobs | submit | aficionado's commentslogin

Why are they regulating only AI? They should wide the spectrum of the regulation and include, for example, graph theory, combinatorics, probability theory, etc and any other branch of discrete mathematics or even mathematics and statistics in general. They look like pretty good tools to perpetrate those crimes too. Or, wait, are those “practices” even classified as crimes but current laws? How many people and organizations were fined or incarcerated for those last year? Let’s suppose they are. If someone commits one of those crimes hardcoding a system or simply using some random generator, or simply sending handwritten letters will this regulation apply? I really wonder how many AI-based systems the people proposing this regulation have really built.


The regulation includes things like "specialist systems" so yes, "graph theory, combinatorics, probability theory" on that context counts as "AI" (this is in TFA)


Links to dataset do not work.


Not sure why you're being downvoted. I also cannot get to at least one dataset -https://storage.googleapis.com/cloudml-demo-lcm/SO_ml_tags_a... which is linked to from the fourth paragraph directly above the "What Is The Bag Of Words Model?" section heading.


I pinged Sara. Probably just forgot to make it publicly accessible.


Thanks!


Article author here.

To access the dataset you need to be logged in with a Google account. Details here: https://github.com/GoogleCloudPlatform/ai-platform-text-clas...


Ideal to practice, learn, and teach machine learning:

https://bigml.com/ml101 https://bigml.com/education/videos



This is one of the many desperate moves that we will see from IBM trying to deliver on the Watson overpromise and all their rocambolesque cognitive computing. AI is a late project (https://en.wikipedia.org/wiki/The_Mythical_Man-Month). So adding more manpower (or brainpower) to a late software project only makes it later. Only folks poorly educated on the topic (such as Elon Musk) really believe that Watson-like AI is around the corner.


I don't think it's universally true that adding more manpower to a late software project always makes it later. The Mythical Man Month takeaway is that things don't scale up linearly when adding manpower but they clearly can and often times do scale up.

Imagine one person given the task of rewriting the Windows operating system in Java. This one person team won't benefit from more programmers? Of course the project will be completed faster, in this case, with more manpower.


> I don't think it's universally true that adding more manpower to a late software project always makes it later.

It always results in a period of reduced progress due to drag on the existing staff to onboard the new staff; and the bigger the scale up, the longer that period where you are behind where you would have been without it is. And the more you scale up, the more you need to reorganize and build new coordination infrastructure to make use of new resources even once they are up to speed technically, which also takes time to set up and time to acclimate staff to the new organization and teams, which creates its own drag.

In realistic scenarios, this pretty invariably means late project + more resources = later projects.

But, sure, there are extreme situations where that wouldn't be true, but I don't think they pop up often in practice. One should be extraordinarily skeptical of any claim (or interior intuition) that the rule doesn't apply to your project.


What I presented was an extreme example but the point stands. Obviously things like Windows OS, iOS, etc. are projects that benefitted from an increase in manpower and this increase in manpower did not stifle development or cause delays. It is true that doubling the manpower does not double progress. It's not a linear relationship.


rocambolesque adj.: Suggestive of Rocambole, a character in the novels of the French author Ponson du Terrail (1829–71), renowned for his improbable and fanciful adventures; incredible, fantastic, bizarre.

https://en.oxforddictionaries.com/definition/rocambolesque


I've never believed that Musk could be so daft. I always thought he was playing up media hype ('AI so powerful, could be evil guys!') as a way to recruit young ambitious talent to his endeavors.


The solution is easy and it applies to most sciences: all research articles should include a pointer to download the dataset that was used and an annex with the details on how it was collected.


Basically this video ignores the history of machine learning in general. Jumping from Expert Systems to Neural Networks and Deep Learning is actually ignoring 36 years (and billions of dollars) of research http://machinelearning.org/icml.html (Breiman, Quinlan, Mitchell, Dietterich, Domingos, etc). Calling 2012 the seminal moment of Deep Learning is quite hard to digest. Maybe it means that 2012 is the point in time when the VC community discovered machine learning? Even harder to digest is calling Deep Learning the most productive and accurate machine learning system. What about more business oriented domains (without unstructured inputs), the extreme difficulties and expertise required to fine tune a network for a specific problem, or some drawbacks like the ones explained by http://arxiv.org/pdf/1412.1897v2.pdf or http://cs.nyu.edu/~zaremba/docs/understanding.pdf.

Those who ignore history are doomed to repeat it. As Roger Schank pointed out recently http://www.rogerschank.com/fraudulent-claims-made-by-IBM-abo..., another AI winter is coming soon! Funny that the video details the three first AI winters but the author doesn't realize that this excessive enthusiasm in one particular technique is contributing to a new one!


I think you're spot on with the observation that 2012 refers to when VCs discovered machine learning. Anyone who has recently interacted with VCs will tell you that they look for anything to do with machine learning (and VR/AR/MR), even when it makes no sense. There are going to be some companies who will be able to leverage machine learning to advance their business, namely, Google/Facebook who will probably claim they can offer better targeted advertising and such. Most other players who merely try to force machine learning on other fields are likely to realize that while the technology is cool, it's still too early for it to be generally applicable to "any" problem.

Especially dangerous is going to be the mix of machine learning with healthcare. I believe Theranos tried it and found out it's not that easy... I'd watch this space with skepticism.


> Especially dangerous is going to be the mix of machine learning with healthcare.

Medical diagnosis has been one of the primary application areas of AI since the 70s (maybe earlier, I can't remember off the top of my head). The widespread non-availability of automatic doctors should tell you how well that has worked :-(.

Coincidentally enough I worked both in AI (1980s) and drug development (2000s) and now really understand how hard it is!

I do believe we will soon see automated radiology analysis as it is likely to appear to be most amenable to automated analysis. Presumably in Asia first as the US FDA will justifiably require a lot of validation. The opportunity for silent defect is quite high -- you are right to say "especially dangerous"


First, would have to agree that there is quite a bit of history being ignored, but I suspect this deck is more to excite and interest investors than anything else.

The value of AI depends largely on perceived value IMHO, and the frequency of "winters" will correlate with that. I think we are still a bit too early for VR to really take off, but that did not stop a $2b acquisition and loads of investor interest. This will probably artificially constrain what should be an AI winter right now, just because so much money is continuing to go into it.

I personally applaud that so much enthusiasm is going into AI right now, and though we are repeating history to some extent I still think we are making incremental advancements (however small) - even if this just means applying old AI techniques to new advancements in hardware.


It's ridiculous but xkcd's "Tasks" goes over how in CS, it can be hard to explain the difference between the easy and the virtually impossible. https://www.quora.com/As-a-programmer-what-are-some-of-the-t...


I'm a little bit worried. At least at the same level as when I see a bunch of developers compiling programs without much understanding of what an LL(k) parser does, or how a pushdown automaton works, or what a Turing machine is. I usually feel the same every time I see an elevator without a liftman, don't compute a square root by hand, or hear about Google self-driving cars.


The difference is that the software or the elevator will work but the statistical model is wrong and doesn't work. It is like the elevator only lift people above 120 and below 90 and for the others it just don't work or take you to the wrong floor.


> The difference is that the software... will work

Lots of software doesn't work. Is there a substantial difference between putting an overfitting model in production, and putting a poorly tested program in production?


I think there might be. When ML fails the only individual capable of noticing is someone who understands the math. When code breaks often the "lay" user notices. The result is obvious to a novice. When ML fails it looks like a duck, quakes like a duck but after multiple years of study its immediately recognizable as an antelope. Though to disagree with my own point, security vulnerabilities have a similar profile. In essence, to all but the highly trained the difference is imperceptible.


>"When code breaks often the "lay" user notices. The result is obvious to a novice."

That depends "how" it breaks. As a novice coder myself, I've had things go wrong that I don't notice or can't identify, and it looks like my program is running fine.

I think that's the parent's point: it might be stupid to implement crappy macho learning models into production, but it isn't worrisome. It's expected.


I hear ya. I knew that assertion was going to draw some criticism as its a judgement call about where we draw the line. Who's a novice and what's obvious? However I can't get away from my nagging impression that statistical validity is not inherently clear to the absolute best practitioners. Causality is the goal, and its notoriously difficult, even for world class minds. In my experience the only similar effervescent specter for software development is in security. Such circumstance,seem to me, to require great humility and introspection about ones abilities, but I suppose a little of that would go a long way in general too!


That the program is likely to fail loudly and obviously, but the overfitted model will just sit there being subtly yet perniciously much wronger than you think is, forever.


I would like to see some ML applied to stop lights with a fallback to the PLC with timers if that fails. It would save the nation a lot of gas.


How many elevators were failing today? how many were fixed today? and how many have been monitored in real-time to get fixed as they fail? Likewise, statistical models can be monitored and fixed automatically, and, of course, even so they will fail from time to time like elevators do.


Well someone clearly has a horse in this race.

Your comment is a bit obtuse, developers are not creating parsers and compilers. Elevators and calculators are very robust technologies that already work.

What I worry about is a new wave of engineers and developers thinking they understand statistical models and then proceeding to work at the big banks and have their models blow things up. If PhDs can make such disastrous non-robust models, how on earth is a random developer who took a summer course not going to do the same?

Now if the banks actually failed on their own, then by natural selection the less skilled would be out of jobs and people would stop trying to "short-cut" gaining this type of knowledge. But that's not what happens. Academics keep writing papers and hyping up specific techniques for which they can give conferences on, and the taxpayer bails out the idiots at the top.


I'm a developer with very little statistics knowledge who at one time had extensive math knowledge, but haven't applied it in so long that I don't recall most of it these days.

I worked in the finance sector as a (lead) developer on a production use trading system for quite some years. None of the developers had formal math or statistics knowledge to the extent required to develop this system.

This didn't really bother anyone in the least despite the fact a mistake could cause a loss of millions of dollars in practically no time at all...

The reason for this wasn't because anyone was ignorant enough to think the programmers knew what was going on. It was because it was a finance company that also employed plenty of mathematicians, statisticians, physicists, and other's with the proper math/stats background. The programmers wrote the code, but the math/stats people wrote the business rules and the formulas and extensively tested that the system worked correctly against a large enough variety of models with expected outcomes that they were able to have sufficient confidence that the system reward/risk measurements were appropriate.

So my answer would be no; I don't find this all that troublesome. We can't be experts at everything and smart companies realize this so they should be creating teams with the correct skillset to be successful.


It's different. In ML the model, the analysis, and the insights, are the product. In general software engineering, your compiler is not your product.


Don't think so. The data and how it's represented is the code. ML is the compiler.


This is overly pedantic. The code is doing machine learning. In order to write and understand the code you have to understand the machine learning algorithms. Before you even choose a model and tune the parameters you have to know how the parameters interact and how different models work.


Did anyone actually give it a try? I only get this error with any dataset (even a humble Iris): Amazon ML cannot create an ML model: 1 validation error detected: Value null at 'predictiveModelType' failed to satisfy constraint: Member must not be null


go to the datasources tab and see if there's an error message from data source creation. i had the same error due to an issue with variable names.


No... it's Amazon ML and Azure ML trying to catch up with BigML. They copied many things from our service but forgot to copy the ease of use. Services like Azure ML, Amazon ML and even Google Predict API work like a black box, and lock your model away, making you extremely dependent on their proprietary service. With BigML, you can easily export your models and use them anywhere for free. If the goal is to democratize machine learning, then the ability to extract your models and use them as you see fit is essential, and only BigML offers that level of freedom.


I just try out BigML and look awesome. I use Google Prediciton API to fill a value on form of a web request. I need the result immediately. Why BigML require two web request and take so long to get a prediction of a trained model?


If you use BigML's web forms, the first request caches the model locally so that all the subsequent predictions are performed directly in your browser.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: