How do we know that AI is truly transformative tech? The null hypothesis is that most businesses could achieve equivalent ROI by using older, simpler data analysis techniques like linear regression.
Because most large tech companies are driven by it right now and you likely interact with a system influenced by ML every time you use your phone, email, maps, translation, social media etc etc. The catch it's only transformative if you're dealing with large enough problems. You're right that more tiny companies should probably stick to classic analysis techniques.
I would argue that linear regression is under the umbrella of ML. And probably the most important ML technique for existing businesses. A lot of the "low hanging fruit" for existing business cases is not necessarily discovering how to use regression analysis on existing data (which is probably already happening at a lot of places) but how to operationalize and continually update linear regression models so you can make them a mission-critical part of the infrastructure.
Linear regression has been around since before computers. It's a real stretch to put that under the umbrella of AI. Nor is AI required to continually update linear regression models; a "for" loop will suffice.
That loop has to run over high quality, relevant data. The regression model needs to be retrained if the production data drifts from the original training dataset. The output of the regression model needs to be wired into a downstream application or decision support tool. That downstream application needs to know how to deal with error bars. A data scientist / statistician needs to re-architect the whole thing if the data or business requirement changes substantially.
As usual, the hardest problems are outside of the code.
I agree with you here Data and measurement is the single most important part of the process.
From my experience working in an industrial plant which has been involved in several machine learning trials a lot of the time there are attempts made to use complex modeling techniques to make up for a lack of measurements.
Something I question is whether the outcome would have been better if the money which was invested into hiring AI consultants was spent on better plant instrumentation.
Industrial Instruments are not cheap something like a PGNAA analyser (https://en.wikipedia.org/wiki/Prompt_gamma_neutron_activatio...) is an expensive capital purchase and I suspect some people have unrealistic expectations that AI and machine learning can replace things like this.
I think there is some middle ground where AI complements better sensors (maybe instrument companies should be pushing this). I've yet to see any of the data experts push back and say something like "actually you need to measure this better first before we can model it."
Most ML algorithms boil down to linear regression with some loss function. Even deep learning networks are essentially just linear regressions stacked on top of each other, with lower layers being trained on features predicted by higher layers.
I think if neural networks or SVMs are AI, then linear regression is as well. Neural Turing machines and other recent developments I think are closer to the layperson's idea of "AI," though.
>Most ML algorithms boil down to linear regression
I think (particularly with DL) it would probably be more accurate to claim it boils down to nonlinear logistic regression rather than linear regression. To your point, both are relatively old techniques
Nonlinear regression is still linear regression with transformed features. Kernel regression, for example, just uses features generated from the data using a supplied kernel function/covariance function. DL just allows the features to be trained.
You could replace "AI" there with any other newly developed tech paradigms, like "Big Data" several years ago. Your null hypothesis is probably correct for most companies, but there will always be some companies for whom the new tech is beneficial. Most companies do not need big data, and are fine running a single postgres instance for the entirety of their lifetime, just like most companies probably don't need AI.
This is true, but I think the real benefit of the “Big Data” paradigm (scare quotes included) was the spreading the idea that you should actually measure things, and then make a decision. You’d think this was obvious, but apparently it wasn’t. (Think back to the book Lean Startup, whose premise is, “A/B test hypotheses.”) Similarly, the “AI Revolution”, could be boiled down to, “run a regression.”
First - there is much more data now than 5 or 10 years ago; it is generated by every process and is easy to store.
Second - there is a greater art and capability to aggregate and manipulate data. It's simply faster, but also there is a lot of supporting technology in the form of workflows and tooling.
Third - there are more algorithms now; these are often derived from AI research (DNN, RNN, Bayesian things..)
The first two definitely mean that linear regression can generate much more value than 10 years ago.
The third one is a product of the frustration with linear regression and many other "traditional" algorithms. In many domains (speech, images, text processing) the community smashed its head on the wall for 30 years before the computational resources and algorithmic tricks that came out in 2010->now came on stream. You just can't do much with TFIDF or similar with text - I tried very very hard; on the other hand using a transformer is like bloody magic.