I’m not a ML person so forgive my ignorance here. It looks interesting, but I’m ...

jandrewrogers · on Oct 11, 2023

The equivalence relationship between efficient AI and universal sequence prediction has been known for decades, so it would be surprising if AI algorithms were poor at sequence prediction. Of course, optimal universal sequence prediction is profoundly intractable and memory hard, which has implications for limits of AI efficiency and scalability.

There used to be a small hobbyist subculture on the Internet in the late 1990s that designed highly efficient approximate universal sequence predictor algorithms for the challenge of it. Now that AI is a thing, I've often wondered if there were some lost insights there on maximally efficient representations of learning systems on real computers. Most of those people would be deep into retirement by now.

bravura · on Oct 12, 2023

There’s nothing more fun than dusting off fossilized proto-AI work and running it on modern hardware.

Why don’t you share some citations?

I always enjoyed tracking down outre typewritten connectionist manuscripts from an author who had more time than compute.

Roark66 · on Oct 12, 2023

Is there anything left of their output in the Internet archive? It is an interesting subject to explore.

sdenton4 · on Oct 11, 2023

I think we're living in a world where deep learning is winning so consistently that comparison to other methods is often just a time suck. It would be nice to provide a non-DL approach as a baseline, but I would expect it to lag behind the DL methods.

Furthermore, often pre-DL methods can be recast as hand-tuned special cases of DL models - some sequence of linear operations with hand-picked discontinuities sprinkled around. If you can implement the pre-DL method using standard neural network components, then gradient descent training of a neural network "should" find an equivalent or better solution.

Nevermark · on Oct 11, 2023

Deep learning models are not better for vast problem areas which have analytical design algorithms. Deep learning's succession of triumphs has been across areas where analytical design has proven difficult.

First, there are many optimal, or near optimal, direct design algorithms for systems that are well characterized. These solutions are more concise, easier to analyze, reveal important insights, and come with guarantees regarding reliability, accuracy, stability, resource requirements, and operating regimes. Clear advantages over inductively learned solutions.

Second, just assuming that new algorithms are better than older algorithms is completely irrational. An anathema to the purpose and benefits of science, math, and responsible research in general.

If you are going to propose new algorithms, you need to compare the new algorithm against the previous state of the art.

Otherwise practitioners and future researchers will be driven into deadends, deploy pointlessly bad designs, forget important knowledge, and worst of all, lose out on what older algorithms can suggest for improving newer algorithms. With no excuse but gross carelessness.

Muller20 · on Oct 11, 2023

This something that DL researchers like to think but it is definitely not true for time series forecasting. See https://forecastingdata.org/ for some examples where simple non-DL approaches beat state-of-the-art DL systems.

wenc · on Oct 11, 2023

> I think we're living in a world where deep learning is winning so consistently that comparison to other methods is often just a time suck.

This is quite untrue. DL methods work well when there’s a lot of data in closed domains. DL works well by learning from corpuses of text and media where it can make reasonable interpolations.

When you don’t have enough data and you don’t have a known foundational model that you can do zero shot from, DL doesn’t work better than simpler conventional methods.

jldugger · on Oct 11, 2023

> It would be nice to provide a non-DL approach as a baseline, but I would expect it to lag behind the DL methods.

The M# competitions have usually shown very old forecasting algorithms work quite well, with frankly, way less training overhead and data. Ensemble models usually do best, but for a lot of use cases, DL is probably overkill versus ARIMA or triple exponential smoothing.

wenc · on Oct 11, 2023

DL also don’t win at medium scale tabular data. This paper mentions why and how DL could might better (if it indeed can, with limited sized data)

Why do tree-based models still outperform deep learning on tabular data?

https://arxiv.org/abs/2207.08815