I think arbitrary distribution choice is dangerous. You're bound to end up using lots of quantities that are integers, or positive only (for example). "Confidence" will be very difficult to interpret.
Does it support constraints on solutions? E.g. A = 3~10, B = 4 - A, B > 0
NN computations -- in the main -- are high school Maths. Architectures are a topology governing chains of high school Maths. This is so much the case that the building blocks can neatly be expressed in a spreadsheet, which may or may not help some people establish more mechanical sympathy for what is going on.
Set theory, mathematical logic and proof makes sense to learn early on. Maybe even after strong command of arithmetic. These things are very portable to thinking generally. They'll change the way you see in a way you can't unlearn.
To my understanding, an LLM -- and similar models -- have a Markov chain equivalent.
There is an old argument from philosophy that any mechanical interpretation of mind has no need for consciousness. Or conversely, that consciousness is not needed explain any mechanistic aspect of mind.
Yet, consciousness -- sentience -- is our primary differentiator as humans.
From my perspective, we are making strides in processing natural language. We have made the startling discovery that language encodes a lot about thought patterns of the humans producing the text, and we now have machines which can effectively learn those patterns.
reply