> Maybe there is something clever that can be done to avoid regenerating from th...

contravariant · on July 22, 2023

I mean you're going to need to include a probability to backtrack one way or another, but simply having a backtrack character seems more like a trick to make fitting the model easier than a way to make constraining it more accurate.

Simply having the probability to backtrack does turn the whole generation process into a ergodic Markov chain though, so you might be able to use something like MCMC to make it work. Technically those only start sampling the distribution eventually but picking the first or nth full output might be good enough for all practical purposes. Especially at low temperatures where there aren't many reasonable options in the first place.