Hacker News new | past | comments | ask | show | jobs | submit login

I think this is a very similar concept compared to TiDE: https://arxiv.org/abs/2304.08424 that also came before and is linked in the paper mentioned in this post. I didn't read through the paper, so I can't point out the differences in approach yet.

However, by just looking at this post' paper results, it seems that at least for TiDE they reported the results completely different from the original paper. It seems this is cherry-picking the particular configuration as the delta is a bit too much to just blame un-reproducibility.




In the TIDE paper, the input sequence length is tuned, while this work uses the uniform input length.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: