I wonder if you could post the transcript to a git repo and allow corrections via pull request. Auto-captioning is a great first step to get phrases set to time-codes, and then open it up to the community for corrections and translations.
Not as well as I would have expected tbh - see the trint link above, it's pretty good but there are lots of errors, so correcting it is quite time consuming.