>Even a searchable transcript [...] Bonus points if clicking on a sentence seeks the video to that timestamp
The Youtube auto-generated transcripts work that way. On most videos, you click on the "..." (3 dots) to access it. Then click on the text fragment and it instantly seeks to that part of the video.
Since it uses AI algorithms, there will be misspelled names or technical terms but it's still useful.