All the solutions I’m seeing are some third party service where I would have to upload my videos to them to get them transcribed.
Here’s one I’ve been playing with github.com/jhj0517/Whisper-WebUI
The small model of fast Whisper has been amazing for the 3 options it gives (files, YT, or recording), tho I have in mind the limitations and I’ve only used it with somewhat clear audio.
Research8165@kbin.social 1 year ago
Maybe Whishper would be suitable?
princessnorah@lemmy.blahaj.zone 1 year ago
Okay yeah, I spun up a docker instance and this is cool as fuck. It seems to be exactly what OP is looking for. This is cool enough to be a post on its own tbh. It would be perfect in a ytdl workflow, as you can do the transcription by linking a video. I’ve been holding off on adding youtube to my Jellyfin setup for just this sort of tool. I hope the add the GPU accelerated faster-whisper models soon.
Research8165@kbin.social 1 year ago
Luckily I still had the project in my history! Glad it was useful.
nieceandtows@programming.dev 1 year ago
That looks perfect! Thank you!
princessnorah@lemmy.blahaj.zone 1 year ago
Oh that looks really cool, thank you for the link.