Swedish Speech-To-Text in OpenWebUI

OpenWebUI is a great AI toolbox for personal or family use. They manage to get a pretty user friendly UI set up that can tie into both local and remote AI servers - with some caveats. The support for Speech-To-Text is one such, where the only local support is for their built-in Faster Whisper server, which only allows you to specify which of the models in the Systran repository you want to use.

I don’t want to use one of those models. I want to use the one by Kungliga Biblioteket , that will handle both Swedish and English.

The solution is to use the huggingface-cli binary and download the repository from KB:

$ huggingface-cli download KBLab/kb-whisper-small

… then enter the model cache folder in your OpenWebUI installation. I run mine dockerized, and by default it maps the host folder data into /app/backend/data in the container. The Whisper model cache thus ends up at data/cache/whisper/models/ on the host.

Move your downloaded model there, and rename it to match the Systran repo name:

$ mv models--KBLab--kb-whisper-small models--Systran--faster-whisper-small

… and now you can enter “small” (in this example) as the model in the OpenWebUI STT settings and the model you downloaded will be used instead of the one from Systran.

I probably should’ve made a PR on OpenWebUI rather than a blog post, but hey. Maybe it’ll help someone in the meantime.