Comment

Comment on What are the differences between conversation, intents, intent_script, and responses?

RandomLegend@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago

I’m very confused and interested in an explanation as well

I just setup whisper on my external GPU server to run the medium model with 0.5s of processing time but the built-in intends are somewhat lacking.

source

Sort:hotnew top

mike_wooskey@lemmy.d.thewooskeys.com ⁨1⁩ ⁨year⁩ ago
What’s involveditn running whisper on a computer other than the home assistant computer? I’m guessing its relatively easyyto install, hopefully in docker. How do you tell HA to use that whisper?

Also, its a bit surprising that moving the voice recognition to a GPU on a more powerful (presumably) computer doesn’t improve HA performance.

source
- RandomLegend@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  First of all: It increases performance tremendously. For comparison
  
  RPI4B
  
  tiny-int8 – WER 40% – Processing time ~5s
  
  base-int8 – WER 70% – Processing Time ~10s
  
  medium-int8 – Impossible
  
  HP EliteDesk 800 G5
  
  tiny-int8 – Irrelevant
  
  base-int8 WER 70% – Processing time ~2s
  
  medium-int8 WER 95% – Processing time ~ 8s
  
  External Server with GTX1660
  
  - medium-int8 WER 95% – Processing time ~0.5s
  
  So running it on a cheap 100€ used GPU can get you results where Alexa, Siri and Google have to respect you in terms of accuracy and speed. This is a gamechanger for me. I already installed 3 M5Stack ATOM ECHOs in my Home and more will soon come in. It’s incredibly accurate and quick.
  
  Now, to get it running it’s actually pretty easy. First go to this link and download all the files. You have to build a custom docker image with those files. I have no idea how to do that with barebones docker as i am using portainer for everything. In Portainer you have to do:
  
  “Images” in the navigation menu
  
  "+ Build new image" on the right hand of the header of your images list
  
  name it wyoming-whisper
  
  Copy and paste the content of “Dockerfile” you downloaded earlier into the "Web Editor"
  
  Under “Upload” you click on “Select Files” and select the Makefile and run.sh
  
  Click on “Build the image”
  
  Next you go
  
  "Stacks" in your Navigation menu
  
  "+ Add stack" at the right side
  
  Give it a name (whisper e.g.)
  
  Copy the content of docker-compose.example.yml from the files you downloaded earlier.
  
  That will spin up a docker-compose with the local custom image you just built, running faster-whisper that is compatible with the wyoming protocol in home assistant and that can run on an NVidia GPU with cuda acceleration.
  
  As you can see in the docker-compose it will expose port 10300. Next:
  
  go into Home Assistant
  
  open Integrations
  
  click on Wyoming
  
  add a device
  
  input the IP of your external GPU server and the port 10300
  
  It will automagically know that it’s whisper and will be fully integrated into your system. You can now add it into your voice assistant.
  
  If you look at the logs of your new docker container you can see every voice command that is sent to your new whisper.
  
  source
  - mike_wooskey@lemmy.d.thewooskeys.com ⁨11⁩ ⁨months⁩ ago
    I finally got around to trying this. It’s super easy and significantly improved response time. I will add that the last step is to configure the Voice Assistant you’re using in Home Assistant to use the new entity you just added as the “Speech to Text” engine.
    
    Thanks, @RandomLegend@lemmy.dbzer0.com!
    
    source
    RandomLegend@lemmy.dbzer0.com ⁨11⁩ ⁨months⁩ ago
    Ah yes, that final step i forgot.
    
    Awesome that it works for you!
    
    source
  - mike_wooskey@lemmy.d.thewooskeys.com ⁨1⁩ ⁨year⁩ ago
    Thanks for all that info, @RandomLegend@lemmy.dbzer0.com!
    
    Currently I’m running HA on a mini-pc with a celeron CPU with some Atom Echos around the house, and it takes 15-30 seconds for HA to respond to a voice command. I have a server with a GeForce 4060Ti (8GB) so I’m going to try to install whisper on it and direct HA to use that whisper service, hopefully reducing the response time to something reasonable. I don’t use Portainer but I think I’ll be able to figure out how to build the images and customize the docker-compose.yaml, thanks to your info.
    
    source
    RandomLegend@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
    Oh my that 4060Ti will reduce the response time to the bare minimum that whisper is capable of i am sure!
    
    You don’t have to customize the docker-compose at all. That’s the plug’n’play part. You have to make sure to build the docker image so it uses the makefile and the run.sh file.
    
    Also make sure your docker environment is able to use the 4060Ti.
    
    Easiest way is to run docker run -it --rm --gpus all ubuntu nvidia-smi and see if you get a proper nvidia-smi
    
    source