Microsoft seems to be attempting this with the new Copilot in Windows. You can ask it to open applications, etc., and also chat with it. But it is still pretty clunky when it comes to the assistant part (e.g. I asked it to open my power settings and after a bit of to and fro it managed to open the Settings app, after which I had to find the power settings for myself). And they’re planning to charge for it, starting at an outrageous $30 per month.
Comment on Amazon lays off Alexa employees as 2010s voice-assistant boom gives way to AI
phoneymouse@lemmy.world 1 year agoThat would be the goal. The trick part is matching intents that align with some API integration to whatever psychobabble the LLM spits out.
floofloof@lemmy.ca 1 year ago
ExLisper@linux.community 1 year ago
It’s actually fairly easy. "I’m a computer. From now on only communicate with me in valid JSON in the format of {“command”: “name”, “parameters”: []}. Possible commands are “toggle_lights”, “pizza”, “set_timer”. And so on and so on. Current models are remarkably good at responding with valid JSON, I didn’t have any issues with that. They will still hallucinate about details (like what it would do if you try to set up a timer for pizza?) but I’m sure you can train those models to address those issues. I was thinking about doing a OpenAI/google assistant bridge myself for spotify. Like “Play me that Michael Jackson song with that videoclip with monsters”. Current assistant can’t handle that but you can just ask chatGPT for the name of the song and then pass it to the assistant. This is what they have to do but on a bigger scale.
ExLisper@linux.community 1 year ago
Deceptichum@kbin.social 1 year ago
Eh just ask the LLM to format requests in a way that can be parsed to a function.
Its pretty trivial to get an llm to do that.
PupBiru@kbin.social 1 year ago
in fact it’s literally the basis for the “tools” functionality in the new openai/chatgpt stuff!
that “browse the web”, “execute code”, etc is all the LLM formatting things in a specific way