You can run smaller models locally, and they can get the job done, but they are not as good as the huge models that would not fit on a your graphics card.
If you are technically adept and can run python, you can try using this:
It has a front end, and I can run queries against it in the same API format as sending them to openai.
Just_Pizza_Crust@lemmy.world 1 year ago
If you have decent hardware, running ‘Oobabooga’ locally seems to be the best way to achieve decent results. Not only can you remove the limitations through running uncensored models (wizardlm-uncensored), but can prompt the creation of more practical results by writing the first part of the AI’s response.