I’d like to self host a large language model, LLM.
I don’t mind if I need a GPU and all that, at least it will be running on my own hardware, and probably even cheaper than the $20 everyone is charging per month.
What LLMs are you self hosting? And what are you using to do it?
LLMs use a ton of VRAM, the more VRAM you have the better.
If you just need an API, then TabbyAPI is pretty great.
If you need a full UI, then Oogabooga’s TextGenration WebUI is a good place to start