Frustratingly bad at self hosting. Can someone help me access LLMs on my rig from my phone

BlackSnack@lemmy.zip · 7 months ago

Frustratingly bad at self hosting. Can someone help me access LLMs on my rig from my phone

brucethemoose@lemmy.world · edit-2 7 months ago

Yeah. But it also messes stuff up from the llama.cpp baseline, and hides or doesn’t support some features/optimizations, and definitely doesn’t support the more efficient iq_k quants of ik_llama.cpp and its specialzied MoE offloading.

And that’s not even getting into the various controversies around ollama (like broken GGUFs or indications they’re going closed source in some form).

…It just depends on how much performance you want to squeeze out, and how much time you want to spend on the endeavor. Small LLMs are kinda marginal though, so IMO its important if you really want to try; otherwise one is probably better off spending a few bucks on an API that doesn’t log requests.