tl-dr
-Can someone give me step by step instructions (ELI5) on how to get access to my LLM’s on my rig from my phone?
Jan seems the easiest but I’ve tried with Ollama, librechat, etc.
…
I’ve taken steps to secure my data and now I’m going the selfhosting route. I don’t care to become a savant with the technical aspects of this stuff but even the basics are hard to grasp! I’ve been able to install a LLM provider on my rig (Ollama, Librechat, Jan, all of em) and I can successfully get models running on them. BUT what I would LOVE to do is access the LLM’s on my rig from my phone while I’m within proximity. I’ve read that I can do that via wifi or LAN or something like that but I have had absolutely no luck. Jan seems the easiest because all you have to do is something with an API key but I can’t even figure that out.
Any help?
Yeah. But it also messes stuff up from the llama.cpp baseline, and hides or doesn’t support some features/optimizations, and definitely doesn’t support the more efficient iq_k quants of ik_llama.cpp and its specialzied MoE offloading.
And that’s not even getting into the various controversies around ollama (like broken GGUFs or indications they’re going closed source in some form).
…It just depends on how much performance you want to squeeze out, and how much time you want to spend on the endeavor. Small LLMs are kinda marginal though, so IMO its important if you really want to try; otherwise one is probably better off spending a few bucks on an API that doesn’t log requests.