Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!

  • SmokeyDope@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    34 minutes ago

    I run kobold.cpp which is a cutting edge local model engine, on my local gaming rig turned server. I like to play around with the latest models to see how they improve/change over time. The current chain of thought thinking models like deepseek r1 distills and qwen qwq are fun to poke at with advanced open ended STEM questions.

    As for actual use: I prefer using mistral small 24b and treating it like a local search engine with the legitimacy of wikipedia. I ask it questions about general things I don’t know about or want advice on, it usually then do further research through more legitimate sources. Its important to not take the LLM too seriously as theres always a small statistical chance it hallucinates some bullshit but most of the time its fairly accurate and is a pretty good jumping off point for further research.

    Like if I want an overview of how can I repair holes concrete, or general ideas on how to invest. If the LLM says a word or related concept I don’t recognize I grill it for clarifying info.

    I’ve used an LLM to help me go through old declassified documents and speculate on internal gov terminalogy I was unfamiliar with.

    I’ve used a speech to text model and get it to speek just for fun. Ive used multimodal model and get it to see/scan documents for info.

    Ive used websearch to get the model to retrieve information it didn’t know off a ddg search, again mostly for fun.

    Feel free to ask me anything, I’m glad to help get newbies started.

  • y0shi@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 hours ago

    I’ve an old gaming PC with a decent GPU laying around and I’ve thought of doing that (currently use it for linux gaming and GPU related tasks like photo editing etc) However ,I’m currently stuck using LLMs on demand locally with ollama. Energy costs of having it powered on all time for on demand queries seems a bit overkill to me…

    • pezhore@infosec.pub
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 hour ago

      I put my Plex media server to work doing Ollama - it has a GPU for transcoding that’s not awful for simple LLMs.

      • y0shi@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        40 minutes ago

        That sounds like a great way of leveraging existing infrastructure! I host Plex together with other services in a server with intel transcoding capable CPU. I’m quite sure I would get much better performance with the GPU machine, might end up following this path!

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      Have to agree on that. Certainly only makes sense to have up when you are using it.

  • NeatoBuilds@lemmy.today
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    3 hours ago

    I have immich machine learning and ollama with openwebui

    I use immich search a lot to find things like pictures of the side of the road to post on my community !sideoftheroad@lemmy.today

    I almost never use the ollama though, not really sure what to do with it other than ask it dumb questions just to see what it says

    I use the duckduckgo one when it auto has an answer to something I searched but its not too reliable

  • Helmaar@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    3 hours ago

    I was able to run a distilled version of DeepSeek on Linux. I ran it inside a PODMAN container with ROCM support (I have an AMD GPU). It wasn’t super fast but for a locally deployed and self hosted option the performance was okay. Apart from that I have deployed Fooocus for image generation in a similar manner. Currently, I am working on deploying Stable Diffusion with either ComfyUI or Automatic1111 inside a PODMAN container with ROCM support.

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      Didn’t know about these image generation tools, besides Stable Diffusion. Thanks!

  • kata1yst@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    1
    ·
    5 hours ago

    I use OLlama & Open-WebUI, OLlama on my gaming rig and Open-WebUI as a frontend on my server.

    It’s been a really powerful combo!

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      5 hours ago

      Would you please talk more about it. I forgot about Open-webui, but intending to start playing with. Honestly, what do you actually do with it?

      • Lucy :3@feddit.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        57 minutes ago

        Sex chats. For other uses, just simple searches are better 99% of the time. And for the 1%, something like the Kagis FastGPT helps to find the correct keywords.

      • Oisteink@feddit.nl
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        edit-2
        4 hours ago

        I have the same setup, but its not very usable as my graphics card has 6gb ram. I want one with 20 or 24, as the 6b models are pain and the tiny ones don’t give me much.

        Ollama was pretty easy to set up on windows, and its eqsy to download and test the models ollama has available

          • Oisteink@feddit.nl
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            1
            ·
            edit-2
            4 hours ago

            Possibly. Been running it since last summer, but like i say the small models dont do much good for me. I have tried llama3.1 olmo2, deepseek r1 in a few variants, qwen2. Qwen2.5 coder, mistral, codellama, starcoder2, nemotron-mini, llama3.2, qwen2.5-coder, gamma2 and llava.

            I use perplexity and mistral as paid, with much better quality. Openwebui is great though, but my hardware is lacking

            Edit: saw that my mate is still using it a bit so i’ll update openwebu frpm 0.4 to 0.5.20 for him. Hes a bit anxious about sending data to the cloud so he dont mind the quality

            • Oisteink@feddit.nl
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              3 hours ago

              Scrap that - after upgrading it went bonkers and will always use one of my «knowledges» no matter what I try. The websearch fails even with ddg as engine. Its aways seemed like the ui was made by unskilled labour, but this is just horrible. 2/10 not recommended

  • colourlesspony@pawb.social
    link
    fedilink
    English
    arrow-up
    6
    ·
    6 hours ago

    I messed around with home assistant and the ollama integration. I have passed on it and just use the default one with voice commands I set up. I couldn’t really get ollama to do or say anything useful. Like I asked it what’s a good time to run on a treadmill for beginners and it told me it’s not a doctor.

    • Starfighter@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      40 minutes ago

      There are some experimental models made specifically for use with Home Assistant, for example home-llm.

      Even though they are tiny 1-3B I’ve found them to work much better than even 14B general purpose models. Obviously they suck for general purpose questions just by their size alone.

      That being said they’re still LLMs. I like to keep the “prefer handling commands locally” option turned on and only use the LLM as a fallback.

    • metoosalem@feddit.org
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 hours ago

      Like I asked it what’s a good time to run on a treadmill for beginners and it told me it’s not a doctor.

      Kirkland brand meseeks energy.

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      5 hours ago

      Haha, that is hilarious. Sounds like it gave you some snark. afaik you have to clarify by asking again when it says such things. “I’m not asking for medical advice, but…”

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 hours ago

      Well, let me know your suggestions if you wish. I took the plunge and am willing to test on your behalf, assuming I can.