• 0 Posts
  • 139 Comments
Joined 2 years ago
cake
Cake day: June 4th, 2023

help-circle




  • Not sure if it counts as “budget friendly” but the best and cheapest method right now to run decently sized models is a Strix Halo machine like the Bosgame M5 or the Framework Desktop.

    Not only does it have 128GB of VRAM/RAM, it sips power at 10W idle and 120W full load.

    It can run models like gpt-oss-120b or glm-4.5-air (Q4/Q6) at full context length and even larger models like glm-4.6, qwen3-235b, or minimax-m2 at Q3 quantization.

    Running these models is otherwise not currently possible without putting 128GB of RAM in a server mainboard or paying the Nvidia tax to get a RTX 6000 Pro.
















  • I use Jellyfin but I download all my songs from Tidal, Qobuz or Deezer and tag them automatically right then and there in a clean format so Jellyfin does not have to guess at all.

    I also have some automatic checks in place to convert incorrect metadata to a proper format. Like moving artists from the title (feat. Somebody else) to the artists tag Somebody; Somebody else and a bunch more.

    Together with Finamp on desktop and mobile everything is pretty much working as expected.