• 0 Posts
  • 15 Comments
Joined 1 year ago
cake
Cake day: April 3rd, 2024

help-circle
  • To quote that same document:

    Figure 5 looks at the average temperatures for different age groups. The distributions are in sync with Figure 4 showing a mostly flat failure rate at mid-range temperatures and a modest increase at the low end of the temperature distribution. What stands out are the 3 and 4-year old drives, where the trend for higher failures with higher temperature is much more constant and also more pronounced.

    That’s what I referred to. I don’t see a total age distribution for their HDDs so I have no idea if they simply didn’t have many HDDs in the three-to-four-years range, which would explain how they didn’t see a correlation in the total population. However, they do show a correlation between high temperatures and AFR for drives after more than three years of usage.

    My best guess is that HDDs wear out slightly faster at temperatures above 35-40 °C so if your HDD is going to die of an age-related problem it’s going to die a bit sooner if it’s hot. (Also notice that we’re talking average temperature so the peak temperatures might have been much higher).

    In a home server where the HDDs spend most of their time idling (probably even below Google’s “low” usage bracket) you probably won’t see a difference within the expected lifespan of the HDD. Still, a correlation does exist and it might be prudent to have some HDD cooling if temps exceed 40 °C regularly.


  • Hard drives don’t really like high temperatures for extended periods of time. Google did some research on this way back when. Failure rates start going up at an average temperature of 35 °C and become significantly higher if the HDD is operated beyond 40°C for much of its life. That’s HDD temperature, not ambient.

    The same applies to low temperatures. The ideal temperature range seems to be between 20 °C and 35 °C.

    Mind you, we’re talking “going from a 5% AFR to a 15% AFR for drives that saw constant heavy use in a datacenter for three years”. Your regular home server with a modest I/O load is probably going to see much less in terms of HDD wear. Still, heat amplifies that wear.

    I’m not too concerned myself despite the fact that my server’s HDD temps are all somewhere between 41 and 44. At 30 °C ambient there’s not much better I can do and the HDDs spend most of their time idling anyway.





  • That does make encryption was less appealing to me. On one of my machines / and /home are on different drives and parts of ~ are on yet another one.

    I consider the ability to mount file systems in random folders or to replace directories with symlinks at will to be absolutely core features of unixoid systems. If the current encryption toolset can’t easily facilitate that then it’s not quite RTM for my use case.


  • If you use a .local domain, your device MUST ask the mDNS address (224.0.0.251 or FF02::FB) and MAY ask another DNS provider. Successful resolution without mDNS is not an intended feature but something that just happens to work sometimes. There’s a reason why the user interfaces of devices like Ubiquiti gateways warn against assigning a name ending in .local to any device.

    I personally have all of my locally-assigned names end with .lan, although I’m considering switching to a sub-subdomain of a domain I own (so instead of mycomputer.lan I’d have mycomputer.home.mydomain.tld). That would make the names much longer but would protect me against some asshat buying .lan as a new gTLD.






  • Flatpak has its benefits, but there are tradeoffs as well. I think it makes a lot of sense for proprietary software.

    For everything else I do prefer native packages since they have fewer issues with interop. The space efficiency isn’t even that important to me; even if space issues should arise, those are relatively easy to work around. But if your password manager can’t talk to your browser because the security model has no solution for safe arbitrary IPC, you’re SOL.



  • Jesus_666@lemmy.worldtoLinux@lemmy.ml*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 months ago

    Oh yeah, the equation completely changes for the cloud. I’m only familiar with local usage where you can’t easily scale out of your resource constraints (and into budgetary ones). It’s certainly easier to pivot to a different vendor/ecosystem locally.

    By the way, AMD does have one additional edge locally: They tend to put more RAM into consumer GPUs at a comparable price point – for example, the 7900 XTX competes with the 4080 on price but has as much memory as a 4090. In systems with one or few GPUs (like a hobbyist mixed-use machine) those few extra gigabytes can make a real difference. Of course this leads to a trade-off between Nvidia’s superior speed and AMD’s superior capacity.


  • Jesus_666@lemmy.worldtoLinux@lemmy.ml*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    8
    ·
    9 months ago

    These days ROCm support is more common than a few years ago so you’re no longer entirely dependent on CUDA for machine learning. (Although I wish fewer tools required non-CUDA users to manually install Torch in their venv because the auto-installer assumes CUDA. At least take a parameter or something if you don’t want to implement autodetection.)

    Nvidia’s Linux drivers generally are a bit behind AMD’s; e.g. driver versions before 555 tended not to play well with Wayland.

    Also, Nvidia’s drivers tend not to give any meaningful information in case of a problem. There’s typically just an error code for “the driver has crashed”, no matter what reason it crashed for.

    Personal anecdote for the last one: I had a wonky 4080 and tracing the problem to the card took months because the log (both on Linux and Windows) didn’t contain error information beyond “something bad happened” and the behavior had dozens of possible causes, ranging from “the 4080 is unstable if you use XMP on some mainboards” over “some BIOS setting might need to be changed” and “sometimes the card doesn’t like a specific CPU/PSU/RAM/mainboard” to “it’s a manufacturing defect”.

    Sure, manufacturing defects can happen to anyone; I can’t fault Nvidia for that. But the combination of useless logs and 4000-series cards having so many things they can possibly (but rarely) get hung up on made error diagnosis incredibly painful. I finally just bought a 7900 XTX instead. It’s slower but I like the driver better.