Some thoughts on how useful Anubis really is. Combined with comments I read elsewhere about scrapers starting to solve the challenges, I’m afraid Anubis will be outdated soon and we need something else.

    • dabe@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      That solution still introduces lots of friction. At the volume and rate that these bots want to be traversing the internet, they probably don’t want to be fully graphically rendering pages and spawning extra browser processes then doing text recognition to then pass on to the LLM training sets. Maybe I’m wrong there, but I don’t think it’s that simple and actually just shifts solving the math challenge horizontally (i.e., in both cases, the scraper or the network the scraper is running on still has to solve the challenge)

    • Passerby6497@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      14 hours ago

      Congrats on doing it the way the website owner wants! You’re now into the content, and you had to waste seconds of processing power to do so (effectively being throttled by the owner), so everyone is happy. You can’t overload the site, but you can still get there after a short wait.

    • Badabinski@kbin.earth
      link
      fedilink
      arrow-up
      6
      ·
      18 hours ago

      Anubis has worked if that’s happening. The point is to make it computationally expensive to access a webpage, because that’s a natural rate limiter. It kinda sounds like it needs to be made more computationally expensive, however.

    • zalgotext@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      18 hours ago

      LLMs can’t just run chromium unless they’re tool aware and have an agent running alongside them to facilitate tool use. I highly suspect that AI web crawlers aren’t that sophisticated.