Got a warning for my blog going over 100GB in bandwidth this month… which sounded incredibly unusual. My blog is text and a couple images and I haven’t posted anything to it in ages… like how would that even be possible?

Turns out it’s possible when you have crawlers going apeshit on your server. Am I even reading this right? 12,181 with 181 zeros at the end for ‘Unknown robot’? This is actually bonkers.

Edit: As Thunraz points out below, there’s a footnote that reads “Numbers after + are successful hits on ‘robots.txt’ files” and not scientific notation.

Edit 2: After doing more digging, the culprit is a post where I shared a few wallpapers for download. The bots have been downloading these wallpapers over and over, using 100GB of bandwidth usage in the first 12 days of November. That’s when my account was suspended for exceeding bandwidth (it’s an artificial limit I put on there awhile back and forgot about…) that’s also why the ‘last visit’ for all the bots is November 12th.

  • Thunraz@feddit.org
    link
    fedilink
    English
    arrow-up
    35
    ·
    9 hours ago

    It’s 12181 hits and the number behind the plus sign are robots.txt hits. See the footnote at the bottom of your screenshot.

    • benagain@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      14
      ·
      9 hours ago

      Phew, so I’m a dumbass and not reading it right. I wonder how they’ve managed to use 3MB per visit?

      • EarMaster@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        9 hours ago

        The traffic is really suspicious. Have you by any chance a health or heartbeat endpoint which provides continuous output? That would explain why so many hits cause so much traffic.

        • benagain@lemmy.mlOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          8 hours ago

          It’s super weird for sure. I’m not sure how the bots have managed to use so much more bandwidth with only 30k more hits than regular traffic, I guess they probably don’t rely on any caching and fetch each page from scratch?

          Still going through my stats, but it doesn’t look like I’ve gotten much traffic via any API endpoint (running WordPress). I had a few wallpapers available for download and it looks like for whatever reason the bots have latched onto those.

        • benagain@lemmy.mlOP
          link
          fedilink
          English
          arrow-up
          6
          ·
          edit-2
          9 hours ago

          12,000 visits, with 181 of those to the robots.txt file makes way, way more sense. The ‘Not viewed traffic’ adds up to 136,957 too - so I should have figured it out sooner.

          I couldn’t wrap my head around how large the number was and how many visits that would actually entail to reach that number in 25 days. Turns out that would be roughly 5.64 quinquinquagintillion visits per nanosecond. Call it a hunch, but I suspect my server might not handle that.