I dont know what to think, really.

The Dekaif channel has 434 videos, but YouTube is only showing 275 to clients, whether logged in or not, whether yt-dlp or official access.

This isn’t the first channel I’ve witnessed this, and weirder stuff, on. Another example is this video - “Belt” meme - it is accessible on Grayjay, yet not on YouTube, meaning (I think) that publicly shared videos are being deindexed, and yet they are still hosted.

You used to be able to take the video code from the URL (everything after ‘?v=’ and before ‘&’) and get the exact video in search results. Not now. The second YouTuber, Sparky, has 35 uploads, only 9 of which are visible. And I can attest that at least one of the remaining 26 is hosted, but invisible. I don’t even know how it came up using Grayjay but not YouTube or Revanced.

Basically, there’s a TON of shady underhanded shit happening at YTHQ and everyone needs to jump ship to Odysee, Peertube or some platform that won’t be clogged with AI. This is bad for everyone.

I’m posting it here mainly because I verified my findings with yt-dlp, and this new bs is successfully thwarting my attempts to archive.

3rd Oct edit: I am seeing massive differences in indexed videos versus archived videos. I am currently aggregating but the definitely affected videos range from 10% to 50%

  • floofloof@lemmy.ca
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    1 day ago

    Someone would have to pay for the API calls though. And that tends to mean either pay a subscription or view ads. There’s no technical reason your local LLM couldn’t call a search engine’s API to give you an ad-free search experience, and in fact you don’t need an LLM to run a local ad-free search frontend. But there is a commercial reason, namely that whoever runs the search engine API will want payment. It would be some progress to have an ad-free search subscription, but it wouldn’t get around all the megacorp fuckery that decides what search results you get.

    • Gravitywell@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      24 hours ago

      APIs are the compromise that sites have to make if they dont want the much more reasource heavy scrapping methods used.

      The most they could do is rate limit IP addresses, and that doesbt work too well when jts individual users who can just request a new IP any time

      • Coopr8@kbin.earth
        link
        fedilink
        arrow-up
        2
        ·
        21 hours ago

        Not to mention that the scraped indexes can and should be shared. Unfortunately what OP is seeing may be a move to thwart this type of brute force scraping, and might resolve as dynamically assigned domain addresses, where the URL of a set object is temporarily assigned and streamed only to a single or group of IP addresses that request it within a given timeframe before being rotated out until found in search again and then reassigned a new URL, etc. This is a frankly stupid use of resources, but can effectively be used to prevent crowdsourced indexes from proliferating, and to punish IPs or even MAC addresses or browser fingerprints associated with downloading and reuploading videos which almost certainly have stegnographic fingerprinting embedded that associate with who the video was served up to at the time it was downloaded.

        • Coopr8@kbin.earth
          link
          fedilink
          arrow-up
          5
          ·
          21 hours ago

          Also, you know what would make this all even worse? Laws requiring that people prove their identity in order to consume content or pull videos… just like age verification laws now being passed in several countries. What a coincidence.