Comment

Comment on FediDB has stoped crawling until they get robots.txt support

hendrik@palaver.p3x.de ⁨1⁩ ⁨year⁩ ago

Did someone complain? Or why stop?

source

Sort:hotnew top

mesamunefire@lemmy.world ⁨1⁩ ⁨year⁩ ago
No idea honestly. If anyone knows, let us know! I dont think its necessarily a bad thing, If their crawler was being too aggressive, then it can accidentally DDOS smaller servers. Im hoping that is what they are doing and respecting the robot.txt that some sites have.

source
- ada@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
  Gotosocial has a setting in development that is designed to baffle bots that don’t respect robots.txt. FediDB didn’t know about that feature and thought gotosocial was trying to inflate their stats.
  
  In the arguments that went back and forth between the devs of the apps involved, it turns out that FediDB was ignoring robots.txt. ie, it was badly behaved
  
  source
  - mesamunefire@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Interesting! Is this over a Git issue somewhere? That could explain quite a bit.
    
    source
    Pika@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    issue link here
    
    It was a good read
    
    source
    -> View More Comments
    ada@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
    Yep!
    
    source
- hendrik@palaver.p3x.de ⁨1⁩ ⁨year⁩ ago
  I think it's just one HTTP request to the nodeinfo API endpoint once a day or so. Can't really be an issue regarding load on the instances.
  
  source
  - jmcs@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    It’s not about the impact it’s about consent.
    
    source
    hendrik@palaver.p3x.de ⁨1⁩ ⁨year⁩ ago
    True. Question here is, if you run a federated service... Is that enough to assume you consent to federation?
    
    source
    -> View More Comments