Comment

Comment on FediDB has stoped crawling until they get robots.txt support

No idea honestly. If anyone knows, let us know! I dont think its necessarily a bad thing, If their crawler was being too aggressive, then it can accidentally DDOS smaller servers. Im hoping that is what they are doing and respecting the robot.txt that some sites have.

source

Sort:hotnew top

ada@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
Gotosocial has a setting in development that is designed to baffle bots that don’t respect robots.txt. FediDB didn’t know about that feature and thought gotosocial was trying to inflate their stats.

In the arguments that went back and forth between the devs of the apps involved, it turns out that FediDB was ignoring robots.txt. ie, it was badly behaved

source
- mesamunefire@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Interesting! Is this over a Git issue somewhere? That could explain quite a bit.
  
  source
  - Pika@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    issue link here
    
    It was a good read
    
    source
    mesamunefire@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Thank you for providing the link.
    
    source
  - ada@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
    Yep!
    
    source
hendrik@palaver.p3x.de ⁨1⁩ ⁨year⁩ ago
I think it's just one HTTP request to the nodeinfo API endpoint once a day or so. Can't really be an issue regarding load on the instances.

source
- jmcs@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
  It’s not about the impact it’s about consent.
  
  source
  - hendrik@palaver.p3x.de ⁨1⁩ ⁨year⁩ ago
    True. Question here is, if you run a federated service... Is that enough to assume you consent to federation?
    
    source
    JustAnotherKay@lemmy.world ⁨1⁩ ⁨year⁩ ago
    
    if you run a federated services… Is that enough to assume you consent
    
    If she says yes to the marriage that doesn’t mean she permanently says yes to sex. I can run a fully air gapped “federated” instance if I want to
    
    source
    -> View More Comments
    WhoLooksHere@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Why invent implied consent when complicit consent has been the standard in robots.txt for ages now?
    
    source
    -> View More Comments