Download Limit/Robots.txt

Hello! I am very glad to see that the site and forum have come back! I've got one question though, is the download limit and /download/ restriction for crawlers temporary? I don't really mind the limit too much (50 in one day should be more than enough!), but I am concerned about restricting access to /download/ for crawlers, as that prevents the Wayback Machine from archiving any new downloads. I assume this is to let the site get on it's feet again, so I'm not complaining. I'm just curious if it's temporary or not :smile:

Comments

  • Crawlers are explicitly disallowed to hit /download/* (because of weird indexing issues; the search spiders keel over for some reason there)

    As for the IA, we could consider something like periodically syncing a mirror to the IA, instead of Wayback access? (I note the IA crawler can just retry ones it gets HTTP 429 on later.)

  • That would be fine, as long as it's backed up (and public) somehow! Was /download/ always disallowed and this is just the first time I've noticed, or was that changed after WinWorldPC's "reboot"?

  • As far as I know, automated systems have been unable to directly access downloads since Winworld went to the database systems. Although google had been indexing the individual "download" pages (rather than just product or release pages), I didn't really like that because download pages had been missing important information (such as language designation) and was often confusing.

  • Ok, thank you for your answer!

  • @SomeGuy said:
    As far as I know, automated systems have been unable to directly access downloads since Winworld went to the database systems. Although google had been indexing the individual "download" pages (rather than just product or release pages), I didn't really like that because download pages had been missing important information (such as language designation) and was often confusing.

    I actually did add an info box for download pages now.

Sign In or Register to comment.