balthasar_s wrote:I wonder: / If the bot doesn't download anything, but I see the pages in a browser …
Yes, I think that approach would be okay (anything that depends upon a person manually loading or refreshing each page).
My broswer stores everything temporarily in a cache (including HTML pages), and if the thing being stored is bigger than 16384 bytes, it is in a file by itself. For example, right now I have a file called ~/Library/Caches/Google/Chrome/Default/Cache/f_001691
that contains the HTML of this page
which is related to my work. In the past I have written scripts to analyse files in that cache looking for something specific.
I believe that if I visit the web pages manually, and write a program that looks in the cache for the pages I want, then I can keep copies of what it finds and I'm not violating the "automatic means" clause either in letter or in spirit.
However, most web pages are not
cached in this way, because their server serves them as a "volatile" resource. For example, the xkcd
fora server returns a Pragma: no-cache
in its headers, but the above-linked example HTML from oeis.org
does not, so the OEIS content appears in my cache, but xkcd
fora pages do not.
But, I have also (like Randall Munroe1
, but independently of him) written a bot that re-loads and caches all URLs that I visit in my browser, so that I can do searches in the full content of my browsing history (but I stopped using that bot after switching to Chrome, which had its own history content search). In this case it's a bot in the normal "scraping-web-loading" sense of the word, but each of its actions are in response to a manual action by me.
slinches wrote:What can FB really do besides detecting the bots and banning user accounts? At most they could request that you take down the "offending" code, but since it isn't illegal to write a bot there's no obligation to comply.
I think at this point we're only talking about a "bot" that runs locally on a user's computer and saves a copy of what they have already manually loaded into their own browser. If it does this without submitting a new request to the internet then there probably wouldn't be any way for FB to detect it.
THAT IS NOT EPSILONISH WHICH CAN ETERNAL LIE1
AND WITH STRANGE AEONS EVEN EPSILONITY MAY DIE
There is an audio or video of a Munroe interview/lecture/etc. in which he describes his browser history bot and how it got him blocked from Wolfram|Alpha
, but I don't know which session it was. I spent some time looking through a few of the videos (JoCo Cruise Crazy 3, Wil Weaton L!ve Talks, and the 2007 Google one that couldn't possibly be it because it pre-dated What If?
) but then I had to move on to other things. Does anyone else know?