Re: Allow robot access to protected content

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: Allow robot access to protected content

Subject: Re: Allow robot access to protected content
From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
Date: Thu, 08 Jun 2006 17:22:15 +0100
Newsgroups: alt.internet.search-engines
Organization: schestowitz.com / MCC / Manchester University
References: <1149706468.448946.132660@c74g2000cwc.googlegroups.com> <op.tashwfta26l578@borek> <1855038.Ab3PtXB239@schestowitz.com> <Xns97DBEDED0EE21castleamber@130.133.1.4> <1260801.6MNE8TqeoX@schestowitz.com> <Xns97DC70F442448castleamber@130.133.1.4>
Reply-to: newsgroups@xxxxxxxxxxxxxxx
User-agent: KNode/0.7.2

__/ [ John Bokma ] on Thursday 08 June 2006 17:06 \__

> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>> __/ [ John Bokma ] on Thursday 08 June 2006 05:23 \__
> 
> [..]
> 
>>> If they check for that, yup. Some sites check for the crawlers, based
>>> on IP or name.
>> 
>> In worse scenarios, if you have no browser extensions, wget can be
>> used to fetch the page in question. There's the "--user-agent" option.
> 
> In worse scenarios that doesn't work, unless you work at Google.


Maybe they can set up an account for us. You know... to use as a proxy, via
SSH, or PHPProxy, or whatever. They could even interface it:

        http://proxy.google.com

Imagine the banner. Imagine the integration with Google Wi-Fi, which is at
the moment deployed in SF and the Bay Area.


> [ website structures ]
>> *smile* I can remember the time when I ceased to maintain the sitemap
>> and lost that visual, conceptual idea of how my site was constructed.
>> It is now somewhat of a messy Web, which I sometimes try to
>> restructure. Same situation with E-mail accounts, Web hosts, and
>> domain names.
> 
> I think the messy web structure is the best. Websites are rarely a perfect
> tree structure.


Most are progress-driven. No top-down approach . No specification. No plan.
Very natural for sites that expend without a pre-allocated budget (c/f
Google.com), as well as personal sites.

Best wishes,

Roy

-- 
Roy S. Schestowitz      | Windows O/S: chmod a-x internet; kill -9 internet
http://Schestowitz.com  | Free as in Free Beer ¦  PGP-Key: 0x74572E8E
  5:15pm  up 41 days 22:48,  10 users,  load average: 1.14, 1.35, 1.40
      http://iuron.com - semantic engine to gather information

Follow-Ups:
- Re: Allow robot access to protected content
  - From: John Bokma

References:
- Re: Allow robot access to protected content
  - From: Roy Schestowitz
- Re: Allow robot access to protected content
  - From: Roy Schestowitz
- Re: Allow robot access to protected content
  - From: John Bokma

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index