Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Allow robot access to protected content

__/ [ John Bokma ] on Thursday 08 June 2006 17:06 \__

> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>> __/ [ John Bokma ] on Thursday 08 June 2006 05:23 \__
> 
> [..]
> 
>>> If they check for that, yup. Some sites check for the crawlers, based
>>> on IP or name.
>> 
>> In worse scenarios, if you have no browser extensions, wget can be
>> used to fetch the page in question. There's the "--user-agent" option.
> 
> In worse scenarios that doesn't work, unless you work at Google.


Maybe they can set up an account for us. You know... to use as a proxy, via
SSH, or PHPProxy, or whatever. They could even interface it:

        http://proxy.google.com

Imagine the banner. Imagine the integration with Google Wi-Fi, which is at
the moment deployed in SF and the Bay Area.


> [ website structures ]
>> *smile* I can remember the time when I ceased to maintain the sitemap
>> and lost that visual, conceptual idea of how my site was constructed.
>> It is now somewhat of a messy Web, which I sometimes try to
>> restructure. Same situation with E-mail accounts, Web hosts, and
>> domain names.
> 
> I think the messy web structure is the best. Websites are rarely a perfect
> tree structure.


Most are progress-driven. No top-down approach . No specification. No plan.
Very natural for sites that expend without a pre-allocated budget (c/f
Google.com), as well as personal sites.

Best wishes,

Roy

-- 
Roy S. Schestowitz      | Windows O/S: chmod a-x internet; kill -9 internet
http://Schestowitz.com  | Free as in Free Beer ¦  PGP-Key: 0x74572E8E
  5:15pm  up 41 days 22:48,  10 users,  load average: 1.14, 1.35, 1.40
      http://iuron.com - semantic engine to gather information

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index