Mystery with Googlebot and IT-Austria

Posted on:June 01 2005

I placed some quite private page on I placed it in a subdirectory which didn't exist before. Nothing links to that page. At least nothing I know of. Two minutes after I had uploaded that page, Googlebot came along and indexed it.
WTF? How did it do that? I mean, I'm no web expert, but how did googleBot find there?
I immediately added a no-archive meta tag to that page and moved it into another directory.
The even more scary thing: Just two hours ago, someone from (what is that, btw?) came along to the page again and read it. How did he get there? The referer log doesn't tell anything. So now I added a .htaccess password. This will help, but I still would like to know how this was possible. Hm.


Did you surf the web _from_ that page?
If you did.. the referer of your browser left some trace in some logs..
Anyway, that's quite strange, and scary..
2005-06-01 18:27:00

That's a good point, but no, I didn't.
2005-06-02 06:18:00

Niko, I came across the same problem some years ago. Here is something you will find useful:

By putting a file called robots.txt in your root web folder, you can tell bots to ignore some of your folders and files.
2005-06-02 12:45:00

That's why I added the meta-tags I mentioned in the text, they should do the job too. :)
2005-06-02 14:17:00

Hm. Same happened to me. Google indexed a page which it can't know because nothing linkes to this page.
I thought that this was my mistake so I forgot it. Now I will discover the HTTP Header of my Apache...
2005-06-06 21:40:00

Back in the days I used a program called GetRight. It had a feature that allowed you to browse an entire site and see all of the files and such. Im not sure if it was just some sort of spider that linked off of the main page, but it impressed me at the time.
2005-06-09 06:38:00

Add comment:

Posted by:

Enter the missing letter in: "Int?rnational"




Possible Codes

Feature Code
Link [url] [/url]
Bold [b]bold text[/b]
Quote [quote]quoted text[/quote]
Code [code]source code[/code]