This is an active list of problems and solutions found in the Pages not found section of AWStats traffic report.
This is the first place I head. It is amazing how many broken links exist on websites. Note that SE =search engine. Listed are several types of problems you may find. An example URL and referrer are given along with possible fixes. If a referrer is missing it means the URL was entered directly and not clicked on.
1. If you see an odd URL from a familiar referrer, the URL may be commented out on the referrer web page. I'm not sure why comments are parsed unless a bot does it.
URL: /my-images/myimage.gif
Referrer: https://www.mysite.com/secure-webpage.html
Fix: The images are in a directory that is secure; i.e. it requires a password to get into.
URL: /oc/www.mysite.com/oc/oc-pricing.htm
Referrer: none
Fix: www.mysite.com/oc/oc-pricing.htm is on a page in the oc
directory without the http before it. This scenario happens when a
return URL is needed (as a parameter in another URL for example) but
the http part is not supposed to be specified. For example:
<input name="return" value="www.mysite.com/oc/oc-pricing.htm" type="hidden" />
or
<a href="http://www.aitsafe.com/cf/review.cfm?return=www.mysite.com/oc/oc-pricing.htm">Price</a>
URL: /my-images/myimage.gif
Referrer: http://www.hotlinkingsite.com
Fix: The hotlinkingsite.com is trying to hotlink your image but cannot find it. Good thing. I had this happen when I found a site was hotlinking an image and I renamed the image.
URL: /mybad.com%20homepage%20link
Referrer: http://www.mydomain.com/
Fix: The problem was an image link to mybad.com had the image's longdesc attribute filled out with the text: "mybad.com homepage link". Since longdesc is a link the text was interpreted as a URL.
URL: test.htm
Referrer:
Fix: test.htm was uploaded for testing but deleted. Even though there was no link to test.htm apparently the search engines picked it up anyway.
URL: /howto/%22mailto:news&
Referer:
Fix: This was an example of using the mailto html command. The SEs picked this up as a real email address. You can spell out colon or something else so it does not look real. This is also not a good idea since spambots can easily grab the email address and use it. Read about how to avoid this problem.
URL: unknown web page
Referer:
Fix: If you see odd pages here, such as forms you do not use, it is people trying to find out if you have a particular form so they can exploit it. Old web pages may be in the list, too.
URL: /_vti_bin/shtml.exe/_vti_rpc
Referer:
Fix: Looks like a FrontPage or Expression Web hidden directory was uploaded to the server. FP/EW have several hidden directories, all beginning with an underscore '_'. The do NOT need to be uploaded to the server. Here is a list:
This usually happens when an external FTP program is used to upload the web site. FP/EW will not upload these directories. Delete the server copy only.