Question: Sitemap Creator is not crawling all the pages on my site. Why not?
Answer: Most crawling issues are caused by one of the following problems:
- The domain name you entered in the Root URL field must match the domain name your website uses when linking to itself. For example, if you enter “example.com” in the Root URL field, but your website uses “www.example.com” (note the “www.” prefix) Sitemap Creator may not index all the URLs. Sitemap Creator will treat any URLs with a different domain as an external site (this includes sub-domains which is why the “www.” is important). Try changing your Root URL and crawl your website again. Note: If your website uses multiple domains interchangeably and you cannot correct it, you can use the “Domain Alias” feature to tell Sitemap Creator to treat the domains as one. The sitemap protocol only allows one domain per sitemap, so Sitemap Creator will change all URLs to match the domain used in the Root URL.
- Your pages are not fully interlinked. Sitemap Creator is like Google-bot; they both use the links on your site to discover the pages. If a page is not linked then Sitemap Creator cannot find it. (You can override this using the “Additional Root URLs” setting to give the crawler multiple starting points, but it is better from an SEO perspective to have your pages interlinked.)
- Your pages have the Canonical URL tag set incorrectly. Sitemap Creator respects the Canonical URL tag. If this tag is present on your pages and used incorrectly it will cause Sitemap Creator to ignore those pages. The easiest way to tell if this affects your website is to check Sitemap Creator’s “Log” tab. If Sitemap Creator is reporting that many pages are omitted by the Canonical URL tag, it is a sign that this problem affects your website. Check your Canonical URL tags and correct/remove them as necessary.Note: You can tell Sitemap Creator to ignore the Canonical URL tags in Advanced Project Settings, however your website will still not get indexed correctly by Google. We always recommend correcting/removing these tags if it is setup incorrectly.
If you continue to have difficulty crawling your site, please contact technical support and we would be happy to assist you.