Question: When I run InSite on my website I receive a “System of out memory” exception. How can I fix it?
Answer: While InSite has no hard limitations, it is designed to run on small to medium sized websites. If your site is very large, your system may run out of memory. We’ve successfully tested InSite to 50,000 pages; your mileage may vary depending on your system configuration.
If you’d like to check a site larger than 50,000 pages, we recommend breaking your site down into sections by specifying alternate root URLs. For example, you might consider breaking your site down into several projects with the following root URLs:
www.example.com
www.example.com/sales
www.example.com/support
This will not only enable you to work around the memory limitations of your system, but make reports, and rechecking more manageable.
Note: You will want to exclude the “/sales” and “/support” directories from the alternate projects to prevent duplication when checking.
Another approach to reducing the amount of memory required is to disabled any features in InSite you are not using. For example, if you are primarily using InSite for spell checking and link checking, you can disable the Word Count and Keyword Analysis features to conserve memory. If you are only interested in Word Counting, try disabling InSite’s spell checking and link checking features.
A third way to reduce the amount of memory required is to exclude some pages from the crawl. If your website contains multiple pages with the same content, try creating rules in the Exclusion List to filter out these duplicate pages. For example, many online stores offer product pages in a “Printer Friendly” format (in addition to the standard product page). The text content on the Printer Friendly page is usually identical to the content on the standard product page, so you can safely exclude these Printer Friendly pages.
If your site is not large and you still receive this error, then a large page that generates many false positives (such as a list of URLs from a website statistics page) may be causing a huge number of spelling mistakes (several hundred thousand). Check your results, and if necessary exclude this page from the crawl, you probably didn’t want those in your results anyway.