The Indexing Service Companion contains the following features:
Enables Index Server (Windows NT 4.0 Server) or Indexing Service (Windows 2000
Server and above) to allow searching of any web server or ODBC compliant database.
Integrated web robot extracts content from websites. Includes support for robots.txt
files and robots meta tags. Robot can negotiate sites using HTML Frames.
Optional ability to retrieve URLs containing QueryStrings.
Ability to retrieve individual URLs from servers.
Ability to specify crawl depth of web robot.
Support for crawling websites using ASP/ASP.NET sessions.
Supports crawling of secure websites using Secure Sockets Layer (SSL).
Supports crawling of websites secured using HTTP Basic Authentication.
Supports crawling of Microsoft IIS websites secured using NTLM (Windows integrated
security).
Ability to retrieve binary files from servers, including Adobe Acrobat PDF, Microsoft
Office documents and even images.
Ability to save content from web crawls to any ODBC database (i.e. Access or SQL
Server), which can potentially allow you to create a website search facility using SQL
(SQL Server full-text catalog searching recommended).
Automatic uploading of saved content to a FTP site.
Ability to save content from web crawls into an XML file.
Support for full or incremental project updates of both web and database content,
meaning that Indexing Service only has to re-index content that has changed.
Configuration of the Indexing Service Companion is through the editing of a plain text
configuration file.
Indexing Service Companion can be run from the command line, and scheduled using the
Windows Task Scheduler.
Full reporting of activity to an external plain text log file.
Flexible output options mean that administrative access to Index Server is not
necessarily required.
Facility for creating a site summary page that can be used to optimise sites for search
engines, by showing what keywords are used on a site, the <title> tags used by each
page etc. The site summary page shows:
The top 40 most used keywords listed in the keywords meta tags in all of the pages
crawled.
The body of the summary page contains the following from the pages encountered in the
web crawl:
A list of page titles.
A list of page headings (H1 to H3 inclusive).
A list of bold (and strong) text.
A list of external hyperlinks in the site.
A list of italicised text.
A selection of relevant paragraph text. Paragraphs are assumed to be relevant if they
contain one or more of the 40 most frequently used keywords.
A list of page description meta tags.
A list of page keywords.
A list of page description meta tags hyperlinked back to the URL from which each
description was extracted.
A list of page keywords hyperlinked back to the URL from which it was extracted.
Fully documented VBScript examples show how to make use of the Indexing Service
Companion in ASP pages.
Includes sample code showing how to use the Indexing Service Companion with Index
Server/Indexing Service on ASP.NET (C#
and VB.NET). Display your search results in DataGrids, Repeaters and other Web Form
components.
Detailed documentation in Microsoft's HTML Help format [download help
file].