|
|
Features of the Indexing
Service Companion
|
|
The Indexing Service Companion contains the following features:Enables Index Server (Windows NT 4.0 Server) or Indexing Service (Windows 2000
Server and above) to allow searching of any web server or ODBC compliant database.
- Integrated web robot extracts content from websites. Includes support for robots.txt
files and robots meta tags. Robot can negotiate sites using HTML Frames.
- Optional ability to retrieve URLs containing QueryStrings.
- Ability to retrieve individual URLs from servers.
- Ability to specify crawl depth of web robot.
- Support for crawling websites using ASP/ASP.NET sessions.
- Supports crawling of secure websites using Secure Sockets Layer (SSL).
- Supports crawling of websites secured using HTTP Basic Authentication.
- Supports crawling of Microsoft IIS websites secured using NTLM (Windows integrated
security).
- Ability to retrieve binary files from servers, including Adobe Acrobat PDF, Microsoft
Office documents and even images.
- Ability to save content from web crawls to any ODBC database (i.e. Access or SQL
Server), which can potentially allow you to create a website search facility using SQL
(SQL Server full-text catalog searching recommended).
- Automatic uploading of saved content to a FTP site.
- Ability to save content from web crawls into an XML file.
- Support for full or incremental project updates of both web and database content,
meaning that Indexing Service only has to re-index content that has changed.
- Configuration of the Indexing Service Companion is through the editing of a plain text
configuration file.
- Indexing Service Companion can be run from the command line, and scheduled using the
Windows Task Scheduler.
- Full reporting of activity to an external plain text log file.
- Flexible output options mean that administrative access to Index Server is not
necessarily required.
- Facility for creating a site summary page that can be used to optimise sites for search
engines, by showing what keywords are used on a site, the <title> tags used by each
page etc. The site summary page shows:
- The top 40 most used keywords listed in the keywords meta tags in all of the pages
crawled.
- The body of the summary page contains the following from the pages encountered in the
web crawl:
- A list of page titles.
- A list of page headings (H1 to H3 inclusive).
- A list of bold (and strong) text.
- A list of external hyperlinks in the site.
- A list of italicised text.
- A selection of relevant paragraph text. Paragraphs are assumed to be relevant if they
contain one or more of the 40 most frequently used keywords.
- A list of page description meta tags.
- A list of page keywords.
- A list of page description meta tags hyperlinked back to the URL from which each
description was extracted.
- A list of page keywords hyperlinked back to the URL from which it was extracted.
- Fully documented VBScript examples show how to make use of the Indexing Service
Companion in ASP pages.
- Includes sample code showing how to use the Indexing Service Companion with Index
Server/Indexing Service on ASP.NET (C#
and VB.NET). Display your search results in DataGrids, Repeaters and other Web Form
components.
- Access to product updates and technical support.
|


|
|