Web Robot View of http://www.brettb.com/IndexServerCompanion.asp

Page Item Value
Title The Index Server Companion
Description Index Server Companion; Allows Index Server to index content from remote websites and ODBC databases
Keywords index server companion, index server, site server search, odbc, database, index, remote, access, sql server, oracle, indexing service, iis, asp, windows, application, download, pdf, word, files, separate, remote
Robots Meta Tag  
Page Content   HOME | ABOUT ME | BIOTECHNOLOGY | ARTICLES | TOOLS | GALLERY | CONTACT Search: Go DEVELOPER TOOLS
ASP Doc Tool
ASP.NET Doc Tool
SQL Doc Tool
Index Server Companion
The Website Utility TECHNICAL ARTICLES
ASP
ASP.NET
JavaScript
Transact SQL

PHOTO GALLERIES
Canon EOS 300D Samples
Red Arrows 2004
Living Coasts
Web Page Backgrounds
More Galleries...

NEW STUFF
SQL Color Coder
Canon EOS 300D Samples
The Website Utility
Search Engine Optimisation
Build an ASP Search Engine
My Tropical Fishtank
Savings Other New Stuff...

POPULAR STUFF
Regular Expressions
ASP Documentation Tool
Index Server ASP
JavaScript Ad Rotator

LINKS
Business Website
ASPAlliance Articles

Home Developer Tools

The Index Server Companion This article describes the Index Server Companion , an application I have created that allows Microsoft Index Server to index content from remote websites and ODBC databases. It is available for purchase through my business website . An evaluation version is also available.

The Problem Index Server is a great product! On the administrative side of things, it is easy to install, performance is good, and once installed maintenance tasks are minimal. The development of search applications using ASP is also made fairly straightforward through the use of the Query and Utility server components.

The main limitation of Index Server is that it can really only be used to index content hosted on servers on the same machine or network as the machine hosting the Index Server service. Although it is possible to set up a share to a Unix/Linux webserver using a file sharing solution such as SAMBA, this isn't always satisfactory because Index Server is not case sensitive with respect to filenames, so this can cause problems when displaying search results.

Another issue is that it can be a chore to prevent Index Server from indexing certain content on a server. Unlike a web robot, it has no concept of the Robots Exclusion Standard specification (i.e. robots.txt files) and is unaffected by the 'robots' meta tag.

The Solution Retrieving and indexing content from a web server by use of a web robot is the solution. The web robot is able to mimic a web browser, starting at one page in the site and traversing the links in the site until it has retrieved all of the pages of the site. The robot will potentially be able to retrieve content from any webserver, regardless of the platform it is hosted on. Two products that allow you to do this are Microsoft's Site Server 3.0 and the author's own Index Server Companion .

Microsoft Site Server 3.0 Microsoft's Site Server 3.0 software suite has a Search application that enhances Index Server by allowing you to (amongst other things) retrieve and index content from remote websites using an integrated web robot. For an overview of Site Server 3.0 Search, take a look at an article I wrote for ariadne.ac.uk . Unfortunately Site Server 3.0 Search has a few shortcomings, including: Site Server 3.0 isn't the easiest of applications to install. The product wasn't really designed for Windows 2000 Server. It doesn't appear that the product is still in active development. It isn't very useful if your websites are hosted by a third party, and they don't have Site Server 3.0 installed. Site Server 3.0 costs a lot of money, which cannot always be justified if you only want to use the Search application of the software suite. Index Server Companion The Index Server Companion is the cost effective method of retrieving content from remote webservers for Index Server to index. Furthermore it also allows retrieval of content from ODBC databases which can be subsequently indexed by Index Server.

Features The main features of the Index Server Companion are: Enables Index Server to allow searching of potentially any web server or ODBC compliant database. Integrated web robot extracts content from websites. Includes support for robots.txt files and robots meta tags. Robot can negotiate sites using HTML Frames. Optional mode allows QueryStrings to be treated as distinct URLs (e.g. treat http://www.aspalliance.com/brettb/WebJobMarket.asp?Skill=ASP as being a distinct URL from http://www.aspalliance.com/brettb/WebJobMarket.asp?Skill=JSP ). Ability to retrieve binary files from servers, including Adobe Acrobat PDF, Microsoft Office documents and even images. Support for full or incremental project updates of both web and database content, meaning that Index Server only has to re-index content that has changed. Configuration of the Index Server Companion is through the editing of a plain text configuration file. Index Server Companion can be run from the command line, and scheduled using the Windows Task Scheduler. Includes sample code showing how to use the Index Server Companion with Index Server/Indexing Service on ASP.NET (C# and VB.NET). Display your search results in DataGrids, Repeaters and other Web Form components. Facility for creating a basic table of contents page for the sites that are crawled [ sample from WinnershTriangle.com website , sample from myCDE.com , sample from Lachesis.biz ]. Facility for creating a site summary page that can be used to optimise sites for search engines, by showing what keywords are used on a site, the title tags used by each page etc. The site summary page shows: The top 40 most used keywords listed in the keywords meta tags in all of the pages crawled. The body of the summary page contains the following from the pages encountered in the web crawl: A list of page titles. A list of page headings (H1 to H3 inclusive). A list of bold (and strong) text. A list of external hyperlinks in the site. A list of italicised text. A selection of relevant paragraph text. Paragraphs are assumed to be relevant if they contain one or more of the 40 most frequently used keywords. A list of page description meta tags. A list of page keywords. A list of page description meta tags hyperlinked back to the URL from which each description was extracted. A list of page keywords hyperlinked back to the URL from which it was extracted. Full reporting of activity to an external plain text log file. Flexible output options mean that administrative access to Index Server is not necessarily required. Fully documented VBScript examples show how to make use of the Index Server Companion in ASP pages. Detailed documentation in Microsoft's HTML Help format. Fully documented source code. Access to product updates and technical support.


Figure 1. The Index Server Companion contains fully searchable documentation in Microsoft's HTML Help format.

System Requirements The Index Server Companion can be used on a machine running Microsoft Windows 95 or any subsequent version of Windows. Windows NT 4.0 or Windows 2000 is recommended.

It also (of course) requires a server running either Index Server on Windows NT 4.0 Server, or the Indexing Service on Windows 2000. Note that Index Server Companion does not have to be run from the machine on which the Index Server is installed.

Configuring and Running the Index Server Companion The Index Server Companion executable file or Perl script needs to be run from the Windows command line. Fortunately there is only a single mandatory parameter, which tells the script which configuration file to use. So to run the Index Server Companion for the Sample Project, an MSDOS Command Prompt is opened in the folder where the Index Server Companion files are installed installed and the following is typed:

IndexServerCompanion.exe --c= SampleProject/SampleProject.ini

It is of course possible to run the Index Server Companion from .bat scripts, which can then be scheduled using the AT command or the Windows Task Scheduler. This makes it straightforward to update the Index Server's index of website and database content at specific times and frequencies.

The configuration file (in this instance it is called SampleProject.ini) is a plain text file containing a number of settings. View a sample configuration file . The Index Server Companion is supplied with full documentation in Microsoft's HTML Help format that describes each of the configuration settings.

When the script is run, the Index Server Companion will display details of its status in the Command Prompt window. A detailed log file is also created. View a sample log file . How the Index Server Companion Works The Index Server Companion script contains a fully functional web robot that is able to extract the content from all of the required pages of the specified website. It contains support for the Robots Exclusion Standard specification , and support for the robots meta tag contained within individual pages. Each file extracted from the website is modified to contain a special meta tag that give the original URL (for web content). It is then saved to disk from where it can be indexed by Index Server. The contents of these special meta tags can then be used by the ASP page displaying the results of a web search, so that clicking on a search result item will display the original URL. Unfortunately Index Server will not allow you to retrieve the content from custom meta tags without making a minor modification in the Index Server's Microsoft Management Console (MMC), so there is also a special mode in the Index Server Companion that appends the original URL into the page's HTML title tag.

The Index Server Companion is also able to index content from database tables, queries (Microsoft Access) and stored procedures (SQL Server). Database connectivity is achieved through the use of ODBC, so potentially any type of database that has an ODBC driver is supported.

Searching Web Content with the Index Server Companion Index Server Companion allows content from remote websites to be retrieved and consequently indexed by Index Server. A working example of this search facility is available. This is a search page running on Internet Information Services 5.0 (Windows 2000 Server) that allows you to search my ASPAlliance site (including the article you are presently reading!), together with the articles I have written for Ariadne.ac.uk and ASPToday.com. Since I don't have administrative access to the Index Server on the machine hosting the search page, I have used the feature of the Index Server Companion that allows the document's original URL to be appended to the original title. For example the title tag of the ASPToday article ASP Documentation Systems at http://asptoday.com/content.asp?id=1435 is modified in the file saved to read:

title ISC_URL=http://asptoday.com/content.asp?id=1435 ASP Documentation Systems /title

The URL and original title are separated by a tab character. The search results page then contains a small piece of ASP code to split this title back into the article's URL and original title.

The ASP code for the sample search page may be seen below: View sample ASP code . Searching Databases with the Index Server Companion The Index Server Companion is able to index content from database tables, queries (Microsoft Access) and stored procedures (SQL Server). It is of course entirely possible to search databases using Structured Query Language (SQL), but by making use of Index Server Companion, it is a lot more straightforward to integrate database searches with Index Server search results from web page searches. There are also other advantages: Index Server contains sophisticated pattern matching, and it is often lot faster at returning search results than an equivalent SQL statement would be when using a database such as Microsoft Access.

Index Server Companion is able to retrieve the rows of a specified database table and make an HTML file containing the data from a specific database row. Index Server can then be used to index the HTML files, and it is possible to extract the details of the table and row from which the data originated so that the search results page can be modified to point to the original database data. A sample page produced from the SQL Server sample pubs database is shown below:

html
head
meta name= ISC_title_id content= MC2222
meta name= ISC_title content= Silicon Valley Gastronomic Treats
meta name= ISC_type content= mod_cook
meta name= ISC_price content= 19.99
meta name= ISC_pubdate content= 6/9/1991 12:00:00 AM
meta name= ISC_notes content= Favorite recipes for quick, easy, and elegant meals.
meta name= description content= Favorite recipes for quick, easy, and elegant meals. /head
title Silicon Valley Gastronomic Treats /title
body
/body
/html

In this example, the title field is optionally used to give the page a title, and the notes field is used for the description meta tag.

Each of the custom ISC_ prefixed meta tags can be queried using Index Server, although to retrieve their contents a minor configuration change to Index Server is required. It is straightforward to create a page which for example, will return the records where the value of the ISC_type meta tag is mod_cook .

The Index Server Companion can also modify the HTML's title tag to include the table name and row ID, e.g.:

title ISC_Table=titles ISC_KeyField=title_id ISC_RowNumber=MC2222 Silicon Valley Gastronomic Treats /title

Summary The Index Server Companion allows Microsoft's Index Server to index content from remote websites and ODBC databases, making it a cost effective way of significantly extending the functionality of Index Server.

Downloads Index Server Companion Evaluation Version (528K zip file). Index Server Companion Documentation (95K zip file). Purchase the Index Server Companion . [ ...more details ] Special offers on both Index Server Companion and the ASP Documentation Tool . Further information Read about the Index Server Companion on Ariadne.ac.uk . The Microsoft Site Server Search Facility . Searching Index Server With ASP . Introductory guide to using Index Server from ASP. More about Searching Index Server With ASP . More advise and source code. Useful Development Tools ASP Documentation Tool Automatically creates developer documentation for ASP 2.0 and 3.0 web applications written in VBScript and JScript. Documentation for Microsoft Access, SQL Server 7/2000 databases and Visual Basic 6.0 components associated with the web application can also be incorporated into the reports. Documentation is created in HTML, HTML Help and plain text formats. View Sample Output (HTML Help format).
View Sample Output (HTML Format).
Download Trial Version (5.2Mb ZIP file).
ASP.NET Documentation Tool Automatically creates developer documentation for ASP.NET web applications written in C# or VB.NET. Documentation for SQL Server 7/2000 databases and C#/VB.NET components associated with the web application can also be incorporated into the reports. Documentation is created in HTML, HTML Help and plain text formats. View Sample Output (HTML Help format).
View Sample Output (HTML Format).
Download Trial Version (2.9Mb ZIP file).
SQL Documentation Tool The SQL Documentation Tool creates technical documentation for Microsoft SQL Server 7.0 and 2000 databases. Technical documentation is created in HTML and HTML Help formats. The HTML Help format documentation is fully searchable and cross referenced. The SQL Documentation Tool documents SQL Server Tables, Views, Stored Procedures, Triggers and Table Relationships. View Sample Output (HTML Help format).
View Sample Output (HTML Format).
Download Trial Version (10.3Mb ZIP file).
Index Server Companion The Index Server Companion is a Windows application that extends the functionality of Microsoft Index Server so that it is able to index content from remote websites and also from ODBC databases. As such it can be used as a low cost alternative to Site Server 3.0 Search. View Product Documentation (119K ZIP file).
Try Sample Search Facility .
Download Trial Version (1.7Mb ZIP file).
The Website Utility The Website Utility examines websites for errors and areas that need to be optimised for search engines by using a built in web crawling engine. Errors checked for include broken or moved hyperlinks, missing page titles and missing meta tags. It also generates HTML for use in creating website site maps (table of contents pages - like this one ), and is able to create both client-side JavaScript Search Engines and server-side ASP Search Engines for a website. View Sample Output (HTML Help format).
View Sample Output (HTML Format).
Download Trial Version (3Mb ZIP file). Site Map

All content is © 1995 - 2006 Brett Burridge

Image Alt Tags Brettb.Com
Microsoft Certified Professional
The Index Server Companion contains fully searchable documentation in Microsoft's HTML Help format
View Sample Output (HTML Help format)
View Sample Output (HTML Format)
Download Trial Version
View Sample Output (HTML Help format)
View Sample Output (HTML Format)
Download Trial Version
View Sample Output (HTML Help format)
View Sample Output (HTML Format)
Download Trial Version
View Product Documentation
Try Sample Search Facility
Download Trial Version
View Sample Output (HTML Help format)
View Sample Output (HTML Format)
Download Trial Version
Internal Links http://www.brettb.com/redirector.asp (15 links in this page) [ Robot View of this URL ]
http://www.brettb.com/IndexServerCompanion.asp (6 links in this page) [ Robot View of this URL ]
http://www.brettb.com/DeveloperTools.asp (3 links in this page) [ Robot View of this URL ]
http://www.brettb.com/Default.asp (3 links in this page) [ Robot View of this URL ]
http://www.brettb.com/ASPDocumentationTool.asp (2 links in this page) [ Robot View of this URL ]
http://www.brettb.com/TheWebsiteUtility.asp (2 links in this page) [ Robot View of this URL ]
http://www.brettb.com/technicalwriting.asp [ Robot View of this URL ]
http://www.brettb.com/JavaScriptArticles.asp [ Robot View of this URL ]
http://www.brettb.com/Website_Search_Engine_Optimisation.asp [ Robot View of this URL ]
http://www.brettb.com/SearchResults.asp [ Robot View of this URL ]
http://www.brettb.com/js_banner_ad_rotator.asp [ Robot View of this URL ]
http://www.brettb.com/web.asp [ Robot View of this URL ]
http://www.brettb.com/CanonEOS300D_Gallery1.asp [ Robot View of this URL ]
http://www.brettb.com/BuildingAnASPSearchEngine.asp [ Robot View of this URL ]
http://www.brettb.com/backgrounds.asp [ Robot View of this URL ]
http://www.brettb.com/VBScriptRegularExpressions.asp [ Robot View of this URL ]
http://www.brettb.com/toc.asp [ Robot View of this URL ]
http://www.brettb.com/ASP.NETArticles.asp [ Robot View of this URL ]
http://www.brettb.com/TransactSQLColorCoder.asp [ Robot View of this URL ]
http://www.brettb.com/Investments_ISAs.asp [ Robot View of this URL ]
http://www.brettb.com/CanonEOS300D_Gallery3.asp [ Robot View of this URL ]
http://www.brettb.com/ASPWatchArticles.asp [ Robot View of this URL ]
http://www.brettb.com/Red_Arrows_2004.asp [ Robot View of this URL ]
http://www.brettb.com/Living_Coasts_Photos.asp [ Robot View of this URL ]
http://www.brettb.com/SQL_Help.asp [ Robot View of this URL ]
http://www.brettb.com/contact.asp [ Robot View of this URL ]
http://www.brettb.com/Biotechnology.asp [ Robot View of this URL ]
http://www.brettb.com/MyTropicalFishtank.asp [ Robot View of this URL ]
http://www.brettb.com/ASPNetDocumentationTool.asp [ Robot View of this URL ]
http://www.brettb.com/gallery.asp [ Robot View of this URL ]
http://www.brettb.com/what's_new.asp [ Robot View of this URL ]
http://www.brettb.com/Gallery.asp [ Robot View of this URL ]
http://www.brettb.com/ASPAlliance/IndexServerCompanion/Sample_Configuration_File.txt [ Robot View of this URL ]
http://www.brettb.com/SearchingIndexServerWithASP.asp [ Robot View of this URL ]
http://www.brettb.com/ASPAlliance/IndexServerCompanion/Sample_Web_ASPCode.html [ Robot View of this URL ]
http://www.brettb.com/ASPAlliance/IndexServerCompanion/Sample_Log_File.txt [ Robot View of this URL ]

Reporting Main Page

Report generated by The Website Utility 2.8