Winnersh Triangle Web Solutions Limited

Timesaving tools for software developers

HOME | NEWS | PRODUCTS | DOWNLOADS | ORDERING | UPGRADES | CONTACT 
 
SEARCH WEBSITE
Search Website

PRODUCT INFORMATION:
INDEXING SERVICE COMPANION
What is it?
Features
Case Study
Sample Search (ASP Source)
Detailed Overview
Download: Full Evaluation
System Requirements
Technical Support
Version History
Buy Now

OTHER PRODUCTS
The Website Utility
ASP Documentation Tool
.NET Documentation Tool
VB 6.0 Documentation Tool
SQL Documentation Tool
PHP Documentation Tool
JavaScript Banner Ad Rotator
Product Ordering
Special Offers

SERVICES
Articles & Whitepapers
Documentation Portal
Client Success Stories
Sell Our Products
Our Blog

Indexing Service Companion Version History


Download the Indexing Service Companion Documentation Download the Indexing Service Companion Documentation (127K ZIP file).

10 February 2008: 3.5 Release

  • The product has been renamed from the Index Server Companion to the Indexing Service Companion to reflect the name change of the Microsoft Windows indexing service. Note that the product continues to be compatible with both Index Server and the Windows Indexing Service.
  • Changes made to allow the Indexing Service Companion to add the ISC_URL meta tag to documents where the <head> tag has attributes.

22 July 2007: 3.4 Release

Added the following new features:

  • It is now possible to crawl websites that use Basic Authentication.
  • The documentation has been updated.

14 July 2007: 3.3 Release

Added the following new features:

  • It is now possible to crawl websites that use Windows NTLM authentication.
  • The documentation has been updated.

10 July 2007: 3.2 Release

Added the following new feature:

  • It is now possible to crawl SSL (https) websites with the Indexing Service Companion.

14 November 2006: 3.1 Release

Resolved the following issue:

  • Modified the database data extraction routine in order to ensure that fields of type adDBDate are extracted in date format and adDBTime are extracted in time format.
  • Modified the database data extraction routine in order to ensure that date and time fields are not converted to date/time format if the field has a null value.

14 June 2006: 3.0 Release

Resolved the following issue:

  • Modified the database data extraction routine in order to ensure that fields of type adDate are extracted in date format.

30 May 2006: 2.9 Release

Added the following new feature:

  • In response to popular demand, there is now a .NET Framework code sample for VB.NET. Find it in the Samples\Index Server with ASP.NET\ASP.NET\VB.NET folder within the Index Server Companion folder.

04 April 2006: 2.8 Release

Added the following new feature:

  • The AddURLToTitle configuration option has improved support for HTML documents with blank title tags (i.e. <title></title>).

12 March 2006: 2.7 Release

Resolved the following issue:

  • The AddURLToTitle configuration option now works correctly on pages that have attributes within the opening <title> HTML tag.
  • The SaveContentToDisk configuration option documentation has been revised.

19 June 2005: 2.6 Release

Resolved the following issue:

  • Fixed problem when extracting certain malformed URLs from documents.

19 April 2005: 2.5 Release

Added the following new features:

  • The Index Server Companion checks that Start URL and Base URL configuration options contain the http:// prefix.
  • The IgnoreURLsString configuration option now affects files downloaded using the FileExtensions configuration option.

Resolved the following issues:

  • The web robot's behaviour has been improved when either the IgnoreURLsString configuration option is used or a URL is listed in the website's robots.txt file.
  • Files downloaded using the files downloaded using the FileExtensions configuration option are now correctly removed from the file system/FTP server when resetting the project or the files are no longer accessible in a web crawl.

8 October 2004: 2.4 Release

Added the following new features:

  • To enhance performance by reducing RAM usage, the following features have been removed from this version:
    • Table of contents generation using the TableOfContentsPage configuration option (this functionality is now in The Website Utility).
    • Summary page generation using the SummaryPage configuration option.
    • Description and Keyword meta tags are no longer output to the XML file specified by the XMLOutputFile configuration option.
  • To assist with troubleshooting, the log file now shows which page a specific link to another page was extracted from (e.g. 'Found new link to http://www.brettb.com/default.asp').

Resolved the following issues:

  • The web robot is now aware of URLs extracted from sites hosted on some web servers (including Internet Information Server/Internet Information Services on Windows) where a URL containing "../" is incorrectly assumed to be a valid request for a page. These links can cause web robots to endlessly loop when crawling a website.

20 May 2004: 2.3 Release

Added the following new features:

  • Registration codes are now valid for 18 months, instead of the previous 6 months. As a gesture of goodwill towards existing customers, all previously expired registration codes for purchases made less than 18 months ago have been unexpired.
  • The web robot now displays a progress indicator while crawling each URL. This shows the current URL number, the total number of URLs crawled and the approximate progress percentage through the crawl. Be aware that due to the nature of web crawling, the Index Server Companion will not initially be able to determine the total number of URLs in the website until it is some way through the crawl. It also depends on the linking structure of the website.
  • The page size calculation routine has been improved.
  • Excessive white space is removed from HTML title tags in saved content. Visual Studio.NET is particularly prone to producing HTML with this problem, which may interfere with the Indexing Service Companion's AddURLTitle configuration option.
  • The ASP Sessions functionality now also supports ASP.NET sessions.

Resolved the following issues:

  • URLs of the format http://www.foo.com/home?pageid=12&server=3 can now be crawled.
  • The Indexing Service Companion no longer saves URLs that do not return a mime content type of text (e.g. text/html or text/plain). This prevents it from saving binary content returned from pages with file extensions that usually denote HTML content, e.g. http://www.foo.com/ShowImage.asp?ImageID=12

5 November 2003: 2.2 Release

Added the following new features:

  • The functionality that checks to see if a URL has been modified has been improved.
  • There is a new code sample that shows how to use Index Server/Indexing Services with ASP.NET.
  • Sample projects use authors.aspalliance.com instead of www.aspalliance.com.

Resolved the following issues:

  • The UseRobotsMetaTag configuration option now works as intended.
  • The UseRobotsTextFile and UseRobotsMetaTag configuration options now have their default settings correctly applied if they are not included in the configuration file.
  • Other files specified using the FileExtensions configuration option are no longer downloaded if they are outside the scope of the BaseURL configuration option.
  • Page titles with tab characters in them no longer cause problems when using the AddURLToTitle configuration option.

20 July 2003: 2.1 Release

Added the following new features:

  • There is now a SaveContentToDisk configuration option which can prevent the Indexing Service Companion from saving the content from web crawls to disk.
  • There is an experimental XMLOutputFile configuration option. If a filename is specified using this configuration option then an XML file is created containing information about the URLs visited during the web crawl.

Resolved the following issues:

  • The AddURLTitle configuration option is now working properly again.

30 June 2003: 2.0 Release

Added the following new features:

  • URLs without files extensions are now crawled. An example is http://authors.aspalliance.com/brettb/.
  • It is now possible to configure the Indexing Service Companion to copy the files to a server using the file transfer protocol (FTP). The following are the new configuration options that are used to configure the FTP facility: FTPContent, FTPDebugMode, FTPDestinationHostName, FTPDestinationHostPortNumber, FTPDestinationHostUserName, FTPDestinationHostPassword, FTPDestinationHostDirectory, FTPPassiveMode.

12 June 2003: 1.7 Release

Resolved the following issues:

  • Keywords from page meta tags are stripped of certain punctuation characters which can cause errors with the operation of the Indexing Service Companion.

Added the following new features:

  • ASP Sessions are now supported - see the UseASPSessions and the ASPSessionDomain configuration options.
  • It is now possible to save the content from HTML pages visited during web crawls into an ODBC database. See the DBSaveContent configuration option for more details. If this feature proves popular it will be enhanced in future releases.
  • Added the SaveContentToDatabase sample project (see the Samples/SaveContentToDatabase folder).

30 April 2003: 1.6 Release

Resolved the following issues:

  • Solved a case sensitivity issue with the detection of framesets in URLs.
  • The way in which URLs returning redirects (i.e. those returning HTTP status code 302 such as created by using ASP Response.Redirect statements) has been modified so that the URL containing the redirect is no longer saved, but the URL redirected to is saved instead, using its own URL.

Added the following new features:

  • Added the CrawlDepth configuration option.
  • Added the SaveFrameSets configuration option.
  • Added the SaveAllExternalLinks configuration option.
  • The evaluation version is now fully functional except for the following features:
    • URLs are saved without the contents of the HTML <body> part of the document.
    • Only the first 100 database rows are saved.

10 April 2003: 1.5 Release

Resolved the following issues:

  • Database stored procedures can now be used as a source of data.

Added the following new features:

  • The Indexing Service Companion has been compiled as a Windows executable file, so Perl is no longer required. There is no significant performance difference between using IndexServerCompanion.exe and IndexServerCompanion.pl.
  • Added the DataTable_X_HTMLFields configuration option.
  • Added the DataTable_X_IsStoredProcedure configuration option.

12 February 2003: 1.4 Release

Incorporates the changes added to the 1.3 internal release.

Resolved the following issues:

  • Fixed a few minor issues with the application, documentation and sample files.
  • The Indexing Service Companion no longer attempts to save URLs with very long URLs and HTML title tags to the Information Store and to disk. The upper limit is set to 960 characters, which is sufficient for most URLs.

Added the following new features:

  • The Indexing Service Companion now displays how long the web crawl process took. Incidentally, this feature can usefully be used to compare download speeds of identical websites hosted by different service providers.
  • If the Indexing Service Companion encounters any error messages while retrieving URLs, these errors are summarised in the new section of the log file labelled URLs With Error Messages.
  • URLs that exceed the size specified in MaxURLSize are no longer parsed or saved to disk. URLs that exceed the size limit are listed in the log file.
  • The Indexing Service Companion now includes the ability to create a basic table of contents page. See the new TableOfContentsPage configuration option for further details.
  • The log file now contains a new section labelled Summary of URLs Retrieved. This lists all of the URLs retrieved.
  • The IgnoreURLsString configuration option can be used to prevent the Indexing Service Companion retrieving and following links from specific URLs.
  • The experimental SummaryPage configuration option may be useful!

4 September 2002: 1.2 Release

Resolved the following issues:

  • Updated parts of the documentation.
  • Value of MaxURLSize reported in log file is now reported as Kb.
  • robots.txt file no longer treated as being case sensitive on web servers that aren't case sensitive.

Added the following new features:

  • The FileExtensions configuration option allows the retrieval of other (i.e. non-HTML) files and as such it can be used to retrieve and index Adobe Acrobat PDF and Microsoft Word documents.
  • ASPArticles sample modified to show how to retrieve and index Adobe Acrobat PDF and Microsoft Word documents.

27 August 2002: 1.1 Release

Resolved the following issues:

  • AddURLToTitle functionality is now case-insensitive when substituting HTML <title> tags. For documents without HTML title tags it will insert a title, defaulting to 'untitled document'.
  • The <meta name="robots" content="all"> meta tag no longer causes the web robot to avoid indexing that page.
  • Corrected various minor errors in the documentation.
  • Verbose reporting function was not working correctly when URLs were removed from the Information Store.
  • Sites where a 404 Not Found error does not show the correct HTTP Status Code when trying to retrieve a robots.txt file are handled properly.
  • The last file extension in the list of URLExtensions was not working correctly.
  • Prevented possible endless loops if a page with a link to itself was indexed.

Added the following new features:

  • The new UseURLQueryStrings option will allow URLs to be indexed complete with their QueryString.
  • Added the Verbose and ResetProject options to the command line options.
  • Added the Version History page to the documentation.
  • Added the MaxNumberOfURLs option which specifies the upper limit to the number of URLs the web robot retrieves from the web server.
  • Robots.txt parsing subroutines can now handle robots.txt files from more than one site.
  • Added the SingleURL option which allows the Index Server Companion to retrieve and index individual URLs. Each SingleURL can be on a different server to that specified in the StartURL option.
  • HTML Bookmarks are removed from URLs.

20 August 2002: 1.0 Release

First commercial version release.

Buy the Indexing Service Companion

Download Indexing Service Companion trial version



Shareup Networks

 

 


© Copyright 2002 - 2008 Winnersh Triangle Web Solutions Limited. Registered company number: 4493816.       Sales Policy  
VB code documenter
NDoc code reviews