How to integrate Content Sources into Grace

To make a new content source searchable through the Grace interface, you need to:

  1. personalize the configuration file (SPEC file) for the specific content source;
  2. integrate the configuration file with the application.
For this tasks it is very convenient to know the basics of XML, XSL and XPath. Furthermore have a look at the content source in a web browser to get familiar with it.
If you are integrating a content source for the first time better read the complete HOWTO.

Step 1

Introduction

The SPEC file is an XML file that will be used as the communication interface between the Content Source and the Grace System. The XML file will be used to send the query information to the Content Source, retrieve the query results and process them using XSL.

There are two different types of content sources:

a) OAI-supporting content sources

OAI supporting content sources can be integrated easily because they stick to OAI definitions. Their results are returned in XML. The XSL processing of these content sources can be similar for several content sources.

b) Not OAI-supporting content sources

Here you just get the html output, and have to process this via XSL.

This HOWTO addresses content sources of type b) ONLY.

Create the XML file

For this purpose you could start using the example.xml. This file is divided into two main parts that correspond to the query step (query-channel definition) and the results processing step (results-parser configuration).

Firstly, give a name to your content source using the query driver attribute:

<query-driver drivername='myQueryDriver'>

Now put the link of the search executable.

<text v="http://www.my-contentsource.com/search" />

You must specify the parameters of the POST or GET request that will be send to the Content Source. For example, if you have a parameter SEARCH you will have:

<get-param name='MyParamName'><paramval name='MyParamValue' /></get-param>

You can use the script FormExtractor.pl using the html of the query page as standard input to obtain all query parameters.

Secondly extract the information received from the Content Source. You can choose which information is important using XSL and XPath expressions. First you provide a XPath expression for finding the results:

<xsl:template match="MyXPathExpression">

Once you have the results obtained by the XPath expression, you can use XSL functions to extract the information:

<Title><xsl:value-of select="." /></Title>

Step 2

After testing the SPEC file with the test script CSTest.sh, which does also validating after receiving the results, you have to integrate it into the GRACE application and add your new content source to GRACE interface.

(more info from GL2006)

More detailed information

To get more detailed information you can consult the HOWTO.
For delevopment and testing of the files you should download the complete package (Note: it will be provided soon). This includes 10 SPEC files for existing content sources, the testing scripts and binaries and the Howto.

Back to >> Application Installation and Configuration