wiki-archive/twiki/data/TAPIR/FirstProposal.txt,v

head	1.7;
access;
symbols;
locks; strict;
comment	@# @;


1.7
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	1.6;

1.6
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	1.5;

1.5
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	1.4;

1.4
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	1.3;

1.3
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	1.2;

1.2
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	1.1;

1.1
date	2007.01.09.00.00.00;	author MoinMoin;	state Exp;
branches;
next	;


desc
@Initial revision
@


1.7
log
@Revision 7
@
text
@---+ First proposal: The XQuery alternative

*General strategy:*

To adopt the XqueryLanguage as a substitute for the current way of filtering and specifying the resulting record structure.

*Details:*
XQuery is part of a new generation of query languages and it aims at being far more generic than the existing options (SQL for instance). Statements can be expressed in order to act across different data sources at the same time, including structured and semi-structured documents, relational databases, object repositories, or even web services (http://www.w3.org/TR/xquery-use-cases/). XQuery could be used not only on search and update operations, but also on transformation and re-structruring of data being retrieved.

In a network of distributed and heterogeneous databases, a possible way to use XQuery would be to write statements based on the data structure of a federated schema. Since each data provider would have mapped its local database against the same federated schema, those statements could be translated according to the local data structure by a wrapper software. Due to the richness and complexity of XQuery, it would be too costly to write complete parsers for it. So instead of having such a parser in the wrapper, the wrapper could pass on the translated queries to a local XML server or middleware software capable of dealing with XQuery statements to be processed.

There could be an interface for advanced users to write their own XQuery statements, but also other simplified interfaces which should be able to automatically generate at least the most used and required types of XQuery statements.

However, one of the consequences of XQuery's flexibility is that it remains unclear how data providers could restrict the ammount of data to be returned in a single response. This issue would also affect paging capabilities. Both features (limiting the number of records and paging) are present and considered important in the DiGIR protocol.

The current XQuery specification version, although considered stable, is still in the stage of a working draft, and there are only a few implementations of it. Presently, most part of the available XML servers, and XML middleware tools are proprietary software (http://www.rpbourret.com/xml/ProdsMiddleware.htm), and not all of them support the complete XQuery specification. Using a commercial software would certainly be too expensive for networks with so many data providers. And the costs should not only include licenses, but equipment and training as well.

XQuery parsers have been mostly implemented in native XML databases, some of them are free and open source (http://www.rpbourret.com/xml/ProdsNative.htm). But that would require frequent data exporting for almost all production databases, and performance could also be a problem when querying larger data sets. Moreover, most database administrators are still not familiar with XML database products.

On the other hand, only a few commercial relational databases have released prototype XQuery interfaces (http://www.oreillynet.com/lpt/wlg/4991), which is certainly not enough to cover the diverse infrastructure of existing data providers.

Still, one specific tool has been identified as a potential candidate to be used in a scenario similar to the one previously suggested. The XQuark project (XQuark Group - http://xquark.objectweb.org/index.html) is working on a suite of free and open source tools (XQuark Bridge and XQuark Fusion) that seem to implement the whole XQuery specification in a distributed environment. Both tools are platform independent (java), and each local data source needs to map its data structure against a federation schema through an object-relational mapping technique. Although the number of supported databases is still limited and the web services interface is not available yet, it would definitely be interesting to conduct a more detailed study to assess the viability of using these tools. With positive conclusions from such a study, the protocol could be substituted by XQuery, and almost all existing software could be replaced by the aforementioned tools or by similar ones.
@


1.6
log
@Revision 6
@
text
@d22 1
a22 1
Still, one specific tool has been identified as a potential candidate to be used in a scenario similar to the one previously suggested. The XQuark project (XQuark Group - http://xquark.objectweb.org/index.html) is working on a suite of free and open source tools (XQuark Bridge and XQuark Fusion) that seem to implement the whole XQuery specification in a distributed environment. Both tools are platform independent (java), and each local data source needs to map its data structure against a federation schema through an object-relational mapping technique. Although the number of supported databases is still limited and the web services interface is not available yet, it would definitely be interesting to conduct a more detailed study to assess the viability of using these tools.
@


1.5
log
@Revision 5
@
text
@d8 1
a8 1
XQuery is part of a new generation of query languages and it aims at being far more generic than the existing options (SQL for instance). Statements can be expressed in order to act across different data sources at the same time, including structured and semi-structured documents, relational databases, object repositories, or even web services [http://www.w3.org/TR/xquery-use-cases/]. XQuery could be used not only on search and update operations, but also on transformation and re-structruring of data being retrieved.
d16 1
a16 1
The current XQuery specification version, although considered stable, is still in the stage of a working draft, and there are only a few implementations of it. Presently, most part of the available XML servers, and XML middleware tools are proprietary software [http://www.rpbourret.com/xml/ProdsMiddleware.htm], and not all of them support the complete XQuery specification. Using a commercial software would certainly be too expensive for networks with so many data providers. And the costs should not only include licenses, but equipment and training as well.
d18 1
a18 1
XQuery parsers have been mostly implemented in native XML databases, some of them are free and open source [http://www.rpbourret.com/xml/ProdsNative.htm]. But that would require frequent data exporting for almost all production databases, and performance could also be a problem when querying larger data sets. Moreover, most database administrators are still not familiar with XML database products.
d20 1
a20 1
On the other hand, only a few commercial relational databases have released prototype XQuery interfaces [http://www.oreillynet.com/lpt/wlg/4991], which is certainly not enough to cover the diverse infrastructure of existing data providers.
d22 1
a22 1
Still, one specific tool has been identified as a potential candidate to be used in a scenario similar to the one previously suggested. The XQuark project [[XQuark][Group - http://xquark.objectweb.org/index.html]] is working on a suite of free and open source tools (XQuark Bridge and XQuark Fusion) that seem to implement the whole XQuery specification in a distributed environment. Both tools are platform independent (java), and each local data source needs to map its data structure against a federation schema through an object-relational mapping technique. Although the number of supported databases is still limited and the web services interface is not available yet, it would definitely be interesting to conduct a more detailed study to assess the viability of using these tools.
@


1.4
log
@Revision 4
@
text
@a22 2

*Impact:* Every piece of software, except maybe the portals (broadcasters) which mainly distribute requests and send back the compiled result. But it would be possible to substitute the portals by XML servers too.
@


1.3
log
@Revision 3
@
text
@a11 2
-> should we include an image here representing the approach?

@


1.2
log
@Revision 2
@
text
@d8 1
d10 1
a10 1
XQuery emerges as a new generation of query languages and it aims at being far more generic than the existing options (SQL for instance). Its syntax allows for interfacing with different datasources at the same time (could be a collection of XML documents, a relational database, simple text files, web services, etc). XQuery could be used not only on search and update operations, but also on transformation and re-structruring of data.
d12 1
a12 1
Unfortunately, since it is a fairly recent technology, it would probably be very costly now. There seems to be no free available XML servers which are able to parse an XQuery request and to interface with different datasources on a multi-platform basis. The freely available implementations usually interface only with XML databases - which would require constant exporting of data from the working databases (mostly SQL based). On the other hand, overcoming the current limitations would mean writing parsers or drivers to XQuery, and that would be a tremendous ammount of work because of its complexity.
d14 11
a24 1
When robust XML servers become freely available, a possible architecture would be to write a thin wrapper that would receive XQuery requests from portals (referencing data elements from common data standards), then translate the concepts according to some local mapping, and finaly passing on the translated request to a local XML server packaged together with the wrapper (and configured to access some local generic datasource).
@


1.1
log
@Initial revision
@
text
@a15 6

*Comments:*

I think this proposal is too advanced to be suggested now.

-- RenatoDeGiovanni 28 jun 2004
@