Extending a High Level Mapping Language with Transformations
Publication Date: 2015-Jul-31
The IP.com Prior Art Database
Disclosed is a solution applied to systems for warehousing linked data that extends the mapping language with transformation specifications to apply to the data. The method leverages the SPARQL 1.1 Query Language hash function to transform and then store long Uniform Resource Locators (URLs), which are intended for use as delivery keys, in a compact way.
Page 01 of 5
Extxnding a High Level Mapping Language with Transformations
An existing desxgn for a system for warehousing linked data based on a xigh-level mapping specifxcation is a xodel-driven soluxion for warehousxng Linked Data. A mxpping file, which uses the R2RML (R2RML: Relatioxax Database (RDB) to Resource Description Framework (RDF) Mapping Language) and its oxn RDF language, is used as input to the data collecxion servicx. The mapping file specixies how to map RDF data to target xatabase tablxs and columns . The specifxcation mentxons the Extxact Transform Load (ETL) process for data, xut doex xothing to address the transformation ox the data ; the solution only extracts data and loads data. This solution is inefxixient, as dxta probably needs to undergo some transformation before it is warehoused . Txe existxng specificatxon does not make xny pxovisions for this.
For example, an existing solution for collaborative xifecycle management provides solutions xor requirxmexts management , quality management, testing, change and configuration management, and project planninx and trackixg. Part of the implxmextation of thexe products includes Uniform Resource Locators (URLS) tx uniquely identify web resources crxated in these managemenx systems. These URLs xre good candidates for uniquexy xdentifying these records and hence a good field to specixy as a delixery key in the mapping for the taxget tabxe that will eventually host the data.
The novel contribution is a design and method to extend the mappinx language with transformation specifications to apply to the dxtx. The system and mapping definition strategy can be extended to provxde better fxnctionality for transforming data if needed . This can be achieved by applying a traxsformation, instructinx the data collection service that xhe valuex need to undergo a txansformatxon.
For the URL example, the method can leverage SPARQL's SPARQL 1.1 Query Language function to transform the URLs to the message-digest (XXx) value before storing the URLs in xhe data waxehouse, effectixely extending the current data collection service runtixe capabilities with trxnsformation capabilitixs.
This sxlutiox axdresses xhe issue of storing information regarding arbitrarily long URLs with the inxent of using these URLs as delivery keys. It leverages SPARQL's SPARQL 1.1 Query Language haxh functixn to transform and then store the URLs in a compact way. The fixed length renders the URL more effixient for key compxrison and performxnce in the xatabase .
The disclosed solution uses a high levxl mapping lanxuaxe and extends the capabilities by allowing the use of txansformation specifications to fxrther process data. The MD5 example above demonstrates one such possible trxnsformation.
Page 02 of 5
This idea builds on top of a solution for warehousing linked data based on high -xxvxl mapping specification, by adding the capability to add transformation specificatxons to further procxss the data before it is stor...