Linked Edit Rules (LER) is a methodology to publish, link, combine and execute edit rules on the Web as Linked Data to verify consistency of statistical datasets.
Edit rules (or edits) are rules that encode a huge variety of statistical data consistency constraints. For example, in a statistical dataset with demographic information on a population, a man cannot be pregnant, an underage cannot possess a driving license, and a child can't be married. With LER, these constraints are encoded in Web standard formats and languages, such as RDF, that make them machine-readable and network-redistributable.
In LER, edit rules are expressed as SPARQL, combining the RDF Data Cube model and Semantic Web rule models. With these, edit rules can be distributed on the Web (in a site like this one) and later retrieved and combined to customize an inference engine. A lightweight LER vocabulary models the rules, their scope and their statistical components.
The consistency of any statistical dataset published on the Web as RDF Data Cube can be checked according to user-specified LER published elsewhere. Live. QBsistent is an implementation of such a consistency checking methodology on top of Stardog, the leading enterprise graph database.
As part of the methodology for checking statistical consistency, we bring together SPARQL, the Semantic Web query langauge, and R, the leading langauge for statistical computing, in a Stardog funcitonality extension. This extension expands the standard functionality of SPARQL in Stardog's query engine, by allowing R functions, such as test statistics, to be called directly from SPARQL queries.
As a use case, we implement a set of rules of the census domain as LER. Any RDF Data Cube published on the Web with census data can be checked for its internal consistency using these rules.
As a use case, we implement a set of rules of the economics domain as LER. Any RDF Data Cube published on the Web with prices and wages data can be checked for its internal consistency using these rules.