Exploring the JVMG knowledge graph

The JVMG project collects data from multiple sources and converts it into the RDF format. One of the core characteristics of this format is that all entities and attributes are represented as URIs, while the value of said attributes are either URIs (thus linking two entities using a property) or literal values.

The SPARQL language can then be used to formulate search queries on RDF stored in a database, but this requires the user to be both familiar with the query language as well as the structure of the RDF data.

As all entities and properties are identified by URIs, one way to explore RDF data is having a web server that serves the domain that the data URIs are residing in and shows all information that can be associated with a given URI.

This functionality is one of the main ideas of linked data: a linked data frontend can serve “raw” RDF data to programs that try to resolve an URI while human users using a browser to resolve the same URI get a human-readable HTML view of all the data that is associated with this URI.

Such a frontend also allows for simple exploration and navigation of a dataset, as all URIs in the human-readable view can be made into clickable links.

Continue reading “Exploring the JVMG knowledge graph”

Turning Fan-Created Data into Linked Data II: Data Transformation

In a previous post, we discussed the creation of a Linked Data ontology that can be used to describe existing fan-created data that the JVMG is working with. For the ontology to work correctly, the data itself must also be converted into a Linked Data format, and so in this post we’ll be discussing the transformation of data, as it’s received from providers, into RDF.

To summarize, our workflow involves using python and the RDFLib library inside a set of Jupyter notebooks to transform and export the data from all of the data provider partners. Data ingestion is also sometimes done using Python and Jupyter notebooks, but here we’ll just focus on the data transformation. 

Continue reading “Turning Fan-Created Data into Linked Data II: Data Transformation”

Turning Fan-Created Data into Linked Data I: Ontology Creation

One of the primary functions of the JVMG project is to enable researchers to work with existing data in ways that are not readily enabled by the data providers themselves. One way in which we are attempting to facilitate this flexible data work is through the use of Linked Data. As we are working with a diverse set of data providers, the ways in which they create, store, and serve data are similarly diverse. Some of these providers are MediaWiki pages, with data being available as JSON through the use of an API, while others are closer to searchable databases, with data existing as SQL and being offered in large data dumps. 

What remains constant across these data providers is our general data workflow; data must be accessed in some way, analyzed so that a suitable ontology can be created that is able to represent the data, transformed into a Linked Data format (in our case RDF), and finally made available so that it is able to be worked with by researchers. To give readers an idea of what this workflow looks like and how the data we work with is altered in a way to help it meet the needs of researchers, we’ll be going over a couple of these steps in separate blog post. Here, we’ll talk about the creation of the ontology based on how data providers describe their own data, and in a followup post, we’ll talk about some technical aspects of data transformation.

Continue reading “Turning Fan-Created Data into Linked Data I: Ontology Creation”