Our project has so far transformed data from multiple online enthusiast communities as well as other data-centric projects into the RDF data format and it is now available online as Linked Open Data. The next step is matching the corresponding entities in each source and merge the properties that each source provides.
Continue readingMilestone: Public access to the knowledge graph
The project has reached an important milestone. The collected data from six sources is available in RDF and can be viewed on the mediagraph.link domain. Currently, only the transformed original data is available, as we are still working to complete the next step in the data integration process.
We would like to use this opportunity to thank all the enthusiast communities who make data available under a free license on the web and specifically the communities who kindly supported the project by attending our initial workshop, and exchanged ideas on data in the Japanese visual media domain with us. We are especially grateful to the communities who have agreed to offer us a specific open licence (detailed information is available here) for the parts of their data that have been integrated into our database.
Continue reading “Milestone: Public access to the knowledge graph”Exploring the JVMG knowledge graph
The JVMG project collects data from multiple sources and converts it into the RDF format. One of the core characteristics of this format is that all entities and attributes are represented as URIs, while the value of said attributes are either URIs (thus linking two entities using a property) or literal values.
The SPARQL language can then be used to formulate search queries on RDF stored in a database, but this requires the user to be both familiar with the query language as well as the structure of the RDF data.
As all entities and properties are identified by URIs, one way to explore RDF data is having a web server that serves the domain that the data URIs are residing in and shows all information that can be associated with a given URI.
This functionality is one of the main ideas of linked data: a linked data frontend can serve “raw” RDF data to programs that try to resolve an URI while human users using a browser to resolve the same URI get a human-readable HTML view of all the data that is associated with this URI.
Such a frontend also allows for simple exploration and navigation of a dataset, as all URIs in the human-readable view can be made into clickable links.
Turning Fan-Created Data into Linked Data II: Data Transformation
In a previous post, we discussed the creation of a Linked Data ontology that can be used to describe existing fan-created data that the JVMG is working with. For the ontology to work correctly, the data itself must also be converted into a Linked Data format, and so in this post we’ll be discussing the transformation of data, as it’s received from providers, into RDF.
To summarize, our workflow involves using python and the RDFLib library inside a set of Jupyter notebooks to transform and export the data from all of the data provider partners. Data ingestion is also sometimes done using Python and Jupyter notebooks, but here we’ll just focus on the data transformation.
Continue reading “Turning Fan-Created Data into Linked Data II: Data Transformation”Turning Fan-Created Data into Linked Data I: Ontology Creation
One of the primary functions of the JVMG project is to enable researchers to work with existing data in ways that are not readily enabled by the data providers themselves. One way in which we are attempting to facilitate this flexible data work is through the use of Linked Data. As we are working with a diverse set of data providers, the ways in which they create, store, and serve data are similarly diverse. Some of these providers are MediaWiki pages, with data being available as JSON through the use of an API, while others are closer to searchable databases, with data existing as SQL and being offered in large data dumps.
What remains constant across these data providers is our general data workflow; data must be accessed in some way, analyzed so that a suitable ontology can be created that is able to represent the data, transformed into a Linked Data format (in our case RDF), and finally made available so that it is able to be worked with by researchers. To give readers an idea of what this workflow looks like and how the data we work with is altered in a way to help it meet the needs of researchers, we’ll be going over a couple of these steps in separate blog post. Here, we’ll talk about the creation of the ontology based on how data providers describe their own data, and in a followup post, we’ll talk about some technical aspects of data transformation.
Continue reading “Turning Fan-Created Data into Linked Data I: Ontology Creation”