About the project

Original proposed goals

The project aims to create a research database on Japanese visual media, including, but not limited to anime, manga, computer games and visual novels. It is aimed at researchers in Japan studies who focus on modern media and its expressions, themes, topics, characters and reception. We envision a graph-based, highly interconnected database structure, similar to the Google knowledge graph, that is combined with a flexible search interface and analytic tools.

We intend to use the data on Japanese visual media that is being created and curated by the many enthusiast communities on the web. An initial survey of several larger community websites showed an incredible depth of information, a deep understanding of the source material as well as a high attention to details on part of the volunteer contributors. As such, making contact with these communities and learning about their needs and motivations is one of the main project elements. We intend to engage into a meaningful discussion with representatives and administrators of the community sites in order to establish a long-term cooperation that benefits both sides.

The architecture will be completely open source and most likely be based on the software stack that is being developed in the Wikidata project. It is one of the more advanced deployment of a huge graph-based database with integrated search and visualization features and has an already established development community. It also provides means to annotate and propose changes to the data in a well-documented and traceable way.

The whole development phase will be accompanied by researchers from the Japan Studies, who will be responsible for data selection as well as data quality assurance. They will make sure that both the data model and the architecture of the prototype supports their research by conducting example research and verifying the results. Once a first prototype has been developed to the satisfaction of the project partners, it will be made accessible to the larger Japanese media studies research community as well as other researchers who are interested in the data harvesting and modelling aspects.

Second project phase 2023-2026

The project has entered its second phase, again funded by the German Research Foundation (DFG). In the first 3 years, we approached several enthusiast communities and received permission to use their data as a starting point for a knowledge graph in the domain of Japanese visual media. We learned a lot about workflows and motivations of enthusiast communities, and assured the high quality of their data via a thorough assessment on statistically representative random samples.

We also entered a lengthy discussion on the licensing terms of said data and found a solution that both allows all relevant uses in research as well as proctects the interests of the enthusiast communities. The specific licence is compatible with many other open data collections on the web and also opens up using the knowledge graph in local projects by other groups or individuals.

The data integration process is fully developed and can easily be extended to further data sources. The main site of the knowledge graph is run using an open source graph database, open source search engine and a custom web frontend that facilitates searching and browsing. Ad-hoc data analysis is supported by plugins, that can be developed and deployed independent from the main frontend. All custom developments are published on github under an open licence.

Besides this, we have proven not only the quality, but also the general utility of the collected data for research in the Japanese visual media domain using the “tiny use case” approach. These showcase different data-driven research approaches and help researchers understand the potentials and limits of utilizing the knowledge graph data.

The second project phase will focus on consolidating and further disseminating the results of the first phase. We plan to include more data sources, conduct joint research with multiple international collaborators and refine the technological foundations of our services. A series of hands-on workshops and comprehensive documentation materials will support students and researchers who want to explore the data in their own work.