JVMG Project Phase One Wrap-Up Workshop

Last week of January we had a fantastic experience hosting the JVMG project phase one wrap-up workshop at Stuttgart Media University with an amazing line-up of collaborators and supporters, some of whom have already contributed to the success of the project over the past three and a half years and others who we are hoping to work and collaborate with going forward. But what made this workshop so special was the lively dialogue between so many different backgrounds and interests, from the representatives of the online enthusiast communities, through academic research librarians, developers and researchers working on various digital humanities and large scale database projects in musicology, literary and fan studies and Japanese media, to researchers from game and media studies, anime and manga research, Japanese studies and even law, from all across Europe, Japan and even North America.

The first panel featured an overview of the state of the art of the JVMG project as well as a summary of the lessons learned during the first phase of the project. First, Magnus Pfeffer introduced the project and discussed the technical problems and solutions in relation to data ingestion and merging as well as the frontend development process.

Then, Zoltan Kacsuk first summarized our work (together with Simone Schroff, who was also present at the workshop) in relation to the legal integration of the knowledge graph, before providing an overview of the data quality assessment and its results we conducted in relation to our data sources.

Finally, Martin Roth explained the Tiny Use Case workflow methodology adopted from the diggr project, and how it helped shape both the media research undertakings within the project and the technical development of the knowledge graph. (The slides for all four of these presentations can be found at the end of this blogpost.)

The next panel showcased three of the main collaborations we are looking forward to pursuing in the potential next phase of the JVMG project. First, Federico Pianzola introduced the GOLEM (Graphs and Ontologies for Literary Evolution Models) project and the pilot research it is based on. One of the initial key questions that this research had to resolve was how to find data about the impact that stories have on readers. By turning to online fanfiction sites like AO3, which has around eight million stories, in roughly forty thousand fandoms, Federico Pianzola was able to successfully demonstrate in a pilot study on English language Harry Potter fanfiction how the evolution of these stories can be analysed to identify an increase in the complexity of stories, as well as in the relationships and the themes depicted therein. The GOLEM project will work with a number of different language (English, Spanish, Italian, Korean, and Indonesian) fanfiction resources to create a graph database as well as take advantage of connecting their metadata to other linked open data sources such as Wikidata and the JVMG knowledge graph.

Next up, Bryan Hartzheim and Stevie Suan explained their work with researching the networks of production in anime and how these relate to the sources of creativity within the industry on the one hand, and to the transnationality of anime production on the other. They then highlighted the potential for combining more in-depth and mostly qualitative micro-analysis with the type of large-scale quantitative macro-analysis that the JVMG knowledge graph would enable. In a pilot study examining the career paths of famous anime directors, they pointed out how the role of episode directing (with its emphasis on directing first in television work) is often overlooked, even though it can be seen to be a common element of many career paths leading to directing. Another role in anime production, storyboarding was also found to be foundational for later work in directing. Going forward this work on researching networks of production in anime will expand on charting and analyzing career trajectories and patterns as well as the differentiation of roles within the anime industry, with the possibility of even engaging with formal aspects of anime and how they relate to the actual people responsible for making them.

Closing the second session Martin Hennig took us on a tour of the state of the art in genre theory and highlighted some of the key problems that media specificity adds to working with genres. Research on genres often focuses on either production and/or reception, in which case genres are usually linked to market recognition; or second, on textuality, focusing on narrative conventions and the way texts operate with genre conventions and expectations; or third, on media culture, highlighting transmedia storytelling and the way genre tropes can act as central connecting nodes. Genres are often interrogated in relation to culturality, in which case genres can be examined for their ideological content, in relation to transformations, focusing on the emergence of new variations and combinations, or in relation to the status of genre systems as a whole, drawing attention to the tension between standardization and innovation. Finally, video games were offered as an excellent example for the media dependence of genre systems, where genre labels refer to conventions of visualization, game mechanics, non-virtual games (e.g. soccer), other genres (e.g. horror), and so on. Working on questions of genre classification and analysis will be an exciting challenge for the further development of the JVMG knowledge graph.

Day two started with the highly anticipated presentations of the representatives of the three enthusiast communities who have shared their data with the JVMG project and who have worked with us since the first project starting Workshop in Leipzig in 2019. First, Maria Pino offered an overview of not only the story and development of AnimeClick, but also the many new developments in the website, the database, the organization and the community engagement. The pandemic resulted in AnimeClick no longer being able to participate in conventions, and in order to keep their community alive they decided to establish a new form of direct contact on Twitch, where they now host five to seven lives per week featuring guests, such as Italian translators and voice actors, and sometimes even international guests, or just interactive sessions with staff from the site. An aspect of the database that is growing rapidly in recent years is data on live action drama, this is in part due to Netflix and other streaming providers featuring an increasing number of Japanese and Korean drama series.

Then, Stephen Goral (Rei), the creator of Anime Characters Database (ACDB) started his presentation by explaining how ACDB grew out of the initial idea to help identify the avatars that people were using on online forums back in the day, and also provided us with an update on his goal of adding a million quotes to the database. We were then treated to a very special behind the scenes look at the various types of problems encountered in relation to data inputs and structures and the mitigation strategies that have been developed to deal with them at ACDB. These included the various types of bad data (such as random and empty strings, as well as spam and malicious code), incorrect data and problematic user behaviours, as well as the ongoing commitment ACDB has demonstrated regarding the handling of objectionable and/or offensive data in their constant updating of their database, even changing their original mascot figure to make the community as non-exclusionary and friendly as possible to all users.

Finishing the series of presentations about the online enthusiast communities and their experiences with participating in the JVMG project Yoran Heling the creator of The Visual Novel Database (VNDB) took to the stage and offered a detailed overview of the many changes and updates that have happened since the first project workshop three and a half years ago. Many new features have been added to VNDB, including a powerful advanced search feature, internationalized visual novel titles, support for user reviews and more exact play through times, and further links to other databases. Yoran Heling also recounted how the last workshop was the final push for him to open up the VNDB data to the public, with almost the entire database now available for download under an ODbL license. Finally, we got to see a roadmap of the plans for the further development of VNDB going forward, warranted in part by the way community activity is only increasing over time on the site.

In the next session Hideyuki Ōtsubo, executive director of the Japan Animation Creators Association (JAniCA) and also representing the Non-Profit Organization Anime Tokusatsu Archive Centre (ATAC) took us on an in-depth tour of the work that goes into compiling the anime data in the Media Arts Database. One of the most exciting aspects of this work is how they approach gathering data on anime releases from the past and on anime that is being broadcast on TV, released in cinemas and published on various formats such as DVDs or Blu-ray Discs today. For the former they rely on various data compilations, such as the Annual Perfect Data of All Animation Works, which used to be published in Animage Monthly. For the latter we learned about the formidable operation that goes on to capture all current broadcast anime for their title and credits information, which is the ultimate source of ground truth for this domain. During the presentation we also learned about the many questions that have to be addressed both in relation to the bounds of what needs to be included in the Media Arts Database, and regarding the development of the data model that is able to provide a fine enough granularity to be capable of capturing the multiple layers, versions and variations of anime production and its many (re)distributions.

The last part of the workshop featured lightning talks with an increasingly rich discussion forming around both the contents of the individual lightning talks at hand, but also weaving into the many common themes that emerged during the whole of the workshop.

The lightning talks started with Michele Newman from the GAMER Group talking about issues of game preservation. The Video Game Metadata Schema was an important first step towards the description and organization of metadata on video games. A central problem of game preservation work is that although games are experiences of play most preservation work does not address this. One approach is to try and use player generated content such as walkthroughs, “Let’s Play” videos and so on. By creating a taxonomy for content creators they themselves could add metadata information to these types of content. Going forward it is important to keep in mind that there is a wealth of knowledge fans and creators (who are often archivists as well in a way) have, and there needs to be a common understanding with game preservation workers.

Next, Kazufumi Fukuda from the Ritsumeikan Center for Game Studies, who is leading the work on the video games data collection for the Media Arts Database (MADB) discussed the question of data utilization. First by introducing two examples of the MADB data being used in different projects: the first one on learning data visualization with manga, and the second one providing a MADB quiz. Second by directing attention to the divide between domain specialist researchers and those working on the creation and development of databases, and posing the question of what type of methodologies and strategies (e.g. documentation, events, etc.) do we need for better community development with researchers who would be interested in working with the data. In closing we also learned that there is an active development of an online catalog going on at the Ritsumeikan Center for Game Studies, which builds on the frontend developed within the JVMG project.

Continuing on the topic of Media Arts Database, but expanding the discussion towards more general problems of modeling cultural metadata Shigeo Sugimoto offered a number of important insights in his lightning talk. First, outlining the way the Media Arts Database is actually a part of a larger Media Art project initiated by the Agency for Cultural Affairs of the Japanese Ministry of Education, Culture, Sports, Science and Technology. Second, drawing attention to the way that different branches of the MADB have various data co-operations in place and different challenges facing them. For example, while the manga data community and co-operation is quite strong and advanced, collaborations in the game field are more difficult and slower, and as for anime, although archiving the products is more clear, preservation of production materials is still challenging. Third, the problem of item-centric (manga, game, anime (package)) vs. event-oriented/content-oriented (anime (broadcast) & art) archiving was introduced, leading into the question of preserving information on cultural works, where (similar to the problem discussed by Michele Newman in her lightning talk) we are faced with the doughnut problem, where we are trying to preserve the center, the experience itself, but we can only reasonably record the surrounding ring of descriptions and artifacts.

Taking the stage Simone Schroff demonstrated once again how a legal perspective on cultural industries can yield a series of uncanny insights. Acknowledging both the positive and negative side of copyright legislation and how it is still essential to the functioning of the creative industries despite its many flaws, she went on to introduce some key questions that have emerged during the presentations and discussions at the workshop, and which can serve as examples of how to enable dialogue between social science, arts and humanities perspectives on the one hand and law on the other. First, auteur discourse (going back all the way to Kant) is present in copyright as a rationale, and we could try and examine how this factors in on the industry side of the discourse in relation to creative production. Second, on the level of visual and artistic analysis we could examine the role of originality in the courts and its implications for and impact on the arts. Third, looking at the connections between business models, licensing and brand management practices. And finally, fourth, how does copyright interact with other intellectual property (e.g. in the case of merchandising), as well as trademarks and design law.

Next up Marco Antonio Stranisci introduced The Under-Represented Writers (URW) Project, which takes as its starting point the problem that non-Western literature is still underrepresented in many digital archives. The project aims to address this problem through three main approaches: mapping Wikidata with other archives, extracting knowledge from unstructured data (a model was trained for extracting biographical events from Wikipedia biographies), and listening to what reader communities have to say (based on reviews from various platforms). An ontology network for aligning and mapping information about works in three platforms (Wikidata, Open Library and Goodreads) was developed in the URW project. This mapping work has resulted in a dramatic increase in representation after the merging of the three data sources, and the work on knowledge extraction and reader reviews is still ongoing.

Then, our former project member Luca Bruno presented on both his experience with the compilation of a complete list of anime in the JVMG project and his dissertation work on character intimacy games. In the latter he provides a new theoretical framework for understanding and engaging with visual novel games that builds on the players perspective and experience, the affordances of these types of video games, the interplay of these two, as well as the (sub)cultural templates and literacy invoked during the course of the gameplay itself.

Next, José Andrés Santiago Iglesias introduced a new project that has just been submitted and is pending review. This project will focus on examining anime co-productions, both old and current works. Some of the questions the project will hope to explore include checking for the consistency of the staff between the different releases (by language and geographical area) of the works, looking for shifts in emphases or even contestations in the attribution of creative work. They will also study how the dominance of certain topics shifts over time in co-productions, and examine conventionalized aspects of the narrative, as well as the evolution of styles through different periods, and explore how we can look for and identify such periods.

In the last lightning talk of the workshop Lukas R.A. Wilde offered an insightful meta-commentary on one of the central questions underpinning so much of the work in relation to Japanese visual media and beyond that we are grappling with through the new possibilities opened up by large scale databases such as those developed in the GOLEM, URW and JVMG projects, as well as the MADB, namely, can we pinpoint the problem of narrative factuality/reliability and intersubjectivity in the chronological development of the data we are working with, and the need to reflect on this problem as opposed to simply accepting the data as objectively given.

At the end of the two-day workshop everyone was quite exhausted and electrified at the same time, because we had witnessed a unique convergence of various disciplinary and domain specific perspectives in dialogue with each other and the opening up of new horizons of inquiry both theoretically and from a methodological standpoint. We are very grateful for the contributions, both in presentation form and as part of the ongoing lively discussions, of all the workshop participants, and we sincerely hope that we will be able to continue the dialogue that was initiated at this event in the coming years.