Tiny Use Case 3 Part II: Stratifying Our Dataset

During the first part of this blogpost, we outlined our investigation into recurring practices of character design in visual novel games employing character data from The Visual Novel Database (VNDB). To map these practices, we visualized our dataset as a network of nodes, and examined its modularity and the eigenvector centrality of its subnetworks. Through the combined examination of modularity and eigenvector centrality, we were able to observe patterns of trait distribution across our dataset. We identified three trait communities, one of which included the near totality of character traits describing character sexual activity and pornographic depictions. The gendered distribution of types of pornography in the field of visual novel games elicited us to stratify our dataset according to characters’ intended audiences. This second part of our blogpost describes the results of our data stratification.

Visual novel games feature depictions of intimacy between the player’s avatar and the game’s characters, which may include sexual intercourse depicted in pornographic fashion. Having most traits describing sexual activities and pornographic depictions of characters cluster into one community was very interesting for us, as it meant that above all, these “sexual/pornographic traits” co-occur with each other before anything else. This offered us an interesting insight regarding character design practices in visual novel games: once a character is depicted as sexually active by way of pornographic representation, it is highly likely that sexual activity is depicted in a variety of ways. If characters were represented in pornographic fashion only once, or maybe twice in the game, we could see a different distribution of traits in our dataset, rather than what we observed in our communities of traits.

The existence of a community comprising of the near totality of traits describing character sexual activity and pornographic depictions elicited us to refine our approach, as our initial analysis refrained from including specificities such as intended audiences. In particular, visual novel games directed at male audiences featuring a male player avatar and female characters as the subject of romantic interaction is where we can find the majority of pornographic character depictions in visual novel games. Visual novel games directed at female audiences featuring a female avatar and male romanceable characters, on the other hand, tend to forego pornographic depictions of characters. There is also a third group, visual novel games directed at female audiences featuring male player avatars and male characters as the subject of romantic interactions, where pornography is present, but in diminished frequency than games intended for male audiences.

Considering the above specificities, we decided to stratify our dataset to account for the gendered distribution of pornography across the field of visual novel games. To do so, we took the combination of character gender and being the target of romantic interaction – the character possessing a storyline leading into one of the game’s endings – as a marker for the intended audience for the character’s host game. On The Visual Novel Database, most character have one of either two genders, male or female, and can have one of three possible roles: ‘protagonist’, main character’ and ‘side character’. Protagonists are player avatar characters and are neither the target of romance nor appear on the screen for a long amount of time. Main characters are characters which can be romanced by the player, and possess a storyline leading into one of the game’s endings. Side characters are characters which cannot be the subject of romance, and do not possess a storyline.

Our new approach would forego analysis of protagonists, as they serve as player avatars and seldom have developed personalities requiring extensive description. More importantly, they are not subjects in a player-character intimate interaction, but rather a tool for the player to interact with other characters. After removing protagonist characters, we divided our dataset into four sub-datasets, each grouping characters according to a combination of gender and character role: primary female characters, side female characters, primary male characters and side male characters. We performed a new analysis, calculating modularity and eigenvector centrality, on each of the four sub-datasets. After repeating our analysis, we were able to observe a significant numerical disparity between female and male characters, which was in line with our initial expectations based on the market of visual novel games.

There are many more female characters and games directed at a male audience than characters and games directed at a female audience. The numbers allow us to derive a very clear picture of visual novel games intended for male audiences, however, it leaves us with a fuzzier picture of visual novel games intended for female audiences. Calculating each sub-dataset’s modularity and respective eigenvector centralities for their main subnetworks returned the following results:

Female Main Characters

Community10 highest ranking traits by eigenvector centrality  Percentage within the dataset
1 ‘Pale Skin’, ‘Slim Body’, ‘Teen (Apparent age)’, ‘Explicit Trait 14’, ‘Waist Lenght+ (Hair)’, ‘Sidehair (Hair Tail)’, ‘School Uniform’, ‘Straight Hair’, ‘Average Height’, ‘Thigh-high stockings’. 60.38%
2 ‘Explicit Trait 1’, ‘Explicit Trait 2’, ‘Explicit Trait 8’, ‘Explicit Trait 3’, ‘Explicit Trait 7’, ‘Long (Hair)’, ‘Explicit Trait 11’, ‘Explicit Trait 5’, ‘Explicit Trait 12’, ‘Explicit Trait 13’. 35.73%
3‘Thigh-High Boots’, ‘Corset’, ‘Pointed ears’, ‘Princess (Role)’, ‘Olive (Skin tone)’, ‘Sword’, ‘Crown’, ‘Explicit Trait 15’, ‘Tattoo’, ‘Plate Armour’. 3.89%

Female Side Characters

Community 10 highest ranking traits by eigenvector centrality  Percentage within the dataset
1‘Pale Skin,’ ‘Slim’, ‘Average Height’, ‘Blue eyes’, ‘Young-adult’, ‘Tareme’, ‘Violet Eyes’, ‘Green’, ‘Watashi’, ‘Kind’. 37.35%
2 ‘Explicit Trait 14′, ‘Explicit Trait 1’, ‘Explicit Trait 2’, ‘Explicit Trait 3’, ‘Explicit Trait 4’, ‘Explicit Trait 15’, ‘Explicit Trait 5’, ‘Explicit Trait 12’, ‘Explicit Trait 16’ 26.73%
3‘Amber Eyes’, ‘Thigh-high Stockings’, ‘Tsurime’, ‘Big Breast Sizes’, ‘Red Eyes’, ‘Fighting’, ‘Headband’, ‘Bracelet’, ‘Gloves’, ‘Necklace’. 23.39%
4‘Teen’, ‘School Uniform’, ‘Miniskirt’, ‘High School Student’, ‘Ribbon Hair Tie’, ‘Ribbon Tie’, ‘Hairpin’, ‘Knee-High Socks’, ‘Necktie’, ‘Pleaded Skirt’. 12.2%
5 ‘Maid’s Dress’, ‘Maid’s Headdress’, ‘Maid (Role)’, ‘Sash’, ‘Comedian’ 0.23%

Note: there are actually seven communities, but two of them have been omitted from this table due to them being formed by only one trait each: ‘Tone Deafness (Subject of Disability, Health Related)’ and ‘Amphibian (Role Animal, Nature)’.

Male Main Characters

Community 10 highest ranking traits by eigenvector centrality  Percentage within the dataset
1 ‘Explicit Trait 8’, ‘Explicit Trait 1’ ‘Explicit Trait 18’, ‘Explicit Trait 7’, ‘Explicit Trait 19’, ‘Explicit Trait 10’, ‘Explicit Trait 17’, ‘Explicit Trait 20’, ‘Explicit Trait 2’, ‘Explicit Trait 6’. 32.74%
2 ‘Fighting (Engages in)’, ‘Death’, ‘Confinement’, ‘Murder (Engages in)’, ‘Avoidable Death’, ‘Teasing (Engages in)’, ‘Planning’, ‘Smart’, ‘Arrogant’, ‘Unarmed Fighting’. 28.86%
3 ‘Young Adult’, ‘Pale Skin’, ‘Trousers’, ‘Tall’, ‘Hosome’, ‘Slim’, ‘Blue Eyes’, ‘Amber Eyes’, ‘Belt’ 25.43%
4 ‘Teen (Apparent Age)’, ‘School Uniform’, ‘High School Student’, ‘Ore’, ‘Average Height’, ‘Brown eyes, ‘Childhood Friend’, ‘Classmate’, ‘Friendly’, ‘Friend’. 12.87%

Male Side Characters

Community 10 highest ranking traits by eigenvector centrality    Percentage within the dataset
1 ‘Explicit Trait 15’, ‘Explicit Trait 17’, ‘Explicit Trait 20’, ‘Cross-dressing’, ‘Bracelet’, ‘Explicit Trait 8’, ‘Explicit Trait 2’, ‘Explicit Trait 5’, ‘Explicit Trait 4’, ‘Not a Virgin’. 22.67%
2 ‘Tall’, ‘Fighting (Engages in)’, ‘Adult Body’, Muscular, ‘Grey hair’, Suit, ‘Death (Subject of)’, ‘Black Eyes’, ‘Long Hair’, ‘Red Eyes’. 47.94%
3 Short Hair’, ‘Pale Skin’, ‘Slim Body’, ‘Brown Hair’, ‘Trousers’, ‘Young Adult’, ‘Black Hair’, ‘Hosome’, ‘Brown eyes’, ‘Teen’ 29.39%

While we continued to observe a strong tendency for sexual/pornographic traits to cluster together, we could see interesting differences in our four stratified datasets. We could observe select sexual activity traits in communities other than those clustering traits related to sexual activities and pornographic depictions. Seeing sexual/pornographic traits in other communities suggested to us that, when character gender and role are taken into account, the number of depictions of character sexual activity changes across different character populations.

At the same time, we could now see a higher degree of thematic commonality – traits pointing at specific situations, social and family status and more – in each community’s highest eigenvector centrality traits. The increasing thematic commonality in high eigenvector centrality traits in our stratified datasets suggested that select groups of character traits might possess an ‘archetypal’ function. These archetypes – a model for characters described with the traits grouped within a subnetwork – vary on the basis of the character’s intended audience, and do not apply to the field of visual novel games in its entirety.

For example, the first community of the female main characters datasets, numbering 60.38% of all nodes in the network, suggests a high prevalence of characters that attend high school. Amidst the community’s highest-ranking trait by eigenvector centrality, we observed the presence of traits describing specific pieces of clothes (‘School Uniform’ and ‘Thigh-high Stockings’) and social statuses (‘High School Student) rather than a sequence of nondescript visual traits. While high eigenvector centrality in a group of traits does not guarantee that these traits will co-occur together all the time, it offers a reasonable assurance that these traits co-occur frequently with themselves and across the node community.

Traits like ‘School Uniform’ do not just describe a piece of clothing connected with an educational institution. ‘School Uniform’ points to a setting (a school), a social status (being a student) and to the narratives that can be developed in a school setting. Looking at ‘Teen (Apparent age)’ and ‘Thigh-high Stockings’ together with ‘School Uniform’ reinforces the connection with a setting, a social status and its potential narratives. We can therefore reasonably assume that there is a recurring set of traits employed in character design for visual novel games intended for male audiences, namely a teenage character that attends secondary educational institutions, and narratives involving romance in high school settings.

Similar archetypal ensembles can be seen in other datasets such as the fourth community of the male main characters dataset. This community is much more limited in scale and comprises only 12.87% of all traits in its dataset. ‘Childhood Friend’, ‘Classmate’ and ‘Friend’, along with ‘Teen (Apparent Age)’ and ‘School Uniform’, outline an even stronger archetype of a high school student, which is a childhood friend of the protagonist character. In the context of male main characters, these traits ranking high in eigenvector centrality point to narratives, social status and settings connected with school life, in a stronger fashion than their female counterparts. On the other hand, the percentage of characters actually featuring in high school settings in visual novel games directed at female audiences is certainly lower, as the fourth node community in the male main characters dataset comprises of a much smaller percentage of all traits.

Beyond high school student characters, we could observe different distributions of traits describing character sexual activity and pornographic depictions in across different character roles. In male main characters, we can observe that, among traits pointing to character sexual activity, traits referring to homosexual intercourse (‘Explicit Trait 20’) or practices that can be related to homosexual erotic activity (‘Explicit Trait 8’, ‘Explicit Trait 1’) possess high eigenvector centrality. If we contrast this node community with community number four, we surmise that pornography in the male main characters dataset possess a specific trajectory towards depictions of homosexual intercourse.

A similar division, albeit not as clear cut, can be observed in female main characters. While traits describing character sexual activity cluster together, this time pointing at heterosexual intercourse, there is one trait found outside the community, ‘Explicit Trait 14’. ‘Explicit Trait 14’’s presence outside of the second, pornography-centered node community allow us to envision a dividing line amongst visual novel games directed at a male audience. Games where characters tend to be depicted in a variety of pornographic and sexual acts, rather than having a focus on one specific depiction might offer a different type of experience than visual novel games where character pornography is present but not at the forefront of the game’s experience.

We also need to specify that depictions of characters performing sexual intercourse with characters of the same sex does not necessarily imply non-heteronormative identity. Pornography in visual novel games is still mostly intended for heterosexual audiences of both genders. However, looking at our data allows us to highlight the fault lines in what is apparently a production niche featuring characters similar in aesthetics and mannerisms, with an apparent tendency toward global self-sameness. Observing the different distributions of sexual/pornographic traits allows us to draw a very significant division in visual novel games directed at different audiences.

Our data analysis allows us to conclude that our dataset hints at significant numbers of characters being designed with a specific experience in mind. In the case of female main characters attending high school, it points to a setting, a social status and to an array of modes of relating with the player, which would be different in other settings. In the case of male main characters, the archetype suggests an even tighter trajectory towards a certain array of interactions, pointing to a background narrative, to an existing friendship and to a current social status. Furthermore, the absence of traits describing sexual activity in male main characters arguably shifts the focus of the interactive experience away from pornography and towards non-sexual emotional engagement.

The results prompt us to return to our initial research question, namely the presence of recurring practices in the field visual novel games. By examining the node communities we have derived from our stratified dataset, we can conclude that there are recurring practices of character design in visual novel games. In fact, the thematic commonality of high eigenvector centrality nodes allows us to surmise that there are also strong hints of usage character archetypes in visual novel game character creation. Our results would not be possible if VNDB’s data didn’t follow its highly granular data model, which allowed us to create our dataset.

In turn, this answers our second research question, namely the viability of fan-curated data for research use: VNDB’s data, which is tailored to the need of fans, rather than researchers, allows a perspective on visual novel game characters that would not be possible with other databases not focused on characters. In fact, the needs for fan to describe characters in such a precise way has allowed us to derive a sort of baseline for characters in visual novel games, against which single characters of interest can be compared.

The possibilities offered by our data-driven approaches are potentially manifold, and as with other TUCs, remain to be explored. While in-depth case studies of characters are outside this TUC’s scope, the possibility of fixing a baseline for further research is an interesting development which warrants further exploration.