Tiny Use Case 2: Can we test one of the points from Hiroki Azuma’s “Otaku: Japan’s Database Animals” with the JVMG database? Part 4: Questions of validity and the theoretical implications of our results

It has been quite a journey getting to this fourth part in our series on Tiny Use Case 2. We started out by introducing Hiroki Azuma’s discourse defining work, Otaku: Japan’s Database Animals, and picking out a claim that would be worth examining on the JVMG database. Next we introduced the two datasets (The Visual Novel Database (VNDB) and Anime Characters Database (ACDB)) we were employing for our analysis, and examined some key descriptive statistics. Finally, in part three we employed the toolkit of regression analysis to see whether our two hypotheses are confirmed or contradicted by the data at our disposal. Our hypotheses were:

    1. The portion of new characters with shared traits should increase over time.
    2. The portion of shared traits among new characters should increase over time.

We found that our first hypothesis was not substantiated by our regression analyses. And we found no adequate regression model for testing our second hypothesis.

Now, in this fourth part of the series we will first assess our approach and the validity of our results. Then we will consider what implications our findings could have for Azuma’s original argument and the wider theoretical discourse on anime and manga.

Can we be certain we actually did what we set out to do?

Starting with some potential critiques that could be raised in relation to our work, the three most obvious would be:

    • The data was wrong or insufficient.
    • Our analysis was wrong.
    • We misinterpreted what Azuma is saying, and thus our hypotheses were wrong.

We know from our analysis that the data is indeed insufficient. On the one hand, the attention paid to providing information on character traits is definitely uneven as seen from the dimension of publication dates. And on the other hand, it is almost certain (especially for works with publication dates in the latter half of the 2010s) that a number of works and characters are missing from the databases. How do we know this? Well, first of all, there seems to be a similarity in the trends we saw for the number of characters and the average number of traits, and since the latter can be thought of as a proxy for attention being paid to the completion of the database, it would seem logical that the parallel downward trend in the two is no mere coincidence, but rather a result of less attention being paid to the completion of the databases along both these dimensions. However, we can go one step further, and check this hunch against some other data source. For example, a cursory glance at the summary page on Wikipedia of the number of anime television series released each year seems to indicate that there was no major drop in the number of television anime produced during the second half of the 2010s. Again, we have to be cautious here, since we don’t know anything about the completeness of this data source either. Nevertheless, taking the volume of television anime being produced as a proxy for the level of production in general of the types of media included, and assuming that the average number of new series and the mean number of characters per series has not changed significantly (yes, that’s another two so far unexamined ifs) then the data in Wikipedia does not seem to corroborate the story of declining number of works and characters we saw in our databases, which once again points towards the explanation that it is the drop in attention being paid to the completion of the databases for more recent titles that is the source of the downward trend in the numbers of characters in our data.

Knowing that our data is not complete or perfect, does not invalidate our results, but rather helps us better assess the validity of those results. First, if the temporal change we were looking for exists, we should find its presence regardless of the level of attention being paid to different years of publication by the community. Precisely because we are hypothesizing an increase in the portion of characters with shared traits. Therefore, if there are fewer games and characters recorded in the databases for later years, given large enough numbers, we can still expect the effect to make its presence felt. Second, by having examined two different databases and three different datasets, we can hope to have circumvented the potential bias that could have been introduced if one community perhaps paid attention to certain types of games or characters more than others. There is, of course, still a possibility that all data in the two databases was actually filled in by largely overlapping two communities, in which case the previous argument would not hold. However, by comparing results for two databases and three datasets, we have at least taken reasonable steps towards mitigating the possibility of this type of bias influencing our results. Third, by including a step in the analysis where we fixed the value of the average number of traits variable, we controlled for at least part of the effect the attention afforded to different characters caused in our analysis results.

Another possible problem with our data is the potential presence of actual factual errors, such as wrong traits being assigned to characters, incorrectly provided publication dates, and so on. Although we are in the process of assessing the level of factual accuracy found in our datasets, for now we can only say that since these databases are built and constantly cross-checked by the communities that are invested in these media they can be assumes to be mostly correct. And even if there is a given percentage of factual errors, unless these errors are clustered according to some special pattern in the data, their random presence should not impact our overall results, since the data points we have been working with are aggregated from thousands of characters (with the exception of our very last regression analysis which relied on only fifty characters per year, which, of course, is still a high enough number statistically speaking). Again, using three datasets for our analysis also helps mitigate the effects of these types of potential problems with the accuracy of our data.

Moving on to the next likely critique, what if our analysis was wrong? As odd as it sounds, that is another possibility we have to consider. Although we hope that the steps and results described in part three have helped convince most readers that we have progressed in our analysis with due care, the possibility of having missed one or more potential variables we should have included in our model building process is quite real, and one that is important to keep in mind, even if we consider the rejection of our first hypothesis to be well grounded in our results.

Another potential problem with our analysis is the operationalization of our hypotheses using our data. For example, we never examined in detail what the five shared traits for characters with shared traits were. This is especially important in relation to the VNDB data, where certain traits can appear as multiple traits at the same time, with different levels of specificity all relating to the same trait (e.g. “has a weapon”, “has a bladed weapon”, “has a sword”). In fact, examining and potentially cleaning our data in relation to this aspect could be a very important next step in our analysis. Again, we relied on the fact that with so much data the effect of these types of false positives (where characters seem to share at least five traits, but this is only due to the above described inflation of certain traits into groups of nested traits) would not have an overall impact on the trends we were attempting to identify and test for.

Finally, we get to the most interesting potential critique: what if we misinterpreted Azuma’s work, and thus failed to formulate an adequate hypothesis? Let’s start with the most obvious objection: did we just pluck the following quote out of context, maybe it wasn’t even an important part of the book’s argument?

“As a result, many of the otaku characters created in recent years [the late nineties to two thousand] are connected to many characters across individual works, rather than emerging from a single author or a work.” (Azuma, 2010: 49)

This quote is from the section Connections between Characters across Individual Works, which in turn is from the subchapter Database consumption, which would seem to imply that it is not about shifts in the production practices. Yet Azuma very clearly addresses the production side of the changes he discusses in this part of the book, for example, in relation to the question of originals versus derivative works, and the primacy of characters versus that of narratives. He also explicitly addresses the question of influence, and points out how it should be reframed from an author centric understanding to one that focuses on the changes effected in the database, the actual source of the building blocks for characters (see again the quote in relation to Ayanami Rei in part one).

If we thus accept that we indeed interpreted the above quote and its context in line with the arc of the argument in the book, then we might ask, but what if we overgeneralized in our translation of the above point into our hypotheses? This is indeed a valid argument, since one might infer from the same subchapter that Azuma is talking only about the proliferation of Ayanami Rei like templates, that can be obviously seen to be derivative, but are not necessarily so pervasive as to impact the level of the presence of characters with shared traits as we are suggesting in our hypotheses. Or to rephrase this, even though Azuma is describing an important new phenomenon that is clearly visible on the level of the individual works where this trend can be observed, it is far too small compared to the scale of character production to be able to be identifiable by the large-scale quantitative approach we employed. Although, we would argue that the way Azuma talks about the paradigm shift in consumption practices seems to imply a correspondingly large-scale shift in production practices, which would justify our hypotheses, we will also consider this more restrictive reading in the following discussion of the impact our findings could have on rethinking Azuma’s arguments and their position in the wider discourse on anime, manga and otaku.

Theoretical implications of our analysis

Having established all the above potential limitations to the validity of our findings, let us see what avenues of thought they open up for us. Azuma is mostly concerned with practices of consumption and the way they are changing in late modernity, but he nevertheless addresses the production side as well. And, as discussed above, depending on our reading he is either implying a corresponding paradigm shift happening on the side of production, or at the very least points out the emergence of the proliferation of templated character creation.

Without this whole analysis it would not have occurred to us that there is maybe no paradigm shift going on on the production side, or to address the narrower interpretation, that the templated creation of characters is far less of a new phenomenon. But now, having seen that most of our results seem to indicate no detectable temporal effect in the way the ratio of characters with shared traits changes in our data, we can seriously contemplate the possibility that this could actually be the case.

Changing the optics through which we look at this question it suddenly becomes more than obvious that an equally strong argument could be made for why the production side has always operated in the manner Azuma describes. We could, of course, argue that inspiration is always part influence, and character creation has always drawn on preceding works, but we can make an even stronger case for the tendency to draw on a pool of available character templates in the case of Japanese manga and anime. Going back to the fountainhead of both modern story manga and limited animation (or television anime) in Japan, Osamu Tezuka, we find that it would not be hard to put forth an argument that this mode of production, namely relying on the elements of “the database”, is how things were set up to be done from the very beginning. First of all, Tezuka is famous for starring the same cast of characters in various works as different characters, which as far as templates go is the end point along the continuum from original to copy in the direction of re-use. Second, the limited animation techniques adopted by Tezuka for Astro Boy, which set the model for most television anime to follow, also heavily rely on the re-use of character elements.

“When it comes to animating characters, it is true that limited animation tends to move as little of the figure as possible and to reuse as much of the figure as possible. With faces, for instance, the eyebrows, eyes, or the mouth may move but nothing else; and drawings of the face seen from a couple different angles are used again and again. Likewise with the animation of bodies, the legs and arms may move, but nothing else. Limited animation tends toward the production a series of cel copies of the same body or face, and minor additions are made to them as you use them. The best way to assure maximum reuse of figures and bits of figures is to develop a cel bank, so you can piece together different scenes and different movements by assembling elements already drawn. The cel bank prepares the way for a relation to characters based on assembly—it forms the basis for the overlap between animation and garage kits and models (self-assembled characters) as well as an overlap between cel animation and the customizable characters of many videos games. It goes hand in hand with the sense of a transformation of humans and other life forms into a standing reserve or human park, as in exemplified in Nadia and Evangelion. The cel bank provides the assembly diagrams for taking apart and piecing together animated life forms. The character form becomes, in effect, a site and mode of technological enframing.” (Lamarre, 2009: 192)

Following on from Thomas Lamarre’s description of the cel bank from his book The Anime Machine (2009) it is quite easy to imagine how the materiality of the animation process would inspire a similar approach in the act of character creation both as a matter of economic necessity (saving money by using already available elements from the cel bank) and as a model for creativity. Although we won’t be able to provide a well researched argument that this approach to character creation in Japanese anime and manga has indeed been the dominant form throughout their history, we hope to have provided a few convincing pointers for how such a position would be quite plausible to consider.

However, the fact that Azuma might have only projected the changes he discusses in relation to consumption practices on to the production side without there being any substantial changes there in reality, in no way poses a significant challenge to his book’s overall argument, as it hinges on his discussion of consumption practices. Nevertheless, potentially amending his argument in this way helps better draw out its connections and indebtedness to Toshio Okada’s 1996 book Otaku-gaku Nyumon (Introduction to Otakuology), which is conspicuously absent from Azuma’s book’s references, even though it is highly unlikely that he did not read it, especially in light of his explicit use of the generational model of otaku introduced by Okada.

In Otaku-gaku Nyumon Okada explained how early otaku would take note of and catalog the differences in the television anime series they enjoyed, in a way attempting to reverse engineer the production process. They started to understand the connections between the end credits and the changes in the looks of the characters, the animation style, or the structure of the story. To rephrase this in Azuma’s vocabulary these early otaku were invested in understanding the underlying structure of “the database” that underpinned their favorite shows.

Thus, by positing that the production side has to a certain degree always followed this model of relying on “the database” of character elements, the corresponding argument would be that it is only now that the consumption side has caught up to it, and thereby also made these tendencies more explicit on the production side as well. This might seem like a minor shift in emphasis, but it would definitely require a stronger acknowledgement of Okada’s work in relation to highlighting the way otaku have from the start been engaged with the database aspect of the production side of anime. Which is very much in line with the way Lamarre treats the question of the connection between Okada’s and Azuma’s work, through the image of the exploded projection, which serves as a central metaphore in The Anime Machine for the way Okada and studio GAINAX approach anime:

“In fact, I would go so far as to say that the underlying structure in Azuma, which he calls database structure, is actually exploded projection.” (Lamarre, 2009: 260)

To summarize then, even though we started out on this tiny use case hoping to be able to quantitatively verify one of the points from Azuma’s Database Animals, by having found no strong evidence in our data to support our hypotheses we have stumbled upon an equally interesting position. Namely, the possibility that the production side of Japanese anime and manga has always operated in a manner congruent with Azuma’s database description. Should this indeed be the case, it would mean only a minor adjustment to the book’s overall argument. However, it would help bring to the fore Azuma’s indebtedness to Okada’s work, with the connection between the two authors’ positions already having been highlighted by researchers like Lamarre.

Finally, we will close this series in part five by offering an overview of the most important open questions remaining for further exploration and of the various lessons learned from this tiny use case.

Leave a Reply

Your email address will not be published. Required fields are marked *