Following the success of our project launching workshop in July 2019, the work on processing community databases started in earnest (you can read about the technical details of the process in relation to ontology creation and data transformation). By November 2019, we were ready to start examining the data and our infrastructure through the lens of exploratory research.
We decided to adopt the Tiny Use Case workflow methodology to have a number of short-term research projects that would be substantial enough to generate meaningful and interesting results in their own right, but would be compact enough to provide an ongoing stream of feedback on issues with the database, the project infrastructure, and researcher needs. Since each Tiny Use Case is only 3-4 months long, it provides us with an excellent tool for assessing our progress and for uncovering newer issues, as each TUC has a different focus and somewhat different requirements.
The first TUC ran from November 2019 to February 2020, the second one spanned the Spring of 2020, and we are currently in the process of Tiny Use Case three, which started in June and is scheduled to run until September. With two TUCs already completed and one very much underway we already have a number of takeaways that have come out of these research projects. Some of these are minor, but can still lead to new features, like the implementation of the “simple” view on the front-end prototype. While others can be so fundamental that they are not readily resolvable within the span of a single TUC and become mainstay fixtures of the feedback to come out of subsequent TUCs. An example of such a takeaway is our on-going realization of the severity and thus work required when handling big data issues in the research projects.
The wrap-up and evaluation phase of the TUCs also offers a chance to momentarily stop doing what we are trying to do the hardest, which may or may not be the right way of approaching certain problems, and just look at the big picture again for some much needed reflection. One of the ideas that has come out of these “taking a step back to reflect” moments is the need to engage with the toolkit of network analysis. The most common tools used in data analysis require tabular data as inputs, but we are actually investing a lot of time and effort into moving data out of their original various tabular formats and into a unified linked data format. One of the defining characteristics of the linked data format is that we end up with a graph of all data elements. What if we could harness this aspect of the data structure, either to gain new insights specific to network analysis, or maybe to perform certain types of analysis more efficiently? We are currently exploring the opportunities opened up by this approach, and will be reporting on our findings here on the blog in the future.
Finally, thanks to the iterative nature of the TUC workflow methodology we can also experience the way the growth of our library of data extraction and analysis pipeline elements, as well as of our list of identified best practices, directly contributes to making each subsequent TUC easier to get off the ground and running. Not to mention the way communication between the library and computer science researchers and the humanities and social science researchers is becoming increasingly smoother with each new TUC, as both parties learn more and more about the concepts and tools of the other group, with an increasingly shared vocabulary emerging from our work together.
We look forward to sharing the most interesting questions and findings from each of our TUCs here on the blog, so stay tuned!