Why do democratic states rarely fight one another? Does the use of mechanization/armor generally make counterinsurgents less effective? When (if ever) does the threat of Western sanctions deter foreign aggression? Will economic decoupling with China significantly increase the likelihood of great power war?
Many natural and applied scientists can rely on rigorous empirical methods such as well-controlled experiments, extensive observational studies, and high-fidelity simulations to test claims or resolve disagreements for questions in their fields. Unfortunately, such rigorous methods are often not practicable for testing claims or resolving disagreements in disciplines such as international relations (IR) and peace & conflict studies (PCS), which frequently involve numerous variables and poor data availability or quality.
The inability to heavily rely on these approaches to answer many questions—including the questions listed above—has plagued research efforts in IR and PCS and has even led some critics to deride social sciences more generally as unreliable or unscientific. Researchers in IR and PCS can and often do turn to qualitative and theoretical methods, but a variety of problems undermine progress in complex debates, such as the difficulty of thoroughly finding or anticipating counterarguments. In many scientific fields over the past two centuries, researchers have mitigated the challenges of complexity by using systems (e.g., spreadsheets, regression models, charts) that handle information more effectively than human cognition and traditional paragraph text. However, many complex debates are still conducted largely via paragraphs, despite the potential problems with such a format.
Thus, this article advances two proposals to increase the efficiency and/or reliability of claim evaluation in disciplines such as IR and PCS:
1. Use methods for managing arguments more effectively than traditional paragraph/bullet-point approaches (with Kialo being an example of such a method).
2. Use dynamic documents that allow incorporating commentary into documents post-publication.
Complex Debates and the Failure Modes of Traditional Argumentation
One of the major advantages of rigorous empirical methods—especially experiments—is that they serve as checklists/algorithms to test many plausible alternative explanations, such as statistical coincidence, reverse causality, and confounding variables. When one cannot algorithmically test alternative explanations or incorporate related data via such methods, researchers typically must use other methods to identify and evaluate important counterarguments, such as conflicting examples/data, confounding factors, and gaps or contradictions in one’s theory.
In fields such as IR and PCS, scholars frequently rely on traditional, low-structure text formats such as paragraphs to present and address counterarguments. However, such approaches are susceptible to a variety of failure modes. For example, producers or consumers of research (including policymakers) may fail to find important counterarguments that exist, especially if they are buried among other arguments or have not actually been published (as opposed to being informally shared among peers). Relatedly, researchers may struggle to anticipate objections to new claims. Even if a researcher does find or anticipate an important counterargument, the researcher may deliberately or accidentally misrepresent it, forget to address it among a sea of other points, lack the space to address it, deliberately omit reference to it, or hastily dismiss it while redirecting focus to weaker counterarguments.
Ideally, the research community could enforce norms against analytical malpractice. However, the frequency of “honest mistakes” and the potential difficulty of distinguishing between excusable oversights and malpractice undermines the practicality of norm enforcement. Thus, it seems that the fields should work to develop and promote research methodologies that mitigate these failure modes.
Discussion From Prominent Scholars
A variety of scholars have identified problems and provided recommendations for how to improve fields such as IR and PCS. For example, John Mearsheimer and Stephen Walt argue that the rise of “simplistic hypothesis testing” via quantitative methods and the decline of emphasis on “theory” (e.g., causal mechanisms) are problematic because such quantitative methods often require more and better data than is usually available in these fields. Relatedly, Georgetown’s Andrew Bennett advocates for the use of process tracing in qualitative research to better explicate and test claims in the face of sparse data. In the same journal issue, David Lake argues that scholars should focus more on narrower, middle-range theories rather than the “paradigm wars” (e.g., realism vs. liberalism) and other sweeping debates. More broadly, Stephen van Evera has made multiple contributions to political science research methods and suggested a variety of institutional reforms (e.g., restructuring academic departments, changing tenure evaluation criteria).
All of these scholars make valuable contributions, but Mearsheimer and Walt’s analysis is particularly relevant to this article as they emphasize some of the major challenges and risks of overreliance on empirical methods. One illustration of this point is Jason Lyall and Isaiah Wilson’s (peer-reviewed) “Rage Against the Machines” and some of the criticisms that it eventually faced: although the article made some noteworthy contributions (e.g., its dataset), the data limitations and subject complexity prevented the authors from thoroughly testing counterarguments via quantitative methods, despite their efforts to control for a variety of variables. Additionally, the article’s reasoning on causal mechanisms pays little attention to readily-available counterarguments.
However, Mearsheimer and Walt’s emphasis on theory does not mitigate the problems with traditional argumentation laid out in the previous section. Additionally, although process tracing seems helpful for mitigating some problems such as opaque reasoning, it is still susceptible to some of the failure modes mentioned in the previous section (e.g., researchers not being aware of counterarguments). Thus, it seems that better methods of conducting and facilitating complex debate would be valuable.
A Pair of Proposals for Conducting and Facilitating Complex Debate
In order to mitigate some of the problems discussed thus far, this article offers two proposals:
Improved argument tracking methods (“argument management”)
Across many disciplines, researchers have used systems (e.g., calculators, spreadsheets, charts) that perform tasks such as calculation, information storage, and communication more reliably or efficiently than the human mind or traditional paragraph text. For some complex questions in IR and PCS, such as the reasons for the democratic peace, there may be dozens or even hundreds of non-trivial arguments across multiple levels of argumentative back-and-forth. Relying purely on human cognition to track such arguments is clearly error-prone, but tracking the claims via paragraph or even bullet-point format can also be impractical or inefficient—especially when a debate has claims that apply across multiple branches of arguments, which is difficult to express even via more-hierarchical text structures such as bullet points. In short, quantitative studies rely on structured datasets to manage quantitative data (rather than files of paragraph text); it seems reasonable to use at least slightly more structure for argument management in research which relies so heavily on presenting and responding to arguments.
What might such argument management systems in academia look like? One approach would be to separate and present arguments/claims as nodes or clusters of nodes, ideally with labeled links that help readers understand or search for relationships between claims. Perhaps one of the best illustrations of this approach is Kialo: a user-friendly platform which does not require using any formal logic and allows indefinite back-and-forth branches of supportive and conflicting arguments, as well as cross-linking claims between branches. Kialo’s simplicity does pose some weaknesses and limitations, and in general current systems cannot reliably automate analysis or synthesis of arguments in the same way that statistical packages can automate analysis of data (although they could facilitate systems that collect and combine assessments of relevant scholars or observers, such as the Delphi method and crowdsourced forecasting).
Despite these limitations, however, it seems that independent or collaborative use of these types of systems for contentious topics would improve the accuracy and speed with which researchers or policymakers can determine whether important counterarguments have or have not been addressed, identify/clarify reasons for disagreement, correct misconceptions or mischaracterizations, incorporate old debates into new contexts, and more. In turn, these effects could also facilitate the promotion of better research standards by reducing the potential for excusable oversights. For example, a researcher writing about the efficacy of sanctions could not as easily plead ignorance of an important counterexample or counterargument if it were available in a widely-known and well-respected hub of arguments on the topic.
Dynamic documents to solicit/convey counterarguments after initial publication
Most scientific disciplines have methods to conduct some counterargument solicitation (e.g., formal peer review, direct reply letters), but these still have flaws: peer review is far from ideal, reply letters may be short/shallow, it may take more than a year for new articles to address an older article (whether due to slow writers or slow publication review processes), and readers may still not be aware that these later criticisms exist. Ultimately, although these problems are not always terrible, it seems that more can and should be done to make it easier for readers/peers to contribute arguments and for readers to find arguments.
One major reform here could be to de-emphasize the “one-off, static pdf” publication model and instead publish an additional, dynamic document that accepts post-publication commentary, such as directly-attached responses from cited authors. Ideally, such counterarguments could also be integrated into a shared argument management system (per the previous proposal).
Much like with current journal practices, developing appropriate rules (e.g., who can contribute) will presumably require thoughtful consideration and still be subject to accusations of gatekeeping. Additionally, without implementing the previous suggestion, this proposal would likely still lead to tangled and confusing debates with many unaddressed counterarguments. Nonetheless, this could reduce accidental oversights and the time required to find counterarguments, which could in turn facilitate the promotion of better research standards.
Better scholarship in fields such as IR and PCS could support better policy analysis for high-stakes decisions, and thus improvements in this field could be valuable. One of the major challenges that plagues the field is the frequent inability to rely on rigorous empirical or quantitative methods to resolve disagreements. Although quantitative methods do have some place in these fields, scholars such as Mearsheimer and Walt are correct in highlighting the limitations of these methods. However, many theoretical and qualitative approaches also face challenges that stem from the difficulty of evaluating complex debates with conflicting arguments and evidence—especially as an observer who may not have the same amount of familiarity or time as the participants. In light of these challenges, systems or methods which improve the ease of contributing, finding, understanding, recalling, and/or discussing counterarguments seem like they would be very helpful, if only to mitigate some of the failure modes discussed in this article.
There are many other potential targets for reform to improve research quality beyond what has been mentioned in this article, such as changing incentive systems surrounding tenure and publication. Additionally, this article is a patient of its own diagnosis: it lacks the space to preemptively address most potential counterarguments, such as the weaknesses of (formal) argument mapping, the potential importance and impracticality of reforming academic incentive systems to align with the reforms, and the heuristic that if these reforms were good ideas they probably would have more vocal advocates by now. Moreover, researchers should subject any such proposal to thoughtful testing/evaluation and refinement prior to pushing for a specific methodology. However, this article is simply meant to be an opening salvo of proposals towards problems which currently seem widely underappreciated by the literature—most notably, the deficiencies of traditional argumentation via paragraph text. Ultimately, these proposals may prove to be ineffective or impractical, but the problems highlighted in this article do suggest that scholars should explore and discuss methods for improving argumentation.