ARROS: A Universal Framework for Scrutinizing Policy Pros and Cons

Image Generated Using MidJourney AI

Policymakers and researchers frequently grapple with complex questions of the form “will taking X action produce Q effect—and how good/bad is Q?” For example, lobbyists or researchers may claim that semiconductor export controls will accelerate China’s indigenization of semiconductor manufacturing, accepting Finland into NATO will increase the likelihood of nuclear war with Russia, or allowing South Korea to acquire nuclear weapons will provide leverage for North Korean denuclearization. Unfortunately, policymakers are often time-constrained, subject to biases, or not deeply knowledgeable in the topic areas, which can allow exaggerated claims about the positive or negative effects (advantages or disadvantages) of a policy to evade scrutiny.

Although it is sometimes easy in hindsight to recognize the flawed assumptions in well-studied policy debacles (e.g., Prohibition), it is often less easy for overworked legislative staffers to identify such mistakes early in the process of evaluating new policy proposals. Thus, this article presents a framework—herein titled ARROS (“arrows”)—to enhance the speed and rigor of pro-con analyses. Specifically, ARROS categorizes all of the possible ways in which a given advantage or disadvantage can be flawed: the plan cannot or will not be implemented as assumed (Action details/implementation); the claimed outcome does not occur even with implementation of the plan (Results); the claimed outcome occurs even without implementation of the plan (Reference Option); the claimed outcome is less good/bad than assumed (Significance). This framework is not constrained to specific policy domains or even government policy: its categorization is designed to apply in any decision context, including business, academia, and personal life. Ultimately, using the framework does not guarantee that one will identify all of the flawed assumptions in a set of advantages or disadvantages, but ARROS can help users more quickly and thoroughly probe assumptions—even when they lack familiarity with a specific topic.

Understanding and Using ARROS

There are many steps in the policymaking process, including the creation, analysis, and implementation of policy proposals. Unlike other frameworks which attempt to cover the entire process, ARROS only focuses on the evaluation (not identification/generation) of advantages and disadvantages. One core observation of the framework is that the chains of reasoning for all advantages or disadvantages rely on the following categories of links (claims):

Action details/implementation: “The proposal would be implemented as some set of actions X (e.g., pass a specific bill, impose a specific sanctions package).”
Results: “If we do X, outcome Q happens.”
Reference Option: “The alternative to X would be Y (e.g., ‘do nothing’), in which case Q does not happen.”
Significance: “Q happening is good/bad.”

To illustrate the four categories with an example, consider a claimed advantage such as “Carrying out X proposed drone strike would reduce an insurgency’s activity by eliminating their leader.” ARROS could prompt the following questions:

Action details/implementation: What does the proposal actually entail doing upon (attempted) implementation? Is it infeasible or too vague? Are analysts overlooking exceptions or loopholes? For example, are the specified weapon platforms actually available for the given area and time? Is there excessive vagueness in the plan? Are there conditions/exceptions in the proposal that would likely trigger a mission abortion?
Results: To what extent does the effect occur in the world with a given implementation of the proposal, in descriptive terms? For example, what is the probability that the leader survives the strike (or is not even in the designated area)? If the goal is to disrupt the insurgent network through decapitation, what is the likelihood that someone just replaces them (and how competent would they be)?
Reference Option: What happens in the hypothetical world without the plan (with regards to the claimed effect), in descriptive terms? As part of this, what would the alternative course of action be (especially if it is not just “do nothing”)? For example, what is the likelihood that the leader is eliminated by other means (e.g., rival insurgent groups, internal challengers, our own troops)?
Significance: How much better or worse is the world with the implemented action than the reference option due to the claimed effect, in normative or value terms? What moral framework or decision criteria are you applying? For example, how valuable would it be if the leader were eliminated or the insurgent network were disrupted? How do these outcomes relate to end goals? In some situations a specific estimate may not be necessary, but when considering outcomes such as “10% chance of eliminating insurgent violence” vs. “50% chance of civilian casualties,” analysts often need some weighing mechanism.

To provide another example for illustration: suppose it is September of 2022 and someone is arguing that the US government’s plan to impose export controls on semiconductors to China has a major disadvantage: doing so would prompt China to indigenize semiconductor manufacturing and thus would destroy our leverage for future export controls. The alternative (Reference Option), the critic argues, should be to delay implementing such export controls. Consulting ARROS could generate the following questions/responses:

Action details/implementation: Are the export controls actually as strict as described? Does the claim take into consideration exceptions or mitigation measures in the policy? In reality, there are at least some temporary “release valves” regarding chip design and access to cloud computing (although this could change over time).
Results: How likely is it that China actually achieves indigenization even with the implementation of the plan? How long would it take? Some analysts are quite skeptical that China could achieve this in the near future given their failed attempts thus far (although this would not be the first time an adversary exceeded expectations in technology acquisition).
Reference Option: How likely is it that China indigenizes its semiconductor manufacturing capabilities even without the new level of external pressure—especially if Beijing believes the West will eventually crack down? China has already been attempting to indigenize for years; how long would indigenization take in this scenario?
Significance: How valuable is it to have X additional years of leverage for future export controls? Is future leverage actually significantly more valuable than current leverage (given that future semiconductors will be more powerful), as Paul Scharre claims, or will future controls be too late to prevent China from accessing sufficiently powerful chips to pull ahead in AI research? In reality, analysts might need to look beyond the goals of “Western leverage” or “slower Chinese AI progress” and focus on specific downstream effects such as human rights violations, stronger Chinese military capabilities, or dangerous races for powerful AI systems.

Ultimately, there are many possible ways to use the framework in a broader workflow, but the two core steps are 1) identify specific claimed advantages/disadvantages for a given proposal and reference option, and 2) use the categories and their associated questions to identify questionable or outright flawed assumptions in specific claims. These findings can then guide subsequent research, argumentation, etc.

The Adaptability and Thoroughness of ARROS

There are multiple other frameworks for analyzing policies, and some of these frameworks (e.g., the “Five E’s,” the “Eightfold Path”) touch on important steps in the policy process. However, many of them rely on categories that are redundant or not collectively exhaustive, prescribe overly rigid/linear procedures, and/or are constrained to specific subjects. In contrast, 1) ARROS is not constrained to a specific subject area and can even be applied to non-governmental decisions; 2) a user can define the categories in minimally-overlapping ways; and 3) ARROS’ categorization is exhaustive: every rebuttal to an advantage/disadvantage addresses one or more of the four categories of assumptions, either directly or indirectly.

Potential Benefits of ARROS

There are a few ways in which using ARROS can help someone avoid oversights or more quickly identify flaws. In general, breaking down complicated questions into smaller pieces can improve the feasibility or accuracy of analysis, especially when this can be done without losing pieces of the original whole or creating redundant categories.

One specific way this decomposition helps is by creating a checklist for analysis/research, which can more quickly and reliably prompt someone to check an argument’s key assumptions. Without such a checklist, people may just take an ad hoc approach which is slower or overlooks categories, especially if the person is time-constrained, unfamiliar with the topic, and/or biased. For example, it is easy to become too focused on how implementable and effective some novel technological solution is without considering the sufficiency of current alternatives, or one might hyperfocus on metrics even after they lose their value. Still, one should be wary of over-routinizing the process and losing the forest for the trees.

Decomposition can also be helpful via pattern recognition and comparison: Using consistent categories can help someone organize their reasoning and recall relevant experiences, which can make it easier to generate—or scrutinize—rebuttals in new contexts.

Additionally, sometimes an analyst can loosely treat the ARROS categories as multipliable factors to perform heuristic-level calculations. One of the simplest and most reliable shortcuts is that if any category is “zero” (e.g., 100% infeasible, 100% ineffective, 100% redundant) then the entire effect is automatically nullified. Setting aside cases with zeros, consider the claim “X intelligence collection plan (e.g., deploying a surveillance drone) will give us Y information”: suppose that an advocate of X initially claims it is 80% likely to provide Y and implies by omission that without X they will not get Y. If a skeptical analyst instead estimates that 1) doing X is only 50% likely to provide Y, 2) they are 20% likely to acquire Y with the alternative to X, and 3) Y is only 67% as valuable as originally claimed, then they could loosely estimate the claimed benefit to be 25% of the original claim. However, treating the steps as neatly multipliable factors is merely a simplifying heuristic and should only be used with caution (or in desperation) given that the factors are often multi-dimensional and have non-linear relationships between them. For example, halving a proposed program’s budget might render it completely ineffective rather than just 50% effective.

Ultimately, there are some drawbacks to routinization and decomposition, but it is often helpful to recognize and label categories when they can be neatly delineated.

Limitations

Despite the potential benefits, ARROS has some important limitations. Most significantly, ARROS is not designed to directly generate ideas for alternative courses of action or advantages/disadvantages. Additionally, using ARROS is not always the most efficient approach: for some decisions, intuition can identify important effects or cruxes much faster than stepping through the framework. It also is not always efficient to explicitly identify what assumptions or even claimed effects a response is targeting (e.g., “all of your sources are unreliable”). Lastly, as with most other frameworks, ARROS requires some practice before it becomes efficient or intuitive, and it still cannot guarantee identification of all flaws. For example, if a user erroneously evaluates each claimed advantage/disadvantage in strict isolation they may overlook double counting (a flaw in Significance).

Ultimately, ARROS is not intended to be a policymaker’s panacea; it is intended to provide a more rigorous rubric for a narrow but important and error-prone step in the policymaking process: identifying flaws in advantages and disadvantages.

Conclusion

When predicting the effects of decisions, people occasionally rely on questionable assumptions, such as regarding the feasibility of a proposal or the value of achieving an outcome. Unfortunately, policymakers do not always have the topic familiarity, impartiality/motivation, or time to identify and scrutinize these assumptions. The urgency or secrecy of security policy can exacerbate these problems by limiting the time or external expertise to evaluate options—even when the stakes are high. ARROS is only designed for argument evaluation and it cannot ensure that someone will identify all of an argument’s flaws, but it does provide a way to exhaustively categorize flawed assumptions. This can help users break down complex, unfamiliar arguments into smaller, familiar categories to aid with identifying weaknesses or applying relevant lessons from past experiences. It can also help users overcome confirmation biases when probing their own policy proposals for weaknesses. Ultimately, given the importance and challenges of national security policymaking, experimentation with ARROS seems worthwhile.