Harold Leventhal Talk: Pitfalls of Empirical Studies that Attempt to Understand the Factors Affecting Appellate Decisionmaking
By The Honorable Harry T. Edwards
I recently read a paper in which a prominent legal scholar argued that empirical evidence conclusively demonstrates that judicial review of agency action in the federal courts of appeals is highly politicized. My view of the work of appellate judges is quite different. My bottom line is simple: The decisions of federal appellate judges are not highly politicized and there is no body of empirical evidence to support a claim to the contrary.
When federal appellate judges decide a case, we focus on the relevant legal materials, including the record from the trial court or agency; the challenged judgment, decision, or verdict; the precise issues raised and preserved by the litigants; the parties' arguments raised in their written briefs and oral arguments; the applicable constitutional, treaty, statutory, regulatory, or contractual provisions; any relevant case precedent; and the applicable standards of review. And, because we typically sit in panels of three, we do not act alone in considering the outcome counseled by these materials; rather, we deliberate as a group--often extensively--with the goal of reaching a consensus as to the appropriate result.
When the relevant legal materials are uncomplicated, the issues are well defined and involve no new questions, and the precedent is fairly clear, deliberations are straightforward and judgments are easily reached. I have estimated that at least one-half of the cases decided by the courts of appeals fit this characterization and are thus "easy."
I have estimated that only 5 percent to 15 percent of the disputes that come before the court in any given term involve situations in which a fair application of the law to the facts produces no clear answer. I view these cases as "very hard." That leaves roughly 35 percent to 45 percent of the cases in which each party is able to advance a colorable legal argument; but, upon careful review, one side's claim is seen to be stronger. I classify these cases as "hard."
There is no doubt that, in "hard" and "very hard" cases, judges must exercise some discretion in order to reach an outcome that best fits with existing law. Given this reality, some commentators hypothesize that judges' decisionmaking is significantly determined by their personal political or ideological predilections. Legal empiricists have tried to test this hypothesis through the application of statistical analysis to case outcomes. Rather than considering the reasoning contained in opinions, these scholars treat case outcomes as raw data and attempt to statistically correlate those outcomes to a judge's presumed personal ideological and political views. These views are often identified by reference to the party of the President who appointed the judge.
I find these studies seriously flawed due to conceptual and methodological problems, and I have concluded that they tell us very little about how appellate judges decide cases. My views on this subject are detailed in an article that I co-authored with Michael Livermore, entitled Pitfalls of Empirical Studies That Attempt to Understand the Factors Affecting Appellate Decisionmaking, 58 DUKE L.J. 1895 (2009). The article was just published by the Duke Law Journal and it includes an extensive review of three large-scale empirical studies authored by Professor Frank Cross, Professor Cass Sunstein and colleagues, and Professor William Landes and Judge Richard Posner. What I offer today is necessarily a condensed overview of our critique of these and other empirical legal studies.
Before turning to that overview, I would like to share with you some information on how my colleagues and I decide cases and then offer you some data on D.C. Circuit decisions in administrative law cases during the Bush Administration.
My assessment of the work of the D.C. Circuit is not based on a rigorous quantitative study. However, I can offer you an "insider's view" of the court, and I will support a number of my assertions with data. My qualitative assessment will at least give you a meaningful starting point, which is important because no legal empiricist has yet been able to determine how to quantitatively capture the effect of legal materials or confidential deliberations on appellate decisionmaking.
Decisions issued by the D.C. Circuit are arrived at as a result of judicial deliberation. In each case, judges work to articulate a shared vision of the appropriate outcome based on the relevant legal materials. In most instances, a judge who initially disagrees with a proposed case outcome either eventually agrees that the majority position is superior or convinces the other two judges to change their positions. In other words, because of fruitful judicial deliberations, there is rarely a dissenting opinion.
This is not to suggest that judicial decisionmaking on my court has always strictly hewed to the process I describe. In my years on the court, the D.C. Circuit has evolved from a place that was divided, divisive, and arguably susceptible to the influence of personal politics and ideology, to one stamped with the best elements of collegiality.
Even now, I have no doubt that there are occasions when, despite our best efforts, my colleagues and I are unwittingly influenced by extralegal factors. Nonetheless, in the vast majority of cases, we deliberate seriously and focus hard on reaching a consensus as to what the law counsels without recourse to personal political or ideological views.
My Pitfalls article was written in conjunction with a symposium sponsored by the Duke Law Journal. In preparing the article, I examined the judgments of the D.C. Circuit, reviewing administrative agency actions between 2000 and 2008. I selected this eight-year period because a Republican President was in the White House, Congress was controlled by Republicans for a majority of the eight years, and a clear majority of the judges on the D.C. Circuit was appointed by Republican Presidents. I selected administrative agency actions, both because the Duke symposium was focused on administrative law and also because it is well understood that this large category of cases includes some of the most difficult and controversial appeals heard by my court. If judges' personal political and ideological predilections played a significant role in their decisions during this period, a prime place to look for its effect would be the court's decisions to support or overturn agency actions. In these same cases, one might also expect to find sharp divisions among judges along political lines. That is not what our case dispositions indicate.
Rather, our case dispositions demonstrate: (1) most of the decisions involving administrative agency actions were issued without dissent; (2) judges routinely crossed over presumed political lines in the few cases in which dissents were issued; (3) "mixed panels" of the court (meaning panels consisting of judges appointed by both Republican and Democratic Presidents) routinely issued unanimous decisions – based on the record and grounded in law and precedent – resolving complex, difficult, and important administrative law cases; and (4) the full court rarely reheard a case en banc.
″ During the applicable 2000-2008 period, 10 of the 14 judges who served on the D.C. Circuit were appointed by Republican Presidents.
″ During this period, there were 913 decisions involving administrative agency actions, and only 41 included a dissent.
″ There were "mixed panel splits" in 22 of the 41 administrative law cases in which a dissent was filed. In other words, in almost 60 percent of the cases in which a dissent was filed, judges appointed by Presidents of the same party disagreed.
Some legal empiricists argue that the extraordinarily high number of unanimous decisions reached by appellate judges proves little, since, in their view, judges may join a decision even though they disagree with the reasoning supporting it. In other words, some empiricists claim that, due to "ideological dampening," a judge sitting on a "mixed panel" will be less likely to vote according to his or her own ideological preferences. The problem with this claim regarding alleged "panel effects" is that it is based on rank speculation--no empiricist can prove it. Indeed, my colleague, Dean Richard Revesz, who has written extensively on empirical legal studies, has readily acknowledged that "panel effects" can be explained by a "deliberation hypothesis," pursuant to which judges modify their views because they the are persuaded by their colleagues on the appropriate application of the law.
Some empiricists go further in their attempts to diminish the importance of unanimous decisions. These empiricists posit that judges sometimes suppress their personal ideological and political preferences in the short term to achieve long-term objectives. In other words, it is claimed that a judge appointed by a Democratic President might vote with her Republican appointed panel members and against her own presumed political preferences in order to maximize the future possibility that those Republican appointees will vote with her and against their presumed political preferences in a future case. The proposition is really quite silly and it finds no support in any study.
The simple point here is that the low rate of dissents in appellate decisions shows that judges appointed by both Democratic and Republican Presidents can and almost always do agree on what the law requires, regardless of their personal political and ideological leanings. Studies indicate that about 90 percent of all published appellate decisions are unanimous. However, unpublished decisions constitute over 80 percent of the cases decided by federal appellate courts, and judges rarely issue dissents in unpublished decisions. If both published and unpublished decisions were counted, the rate of dissent in federal appellate courts would border on negligible.
In pressing my claim that empirical studies tell us very little about the factors affecting appellate decisionmaking, I do not want to rest solely (or even principally) on my qualitative assessment of the D.C. Circuit. Rather, I would like to return to the conceptual and methodological problems to which I alluded at the outset of my remarks.
One glaring problem with the empirical studies to which I have referred is that they are overly informed by the attitudinal model of judicial behavior. The attitudinal model is premised on the assumption that judicial decisions are determined principally by the personal political and ideological preferences or attitudes of judges, and that judges' written opinions are merely "smokescreens" designed to hide this reality. The attitudinal model thus, quite purposefully, fails to take account of the effect of law, precedent, and deliberations on judicial decisionmaking. Consequently, at least in its starkest forms, the attitudinal model speaks about judicial opinions solely in terms of case outcomes. The attitudinal model also assumes that individual judges' personal views are immutable and can be accurately characterized pursuant to a simplistic liberal/conservative dichotomy. These are patently unrealistic assumptions. There is no reason to think that judges' preferences are particularly stable, and few of us can be easily characterized as liberal or conservative on all issues.
Even those legal empiricists who recognize the problems inherent in the attitudinal model face unsolved methodological difficulties that render suspect all but the most modest empirical conclusions about appellate decisionmaking. Empiricists normally draw on a defined data set of cases and look at "dependent" and "independent" variables. Dependent variables concern the object of study, while independent variables are those that are hypothesized to affect the dependent variable. In studies of judicial decisionmaking, the dependent variables relate to judicial decisions, typically case "outcomes." And in defining case outcomes, a number of empiricists rely on the "U.S. Courts of Appeals Database."
Although this database was created to facilitate empirical analysis, it is seriously flawed in that it does not include appellate cases resolved by unpublished decisions. As I have already noted, over 80% of all federal appellate decisions are unpublished. Unpublished decisions typically are unanimous and involve the most straightforward applications of the law. Importantly, then, unpublished decisions offer valuable information regarding appellate judges' adherence to precedent. For legal empiricists whose stated concern is whether judges follow the law or personal preferences, every judgment must count if the basis of appellate decisionmaking is to be accurately characterized.
The second major methodological obstacle faced by empiricists involves coding difficulties that can distort the dependant case outcome variable. There are many possible dispositions of appellate cases. However, empiricists routinely collapse these dispositions into simple binary outcomes such as "appellant prevails" or "appellee prevails." Some studies code case outcomes according to topical or political binary criteria, such as pro or anti-environment, pro or anti-criminal defendant, pro or anti-civil rights, and so on. Perhaps the most common metric used in empirical studies is a simple "liberal/conservative" binary.
These practices necessarily simplify and distort a court's holding, reducing to a simple often uninformative label what may be a complex and nuanced decision. Thus, for example, the court's disposition in an administrative law case might include a judgment on standing that appears to be "conservative," a judgment on "arbitrary and capricious" review that appears to be "liberal," and a judgment under Chevron that is neither. All of these nuances are lost in a binary outcome characterization.
Another well-recognized problem faced by empiricists studying appellate decisionmaking is that only the outcomes of decisions are coded, not the content. A disposition on procedural grounds against an environmental group is treated exactly the same as a decision on the merits, although the consequences can be quite different. Opinions that reach broad conclusions of law and include significant dicta are treated the same as opinions that decide cases narrowly and address only the arguments supporting the decision. Whether an opinion hews closely to precedent or decides a case on first principles is usually ignored. Coding only for outcome eliminates large amounts of data and treats as identical opinions that are, in many ways, quite different.
Difficulties in the coding of independent variables also cause problems for empiricists. The things that legal empiricists are most interested in studying – the personal political and ideological preferences of judges – are not easily quantified. Scholars consequently seek to describe these preferences through reference to proxies. The proxy typically employed is the party of the appointing President or "PAP." Relying on this proxy, researchers assume that judges appointed by Republican Presidents are "conservative," and judges appointed by Democratic Presidents are "liberal." The PAP proxy is a highly unsatisfying measure of a judge's personal political and ideological preferences.
Assuming we could agree on the meanings of "conservative" and "liberal," it is not the case that all Republican Presidents are conservatives and all Democratic Presidents are liberal. Moreover, Presidents are not solely motivated to appoint judges who reflect their politics. Commentators have noted alternative motivations for presidential appointments, including personal relationships and party building.
The link between the party of the appointing President and judicial "ideology" is even more attenuated. As Professor Gregory Sisk, a recognized empiricist, has explained:
The International Encyclopedia of the Social Sciences defines ideology as "one variant form of those comprehensive patterns of cognitive and moral beliefs about man, society, and the universe in relation to man and society, which flourish in human societies." Nothing nearly so sophisticated is in
operation in most empirical research conducted on the courts, whether undertaken by political scientists or law professors.
The critically important legal influences on appellate decisionmaking, including case records, the applicable law, precedent, and judicial deliberations, pose even more difficult coding challenges for empiricists and consequently have been largely ignored. To code precedent, for example, formalized and repeatable procedures would have to be developed for identifying and numerically describing the legal issues present in a case, the scope of authoritative and persuasive law, and the effect of that law on the outcome of the case.
Legal empiricists have yet to figure out how to reliably code "precedent" as an independent variable. The coding of deliberations presents an even more insurmountable task, for judges' deliberations are confidential. What is said as judges deliberate over how best to resolve the issues before them is critical to the decisionmaking process but is not public and therefore cannot be coded.
Persons who read about empirical studies often do not understand the coding problems underlying the use of proxies for political beliefs and ideology. Nor may they appreciate the consequences of empiricists' failure to account for unpublished decisions, case records, the applicable law, the effect of precedent, and the impact of judicial deliberations.
The problem is compounded when empirical scholars fail to fully reveal the limitations of their studies. Some empiricists bandy about the term "significant correlation" as if to suggest that their regression analyses conclusively demonstrate an irrefutable and strong connection between judges' personal political and ideological views and their decisions. But in his thoughtful and measured book, Decision Making in the U.S. Courts of Appeals, Professor Frank Cross explains that [a] reader [of empirical studies] should not place undue importance on a finding of statistical significance, because such a finding shows a correlation between variables but by itself does not prove the substantive significance of that correlation. One must also consider the magnitude of the association.
In other words, even when a regression study indicates a strong relationship, the meaning of that relationship may still be unclear.
There are two areas of disagreement that might arise with respect to my critique of empirical legal studies.
First, empiricists might point out that it is often the case in studies involving correlational analyses that researchers do not have direct access to the phenomena they want to measure and must therefore resort to proxies. So long as there is a general correlation between the proxy and the underlying phenomenon (and so long as the proxy is not correlated significantly with the absence of that phenomenon), the imperfections merely burden the estimates with a degree of randomness or noise.
Some empiricists might also argue that a significant relationship between PAP and case outcomes is telling because any such relationship is unconnected to anything intrinsic to legal reasoning.
These claims do not hold up under close scrutiny. The hypothesis that judicial decisionmaking is influenced by the ideology of judges is remarkable only if and to the extent that that ideology is extrinsic to law. It is well understood, however, that legal reasoning partakes of political and moral judgments in a number of cases in which judges must exercise delegated or common-lawmaking authority. Thus, in cases where the law requires it, judicial decisionmaking includes a situated and disciplined elaboration of the conventional norms of the American political community, some of which may coincidentally overlap with a judge's own personal ideology. One of many examples of judges undertaking this sort of disciplined elaboration in a self conscious and overt way is seen in the political (Meiklejohnian) conception of free speech that animated the Supreme Court's seminal decision in New York Times v. Sullivan.
If one accepts that such reasoning is legal reasoning, then any statistical model that uses a measure of ideology that potentially captures this reasoning cannot tell us much about appellate decisionmaking beyond the bland assertion that judicial disagreement explains variation in outcomes. Ideology may inappropriately affect variation in legal outcomes only if (a) ideology or politics takes on an impermissible, extralegal characteristic--something that empirical scholarship has not shown--or (b) we are wrong in our view that some political and ideological questions are intrinsic to law itself.
If empirical scholars could convince us that we are wrong in our view that some political and ideological questions are intrinsic to law itself, they would not be showing that judges have been substituting their ideology for law but, rather, that judges have been following a conception of law that we should reject for normative reasons. And, if they are right in this, their claim would be recognized as a contribution to philosophical jurisprudence, not empirical legal studies.
Second, empiricists might point out that where there is no correlation between independent variables, the omission of one independent variable should have no measurable impact on the estimated effect of another on the dependent variable. On this view, an empiricist might argue that the failure to take account of law, precedent, and deliberations does not bias any estimate of how much of the variance in outcomes is attributable to judges' personal politics or ideology.
There are at least two problems with this argument. First, we have good reason to believe that the quality of judicial deliberations affects appellate decisionmaking. Therefore, if an empirical model omits the deliberations variable, we have reason to believe that it will falsely suggest that influence of personal ideology is immutable and endemic to judicial decisionmaking rather than the source of a correctable pathology that is likely concentrated in relatively discrete segments of the federal circuit courts at any given time.
Second, empirical studies also go wrong when they assume that as ideological correlations go up, precedent correlations invariably go down. Higher correlations between ideological and political preferences and case outcomes tell us nothing about the relationship between precedent and case outcomes. If a case outcome adheres to precedent, it really does not matter whether the outcome is consistent with a judge's personal political or ideological views.
The simple truth is that, even accepting recent empirical studies on their own terms--that is, with all of their inherent flaws--it is still clear that they predict very little about the effects of extralegal factors on appellate decisionmaking. Professor Cross, who has completed the most comprehensive study of federal appellate decisionmaking, concedes as much in his book. He concludes that, while appointment variables (such as PAP) had measurable effects, they had "very limited explanatory power," especially when compared to legal variables for which "there was consistently a statistically significant association that was robust to different samples and control variables." Unsurprisingly, none of the studies refute the claim that case records, the applicable law, precedent, and judicial deliberations are the critically important determinants of appellate decisionmaking.
In conclusion, I want to be clear that, in forwarding my thesis, I do not mean to dispute the reality that Presidents often appoint judges whose views are consistent with their own. Indeed, when a court is composed of judges who come from a variety of personal, professional, and political backgrounds, this can make for better informed deliberations. My principal point is that it does not follow from the political reality of partisan appointments that judges act in a partisan way in deciding cases once on the bench. Rather, what I believe is that, on an appellate court that adheres to collegial practices, the applicable law, controlling precedent, and the collegial deliberative process are the primary determinants of case outcomes. Certainly, no empirical study has shown otherwise.
[Note: Citations to the authorities mentioned in this paper may be found in Edwards & Livermore, Pitfalls of Empirical Studies That Attempt to Understand the Factors Affecting Appellate Decisionmaking, 58 DUKE L.J. 1895 (2009)]