Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Statistical Lessons of Ricci

By Jonathan Falk
August 25, 2009

The Supreme Court's decision in Ricci v. De Stefano has already garnered a great deal of attention from lawyers, political pundits, and Supreme Court watchers. While the statistical issues got almost no attention in the decision from either side, there are important statistical currents in Ricci ' as there are in any disparate impact case ' that are worthy of further attention. This two-part article focuses on three issues: 1) What do statisticians really have to say about disparate impact? 2) How might statistical analysis have played out in Ricci? and 3) Going forward, what role do statisticians have to play in the new standard (i.e., strong basis in evidence)?

To frame the discussion, it will help to lay out a few facts about the case. The City of New Haven, CT, hired a company to develop a promotional test for firefighters that would accomplish two objectives: test only skills relevant for promotion and, subject to that, minimize potential disparate impact of the results. It should be noted that under the precedent provided in Griggs v. Duke Power, 401 U.S. 424 (1971), any procedure that accomplishes the first task has a safe harbor against allegations of disparate impact, albeit a safe harbor that might well have been challenged in court on the facts.

The lieutenant's test (I focus in this paper for expository purposes only on the lieutenant's test and only on blacks and whites) was given to 43 white firefighters and 19 black firefighters, producing the following results: 25 whites passed (58%) and six blacks passed (32%). The New Haven procedure for promotion involved a second phase and the results of that phase meant that 10 whites (40% of those passing the test) and no blacks (0%) would actually receive promotions.

The Statistics of Disparate Impact

The Civil Rights Act of 1964 and its subsequent amendments prohibit discrimination in promotion. Facially neutral promotion policies are still suspect under Title VII of the Act when their application produces a “disparate impact.” Surprisingly, the actual judicial interpretation of what “disparate impact” means is quite thin and the questions of when statistical analysis is appropriate, and to what purpose it should be put, are particularly vague. This is not surprising, since decisions are written by judges, who are well versed in the law, but are unlikely to have had substantial training in statistics. As a consequence, the overwhelming tendency in disparate impact cases is to rely on one of two precedents. First are the Supreme Court's statistical pronouncements, which appear in footnotes in two cases: Casteneda v. Partida, 430 U.S. 482 (1977) and Hazelwood School District v. US, 433 U.S. 299 (1977). Those footnotes have engendered extensive commentary, but it suffices to note that the Court in these cases allowed, but did not mandate, the use of statistical analysis to help assess disparate impact.

The second reference that is uniformly employed is the so-called “four-fifths rule” promulgated by the Equal Employment Opportunity Commission (EEOC), in which a rate of promotion for a disfavored group, that is less than four-fifths of the rate for the most favored group, is “evidence of adverse impact.” It is this rule that the majority decision in Ricci cited for the proposition that “[t]he racial adverse impact here was significant, and petitioners do not dispute that the City was faced with a prima facie case of disparate impact liability.” Looking to the lieutenant's exam, the pass rate of blacks was about half that of whites, and the fact that no blacks would have been promoted following the second phase is clearly substantially short of the promotion rate for whites.

The “four-fifths rule” is not a legal standard on its own, nor has it been characterized by the courts as anything more than a “rule of thumb.” (See Watson vs. Forth Worth Bank and Trust, 487 U.S. 977 (1988). Nor should it be ' the rule itself is qualified in EEOC regulations and has received substantial scorn from statisticians who have looked at it. Moreover, as we shall see, the actual disparities observed in the test fall short of the standards set forth in Castaneda and Hazelwood.

The Fisher Exact Test

The basic calculation that is normally employed begins with the assumption that, ex ante, all candidates ought to be equally capable of securing promotion in an unbiased test. If that is true, then we can calculate the probability that one group will have a particular success rate as a function of the total number of successful candidates, and the number of candidates in one of the groups. This is the basis for the so-called Fisher Exact Test, and it is the most commonly used test in these circumstances, though it is not the only one that might be employed. If the combined probability of all events less than ' or equally likely than ' the one that actually occurred is sufficiently low, then we can conclude that one of three things has happened:

  • Something unusual has occurred by chance.
  • The test has taken equally qualified people and is somehow biased in its result.
  • People were not equally qualified to begin with.

It is the logical disjunction of these three possibilities that explains why a statistician's evidence in a disparate impact case can never be dispositive. The statistician's calculations can shed some light on these three possibilities, but it requires a finder of fact to separate these three causes for the observed result.

The Statistician's Procedure

The standard procedure for a statistician is to choose a probability below which the first possibility, i.e., “Something unusual happened by chance,” ought to be discounted. A rule of thumb in social science is 5%, but the court should recognize that this is no more than a rule of thumb and needs to be considered in light of both the sample size involved and the other facts in the case.

After the statistician has disposed of this possibility by declaring the result “statistically significant,” the next step is to assert (sometimes directly, sometimes indirectly) that the test was biased. This follows because the assumption that everyone was equally well qualified ex ante is an assumption by the statistician that cannot be altered without invalidating the procedure. Thus, of the three possibilities, one is rejected by the use of a rule of thumb and one is rejected by assumption, which leaves only the third possibility.

However, the court is not so constrained. Suppose, for example, that the test results make it clear that five black candidates and five white candidates were manifestly unqualified for some reason. Then these candidates should not have been included in the original statistical analysis. If this is the case (using the New Haven numbers), the white pass rate (from among the ex ante equally qualified candidates) will rise, but the black rate will rise more. Indeed, this possibility is explicitly mentioned in the EEOC guidelines. (“Greater differences in selection rate may not constitute adverse impact where ' special recruiting or other programs cause the pool of minority ' candidates to be atypical of the normal pool of applicants from that group.” )

What the statistician can usefully do to aid a court is to explain the assumptions behind the analysis, to explain the calculations made under those assumptions, and to explain what would be the normal conclusion in professional practice subject to those assumptions being true. The finder of fact, weighing the totality of the evidence, is then in a position to make a determination of disparate impact.

Part Two of this article will discuss the Ricci results, and lessons from the new rule.


Jonathan Falk is a Vice President of NERA Economic Consulting. His practice covers many areas, including statistical analysis in labor-related cases, covering issues of hiring, promotion, and reductions in force. He is a member of the American Statistical Association, and can be reached at [email protected].

The Supreme Court's decision in Ricci v. De Stefano has already garnered a great deal of attention from lawyers, political pundits, and Supreme Court watchers. While the statistical issues got almost no attention in the decision from either side, there are important statistical currents in Ricci ' as there are in any disparate impact case ' that are worthy of further attention. This two-part article focuses on three issues: 1) What do statisticians really have to say about disparate impact? 2) How might statistical analysis have played out in Ricci? and 3) Going forward, what role do statisticians have to play in the new standard (i.e., strong basis in evidence)?

To frame the discussion, it will help to lay out a few facts about the case. The City of New Haven, CT, hired a company to develop a promotional test for firefighters that would accomplish two objectives: test only skills relevant for promotion and, subject to that, minimize potential disparate impact of the results. It should be noted that under the precedent provided in Griggs v. Duke Power , 401 U.S. 424 (1971), any procedure that accomplishes the first task has a safe harbor against allegations of disparate impact, albeit a safe harbor that might well have been challenged in court on the facts.

The lieutenant's test (I focus in this paper for expository purposes only on the lieutenant's test and only on blacks and whites) was given to 43 white firefighters and 19 black firefighters, producing the following results: 25 whites passed (58%) and six blacks passed (32%). The New Haven procedure for promotion involved a second phase and the results of that phase meant that 10 whites (40% of those passing the test) and no blacks (0%) would actually receive promotions.

The Statistics of Disparate Impact

The Civil Rights Act of 1964 and its subsequent amendments prohibit discrimination in promotion. Facially neutral promotion policies are still suspect under Title VII of the Act when their application produces a “disparate impact.” Surprisingly, the actual judicial interpretation of what “disparate impact” means is quite thin and the questions of when statistical analysis is appropriate, and to what purpose it should be put, are particularly vague. This is not surprising, since decisions are written by judges, who are well versed in the law, but are unlikely to have had substantial training in statistics. As a consequence, the overwhelming tendency in disparate impact cases is to rely on one of two precedents. First are the Supreme Court's statistical pronouncements, which appear in footnotes in two cases: Casteneda v. Partida , 430 U.S. 482 (1977) and Hazelwood School District v. US , 433 U.S. 299 (1977). Those footnotes have engendered extensive commentary, but it suffices to note that the Court in these cases allowed, but did not mandate, the use of statistical analysis to help assess disparate impact.

The second reference that is uniformly employed is the so-called “four-fifths rule” promulgated by the Equal Employment Opportunity Commission (EEOC), in which a rate of promotion for a disfavored group, that is less than four-fifths of the rate for the most favored group, is “evidence of adverse impact.” It is this rule that the majority decision in Ricci cited for the proposition that “[t]he racial adverse impact here was significant, and petitioners do not dispute that the City was faced with a prima facie case of disparate impact liability.” Looking to the lieutenant's exam, the pass rate of blacks was about half that of whites, and the fact that no blacks would have been promoted following the second phase is clearly substantially short of the promotion rate for whites.

The “four-fifths rule” is not a legal standard on its own, nor has it been characterized by the courts as anything more than a “rule of thumb.” (See Watson vs. Forth Worth Bank and Trust, 487 U.S. 977 (1988). Nor should it be ' the rule itself is qualified in EEOC regulations and has received substantial scorn from statisticians who have looked at it. Moreover, as we shall see, the actual disparities observed in the test fall short of the standards set forth in Castaneda and Hazelwood.

The Fisher Exact Test

The basic calculation that is normally employed begins with the assumption that, ex ante, all candidates ought to be equally capable of securing promotion in an unbiased test. If that is true, then we can calculate the probability that one group will have a particular success rate as a function of the total number of successful candidates, and the number of candidates in one of the groups. This is the basis for the so-called Fisher Exact Test, and it is the most commonly used test in these circumstances, though it is not the only one that might be employed. If the combined probability of all events less than ' or equally likely than ' the one that actually occurred is sufficiently low, then we can conclude that one of three things has happened:

  • Something unusual has occurred by chance.
  • The test has taken equally qualified people and is somehow biased in its result.
  • People were not equally qualified to begin with.

It is the logical disjunction of these three possibilities that explains why a statistician's evidence in a disparate impact case can never be dispositive. The statistician's calculations can shed some light on these three possibilities, but it requires a finder of fact to separate these three causes for the observed result.

The Statistician's Procedure

The standard procedure for a statistician is to choose a probability below which the first possibility, i.e., “Something unusual happened by chance,” ought to be discounted. A rule of thumb in social science is 5%, but the court should recognize that this is no more than a rule of thumb and needs to be considered in light of both the sample size involved and the other facts in the case.

After the statistician has disposed of this possibility by declaring the result “statistically significant,” the next step is to assert (sometimes directly, sometimes indirectly) that the test was biased. This follows because the assumption that everyone was equally well qualified ex ante is an assumption by the statistician that cannot be altered without invalidating the procedure. Thus, of the three possibilities, one is rejected by the use of a rule of thumb and one is rejected by assumption, which leaves only the third possibility.

However, the court is not so constrained. Suppose, for example, that the test results make it clear that five black candidates and five white candidates were manifestly unqualified for some reason. Then these candidates should not have been included in the original statistical analysis. If this is the case (using the New Haven numbers), the white pass rate (from among the ex ante equally qualified candidates) will rise, but the black rate will rise more. Indeed, this possibility is explicitly mentioned in the EEOC guidelines. (“Greater differences in selection rate may not constitute adverse impact where ' special recruiting or other programs cause the pool of minority ' candidates to be atypical of the normal pool of applicants from that group.” )

What the statistician can usefully do to aid a court is to explain the assumptions behind the analysis, to explain the calculations made under those assumptions, and to explain what would be the normal conclusion in professional practice subject to those assumptions being true. The finder of fact, weighing the totality of the evidence, is then in a position to make a determination of disparate impact.

Part Two of this article will discuss the Ricci results, and lessons from the new rule.


Jonathan Falk is a Vice President of NERA Economic Consulting. His practice covers many areas, including statistical analysis in labor-related cases, covering issues of hiring, promotion, and reductions in force. He is a member of the American Statistical Association, and can be reached at [email protected].

This premium content is locked for Entertainment Law & Finance subscribers only

  • Stay current on the latest information, rulings, regulations, and trends
  • Includes practical, must-have information on copyrights, royalties, AI, and more
  • Tap into expert guidance from top entertainment lawyers and experts

For enterprise-wide or corporate acess, please contact Customer Service at [email protected] or 877-256-2473

Read These Next
How Secure Is the AI System Your Law Firm Is Using? Image

What Law Firms Need to Know Before Trusting AI Systems with Confidential Information In a profession where confidentiality is paramount, failing to address AI security concerns could have disastrous consequences. It is vital that law firms and those in related industries ask the right questions about AI security to protect their clients and their reputation.

COVID-19 and Lease Negotiations: Early Termination Provisions Image

During the COVID-19 pandemic, some tenants were able to negotiate termination agreements with their landlords. But even though a landlord may agree to terminate a lease to regain control of a defaulting tenant's space without costly and lengthy litigation, typically a defaulting tenant that otherwise has no contractual right to terminate its lease will be in a much weaker bargaining position with respect to the conditions for termination.

Pleading Importation: ITC Decisions Highlight Need for Adequate Evidentiary Support Image

The International Trade Commission is empowered to block the importation into the United States of products that infringe U.S. intellectual property rights, In the past, the ITC generally instituted investigations without questioning the importation allegations in the complaint, however in several recent cases, the ITC declined to institute an investigation as to certain proposed respondents due to inadequate pleading of importation.

Authentic Communications Today Increase Success for Value-Driven Clients Image

As the relationship between in-house and outside counsel continues to evolve, lawyers must continue to foster a client-first mindset, offer business-focused solutions, and embrace technology that helps deliver work faster and more efficiently.

The Power of Your Inner Circle: Turning Friends and Social Contacts Into Business Allies Image

Practical strategies to explore doing business with friends and social contacts in a way that respects relationships and maximizes opportunities.