Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Survey Power

By Alex Simonson
November 29, 2007

Surveys attempt to extrapolate from a sample what is happening in the defined universe as a whole. If the study is designed with biases (such as leading questions), typical statistics used for hypothesis testing are not directly relevant to determine whether there is 'statistical significance' because the statistics are unaware of such biases in the questions. Statistics are blind to the conceptual problems in questionnaire design.

Assuming, however, that we have a non-leading, non-biased instrument, and that we have a random sample of respondents, then statistics from the sample can be used to determine how likely or not these results have occurred by chance or instead are indicative of a true result. There are two general kinds of true results, results indicating the presence of some effect and that the effect truly exists in the underlying universe, and results indicating the absence of some effect and that the effect truly does not exist in the underlying universe. Thus, there are two general kinds of incorrect results, results indicating the presence of some effect, but the effect is not there in the underling universe, and results indicating the absence of some effect, but the effect is actually there in the underling universe.

Statistical Power

Surveys commissioned by plaintiffs typically are used to demonstrate the presence of some effect, such as secondary meaning, likelihood of confusion, likelihood of dilution, etc. Defendants often attempt to establish the absence of some effect, such as no secondary meaning, no likelihood of confusion, no likelihood of dilution, etc. The important point that is often overlooked, however, is this: The mere absence of finding an effect in a survey is not sufficient to establish a 'no effect' conclusion. The reason for this is in major part due to a concept called 'statistical power.' High statistical power (defined later) is required in order to establish a no-effect conclusion. The mere absence of a statistically significant finding does not equate to, and is not the same as, having high statistical power. Statistical power is not an esoteric concept, but instead a key component of scientific testing in industry and in marketing research in general.

Statistical power, in words, is the chance that one would conclude from the sample tested that there is no effect (no confusion or no dilution, etc.), when in fact there is truly in the universe no such effect going on. Optimally, scientists would like this level of chance to be 80% or higher. That is, we would like to be able to say that if we conclude that there is no effect (no confusion or no dilution, etc.) from our survey, we have an over 80% chance of being correct.

There are four possibilities in testing for effects. Two possibilities are: We can conclude from the survey that there is an effect (there is a certain level of confusion); and in reality in the universe, there may be such an effect, or instead, perhaps our sample is not reflecting the true number in the universe. So, the first two possibilities are: a) we conclude from our sample that there is an effect and there is indeed one in the defined universe, or b) we conclude from our sample that there is an effect but in fact there is no such effect in the defined universe. Scientific standards tend to converge that we would like to have a low chance of our concluding from our sample that there is an effect, but in fact there is no such effect in the universe, less than 5% typically. Thus, the converse, the chance that we conclude from our sample that there is an effect and indeed there is such an effect in the universe would be greater than 95% (or 100% minus our error rate of 5%). The error rate here is termed 'type A' error ' or the error of concluding there is an effect based on our survey when in fact there is no such effect in the universe.

The other two possibilities are: We can conclude from the survey that there is no effect (there is no confusion for example), and in reality in the universe, there may indeed be no such effect, or instead perhaps our sample is not reflecting the true number in the universe. So, the two possibilities are: c) we conclude from our sample that there is no effect and there is no such effect in the defined universe, or d) we conclude from our sample that there is no effect, but in fact there is an effect in the defined universe. Scientific standards tend to converge that we would like to have a low chance of our concluding from our sample that there is no effect, but in fact there is an effect in the universe should be minimal, less than 20% typically. Thus, the converse, the chance that we conclude from our sample that there is no effect and indeed there is no such effect in the defined universe would be greater than 80% (or 100% minus our error rate of 20%). This is termed the 'statistical power' of the test. The error rate here is termed 'type B' error ' or the error of concluding there is no effect based on our survey when in fact there is an effect in the universe.

Sample Size

Sample size is one of the key variables that influences statistical power. It turns out that a typical test or control group size of 200 respondents often is too small to yield high statistical power. Thus, not finding a significant difference between test and control groups is not sufficient often to conclude that there is really no effect in the defined universe. So, when a survey researcher conducts a confusion survey with 200 respondents in each cell (test and control) and finds the test cell has a 30% level of confusion and the control cell a 29% level of confusion, this would tend to indicate the following: The survey does not prove there is confusion. It does not indicate, however, that the survey proves that there is indeed 'no confusion' in the defined universe. In my experience as a professor, this point seems peculiarly difficult to grasp. An analogy is useful to clarify the point: It is akin to trawling in the ocean in search of a sunken ship, the ship being the 'effect.' If one finds the ship, clearly it is proof of its existence. Not finding the ship does not however prove that it is not there, because your measuring apparatus may not be sufficiently strong to come to the conclusion there is no ship. In statistical terms this is called 'the power' of the test. Survey power means the ability of a survey to measure an effect. In the science of statistics and surveys, it is well accepted that in order for a survey to be scientifically valid, it must have sufficient survey power.

In industry and in the world of policy where just as much impact will arise from either type of study (concluding there is an effect or concluding there is no effect) and where either type of study will be given equal weight in determining whether there is or is not an effect, the ramifications of Type B error are just as important as those of Type A error. In a litigation, for example, if the court agrees with conclusions in a study that was presented to the court that shows an effect occurred, if there happened to be a Type A error, one party (defendant) suffers based on the court's decision due to the error. Equally, if the court agrees with conclusions in a study that was presented to the court that shows no effect occurred, if there happened to be a Type B error, one party (plaintiff) suffers based on the court's decision due to the error. Thus, in the policy or litigation world, where a study showing no effect exists may be admissible and may be relied upon by the court to effect court orders and change rights between parties in
the same manner as a study showing an effect does exist, the error rates
for both types of errors (A and B) should be low.

Conclusion

Courts and lawyers should pay particular attention to survey findings such as there is 'no likelihood of confusion,' 'there is no likelihood of dilution,' and ensure that the power of the test is high to rely on such conclusions. Anytime a commissioned survey indicates 'no effect,' in addition to issues such as question clarity, randomness of the sample, and other methodological issues, there is the additional critical issue of statistical power. There are power tables that can be used to determine (more or less depending on certain a priori assumptions) what the power of the test is likely to have been (based on some assumptions one must make). The astute litigator will make use of such power tables to determine whether a 'no effect' conclusion is warranted from a particular survey.

[IMGCAP(1)]


Dr. Alex Simonson is president of Simonson Associates, Inc., in Englewood Cliffs, NJ, and has been a survey researcher for more than 15 years. He may be contacted at [email protected], or visit www.simonsonassociates.com.

Surveys attempt to extrapolate from a sample what is happening in the defined universe as a whole. If the study is designed with biases (such as leading questions), typical statistics used for hypothesis testing are not directly relevant to determine whether there is 'statistical significance' because the statistics are unaware of such biases in the questions. Statistics are blind to the conceptual problems in questionnaire design.

Assuming, however, that we have a non-leading, non-biased instrument, and that we have a random sample of respondents, then statistics from the sample can be used to determine how likely or not these results have occurred by chance or instead are indicative of a true result. There are two general kinds of true results, results indicating the presence of some effect and that the effect truly exists in the underlying universe, and results indicating the absence of some effect and that the effect truly does not exist in the underlying universe. Thus, there are two general kinds of incorrect results, results indicating the presence of some effect, but the effect is not there in the underling universe, and results indicating the absence of some effect, but the effect is actually there in the underling universe.

Statistical Power

Surveys commissioned by plaintiffs typically are used to demonstrate the presence of some effect, such as secondary meaning, likelihood of confusion, likelihood of dilution, etc. Defendants often attempt to establish the absence of some effect, such as no secondary meaning, no likelihood of confusion, no likelihood of dilution, etc. The important point that is often overlooked, however, is this: The mere absence of finding an effect in a survey is not sufficient to establish a 'no effect' conclusion. The reason for this is in major part due to a concept called 'statistical power.' High statistical power (defined later) is required in order to establish a no-effect conclusion. The mere absence of a statistically significant finding does not equate to, and is not the same as, having high statistical power. Statistical power is not an esoteric concept, but instead a key component of scientific testing in industry and in marketing research in general.

Statistical power, in words, is the chance that one would conclude from the sample tested that there is no effect (no confusion or no dilution, etc.), when in fact there is truly in the universe no such effect going on. Optimally, scientists would like this level of chance to be 80% or higher. That is, we would like to be able to say that if we conclude that there is no effect (no confusion or no dilution, etc.) from our survey, we have an over 80% chance of being correct.

There are four possibilities in testing for effects. Two possibilities are: We can conclude from the survey that there is an effect (there is a certain level of confusion); and in reality in the universe, there may be such an effect, or instead, perhaps our sample is not reflecting the true number in the universe. So, the first two possibilities are: a) we conclude from our sample that there is an effect and there is indeed one in the defined universe, or b) we conclude from our sample that there is an effect but in fact there is no such effect in the defined universe. Scientific standards tend to converge that we would like to have a low chance of our concluding from our sample that there is an effect, but in fact there is no such effect in the universe, less than 5% typically. Thus, the converse, the chance that we conclude from our sample that there is an effect and indeed there is such an effect in the universe would be greater than 95% (or 100% minus our error rate of 5%). The error rate here is termed 'type A' error ' or the error of concluding there is an effect based on our survey when in fact there is no such effect in the universe.

The other two possibilities are: We can conclude from the survey that there is no effect (there is no confusion for example), and in reality in the universe, there may indeed be no such effect, or instead perhaps our sample is not reflecting the true number in the universe. So, the two possibilities are: c) we conclude from our sample that there is no effect and there is no such effect in the defined universe, or d) we conclude from our sample that there is no effect, but in fact there is an effect in the defined universe. Scientific standards tend to converge that we would like to have a low chance of our concluding from our sample that there is no effect, but in fact there is an effect in the universe should be minimal, less than 20% typically. Thus, the converse, the chance that we conclude from our sample that there is no effect and indeed there is no such effect in the defined universe would be greater than 80% (or 100% minus our error rate of 20%). This is termed the 'statistical power' of the test. The error rate here is termed 'type B' error ' or the error of concluding there is no effect based on our survey when in fact there is an effect in the universe.

Sample Size

Sample size is one of the key variables that influences statistical power. It turns out that a typical test or control group size of 200 respondents often is too small to yield high statistical power. Thus, not finding a significant difference between test and control groups is not sufficient often to conclude that there is really no effect in the defined universe. So, when a survey researcher conducts a confusion survey with 200 respondents in each cell (test and control) and finds the test cell has a 30% level of confusion and the control cell a 29% level of confusion, this would tend to indicate the following: The survey does not prove there is confusion. It does not indicate, however, that the survey proves that there is indeed 'no confusion' in the defined universe. In my experience as a professor, this point seems peculiarly difficult to grasp. An analogy is useful to clarify the point: It is akin to trawling in the ocean in search of a sunken ship, the ship being the 'effect.' If one finds the ship, clearly it is proof of its existence. Not finding the ship does not however prove that it is not there, because your measuring apparatus may not be sufficiently strong to come to the conclusion there is no ship. In statistical terms this is called 'the power' of the test. Survey power means the ability of a survey to measure an effect. In the science of statistics and surveys, it is well accepted that in order for a survey to be scientifically valid, it must have sufficient survey power.

In industry and in the world of policy where just as much impact will arise from either type of study (concluding there is an effect or concluding there is no effect) and where either type of study will be given equal weight in determining whether there is or is not an effect, the ramifications of Type B error are just as important as those of Type A error. In a litigation, for example, if the court agrees with conclusions in a study that was presented to the court that shows an effect occurred, if there happened to be a Type A error, one party (defendant) suffers based on the court's decision due to the error. Equally, if the court agrees with conclusions in a study that was presented to the court that shows no effect occurred, if there happened to be a Type B error, one party (plaintiff) suffers based on the court's decision due to the error. Thus, in the policy or litigation world, where a study showing no effect exists may be admissible and may be relied upon by the court to effect court orders and change rights between parties in
the same manner as a study showing an effect does exist, the error rates
for both types of errors (A and B) should be low.

Conclusion

Courts and lawyers should pay particular attention to survey findings such as there is 'no likelihood of confusion,' 'there is no likelihood of dilution,' and ensure that the power of the test is high to rely on such conclusions. Anytime a commissioned survey indicates 'no effect,' in addition to issues such as question clarity, randomness of the sample, and other methodological issues, there is the additional critical issue of statistical power. There are power tables that can be used to determine (more or less depending on certain a priori assumptions) what the power of the test is likely to have been (based on some assumptions one must make). The astute litigator will make use of such power tables to determine whether a 'no effect' conclusion is warranted from a particular survey.

[IMGCAP(1)]


Dr. Alex Simonson is president of Simonson Associates, Inc., in Englewood Cliffs, NJ, and has been a survey researcher for more than 15 years. He may be contacted at [email protected], or visit www.simonsonassociates.com.

This premium content is locked for Entertainment Law & Finance subscribers only

  • Stay current on the latest information, rulings, regulations, and trends
  • Includes practical, must-have information on copyrights, royalties, AI, and more
  • Tap into expert guidance from top entertainment lawyers and experts

For enterprise-wide or corporate acess, please contact Customer Service at [email protected] or 877-256-2473

Read These Next
Overview of Regulatory Guidance Governing the Use of AI Systems In the Workplace Image

Businesses have long embraced the use of computer technology in the workplace as a means of improving efficiency and productivity of their operations. In recent years, businesses have incorporated artificial intelligence and other automated and algorithmic technologies into their computer systems. This article provides an overview of the federal regulatory guidance and the state and local rules in place so far and suggests ways in which employers may wish to address these developments with policies and practices to reduce legal risk.

Is Google Search Dead? How AI Is Reshaping Search and SEO Image

This two-part article dives into the massive shifts AI is bringing to Google Search and SEO and why traditional searches are no longer part of the solution for marketers. It’s not theoretical, it’s happening, and firms that adapt will come out ahead.

While Federal Legislation Flounders, State Privacy Laws for Children and Teens Gain Momentum Image

For decades, the Children’s Online Privacy Protection Act has been the only law to expressly address privacy for minors’ information other than student data. In the absence of more robust federal requirements, states are stepping in to regulate not only the processing of all minors’ data, but also online platforms used by teens and children.

Revolutionizing Workplace Design: A Perspective from Gray Reed Image

In an era where the workplace is constantly evolving, law firms face unique challenges and opportunities in facilities management, real estate, and design. Across the industry, firms are reevaluating their office spaces to adapt to hybrid work models, prioritize collaboration, and enhance employee experience. Trends such as flexible seating, technology-driven planning, and the creation of multifunctional spaces are shaping the future of law firm offices.

From DeepSeek to Distillation: Protecting IP In An AI World Image

Protection against unauthorized model distillation is an emerging issue within the longstanding theme of safeguarding intellectual property. This article examines the legal protections available under the current legal framework and explore why patents may serve as a crucial safeguard against unauthorized distillation.