The IQ Illusion: Why IQ Tests Aren’t Very Clever
Do you use psychometric assessment that measure “general mental ability”, through numerical, verbal and spatial reasoning tests? Read on.
For over a century, IQ tests have been used to measure human intelligence, influencing educational policies, job recruitment, and even social perceptions. However, as our understanding of complexity science evolves, serious questions arise about the validity and reliability of these tests in themselves, let alone the already adverse impact they have on different demographics. In this post, we'll explore why evidence is now showing that IQ tests fall short as a scientific measure of intelligence, drawing on insights from complexity theory and modern data analysis techniques. The latter is something I toiled with in my dissertation. Psychology and Maths are uncomfortable partners. This blog was inspired by and tries to summarise a long form (64 minute read) piece written by Sean McClure I shall space this over two blogs as to give it the coverage IMO it requires and deserves, due to the prevalence of psychometrics in current HR practices. Part one is about the underlying assumptions that IQ tests sit upon. The second part in their relation to performance.
The Complexity of Intelligence
Intelligence or cognition is perhaps the most complex phenomenon we know. If we don’t still have an agreed understanding of how we cognise then measuring must be an illusion. It surpasses even weather systems in its intricacy, emerging from countless interdependent dimensions of brain function, environmental factors, and personal experiences. As Sean McClure points out in his critique of IQ testing:
"Intelligence is an extremely complex phenomenon, sitting at the far end of the complexity spectrum."
This complexity presents significant challenges for measurement and prediction. Complex systems exhibit key hallmarks that include:
1. Emergence: Properties that arise from the interactions of components, not predictable from individual parts.
2. Opacity: The inability to trace a clear path from cause to effect.
3. Non-linearity-Nonlinearity refers to the behaviour where the relationship between cause and effect is not proportional. Small changes in input can lead to disproportionately large or unpredictable outcomes (Butterfly-effect).
4. Self-organisation- This occurs when a system spontaneously arranges itself into a more ordered state without external control or central coordination. Simple rules at the local level give rise to complex behaviour at the global level.
These characteristics make it extremely difficult to reduce intelligence to a single, environment & task agnostic number or even a set of discrete measurements.
The Problem of Measurement in Complex Systems
When dealing with complex systems like intelligence, we must recognise the limitations of our measurements. McClure emphasises:
"If your system of interest is considered complex you must assume any measurement made in the pursuit of understanding that system is highly indirect. This is a consequence of the complexity hallmarks."
This indirectness creates a "proxy distance" between what we are measuring and the underlying phenomenon we are trying to understand. In the case of intelligence, this distance is vast, making it challenging to draw meaningful conclusions from IQ test scores. This starts to undermine a big part of the science that underpins much research (including my own), the dependant-independent variable. If anything we are measuring is a complex system, then any effects (dependant variable) on that system will not be localised (independent variable) but will resonate throughout the system due to “interaction dominant dynamics” (Favela, 2024, pg.102). This was also evidenced in the critiques of Mindset Theory.
The Pitfalls of Data-Driven vs. Model-Driven Approaches
Traditional IQ research often falls into the trap of being model-driven rather than data-driven. This approach can lead to confirmation bias and statistical manipulation. As McClure notes:
"It's easy to 'confirm' what we believe if we look hard enough. In statistics we call this 'torturing the data until they confess' meaning we can repeatedly gather and interpret source data until they align with whatever pre-conceived model we have in possession."
Alex Edmans has written about this in challenging spurious correlations from people such as McKinsey (Edmans et al., 2023). Cutting the data to generate the p value or manipulating the scales in x/y axis to show the appearance of correlations. This problem is exacerbated by the complexity of intelligence, as the sheer number of variables involved makes it easy to find spurious correlations if one is looking for them.
The Physics Envy Problem
Much of the criticism of IQ tests stems from what's known as "physics envy" in psychology. Early psychologists, eager to establish their field as a "hard science," attempted to mimic the reductionist approach of physics. They sought to quantify and reduce intelligence to its component parts, much like physicists break down matter into atoms and subatomic particles. This is where Likert Scales became the saviour to Psychology by being seen as a “real science” through the eyes of the early 20th Century.
However, this approach fundamentally misunderstands the nature of intelligence as a complex system. As complexity science teaches us, the whole is more than the sum of its parts. Emergent properties arise from the interactions of components in ways that can't be predicted by studying those components in isolation. Again bringing into question the entire independent-dependant variable approach to studying behaviour.
The Limitations of Linear Models
Traditional statistical approaches used in IQ research often rely on linear models and normal distributions. These methods are inappropriate for complex systems like intelligence. As Luis H. Favela points out in his work on complexity science:
"On the other hand, if a system's dynamics are the result of nonlinear (e.g, multiplicative) processes, then it follows that the effects of perturbations will not be localised and will percolate throughout the system due to interaction-dominant dynamics."
This means we can't isolate single variables and measure their effects independently. Small changes in one area might have large, nonlinear effects elsewhere in the system due to complex interactions.
The Problem of Equifinality
Another challenge in measuring intelligence is the concept of equifinality - the idea that multiple different configurations can lead to the same outcome. In the context of intelligence, this means that two individuals might achieve the same level of performance or problem-solving ability through entirely different cognitive processes or brain structures.
This phenomenon makes it extremely difficult to create a standardised test that accurately measures intelligence across diverse populations.
The Dangers of Oversimplification
Despite these complexities, IQ tests continue to be used to make significant decisions in education, employment, and even policy. McClure warns about the dangers of this approach:
"Psychologists propose changes to policy. This represents an intervention into a complex system. The burden of interpretation should always be taken into account whenever interventions are proposed. This makes intuitive sense. If the proxy distance between my measurements and the underlying function is large I should be exceedingly wary of turning my findings into a recommended policy or promoted piece of technology."
When we oversimplify intelligence to a single number or even a small set of scores, we risk making poor decisions that affect people's lives and futures.
The Limitations of Current Statistical Methods
Even within the framework of traditional statistics, there are significant issues with how IQ test results are analysed. Many studies rely on ordinal data (Likert scales) but treat them as interval or ratio data in their analyses. This introduces a fundamental flaw in the statistical modelling.
As noted in my dissertation research, there's an "acknowledged fallacy" in the difference between "strongly agree" and "agree" on a Likert scale may not be the same as the difference between "agree" and "neutral," yet these are often treated as equal intervals in statistical analyses. (Harpe, 2015; Weijters et al., 2021)
This "fudge," while widely accepted in psychological research, introduces significant errors into our understanding of intelligence and its measurement.
Conclusion
As our understanding of complexity science grows, it becomes increasingly clear that traditional IQ tests are inadequate measures of human intelligence. The complex, emergent nature of intelligence defies simple quantification and linear modelling.
To truly understand and measure intelligence, we need to embrace more sophisticated approaches that can handle high-dimensional, nonlinear data. Machine learning and other algorithmic models offer promising avenues for future research, but we must also remain humble about the limitations of any attempt to quantify something as complex as human intelligence.
As we move forward, it's crucial that we critically examine the tools we use to measure intelligence and the decisions we make based on those measurements. Only by acknowledging the full complexity of human cognition can we hope to develop more accurate and useful ways of understanding and nurturing intelligence in all its diverse forms.
Edmans, A., Flammer, C., & Glossner, S. (2023). ECGI Working Paper Series in Finance.
Favela, L. (2024). The Ecological Brain. Routledge.
Harpe, S. E. (2015). How to analyze Likert and other rating scale data. Currents in Pharmacy Teaching and Learning, 7(6), 836–850. https://doi.org/10.1016/J.CPTL.2015.08.001
Weijters, B., Millet, K., & Cabooter, E. (2021). Extremity in horizontal and vertical Likert scale format responses. Some evidence on how visual distance between response categories influences extreme responding. International Journal of Research in Marketing, 38(1), 85–103. https://doi.org/10.1016/J.IJRESMAR.2020.04.002
Here are some key criticisms and potential flaws of Howard Gardner's theory of Multiple Intelligences (MI):
## Lack of Empirical Evidence
One of the main criticisms is that Gardner's theory lacks sufficient empirical evidence to support its claims[1][2]. Critics argue that Gardner has not conducted rigorous experimental research to validate the existence of distinct, independent intelligences[2]. Gardner himself has admitted that he has not carried out experiments specifically designed to test the theory[4].
## Misuse of the Term "Intelligence"
Some critics argue that Gardner's use of the term "intelligence" is problematic[4]. The theory suggests a predictive power that it does not actually have. Gardner later acknowledged that he used the term "intelligences" rather than "talents" partly to challenge psychologists who claimed ownership of the definition of intelligence[4].
## Conflation with Learning Styles
Despite Gardner's intentions, his theory has often been misinterpreted and conflated with learning styles theory in educational settings[5]. This has led to the problematic practice of labeling students as having a single "preferred intelligence," which Gardner himself criticizes as unhelpful or ill-conceived[5].
## Outdated Framework
Gardner has admitted that the theory is no longer current, as several fields of knowledge have advanced significantly since the early 1980s when the theory was first proposed[4]. He suggests that any reinvigoration of the theory would require a new comprehensive survey of scientific findings.
## Criticism from Psychologists
Many psychologists argue that Gardner's theory goes against the widely accepted notion of general intelligence or "g"[2]. The dominant view in psychology supports a hierarchical model of intelligence, where "g" is at the top, influencing various cognitive processes[2].
## Educational Implications
While popular among educators, the practical application of MI theory in classrooms has been questioned. Research shows that when teachers try to match instruction to perceived learning styles (often conflated with multiple intelligences), the benefits are nonexistent[5].
## Definition and Measurement Issues
Critics point out that Gardner's definition of intelligence is too broad and that the criteria for identifying an intelligence are subjective[1]. There are also concerns about the lack of standardized measures for assessing the different intelligences proposed by Gardner.
In conclusion, while Gardner's theory of Multiple Intelligences has been influential in challenging traditional views of intelligence, it has faced significant criticism for its lack of empirical support, potential misuse in educational settings, and inconsistency with current psychological understanding of intelligence. Gardner himself has acknowledged some of these criticisms and the need for updating the theory in light of recent scientific advancements.
Citations:
[1] https://typeset.io/questions/what-are-the-criticisms-of-howard-gardner-s-theory-of-5b1d7wk4jd
[2] https://files.eric.ed.gov/fulltext/ED500515.pdf
[3] https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2023.1217288/full
[4] https://researched.org.uk/2018/09/26/myth-busting-gardners-multiple-intelligences/
[5] https://www.edutopia.org/article/multiple-intelligences-theory-widely-used-yet-misunderstood/
[6] https://www.verywellmind.com/gardners-theory-of-multiple-intelligences-2795161