Why Performance Ratings Deserve a Performance Review
What are the issues with ratings & distribution curves and what Leaders & HR practitioners can do about it
Performance management has become a pivotal process for organisations seeking to boost productivity and align employee behaviors with strategic objectives. However, managers and HR professionals often put too much stock in performance rating numbers, using them as the basis for significant decisions around promotion, compensation, bonuses, stock options and even termination.
Research shows these performance ratings numbers are significantly skewed by the idiosyncratic biases and perceptions of raters rather than being close to objective assessments of actual job performance.
The Idiosyncratic Rater Effect
In something called the “idiosyncratic rater effect”, research has found a massive 50-72% of the variance in performance rating scores is explained purely by the personal perceptions, biases and tendencies of the specific rater rather than the actual performance of the ratee.
This means if two managers are independently rating the same employee, they can often come up with wildly differing rating scores across a range of performance dimensions. And this variance isn’t necessarily because one rater is more accurate or “right” in their assessment than the other rater. Both raters may in fact be equally inaccurate or have similar distortions and biases in their ratings.
What the Research Says on Rating Biases
A range of studies have delved into the factors influencing performance ratings and most identify five core elements:
Actual job performance of the ratee
Rater’s perception of the ratee’s performance
Rater’s own idiosyncratic tendencies
Rater perspective based on role (manager, peer etc.)
Random errors
However, research overwhelmingly finds that out of these five factors, idiosyncratic rater effects and perceptions explain far more of the variance in scores than actual job performance:
A major study found idiosyncratic effects caused over 50% rating variance versus around 20% for actual performance
Another large study pegged idiosyncratic effects at causing 72% of rating variance
Research by Viswesvaran et al calculated that 19% of variance was caused by pure rater errors
This data shows clearly that performance rating numbers reflect a rater’s internal biases, perceptions and tendencies to a very large degree rather than providing anything close to an objective measure of the ratee’s actual performance.
Problems with Connecting Ratings Directly to Pay and Rewards
Unfortunately, despite the well-documented issues with rater subjectivity and bias, many leaders still directly connect performance rating numbers to pay rates, bonuses, stock allocations and other rewards.
By tying rating numbers directly to monetary rewards and punishments for employees, organisations effectively bake the rater biases and idiosyncratic tendencies into compensation decisions. An employee may end up with a lower pay rise or bonus not necessarily because of poor performance but because their rater has certain fixed perceptions of them or is generally harsher in their rating approach or has some unconscious bias against them.
Likewise, some less deserving employees may receive an outsized share of rewards simply because their manager happens to rate more leniently or takes a shine to them, not due to merit. With so much research pointing to the influence of rater bias, it seems unwise for leaders to link employee financial outcomes directly to subjective rating numbers.
Problems with Forced Ranking of Employees
Many leaders still follow variants of the infamous forced ranking approach first instituted by Jack Welch at GE in the 1980s. This system pits employees & managers against each other by arbitrarily forcing their rating distribution into a bell curve where most folks end up rated as average and fixed percentages fall into “top” and “bottom” rankings.
Forced rankings builds on two outdated and flawed assumptions:
Performance across knowledge workers fits a standard normal distribution curve akin to fixed physical attributes like height or weight i.e. most people fall somewhere in the middle of the curve.
Employees must compete against each other for rankings rather than being assessed objectively versus job expectations. This creates perverse incentives for employees to undermine teamwork and information sharing.
Abundant research now shows that performance levels among knowledge workers in fields like engineering, law, academia and creative vocations follows more of a ramp or skew distribution. The top 20% of performers are often responsible for generating 80% or more of the business outputs and innovations.
Enforcing bell curve rankings leads to valuable top performers being unfairly labeled as merely average. It also pressures managers to allocate arbitrary bottom rankings even when most team members are competent. This leads to a highly demoralising and demotivating system that damages team cohesion, information sharing and retention of top talent.
Many well known companies that formerly used rigid forced ranking systems including Microsoft, Adobe, Accenture and Juniper Networks have abandoned them after perceiving deleterious organisational impacts. It is time others also rethink this questionable practice especially given the dominance of the knowledge economy at present.
Ways Leaders Can Mitigate Rating Biases
Though it may be impossible to eliminate biases and idiosyncrasies from performance evaluation processes, Leaders can take steps to mitigate their influence:
Use multi-source 360 degree reviews: Rather than rely purely on a single manager’s rating, collect perspectives from peers, subordinates, internal customers and other stakeholders using 360 reviews. Aggregating views from multiple lens can balance out individual rater biases. However, avoid compressing 360 feedback into a single aggregate number as that merely hides variance amongst raters.
Provide rating ranges not averages: Present 360 feedback in terms of ranges to better capture the diversity in ratee perceptions rather than hiding it under averages more prone to rater biases.
Institute continuous evaluation: Conducting more frequent mini-performance reviews instead of a single major annual review provides better opportunity to correct rating biases and perceptions over time before serious decisions are made.
Focus developmental uses over administrative purposes: Research shows emphasizing employee development and feedback over using ratings primarily for promotion or pay decisions incentivizes more accurate ratings from managers.
Train raters on avoiding biases: Rater training can increase self-awareness of biases like halo effect, recency effects, leniency/harshness tendencies that distort performance perceptions and ratings. Frame-of-reference training is one proven approach.
Define metrics and goals clearly: Setting expectations through measurable, defined goals and success metrics leaves less room for subjective perceptions driving ratings.
Allow rater anonymity: Allowing raters to provide 360 feedback anonymously, where suitable, encourages more candid feedback to surface.
Key Takeaways for Leaders
While various rating biases may be inherently unavoidable in human evaluation processes, understanding their existence and thoughtfully designing rating systems can go some way toward minimising unfair impacts on employees.
Leaders would do well to maintain healthy skepticism regarding the true objectivity and accuracy of performance rating measures rather than taking numbers at face value. Recognition of the inherent subjectivity in ratings is called for, coupled with approaches to capture rater perspectives in the round.
Most critically, directly connecting flawed performance rating numbers to inherently biased and inaccurate forced ranking systems before using them as the sole basis for pay and promotion decisions does more harm than good.