Next Chapter of the Performance Ratings Debate

Screen Shot 2017-03-31 at 10.24.51

No one with an interest in HR and organizational behavior is likely to have missed that there is a lot happening within the field of performance management right now. As covered rather extensively in a series of blog posts here in the fall, a pervasive trend over the last three or four years has been to get rid of the annual performance review (APR) and, most notably, the numerical performance ratings. The idea has been that this highly unpopular process, doubtful both in terms of accuracy and added business value, only takes a lot of time for managers and also is demotivating for basically all employees except the highest-performing ones.

Back in the fall, I cautioned that the scrapping of the APR – overdue and expected as it was – risked hiding the fact that the really difficult issue is still upon us. Because in reality, the task of evaluating employees’ performance has gotten no easier just because the ratings went out the window. And there is still a need to evaluate, if you e.g, want to differentiate some aspect of pay or benefits based on performance.

It should come as no surprise, then, that the “get rid of the ratings” movement has now encountered its first big backlash. In a study performed by CEB with 9,500 employees and 300 HR managers in global enterprises, it turns out that the scrapping of performance ratings often has not resulted in the expected outcomes. Most notably, the quality of the performance conversations that managers hold with employees often seems to drop, since managers have a harder time explaining what they are basing their judgements on and how, concretely, the employee should improve. This also tends to lead to lower employee engagement. What is perhaps even more conspicuous is that managers, while having significantly more time on their hands after the administrative beast of the APR is abolished, spend significantly less time on informal performance conversations with employees. The  drop, according to CEB’s report, is by an average of 10 hours per year.

What does this tell us? That once more a lot of companies have jumped on a bandwagon without thinking through the really difficult underlying issues. Some of those issues are:

  • What should take the place of APRs? Are we dumping formal differentiation of e.g. pay altogether (only likely to work in very “elite” organizations where there really are very few low performers), or do we need a new system to ensure fair and unbiased procedures?
  • How do we make sure that the additional time freed up by taking away the APR is used by managers to improve and enhance ongoing feedback?
  • Have we made sure that managers have the skills and tools necessary to provide effective ongoing coaching and feedback?
  • How do we handle the fact that managers will probably still be just as reluctant to handle performance deficits?

Instead, however, the focus has so far has been exclusively on ratings per se. And of course, they were an easy target to blame for all the deeper-seated problems with performance management. I fear we will now see an equally shallow discussion as the pendulum swings again: “Getting rid of the ratings was a mistake!”

Let us remember that there are good arguments to question the APR: The administrative burden of the process, its doubtful validity, its rigidity, and – not least – the inefficiency of feedback that is given merely once a year. However, just throwing it out is not going to solve any of the hard problems of performance management. Starting by going head-to-head with the above-listed bullet points is a better way to go.



Performance Ratings, pt 4: New Ways Forward


Welcome back to our fourth and final episode of a long and rocky odyssey in the world of performance management. The short version of what we have established so far goes something like: Scrapping the annual performance review is quite uncontroversial in the face of research. We know that is not how to motivate and enable improved performance – that is rather done through close contact with the supervisor, useful feedback, challenging tasks, etc. However, we are still stuck with the question of how to evaluate performance. This happens to be an incredibly difficult task – but if we want to work systematically with quality improvement, and aim for fair and unbiased decisions in e.g. pay and promotions, we need to take it on. So today, after this long journey, I will finally try to get to the natural follow-up question: How are we supposed to do it?

I dare say that the utopia of completely objective performance judgments is out, or at least it should be out. It is simply vain, and even dangerous, to believe that we can ever find a way to rate other people’s performance that is completely ”accurate” in some universal meaning. Let us draw on Waters et al. (2016) and call that a fantasy of the industrial era. Still, if we are to make these judgements and let people’s careers and salaries depend on them, we absolutely cannot settle for completely subjective statements either. So where do we go from here? A number of developments have arisen in recent years, and many of them seem promising. I base them both on current research in management and organizational psychology, and on accounts from foresighted practitioners that I meet as part of my research.

Multiple raters. What you usually call 360 ratings; using ratings not only from your supervisor but also from colleagues, subordinates,  clients, etc. The logic is; it is unlikely that all of the stakeholders you interact with hold the same biases or political interests about you (Pulakos et al., 2015). E.g., if your supervisor is constantly underrating your performance, it is likely that your colleagues, subordinates, and customers will at least give you a higher rating, and that will equal out the adverse impact.  Evidence is scattered, however. It is not certain that they substantially improve rating accuracy – sometimes, more raters seem to add marginal effects at best (e.g. Howard, 2016). Still it is being used increasingly to try to get a broader picture but also decrease the risk of bias. We should keep in mind, however, that stereotypes based on e.g. gender, background, or age could very well break through in 360 ratings as well.

Supervisor training. In research, opinions diverge as to the effectiveness of training managers in rating performance. Some state that these initiatives do not reliably improve rating quality (e.g. Adler et al., 2016). However, there is reason to believe that this is an effect of benchmarking against an unattainable goal (i.e., complete “accuracy”). As some scholars are now starting to argue, the goal can never be complete objectivity – instead, it should be the more humble state of intersubjectivity. That is; we should train raters to share a common way of thinking when making the ratings, even if that way of thinking does not reflect an objective “truth” about performance. If we can attain that, there are good hopes that rating accuracy in a more realistic sense can improve (Gorman & Rentsch, 2009; Schleicher & Day, 1998).

Self-rating: An idea increasingly heard in the start-up world and among avant-gard HR people. Why not let people rate their own 1) performance 2) access to the help they needed. Personally, I believe it is a practice we will be seeing more of. However, not the best idea if you are tying it to pay.

Evaluating only on goal fulfillment. Common in the tech world. I.e., you throw out all performance criteria that demand subjective judgment, and only ask the question: Did you deliver on your measurable goals or not? This strategy could be one way of decreasing subjectivity, as long as the goals are pre-determined and clearly measurable. As noted by Hunt (2016), however, the strategy might open up for attaining your goals in ways that do not rhyme with company ethics or values. Further, it is badly suited for positions where the end goals are actually unclear or develop underway.

Rating on behaviors. This is another development, sprung from behavioral psychology and organizational behavior management (OBM): Instead of trying to judge fluffy criteria like ”team player” or ”strategic thinking”, the rater only looks at observable behaviors such as ”takes time to help colleagues with problems” or ”identifies upcoming obstacles in planning meetings”. The advantage of this approach is that observable behaviors are less ambiguous than more general evaluative judgements (Lievens, 2001). The high level of specificity is also their problem, however – it can easily get too specific and rigid, which makes it difficult to account for the fact that different people can demonstrate the same quality in different kinds of behaviors.

Skipping the compensation link. Some companies have come to the conclusion that if you ever want accurate, truthful ratings that can be used for systematic organizational development, you should stop tying them to pay. Some are switching to standard pay levels for different types of positions, usually with reference to the ”startup philosophy” mentioned earlier in this series: We only hire really good people, so there is no reason why not all people in the same type of role would get the same pay. Controversial? To some, absolutely. Probably less so in Sweden than in the US. Regardlessly, research supports the notion that lower stakes lead to more honest performance ratings. In addition, a survey study underway (Ledford et al., 2016) showed that organizations that drop the annual rating do not face increasing reward costs.

Hopefully, this brief review provided some hope for the future: New and promising solutions for better performance evaluations are coming forward quickly. From the ashes of the old annual performance review, chances are we will see a new paradigm arising that holds more humble hopes for completely objective ratings, but still uses research and creativity to find new high-quality ways of evaluating performance.



Performance Ratings, pt 3: Why the Really Daunting Task Is Still Looming


We continue our odyssey through the complex topic of performance management – a practice that is changing forcefully at the moment. The annual performance review, where employees’ performance is rated according to complex criteria and scales once a year, is increasingly being abandoned. To many’s joy, one should add: This practice is disliked by most stakeholders and has been accused of not adding any substantial value. So, that’s it? Ding dong, the witch is dead? Not really.

As noted last time, performance management serves at least three purposes in organizations: Motivating performance improvement; enabling analysis of performance patterns and trends; and serving as a decision basis for e.g. promotions, layoffs, talent nominations, and compensation. We also noted that the first function is actually the one where research can provide the most clear-cut answer with regards to ratings: In order to develop and motivate employees, ratings generally give little added value. When it comes to the second and third purposes of performance management, however, the issue becomes a lot more complicated.

Let’s say you are an executive, and want to identify those managers in your organization that continuously succeed at growing high-performing individuals. Or you want to investigate whether there is a pattern of where in the organization low-performers are located. Or, for that matter, you want to tie some aspect of compensation to performance. How are you going to do this? First thing, you need to be able to compare people to each other. That means you will need some kind of systematization and documentation of performance. And as soon as you start logging people’s performance in any standardized way, be it by a number or a qualitative judgment, you are indeed doing a performance evaluation. Bottom line: Even if you throw out the annual performance review and remove ratings from the coaching sessions with employees, you will arguably still need some kind of method for evaluating people’s performance.

In light of this, it should come as no surprise that the witch is not really dead. As pointed out by several scholars (Hunt, 2016; Ledford et al., 2016), most companies that claim they have gotten rid of performance ratings really only refer to the annual performance review. Most continue to rate their employees as part of e.g. their talent review, their compensation process, or their leadership audits. And then we are actually back at what is really the key issue here: Evaluating someone’s work performance is an incredibly difficult task. As noted by Adler et al. (2016), sports judges spend their entire careers specializing in this – in contrast to managers, who are supposed to handle performance judgments as a ”side task”. And yet, sports judges often disagree with each other…

To no surprise, substantial research has shown that supervisor ratings of employee performance are pretty far from being accurate or consistent (Levy & Williams, 2004; Murphy & Cleveland, 1995). A number of things besides performance tend to go into these ratings: Personal liking, politics, different kinds of cognitive biases, stereotypes, and chance. Unfortunately, there is no real reason to believe that the ratings would be any more accurate just because they are conducted in another format than the annual performance review.

What I am trying to say here is this: In the general excitement about scrapping the annual performance review, there is a risk that organizations speed through the issue of how to actually evaluate employees’ performance. Chances are, then, that we just move this daunting task from one place to another (e.g., from the annual review to the leadership audit), without having improved our methods. Instead, why not take this time of change as a perfect opportunity to actually try to improve performance evaluation in a broader sense? Like so often before, the most forward-thinking practitioners are already ahead of research on this matter. In the next blog post we will look closer at some of the concrete strategies that are now being used to try to make performance evaluations more fair, accurate, and fit to our knowledge-intensive, post-industrial era.



Performance Ratings, pt 2: The Paradox of Dual Purposes


As noted last time, one of the most flagrant trends in HR right now is the change going on in performance management. An increasing number of companies are throwing out their annual performance reviews, and often also their complex criteria and matrices for rating employees’ performance. An old paradigm thus seems to be on its way out. Why is this happening, and what broader underlying challenges are driving this change? Today, I thought we would dwell on one of the inherent difficulties of performance management: Combining development with evaluation.

First of all, we need to acknowledge that performance management serves at least three broad purposes in organizations today. First; to enable and motivate employees to improve their performance. Second; to feed systematic analysis and business intelligence, e.g. finding patterns in performance differences. And third, to inform decisions about promotion, talent nominations, layoffs, and – not least – compensation and benefits. This multi-purpose nature of performance management is absolutely central if we want to understand what is happening now – and what an up-to-date performance management system might look like.

So far, the discussion has mostly focused on the first purpose. And if that was the only one, the issue of throwing out performance ratings would indeed be pretty clear-cut: There is little evidence to suggest that ratings should be a central part of motivating or enabling employees to improve their performance (DeNisi & Smith, 2014). On the contrary, there is quite a lot of research pointing to the demotivating effects of ratings (Aguinis et al., 2011; Culbertson et al., 2013), or at least indicating that ratings only motivate a minority of employees (usually – surprise! – the top-rated ones). This should come as no surprise. We have long known that motivation at work is fueled by frequent feedback, challenging goals combined with the right resources to achieve them, and close contact with a supportive supervisor. A label or number put on your performance once a year has scant chances of affecting your everyday behavior and engagement at work.

Furthermore, we know that evaluation and human growth tend to be like oil and water: They are virtually impossible to combine in the same process (a fact noted already by Meyer, Kay, and French, 1965). Performance management as it has been carried out to date thus carries an inherent paradox: In one and the same process, supervisors are supposed to help the employee develop and grow, while at the same time giving an evaluative judgment of his or her performance over the past year. There is also a lot of research showing that the use of numbers or categories in that process actually works to aggravate this paradox (e.g. Murphy, Cleveland, & Lim, 2007 in Langhan-Fox, Cooper, & Klimoski (eds.)). By introducing a rating scale, you forcefully direct the employee’s attention to the rating itself and not to the qualitative feedback and discussion that go with it. No matter how much you emphasize that the process is forward-looking and developmental, the employee will tend to focus mainly on the rating.

Thus, you might say that many of the problems of performance management stem from its apparent janus face: It includes both a developmental and a judgmental focus. One clear practical implication can be drawn out of this: If you want to enable performance improvement among employees, try removing any talk about ratings and formal judgments from that conversation. Coaching or development sessions between supervisor and employee are best held with little focus on evaluation. In other words; the first of the above purposes with performance management is often best fulfilled when separated from the second and third. From that perspective, the scrapping of the performance review is a promising development.

As evident above, we actually know quite a lot about how to motivate employees at work. We definitely know enough to say that performance ratings seldom serve that purpose. But – and this is an important but – when it comes to the second and third purpose of performance management, the issue of ratings becomes a lot more complicated. The reason is, they relate to one of the most difficult issues that organizational psychology has to offer: How do you fairly evaluate another person’s performance? This daunting task does not go away just by getting rid of the annual performance review – and we will dig into it in depth in the next blog post.



Performance Ratings, pt 1: A New Paradigm Under Way?


If there is one trend really exploding in the HR/Management space right now, it is the wave of companies getting rid of their annual performance reviews – i.e., the process where all employees’ performance is rated once a year on a standardized scale, according to standardized criteria, usually by their immediate supervisor. When software giant Adobe declared that they had thrown out their entire annual performance review in 2013, they seemed to open Pandora’s box: Since then, companies like Gap, AccentureDeloitte, and even the rating pioneer GE have followed. HR representatives, managers, and employees have all testified to their satisfaction with finally being rid of the unpopular, time-consuming performance ratings. The issue is now sparking heated debates at both scientific conferences and industry meetings.

In one way, you could say that the change was long overdue. It is well known that performance ratings in general, and the annual performance review in particular, are highly unpopular among practitioners. 95 percent of American managers think performance ratings are too time-consuming and do not contribute enough value to the organization (CEB Corporate Leadership Council, 2002; 2012). A Deloitte study from 2014 showed that only eight percent of companies agree that their performance management process contributes substantially to their success. This is quite remarkable for a process that the average manager of a large American company spends 210 hours a year working on (Corporate Leadership Council, 2012). In addition, many of the practitioners I meet are emphasizing that traditional performance ratings no longer meet the demands of modern work life. It is a far too rigid process for today’s fast-paced, flexible, and fluid world of work.

To add to this, psychological research has long pointed to a number of problems with performance ratings. For one, there is very little research showing any consistent increases in motivation or work effort following performance ratings (e.g. Aguinis, Joo & Gottfredson, 2011; DeNisi & Smith, 2014). Numerous studies have also shown that the accuracy of the ratings is generally far from good (Levy & Williams, 2004; Murphy & Cleveland, 1995). Furthermore, anyone with rudimentary knowledge of behavioral psychology knows that feedback given once a year has a scant chance of directing or changing employees’ everyday actions.

So, to sum up: We have here a practice that is disliked by most if not all important stakeholders, that is afflicted by various problems in both accuracy and value creation, and that is a bad match with today’s work life. Shouldn’t it be a no-brainer to throw it out, then? Well, I think it is safe to say that the traditional performance management practice, with its annual review and complex rating forms, is a legacy of the industrial era and was bound to be revised sooner or later. However, the story does not end with getting rid of ratings. The real interesting question to ask is: Where do we go from here?  Just because the annual performance review gets the boot, it does not mean that performance management as a whole has become any less important. Rather the opposite, actually. Now we need to ask ourselves: What deeper challenges is our discontent with ratings a symptom of? Well, for one, we are back to the classical issue of how to fairly judge other people’s performance. We are also back to the ever-present issue of managers’  unwillingness to bring up performance problems. These issues will not resolve themselves just because we drop the annual performance review – but it could be a start. So now that things are really moving in the performance management space, it seems like a good time to revisit some of the broader and more fundamental issues that a modern, well-designed performance management system must be able to handle. We will spend the three following posts digging into just those issues.