Survey-style leadership assessments promised to standardize the way leaders are evaluated in organizations. Yet survey data can still reflect and reinforce common biases. To try to correct for bias in these assessments, the authors recommend three strategies when developing survey questions: 1) Have people rate an ideal leader before rating their actual leader; 2) Require raters to give specific, qualitative examples for each rated behavior; and 3) Create friction during the evaluation to slow people down and make them aware of potential bias.
Leadership questionnaires are ubiquitous in organizations today. Surveys are a common tool to measure leadership potential, help determine whether employees should be promoted or afforded a bonus, and understand the leadership culture of an organization.
The push towards survey-type assessments was driven by a general dissatisfaction with older, more anecdote-driven processes that came with a lot of subjectivity. Surveys sent out to leaders, their employees, or sometimes even in a 360 fashion promised more objectivity — assuming the obtained data accurately reflects reality.
But we know that survey data still reflects common biases. To name just a few: Men are generally evaluated more favorably than women. Taller men are judged better leaders than shorter ones. White people are evaluated more favorably than people of color. Conventionally attractive leaders and those with angular face structures are also evaluated more favorably. People who have similar (political, religious, etc.) values are easily forgiven. Something is fundamentally flawed in the assessment, and this list is only the tip of the iceberg.
If leadership surveys are not an accurate measure of leadership behavior, what do they measure? The answer is perceived leader effectiveness rather than actual leadership behaviors. It has taken decades for academics to understand the importance of this distinction when evaluating women in leadership. Organizations often operate with data that captures to which degree someone is perceived as a leader and less so whether that person actually is a good leader by way of their actions. This is important because over time, perceived good leadership may be found to be highly ineffective. For example, research has shown that narcissists are often perceived as good leaders but are not necessarily effective ones.
Practitioners are not alone in this misconception. In fact, leadership researchers around the world are rethinking their use of questionnaires so much so that they devote whole sections of top-tier scientific journals to it. Scholarship increasingly agrees that we need to get better at capturing and interpreting actual leadership behaviors.
Naturally, alternative assessment methods like observation (whereby leadership behaviors are careful observed and coded), indirect forms of measurement through unobtrusive data collection (e.g., through email content or network analyses), and experiments (in which groups of participants are exposed to different leadership behaviors) at the disposal of researchers may not be realistic in the workplace. Leadership surveys are likely here to stay to some extent, and we need to carefully consider how these surveys are built and interpreted to minimize bias as much as possible.
One way to reduce bias in surveys is to pay careful attention to the way the questions are asked, per classical test theory. For example, this includes asking for examples of specific behaviors over general evaluations like “my leader is fair” or “my leader is supportive.” Such broad comments will mostly tell you if the rater likes their leader and little else.
While such adjustments may help improve the accuracy of leadership surveys, the risk of bias remains. But you don’t have to be a statistician or researcher to create leadership surveys that reduce the risk of common biases. We recommend these three strategies that each provide a means of forcing survey respondents to reflect on their answers to break the implicit bias that otherwise may guide responses.
1. Have people rate an ideal leader before rating their actual leader. When rating leaders we can be influenced by our views of ideal leadership. For example, when self-rating, we are typically guided more by who want to be (our ideal self) than by what we actually do (our actual self). A simple trick to combat this bias is to have leaders rate their ideal self (who would you like to be?) before rating their actual self (who are you at this moment?). Doing this dramatically increases whether self-ratings and ratings by others are positively correlated, suggesting a more accurate evaluation. Not only will self-evaluations become more accurate, they can also offer ideas for how the person can develop as a leader.
Similarly, asking raters to first rate their ideal leadership qualities before rating their actual leader filters out some of their biased perception of what a leader should be. Beyond driving more accurate data from the survey, reminding people of their perceptual filters may spread beyond completing a survey to recognizing bias in other leadership tasks (e.g., selection, performance management, feedback sessions).
2. Require raters to give specific, qualitative examples for each rated behavior. For example, when asked whether a leader provides opportunities for development, employees should also be asked to provide specific examples. Having to come up with concrete examples will make people reflect more deeply on the actual quality and frequency of such behaviors. On the one hand, such prompted reflections will help employees to calibrate their numeric assessment scores for their leader, thus increasing their validity. On the other hand, requiring numeric scores to be backed up with concrete examples will also help the assessed leaders to better understand how their behavior impacts others, thus improving the utility of the feedback for their development.
Providing concrete examples of leader behavior may be difficult for employees, especially if the information is sensitive, as in the case of abusive behavior. To ensure employees are honest in their reporting, they should be provided with proof that their responses will be anonymous and treated confidentially. Reports should, for example, go to HR for aggregation and moderation before being returned to the leader in question.
3. Create what Jennifer Eberhardt calls “friction” during the evaluation. Bias tends to appear when people make fast decisions, relying on heuristics rather than objective data. Slowing people down and making them aware of potential bias before they evaluate a leader can remind people to base their evaluation on specific behaviors.
Designing tweaks and nudges, such as dialogue boxes, warnings, or confirmation messages that appear before and after people evaluate a leader, should allow respondents to evaluate their reasoning and reduce incidence of bias. For example, before completing an assessment of one’s leader, raters could be reminded of how implicit bias can lead to unintentional but consequential discriminatory behavior that results in negative consequences for those leaders. Nextdoor, the neighborhood-based social network, used this process successfully; taking steps to slow people down when reporting suspicious activity reduced instances of racial profiling on the platform by 75%.
When these recommendations are put in practice, they can tease apart why some are perceived to be a good leader from what actually makes them a good leader. Of course, both perceived and actual effectiveness are necessary for good leadership. However, identifying this difference can give an organization the chance to promote someone who leads effectively but is not (yet) perceived as a leader. They may give that individual a chance to be a leader and thereby also gradually change others’ perception within the organization of what it takes to be a good leader, thus syncing the image of an ideal leader and effective leadership behavior within the organization.
As long as humans are observers, there will be some error in the observation. Humans were not built to objectively sense information but to immediately make sense of it. The latter will continue to distort data even if we do our best to keep the impact low. We all have a responsibility to correct for that bias for the good of organizations and the people that work for them.