Research Evaluation’s Gender Problem – and Some Suggestions for Fixing It #STMchallenges
Stacy Konkiel is a Research Metrics Consultant at Altmetric, a data science company that helps researchers discover the attention their work receives online. Since 2008, she has worked at the intersection of Open Science, research impact metrics, and academic library services with teams at Impactstory, Indiana University & PLOS.
Research evaluation in the sciences has a gender problem: it’s mostly based on indicators and practices that reflect an unconscious bias against women.
This bias, when embedded into research evaluation systems, can harm female researchers as they apply for grant funding, jobs, promotion, and navigate other professional advancement opportunities.
Unfortunately, this problem is spreading to other disciplines, too. As the humanities and social sciences start to favor the use of quantitative metrics like citation counts for research evaluation, the problems of bias will follow.
Though more research is being done today than ever before on these problems, solutions have been less easy to come by. The literature suggests that education, support, and the use of context-aware metrics may help mitigate the gender gap, in both the short and long term.
The most used productivity indicator penalizes women researchers
Around the world, it’s common to use “number of papers published per year” as an indicator for a researcher’s productivity, especially in the sciences. Common sense and a lot of well-written editorials explain why that practice is misguided: it incentivizes writing over the act of doing research; it has lead to the practice of “salami slicing” (whereby authors split up one paper’s worth of insight over the course of several publications); and it is an invalid measure for the ultimate desired outcome: doing a lot of high-quality research.
The practice of understanding productivity simply by counting one’s number of publications per year is a flawed metric for another reason: it penalizes women researchers.
Research has shown that female researchers tend to publish less than their male counterparts in several scientific fields, especially early in their careers, and even in fields where publishing patterns are seemingly equal, men hold more of the prestigious first and last author positions (one study found that male “first authorships” outnumbered female, nearly 2:1). The latter point is significant due to the fact that some disciplinary evaluation practices specifically take “first/last author” publications into account when determining productivity.
Perhaps the most sobering statement from the literature can be found in a recent study: “Fewer than 6% of countries represented in the Web of Science come close to achieving gender parity in terms of papers published.”
One theory for this discrepancy – that women are less productive because they bear the brunt of domestic responsibilities, including childcare – has been mostly discounted. Another theory – traditional gender roles within marriage require more domestic labor of women and therefore less time to publish – has also been ruled out.
What seems a more likely culprit is gendered differences in how research is done, along with a healthy dose of institutionalized sexism. Women tend to collaborate less than their male counterparts (especially less often internationally), have different collaboration strategies than their male counterparts, have been found to have less access to research funding, and are on the receiving end of harsher criticism from hiring panels and grant reviewers, simply for being female! All of these variables have a bearing on a woman’s ability to “be productive” by publishing, making publication counts an inherently biased metric.
Citation counts aren’t sexist, but citation practices can be
Citation counts are another popular means by which researchers are often evaluated, especially in the sciences. These indicators, much like publication counts, disadvantage female researchers. That’s because researchers cite their female colleagues less often than their male colleagues.
One of the largest ever studies on citations and gender uncovered the following fact:
“We discovered that when a woman was in any of these [prominent authorship, i.e. sole authorship, first-authorship and last-authorship] roles, a paper attracted fewer citations than in cases in which a man was in one of these roles…The gender disparity holds for national and international collaborations.”
Other studies have found that, no matter the authorship position of a female researcher, she is less likely to be cited than her male counterparts.
In fields where a researcher is judged upon the size of their h-index, this disparity has obvious consequences for women, who are more likely to have lower h-indices, due to both citation bias and publication frequency differences.
Yet citations themselves aren’t inherently biased – they’re simply an indicator of one’s influence in a discipline. Citations merely reflect our own unconscious biases. As one editorial puts it:
“[T]o say that existing measures are not gender-biased is not the same as saying that there may not be structural gender-based biases in the larger environment in which scientists send and receive messages about research findings which, in turn, affects interpretation of the meaning of these measures.”
Even with the increased attention being paid to these citation discrepancies, many institutions still use raw citation counts and h-indices in their evaluation practices.
Clearly, the indicators we use to understand academic impact are faulty. What about indicators for understanding non-academic impact, like patents and altmetrics?
Innovation indicators penalize women, too
The number of patents a researcher files – understood to be an indicator of one’s impact upon technology and the economy – is yet another research evaluation metric that doesn’t adequately represent women’s contributions.
Studies have repeatedly shown that women researchers receive patents at lower rates than their male counterparts. One study found that though women represent about a third of the workforce in disciplines where patenting is common, they only account for around 11% of all patent holders.
Much like the “productivity puzzle”, the reasons for the lack of gender equity in patenting practices is likely due to a number of social phenomena. It has been suggested that female researchers’ less diverse social networks, the lack of support for underrepresented faculty from university technology transfer offices, one’s motherhood status, and the designation of patenting as an “optional” career advancement activity may all play a role in this disparity.
Though an improvement, altmetrics are still biased
Altmetrics improve upon citations in many ways: we can use these discussions of research on the social web to understand non-academic impacts; they’re quicker to accumulate; and they apply to research other than journal articles (datasets, software, presentations, and other outputs, as well).
However, altmetrics research has shown that articles penned by female authors tends to receive less attention on social media, blogs and in the news than those of male authors, making raw altmetrics counts – much like citations – a problematic means by which to evaluate researchers. It is worth noting, though, that the gender gap for altmetrics is less severe than the gender gap for citations, meaning that they are still an improvement upon citations.
So, what can we do to improve research evaluation?
Many traditionalists would be the first to posit that we need to do away with using impact and productivity metrics altogether and simply use peer review if we wish to rid ourselves of biased forms of research evaluation. But peer review isn’t necessarily the answer.
Peer review has been shown time and time again to be subject to gender bias. Even Sweden, that Nordic bastion of gender equality, has had a problem with gender-based bias affecting peer review in their national funding council!
Instead, there are a number of other tactics that academia might implement to combat the gender bias that’s manifested in various research evaluation metrics. These upstream changes could potentially alter citation practices, offer more support for female researchers, and encourage other improvements to gender equity that would render some impact metrics more reliable.
A solid first step would be to raise awareness among academics worldwide of implicit bias with respect to gender. Implicit bias is the unconscious assumption that people of certain genders, races, etc act in certain ways or, sometimes, that they are less capable than others. Certainly, it’s implicit bias (along with a health dose of the Matthew Effect) rather than overt sexism that subtly nudges researchers away from citing or sharing women-authored articles.
Luckily, being made aware of one’s own implicit bias has been reported to be the first step towards correcting it. Knowing that you don’t cite female authors as often as you do male authors may make you more likely to seek out and cite high-quality research from women.
Support programs for women in academia is another important step that can be taken. To encourage female researchers to collaborate – and thus, author – more often, bibliometrics researchers have suggested that “programmes fostering international collaboration for female researchers might help to level the playing field.” Similarly, it has been suggested that technology transfer offices have a role in eliminating the patent gender gap by offering women-oriented programs and support.
Until these upstream changes take effect, what can be done to ensure that women are being evaluated fairly using impact metrics?
To be frank, research evaluation programs should incorporate gender-aware indicators to correct for current disparities. One team of researchers has suggested the use of gender-sensitive metrics to replace the h-index. This theme was echoed by a prominent altmetrics researcher, as well, who once suggested to me that Altmetric should provide “gender percentiles” based on lead authors on our details pages, alongside other “Score in Context” percentiles. (It’s something that certainly got support at Altmetric, though we have no firm plans to add that feature soon.) In theory, gender-aware percentiles could be built into citation databases like Scopus and Web of Science, too.
Do you have suggestions for how we might address the structural inequities in research evaluation? Leave them in the comments below or share them with me on Twitter (@skonkiel).