2015-09-28

Peer Evaluation of Science

This is a proposal for a system for evaluation of the quality of scientific papers by open review of the papers through a platform inspired by StackExchange. I have reposted it here from The Self-Journal of Science where I hope my readers will go and comment on it: http://www.sjscience.org/article?id=401. The proposal is also intended as a contribution to #peerrevwk15 on Twitter.

I have chosen to publish this proposal on SJS since this is a platform that comes quite close to what I envision in this proposal.

Introduction

Researchers currently rely on traditional journals for publishing their research. Why is this? you might ask. Is it because it is particularly difficult to publish research results? Perhaps 300 years ago, but certainly not today where anyone can publish anything on the Internet with very little trouble. Why do we keep publishing with them, then? – they charge outrageous amounts for their services in the form of APCs from authors or subscriptions from readers or their libraries. One of the real reasons, I believe, is prestige.

The purpose of publishing your work in a journal is not really to get your work published and read, but it is to prove that your paper was good enough to be published in that particular journal. The more prestigious the journal, the better the paper, it seems. This roughly boils down to using the impact factor of the journal to evaluate the research of authors publishing in it (bad idea, see for example Wrong Number: A closer look at Impact Factors). It is often mentioned in online discussions how researchers are typically evaluated by hiring committees or grant reviewers based which journals they have published in. In Denmark (and Norway – possibly other countries?), universities are even getting funded based on which journals their researchers publish in.

I think the journal’s reputation (impact factor) is used in current practice because it is easy. It is a number that a grant reviewer or hiring committee member can easily look up and use to assess an author without having to read piles of their papers on which they might have to be experts. I support a much more qualitative approach based on the individual works of the individual researcher. So, to have any hope of replacing this practice, I think we need to offer a quantitative “short-cut” that can compete with the impact factor (and H-index etc.) that say little about the actual quality of the researcher’s works. Sadly, a quantitative metric is likely what hiring committees and grant reviewers are going to be looking at. Here I think a (quantitative) “score” or several such scores on different aspects of a paper accompanying the (qualitative) review can be used to provide such an evaluation metric. Here I am going to present some ideas of how such a metric can be calculated and also some potential pitfalls we need to discuss how to handle.

I believe that a system to quantify various aspects of a paper’s quality as part of an open review process could help us turn to a practice of judging papers and their authors by the merits of the individual paper instead of by the journal in which they are published. I also believe that this can be designed to incentivise participation in such a system.

Research and researchers should be evaluated directly by the quality of the research instead of indirectly through the reputation of the journals they publish in. My hope is to base this evaluation on open peer review, i.e. the review comments are open for anyone to read along with the published paper. Even when a publisher (in the many possible incarnations of that word) chooses to use pre-publication peer review, I think that should be made open in the sense that the review comments should be open for all to read after paper acceptance. And in any case, I think it should be supplemented by post-publication peer review (both open in the sense that they are open to read and also open for anyone to comment – although one might opt for a restriction of reviewers to any researcher who has published something themselves as for example Science Open uses).

What do I mean by using peer review to replace journal reputation as a method of evaluation? This is where I envision calculating a “quality” or “reputation” metric as part of the review process. This metric would be established through a quality “score” (could be multiple scores targeting different aspects of the paper) assigned by the reviewers/commenters, but endorsed (or not) by other reviewers through a two-layer scoring system inspired by the reputation metric from StackExchange. This would, in my opinion, comprise a metric that:

specifically evaluates the individual paper (and possibly the individual researcher through a combined score of her/his papers),
is more than a superficial number – the number only accompanies a qualitative (expert) review of the individual paper that others can read to help them assess the paper,
is completely transparent – accompanying reviews/comments are open for all to read and the votes/scores and the algorithm calculating a paper’s metric is completely open.

I have mentioned that this system is inspired by StackExchange. Let me first briefly explain what StackExchange is and how their reputation metric works: StackExchange is a question & answer (Q&A) site where anyone can post questions in different categories and anyone can post answers to those questions. The whole system is governed by a reputation metric which seems to be the currency that makes this platform work impressively well. Each question and each answer on the platform can be voted up or down by other users. When a user gets one of his/her questions or answers voted up, the user’s reputation metric increases. The score resulting from the voting helps rank questions and answers so the best ones are seen at the top of the list.

The System

A somewhat similar system could be used to evaluate scientific papers on a platform designed for the purpose. As I mentioned, my proposal is inspired by StackExchange, but I propose a somewhat different mechanism as the one based on questions and answers on StackExchange does not exactly fit the purpose here. I propose the following two-layer system.

First layer: each paper can be reviewed openly by other users on the platform. When someone reviews a paper, along with submission of the review text, the reviewer is asked to score the paper on one or more aspects. This could be simply “quality”, whatever this means, or several aspects such as “clarity”, “novelty”, “correctness”. It is of course an important matter to determine these evaluation aspects and define what they should mean. This is however a different story and I focus on the metric system here.
Second layer: other users on the platform can of course read the paper as well as the reviews attached to it. These users can score the individual reviews. This means that some users, even if they do not have the time to write a detailed review themselves, can still evaluate the paper by expressing whether they agree or disagree with the existing reviews of the paper.
What values can a score take? We will get to that in a bit.

How are metrics calculated based on this two-layer system?

Each paper’s metric is calculated as a weighted average of the scores assigned by reviewers (first layer). The weights assigned to the individual reviews are calculated from the scores other users have assigned to the reviews (second layer). The weight could be calculated in different ways depending on which values scores can take. It could be an average of the votes. It could also be calculated as the sum of votes on each review, meaning that reviews with lots of votes would generally get higher weights than reviews with few votes.
Each author’s metric is calculated based on the scores of the author’s papers. This could be done in several ways: One is a simple average; this would not take into account the number of papers an author has published. Maybe it should, so the sum of scores of the author’s papers could be another option. Alternatively, it might also be argued that each paper’s score in the author’s metric should be weighted by the “significance” of the paper which could be based on the number of reviews and votes on these each paper has.
Each reviewer’s metric is calculated based on the scores of her/his reviews in a similar way to the calculation of authors’ metrics. This should incentivise reviewers to write good reviews. Most users on the proposed platform will act as both reviewers and authors and will therefore have both a reviewer and an author metric.

Which Values Can Votes Have?

I propose to make the scores of both papers (first layer) and individual reviews (second layer) a ± 1 vote. One could argue that this is a very coarse-grained scale, but consider the option of for example a 10-level scale. This could cause problems of different users interpreting the scale differently. Some users might hardly ever use the maximum score while other users might give the maximum score to all papers that they merely find worthy of publication. By relying on a simple binary score instead, an average over a (hopefully) high number of reviews and review endorsements/disapprovals would be less sensitive to individual interpretations of the score value than many-level scores.

Conclusion

As mentioned, I hope the proposed model of evaluating scientific publications by accompanying qualitative reviews by a quantitative score would provide a useful metric that – although still quantitative – could prove a more accurate measure of quality of individual publications for those that need to rely on such a measure. This proposal should not be considered a scientific article itself, but I hope it can be a useful contribution to a debate on how to make peer review both more open and more broadly useful to readers and evaluators of scientific publications.

I have chosen to publish this proposal on SJS since this is a platform that comes quite close to what I envision in this proposal. I hope that readers will take the opportunity to comment on the proposal and help start a discussion about it.

1 Comment

2015-09-10

It’s all about replication

A new journal appeared recently in the scientific publishing landscape: ReScience – announced at the recent EuroSciPy 2015 conference. The journal has been founded by Nicolas Rougier and Konrad Hinsen. This journal is remarkable in several ways, so remarkable in fact that I could not resist accepting their offer to become associate editor for the journal.

So how does this journal stand out from the crowd? First of all it is about as open as it gets. The entire publishing process is completely transparent – from first submission through review to final publication. Second, the journal platform is based entirely on GitHub, the code repository home to a plethora of open source projects. This is part of what enables the journal to be so open about the entire publishing process. Third, the journal does not actually publish original research – there are plenty of those already. Instead, ReScience focuses entirely on replications of already published computational science.

As has been mentioned by numerous people before me, when dealing with papers based on computational science it is not really enough to review the paper in the classical sense to ensure that the results can be trusted (this not only a problem of computational science, but this is the particular focus of ReScience). Results need to be replicated to validate them and this is what ReScience addresses.

Many of us probably know it: we are working on a new paper of our own and we need to replicate the results of some previous paper that we wish to compare our results against. Except for that comparison, this is essentially lost work after you get your paper published. Others looking at the original paper whose results you replicated may not be aware that anyone replicated these results. Now you can publish the replication of these previous results as well and get credit for it. At the same time you benefit the authors of the original results that you have replicated by helping validate their research.

The process of submitting your work to ReScience is described on their website along with the review process and the roles of editors and reviewers. So if you have replicated someone else’s computational work, go ahead and publish it in ReScience. If it is in the signal processing area I will be happy to take your submission through the publishing process.

2015-01-17

Open review in the wild

Few journals and conferences so far seem to use open review. We mostly see open review practised as post-publication commenting on for example pubpeer.com where it so far seems to be mainly about spotting errors in already published papers.

I would personally like to see more open review employed by journals and conferences in the publishing of scientific papers to increase transparency in the process.
Today I have found such an example thanks to Igor Carron’s post The papers for ICLR 2015 are now open for discussion! The machine learning conference International Conference on Learning Representations uses an open review model where reviews are published, anyone can comment on the papers, and anyone can ask to become a designated reviewer: http://www.iclr.cc/doku.php?id=pubmodel.

Even though independent sites exist for post-publication commenting and review, I think it is especially exciting to see it being actively encouraged and fully integrated into the paper submission and acceptance process by the conference organisers. In addition to providing transparency in the process, I hope it also stimulates more discussion when the it is actively encouraged as we see here.

3 Comments

2014-06-13

Episciences.org update

I mentioned the Episciences project the other day in Scientific journals as an overlay. In the meantime I have tried to contact the people behind this project and The Open Journal, apparently without any luck.

I went and checked the Episciences website yesterday and it actually seems that they are moving forward. They changed the page design completely and there is now a button in the upper right corner to create an account and log in. I took the liberty of doing so to have a look around. I was able to create an account, but is just about it so far. The site still seems quite “beta” – I was not able to save changes to my profile and I cannot yet find anywhere to submit papers. It is nice to see some progress on the platform and I will be keeping an eager eye on it to find out when they will go operational.

7 Comments

2014-06-04

Scientific journals as an overlay

There is an update on this post in Episciences.org update…

In many of my posts since I started this blog, I have been writing about open peer review. Another topic related to open science that interests me is open access (to scientific papers). Part of open access in practice is about authors posting their papers, perhaps submitted to traditional journals, to preprint servers such as arXiv. This is used a lot, particularly in physics and mathematics.
Read the rest of this entry »

6 Comments

2014-04-16

How to attract reviewers for open / post-publication review?

In traditional journals with closed pre-publication peer review, reviewers are typically invited by the editor. Editors can for example draw on previous authors from the journal or (I guess) their professional network in general. With the recent appearance of several open peer review platforms, for example PubPeer, Publons etc. – more here, there will be a need to attract reviewers to such platforms. Sufficiently flawed papers seem to attract enough attention to trigger reviews, but it is my impression that papers that are generally OK do not get a lot of post-publication review. This is perhaps not such a big deal for papers that have been already been published in some journal with closed pre-publication review – the major downside is that the rest of us do not get to see the review comments. But, if you are going to base an entire journal on post-publication peer review you will want to ensure at least a few reviews of each published paper as a sort of stamp of approval.

The Winnower – the new journal based entirely on open post-publication peer review that I have previously written about here – is about to launch. First and foremost they will of course need to attract some papers to publish. Their in my opinion very fair (when you compare to other open access journals) price of $100 should help and the fact that they span a very broad range of scientific disciplines should also give them a lot of potential authors. They also want to attract reviewers to their papers. The platform is open to review by anyone, but in order to ensure a minimum number of reviews of each paper with at least some experts on the topic among them, this seems like a good idea. But how do you do this? As a new journal there are no previous authors to draw on. It can probably get difficult to get in touch with sufficiently many qualified reviewers across all of the journal’s disciplines and on top of that, reviewers may be more reluctant to accept since the reviews will be open with reviewers’ identities disclosed.

What can be done to attract sufficiently many reviewers? Should the journal gamble on being the cool new kid in class that everyone wants to be friends with or simply try to buy friends? I have been discussing this with their founder Josh Nicholson. One possibility is to pay reviewers a small amount for each review they complete. If we were talking one of the traditional publishers, whom I think in many cases are exploiting authors and reviewers shamelessly to stuff their own pockets, I think it would only be reasonable to actually start paying the reviewers. In the case of The Winnower, it may be different. The Winnower is a new journal trying to get authors and reviewers on board. With a very idealistic approach and pricing, I do not think people are likely to think that they are just trying to make money – being a “predatory publisher”. But on the other hand, paying reviewers might somehow make it look like they are trying to “buy friends”. With the journal’s profile, potential reviewers might mainly be ones that like to think of themselves as a bit idealistic and revolutionary too and that might just not go well with being paid for reviews? Josh told me an anecdote he had been told recently:

To illustrate this point, imagine walking down the street and an able bodied young man asks for your help loading a large box into a truck. If he were to politely ask for help, most people would be highly likely to assist him in his request. However, if he were to politely ask and also mention that for your time, he will pay you $0.25 most people will actually turn him down. Despite being totally irrational, given that under the same circumstance they would do it without the promise of a quarter, the mention of money evokes the passerby to calculate what their time is worth to them. To many, the $0.25 isn’t going to be worth the effort. I think this may pose a similar issue.

Then again, paying reviewers could also send the message that they are taking their reviewers and the work they do very seriously? Ideally, I think it might work better with an incentive structure and “review of reviews” / scoring of reviews like the Stackexchange network for example. I am just afraid that something like that will take considerable “critical mass” to be effective. Another option Josh mentioned could be to let reviewers earn free publications with the journal by completing a number of reviews. This sounds better to me: you do not risk offending potential reviewers with a “price on their head”, but there is still something to gain for reviewers.

I think this is a very interesting question and probably one that a lot of people have much more qualified answers for than I. Let me know what you think?!

4 Comments

2013-11-06

There’s a new journal in town…

[Image: Jean-François Millet: Le Vanneur. Source: WikiPedia]

I have been writing a few posts lately about open peer review in scientific publishing (Open Review of Scientific Literature, Openness And Anonymity in Peer Review, Third-party review platforms). As I have mentioned, quite a few platforms experimenting with open post-publication peer review have been appearing around us recently.

Now it seems there is an actual journal on its way, embracing open review and open access from the very beginning to an extent I have not seen yet. It sounds like a very brave and exciting initiative. According to their own description it is going to be a journal for all disciplines of science. You can read more about the ideas behind the journal on their blog: The Winnower. It was also featured recently here.

Curious about this new journal as I am, I have been talking to its founder, Josh Nicholson, online on a few occasions lately to find out more about the journal. I have decided to publish this Q&A correspondance here in case others are interested.

Q&A with Josh Nicholson

2013/10/04 – on Google+:
As I understand, you will publish manuscripts immediately and publish the accompanying reviews of them when ready. Will these manuscripts be open to review by anyone, will you find reviewers, or a combination thereof?
In principle, it would be “most open” to allow reviews by anyone, but specifically when some paper is not “popular” enough to attract reviewers spontaneously, I guess it might also be necessary to actively engage reviewers? If so, do you consider somehow paying (monetarily or otherwise) reviewers?

The papers will indeed be open to review by anyone. We want it to be completely transparent and open. We also wish to be completely clear that papers without reviews linked to them have not been reviewed and should be viewed accordingly. We would like to engage reviewers with different incentives in the future and will explore the best ways to do that as we move forward. Our system will in essence be quite similar to “pre-prints” where authors are allowed to solicit reviews and anyone is allowed to review but it will all occur in the open. We will charge $100 per publication so that we can sustain the site without relying in grants. We would love to hear more of your feedback should you have any!

I have been considering for some time how an open peer review system can attract reviewers and possibly encourage them to identify themselves to “earn” reputation.
The Stackexhange network, among others, seems to be quite popular and it seems to me that one of the things driving users to contribute is the reputation system where a reputation score becomes the “currency” of the site. Users can vote other users’ questions and answers up or down. This lets other users quickly assess which questions and answers are “good”. Votes earn the poster of the question or answer reputation points and this encourages posters to make an effort to write good questions and answers.
It seems to me that such a system could be used more or less directly on a peer review platform. It would both encourage users to write reviews and let other users assess and score reviews (review of reviews).

We agree with you 100%. We would even like to offer the “best” reviewers, as judged by the community, free publishing rights. Ultimately we would also like to make the reviews citeable. Some of these features will not be present in the initial launch but will be expanded upon and rolled out over time. We hope you will consider submitting in 2014 and reviewing!
We have a few other select features that will be present in the initial build to attract reviews. Some of these will be discussed in future blog posts.

2013/10/06 – in blog comment:
Have you at The Winnower considered if you could make use of third-party reviewer platforms for your publishing?

We have briefly communicated with LIBRE and are indeed open to reviews from third-party platforms. We are happy to work with anyone towards the goal of making reviewing more transparent.

2013/10/11 – on Twitter:
Will you have any sort of editorial endorsement of papers you publish or will the open reviews be the only “stamp of approval”?

Open reviews will serve as ‘stamp of approval.’ We hope papers will accumulate many reviews.
Papers can be organized based by content as well as various metrics including most reviewed etc.

2013/10/11 – in blog comment:
I am very excited about your new journal – that’s why I keep asking all sorts of questions about it here and there 😉
In terms of archiving papers, what will you do to ensure that the papers you have published do not disappear in the event that the Winnower should be out of business? Do you have any mutual archival agreements with other journals or institutional repositories?

We are happy you are excited about The Winnower. Please keep the questions and comments coming!
We are currently looking at what is the best way to preserve papers published in The Winnower should The Winnower not survive. We are looking to participate in CLOCKSS but have not made any agreements as of yet.

Another one: under which terms are you going to license the published manuscripts? For example, I have heard authors express concern about third-party commercial reuse of papers without consent under CC-BY. I am not sure yet what to think about that.

Content published with The Winnower will be licensed under a CC BY license. Commercial reuse of work, as we understand it, must cite the original work. We want to open the exchange ideas and information.

2013/10/18 – in blog comment:
Here goes another one of my questions: will your platform employ versioning of manuscripts?
I imagine that authors of a paper may want to revise their paper in response to relevant review comments. Just like it often happens in traditional pre-publication review – here we just get the whole story out in the open. If so, I think there should be a mechanism in place to keep track of different versions of the paper – all of which should remain open to readers. As a consequence of this, there will also be a need to keep track of which version of a paper specific comments relate to.
Rating: will it be possible to rate/score papers in addition to reviewing/commenting? While a simple score may seem a crude measure I think there is a possibility that it could help readers sift more efficiently through the posted papers. In a publishing model like yours, it is going to be harder for, e.g. funding agencies or hiring committees to assess an author’s work, because they cannot simply judge it by where it was published (that may be the wrong way to do it anyway, but that is not what I am aiming to discuss here). A simple score might make the transition to your proposed publishing process “easier” for some stakeholders. I am a bit reluctant about it myself, but in order not to make it too superficial, maybe scoring/rating should only be possible after having provided a proper review comment. This should make it difficult for readers to score the paper without making a proper effort in assessing the paper.

We are happy to have your questions! There will indeed be an option to revise manuscripts after a MS has collected reviews. We are however a bit uneasy about hosting multiple versions of the paper as we think it may become quite confusing. We are happy to explore this option in the future but currently we believe that the comments along with the responses should be sufficient to inform the reader what was changed.

Do you agree?

Our reviews will be structured meaning that there will be prompts which allow different aspects of the papers to be rated.

So you plan to allow revision, but previous versions are “lost”?
I see the point about the possible confusion, but what if a commenter points out some flaw about some details in the paper, the author acknowledges it and revises the paper? Now, future readers can no longer see in the paper what that comment was about. Well, they can see the comment, but they can no longer see for themselves in the actual paper what the flawed part originally said.
Could the platform perhaps always display the most recent version of the paper but show links to previous versions somewhere along with the metadata, abstract etc. that I assume you will be displaying on a paper’s “landing page”? The actual PDF of an outdated version of a paper could have a prominently displayed text or “stamp” saying that this is an outdated version kept for the record and that a newer version is available?
Perhaps links to older versions of a paper would only be visible in a specific comment that refers to an earlier version of the paper?

These are good points. We will discuss some of these and see what approach will work best and what we are capable of. If it is not too much confusion and not too much to implement this into the build this could be quite useful as you point out. Thanks!

By the way, I think it would be interesting if it were possible not only to comment on papers, but to annotate the actual text in-line. I think it would be great if readers could mark up parts of text and write comments directly next to them. Would it seem too draft-like?
I am not sure how this could be done, technically, but it seems like the technology http://hypothes.is/ are brewing could enable something like this.

We agree that inline comments are quite interesting and we have this as a possible tool to build in the platform in the future. We however have limited funding for the initial build and want to focus on features that are critical first and complimentary second. But this is a great idea and definitely something we will be exploring in the future.

Good point. It is best to get the essential features working well first.

3 Comments

Link | 2013-10-26

Academia.edu acquires Plasmyd to bring peer review into the 21st century

I noticed this news piece today. I have previously written about open peer review platform. Most of the recent initiatives in open peer review are entirely new platforms that provide the mechanics to get open peer review going, but in my opinion a challenge for them is to attract a critical mass of users.

Academia.edu’s move seems a bit the other way around: they already have an existing science-related platform with a quite a few users, but now they are adding peer review functionality. It is not entirely clear to me whether this means open review, but the mechanism they describe could help address the challenge of how to attract sufficient numbers of qualified reviewers to such a platform. The article does hint at the possibility that academia.edu might try to “build a reveue model around their modern approach to peer review”. I am not a fan of such a model, as this is one of the things that are wrong with the traditional journal publishing model. Nevertheless, it is going to be interesting to see how it goes.

Link | 2013-10-07

More on anonymity in peer review

This study lacked an appropriate control group: Two stars

I came across this post on anonymity in peer review by Jon Brock. I have previously tried to discuss pros and cons of anonymity here. I think Jon’s post is a quite good argument in favour of identifying reviewers.

In relation to this, I actually wanted to sign a review I did recently for a journal as I wanted to personally stand by my assessment of the manuscript. I asked with the editor first if this was OK with him. He explicitly requested me NOT to do so, because this was against their review policy…