Tag Archives: mechanism design

Incentivising refereeing

To shorten the refereeing lag and improve report quality in economics, the natural solution is to incentivise academics to do a better and quicker job. Economists respond to incentives, but currently no salary or promotion consequences arise from good or bad refereeing, as far as I know. In http://sanderheinsalu.com/ajaveeb/?p=503, I wrote about incentives for authors not to submit careless papers (in the hope that a refereeing mistake gets them accepted). One such incentive is requiring refereeing for a journal as a precondition for submitting to that journal. If a submitted paper gets n referee reports on average, then before submitting, an author should referee n papers in a timely manner, which should balance the supply and demand of refereeing. This forced refereeing may lead to quick, but sloppy reports.

An additional incentive is needed for quality. Rahman’s 2010 paper on the question „who watches the watchmen” suggests an answer. The editor can insert a deliberate mistake in every paper and see whether a referee finds it. If not, then the refereeing of that person is likely of low quality. The mistake is corrected before publication. Alternatively, the editor can ask each author to insert a mistake and tell the editor about it. The author is not penalised for this mistake and is asked to correct it if the paper is accepted. The referees are again judged on whether they find the mistake.

The above scheme derives refereeing incentives from publication incentives, requiring minimal change to the current system. However, it is somewhat indirect. A more straightforward incentive for refereeing is to reward it directly, either paying for it or basing promotion decisions partly on it. The speed of refereeing is already slightly monetarily incentivised in the American Economic Journal: Microeconomics. If the referee sends the report before a deadline, then she or he is paid 100 dollars. If a good referee report takes about 10 hours, then the amount is certainly not enough to motivate quality provision, but it is a step in one of the right directions. A simple improvement on the binary „before vs after deadline” reward scheme is to reduce the payment gradually as the delay of the referee report increases.

If refereeing is incentivised, then lower-ranked journals need larger incentives to compensate for the fact that refereeing for these has less inherent prestige and the paper one has to read is of lower quality. On the other hand, lower-ranked journals are less able to motivate refereeing with the threat of not accepting submissions from those who have not refereed. There are more lower-ranked journals, and it is less important to get accepted by any particular one of them. Some of the less prestigious journals would find no referees under the system proposed above. This is good, because it removes the „peer reviewed” status of some junk journals and may force them to close. If authors know that quality journals require refereeing before submission, then they draw the obvious conclusion about a journal that does not require it.

Publication delay provides incentives

From submitting a paper to a journal until getting the first referee reports takes about six months in economics. It is very rare to get accepted on the first try. Most papers are rejected, and an immediate acceptance implies having submitted to too weak a journal. Waiting for the referee reports on the revision and second revision takes another six plus a few months. This seems unnecessary (reading a paper does not take six months) and inefficient (creates delay in disseminating research results), but is used for incentives.
Delay discourages frivolous submissions. It forces authors to evaluate their own work with some honesty. If the referee reports were immediate, then everyone would start at the top journal and work their way down through every venue of publication until getting accepted. This would create a large refereeing and editing burden. Delay is a cost for the authors, because simultaneous submissions to multiple journals are not allowed. Trying for high-ranking journals is a risk, because the author may not have anything to show at the next evaluation. This reduces submissions to top journals. It may be optimal to start at the middle of the ranking where the chances of acceptance are higher.
A similar incentive to submit to the correct quality level of journal can be created by imposing a submission fee, forbidding further submissions for a period of time if rejected or requiring the author to write referee reports on others’ papers. A submission fee should be distinguished from publication fees, which are used at fake journals. The submission fee is paid no matter whether the paper is accepted, therefore does not create the incentive for the journal to lower its standards and publish more papers.
The submission fee would impose different costs on authors in different financial circumstances. Some have research funds to pay the fee, some do not. Similarly, delay has a larger effect on people whose evaluation is coming sooner. Being banned from a journal for a given amount of time after a rejection is worse for a researcher working in a single field. Interdisciplinary folk have a wider variety of venues to publish in. Writing referee reports as a price of having one’s work evaluated may lead to sloppy reviewing. Any mechanism to induce self-selection has a cost. Yet self-selection is needed.

When to permit new construction

In places with zoning laws (restrictions on what kind of buildings are allowed at a given address), there is often debate on whether to relax the restrictions. This would allow new construction or enlargement of existing buildings. The renters are generally in favour of more buildings, because the increased supply of housing lowers prices at a given demand. The landlords oppose construction, because it reduces the rents they can charge. These economic arguments are already part of the debate.

Much lobbying effort (that costs time and money and may create corruption) could be avoided if the market price of housing (rents or house transactions) was used directly in the regulations. New construction is allowed if the average rent is above a cutoff and denied below. Zoning laws may be a bad thing overall, but if they are to remain, they could be made more resistant to manipulation by basing restrictions on objective indicators, not lobbying.

The good incentives created by this require interest groups to put their money where their mouth is: if landlords want to prevent new construction, they should lower the rents they charge. Only with average rents low would building be blocked. Similarly, if tenants want more housing, they should pay the landlords more. They may of course decide to pool their money and found a property development firm instead.

Property developers want to get construction permits for themselves, but deny them to other property developers (their competition). The motivation to get a permit by fair means or foul is stronger when property prices are higher. In this case, the above reliance on the market price to regulate permits does not create good incentives. If new housing is allowed when prices are high, developers are motivated to form a cartel and raise the price. Permits reward high prices. A good price-based regulation of property development would require the opposite of the rental market mechanism – a low selling price of new housing should lead to more construction permits.

Of rankings

Many universities advertise themselves as being among the top n in the world (or region, country etc). Many more than n in fact, for any n small enough (1000, 500, 100 etc). How can this be? There are many different rankings of universities, each university picks the ranking in which it is the highest and advertises that. So if there are 10 rankings, each with a different university as number one, then there are at least 10 universities claiming to be number one.
There are many reasonable ways to rank a scientist, a journal, a university or a department. For a scientist, one can count all research articles published, or only those in the last 10 or 5 years, or only those in the top 100 journals in the field, or any of the previous weighted by some measure of quality, etc. Quality can be the number of citations, or citations in the last 5 years or from papers in the top 50 journals or quality-weighted citations (for some measure of quality)…
What are the characteristics of a good ranking? Partly depends on what one cares about. If a fresh PhD student is looking for an advisor, it is good to have an influential person who can pull strings to get the student a job. Influence is positively related to total citations or quality-weighted publications, and older publications may be better known than newer. If a department is looking to hire a professor, they would like someone who is active in research, not resting on past glory. So the department looks at recent publications, not total lifetime ones. Or at least divides the number of publications by the number of years the author has been a researcher.
Partly the characteristics of a good ranking are objective. It is the quality-weighted publications that matter, not just total publications. Similarly for citations. Coauthored publications should matter less than solo-authored. The ranking should not be manipulable by splitting one’s paper into two or more, or merging several papers into one. It should not increase if two authors with solo papers agree to add each other as the author, or if two coauthors having two papers together agree to make both papers single-authored, one under each of their names. Therefore credit for coauthored papers should be split between authors so that the shares sum to one.
How to measure the quality of a publication? One never knows the true impact that a discovery will have over the infinite future. Only noisy signals about this can be observed. There currently is no better measure than the opinion of other scientists, but how to transform vague opinions into numerical rankings?  The process seems to start with peer review.
Peer review is not a zero-one thing that a journal either has or not. There are a continuum of quality levels of it, from the top journals with very stringent requirements to middle ones where the referees only partly read or understand the paper, to fake journals that claim to have peer review but really don’t. There have been plenty of experiments where someone has submitted a clearly wrong or joke article to (ostensibly peer-reviewed) journals and got it accepted. Even top journals are not perfect, as evidenced by corrigenda published by authors and critical comments on published articles by other researchers. Even fake journals are unlikely to accept a paper where every word is “blah” – it would make their lack of review obvious and reduce revenue from other authors.
The rankings (of journals, researchers, universities) I have seen distinguish peer-reviewed journals from other publications in a zero-one way, not acknowledging the shades of grey between lack of review and competent review.
How to measure the quality of peer-review in a journal? One could look at the ranking of the researchers who are the reviewers and editors, but then how to rank the researchers? One could look at the quality-weighted citations per year to papers in the journal, but then what is the q    uality of a citation?
Explicit measurement of the quality of peer-review is possible: each author submitting a paper is asked to introduce a hard-to-notice mistake into the paper deliberately, report that mistake to an independent database and the referees are asked to report all mistakes they find to the same database. The author can dispute claimed mistakes and some editor has to have final judgement. It is then easy to compare the quality of review across journals and reviewers by the fraction of introduced mistakes they find. This is the who-watches-the-watchmen situation studied in Rahman (2010) “But who will monitor the monitor?” (http://www.aeaweb.org/articles.php?doi=10.1257/aer.102.6.2767).
One could disregard the journal altogether and focus on quality-weighted citations of the papers, but there is useful information contained in the reputation of a journal. The question is measuring that reputation explicitly.
If there is no objective measure of paper quality (does the chemical process described in it work, the algorithm give a correct solution, the material have the claimed properties etc), then a ranking of papers must be based on people’s opinions. This is like voting. Each alternative to be voted on is a ranking of papers, or there may simply be voting for the best paper. Arrow’s impossibility theorem applies – it is not possible to establish an overall ranking of papers (that is Pareto efficient, independent of irrelevant alternatives, non-dictatorial) using people’s individual rankings.
Theorists have weakened independence of irrelevant alternatives (ranking of A and B does not depend on preferences about other options). If preferences are cardinal (utility values have meaning beyond their ordering), then some reformulations of Arrow’s criteria can be simultaneously satisfied and a cardinal ordering of papers may be derivable from individual orderings.
If the weight of a person’s vote on the ranking of papers or researchers depends on the rank this person or their papers get, then the preference aggregation becomes a fixed point problem even with truthful (nonstrategic) voting. (This is the website relevance ranking problem, addressed by Google’s PageRank and similar algorithms.) There may be multiple fixed points, i.e. multiple different rankings that weight the votes of individuals by their rank and derive their rank from everyone’s votes.
For example, A, B and C are voting on their ranking. Whoever is top-ranked by voting gets to be the dictator and determines everyone’s ranking. A, B, C would most prefer the ranking of themselves to be ABC, BCA and CAB respectively. Each of these rankings is a fixed point, because each person votes themselves as the dictator if they should become the dictator, and the dictator’s vote determines who becomes the dictator.
A fixed point may not exist: with voters A and B, if A thinks B should be the dictator and B thinks A should, and the dictator’s vote determines who becomes the dictator, then a contradiction results no matter whether A or B is the dictator.
If voting is strategic and more votes for you gives a greater weight to your vote, then the situation is the one studied in Acemoglu, Egorov, Sonin (2012) “Dynamics and stability of constitutions, coalitions, and clubs.” (http://www.aeaweb.org/articles.php?f=s&doi=10.1257/aer.102.4.1446). Again, multiple fixed points are possible, depending on the starting state.
Suppose now that the weights of votes are fixed in advance (they are not a fixed point of the voting process). An objective signal that ranks some alternatives, but not all, does not change the problem coming from Arrow’s impossibility theorem. An objective signal that gives some noisy information about the rank or quality of each alternative can be used to prevent strategic manipulation in voting, but does not change the outcome under truthful voting much (outcome is probably continuous in the amount of noise).