Tag Archives: game theory

Dilution effect explained by signalling

Signalling confidence in one’s arguments explains the dilution effect in marketing and persuasion. The dilution effect is that the audience averages the strength of a persuader’s arguments instead of adding the strengths. More arguments in favour of a position should intuitively increase the confidence in the correctness of this position, but empirically, adding weak arguments reduces people’s belief, which is why drug advertisements on US late-night TV list mild side effects in addition to serious ones. The target audience of these ads worries less about side effects when the ad mentions more slight problems with the drug, although additional side effects, whether weak or strong, should make the drug worse.

A persuader who believes her first argument to be strong enough to convince everyone does not waste valuable time to add other arguments. Listeners evaluate arguments partly by the confidence they believe the speaker has in these claims. This is rational Bayesian updating because a speaker’s conviction in the correctness of what she says is positively correlated with the actual validity of the claims.

A countervailing effect is that a speaker with many arguments has spent significant time studying the issue, so knows more precisely what the correct action is. If the listeners believe the bias of the persuader to be small or against the action that the arguments favour, then the audience should rationally believe a better-informed speaker more.

An effect in the same direction as dilution is that a speaker with many arguments in favour of a choice strongly prefers the listeners to choose it, i.e. is more biased. Then the listeners should respond less to the persuader’s effort. In the limit when the speaker’s only goal is always for the audience to comply, at any time cost of persuasion, then the listeners should ignore the speaker because a constant signal carries no information.


Start with the standard model of signalling by information provision and then add countersignalling.

The listeners choose either to do what the persuader wants or not. The persuader receives a benefit B if the listeners comply, otherwise receives zero.

The persuader always presents her first argument, otherwise reveals that she has no arguments, which ends the game with the listeners not doing what the persuader wants. The persuader chooses whether to spend time at cost c>0, c<B to present her second argument, which may be strong or weak. The persuader knows the strength of the second argument but the listeners only have the common prior belief that the probability of a strong second argument is p0. If the second argument is strong, then the persuader is confident, otherwise not.

If the persuader does not present the second argument, then the listeners receive an exogenous private signal in {1,0} about the persuader’s confidence, e.g. via her subconscious body language. The probabilities of the signals are Pr(1|confident) =Pr(0|not) =q >1/2. If the persuader presents the second argument, then the listeners learn the confidence with certainty and can ignore any signals about it. Denote by p1 the updated probability that the audience puts on the second argument being strong.

If the speaker presents a strong second argument, then p1=1, if the speaker presents a weak argument, then p1=0, if the speaker presents no second argument, then after signal 1, the audience updates their belief to p1(1) =p0*q/(p0*q +(1-p0)*(1-q)) >p0 and after signal 0, to p1(0) =p0*(1-q)/(p0*(1-q) +(1-p0)*q) <p0.

The listeners prefer to comply (take action a=1) when the second argument of the persuader is strong, otherwise prefer not to do what the persuader wants (action a=0). At the prior belief p0, the listeners prefer not to comply. Therefore a persuader with a strong second argument chooses max{B*1-c, q*B*1 +(1-q)*B*0} and presents the argument iff (1-q)*B >c. A persuader with a weak argument chooses max{B*0-c, (1-q)*B*1 +q*B*0}, always not to present the argument. If a confident persuader chooses not to present the argument, then the listeners use the exogenous signal, otherwise use the choice of presentation to infer the type of the persuader.

One extension is that presenting the argument still leaves some doubt about its strength.

Another extension has many argument strength levels, so each type of persuader sometimes presents the second argument, sometimes not.

In this standard model, if the second argument is presented, then always by the confident type. As is intuitive, the second argument increases the belief of the listeners that the persuader is right. Adding countersignalling partly reverses the intuition – a very confident type of the persuader knows that the first argument already reveals her great confidence, so the listeners do what the very confident persuader wants. The very confident type never presents the second argument, so if the confident type chooses to present it, then the extra argument reduces the belief of the audience in the correctness of the persuader. However, compared to the least confident type who also never presents the second argument, the confident type’s second argument increases the belief of the listeners.

On the optimality of self-quarantine

Is self-quarantine early in an epidemic optimal, either individually or for society?

Individual incentives are easier to analyse, so let’s start with these. Conditional on catching a disease, other things equal, later is better. The reasons are discounting and the advances in treatment. A delay of many years may increase the severity conditional on infection (old age weakens immunity), but such long time intervals are typically not relevant in an epidemic.

Conditional on falling ill within the next year (during which discounting and advances in treatment are negligible), it is better to catch the disease when few others are infected, so hospitals have spare capacity. This suggests either significantly before or long after the peak of the epidemic. Self-quarantine, if tight enough, may postpone one’s infection past the peak.

Another individually optimal choice is to get infected early (also called vaccination with live unattenuated virus), although not if immunity increases very little or even decreases. The latter means that one infection raises the probability of another with the same disease, like for malaria, HIV and herpes, which hide out in the organism and recur. Cancer displays similar comebacks. For viral respiratory diseases, as far as I know, immunity increases after infection, but not to 100%. The optimality of self-quarantine vs trying to be infected early then depends on the degree of immunity generated, the quality of the quarantine, whether the disease will be eradicated soon after the epidemic, and other details of the situation.

Individual optimality also depends on what the rest of the population is doing. If their self-quarantine is close to perfect, then an individual’s risk of catching the disease is very low, so no reason to suffer the disutility of isolation. If others quarantine themselves moderately, so the disease will be eradicated soon, but currently is quite infectious, then self-isolation is individually optimal. If others do almost nothing, and the disease spreads easily and does not generate much immunity, then an individual will either have to self-quarantine indefinitely or will catch it. Seasonal flu and the common cold (various rhinoviruses and adenoviruses) are reasonable examples. For these, self-quarantine is individually suboptimal.

Social welfare considerations seem to weigh in favour of self-quarantine, because a sick person infects others, which speeds up the epidemic. One exception to the optimality of self-quarantine comes from economies of scale in treatment when prevalence is not so high as to overwhelm the health system. If the epidemic is fading, but the disease increases immunity and is likely to become endemic, with low prevalence, then it may be better from a social standpoint to catch the disease when treatment is widely available, medical personnel have just had plenty of experience with this illness, and not many other people remain susceptible. This is rare.

Herd immunity is another reason why self-quarantine is socially suboptimal for some diseases. The logic is the same as for vaccination. If catching chickenpox as a child is a mild problem and prevents contracting and spreading it at an older age when it is more severe, then sending children to a school with a chickenpox epidemic is a smart idea.

Reducing the duration of quarantine for vulnerable populations is another reason why being infected sooner rather than later may be socially optimal. Suppose a disease is dangerous for some groups, but mild or even undetectable for most of the population, spreads widely and makes people resistant enough that herd immunity leads to eradication. During the epidemic, the vulnerable have to be isolated, which is unpleasant for them. The faster the non-vulnerable people get their herd immunity and eradicate the infection, the shorter the quarantine required for the vulnerable.

For most epidemics, but not all, self-quarantine is probably socially optimal.

Prefereeing increases the inequality of research output

Why do top researchers in economics publish almost exclusively in the top 5 journals? Random idea generation and mistakes in the course of its implementation should imply significant variance of the quality of finished research projects even for the best scientists. So top people should have more of all quality levels of papers.

Nepotism is not necessary to explain why those at top universities find it easier to publish in top journals. Researchers at the best departments have frequent access to editors and referees of top journals (their colleagues), so can select ideas that the editors and referees like and further tailor the project to the tastes of these gatekeepers during writing. Researchers without such access to editors and referees choose their projects “blindly” and develop the ideas in directions that only match gatekeeper tastes by chance. This results in much “wasted work” if the goal is to publish well (which may or may not be correlated with the social welfare from the research).

In addition to selecting and tailoring projects, those with access can also better select journals, because they know the preferences of the editorial board. So for any given project, networking with the gatekeepers allows choosing a journal where editors are likely to like this project. This reduces the number of rejections before eventual acceptance, allowing accumulating publications quicker and saving the labour of some rounds of revision of the paper (at journals that reject after a revise-and-resubmit for example).

A similar rich-get-richer positive feedback operates in business, especially for firms that sell to other firms (B2B). Top businesspeople get access to decisionmakers at other organisations, so can learn what the market desires, thus can select and tailor products to the wants of potential customers. Better selection and targeting avoids wasting product development costs. The products may or may not increase social welfare.

Information about other business leaders’ preferences also helps target the marketing of any given product to those predisposed to like the product. Thus successful businesspeople (who have access to influential decisionmakers) have a more popular selection of products with lower development and marketing costs.

On the seller side, firms would not want their competitors to know what the buyers desire, but the buyer side has a clear incentive to inform all sellers, not just those with access. Empirically, few buyers publish on their websites any information about their desired products. One reason may be that info is costly to provide, e.g. requests for product characteristics reveal business secrets about the buyer. However, disclosure costs would also prevent revealing info via networking. Another reason buyers do not to publicly announce their desired products may be that the buyers are also sellers of other products, so trade information for information with their suppliers who are also their customers. The industry or economy as a whole would benefit from more information-sharing (saving the cost of unwanted products), so some trading friction must prevent this mutually beneficial exchange.

One friction is an agency conflict between managers and shareholders. If managers are evaluated based on relative performance, then the managers of some firms may collude to only share useful information with each other, not with those outside their circle. The firms managed by the circle would benefit from wider sharing of their product needs, because outside companies would enter the competition to supply them, reducing their costs. However, those outside firms would get extra profit, making their managers look good, thus lowering the relative standing of the managers in the circle.

Popularity inequality and multiple equilibria

Suppose losing a friend is more costly for a person with few contacts than with many. Then a person with many friends has a lower cost of treating people badly, e.g. acting as if friends are dispensable and interchangeable. The lower cost means that unpleasant acts can signal popularity. Suppose that people value connections with popular others more than unpopular. This creates a benefit from costly, thus credible, signalling of popularity – such signals attract new acquaintances. Having a larger network in turn reduces the cost of signalling popularity by treating friends badly.

Suppose people on average value a popular friend more than the disutility from being treated badly by that person (so the bad treatment is not too bad, more of a minor annoyance). Then a feedback loop arises where bad treatment of others attracts more connections than it loses. The popular get even more popular, reducing their cost of signalling popularity, which allows attracting more connections. Those with few contacts do not want to imitate the stars of the network by also acting unpleasantly, because their expected cost is larger. For example, there is uncertainty about the disutility a friend gets from being treated badly or about how much the friend values the connection, so treating her or him badly destroys the friendship with positive probability. An unpopular person suffers a large cost from losing even one friend.

Under the assumptions above, a popular person can rely on the Law of Large Numbers to increase her or his popularity in expectation by treating others badly. A person with few friends does not want to take the risk of losing even them if they turn out to be sensitive to nastiness.

Multiple equilibria may exist in the whole society: one in which everyone has many contacts and is nasty to them and one in which people have few friends and act nice. Under the assumption that people value a popular friend more than the disutility from being treated badly, the equilibrium with many contacts and bad behaviour actually gives greater utility to everyone. This counterintuitive conclusion can be changed by assuming that popularity is relative, not a function of the absolute number of friends. Total relative popularity is constant in the population, in which case the bad treatment equilibrium is worse by the disutility of bad treatment.

In order for there to be something to signal, it cannot be common knowledge that everyone is equally popular. Signalling with reasonable beliefs requires unequal popularity. Inequality reduces welfare if people are risk averse (in this case over their popularity). Risk aversion further reduces average utility in the popular-and-nasty equilibrium compared to the pooling equilibrium where everyone has few friends and does not signal (acts nice).

In general, if one of the benefits of signalling is a reduction in the cost of signalling, then the amount of signalling and inequality increases. My paper “Dynamic noisy signaling” (2018) studies this in the context of education signalling in Section V.B “Human capital accumulation”.

The smartest professors need not admit the smartest students

The smartest professors are likely the best at targeting admission offers to students who are the most useful for them. Other things equal, the intelligence of a student is beneficial, but there may be tradeoffs. The overall usefulness may be maximised by prioritising obedience (manipulability) over intelligence or hard work. It is an empirical question what the real admissions criteria are. Data on pre-admissions personality test results (which the admissions committee may or may not have) would allow measuring whether the admission probability increases in obedience. Measuring such effects for non-top universities is complicated by the strategic incentive to admit students who are reasonably likely to accept, i.e. unlikely to get a much better offer elsewhere. So the middle- and bottom-ranked universities might not offer a place to the highest-scoring students for reasons independent of the obedience-intelligence tradeoff.

Similarly, a firm does not necessarily hire the brightest and individually most productive workers, but rather those who the firm expects to contribute the most to the firm’s bottom line. Working well with colleagues, following orders and procedures may in some cases be the most important characteristics. A genius who is a maverick may disrupt other workers in the organisation too much, reducing overall productivity.

Privacy reduces cooperation, may be countered by free speech

Cooperation relies on reputation. For example, fraud in online markets is deterred by the threat of bad reviews, which reduce future trading with the defector. Data protection, specifically the “right to be forgotten” allows those with a bad reputation to erase their records from the market provider’s database and create new accounts with a clean slate. Bayesian participants of the market then rationally attach a bad reputation to any new account (“guilty until proven innocent”). If new entrants are penalised, then entry and competition decrease.

One way to counter this abusing of data protection laws to escape the consequences of one’s past misdeeds is to use free speech laws. Allow market participants to comment on or rate others, protecting such comments as a civil liberty. If other traders can identify a bad actor, for example using his or her government-issued ID, then any future account by the same individual can be penalised by attaching the previous bad comments from the start.

Of course, comments could be abused to destroy competitors’ reputations, so leaving a bad comment should have a cost. For example, the comments are numerical ratings and the average rating given by a person is subtracted from all ratings given by that person. Dividing by the standard deviation is helpful for making the ratings of those with extreme opinions comparable to the scores given by moderates. Normalising by the mean and standard deviation makes ratings relative, so pulling down someone’s reputation pushes up those of others.

However, if a single entity can control multiple accounts (create fake profiles or use company accounts), then he or she can exchange positive ratings between his or her own profiles and rate others badly. Without being able to distinguish new accounts from fake profiles, any rating system has to either penalise entrants or allow sock-puppet accounts to operate unchecked. Again, official ID requirements may deter multiple account creation, but privacy laws impede this deterrence. There is always the following trilemma: either some form of un-erasable web activity history is kept, or entrants are punished, or fake accounts go unpunished.

M-diagram of politics

Suppose a politician claims that X is best for society. Quiz:

1. Should we infer that X is best for society?

2. Should we infer that the politician believes that X is best for society?

3. Should we infer that X is best for the politician?

4. Should we infer that X is best for the politician among policies that can be `sold’ as best for society?

5. Should we infer that the politician believes that X is best for the politician?

This quiz illustrates the general principle in game theory that players best-respond to their perceptions, not reality. Sometimes the perceptions may coincide with reality. Equilibrium concepts like Nash equilibrium assume that on average, players have correct beliefs.

The following diagram illustrates the reasoning of the politician claiming X is best for society: M-diagram of politics In case the diagram does not load, here is its description: the top row has `Official goal’ and `Real goal’, the bottom row has `Best way to the official goal’, `Best way to the real goal that looks like a reasonable way to the official goal’ and `Best way to the real goal’. Arrows point in an M-shaped pattern from the bottom row items to the top items. The arrow from `Best way to the real goal that looks like a reasonable way to the official goal’ to `Official goal’ is the constraint on the claims of the politician.

The correct answer to the quiz is 5.

This post is loosely translated from the original Estonian one https://www.sanderheinsalu.com/ajaveeb/?p=140

Economic and political cycles interlinked

Suppose the government’s policy determines the state of the economy with a lag that equals one term of the government. Also assume that voters re-elect the incumbent in a good economy, but choose the challenger in a bad economy. This voting pattern is empirically realistic and may be caused by voters not understanding the lag between the policy and the economy. Suppose there are two political parties: the good and the bad. The policy the good party enacts when in power puts the economy in a good state during the next term of government. The bad party’s policy creates a recession in the next term.

If the economy starts out doing well and the good party is initially in power, then the good party remains in power forever, because during each of its terms in government, it makes the economy do well the next term, so voters re-elect it the next term.

If the economy starts out in a recession with the good party in power, then the second government is the bad party. The economy does well during the second government’s term due to the policy of the good party in the first term. Then voters re-elect the bad party, but the economy does badly in the third term due to the bad party’s previous policy. The fourth government is then again the good party, with the economy in a recession. This situation is the same as during the first government, so cycles occur. The length of a cycle is three terms. In the first term, the good party is in power, with the other two terms governed by the bad party. In the first and third term, the economy is in recession, but in the second term, booming.

If the initial government is the bad party, with the economy in recession, then the three-term cycle again occurs, starting from the third term described above. Specifically, voters choose the good party next, but the economy does badly again because of the bad party’s current policy. Then voters change back to the bad party, but the economy booms due to the policy the good party enacted when it was in power. Re-election of the bad is followed by a recession, which is the same state of affairs as initially.

If the government starts out bad and the economy does well, then again the three-term cycle repeats: the next government is bad, with the economy in recession. After that, the good party rules, but the economy still does badly. Then again the bad party comes to power and benefits from the economic growth caused by the good party’s previous policy.

Overall, the bad party is in power two-thirds of the time and the economy in recession also two-thirds of the time. Recessions overlap with the bad party in only one-third of government terms.

Of course, reality is more complicated than the simple model described above – there are random shocks to the economy, policy lags are not exactly equal to one term of the government, the length of time a party stays in power is random, one party’s policy may be better in one situation but worse in another.

Tradeoff between flashiness and competitive advantage in sports

Sports equipment is often brightly coloured, with eye-catching shape, such as for bicycle frames. Sometimes flashiness is beneficial, for example improving the visibility of a bike or a runner on the road, or a boat on the water. However, in sports where competitors act directly against each other (ballgames, racquet sports, fencing), eye-catching equipment makes it easier for opponents to track one’s movements, which is a disadvantage. For a similar reason, practical military equipment is camouflaged and dull-coloured, unlike dress uniforms.

Athletes would probably gain a small advantage by using either dull grey clothing, perhaps with camouflage spots, or equipment that matches the colour of the sports arena, e.g. green grass-patterned shoes and socks for a football field, blue or red for a tennis court. Eye-deceiving colouring would be especially useful in competitions based on rapid accurate movement and feints, such as fencing or badminton.

Another option for interfering with an opponent’s tracking of one’s movements is to use reflective clothing (mirror surfaces, safety orange or neon yellow) to blind the rival. This would work especially well for outdoor sports in the sunshine or in stadiums lit by floodlights.

One downside of dull clothing may be that it does not inspire fans or sponsors, so wearing it may reduce the athlete’s income from merchandise and advertising. A similar tradeoff occurs in real vs movie fighting. Blindingly bright equipment does not have this disadvantage.

Another downside of camouflage may occur if it replaces red clothing, which has been found to give football teams a small advantage. The reason is psychological: red makes the wearers more aggressive and the opponents less.

Golf as a cartel monitoring device for skilled services

Many explanations have been advanced for golf and similar costly, seemingly boring, low-effort group activities. One reason could be signalling one’s wealth and leisure by an expensive and time-consuming sport, another may be networking during a low-effort group activity that does not interfere with talking.

An additional explanation is monitoring others’ time use. A cartel agrees to restrict the quantity that its members provide, in order to raise price. In skilled services (doctors, lawyers, engineers, notaries, consultants) the quantity sold is work hours. Each member of a cartel has an incentive to secretly increase supply to obtain more profit. Monitoring is thus needed to sustain the cartel. One way to check that competitors are not selling more work hours is to observe their time use by being together. To reduce boredom, the time spent in mutual monitoring should be filled somehow, and the activity cannot be too strenuous, otherwise it could not be sustained for long enough to meaningfully decrease hours worked. Playing golf fulfills these requirements.

A prediction from this explanation for golf is that participation in time-consuming group activities would be greater in industries selling time-intensive products and services. By contrast, if supply is relatively insensitive to hours worked, for example in capital-intensive industries or standard software, then monitoring competitors’ time use is ineffective in restricting their output and sustaining a cartel. Other ways of checking quantity must then be found, such as price-matching guarantees, which incentivise customers to report a reduced price of a competitor.