Tag Archives: abstract musings

Learning and evolution switch the sign of autocorrelations

Animals are more successful if they learn or evolve to predict locations of food, mates and predators. Prediction of anything relies on correlations over time in the environment. These correlations may be positive or negative. Learning is more difficult if the sign of the correlation switches over time, which occurs in nature due to resource depletion, learning and evolution.

If a herbivore eats a tasty patch of plants or a predator a nest full of eggs, then the next day that food is not there (negative correlation), but the next year at the same time it is probably there again (positive correlation) because the plants regrow from roots or seeds, and if the prey found the nesting spot attractive one year, then other members of the prey species will likely prefer it the next year as well. However, over many generations, if the plants in that location get eaten before dispersing seeds or the young in that nest before breeding, then the prey will either learn or evolve to avoid that location, or go extinct. This makes the autocorrelation negative again on sufficiently long timescales.

Positive correlation is the easiest to learn – just keep doing the same thing and achieve the same successful outcome. Negative correlation is harder, because the absence of success at one time predicts success from the same action at another time, and vice versa. Learning a changing correlation requires a multi-parameter mental model of the superimposed different-frequency oscillations of resource abundance.

There is a tradeoff between exploiting known short-period correlations and experimenting to learn longer-period correlations. There may always be a longer pattern to discover, but finite lifetimes make learning very low-frequency events not worthwhile.

Popularity inequality and multiple equilibria

Suppose losing a friend is more costly for a person with few contacts than with many. Then a person with many friends has a lower cost of treating people badly, e.g. acting as if friends are dispensable and interchangeable. The lower cost means that unpleasant acts can signal popularity. Suppose that people value connections with popular others more than unpopular. This creates a benefit from costly, thus credible, signalling of popularity – such signals attract new acquaintances. Having a larger network in turn reduces the cost of signalling popularity by treating friends badly.

Suppose people on average value a popular friend more than the disutility from being treated badly by that person (so the bad treatment is not too bad, more of a minor annoyance). Then a feedback loop arises where bad treatment of others attracts more connections than it loses. The popular get even more popular, reducing their cost of signalling popularity, which allows attracting more connections. Those with few contacts do not want to imitate the stars of the network by also acting unpleasantly, because their expected cost is larger. For example, there is uncertainty about the disutility a friend gets from being treated badly or about how much the friend values the connection, so treating her or him badly destroys the friendship with positive probability. An unpopular person suffers a large cost from losing even one friend.

Under the assumptions above, a popular person can rely on the Law of Large Numbers to increase her or his popularity in expectation by treating others badly. A person with few friends does not want to take the risk of losing even them if they turn out to be sensitive to nastiness.

Multiple equilibria may exist in the whole society: one in which everyone has many contacts and is nasty to them and one in which people have few friends and act nice. Under the assumption that people value a popular friend more than the disutility from being treated badly, the equilibrium with many contacts and bad behaviour actually gives greater utility to everyone. This counterintuitive conclusion can be changed by assuming that popularity is relative, not a function of the absolute number of friends. Total relative popularity is constant in the population, in which case the bad treatment equilibrium is worse by the disutility of bad treatment.

In order for there to be something to signal, it cannot be common knowledge that everyone is equally popular. Signalling with reasonable beliefs requires unequal popularity. Inequality reduces welfare if people are risk averse (in this case over their popularity). Risk aversion further reduces average utility in the popular-and-nasty equilibrium compared to the pooling equilibrium where everyone has few friends and does not signal (acts nice).

In general, if one of the benefits of signalling is a reduction in the cost of signalling, then the amount of signalling and inequality increases. My paper “Dynamic noisy signaling” (2018) studies this in the context of education signalling in Section V.B “Human capital accumulation”.

Diffraction grating of parallel electron beams

Diffraction gratings with narrow bars and bar spacing are useful for separating short-wavelength electromagnetic radiation (x-rays, gamma rays) into a spectrum, but the narrow bars and gaps are difficult to manufacture. The bars are also fragile and thus need a backing material, which may absorb some of the radiation, leaving less of it to be studied. Instead of manufacturing the grating out of a solid material composed of neutral atoms, an alternative may be to use many parallel electron beams. Electromagnetic waves do scatter off electrons, thus the grating of parallel electron beams should have a similar effect to a solid grating of molecules. My physics knowledge is limited, so this idea may not work for many reasons.

Electron beams can be made with a diameter a few nanometres across, and can be bent with magnets. Thus the grating could be made from a single beam if powerful enough magnets bend it back on itself. Or many parallel beams generated from multiple sources.

The negatively charged electrons repel each other, so the beams tend to bend away from each other. To compensate for this, the beam sources could target the beams to a common focus and let the repulsion forces bend the beams outward. There would exist a point at which the converging and then diverging beams are parallel. The region near that point could be used as the grating. The converging beams should start out sufficiently close to parallel that they would not collide before bending outward again.

Proton or ion beams are also a possibility, but protons and ions have larger diameter than electrons, which tends to create a coarser grating. Also, electron beam technology is more widespread and mature (cathode ray tubes were used in old televisions), thus easier to use off the shelf.

Privacy reduces cooperation, may be countered by free speech

Cooperation relies on reputation. For example, fraud in online markets is deterred by the threat of bad reviews, which reduce future trading with the defector. Data protection, specifically the “right to be forgotten” allows those with a bad reputation to erase their records from the market provider’s database and create new accounts with a clean slate. Bayesian participants of the market then rationally attach a bad reputation to any new account (“guilty until proven innocent”). If new entrants are penalised, then entry and competition decrease.

One way to counter this abusing of data protection laws to escape the consequences of one’s past misdeeds is to use free speech laws. Allow market participants to comment on or rate others, protecting such comments as a civil liberty. If other traders can identify a bad actor, for example using his or her government-issued ID, then any future account by the same individual can be penalised by attaching the previous bad comments from the start.

Of course, comments could be abused to destroy competitors’ reputations, so leaving a bad comment should have a cost. For example, the comments are numerical ratings and the average rating given by a person is subtracted from all ratings given by that person. Dividing by the standard deviation is helpful for making the ratings of those with extreme opinions comparable to the scores given by moderates. Normalising by the mean and standard deviation makes ratings relative, so pulling down someone’s reputation pushes up those of others.

However, if a single entity can control multiple accounts (create fake profiles or use company accounts), then he or she can exchange positive ratings between his or her own profiles and rate others badly. Without being able to distinguish new accounts from fake profiles, any rating system has to either penalise entrants or allow sock-puppet accounts to operate unchecked. Again, official ID requirements may deter multiple account creation, but privacy laws impede this deterrence. There is always the following trilemma: either some form of un-erasable web activity history is kept, or entrants are punished, or fake accounts go unpunished.

Avoiding the Bulow and Rogoff 1988 result on the impossibility of borrowing

Bulow and Rogoff 1988 NBER working paper 2623 proves that countries cannot borrow, due to their inability to credibly commit to repay, if after default they can still buy insurance. The punishment of defaulting on debt is being excluded from future borrowing. This punishment is not severe enough to motivate a country to repay, by the following argument. A country has two reasons to borrow: it is less patient than the lenders (values current consumption or investment opportunities relatively more) and it is risk-averse (either because the utility of consumption is concave, or because good investment opportunities appear randomly). Debt can be used to smooth consumption or take advantage of temporary opportunities for high-return investment: borrow when consumption would otherwise be low, pay back when relatively wealthy.

After the impatient country has run up its debt to the maximum level the creditors are willing to tolerate, the impatience motive to borrow disappears, because the lenders do not allow more consumption to be transferred from the future to the present. Only the insurance motive to borrow remains. The punishment for default is the inability to insure via debt, because in a low-consumption or valuable-investment state of affairs, no more can be borrowed. Bulow and Rogoff assume that the country can still save or buy insurance by paying in advance, so “one-sided” risk-sharing (pay back when relatively wealthy, or when investment opportunities are unavailable) is possible. This seemingly one-sided risk-sharing becomes standard two-sided risk-sharing upon default, because the country can essentially “borrow” from itself the amount that it would have spent repaying debt. This amount can be used to consume or invest in the state of the world where these activities are attractive, or to buy insurance if consumption and investment are currently unattractive. Thus full risk-sharing is achieved.

More generally, if the country can avoid the punishment that creditors impose upon default (evade trade sanctions by smuggling, use alternate lenders if current creditors exclude it), then the country has no incentive to repay, in which case lenders have no incentive to lend.

The creditors know that once the country has run up debt to the maximum level they allow, it will default. Thus rational lenders set the maximum debt to zero. In other words, borrowing is impossible.

A way around the no-borrowing theorem of Bulow and Rogoff is to change one or more assumptions. In an infinite horizon game, Hellwig and Lorenzoni allow the country to run a Ponzi scheme on the creditors, thus effectively “borrow from time period infinity”, which permits a positive level of debt. Sometimes even an infinite level of debt.

Another assumption that could realistically be removed is that the country can buy insurance after defaulting. Restricting insurance need not be due to an explicit legal ban. The insurers are paid in advance, thus do not exclude the country out of fear of default. Instead, the country’s debt contract could allow creditors to seize the country’s financial assets abroad, specifically in creditor countries, and these assets could be defined to include insurance premiums already paid, or the payments from insurers to the country. The creditors have no effective recourse against the sovereign debtor, but they may be able to enforce claims against insurance firms outside the defaulting country.

Seizing premiums to or payments from insurers would result in negative profits to insurers or restrict the defaulter to one-sided risk-sharing, without the abovementioned possibility of making it two-sided. Seizing premiums makes insurers unwilling to insure, and seizing payments from insurers removes the country’s incentive to purchase insurance. Either way, the country’s benefit from risk-sharing after default is eliminated. This punishment would motivate loan repayment, in turn motivating lending.

M-diagram of politics

Suppose a politician claims that X is best for society. Quiz:

1. Should we infer that X is best for society?

2. Should we infer that the politician believes that X is best for society?

3. Should we infer that X is best for the politician?

4. Should we infer that X is best for the politician among policies that can be `sold’ as best for society?

5. Should we infer that the politician believes that X is best for the politician?

This quiz illustrates the general principle in game theory that players best-respond to their perceptions, not reality. Sometimes the perceptions may coincide with reality. Equilibrium concepts like Nash equilibrium assume that on average, players have correct beliefs.

The following diagram illustrates the reasoning of the politician claiming X is best for society: M-diagram of politics In case the diagram does not load, here is its description: the top row has `Official goal’ and `Real goal’, the bottom row has `Best way to the official goal’, `Best way to the real goal that looks like a reasonable way to the official goal’ and `Best way to the real goal’. Arrows point in an M-shaped pattern from the bottom row items to the top items. The arrow from `Best way to the real goal that looks like a reasonable way to the official goal’ to `Official goal’ is the constraint on the claims of the politician.

The correct answer to the quiz is 5.

This post is loosely translated from the original Estonian one https://www.sanderheinsalu.com/ajaveeb/?p=140

Economic and political cycles interlinked

Suppose the government’s policy determines the state of the economy with a lag that equals one term of the government. Also assume that voters re-elect the incumbent in a good economy, but choose the challenger in a bad economy. This voting pattern is empirically realistic and may be caused by voters not understanding the lag between the policy and the economy. Suppose there are two political parties: the good and the bad. The policy the good party enacts when in power puts the economy in a good state during the next term of government. The bad party’s policy creates a recession in the next term.

If the economy starts out doing well and the good party is initially in power, then the good party remains in power forever, because during each of its terms in government, it makes the economy do well the next term, so voters re-elect it the next term.

If the economy starts out in a recession with the good party in power, then the second government is the bad party. The economy does well during the second government’s term due to the policy of the good party in the first term. Then voters re-elect the bad party, but the economy does badly in the third term due to the bad party’s previous policy. The fourth government is then again the good party, with the economy in a recession. This situation is the same as during the first government, so cycles occur. The length of a cycle is three terms. In the first term, the good party is in power, with the other two terms governed by the bad party. In the first and third term, the economy is in recession, but in the second term, booming.

If the initial government is the bad party, with the economy in recession, then the three-term cycle again occurs, starting from the third term described above. Specifically, voters choose the good party next, but the economy does badly again because of the bad party’s current policy. Then voters change back to the bad party, but the economy booms due to the policy the good party enacted when it was in power. Re-election of the bad is followed by a recession, which is the same state of affairs as initially.

If the government starts out bad and the economy does well, then again the three-term cycle repeats: the next government is bad, with the economy in recession. After that, the good party rules, but the economy still does badly. Then again the bad party comes to power and benefits from the economic growth caused by the good party’s previous policy.

Overall, the bad party is in power two-thirds of the time and the economy in recession also two-thirds of the time. Recessions overlap with the bad party in only one-third of government terms.

Of course, reality is more complicated than the simple model described above – there are random shocks to the economy, policy lags are not exactly equal to one term of the government, the length of time a party stays in power is random, one party’s policy may be better in one situation but worse in another.

Social welfare functions derived from revealed preference

The social welfare functions used in policy evaluation typically put more weight on poorer people, justifying redistribution from the rich to the poor. The reasoning is that the marginal benefit of a unit of money is greater for the poor than the rich. However, people with a greater marginal value of money are more motivated to earn and save, other things equal, so more likely to become rich. In this case, the rich have on average a higher marginal benefit of money than the poor, or a lower marginal cost of accumulating it. If the justification for redistribution is an interpersonal utility comparison, then revealed preference suggests a greater Pareto weight for richer people, thus redistribution in the opposite direction to the usual.

If the marginal utility of money decreases in wealth or income, then people earn until the marginal benefit equals the marginal cost, so the comparison between the rich and the poor depends on their marginal cost of earning, evaluated at their current wealth and income. The cost and benefit of earning may both be higher or lower for richer people. In a one-shot model, whoever has a greater benefit should receive redistributive transfers to maximise a utilitarian welfare criterion. Dynamic indirect effects sometimes reverse this conclusion, because incentives for future work are reduced by taxation.

Those with a high marginal utility of money are more motivated to convince the public that their marginal utility is high and that they should receive a subsidy. The marginal utility is the difference between a benefit and a cost, which determine whether the poor or the rich have a greater incentive to lobby for redistributive transfers. The marginal cost of an hour of persuasion equals the person’s hourly wage, so depends on whether her income is derived mostly from capital or from labour. For example, both rentiers and low-wage workers have a low opportunity cost of time, so optimally lobby more than high-wage workers. If lobbying influences policy (which is empirically plausible), then the tax system resulting from the persuasion competition burdens the high-wage workers the heaviest and leaves loopholes and low rates for capital income and low wages. This seems to be the case in most countries.

A tax system based on lobbying is inefficient, because it is not the people with the greatest benefit that receive the subsidies (which equal the value of government services minus the taxes), but those with the largest difference between the benefit and the lobbying cost. However, the resulting taxation is constrained efficient under the restriction that the social planner cannot condition policy on people’s marginal costs of lobbying.

Seasonings may reduce the variety of diet

Animals may evolve a preference for a varied diet in order to get the many nutrients they need. A test of this on mice would be whether their preference for different grains is negatively autocorrelated, i.e. they are less likely to choose a food if they have eaten more of it recently.

Variety is perceived mainly through taste, so the mechanism via which the preference for a varied diet probably operates is that consuming a substance repeatedly makes its taste less pleasant for the next meal. Spices and other flavourings can make the same food seem different, so may interfere with variety-seeking, essentially by deceiving the taste. A test of this on mice would flavour the same grain differently and check whether this attenuates the negative autocorrelation of consumption, both when other grains are available and when not.

If seasonings reduce variety-seeking, then access to spices may lead people to consume a more monotonous diet, which may be less healthy. A test of this hypothesis is whether increased access to flavourings leads to more obesity, especially among those constrained to eat similar foods over time. The constraint may be poverty (only a few cheap foods are affordable) or physical access (living in a remote, unpopulated area).

A preference for variety explains why monotonous diets, such as Atkins, may help lose weight: eating similar food repeatedly gets boring, so the dieter eats less.

Compatibility with colleagues is like interoperability

Interacting with colleagues is like compatibility of programs, tools or machine parts – an individually very good component may be useless if it does not fit with the rest of the machine. A potentially very productive worker who does not work with others in the company does not contribute much.

The difference between an individual and a firm may be horizontal (different cultures, all similarly good) or vertical (bad vs good quality or productivity). The horizontal compatibility with colleagues includes personal appearance – wearing a shirt with a left-wing slogan may be fine in a left-wing company, but offend people in a right-wing one, and vice versa. When colleagues take offence, the strong emotions distract them from work, so a slogan on a shirt may reduce their productivity.

Vertical fitting in includes personal hygiene, because bad breath or body odour distracts others from work. Similarly, loud phone conversations or other noise are disruptive everywhere.