# Robustness is a form of efficiency

Efficiency means using the best way to achieve a goal. Mathematically, selecting the maximizer of an objective function. The goal may be anything. For example, the objective function may be a weighted average of performance across various situations.

Robustness means performing well in a wide variety of circumstances. Mathematically, performing well may mean maximizing the weighted average performance across situations, where the weights are the probabilities of the situations. Performing well may also mean maximizing the probability of meeting a minimum standard – this probability sums the probabilities of situations in which the (situation-specific) minimum standard is reached. In any case, some objective function is being maximized for robustness. The best way to achieve a goal is being found. The goal is either a weighted average performance, the probability of exceeding a minimum standard or some similar objective. Thus robustness is efficiency for a particular objective.

The robustness-efficiency tradeoff is just a tradeoff between different objective functions. One objective function in this case is a weighted average that puts positive weight on the other objective function.

Whatever the goal, working towards it efficiently is by definition the best thing to do. The goal usually changes over time, but most of this change is a slow drift. Reevaluating the probabilities of situations usually changes the goal, in particular if the goal is a weighted average or a sum of probabilities that includes some of these situations. A rare event occurring causes a reevaluation of the probability of this event, thus necessarily the probability of at least one other event. If the probabilities of rare events are revised up, then the goal tends to shift away from single-situation efficiency, or performance in a small number of situations, towards robustness (efficiency for a combination of a large number of situations).

To be better prepared for emergencies and crises, the society should prepare efficiently. The most efficient method may be difficult to determine in the short term. If the expected time until the next crisis is long, then the best way includes gathering resources and storing these in a large number of distributed depots. These resources include human capital – the skills of solving emergencies. Such skills are produced using training, stored in people’s brains, kept fresh with training. Both the physical and mental resources are part of the economic production in the country. Economic growth is helpful for creating emergency supplies, raising the medical capacity, freeing up time in which to train preparedness. Unfortunately, economic growth is often wasted on frivolous consumption of goods and services, often to impress others. Resources wasted in this way may reduce preparedness by causing people to go soft physically and mentally.

Solving a crisis requires cooperation. Consumption of social media may polarize a society, reducing collaboration and thus preparedness.

# Signalling the precision of one’s information with emphatic claims

Chats both online and in person seem to consist of confident claims which are either extreme absolute statements (“vaccines don’t work at all”, “you will never catch a cold if you take this supplement”, “artificial sweeteners cause cancer”) or profess no knowledge (“damned if I know”, “we will never know the truth”), sometimes blaming the lack of knowledge on external forces (“of course they don’t tell us the real reason”, “the security services are keeping those studies secret, of course”, “big business is hiding the truth”). Moderate statements that something may or may not be true, especially off the center of all-possibilities-equal, and expressions of personal uncertainty (“I have not studied this enough to form an opinion”, “I have not thought this through”) are almost absent. Other than in research and official reports, I seldom encounter statements of the form “these are the arguments in this direction and those are the arguments in that direction. This direction is somewhat stronger.” or “the balance of the evidence suggests x” or “x seems more likely than not-x”. In opinion pieces in various forms of media, the author may give arguments for both sides, but in that case, concludes something like “we cannot rule out this and we cannot rule out that”, “prediction is difficult, especially now in a rapidly changing world”, “anything may happen”. The conclusion of the opinion piece does not recommend a moderate course of action supported by the balance of moderate-quality evidence.

The same person confidently claims knowledge of an extreme statement on one topic and professes certainty of no knowledge at all on another. What could be the goal of making both extreme and no-knowledge statements confidently? If the person wanted to pretend to be well-informed, then confidence helps with that, but claiming no knowledge would be counterproductive. Blaming the lack of knowledge on external forces and claiming that the truth is unknowable or will never be discovered helps excuse one’s lack of knowledge. The person can then pretend to be informed to the best extent possible (a constrained maximum of knowledge) or at least know more than others (a relative maximum).

Extreme statements suggest to an approximately Bayesian audience that the claimer has received many precise signals in the direction of the extreme statement and as a result has updated the belief far from the average prior belief in society. Confident statements also suggest many precise signals to Bayesians. The audience does not need to be Bayesian to form these interpretations – updating in some way towards the signal is sufficient, as is behavioural believing that confidence or extreme claims demonstrate the quality of the claimer’s information. A precisely estimated zero, such as confidently saying both x and not-x are equally likely, also signals good information. Similarly, being confident that the truth is unknowable.

Being perceived as having precise information helps influence others. If people believe that the claimer is well-informed and has interests more aligned than opposed to theirs, then it is rational to follow the claimer’s recommendation. Having influence is generally profitable. This explains the lack of moderate-confidence statements and claims of personal but not collective uncertainty.

A question that remains is why confident moderate statements are almost absent. Why not claim with certainty that 60% of the time, the drug works and 40% of the time, it doesn’t? Or confidently state that a third of the wage gap/racial bias/country development is explained by discrimination, a third by statistical discrimination or measurement error and a third by unknown factors that need further research? Confidence should still suggest precise information no matter what the statement is about.

Of course, if fools are confident and researchers honestly state their uncertainty, then the certainty of a statement shows the foolishness of the speaker. If confidence makes the audience believe the speaker is well-informed, then either the audience is irrational in a particular way or believes that the speaker’s confidence is correlated with the precision of the information in the particular dimension being talked about. If the audience has a long history of communication with the speaker, then they may have experience that the speaker is generally truthful, acts similarly across situations and expresses the correct level of confidence on unemotional topics. The audience may fail to notice when the speaker becomes a spreader of conspiracies or becomes emotionally involved in a topic and therefore is trying to persuade, not inform. If the audience is still relatively confident in the speaker’s honesty, then the speaker sways them more by confidence and extreme positions than by admitting uncertainty or a moderate viewpoint.

The communication described above may be modelled as the claimer conveying three-dimensional information with two two-dimensional signals. One dimension of the information is the extent to which the statement is true. For example, how beneficial is a drug or how harmful an additive. A second dimension is how uncertain the truth value of the statement is – whether the drug helps exactly 55% of patients or may help anywhere between 20 and 90%, between which all percentages are equally likely. A third dimension is the minimal attainable level of uncertainty – how much the truth is knowable in this question. This is related to whether some agency is actively hiding the truth or researchers have determined it and are trying to educate the population about it. The second and third dimensions are correlated. The lower is the lowest possible uncertainty, the more certain the truth value of the statement can be. It cannot be more certain than the laws of physics allow.

The two dimensions of one signal (the message of the claimer) are the extent to which the statement is true and how certain the claimer is of the truth value. Confidence emphasises that the claimer is certain about the truth value, regardless of whether this value is true or false. The claim itself is the first dimension of the signal. The reason the third dimension of the information is not part of the first signal is that the claim that the truth is unknowable is itself a second claim about the world, i.e. a second two-dimensional signal saying how much some agency is hiding or publicising the truth and how certain the speaker is of the direction and extent of the agency’s activity.

Opinion expressers in (social) media usually choose an extreme value for both dimensions of both signals. They claim some statement about the world is either the ultimate truth or completely false or unknowable and exactly in the middle, not a moderate distance to one side. In the second dimension of both signals, the opinionated people express complete certainty. If the first signal says the statement is true or false, then the second signal is not sent and is not needed, because if there is complete certainty of the truth value of the statement, then the statement must be perfectly knowable. If the first signal says the statement is fifty-fifty (the speaker does not know whether true or false), then in the second signal, the speaker claims that the truth is absolutely not knowable. This excuses the speaker’s claimed lack of knowledge as due to an objective impossibility, instead of the speaker’s limited data and understanding.

# P-value cannot be less than 1/1024 in ten binary choices

Baez-Mendoza et al (2021) claim that for rhesus macaques choosing which of two others to reward in each trial, „the difference in the other’s reputation based on past interactions (i.e., how likely they were to reciprocate over the past 20 trials) had a significant effect on the animal’s choices [odds ratio (OR) = 1.54, t = 9.2, P = 3.5 × 10^-20; fig. S2C]”.

In 20 trials, there are ten chances to reciprocate if I understand the meaning of reciprocation in the study (monkey x gives a reward to the monkey who gave x a reward in the last trial). Depending on interpretation, there are 6-10 chances to react to reciprocation. Six if three trials are required for each reaction: the trial in which a monkey acts, the trial in which another monkey reciprocates and the trial in which a monkey reacts to the reciprocation. Ten if the reaction can coincide with the initial act of the next action-reciprocation pair.

Under the null hypothesis that the monkey allocates rewards randomly, the probability of giving the reward to the monkey who previously reciprocated the most 10 times out of 10 is 1/1024. The p-value is the probability that the observed effect is due to chance, given the null hypothesis. So the p-value cannot be smaller than about 0.001 for a 20-trial session, which offers at most 10 chances to react to reciprocation. The p-value cannot be 3.5*10^-20 as Baez-Mendoza et al (2021) claim. Their supplementary material does not offer an explanation of how this p-value was calculated.

Interpreting reciprocation or trials differently so that 20 trials offer 20 chances to reciprocate, the minimal p-value is 1/1048576, approximately 10^-6, again far from 3.5*10^-20.

A possible explanation is the sentence “The group performed an average of 105 ± 8.7 (mean ± SEM) trials per session for a total of 22 sessions.” If the monkey has a chance to react to past reciprocation in a third of the 105*22 sessions, then the p-value can indeed be of the order 10^-20. It would be interesting to know how the authors divide the trials into the reputation-building and reaction blocks.

# Symmetry of matter seems impossible

I am not a physicist, so the following may be my misunderstanding. Symmetry seems theoretically impossible, except at one instant. If there was a perfectly symmetric piece of matter (after rotating or reflecting it around some axis, the set of locations of its atoms would be the same as before, just a possibly different atom in each location), then in the next instant of time, its atoms would move to unpredictable locations by the Heisenberg uncertainty principle (the location and momentum of a particle cannot be simultaneously determined). This is because the locations of the atoms would be known by symmetry in the first instant, thus their momenta unknown.

Symmetry may not provide complete information about the locations of the atoms, but constrains their possible locations. Such an upper bound on the uncertainty about locations puts a lower bound on the uncertainty about momenta. Momentum uncertainty creates location uncertainty in the next instant.

Symmetry is probably an approximation: rotating or reflecting a piece of matter, its atoms are in locations close to the previous locations of its atoms. Again, an upper bound on the location uncertainty about the atoms should put a lower bound on the momentum uncertainty. If the atoms move in uncertain directions, then the approximate location symmetry would be lost at some point in time, both in the future and the past.

# Moon phase and sleep correlation is not quite a sine wave

Casiraghi et al. (2021) in Science Advances (DOI: 10.1126/sciadv.abe0465) show that human sleep duration and onset depends on the phase of the moon. Their interpretation is that light availability during the night caused humans to adapt their sleep over evolutionary time. Casiraghi et al. fit a sine curve to both sleep duration and onset as functions of the day in the monthly lunar cycle, but their Figure 1 A, B for the full sample and the blue and orange curves for the rural groups in Figure 1 C, D show a statistically significant deviation from a sine function. Instead of same-sized symmetric peaks and troughs, sleep duration has two peaks with a small trough between, then a large sharp trough which falls more steeply than rises, then two peaks again. Sleep onset has a vertically reflected version of this pattern. These features are statistically significant, based on the confidence bands Casiraghi and coauthors have drawn in Figure 1.

The significant departure of sleep patterns from a sine wave calls into question the interpretation that light availability over evolutionary time caused these patterns. What fits the interpretation of Casiraghi et al. is that sleep duration is shortest right before full moon, but what does not fit is that the duration is longest right after full and new moons, but shorter during a waning crescent moon between these.

It would better summarise the data to use the first four terms of a Fourier series instead of just the first term. There seems little danger of overfitting, given N=69 and t>60.

A questionable choice of the authors is to plot the sleep duration and onset of only the 35 best-fitting participants in Figure 2. A more honest choice yielding the same number of plots would pick every other participant in the ranking from the best fit to the worst.

In the section Materials and Methods, Casiraghi et al. fitted both a 15-day and a 30-day cycle to test for the effect of the Moon’s gravitational pull on sleep. The 15-day component was weaker in urban communities than rural, but any effect of gravity should be the same in both. By contrast, the effect of moonlight should be weaker in urban communities, but the urban community data (Figure 1 C, D green curve) fits a simple sine curve better than rural. It seems strange that sleep in urban communities would correlate more strongly with the amount of moonlight, like Figure 1 shows.

# Leader turnover due to organisation performance is underestimated

Berry and Fowler (2021) “Leadership or luck? Randomization inference for leader effects in politics, business, and sports” in Science Advances propose a method they call RIFLE for testing the null hypothesis that leaders have no effect on organisation performance. The method is robust to serial correlation in outcomes and leaders, but not to endogenous leader turnover, as Berry and Fowler honestly point out. The endogeneity is that the organisation’s performance influences the probability that the leader is replaced (economic growth causes voters to keep a politician in office, losing games causes a team to replace its coach).

To test whether such endogeneity is a significant problem for their results, Berry and Fowler regress the turnover probability on various measures of organisational performance. They find small effects, but this underestimates the endogeneity problem, because Berry and Fowler use linear regression, forcing the effect of performance on turnover to be monotone and linear.

If leader turnover is increased by both success (get a better job elsewhere if the organisation performs well, so quit voluntarily) and failure (fired for the organisation’s bad performance), then the relationship between turnover and performance is U-shaped. Average leaders keep their jobs, bad and good ones transition elsewhere. This is related to the Peter Principle that an employee is promoted to her or his level of incompetence. A linear regression finds a near-zero effect of performance on turnover in this case even if the true effect is large. How close the regression coefficient is to zero depends on how symmetric the effects of good and bad performance on leader transition are, not how large these effects are.

The problem for the RIFLE method of Berry and Fowler is that the small apparent effect of organisation performance on leader turnover from OLS regression misses the endogeneity in leader transitions. Such endogeneity biases RIFLE, as Berry and Fowler admit in their paper.

The endogeneity may explain why Berry and Fowler find stronger leader effects in sports (coaches in various US sports) than in business (CEOs) and politics (mayors, governors, heads of government). A sports coach may experience more asymmetry in the transition probabilities for good and bad performance than a politician. For example, if the teams fire coaches after bad performance much more frequently than poach coaches from well-performing competing teams, then the effect of performance on turnover is close to monotone: bad performance causes firing. OLS discovers this monotone effect. On the other hand, if politicians move with equal likelihood after exceptionally good and bad performance of the administrative units they lead, then linear regression finds no effect of performance on turnover. This misses the bias in RIFLE, which without the bias might show a large leader effect in politics also.

The unreasonably large effect of governors on crime (the governor effect explains 18-20% of the variation in both property and violent crime) and the difference between the zero effect of mayors on crime and the large effect of governors that Berry and Fowler find makes me suspect something is wrong with that particular analysis in their paper. In a checks-and-balances system, the governor should not have that large of influence on the state’s crime. A mayor works more closely with the local police, so would be expected to have more influence on crime.

# Learning and evolution switch the sign of autocorrelations

Animals are more successful if they learn or evolve to predict locations of food, mates and predators. Prediction of anything relies on correlations over time in the environment. These correlations may be positive or negative. Learning is more difficult if the sign of the correlation switches over time, which occurs in nature due to resource depletion, learning and evolution.

If a herbivore eats a tasty patch of plants or a predator a nest full of eggs, then the next day that food is not there (negative correlation), but the next year at the same time it is probably there again (positive correlation) because the plants regrow from roots or seeds, and if the prey found the nesting spot attractive one year, then other members of the prey species will likely prefer it the next year as well. However, over many generations, if the plants in that location get eaten before dispersing seeds or the young in that nest before breeding, then the prey will either learn or evolve to avoid that location, or go extinct. This makes the autocorrelation negative again on sufficiently long timescales.

Positive correlation is the easiest to learn – just keep doing the same thing and achieve the same successful outcome. Negative correlation is harder, because the absence of success at one time predicts success from the same action at another time, and vice versa. Learning a changing correlation requires a multi-parameter mental model of the superimposed different-frequency oscillations of resource abundance.

There is a tradeoff between exploiting known short-period correlations and experimenting to learn longer-period correlations. There may always be a longer pattern to discover, but finite lifetimes make learning very low-frequency events not worthwhile.

# Popularity inequality and multiple equilibria

Suppose losing a friend is more costly for a person with few contacts than with many. Then a person with many friends has a lower cost of treating people badly, e.g. acting as if friends are dispensable and interchangeable. The lower cost means that unpleasant acts can signal popularity. Suppose that people value connections with popular others more than unpopular. This creates a benefit from costly, thus credible, signalling of popularity – such signals attract new acquaintances. Having a larger network in turn reduces the cost of signalling popularity by treating friends badly.

Suppose people on average value a popular friend more than the disutility from being treated badly by that person (so the bad treatment is not too bad, more of a minor annoyance). Then a feedback loop arises where bad treatment of others attracts more connections than it loses. The popular get even more popular, reducing their cost of signalling popularity, which allows attracting more connections. Those with few contacts do not want to imitate the stars of the network by also acting unpleasantly, because their expected cost is larger. For example, there is uncertainty about the disutility a friend gets from being treated badly or about how much the friend values the connection, so treating her or him badly destroys the friendship with positive probability. An unpopular person suffers a large cost from losing even one friend.

Under the assumptions above, a popular person can rely on the Law of Large Numbers to increase her or his popularity in expectation by treating others badly. A person with few friends does not want to take the risk of losing even them if they turn out to be sensitive to nastiness.

Multiple equilibria may exist in the whole society: one in which everyone has many contacts and is nasty to them and one in which people have few friends and act nice. Under the assumption that people value a popular friend more than the disutility from being treated badly, the equilibrium with many contacts and bad behaviour actually gives greater utility to everyone. This counterintuitive conclusion can be changed by assuming that popularity is relative, not a function of the absolute number of friends. Total relative popularity is constant in the population, in which case the bad treatment equilibrium is worse by the disutility of bad treatment.

In order for there to be something to signal, it cannot be common knowledge that everyone is equally popular. Signalling with reasonable beliefs requires unequal popularity. Inequality reduces welfare if people are risk averse (in this case over their popularity). Risk aversion further reduces average utility in the popular-and-nasty equilibrium compared to the pooling equilibrium where everyone has few friends and does not signal (acts nice).

In general, if one of the benefits of signalling is a reduction in the cost of signalling, then the amount of signalling and inequality increases. My paper “Dynamic noisy signaling” (2018) studies this in the context of education signalling in Section V.B “Human capital accumulation”.

# Diffraction grating of parallel electron beams

Diffraction gratings with narrow bars and bar spacing are useful for separating short-wavelength electromagnetic radiation (x-rays, gamma rays) into a spectrum, but the narrow bars and gaps are difficult to manufacture. The bars are also fragile and thus need a backing material, which may absorb some of the radiation, leaving less of it to be studied. Instead of manufacturing the grating out of a solid material composed of neutral atoms, an alternative may be to use many parallel electron beams. Electromagnetic waves do scatter off electrons, thus the grating of parallel electron beams should have a similar effect to a solid grating of molecules. My physics knowledge is limited, so this idea may not work for many reasons.

Electron beams can be made with a diameter a few nanometres across, and can be bent with magnets. Thus the grating could be made from a single beam if powerful enough magnets bend it back on itself. Or many parallel beams generated from multiple sources.

The negatively charged electrons repel each other, so the beams tend to bend away from each other. To compensate for this, the beam sources could target the beams to a common focus and let the repulsion forces bend the beams outward. There would exist a point at which the converging and then diverging beams are parallel. The region near that point could be used as the grating. The converging beams should start out sufficiently close to parallel that they would not collide before bending outward again.

Proton or ion beams are also a possibility, but protons and ions have larger diameter than electrons, which tends to create a coarser grating. Also, electron beam technology is more widespread and mature (cathode ray tubes were used in old televisions), thus easier to use off the shelf.

# Privacy reduces cooperation, may be countered by free speech

Cooperation relies on reputation. For example, fraud in online markets is deterred by the threat of bad reviews, which reduce future trading with the defector. Data protection, specifically the “right to be forgotten” allows those with a bad reputation to erase their records from the market provider’s database and create new accounts with a clean slate. Bayesian participants of the market then rationally attach a bad reputation to any new account (“guilty until proven innocent”). If new entrants are penalised, then entry and competition decrease.

One way to counter this abusing of data protection laws to escape the consequences of one’s past misdeeds is to use free speech laws. Allow market participants to comment on or rate others, protecting such comments as a civil liberty. If other traders can identify a bad actor, for example using his or her government-issued ID, then any future account by the same individual can be penalised by attaching the previous bad comments from the start.

Of course, comments could be abused to destroy competitors’ reputations, so leaving a bad comment should have a cost. For example, the comments are numerical ratings and the average rating given by a person is subtracted from all ratings given by that person. Dividing by the standard deviation is helpful for making the ratings of those with extreme opinions comparable to the scores given by moderates. Normalising by the mean and standard deviation makes ratings relative, so pulling down someone’s reputation pushes up those of others.

However, if a single entity can control multiple accounts (create fake profiles or use company accounts), then he or she can exchange positive ratings between his or her own profiles and rate others badly. Without being able to distinguish new accounts from fake profiles, any rating system has to either penalise entrants or allow sock-puppet accounts to operate unchecked. Again, official ID requirements may deter multiple account creation, but privacy laws impede this deterrence. There is always the following trilemma: either some form of un-erasable web activity history is kept, or entrants are punished, or fake accounts go unpunished.