Tag Archives: Bayes’ rule

Putting your money where your mouth is in policy debates

Climate change deniers should put their money where their mouth is by buying property in low-lying coastal areas or investing in drought-prone farmland. Symmetrically, those who believe the Earth is warming as a result of pollution should short sell climate-vulnerable assets. Then everyone eventually receives the financial consequences of their decisions and claimed beliefs. The sincere would be happy to bet on their beliefs, anticipating positive profit. Of course, the beliefs have to be somewhat dogmatic or the individuals in question risk-loving, otherwise the no-agreeing-to-disagree theorem would preclude speculative trade (opposite bets on a common event).

Governments tend to compensate people for widespread damage from natural disasters, because distributing aid is politically popular and there is strong lobbying for this free insurance. This insulates climate change deniers against the downside risk of buying flood- or wildfire-prone property. To prevent the cost of the damages from being passed to the taxpayers, the deniers should be required to buy insurance against disaster risk, or to sign contracts with (representatives of) the rest of society agreeing to transfer to others the amount of any government compensation they receive after flood, drought or wildfire. Similarly, those who short sell assets that lose value under a warming climate (or buy property that appreciates, like Arctic ports, under-ice mining and drilling rights) should not be compensated for the lost profit if the warming does not take place.

In general, forcing people to put their money where their mouth is would avoid wasting time on long useless debates (e.g. do high taxes reduce economic growth, does a high minimum wage raise unemployment, do tough punishments deter crime). Approximately rational people would doubt the sincerity of anyone who is not willing to bet on her or his beliefs, so one’s credibility would be tied to one’s skin in the game: a stake in the claim signals sincerity. Currently, it costs pundits almost nothing to make various claims in the media – past wrong statements are quickly forgotten, not impacting the reputation for accuracy much. 

The bets on beliefs need to be legally enforceable, so have to be made on objectively measurable events, such as the value of a publicly traded asset. By contrast, it is difficult to verify whether government funding for the arts benefits culture, or whether free public education is good for civil society, therefore bets on such claims would lead to legal battles. The lack of enforceability would reduce the penalty for making false statements, thus would not deter lying or shorten debates much.

An additional benefit from betting on (claimed) beliefs is to provide insurance to those harmed by the actions driven by these beliefs. For example, climate change deniers claim small harm from air pollution. Their purchases of property that will be damaged by a warming world allows climate change believers to short sell such assets. If the Earth then warms, then the deniers lose money and the believers gain at their expense. This at least partially compensates the believers for the damage caused by the actions of the deniers.

Why rational agents may react negatively to honesty

Emotional people may of course dislike an honest person, just because his truthful opinion hurt their feelings. In contrast, rational agents’ payoff cannot decrease when they get additional information, so they always benefit from honest feedback. However, rational decision makers may still adjust their attitude to be more negative towards a person making truthful, informative statements. The reason is Bayesian updating about two dimensions: the honesty of the person and how much the person cares about the audience’s feelings. Both dimensions of belief positively affect attitude towards the person. His truthful statements increase rational listeners’ belief about his honesty, but may reduce belief in his tactfulness, which may shift rational agents’ opinions strongly enough in the negative direction to outweigh the benefit from honesty.

The relative effect of information about how much the person cares, compared to news about his honesty, is greater when the latter is relatively more certain. In the limit, if the audience is completely convinced that the person is honest (or certain of his dishonesty), then the belief about his honesty stays constant no matter what he does, and only the belief about tact moves. Then telling an unpleasant truth unambiguously worsens the audience’s attitude. Thus if a reasonably rational listener accuses a speaker of „brutal honesty” or tactlessness, then it signals that the listener is relatively convinced either that the speaker is a liar or that he is a trustworthy type. Therefore an accusation of tactlessness may be taken as an insult or a compliment, depending on one’s belief about the accuser’s belief about one’s honesty.

If tact takes effort, and the cost of this effort is lower for those who care about the audience’s emotions, then pleasant comments are an informative signal (in the Spence signalling sense) that the speaker cares about the feelings of others. In that case the inference that brutal honesty implies an uncaring nature is correct.

On the other hand, if the utility of rational agents only depends on the information content of statements, not directly on their positive or negative emotional tone, then the rational agents should not care about the tact of the speaker. In this case, there is neither a direct reason for the speaker to avoid unpleasant truths (out of altruism towards the audience), nor an indirect benefit from signalling tactfulness. Attitudes would only depend on one dimension of belief: the one about honesty. Then truthfulness cannot have a negative effect.

Higher order beliefs may still cause honesty to be interpreted negatively even when rational agents’ utility does not depend on the emotional content of statements. The rational listeners may believe that the speaker believes that the audience’s feelings would be hurt by negative comments (for example, the speaker puts positive probability on irrational listeners, or on their utility directly depending on the tone of the statements they hear), in which case tactless truthtelling still signals not caring about others’ emotions.

On the optimal burden of proof

All claims should be considered false until proven otherwise, because lies can be invented much faster than refuted. In other words, the maker of a claim has the burden of providing high-quality scientific proof, for example by referencing previous research on the subject. Strangely enough, some people seem to believe marketing, political spin and conspiracy theories even after such claims have been proven false. It remains to wish that everyone received the consequences of their choices (so that karma works).
Considering all claims false until proven otherwise runs into a logical problem: a claim and its opposite claim cannot be simultaneously false. The priority for falsity should be given to actively made claims, e.g. someone saying that a product or a policy works, or that there is a conspiracy behind an accident. Especially suspect are claims that benefit their maker if people believe them. A higher probability of falsity should also be attached to positive claims, e.g. that something has an effect in whatever direction (as opposed to no effect) or that an event is due to non-obvious causes, not chance. The lack of an effect should be the null hypothesis. Similarly, ignorance and carelessness, not malice, should be the default explanation for bad events.
Sometimes two opposing claims are actively made and belief in them benefits their makers, e.g. in politics or when competing products are marketed. This is the hardest case to find the truth in, but a partial and probabilistic solution is possible. Until rigorous proof is found, one should keep an open mind. Keeping an open mind creates a vulnerability to manipulation: after some claim is proven false, its proponents often try to defend it by asking its opponents to keep an open mind, i.e. ignore evidence. In such cases, the mind should be closed to the claim until its proponents provide enough counter-evidence for a neutral view to be reasonable again.
To find which opposing claim is true, the first test is logic. If a claim is logically inconsistent with itself, then it is false by syntactic reasoning alone. A broader test is whether the claim is consistent with other claims of the same person. For example, Vladimir Putin said that there were no Russian soldiers in Crimea, but a month later gave medals to some Russian soldiers, citing their successful operation in Crimea. At least one of the claims must be false, because either there were Russian soldiers in Crimea or not. The way people try to weasel out of such self-contradictions is to say that the two claims referred to different time periods, definitions or circumstances. In other words, change the interpretation of words. A difficulty for the truth-seeker is that sometimes such a change in interpretation is a legitimate clarification. Tongues do slip. Nonetheless, a contradiction is probabilistic evidence for lying.
The second test for falsity is objective evidence. If there is a streetfight and the two sides accuse each other of starting it, then sometimes a security camera video can refute one of the contradicting claims. What evidence is objective is, sadly, subject to interpretation. Videos can be photoshopped, though it is difficult and time-consuming. The objectivity of the evidence is strongly positively correlated with the scientific rigour of its collection process. „Hard” evidence is a signal of the truth, but a probabilistic signal. In this world, most signals are probabilistic.
The third test of falsity is the testimony of neutral observers, preferably several of them, because people misperceive and misremember even under the best intentions. The neutrality of observers is again up for debate and interpretation. In some cases, an observer is a statistics-gathering organisation. Just like objective evidence, testimony and statistics are probabilistic signals.
The fourth test of falsity is the testimony of interested parties, to which the above caveats apply even more strongly.
Integrating conflicting evidence should use Bayes’ rule, because it keeps probabilities consistent. Consistency helps glean information about one aspect of the question from data on other aspects. Background knowledge should be combined with the evidence, for example by ruling out physical impossibilities. If a camera shows a car disappearing behind a corner and immediately reappearing, moving in the opposite direction, then physics says that the original car couldn’t have changed direction so fast. The appearing car must be a different one. Knowledge of human interactions and psychology is part of the background information, e.g. if smaller, weaker and outnumbered people rarely attack the stronger and more numerous, then this provides probabilistic info about who started a fight. Legal theory incorporates background knowledge of human nature to get information about the crime – human nature suggests motives. Asking: „Who benefits?” has a long history in law.

On simple answers

Bayes’ rule exercise: is a simple or a complicated answer to a complicated problem more likely to be correct?

Depends on the conditional probabilities: if simple questions are more likely to have simple answers and complex questions complicated, then a complicated answer is more likely to be correct for a complicated problem.

It seems reasonable that the complexity of the answer is correlated with the difficulty of the problem. But this is an empirical question.

If difficult problems are likely to have complex answers, then this is an argument against slogans and ideologies. These seek to give a catchy one-liner as the answer to many problems in society. No need to think – ideology has the solution. Depending on your political leaning, poverty may be due to laziness or exploitation. The foreign policy “solution” is bombing for some, eternal appeasement for others.

The probabilistic preference for complex answers in complicated situations seems to contradict Occam’s razor (among answers equally good at explaining the facts, the simplest answer should be chosen). There is no actual conflict with the above Bayesian exercise. There, the expectation of a complex answer applies to complicated questions, while a symmetric anticipation of a simple answer holds for simple problems. The answers compared are not equally good, because one fits the structure of the question better than the other.

Which ideology is more likely to be wrong?

Exercise in Bayes’ rule: is an ideology more likely to be wrong if it appeals relatively more to poor people than the rich?

More manipulable folks are more likely to lose their money, so less likely to be rich. Stupid people have a lower probability of making money. By Bayes, the rich are on average less manipulable and more intelligent than the poor.

Less manipulable people are less likely to find an ideology built on fallacies appealing. By Bayes, an ideology relatively more appealing to the stupid and credulous is more likely to be wrong. Due to such people being poor with a higher probability, an ideology embraced more by the poor than the rich is more likely to be fallacious.

Another exercise: is an ideology more likely to be wrong if academics like it relatively more than non-academics?

Smarter people are more likely to become academics, so by Bayes’ rule, academics are more likely to be smart. Intelligent people have a relatively higher probability of liking a correct ideology, so by Bayes, an ideology appealing to the intelligent is more likely to be correct. An ideology liked by academics is correct with a higher probability.

A random world as an argument against fanatism

Theoretical physicists may debate whether the universe is random or not, but for practical purposes it is, because any sufficiently complicated deterministic system looks random to someone who does not fully understand it. This is the example from Lipman (1991) “How to decide how to decide…”: the output of a complicated deterministic function that is written down still looks random to a person who cannot calculate its output.
If the world is random, we should not put probability one on any event. Nothing is certain, so any fanatical belief that some claim is certainly true is almost certainly wrong. This applies to religion, ideology, personal memories and also things right before your eyes. The eyes can deceive, as evidenced by the numerous visual illusions invented and published in the past. If you see your friend, is that really the same person? How detailed a memory of your friend’s face do you have? Makeup can alter appearance quite radically (http://www.mtv.com/news/1963507/woman-celebrity-makeup-transformation/).
This way lies paranoia, but actually in a random world, a tiny amount of paranoia about everything is appropriate. A large amount of paranoia, say putting probability more than 1% on conspiracy theories, is probably a wrong belief.
How to know whether something is true then? A famous quote: “Everything is possible, but not everything is likely” points the way. Use logic and statistics, apply Bayes’ rule. Statistics may be wrong, but they are much less likely to be wrong than rumours. A source that was right in the past is more likely to be right at present than a previously inaccurate source. Science does not know everything, but this is not a reason to believe charlatans.

Evaluating the truth and the experts simultaneously

When evaluating an artwork, the guilt of a suspect or the quality of theoretical research, the usual procedure is to gather the opinions of a number of people and take some weighted average of these. There is no objective measure of the truth or the quality of the work. What weights should be assigned to different people’s opinions? Who should be counted an expert or knowledgeable witness?
A circular problem appears: the accurate witnesses are those who are close to the truth, and the truth is close to the average claim of the accurate witnesses. This can be modelled as a set of signals with unknown precision. Suppose the signals are normally distributed with mean equal to the truth (witnesses unbiased, just have poor memories). If the precisions were known, then these could be used as weights in the weighted average of the witness opinions, which would be an unbiased estimate of the truth with minimal variance. If the truth were known, then the distance of the opinion of a witness from it would measure the accuracy of that witness. But both precisions and the truth are unknown.
Simultaneously determining the precisions of the signals and the estimate of the truth may have many solutions. If there are two witnesses with different claims, we could assign the first witness infinite precision and the second finite, and estimate the truth to equal the opinion of the first witness. The truth is derived from the witnesses and the precisions are derived from the truth, so this is consistent. The same applies with witnesses switched.
A better solution takes a broader view and simultaneously estimates witness precisions and the truth. These form a vector of random variables. Put a prior probability distribution on this vector and use Bayes’ rule to update this distribution in response to the signals (the witness opinions).
The solution of course depends on the chosen prior. If one witness is assumed infinitely precise and the others finitely, then the updating rule keeps the infinite and finite precisions and estimates the truth to equal the opinion of the infinitely precise witness. The assumption of the prior seems unavoidable. At least it makes clear why the multiple solutions arise.

Retaking exams alters their informativeness

If only those who fail are allowed to retake an exam and it is not reported whether a grade comes from the first exam or a retake, then the failers get an advantage. They get a grade that is the maximum of two attempts, while others only get one attempt.
A simple example has two types of exam takers: H and L, with equal proportions in the population. The type may reflect talent or preparation for exam. There are three grades: A, B, C. The probabilities for each type to receive a certain grade from any given attempt of the exam are for H, Pr(A|H)=0.3, Pr(B|H)=0.6, Pr(C|H)=0.1 and for L, Pr(A|L)=0.2, Pr(B|L)=0.1, Pr(C|L)=0.7. The H type is more likely to get better grades, but there is noise in the grade.
After the retake of the exam, the probabilities for H to end up with each grade are Pr*(A|H)=0.33, Pr*(B|H)=0.66 and Pr*(C|H)=0.01. For L,  Pr*(A|L)=0.34, Pr*(B|L)=0.17 and Pr*(C|L)=0.49. So the L type ends up with an A grade more frequently than H, due to retaking exams 70% of the time as opposed to H’s 10%.
If the observers of the grades are rational, they will infer by Bayes’ rule Pr(H|A)=33/67, Pr(H|B)=66/83 and Pr(H|C)=1/50.
It is probably to counter the advantage of retakers that some universities in the UK discount grades obtained from retaking exams (http://www.telegraph.co.uk/education/universityeducation/10236397/University-bias-against-A-level-resit-pupils.html). In the University of Queensland, those who fail a course can take a supplementary exam, but the grade is distinguished on the transcript from the grade obtained on first try. Also, the maximum grade possible from taking a supplementary exam is one step above failure – the three highest grades cannot be obtained.