Tag Archives: statistics

Probability of finding true love

The concept of true love has been invented by poets and other exaggerators. Evolutionarily, the optimal strategy is to settle with a good enough partner, not to seek the best in the world. But suppose for the sake of argument that a person A’s true love is another person B who exists somewhere in the world. What is the probability that A meets B?

There is no a priori reason why A and B have to be from the same country, have similar wealth or political views. Isn’t that what poets would have us believe – that love knows no boundaries, blossoms in unlikely places, etc?

Given the 7 billion people in the world, what fraction of them does a given person meet per lifetime? Depends on what is meant by “meets” – seeing each other from a distance, walking past each other on the street, looking at each other, talking casually. Let’s take literally the cliché “love at first sight” and assume that meeting means looking at each other. A person looks at a different number of people per day depending on whether they live in a city or in the countryside. There is also repetition, i.e. seeing the same person multiple times. A guess at an average number of new people a person looks at per day is 100. This times 365 times a 70-year lifespan is 2555000. Divide 7 billion by this and the odds of meeting one’s true love are thus about one in three thousand per lifetime.

Some questionable assumptions went into this conclusion, for example that the true love could be of any gender or age and that the meeting rate is 100 per day. Restricting the set candidates to a particular gender and age group proportionately lowers the number of candidates met and the total number of candidates, so leaves the conclusion unchanged.

Someone on the hunt for a partner may move to a big city, sign up for dating websites and thereby raise the meeting rate (raise number met while keeping total number constant), which would improve the odds. On the other hand, if recognizing one’s true love takes more than looking at them, e.g. a conversation, then the meeting rate could fall to less than one – how many new people per day do you have a conversation with?

Some people claim to have met their true love, at least in the hearing of their current partner. The fraction claiming this is larger than would be expected based on the calculations above. There may be cognitive dissonance at work (reinterpreting the facts so that one’s past decision looks correct). Or perhaps the perfect partner is with high probability from the same ethnic and socioeconomic background and the same high school class (this is called homophily in sociology). Then love blossoms in the most likely places.

Statistics with a single history

Only one history is observable to a person – the one that actually happened. Counterfactuals are speculation about what would have happened if choices or some other element of the past history had differed. Only one history is observable to humanity as a whole, to all thinking beings in the universe as a whole, etc. This raises the question of how to do statistics with a single history.

The history is chopped into small pieces, which are assumed similar to each other and to future pieces of history. All conclusions require assumptions. In the case of statistics, the main assumption is “what happened in the past, will continue to happen in the future.” The “what” that is happening can be complicated – a long chaotic pattern can be repeated. It should be specified what the patterns of history consist of before discussing them.

The history observable to a brain consists of the sensory inputs and memory. Nothing else is accessible. This is pointed out by the “brain in a jar” thought experiment. Memory is partly past sensory inputs, but may also depend on spontaneous changes in the brain. Machinery can translate previously unobservable aspects of the world into accessible sensory inputs, for example convert infrared and ultraviolet light into visible wavelengths. Formally, history is a function from time to vectors of sensory inputs.

The brain has a built-in ability to classify sensory inputs by type – visual, auditory, etc. This is why the inputs form a vector. For a given sense, there is a built-in “similarity function” that enables comparing inputs from the same sense at different times.

Inputs distinguished by one person, perhaps with the help of machinery, may look identical to another person. The interpretation is that there are underlying physical quantities that must differ by more than the “just noticeable difference” to be perceived as different. The brain can access physical quantities only through the senses, so whether there is a “real world” cannot be determined, only assumed. If most people’s perceptions agree about something, and machinery also agrees (e.g. measuring tape does not agree with visual illusions), then this “something” is called real and physical. The history accessible to humanity as a whole is a function from time to the concatenation of their sensory input vectors.

The similarity functions of people can also be aggregated, compared to machinery and the result interpreted as a physical quantity taking “similar” values at different times.

A set of finite sequences of vectors of sensory inputs is what I call a pattern of history. For example, a pattern can be a single sequence or everything but a given sequence. Patterns may repeat, due to the indistinguishability of physical quantities close to each other. The finer distinctions one can make, the fewer the instances with the same perception. In the limit of perfect discrimination of all variable values, history is unlikely to ever repeat. In the limit of no perception at all, history is one long repetition of nothing happening. The similarity of patterns is defined based on the similarity function in the brain.

Repeated similar patterns together with assumptions enable learning and prediction. If AB is always followed by C, then learning is easy. Statistics are needed when this is not the case. If half the past instances of AB are followed by C, half by D, then one way to interpret this is by constructing a state space with a probability distribution on it. For example, one may assume the existence of an unperceived variable that can take values c,d and assume that ABc leads deterministically to ABC and ABd to ABD. The past instances of AB can be interpreted as split into equal numbers of ABc and ABd. The prediction after observing AB is equal probabilities of C and D. This is a frequentist setup.

A Bayesian interpretation puts a prior probability distribution on histories and updates it based on the observations. The prior may put probability one on a single future history after each past one. Such a deterministic prediction is easily falsified – one observation contrary to it suffices. Usually, many future histories are assumed to have positive probability. Updating requires conditional probabilities of future histories given the past. The histories that repeat past patterns are usually given higher probability than others. Such a conditional probability system embodies the assumption “what happened in the past, will continue to happen in the future.”

There is a tradeoff between the length of a pattern and the number of times it has repeated. Longer patterns permit prediction further into the future, but fewer repetitions mean more uncertainty. Much research in statistics has gone into finding the optimal pattern length given the data. A long pattern contains many shorter ones, with potentially different predictions. Combining information from different pattern lengths is also a research area. Again, assumptions determine which pattern length and combination is optimal. Assumptions can be tested, but only under other assumptions.

Causality is also a mental construct. It is based on past repetitions of an AB-like pattern, without occurrence of BA or CB-like patterns.

The perception of time is created by sensory inputs and memory, e.g. seeing light and darkness alternate, feeling sleepy or alert due to the circadian rhythm and remembering that this has happened before. History is thus a mental construct. It relies on the assumptions that time exists, there is a past in which things happened and current recall is correlated with what actually happened. The preceding discussion should be restated without assuming time exists.


Bayesian vs frequentist statistics – how to decide?

Which predicts better, Bayesian or frequentist statistics? This is an empirical question. To find out, should we compare their predictions to the data using Bayesian or frequentist statistics? What if Bayesian statistics says frequentist is better and frequentist says Bayesian is better (Liar’s paradox)? To find the best method for measuring the quality of the predictions, should we use Bayesianism or frequentism? And to find the best method to find the best method for comparing predictions to data? How to decide how to decide how to decide, as in Lipman (1991)?