Tag Archives: statistics

If top people have families and hobbies, then success is not about productivity

Assume:

1 Productivity is continuous and weakly increasing in talent and effort.

2 The sum of efforts allocated to all activities is bounded, and this bound is similar across people.

3 Families and hobbies take some effort, thus less is left for work. (For this assumption to hold, it may be necessary to focus on families with children in which the partner is working in a different field. Otherwise, a stay-at-home partner may take care of the cooking and cleaning, freeing up time for the working spouse to allocate to work. A partner in the same field of work may provide a collaboration synergy. In both cases, the productivity of the top person in question may increase.)

4 The talent distribution is similar for people with and without families or hobbies. This assumption would be violated if for example talented people are much better at finding a partner and starting a family.

Under these assumptions, reasonably rational people would be more productive without families or hobbies. If success is mostly determined by productivity, then people without families should be more successful on average. In other words, most top people in any endeavour would not have families or hobbies that take time away from work.

In short, if responsibilities and distractions cause lower productivity, and productivity causes success, then success is negatively correlated with such distractions. Therefore, if successful people have families with a similar or greater frequency as the general population, then success is not driven by productivity.

One counterargument is that people first become successful and then start families. In order for this to explain the similar fractions of singles among top and bottom achievers, the rate of family formation after success must be much greater than among the unsuccessful, because catching up from a late start requires a higher rate of increase.

Another explanation is irrationality of a specific form – one which reduces the productivity of high effort significantly below that of medium effort. Then single people with lots of time for work would produce less through their high effort than those with families and hobbies via their medium effort. Productivity per hour naturally falls with increasing hours, but the issue here is total output (the hours times the per-hour productivity). An extra work hour has to contribute negatively to success to explain the lack of family-success correlation. One mechanism for a negative effect of hours on output is burnout of workaholics. For this explanation, people have to be irrational enough to keep working even when their total output falls as a result.

If the above explanations seem unlikely but the assumptions reasonable in a given field of human endeavour, then reaching the top and staying there is mostly not about productivity (talent and effort) in this field. For example, in academic research.

A related empirical test of whether success in a given field is caused by productivity is to check whether people from countries or groups that score highly on corruption indices disproportionately succeed in this field. Either conditional on entering the field or unconditionally. In academia, in fields where convincing others is more important than the objective correctness of one’s results, people from more nepotist cultures should have an advantage. The same applies to journals – the general interest ones care relatively more about a good story, the field journals more about correctness. Do people from more corrupt countries publish relatively more in general interest journals, given their total publications? Of course, conditional on their observable characteristics like the current country of employment.

Another related test for meritocracy in academia or the R&D industry is whether coauthored publications and patents are divided by the number of coauthors in their influence on salaries and promotions. If there is an established ranking of institutions or job titles, then do those at higher ranks have more quality-weighted coauthor-divided articles and patents? The quality-weighting is the difficult part, because usually there is no independent measure of quality (unaffected by the dependent variable, be it promotions, salary, publication venue).

Learning and evolution switch the sign of autocorrelations

Animals are more successful if they learn or evolve to predict locations of food, mates and predators. Prediction of anything relies on correlations over time in the environment. These correlations may be positive or negative. Learning is more difficult if the sign of the correlation switches over time, which occurs in nature due to resource depletion, learning and evolution.

If a herbivore eats a tasty patch of plants or a predator a nest full of eggs, then the next day that food is not there (negative correlation), but the next year at the same time it is probably there again (positive correlation) because the plants regrow from roots or seeds, and if the prey found the nesting spot attractive one year, then other members of the prey species will likely prefer it the next year as well. However, over many generations, if the plants in that location get eaten before dispersing seeds or the young in that nest before breeding, then the prey will either learn or evolve to avoid that location, or go extinct. This makes the autocorrelation negative again on sufficiently long timescales.

Positive correlation is the easiest to learn – just keep doing the same thing and achieve the same successful outcome. Negative correlation is harder, because the absence of success at one time predicts success from the same action at another time, and vice versa. Learning a changing correlation requires a multi-parameter mental model of the superimposed different-frequency oscillations of resource abundance.

There is a tradeoff between exploiting known short-period correlations and experimenting to learn longer-period correlations. There may always be a longer pattern to discover, but finite lifetimes make learning very low-frequency events not worthwhile.

The most liveable cities rankings are suspicious

The „most liveable cities” rankings do not publish their methodology, only vague talk about a weighted index of healthcare, safety, economy, education, etc. An additional suspicious aspect is that the top-ranked cities are all large – there are no small towns. There are many more small than big cities in the world (this is known as Zipf’s law), so by chance alone, one would expect most of the top-ranked towns in any ranking that is not size-based to be small. The liveability rankings do not mention restricting attention to sizes above some cutoff. Even if a minimum size was required, one would expect most of the top-ranked cities to be close to this lower bound, just based on the size distribution.

The claimed ranking methodology includes several variables one would expect to be negatively correlated with the population of a city (safety, traffic, affordability). The only plausible positively size-associated variables are culture and entertainment, if these measure the total number of venues and events, not the per-capita number. Unless the index weights entertainment very heavily, one would expect big cities to be at a disadvantage in the liveability ranking based on the correlations, i.e. the smaller the town, the greater its probability of achieving a given liveability score and placing in the top n in the rankings. So the “best places to live” should be almost exclusively small towns. Rural areas not so much, because these usually have limited access to healthcare, education and amenities. The economy of remote regions grows less overall and the population is older, but some (mining) boom areas radically outperform cities in these dimensions. Crime is generally low, so if rural areas were included in the liveability index, then some of these would have a good change of attaining top rank.

For any large city, there exists a small town with better healthcare, safety, economy, education, younger population, more entertainment events per capita, etc (easy examples are university towns). The fact that these do not appear at the top of a liveability ranking should raise questions about its claimed methodology.

The bias in favour of bigger cities is probably coming from sample selection and hometown patriotism. If people vote mostly for their own city and the respondents of the liveability survey are either chosen from the population approximately uniformly randomly or the sample is weighted towards larger cities (online questionnaires have this bias), then most of the votes will favour big cities.

Blind testing of bicycle fitting

Claims that getting a professional bike fit significantly improves riding comfort and speed and reduces overuse injuries seem suspicious – how can a centimetre here or there make such a large difference? A very wrong fit (e.g. an adult using a children’s bike) of course creates big problems, but most people can adjust their bike to a reasonable fit based on a few online suggestions.

To determine the actual benefit of a bike fit requires a randomised trial: have professionals determine the bike fit for a large enough sample of riders, measure and record the objective parameters of the fit (centimetres of seatpost out of the seat tube, handlebar height from the ground, pedal crank length, etc). Then randomly change the fit by a few centimetres or leave it unchanged, without the cyclist knowing, and let the rider test the bike. Record the speed, ask the rider to rate the comfort, fatigue, etc. Repeat for several random changes in fit. Statistically test whether the average speed, comfort rating and other outcome variables across the sample of riders are better with the actual fit or with small random changes. To eliminate the placebo effect, blind testing is important – the cyclists should not know whether and how the fit has been changed.

Another approach is to have each rider test a large sample of different bike fits, find the best one empirically, record its objective parameters and then have a sample of professional fitters (who should not know what empirical fit was found) choose the best fit. Test statistically whether the professionals choose the same fit as the cyclist.

A simpler trial that does not quite answer the question of interest checks the consistency of different bike fitters. The same person with the same bike in the same initial configuration goes to various fitters and asks them to choose a fit. After each fitting, the objective sizing of the bike is recorded and then the bike is returned to the initial configuration before the next fit. The test is whether all fitters choose approximately the same parameters. Inconsistency implies that most fitters cannot figure out the objectively best fit, but consistency does not imply that the consensus of the fitters is the optimal sizing. They could all be wrong the same way – consistency is insufficient to answer the question of interest.

Committing to an experimental design without revealing it

Pre-registering an experiment in a public registry of clinical trials keeps the experimenters honest (avoids ex post modifications of hypotheses to fit the data and “cherry-picking” the data by removing “outliers”), but unfortunately reveals information to competing research groups. This is an especially relevant concern in commercial R&D.

The same verifiability of honesty could be achieved without revealing scientific details by initially publicly distributing an encrypted description of the experiment, and after finishing the research, publishing the encryption key. Ex post, everyone can check that the specified experimental design was followed and all variables reported (no p-hacking). Ex ante, competitors do not know the trial details, so cannot copy it or infer the research direction.

Distinguishing discrimination in admissions from the opposite discrimination in grading

There are at least two potential explanations for why students from group A get a statistically significantly higher average grade in the same course than those from group B. The first is discrimination against A in admissions: if members of A face a stricter ability cutoff to be accepted at the institution, then conditional on being accepted, they have higher average ability. One form of a stricter ability cutoff is requiring a higher score from members of A, provided admissions test scores are positively correlated with ability.

The second explanation is discrimination in favour of group A in grading: students from A are given better grades for the same work. To distinguish this from admissions discrimination against A, one way is to compare the relative grades of groups A and B across courses. If the difference in average grades is due to ability, then it should be quite stable across courses, compared to a difference coming from grading standards, which varies with each grader’s bias for A.

Of course, there is no clear line how much the relative grades of group A vary across courses under grading discrimination, as opposed to admissions bias. Only statistical conclusions can be drawn about the relative importance of the two opposing mechanisms driving the grade difference. The distinction is more difficult to make when there is a „cartel” in grading discrimination, so that all graders try to boost group A by the same amount, i.e. to minimise the variance in the advantage given to A. Conscious avoidance of detection could be one reason to reduce the dispersion in the relative grade improvement of A.

Another complication when trying to distinguish the causes of the grade difference is that ability may affect performance differentially across courses. An extreme case is if the same trait improves outcomes in one course, but worsens them in another, for example lateral thinking is beneficial in a creative course, but may harm performance when the main requirement is to follow rules and procedures. To better distinguish the types of discrimination, the variation in the group difference in average grades should be compared across similar courses. The ability-based explanation results in more similar grade differences between more closely related courses. Again, if graders in similar courses vary less in their bias than graders in unrelated fields, then distinguishing the types of discrimination is more difficult.

Ways in which an eater can get negative calories from food

There are at least four ways in which an eater may have less energy and nutrients after consuming a food: mechanical, chemical, physical and biological. The mechanical way is that chewing and other parts of digestion take energy, so if a food requires serious mastication and contains few calories, then more energy may be spent than absorbed. This has been claimed for raw celery.
Chemically, one food may react with another in a way that makes one or both of them less digestible. The less effective absorption reduces the nutrients obtained compared to not eating the second reactant. The chemical pathway to inefficient digestion may have multiple steps. For example, ascorbic acid leaches calcium from the body, and calcium is required for the absorption of vitamin D, so eating more citrus fruits may indirectly reduce one’s vitamin D levels.
When calculating the calorie content of food, indigestible fibre is subtracted from carbohydrates before adding up the energy obtained from carbohydrates, fats and proteins. However, if fibre reduces the absorption of calories (in addition to its known reduction of the absorption iron, zinc, magnesium, calcium and phosphorus), then the food’s bioavailable calorie content is less than that obtained by simply subtracting the fibre. To derive the correct calorie content, the fibre should then have negative weight in the calculation, not zero. This difference may explain why in Western countries, a high-fibre diet predicts better health in multiple dimensions in large prospective studies (Nurses’ Health Study, Framingham Heart Study), controlling for calorie intake, lifestyle and many other factors. If the calorie absorption is overestimated for people eating lots of fibre (because the calorie intake is larger than the absorption), then their predicted health based on the too high calorie estimate is worse than their actual health. This is because most people in Western countries overeat, so eating less improves health outcomes. If the predicted health is underestimated, then the high-fibre group looks unusually healthy, which is attributed to the beneficial effects of fibre, but may actually be due to absorbing fewer calories.
A food may chemically break down tissues, e.g. bromelain and papain, from fresh pineapple and papaya respectively, denature meat proteins, so cause mouth sores. Rebuilding the damaged tissue requires the energy and nutrients, the quantity of which may exceed that absorbed from the food.
Chemically causing diarrhea reduces the time that foods (including the laxative agent) spend in the gut, thus reduces nutrient absorption.
Stimulants like caffeine speed up metabolism and cause greater energy expenditure, but may give zero calories themselves, resulting in a net negative caloric balance.
Just like chemical damage, physical injury to the body necessitates spending calories and nutrients for tissue repair. For example, scratchy food (phytoliths, bran) may cause many microscopic wounds to the digestive tract.
Cold food requires the body to spend energy on heating, so if the calorie content is small, then the net energy obtained is is negative. Examples are ice cubes and cold water.
A food substance may physically partially block the absorption of another, for example a gelling agent (methylcellulose, psyllium husks) may turn a juice into a gel in the gut and thereby reduce its absorption. Based on my personal experience, psyllium husks gel liquid feces, thus effectively reducing diarrhea. Mixing psyllium husks with carrot juice and with asparagus powder dissolved in water before consuming them during the same meal results in the excretion of separated faint orange and green gels somewhat distinct from the rest of the feces (photos available upon request, not posted to keep the blog family-friendly). This is suggestive evidence that the gelling agent both kept the juices from mixing in the gut and reduced the absorption of the colourful compounds by keeping the juice in the centre of the gel away from the intestinal wall.
Biologically, a food may reduce the nutrients available to the organism by causing infection, the immune response to which requires energy and depletes the body’s reserves of various substances. Infection may lead to diarrhea, although the mechanism is chemical, namely the toxins excreted by the microbes. Infection with helminths (intestinal worms) that suck blood through the wall of the gut requires the replenishment of blood cells, which uses up calories, protein and iron.
If the food takes a long time to chew or is bulky, then chemical and electrical signals of satiation are sent from the gastrointestinal tract to the the appetite centre of the brain. These signals reduce the desire to eat, thus decrease calorie intake.

“What if” is a manipulative question

“What if this bad event happens?” is a question used as a high-pressure sales tactic (for insurance, maintenance, upgrades and various protective measures). People suffering from anxiety or depression also tend to ask that question, which is called catastrophising. The question generates vague fears and is usually unhelpful for finding reasonable preventive or corrective measures for the bad event. Fearful people tend to jump on anything that looks like it might be a prevention or cure, which sometimes makes the problem worse (e.g. quack remedies for imagined rare disease worsen health).
A more useful question is: “What is the probability of this bad event happening?” This question directs attention to statistics and research about the event. Often, the fear-generating event is so unlikely that it is not worth worrying about. Even if it has significant probability, checking the research on it is more likely to lead to solutions than vague rumination along the lines of “what if.” Even if there are no solutions, statistics on the bad event often suggest circumstances that make it more likely, thus information on which situations or risk factors to avoid.
These points have been made before, as exemplified by the aphorisms “Prepare for what is likely and you are likely to be prepared” and “Safety is an expensive illusion.”

Laplace’s principle of indifference makes history useless

Model the universe in discrete time with only one variable, which can take values 0 and 1. The history of the universe up to time t is a vector of length t consisting of zeroes and ones. A deterministic universe is a fixed sequence. A random universe is like drawing the next value (0 or 1) according to some probability distribution every period, where the probabilities can be arbitrary and depend in arbitrary ways on the past history.
The prior distribution over deterministic universes is a distribution over sequences of zeroes and ones. The prior determines which sets are generic. I will assume the prior with the maximum entropy, which is uniform (all paths of the universe are equally likely). This follows from Laplace’s principle of indifference, because there is no information about the distribution over universes that would make one universe more likely than another. The set of infinite sequences of zeroes and ones is bijective with the interval [0,1], so a uniform distribution on it makes sense.
After observing the history up to time t, one can reject all paths of the universe that would have led to a different history. For a uniform prior, any history is equally likely to be followed by 0 or 1. The prediction of the next value of the variable is the same after every history, so knowing the history is useless for decision-making.
Many other priors besides uniform on all sequences yield the same result. For example, uniform restricted to the support consisting of sequences that are eventually constant. There is a countable set of such sequences, so the prior is improper uniform. A uniform distribution restricted to sequences that are eventually periodic, or that in the limit have equal frequency of 1 and 0 also works.
Having more variables, more values of these variables or making time continuous does not change the result. A random universe can be modelled as deterministic with extra variables. These extras can for example be the probability of drawing 1 next period after a given history.
Predicting the probability distribution of the next value of the variable is easy, because the probability of 1 is always one-half. Knowing the history is no help for this either.

Giving oneself tenure

Senior academics tell juniors that an assistant professor does not have to get tenure at his or her current university, but “in the profession”, i.e. at some university. To extend this reasoning, one does not have to get tenure at all, just guarantee one’s ability to pay one’s living costs with as low effort as possible. Government jobs are also secure – not quite tenure, but close.
Economically, tenure is guaranteed income for life (or until a mandatory retirement age) in exchange for teaching and administrative work. The income may vary somewhat, based on research and teaching success, but there is some lower bound on salary. Many nontenured academics are obsessed about getting tenure. The main reason is probably not the prestige of being called Professor, but the income security. People with families seem especially risk averse and motivated to secure their job.
Guaranteed income can be obtained by other means than tenure, e.g. by saving enough to live off the interest and dividends (becoming a rentier). Accumulating such savings is better than tenure, because there is no teaching and administration requirement. If one wishes, one can always teach for free. Similarly, research can be done in one’s free time. If expensive equipment is needed for the research, then one can pay a university or other institution for access to it. The payment may be in labour (becoming an unpaid research assistant). Becoming financially independent therefore means giving oneself more than tenure. Not many academics seem to have noticed this option, because they choose a wasteful consumerist lifestyle and do not plan their finances.
Given the scarcity of tenure-track jobs in many fields, choosing the highest-paying private-sector position (to accumulate savings), may be a quicker and more certain path to the economic equivalent of tenure than completing sequential postdocs. The option of an industry job seems risky to graduate students, because unlike in academia, one can get fired. However, the chance of layoffs should be compared to failing to get a second postdoc at an institution of the same or higher prestige. When one industry job ends, there are others. Like in academia, moving downward is easier than up.
To properly compare the prospects in academia and industry, one should look at the statistics, not listen to anecdotal tales of one’s acquaintances or the promises of recruiters. If one aspires to be a researcher, then one should base one’s life decisions on properly researched facts. It is surprising how many academics do not. The relevant statistics on the percentage of graduates or postdocs who get a tenure-track job or later tenure have been published for several fields (http://www.nature.com/ncb/journal/v12/n12/full/ncb1210-1123.html, http://www.education.uw.edu/cirge/wp-content/uploads/2012/11/so-you-want-to-become-a-professor.pdf, https://www.aeaweb.org/articles?id=10.1257/jep.28.3.205). The earnings in both higher education and various industries are published as part of national labour force statistics. Objective information on job security (frequency of firing) is harder to get, but administrative data from the Nordic countries has it.
Of course, earnings are not the whole story. If one has to live in an expensive city to get a high salary, then the disposable income may be lower than with a smaller salary in a cheaper location. Non-monetary aspects of the job matter, such as hazardous or hostile work environment, the hours and flexibility. Junior academics normally work much longer than the 40 hours per week standard in most jobs, but the highest-paid private-sector positions may require even more time and effort than academia. The hours may be more flexible in academia, other than the teaching times. The work is probably of the same low danger level. There is no reason to suppose the friendliness of the colleagues to differ.
Besides higher salary, a benefit of industry jobs is that they can be started earlier in life, before the 6 years in graduate school and a few more in postdoc positions. Starting early helps with savings accumulation, due to compound interest. Some people have become financially independent in their early thirties this way (see mrmoneymustache.com).
If one likes all aspects of an academic job (teaching, research and service), then it is reasonable to choose an academic career. If some aspects are not inherently rewarding, then one should consider the alternative scenario in which the hours spent on those aspects are spent on paid employment instead. The rewarding parts of the job are done in one’s free time. Does this alternative scenario yield a higher salary? The non-monetary parts of this scenario seem comparable to academia.
Tenure is becoming more difficult to get, as evidenced by the lengthening PhD duration, the increasing average number of postdocs people do before getting tenure, and by the lengthening tenure clocks (9 years at Carnegie Mellon vs the standard 6). Senior academics (who have guarateed jobs) benefit from increased competition among junior academics, because then the juniors will do more work for the seniors for less money. So the senior academics have an incentive to lure young people into academia (to work in their labs as students and postdocs), even if this is not in the young people’s interest. The seniors do not fear competition from juniors, due to the aforementioned guaranteed jobs.
Graduate student and postdoc unions are lobbying universities and governments to give them more money. This has at best a limited impact, because in the end the jobs and salaries are determined by supply and demand. If the unions want to make current students and postdocs better off, then they should discourage new students from entering academia. If they want everyone to be better off, then they should encourage research-based decision-making by everyone. I do not mean presenting isolated facts that support their political agenda (like the unions do now), but promoting the use of the full set of labour force statistics available, asking people to think about their life goals and what jobs will help achieve those goals, and developing predictive models along the lines of “if you do a PhD in this field in this university, then your probable job and income at age 30, 40, etc is…”.