Tag Archives: research

The most liveable cities rankings are suspicious

The „most liveable cities” rankings do not publish their methodology, only vague talk about a weighted index of healthcare, safety, economy, education, etc. An additional suspicious aspect is that the top-ranked cities are all large – there are no small towns. There are many more small than big cities in the world (this is known as Zipf’s law), so by chance alone, one would expect most of the top-ranked towns in any ranking that is not size-based to be small. The liveability rankings do not mention restricting attention to sizes above some cutoff. Even if a minimum size was required, one would expect most of the top-ranked cities to be close to this lower bound, just based on the size distribution.

The claimed ranking methodology includes several variables one would expect to be negatively correlated with the population of a city (safety, traffic, affordability). The only plausible positively size-associated variables are culture and entertainment, if these measure the total number of venues and events, not the per-capita number. Unless the index weights entertainment very heavily, one would expect big cities to be at a disadvantage in the liveability ranking based on the correlations, i.e. the smaller the town, the greater its probability of achieving a given liveability score and placing in the top n in the rankings. So the “best places to live” should be almost exclusively small towns. Rural areas not so much, because these usually have limited access to healthcare, education and amenities. The economy of remote regions grows less overall and the population is older, but some (mining) boom areas radically outperform cities in these dimensions. Crime is generally low, so if rural areas were included in the liveability index, then some of these would have a good change of attaining top rank.

For any large city, there exists a small town with better healthcare, safety, economy, education, younger population, more entertainment events per capita, etc (easy examples are university towns). The fact that these do not appear at the top of a liveability ranking should raise questions about its claimed methodology.

The bias in favour of bigger cities is probably coming from sample selection and hometown patriotism. If people vote mostly for their own city and the respondents of the liveability survey are either chosen from the population approximately uniformly randomly or the sample is weighted towards larger cities (online questionnaires have this bias), then most of the votes will favour big cities.

Blind testing of bicycle fitting

Claims that getting a professional bike fit significantly improves riding comfort and speed and reduces overuse injuries seem suspicious – how can a centimetre here or there make such a large difference? A very wrong fit (e.g. an adult using a children’s bike) of course creates big problems, but most people can adjust their bike to a reasonable fit based on a few online suggestions.

To determine the actual benefit of a bike fit requires a randomised trial: have professionals determine the bike fit for a large enough sample of riders, measure and record the objective parameters of the fit (centimetres of seatpost out of the seat tube, handlebar height from the ground, pedal crank length, etc). Then randomly change the fit by a few centimetres or leave it unchanged, without the cyclist knowing, and let the rider test the bike. Record the speed, ask the rider to rate the comfort, fatigue, etc. Repeat for several random changes in fit. Statistically test whether the average speed, comfort rating and other outcome variables across the sample of riders are better with the actual fit or with small random changes. To eliminate the placebo effect, blind testing is important – the cyclists should not know whether and how the fit has been changed.

Another approach is to have each rider test a large sample of different bike fits, find the best one empirically, record its objective parameters and then have a sample of professional fitters (who should not know what empirical fit was found) choose the best fit. Test statistically whether the professionals choose the same fit as the cyclist.

A simpler trial that does not quite answer the question of interest checks the consistency of different bike fitters. The same person with the same bike in the same initial configuration goes to various fitters and asks them to choose a fit. After each fitting, the objective sizing of the bike is recorded and then the bike is returned to the initial configuration before the next fit. The test is whether all fitters choose approximately the same parameters. Inconsistency implies that most fitters cannot figure out the objectively best fit, but consistency does not imply that the consensus of the fitters is the optimal sizing. They could all be wrong the same way – consistency is insufficient to answer the question of interest.

Avoiding the Bulow and Rogoff 1988 result on the impossibility of borrowing

Bulow and Rogoff 1988 NBER working paper 2623 proves that countries cannot borrow, due to their inability to credibly commit to repay, if after default they can still buy insurance. The punishment of defaulting on debt is being excluded from future borrowing. This punishment is not severe enough to motivate a country to repay, by the following argument. A country has two reasons to borrow: it is less patient than the lenders (values current consumption or investment opportunities relatively more) and it is risk-averse (either because the utility of consumption is concave, or because good investment opportunities appear randomly). Debt can be used to smooth consumption or take advantage of temporary opportunities for high-return investment: borrow when consumption would otherwise be low, pay back when relatively wealthy.

After the impatient country has run up its debt to the maximum level the creditors are willing to tolerate, the impatience motive to borrow disappears, because the lenders do not allow more consumption to be transferred from the future to the present. Only the insurance motive to borrow remains. The punishment for default is the inability to insure via debt, because in a low-consumption or valuable-investment state of affairs, no more can be borrowed. Bulow and Rogoff assume that the country can still save or buy insurance by paying in advance, so “one-sided” risk-sharing (pay back when relatively wealthy, or when investment opportunities are unavailable) is possible. This seemingly one-sided risk-sharing becomes standard two-sided risk-sharing upon default, because the country can essentially “borrow” from itself the amount that it would have spent repaying debt. This amount can be used to consume or invest in the state of the world where these activities are attractive, or to buy insurance if consumption and investment are currently unattractive. Thus full risk-sharing is achieved.

More generally, if the country can avoid the punishment that creditors impose upon default (evade trade sanctions by smuggling, use alternate lenders if current creditors exclude it), then the country has no incentive to repay, in which case lenders have no incentive to lend.

The creditors know that once the country has run up debt to the maximum level they allow, it will default. Thus rational lenders set the maximum debt to zero. In other words, borrowing is impossible.

A way around the no-borrowing theorem of Bulow and Rogoff is to change one or more assumptions. In an infinite horizon game, Hellwig and Lorenzoni allow the country to run a Ponzi scheme on the creditors, thus effectively “borrow from time period infinity”, which permits a positive level of debt. Sometimes even an infinite level of debt.

Another assumption that could realistically be removed is that the country can buy insurance after defaulting. Restricting insurance need not be due to an explicit legal ban. The insurers are paid in advance, thus do not exclude the country out of fear of default. Instead, the country’s debt contract could allow creditors to seize the country’s financial assets abroad, specifically in creditor countries, and these assets could be defined to include insurance premiums already paid, or the payments from insurers to the country. The creditors have no effective recourse against the sovereign debtor, but they may be able to enforce claims against insurance firms outside the defaulting country.

Seizing premiums to or payments from insurers would result in negative profits to insurers or restrict the defaulter to one-sided risk-sharing, without the abovementioned possibility of making it two-sided. Seizing premiums makes insurers unwilling to insure, and seizing payments from insurers removes the country’s incentive to purchase insurance. Either way, the country’s benefit from risk-sharing after default is eliminated. This punishment would motivate loan repayment, in turn motivating lending.

Asking questions of yourself

To make better decisions, ask about all your activities “Am I doing this right? Is there a better way?” I would have benefited from considering such questions about many everyday tasks. For example, I brushed my teeth wrong (sawing at the roots) until late teens, brushed my teeth at the wrong time (right after a meal when the enamel is soft) until my 30s. I only learned to cut my own hair in my mid-20s, and this was the highest-return investment I ever made, because a hair clipper costs as much as a haircut, so pays for itself with the first use.

Peeling a kiwi with a spoon is far easier than slicing with a knife. All it took to learn this was one web search, but it required asking myself the question of whether I was peeling fruit optimally. Same for extracting the seed from an avocado.

Cracking the shell of a hard-boiled egg, making two holes at the ends and blowing air under the membrane before peeling is another trick I wish I had known earlier.

Microwaved food is cooler in the centre, so to avoid scalding one’s mouth, it is helpful to start eating it from the middle. Cooked food left in a covered cooking pot or transferred to a storage container while still mildly hot does not go bad at room temperature for several days – doing this experiment required posing this hypothesis. Drinking without touching the bottle with one’s mouth turns out to be quite easy and is widespread in India.

Only after learning to drive did I start meaningfully using gears on a bicycle, and it took about 15 years more to start shifting approximately correctly (pedalling cadence 60-100 rotations per minute, downshifting before stopping, avoiding cross-geared riding). Similarly for basic bike maintenance like cleaning and oiling the chain, selecting the appropriate front and rear tire pressure given one’s weight and tire widths. Seat height is one thing I figured out early, but not handlebar height.

As a teenager, I would have benefited from asking myself whether I was overtraining, whether my nutrition was reasonable, how soon to return to training after various injuries and whether to seek medical assistance with these. Questioning the competence of coaches and doing a simple web search for sports medicine resources would have prevented following some of their mistaken advice.

Sometimes asking yourself the question reveals that you are already doing the task correctly. On the internet, people claim that they do not use shampoo, just water, and their hair stays clean-smelling and more lush than using detergent. An experiment not to use shampoo was a failure for me, causing greasy hair and lots of dandruff after a few days. The optimality of shampoo may depend on individual scalp and hair characteristics. On the other hand, a single-blade disposable razor and cold water give me a better shave than multi-bladed fancy brands with foam (that get clogged), and the disposable razor stays sharp enough for a month or two of everyday shaving.

When going to teach, it may be worth asking whether the room is the correct one, even if some students show up and the room is free, because once in this situation I was in a room with the right label, but in the wrong building.

On the other hand, constantly doubting oneself is unhealthy and unhelpful. If enough evidence points one way, then it is time to make up one’s mind.

Blind testing of clothes

Inspired by blind taste testing, manufacturers’ claims about clothes could be tested by subjects blinded to what they are wearing. The test would work as follows. People put clothes on by feel with their eyes closed or in a pitch dark room and wear other clothes on top of the item to be tested. Thus the subjects cannot see what they are wearing. They then rate the comfort, warmth, weight, softness and other physical aspects of the garment. This would help consumers select the most practical clothing and keep advertising somewhat more honest than heretofore. For example, many socks are advertised as warm, but based on my experience, many of them do not live up to the hype. I would be willing to pay a small amount for data about past wearers’ experience. Online reviews are notoriously emotional and biased.

Some aspects of clothes can also be measured objectively – warmth is one of these, measured by heat flow through the garment per unit of area. Such data is unfortunately rarely reported. The physical measurements to conduct on clothes require some thought, to make these correspond to the wearing experience. For example, if clothes are thicker in some parts, then their insulation should be measured in multiple places. Some parts of the garment may usually be worn with more layers under or over it than others, which may affect the required warmth of different areas of the clothing item differently. Sweat may change the insulation properties dramatically, e.g. for cotton. Windproofness matters for whether windchill can be felt. All this needs taking into account when converting physical measurements to how the clothes feel.

Why research with more authors gets cited more

Empirically, articles with more authors are cited more, according to Wuchty et al. (2007). The reasons may be good or bad. A good reason is that coauthored papers may have higher quality, e.g. due to division of labour increasing the efficiency of knowledge production. I propose the following bad reasons, independent of potential quality differences between coauthored and solo articles. Suppose that researchers cite the works of their friends more frequently than warranted. A given scientist is more likely to have a friend among the authors of an article with a greater number of collaborators, which increases its probability of getting a „friendly citation”.

Another reason is defensive citing, i.e. including relatively unrelated papers in the reference list before submitting to a journal, in case the referees happen to be the authors of those works. The reason for adding these unnecessary citations is the belief, warranted or not, that a referee is more likely to recommend acceptance of a paper if it cites the referee’s publications. The probability that the set of referees overlaps with the set of authors of a given prior work increases in the number of authors of that work. Thus defensive citing is more effective when targeted to collaborative instead of solo papers.

The referees may also directly ask the author to cite certain papers in the revision (I have had this experience). If the referees are more likely to request citations to their own or their coauthors’ work, then articles with more authors are again referenced more.

Valderas et al. (2007) offer some additional explanations. One is measurement error. Suppose that letters to the editor, annual reports of the learned society, its presidential inaugural addresses, and other non-research in scientific journals are counted as publications. These have both fewer authors and citations than regular research articles, which creates a positive correlation between the popularity of a piece of writing and its number of authors.

If self-citations are not excluded and researchers cite their own work more frequently than that of others, then papers with more authors get cited more.

Articles with more collaborators are presented more frequently, thus their existence is more widely known. Awareness of a work is a prerequisite of citing it, so the wider circulation of multi-author publications gives them a greater likelihood of being referenced, independent of quality.

Easier combining of entertainment and work may explain increased income inequality

Many low-skill jobs (guard, driver, janitor, manual labourer) permit on-the-job consumption of forms of entertainment (listening to music or news, phoning friends) that became much cheaper and more available with the introduction of new electronic devices (first small radios, then TVs, then cellphones, smartphones). Such entertainment does not reduce productivity at the abovementioned jobs much, which is why it is allowed. On the other hand, many high-skill jobs (planning, communicating, performing surgery) are difficult to combine with any entertainment, because the distraction would decrease productivity significantly. The utility of low-skill work thus increased relatively more than that of skilled jobs when electronics spread and cheapened. The higher utility made low-skill jobs relatively more attractive, so the supply of labour at these increased relatively more. This supply rise reduced the pay relative to high-skill jobs, which increased income inequality. Another way to describe this mechanism is that as the disutility of low-skill jobs fell, so did the real wage required to compensate people for this disutility.

An empirically testable implication of this theory is that jobs of any skill level that do not allow on-the-job entertainment should have seen salaries increase more than comparable jobs which can be combined with listening to music or with personal phone calls. For example, a janitor cleaning an empty building can make personal calls, but a cleaner of a mall (or other public venue) during business hours may be more restricted. Both can listen to music on their headphones, so the salaries should not have diverged when small cassette players went mainstream, but should have diverged when cellphones with headsets became cheap. Similarly, a trucker or nightwatchman has more entertainment options than a taxi driver or mall security guard, because the latter do not want to annoy customers with personal calls or loud music. A call centre operator is more restricted from audiovisual entertainment than a receptionist.

According to the above theory, the introduction of radios and cellphones should have increased the wage inequality between areas with good and bad reception, for example between remote rural and urban regions, or between underground and aboveground mining. On the other hand, the introduction of recorded music should not have increased these inequalities as much, because the availability of records is more similar across regions than radio or phone coverage.

Heating my apartment with a gas stove

There is no built-in heating system in my Australian-standard un-insulated apartment, and the plug-in electric radiators do not have enough power to raise the temperature by a degree. In the past two winters, I used the gas stove as a heater. It is generally unwise to heat an enclosed space without purpose-built ventilation (such as a chimney) by burning something, because of the risk of CO poisoning. Even before CO becomes a problem, suffocation may occur because the CO2 concentration rises and oxygen concentration falls. Therefore, before deciding to heat with a gas stove, I looked up the research, made thorough calculations and checked them several times. I also bought a CO detector, tested it and placed it next to the gas stove. The ceiling has a smoke alarm permanently attached, but this only detects soot in the air, not gases like CO.
For the calculations, I looked up how much heat is produced by burning a cubic metre or kilogram of CH4 (natural gas), how much the temperature of the air in the apartment should rise as a result, how much CO2 the burning produces, and what the safe limits of long-term CO2 exposure are.
The energy content of CH4 is 37.2 MJ/m3, equivalently 50-55.5 MJ/kg. A pilot light of a water heater is estimated to produce 5.3 kWh/day = 20 MJ/day of heat, but a gas stove’s biggest burner turned fully on is estimated to produce 5-15 MJ/h, depending on the stove and the data source.
The chemical reaction of burning natural gas when oxygen is not a limiting factor is CH4 +2*O2 =CO2 +2*H2O. The molar masses of these gases are CH4=16 g/mol, O2=32 g/mol, CO2=44 g/mol, H2O=18 g/mol, air 29 g/mol. One stove burner on full for 1 hour uses about 0.182 kg =0.255 m3 of CH4 and 0.364 kg of O2, which depletes 1.82 kg = 1.52 m3 of air. The burning produces 2.75*0.182 = 0.5 kg = 0.41 m3 of CO2. The CO2 is denser than air, which is why it may remain in the apartment and displace air when the cracks around the windows are relatively high up. On the other hand, the CO2 also mixes with the air, so may escape at the same rate. Or alternatively, the CO2 is hot, so may rise and escape faster than air. For safety calculations, I want to use a conservative estimate, so assume that the CO2 remains in the apartment.
The volume of the apartment is 6x5x2.5 m =75 m^3. The density of air at room temperature is 1.2 kg/m^3, thus the mass of air in the apartment is 90 kg. The specific heat of air is 1005 kJ/(kg*K) at 20C. The walls and ceiling leak heat, thus more energy is actually needed to heat the apartment by a given amount than the calculation using only air shows. It takes 900 kJ of heat to raise the temperature of the air, not the walls, by 10C (from 12C to 22C). This requires 9/555 kg = 9/(16*555) kmol of CH4 with estimated energy density 55500 kJ/kg. Burning that CH4 also takes 9/(8*555) kmol of O2 and produces 9*11/(4*555) kmol = 9/200 kg of CO2.
The normal concentration of CO2 in outside air is 350-450 ppm. Estimate the baseline concentration in inside air to be 1/2000 ppm because of breathing and poor ventilation. Adding 1/2000 ppm from heating, the CO2 concentration reaches 1/1000 ppm. This is below the legal limit for long-term exposure.
CO is produced in low-oxygen burning. As long as the CO2 concentration in the air is low and the oxygen concentration high, the risk of CO poisoning is small.
For the actual heating, I first tested running the smallest burner all day while I was at home, and paid attention to whether I felt sleepy and whether the air in the apartment smelled more stale than outside or in the corridor. There seemed to be no problems. For nighttime heating, I started with the smallest burner in the lowest setting, similarly paying attention to whether the air in the morning smelled staler than usual and whether I felt any different. Because there were no problems, I gradually increased the heating from week to week. The maximum I reached was to turn on the largest burner to less than half power, and one or two smaller burners fully. Together, these burners produced much less heat than the largest burner on full, as could be easily checked by feel when standing next to the stove. At night, the stove prevented the temperature in the apartment from dropping by the usual 2C, but did not increase it. The CO2 produced was probably far less than the bound I calculated above by assuming a 10C increase in temperature. Empirically, I’m still alive after two winters of letting the gas stove run overnight.

“What if” is a manipulative question

“What if this bad event happens?” is a question used as a high-pressure sales tactic (for insurance, maintenance, upgrades and various protective measures). People suffering from anxiety or depression also tend to ask that question, which is called catastrophising. The question generates vague fears and is usually unhelpful for finding reasonable preventive or corrective measures for the bad event. Fearful people tend to jump on anything that looks like it might be a prevention or cure, which sometimes makes the problem worse (e.g. quack remedies for imagined rare disease worsen health).
A more useful question is: “What is the probability of this bad event happening?” This question directs attention to statistics and research about the event. Often, the fear-generating event is so unlikely that it is not worth worrying about. Even if it has significant probability, checking the research on it is more likely to lead to solutions than vague rumination along the lines of “what if.” Even if there are no solutions, statistics on the bad event often suggest circumstances that make it more likely, thus information on which situations or risk factors to avoid.
These points have been made before, as exemplified by the aphorisms “Prepare for what is likely and you are likely to be prepared” and “Safety is an expensive illusion.”

App to measure road quality

The accelerometers in phones can detect vibrations, such as when the car that the phone is in drives through a pothole. The GPS in the phone can detect the location and speed of the car. An app that connects the jolt, location and speed (and detects whether the phone is in a moving car based on its past speed and location) can automatically measure the quality of the road. The resulting data can be automatically uploaded to a database to create an almost real-time map of road quality. The same detection and reporting would work for bike paths.
Perhaps such an app has already been created, but if not, then it would complement map software nicely. Drivers and cyclists are interested in the quality of the roads as well as the route, time and distance of getting to the destination. Map software already provides congestion data and takes traffic density into account when predicting arrival time at a destination. Road quality data would help drivers select routes to minimise damage to vehicles (and the resulting maintenance cost) and to sensitive cargo. This would be useful to trucking and delivery companies, and ambulances.
A less direct use of data on road quality collected by the app is in evaluating the level of local public services provided (one aspect of the quality of local government). Municipalities with the same climate, soil and traffic density with worse roads are probably less well run. For developing countries where data on governance quality and spending is difficult to get, road quality may be a useful proxy. The public services are correlated with the wealth of a region, so road quality is also a proxy for poverty.