Tag Archives: being and seeming

Partial cleaning may make surfaces look dirtier

The reason why incomplete cleaning may increase the visual perception of dirt is by increasing the contrast between the patches of thicker grime and the normal colour by removing a uniform covering of thinner dirt. If something is uniformly grimy, then the colour of the covering dirt may be perceived as the thing’s normal hue. Cleaning may remove approximately the same thickness of dirt from all points on the surface. If some patches initially have a thicker layer, then these remain the colour of the dirt after the cleaning, but other areas may be fully cleaned and revert to the original look of the surface. The human visual system mostly perceives contrast, not the absolute wavelength of the reflected light, as various optical illusions demonstrate. Higher contrast between the thicker patches of grime and the rest of the surface then enhances the perception of dirtiness.

Bar-coding videos to prevent faking

To prevent clips from being cut out of a video or inserted, add a non-repeating sequence of bar codes onto either the whole frame or the main object of the video, such as a person talking. The bar code can use subtle „watermark” shading that does not interfere with viewing – it only needs to be readable by computer. The sequence of bar codes can be recreated at a later time if the algorithm is known, so if a clip is cut out of the video or added, then the sequence no longer matches the replication. Altering the video frame by frame also changes the bar code, although the forger can bypass this security feature by reading the original bar code, removing it before retouching and adding it back later. Still, these extra steps make faking the video somewhat more difficult. The main security feature is that the length of the video cannot be changed without altering the sequence of bar codes, which is easily detected.

The maker of the video may generate the bar codes cryptographically using a private key. This enables confirming the source of the video, for example in a copyright dispute.

Probably the idea of bar-coding videos has already been implemented, because watermarks and time stamps on photos have long been used. The main novelty relative to treating each frame as a photo is to link the bar codes to each other over time.

If top people have families and hobbies, then success is not about productivity

Assume:

1 Productivity is continuous and weakly increasing in talent and effort.

2 The sum of efforts allocated to all activities is bounded, and this bound is similar across people.

3 Families and hobbies take some effort, thus less is left for work. (For this assumption to hold, it may be necessary to focus on families with children in which the partner is working in a different field. Otherwise, a stay-at-home partner may take care of the cooking and cleaning, freeing up time for the working spouse to allocate to work. A partner in the same field of work may provide a collaboration synergy. In both cases, the productivity of the top person in question may increase.)

4 The talent distribution is similar for people with and without families or hobbies. This assumption would be violated if for example talented people are much better at finding a partner and starting a family.

Under these assumptions, reasonably rational people would be more productive without families or hobbies. If success is mostly determined by productivity, then people without families should be more successful on average. In other words, most top people in any endeavour would not have families or hobbies that take time away from work.

In short, if responsibilities and distractions cause lower productivity, and productivity causes success, then success is negatively correlated with such distractions. Therefore, if successful people have families with a similar or greater frequency as the general population, then success is not driven by productivity.

One counterargument is that people first become successful and then start families. In order for this to explain the similar fractions of singles among top and bottom achievers, the rate of family formation after success must be much greater than among the unsuccessful, because catching up from a late start requires a higher rate of increase.

Another explanation is irrationality of a specific form – one which reduces the productivity of high effort significantly below that of medium effort. Then single people with lots of time for work would produce less through their high effort than those with families and hobbies via their medium effort. Productivity per hour naturally falls with increasing hours, but the issue here is total output (the hours times the per-hour productivity). An extra work hour has to contribute negatively to success to explain the lack of family-success correlation. One mechanism for a negative effect of hours on output is burnout of workaholics. For this explanation, people have to be irrational enough to keep working even when their total output falls as a result.

If the above explanations seem unlikely but the assumptions reasonable in a given field of human endeavour, then reaching the top and staying there is mostly not about productivity (talent and effort) in this field. For example, in academic research.

A related empirical test of whether success in a given field is caused by productivity is to check whether people from countries or groups that score highly on corruption indices disproportionately succeed in this field. Either conditional on entering the field or unconditionally. In academia, in fields where convincing others is more important than the objective correctness of one’s results, people from more nepotist cultures should have an advantage. The same applies to journals – the general interest ones care relatively more about a good story, the field journals more about correctness. Do people from more corrupt countries publish relatively more in general interest journals, given their total publications? Of course, conditional on their observable characteristics like the current country of employment.

Another related test for meritocracy in academia or the R&D industry is whether coauthored publications and patents are divided by the number of coauthors in their influence on salaries and promotions. If there is an established ranking of institutions or job titles, then do those at higher ranks have more quality-weighted coauthor-divided articles and patents? The quality-weighting is the difficult part, because usually there is no independent measure of quality (unaffected by the dependent variable, be it promotions, salary, publication venue).

Visually distinct social classes in agrarian societies

One argument advanced for why slavery in the US was special among the world’s slaveholding societies is that one race enslaved another. However, before the age of genetic testing, the races could only have been distinguished visually. Similarly obvious differences in the looks of slaves and masters, or serfs and nobility occurred in all agrarian societies. The obviousness of distinct looks is meant in the statistical sense: with what accuracy could people classify others into slaves and masters, or peasants and lords, averaged both across the population judging and the population judged? I believe the accuracy was close to perfect – comparable to the classification accuracy of US slaves and slaveholders – for the following reasons.

Serfs were malnourished in childhood, thus short. They did hard physical labour without stretching much, thus were bent over, with back and leg muscles better developed than the rest. They spent the day outdoors without sunscreen, wearing limited clothing, thus were tanned. The lack of sunglasses caused them to squint, creating characteristic wrinkles on the face. They seldom had opportunity to wash, thus had ingrained dirt in their skin that would not have come out with a single hard scrubbing. Both corporal punishment and intrafamily violence caused many of them to have visible scars, missing teeth, crooked noses. By contrast, the well-fed nobility were tall and practised proper erect posture in childhood for table manners and dance lessons. Their physical exercise was mostly cardiovascular, without heavy lifting, thus they were either slim or fat, but not muscular. Fencing may have developed noblemen’s quadriceps, biceps and wrist muscles, not so much the trunk. The nobility’s fashionable paleness was further ensured by wearing gloves and hats and carrying parasols during the short time spent outdoors.

All these physical contrasts ensured that even in the same clothes and surroundings, without talking or moving, a peasant and a noble could be distinguished at a glance. In this sense there was nothing special about US slavery.

The belief that US slaves were more distinguishable from their owners than those of other slaveholding societies is based on modern experience – nowadays, people of the same race but different social class are difficult to distinguish based on their physical appearance. Similar nutrition, sports opportunities and outdoor exposure lead to similar stature, musculature and tan.

Directing help-seekers to resources is playing hot potato

In several mental health first aid guidelines, one of the steps is to direct the help-seeker to resources (suggest asking friends, family, professionals for help, reading materials on how to cope with the mental condition). This can provide an excuse to play hot potato: send the help-seeker to someone else instead of providing help. For example, the therapist or counsellor suggests seeing a doctor and obtaining a prescription, and the doctor recommends meeting a therapist instead.

The hot potato game is neither limited to sufferers of mental health issues, nor to doctors and counsellors. It is very common in universities: many people „raise awareness”, „coordinate” the work of others or „mentor” them, „manage change”, „are on the team or committee”, „create an action plan” (or strategy, policy or procedure), „start a conversation” about an issue or „call attention” to it, instead of actually doing useful work. One example is extolling the virtues of recycling, as opposed to physically moving recyclable items from the garbage bin to the recycling bin, and non-recyclable waste in the other direction. Another example is calling attention to mental health, instead of volunteering to visit the mentally ill at home and help them with tasks. Talking about supporting and mentoring early career academics, as opposed to donating part of one’s salary to create a new postdoc position, thereby putting one’s money where one’s mouth is.

All the seeming-work activities mentioned above allow avoiding actual work and padding one’s CV. Claiming to manage and coordinate other people additionally helps with empire-building – hiring more subordinates to whom one’s own work can be outsourced.

To motivate people to do useful work, as opposed to coordinating or managing, the desirable outcomes of the work should be clearly defined, measured, and incentivised. Mere discussions, committee meetings and action plans should attract no rewards, rather the reverse, because these waste other people’s time. More generally, using more inputs for the same output should be penalised, for example for academics, receiving more grant money should count negatively for promotions, given the same patent and publication output.

One way to measure the usefulness of someone’s activity is to use the revealed preference of colleagues (https://sanderheinsalu.com/ajaveeb/?p=1093). Some management and coordination is beneficial, but universities tend to overdo it, so it has negative value added.

Identifying useful work in large organisations by revealed preference

Some members of large organisations seemingly do work, but actually contribute negatively by wasting other people’s time. For example, by sending mass emails, adding regulations, changing things for the sake of changing them (and to pad their CV with „completed projects”) or blocking change with endless committees, consultations and discussions with stakeholders. Even if there is a small benefit from this pretend-work, it is outweighed by the cost to the organisation from the wasted hours of other members. It is difficult to distinguish such negative-value-added activity from positive contributions (being proactive and entrepreneurial, leading by example). Opinions differ on what initiatives are good or bad and how much communication or discussion is enough.
Asking others to rate the work of a person would be informative if the feedback was honest, but usually people do not want to officially criticise colleagues and are not motivated to respond thoughtfully to surveys. Selection bias is also a problem, as online ratings show – the people motivated enough to rate a product, service or person are more likely to have extreme opinions.
Modern technology offers a way to study the revealed preferences of all members of the organisation without taking any of their time. If most email recipients block a given sender, move her or his emails to junk or spend very little time reading (keeping the email open), then this suggests the emails are not particularly useful. Aggregate email activity can be tracked without violating privacy if no human sees information about any particular individual’s email filtering or junking, only about the total number of people ignoring a given sender.
Making meetings, consultations and discussions optional and providing an excuse not to attend (e.g. two voluntary meetings at the same time) similarly allows members of the organisation „vote with their feet” about which meeting they find (more) useful. This provides an honest signal, unlike politeness-constrained and time-consuming feedback.
Anonymity of surveys helps mitigate the reluctance to officially criticise colleagues, but people may not believe that anonymity will be preserved. Even with trust in the feedback mechanism, the time cost of responding may preclude serious and thoughtful answers.

The most liveable cities rankings are suspicious

The „most liveable cities” rankings do not publish their methodology, only vague talk about a weighted index of healthcare, safety, economy, education, etc. An additional suspicious aspect is that the top-ranked cities are all large – there are no small towns. There are many more small than big cities in the world (this is known as Zipf’s law), so by chance alone, one would expect most of the top-ranked towns in any ranking that is not size-based to be small. The liveability rankings do not mention restricting attention to sizes above some cutoff. Even if a minimum size was required, one would expect most of the top-ranked cities to be close to this lower bound, just based on the size distribution.

The claimed ranking methodology includes several variables one would expect to be negatively correlated with the population of a city (safety, traffic, affordability). The only plausible positively size-associated variables are culture and entertainment, if these measure the total number of venues and events, not the per-capita number. Unless the index weights entertainment very heavily, one would expect big cities to be at a disadvantage in the liveability ranking based on the correlations, i.e. the smaller the town, the greater its probability of achieving a given liveability score and placing in the top n in the rankings. So the “best places to live” should be almost exclusively small towns. Rural areas not so much, because these usually have limited access to healthcare, education and amenities. The economy of remote regions grows less overall and the population is older, but some (mining) boom areas radically outperform cities in these dimensions. Crime is generally low, so if rural areas were included in the liveability index, then some of these would have a good change of attaining top rank.

For any large city, there exists a small town with better healthcare, safety, economy, education, younger population, more entertainment events per capita, etc (easy examples are university towns). The fact that these do not appear at the top of a liveability ranking should raise questions about its claimed methodology.

The bias in favour of bigger cities is probably coming from sample selection and hometown patriotism. If people vote mostly for their own city and the respondents of the liveability survey are either chosen from the population approximately uniformly randomly or the sample is weighted towards larger cities (online questionnaires have this bias), then most of the votes will favour big cities.

Overbidding incentives in crowdfunding

Crowdfunding campaigns on Funderbeam and other platforms fix a price for the shares or loan notes and invite investors to submit the quantity they want to buy. If demand exceeds supply, then the financial instruments are rationed pro rata, or investors requesting quantities below a threshold get what they asked and others receive the threshold amount plus a pro rata share in the remaining quantity after the threshold amounts are allocated. Rationing creates the incentive to oversubscribe: an investor who wants n shares and expects being rationed to fraction x of her demanded quantity will rationally put in the order for n/x>n shares to counteract the rationing. For a mechanism not to invite such manipulation, the amount allocated to a given bidder in the event of oversubscription should not depend on that bidder’s bid quantity. For example, everyone gets the minimum of their demanded amount and a threshold quantity, where the threshold is determined so as to equate demand and supply. If there are s shares and all m investors demand more than s/m, then each gets s/m.

If some investors demand less than s/m, then the allocation process is recursive as follows. The i1 investors who asked for less than s/m each get what they requested. Their total t1 is subtracted from s to get s1 and the number of remaining investors reduced to m1=m-i1. Then the i2 investors asking for less than s1/m1 get what they demanded (t2 in total), and the new remaining amount s2=s1-t2 and number of investors m2=m1-i2 determined. Repeat until the number of investors asking for less than sj/mj is zero. Divide the remaining amount equally between the remaining investors.

An alternative is to let the market work by allowing the price to adjust, instead of fixing it in advance. Everyone should then submit demand curves: for each price, how many shares are they willing to buy. This may be too complicated for the unsophisticated crowdfunding investors.

However, complexity is probably not the main reason for the inefficient allocation mechanism that invites overbidding. The crowdfunding platform wants to appear popular among investors to attract companies to raise funds on it, so wants to increase the number of oversubscribed campaigns. Rationing is a way to achieve such manipulation if the fundraisers ignore the investors’ incentives to overbid and do not compare the platform to competing ones with similar allocation mechanisms. If fundraisers are irrational in this way, then they do not choose competing platforms without overbidding incentives, because funding campaigns there seem to attract less investor interest. Competing platforms with more efficient allocation mechanisms then go out of business, which eliminates comparison possibilities.

Feedback requests by no-reply emails

We value your feedback” sent from a no-reply email address shows not only that the feedback is not valued, but also that the organisation is lying. More generally, when someone’s words and deeds conflict, then this is informative about his or her lack of truthfulness. If in addition the deeds are unpleasant, then this is the worst of the four possibilities (good or bad deeds combined with honest admission or lying).

The fact of sending such no-reply feedback requests suggests that either the organisations doing it are stupid, needlessly angering customers with insincere solicitations, or believe that the customers are stupid, failing to draw the statistically correct (Bayesian) conclusion about the organisation.

Some organisations send an automated feedback request by email (Mintos) or post (Yale Student Health) in response to every inquiry or interaction, even ones that clearly did not resolve the problem. The information about the non-resolution could easily be scraped from the original customer emails, without wasting anyone’s time by asking them to fill out feedback forms. The inefficient time-wasting by sending feedback requests is again informative about the organisation.

Blind testing of bicycle fitting

Claims that getting a professional bike fit significantly improves riding comfort and speed and reduces overuse injuries seem suspicious – how can a centimetre here or there make such a large difference? A very wrong fit (e.g. an adult using a children’s bike) of course creates big problems, but most people can adjust their bike to a reasonable fit based on a few online suggestions.

To determine the actual benefit of a bike fit requires a randomised trial: have professionals determine the bike fit for a large enough sample of riders, measure and record the objective parameters of the fit (centimetres of seatpost out of the seat tube, handlebar height from the ground, pedal crank length, etc). Then randomly change the fit by a few centimetres or leave it unchanged, without the cyclist knowing, and let the rider test the bike. Record the speed, ask the rider to rate the comfort, fatigue, etc. Repeat for several random changes in fit. Statistically test whether the average speed, comfort rating and other outcome variables across the sample of riders are better with the actual fit or with small random changes. To eliminate the placebo effect, blind testing is important – the cyclists should not know whether and how the fit has been changed.

Another approach is to have each rider test a large sample of different bike fits, find the best one empirically, record its objective parameters and then have a sample of professional fitters (who should not know what empirical fit was found) choose the best fit. Test statistically whether the professionals choose the same fit as the cyclist.

A simpler trial that does not quite answer the question of interest checks the consistency of different bike fitters. The same person with the same bike in the same initial configuration goes to various fitters and asks them to choose a fit. After each fitting, the objective sizing of the bike is recorded and then the bike is returned to the initial configuration before the next fit. The test is whether all fitters choose approximately the same parameters. Inconsistency implies that most fitters cannot figure out the objectively best fit, but consistency does not imply that the consensus of the fitters is the optimal sizing. They could all be wrong the same way – consistency is insufficient to answer the question of interest.