Tag Archives: being and seeming

Moon phase and sleep correlation is not quite a sine wave

Casiraghi et al. (2021) in Science Advances (DOI: 10.1126/sciadv.abe0465) show that human sleep duration and onset depends on the phase of the moon. Their interpretation is that light availability during the night caused humans to adapt their sleep over evolutionary time. Casiraghi et al. fit a sine curve to both sleep duration and onset as functions of the day in the monthly lunar cycle, but their Figure 1 A, B for the full sample and the blue and orange curves for the rural groups in Figure 1 C, D show a statistically significant deviation from a sine function. Instead of same-sized symmetric peaks and troughs, sleep duration has two peaks with a small trough between, then a large sharp trough which falls more steeply than rises, then two peaks again. Sleep onset has a vertically reflected version of this pattern. These features are statistically significant, based on the confidence bands Casiraghi and coauthors have drawn in Figure 1.

The significant departure of sleep patterns from a sine wave calls into question the interpretation that light availability over evolutionary time caused these patterns. What fits the interpretation of Casiraghi et al. is that sleep duration is shortest right before full moon, but what does not fit is that the duration is longest right after full and new moons, but shorter during a waning crescent moon between these.

It would better summarise the data to use the first four terms of a Fourier series instead of just the first term. There seems little danger of overfitting, given N=69 and t>60.

A questionable choice of the authors is to plot the sleep duration and onset of only the 35 best-fitting participants in Figure 2. A more honest choice yielding the same number of plots would pick every other participant in the ranking from the best fit to the worst.

In the section Materials and Methods, Casiraghi et al. fitted both a 15-day and a 30-day cycle to test for the effect of the Moon’s gravitational pull on sleep. The 15-day component was weaker in urban communities than rural, but any effect of gravity should be the same in both. By contrast, the effect of moonlight should be weaker in urban communities, but the urban community data (Figure 1 C, D green curve) fits a simple sine curve better than rural. It seems strange that sleep in urban communities would correlate more strongly with the amount of moonlight, like Figure 1 shows.

Preventing cheating is hopeless in online learning

Technology makes cheating easy even in in-person exams with invigilators next to the test-taker. For example, in-ear wireless headphones not visible externally can play a loop recording of the most important concepts of the tested material. A development of this idea is to use a hidden camera in the test-takers glasses or pen to send the exam contents to a helper who looks up the answers and transmits the spoken solutions via the headphones. Without a helper, sophisticated programming is needed: the image of the exam from the hidden camera is sent to a text-recognition (OCR) program, which pipes it to a web search or an online solver such as Wolfram Alpha, then uses a text-to-speech program to speak the results into the headphones.

A small screen on the inside of the glasses would be visible to a nearby invigilator, so is a risky way to transmit solutions. A small projector in the glasses could in theory display a cheat sheet right into the eye. The reflection from the eye would be small and difficult to detect even looking into the eyes of the test-taker, which are mostly pointed down at the exam.

If the testing is remote, then the test-taker could manipulate the cameras through which the invigilators watch, so that images of cheat sheets are replaced with the background and the sound of helpers saying answers is removed. The sound is easy to remove with a microphone near the mouth of the helper, the input of which is subtracted from the input of the computer webcam. A more sophisticated array of microphones feeding sound into small speakers near the web camera’s microphone can be used to subtract a particular voice from the web camera’s stereo microphone’s input. The technology is the same as in noise-cancelling headphones.

Replacing parts of images is doable even if the camera and its software are provided by the examiners and completely non-manipulable. The invigilators’ camera can be pointed at a screen which displays an already-edited video of the test-taker. The editing is fast enough to make it nearly real-time. The idea of the edited video is the same as in old crime movies where a photo of an empty room is stuck in front of a stationary security camera. Then the guard sees the empty room on the monitor no matter what actually goes on in the room.

There is probably a way to make part of the scene invisible to a camera even with 19th century technology, namely the Pepper’s Ghost illusion with a two-way mirror. The edges of the mirror have to be hidden somehow.

Partial cleaning may make surfaces look dirtier

The reason why incomplete cleaning may increase the visual perception of dirt is by increasing the contrast between the patches of thicker grime and the normal colour by removing a uniform covering of thinner dirt. If something is uniformly grimy, then the colour of the covering dirt may be perceived as the thing’s normal hue. Cleaning may remove approximately the same thickness of dirt from all points on the surface. If some patches initially have a thicker layer, then these remain the colour of the dirt after the cleaning, but other areas may be fully cleaned and revert to the original look of the surface. The human visual system mostly perceives contrast, not the absolute wavelength of the reflected light, as various optical illusions demonstrate. Higher contrast between the thicker patches of grime and the rest of the surface then enhances the perception of dirtiness.

Bar-coding videos to prevent faking

To prevent clips from being cut out of a video or inserted, add a non-repeating sequence of bar codes onto either the whole frame or the main object of the video, such as a person talking. The bar code can use subtle „watermark” shading that does not interfere with viewing – it only needs to be readable by computer. The sequence of bar codes can be recreated at a later time if the algorithm is known, so if a clip is cut out of the video or added, then the sequence no longer matches the replication. Altering the video frame by frame also changes the bar code, although the forger can bypass this security feature by reading the original bar code, removing it before retouching and adding it back later. Still, these extra steps make faking the video somewhat more difficult. The main security feature is that the length of the video cannot be changed without altering the sequence of bar codes, which is easily detected.

The maker of the video may generate the bar codes cryptographically using a private key. This enables confirming the source of the video, for example in a copyright dispute.

Probably the idea of bar-coding videos has already been implemented, because watermarks and time stamps on photos have long been used. The main novelty relative to treating each frame as a photo is to link the bar codes to each other over time.

If top people have families and hobbies, then success is not about productivity


1 Productivity is continuous and weakly increasing in talent and effort.

2 The sum of efforts allocated to all activities is bounded, and this bound is similar across people.

3 Families and hobbies take some effort, thus less is left for work. (For this assumption to hold, it may be necessary to focus on families with children in which the partner is working in a different field. Otherwise, a stay-at-home partner may take care of the cooking and cleaning, freeing up time for the working spouse to allocate to work. A partner in the same field of work may provide a collaboration synergy. In both cases, the productivity of the top person in question may increase.)

4 The talent distribution is similar for people with and without families or hobbies. This assumption would be violated if for example talented people are much better at finding a partner and starting a family.

Under these assumptions, reasonably rational people would be more productive without families or hobbies. If success is mostly determined by productivity, then people without families should be more successful on average. In other words, most top people in any endeavour would not have families or hobbies that take time away from work.

In short, if responsibilities and distractions cause lower productivity, and productivity causes success, then success is negatively correlated with such distractions. Therefore, if successful people have families with a similar or greater frequency as the general population, then success is not driven by productivity.

One counterargument is that people first become successful and then start families. In order for this to explain the similar fractions of singles among top and bottom achievers, the rate of family formation after success must be much greater than among the unsuccessful, because catching up from a late start requires a higher rate of increase.

Another explanation is irrationality of a specific form – one which reduces the productivity of high effort significantly below that of medium effort. Then single people with lots of time for work would produce less through their high effort than those with families and hobbies via their medium effort. Productivity per hour naturally falls with increasing hours, but the issue here is total output (the hours times the per-hour productivity). An extra work hour has to contribute negatively to success to explain the lack of family-success correlation. One mechanism for a negative effect of hours on output is burnout of workaholics. For this explanation, people have to be irrational enough to keep working even when their total output falls as a result.

If the above explanations seem unlikely but the assumptions reasonable in a given field of human endeavour, then reaching the top and staying there is mostly not about productivity (talent and effort) in this field. For example, in academic research.

A related empirical test of whether success in a given field is caused by productivity is to check whether people from countries or groups that score highly on corruption indices disproportionately succeed in this field. Either conditional on entering the field or unconditionally. In academia, in fields where convincing others is more important than the objective correctness of one’s results, people from more nepotist cultures should have an advantage. The same applies to journals – the general interest ones care relatively more about a good story, the field journals more about correctness. Do people from more corrupt countries publish relatively more in general interest journals, given their total publications? Of course, conditional on their observable characteristics like the current country of employment.

Another related test for meritocracy in academia or the R&D industry is whether coauthored publications and patents are divided by the number of coauthors in their influence on salaries and promotions. If there is an established ranking of institutions or job titles, then do those at higher ranks have more quality-weighted coauthor-divided articles and patents? The quality-weighting is the difficult part, because usually there is no independent measure of quality (unaffected by the dependent variable, be it promotions, salary, publication venue).

Visually distinct social classes in agrarian societies

One argument advanced for why slavery in the US was special among the world’s slaveholding societies is that one race enslaved another. However, before the age of genetic testing, the races could only have been distinguished visually. Similarly obvious differences in the looks of slaves and masters, or serfs and nobility occurred in all agrarian societies. The obviousness of distinct looks is meant in the statistical sense: with what accuracy could people classify others into slaves and masters, or peasants and lords, averaged both across the population judging and the population judged? I believe the accuracy was close to perfect – comparable to the classification accuracy of US slaves and slaveholders – for the following reasons.

Serfs were malnourished in childhood, thus short. They did hard physical labour without stretching much, thus were bent over, with back and leg muscles better developed than the rest. They spent the day outdoors without sunscreen, wearing limited clothing, thus were tanned. The lack of sunglasses caused them to squint, creating characteristic wrinkles on the face. They seldom had opportunity to wash, thus had ingrained dirt in their skin that would not have come out with a single hard scrubbing. Both corporal punishment and intrafamily violence caused many of them to have visible scars, missing teeth, crooked noses. By contrast, the well-fed nobility were tall and practised proper erect posture in childhood for table manners and dance lessons. Their physical exercise was mostly cardiovascular, without heavy lifting, thus they were either slim or fat, but not muscular. Fencing may have developed noblemen’s quadriceps, biceps and wrist muscles, not so much the trunk. The nobility’s fashionable paleness was further ensured by wearing gloves and hats and carrying parasols during the short time spent outdoors.

All these physical contrasts ensured that even in the same clothes and surroundings, without talking or moving, a peasant and a noble could be distinguished at a glance. In this sense there was nothing special about US slavery.

The belief that US slaves were more distinguishable from their owners than those of other slaveholding societies is based on modern experience – nowadays, people of the same race but different social class are difficult to distinguish based on their physical appearance. Similar nutrition, sports opportunities and outdoor exposure lead to similar stature, musculature and tan.

Directing help-seekers to resources is playing hot potato

In several mental health first aid guidelines, one of the steps is to direct the help-seeker to resources (suggest asking friends, family, professionals for help, reading materials on how to cope with the mental condition). This can provide an excuse to play hot potato: send the help-seeker to someone else instead of providing help. For example, the therapist or counsellor suggests seeing a doctor and obtaining a prescription, and the doctor recommends meeting a therapist instead.

The hot potato game is neither limited to sufferers of mental health issues, nor to doctors and counsellors. It is very common in universities: many people „raise awareness”, „coordinate” the work of others or „mentor” them, „manage change”, „are on the team or committee”, „create an action plan” (or strategy, policy or procedure), „start a conversation” about an issue or „call attention” to it, instead of actually doing useful work. One example is extolling the virtues of recycling, as opposed to physically moving recyclable items from the garbage bin to the recycling bin, and non-recyclable waste in the other direction. Another example is calling attention to mental health, instead of volunteering to visit the mentally ill at home and help them with tasks. Talking about supporting and mentoring early career academics, as opposed to donating part of one’s salary to create a new postdoc position, thereby putting one’s money where one’s mouth is.

All the seeming-work activities mentioned above allow avoiding actual work and padding one’s CV. Claiming to manage and coordinate other people additionally helps with empire-building – hiring more subordinates to whom one’s own work can be outsourced.

To motivate people to do useful work, as opposed to coordinating or managing, the desirable outcomes of the work should be clearly defined, measured, and incentivised. Mere discussions, committee meetings and action plans should attract no rewards, rather the reverse, because these waste other people’s time. More generally, using more inputs for the same output should be penalised, for example for academics, receiving more grant money should count negatively for promotions, given the same patent and publication output.

One way to measure the usefulness of someone’s activity is to use the revealed preference of colleagues (https://sanderheinsalu.com/ajaveeb/?p=1093). Some management and coordination is beneficial, but universities tend to overdo it, so it has negative value added.

Identifying useful work in large organisations by revealed preference

Some members of large organisations seemingly do work, but actually contribute negatively by wasting other people’s time. For example, by sending mass emails, adding regulations, changing things for the sake of changing them (and to pad their CV with „completed projects”) or blocking change with endless committees, consultations and discussions with stakeholders. Even if there is a small benefit from this pretend-work, it is outweighed by the cost to the organisation from the wasted hours of other members. It is difficult to distinguish such negative-value-added activity from positive contributions (being proactive and entrepreneurial, leading by example). Opinions differ on what initiatives are good or bad and how much communication or discussion is enough.
Asking others to rate the work of a person would be informative if the feedback was honest, but usually people do not want to officially criticise colleagues and are not motivated to respond thoughtfully to surveys. Selection bias is also a problem, as online ratings show – the people motivated enough to rate a product, service or person are more likely to have extreme opinions.
Modern technology offers a way to study the revealed preferences of all members of the organisation without taking any of their time. If most email recipients block a given sender, move her or his emails to junk or spend very little time reading (keeping the email open), then this suggests the emails are not particularly useful. Aggregate email activity can be tracked without violating privacy if no human sees information about any particular individual’s email filtering or junking, only about the total number of people ignoring a given sender.
Making meetings, consultations and discussions optional and providing an excuse not to attend (e.g. two voluntary meetings at the same time) similarly allows members of the organisation „vote with their feet” about which meeting they find (more) useful. This provides an honest signal, unlike politeness-constrained and time-consuming feedback.
Anonymity of surveys helps mitigate the reluctance to officially criticise colleagues, but people may not believe that anonymity will be preserved. Even with trust in the feedback mechanism, the time cost of responding may preclude serious and thoughtful answers.

The most liveable cities rankings are suspicious

The „most liveable cities” rankings do not publish their methodology, only vague talk about a weighted index of healthcare, safety, economy, education, etc. An additional suspicious aspect is that the top-ranked cities are all large – there are no small towns. There are many more small than big cities in the world (this is known as Zipf’s law), so by chance alone, one would expect most of the top-ranked towns in any ranking that is not size-based to be small. The liveability rankings do not mention restricting attention to sizes above some cutoff. Even if a minimum size was required, one would expect most of the top-ranked cities to be close to this lower bound, just based on the size distribution.

The claimed ranking methodology includes several variables one would expect to be negatively correlated with the population of a city (safety, traffic, affordability). The only plausible positively size-associated variables are culture and entertainment, if these measure the total number of venues and events, not the per-capita number. Unless the index weights entertainment very heavily, one would expect big cities to be at a disadvantage in the liveability ranking based on the correlations, i.e. the smaller the town, the greater its probability of achieving a given liveability score and placing in the top n in the rankings. So the “best places to live” should be almost exclusively small towns. Rural areas not so much, because these usually have limited access to healthcare, education and amenities. The economy of remote regions grows less overall and the population is older, but some (mining) boom areas radically outperform cities in these dimensions. Crime is generally low, so if rural areas were included in the liveability index, then some of these would have a good change of attaining top rank.

For any large city, there exists a small town with better healthcare, safety, economy, education, younger population, more entertainment events per capita, etc (easy examples are university towns). The fact that these do not appear at the top of a liveability ranking should raise questions about its claimed methodology.

The bias in favour of bigger cities is probably coming from sample selection and hometown patriotism. If people vote mostly for their own city and the respondents of the liveability survey are either chosen from the population approximately uniformly randomly or the sample is weighted towards larger cities (online questionnaires have this bias), then most of the votes will favour big cities.

Overbidding incentives in crowdfunding

Crowdfunding campaigns on Funderbeam and other platforms fix a price for the shares or loan notes and invite investors to submit the quantity they want to buy. If demand exceeds supply, then the financial instruments are rationed pro rata, or investors requesting quantities below a threshold get what they asked and others receive the threshold amount plus a pro rata share in the remaining quantity after the threshold amounts are allocated. Rationing creates the incentive to oversubscribe: an investor who wants n shares and expects being rationed to fraction x of her demanded quantity will rationally put in the order for n/x>n shares to counteract the rationing. For a mechanism not to invite such manipulation, the amount allocated to a given bidder in the event of oversubscription should not depend on that bidder’s bid quantity. For example, everyone gets the minimum of their demanded amount and a threshold quantity, where the threshold is determined so as to equate demand and supply. If there are s shares and all m investors demand more than s/m, then each gets s/m.

If some investors demand less than s/m, then the allocation process is recursive as follows. The i1 investors who asked for less than s/m each get what they requested. Their total t1 is subtracted from s to get s1 and the number of remaining investors reduced to m1=m-i1. Then the i2 investors asking for less than s1/m1 get what they demanded (t2 in total), and the new remaining amount s2=s1-t2 and number of investors m2=m1-i2 determined. Repeat until the number of investors asking for less than sj/mj is zero. Divide the remaining amount equally between the remaining investors.

An alternative is to let the market work by allowing the price to adjust, instead of fixing it in advance. Everyone should then submit demand curves: for each price, how many shares are they willing to buy. This may be too complicated for the unsophisticated crowdfunding investors.

However, complexity is probably not the main reason for the inefficient allocation mechanism that invites overbidding. The crowdfunding platform wants to appear popular among investors to attract companies to raise funds on it, so wants to increase the number of oversubscribed campaigns. Rationing is a way to achieve such manipulation if the fundraisers ignore the investors’ incentives to overbid and do not compare the platform to competing ones with similar allocation mechanisms. If fundraisers are irrational in this way, then they do not choose competing platforms without overbidding incentives, because funding campaigns there seem to attract less investor interest. Competing platforms with more efficient allocation mechanisms then go out of business, which eliminates comparison possibilities.