Tag Archives: computer

Computer vision training sets of photos are endogenous

In principle, every pixel could be independent of any other, so the set of possible photos is the number of pixels times the number of colours – billions at least. No training data set is large enough to cover these photo possibilities many times over, as required for statistical analysis, of which machine learning is a subfield. The problem is solved by restricting attention to a small subset of possible photos. In this case, there is a reasonable number of possible photos, which can be covered by a reasonably large training data set.

Useful photos on any topic usually contain just one main object, such as a face, with less than 100 secondary objects (furniture, clothes, equipment). There is a long right tail – some useful photos have dozens of the main object, like a group photo full of faces, but I do not know of a photo with a thousand distinguishable faces. Photos of mass events may have ten thousand people, but lack the resolution to make any face in these useful.

Only selected photos are worth analysing. Only photos sufficiently similar to these are worth putting in a computer vision training dataset. The sample selection occurs both on the input and the output side: few of the billions of pixel arrangements actually occur as photos to be classified by machine vision and most of the training photos are similar to those. There are thus fewer outputs to predict than would be generated from a uniform random distribution and more inputs close to those outputs than would occur if input data was uniform random. Both speed learning.

When photo resolution improves, more objects of interest may appear in photos without losing usefulness to blur. Then such photos become available in large numbers and are added to the datasets.

Reduce temptation by blocking images

Web shops try to tempt customers into unnecessary and even harmful purchases, including grocery and food ordering sites which promote unhealthy meals. The temptation can be reduced by blocking images on shopping websites. I find it useful when ordering food. Similarly, Facebook and news sites try to tempt viewers with clickbait and ads. To reduce my time-wasting, I make the clickbait less attractive by blocking images. The pictures in most news stories do not contribute any information – a story about a firm has a photo of the main building or logo of the firm or the face of its CEO, a “world leaders react to x” story has pictures of said leaders.

The blocking may require a browser extension (“block images”) and each browser and version has a little different steps for this.

In Chromium on 20 Jan 2021, no extension is needed:

1) click the three vertical dots at the top right,

2) click Settings to go to chrome://settings/,

3) scroll down to Site settings, click it,

4) scroll down to Images, click it.

5) Click the Add button to the right of the Block heading. A dialog pops up to enter a web address.

6) Copy the url of the site on which you want to block pictures, for example https://webshop.com into the Site field.

If seeing the images is necessary for some reason, then re-enable images on the website: follow steps 1-4 above, then click the three vertical dots under the Add button under the Block heading. A menu of three options pops up. Click the Allow option.

Alternatively, you may block all images on all websites and then allow only specific sites to show images. For this, follow steps 1-4 above, then click the blue button to the right of the Allow all (recommended) heading. Then click the Add button next to Allow. A dialog pops up to enter a web address. Copy the url of the site on which you want to block pictures, for example https://webshop.com into the Site field.

Preventing cheating is hopeless in online learning

Technology makes cheating easy even in in-person exams with invigilators next to the test-taker. For example, in-ear wireless headphones not visible externally can play a loop recording of the most important concepts of the tested material. A development of this idea is to use a hidden camera in the test-takers glasses or pen to send the exam contents to a helper who looks up the answers and transmits the spoken solutions via the headphones. Without a helper, sophisticated programming is needed: the image of the exam from the hidden camera is sent to a text-recognition (OCR) program, which pipes it to a web search or an online solver such as Wolfram Alpha, then uses a text-to-speech program to speak the results into the headphones.

A small screen on the inside of the glasses would be visible to a nearby invigilator, so is a risky way to transmit solutions. A small projector in the glasses could in theory display a cheat sheet right into the eye. The reflection from the eye would be small and difficult to detect even looking into the eyes of the test-taker, which are mostly pointed down at the exam.

If the testing is remote, then the test-taker could manipulate the cameras through which the invigilators watch, so that images of cheat sheets are replaced with the background and the sound of helpers saying answers is removed. The sound is easy to remove with a microphone near the mouth of the helper, the input of which is subtracted from the input of the computer webcam. A more sophisticated array of microphones feeding sound into small speakers near the web camera’s microphone can be used to subtract a particular voice from the web camera’s stereo microphone’s input. The technology is the same as in noise-cancelling headphones.

Replacing parts of images is doable even if the camera and its software are provided by the examiners and completely non-manipulable. The invigilators’ camera can be pointed at a screen which displays an already-edited video of the test-taker. The editing is fast enough to make it nearly real-time. The idea of the edited video is the same as in old crime movies where a photo of an empty room is stuck in front of a stationary security camera. Then the guard sees the empty room on the monitor no matter what actually goes on in the room.

There is probably a way to make part of the scene invisible to a camera even with 19th century technology, namely the Pepper’s Ghost illusion with a two-way mirror. The edges of the mirror have to be hidden somehow.

Bar-coding videos to prevent faking

To prevent clips from being cut out of a video or inserted, add a non-repeating sequence of bar codes onto either the whole frame or the main object of the video, such as a person talking. The bar code can use subtle „watermark” shading that does not interfere with viewing – it only needs to be readable by computer. The sequence of bar codes can be recreated at a later time if the algorithm is known, so if a clip is cut out of the video or added, then the sequence no longer matches the replication. Altering the video frame by frame also changes the bar code, although the forger can bypass this security feature by reading the original bar code, removing it before retouching and adding it back later. Still, these extra steps make faking the video somewhat more difficult. The main security feature is that the length of the video cannot be changed without altering the sequence of bar codes, which is easily detected.

The maker of the video may generate the bar codes cryptographically using a private key. This enables confirming the source of the video, for example in a copyright dispute.

Probably the idea of bar-coding videos has already been implemented, because watermarks and time stamps on photos have long been used. The main novelty relative to treating each frame as a photo is to link the bar codes to each other over time.

Virtual reality helmet for video calling

Current virtual reality headsets can display video calls, but the person wearing the VR goggles is filmed from outside these. A face with its top half covered by VR goggles is not very expressive, which somewhat defeats the purpose of a video call. The solution is a sphere around the head with the webcam inside it and the video of the other caller projected on the inside. An astronaut’s helmet is an analogy.

To prevent suffocation, the sphere should not be airtight – small CPU fans can be installed at the top or back to circulate air in and out. This also prevents humidity buildup. For headphones as well, I would prefer some ventilation of the area covered.

Multiple webcams pointed at the face allow for 3D imaging, so the video call could take full advantage of the 3D display of virtual reality headsets. However, 3D display relies on projecting a different image to each eye. If the video call is simply projected on the inside of the sphere, then it is a single image and the 3D effect is lost. One solution is to point a small data projector at each eye to display different images. Then the sphere is not needed, just cameras and projectors attached to a stick attached to a headband. A Dilbert comic had this idea, but I cannot find the link on the web.

Ebay should allow conditional bids

Ebay should allow buyers to bid for a single item across multiple auctions: make a bid for one item, then if outbid, automatically make the same bid on the next identical (as defined by the buyer) item and so on. This increases efficiency by joining multiple auctions for identical items into one market with many sellers and buyers. It also reduces selling times, because a buyer who just wants one unit does not have to wait until being outbid before bidding for the next identical item. Buyers generally are not continuously watching the auction, so there is a delay between being outbid and manually making the next bid. Buyers are willing to pay to reduce the delay, as evidenced by purchases at “buy it now” prices greater than the highest bids in the auctions.

More generally, bids conditional on being outbid would help merge auctions into markets, gaining efficiency and speed. For example, a buyer has different values for used copies of the same item in different condition and wants just one copy of the item. Conditional bids allow the buyer to enter a sequence of different-sized bids, one for each copy, with each bid in the sequence conditional on the preceding bids losing.

Linking the bids is not computationally difficult because Ebay already sends an automatic email to a buyer who has been outbid. Instead of an email, the event of being outbid can be used to trigger entering a bid on the next copy of the item.

Faster selling times benefit everyone: sellers sell faster, buyers do not have to waste time checking whether they have been outbid and then making the next bid, Ebay can charge higher fees to appropriate part of the increased surplus from greater efficiency. Ebay can also use the data on which items buyers consider similar enough to classify products and remove duplicate ads.

A browser extension or app can provide the same functionality: an email with title containing “You have been outbid” triggers code that logs in the user (with the credentials saved into a password manager or the browser) and types in a bid on the next copy of the item.

Feedback requests by no-reply emails

We value your feedback” sent from a no-reply email address shows not only that the feedback is not valued, but also that the organisation is lying. More generally, when someone’s words and deeds conflict, then this is informative about his or her lack of truthfulness. If in addition the deeds are unpleasant, then this is the worst of the four possibilities (good or bad deeds combined with honest admission or lying).

The fact of sending such no-reply feedback requests suggests that either the organisations doing it are stupid, needlessly angering customers with insincere solicitations, or believe that the customers are stupid, failing to draw the statistically correct (Bayesian) conclusion about the organisation.

Some organisations send an automated feedback request by email (Mintos) or post (Yale Student Health) in response to every inquiry or interaction, even ones that clearly did not resolve the problem. The information about the non-resolution could easily be scraped from the original customer emails, without wasting anyone’s time by asking them to fill out feedback forms. The inefficient time-wasting by sending feedback requests is again informative about the organisation.

Online check-in lies

Almost all airlines advertise the option to check in online and send email reminders to do so. In my experience, some airlines (Qantas, Air New Zealand and Qatar Airways) frequently do not allow online check-in despite falsely claiming that it is always available, or only unavailable to underage people and large groups. Email reminders to check in online seem like mockery in this case, but are still sent.

The false advertising of online check-in wastes customers’ time by encouraging them to start the data entry process. Often the process can be almost completed and only at the end does a message appear saying that online check-in is unavailable. To reduce the wasted time, the process should be stopped as soon as possible whenever it cannot be completed but is nonetheless started. It seems a simple IT fix to not send the automated reminder emails when online check-in is unavailable, and display the message „Online check-in unavailable” at the start of the data entry process instead of at the end.

A similarly ironic tone to falsely advertising online check-in is achieved by sending „we value your opinion” emails from a no-reply email address, or claiming to listen to customers but providing no contact email or phone on the website. Such mockery is practiced by many large companies. Sometimes the firms provide a feedback form that is user-unfriendly and requests lots of personal data. Or they may refer inquiries to a very limited FAQ section. The FAQ sometimes lists questions no real customer would ask, along the lines of „What makes your product so excellent?” These questions are in the FAQ just to let the company repeat their marketing slogans.

Recording speaking time to prevent meetings from running over

To prevent meetings from running over because some people like to listen to their own voice, one way is to publish how much of others’ time each participant took. Measuring the talking time and making the results public helps participants with low self-awareness realise how long they talked, and creates social disapproval of those who go on for too long, potentially motivating them to be more concise.

A related method to prevent time overruns using current meeting rules, e.g. Robert’s Rules, is to allocate each speaker a fixed amount of time in advance. The problem with this method is the lax enforcement both during and after the meeting. If a speaker goes over and does not respond to requests to stop, then the moderator or chairperson usually does not shut the speaker up (turn off the microphone, forcefully remove the waffler from the stage, clamp a hand over their mouth). After the meeting, the possible sanctions (e.g. not inviting the speaker to future meetings, monetary fine, opposing the speaker’s proposed policy) are also infrequent or weak. Of course this enforcement problem also arises when talk time is recorded and published. However, the clear measurement removes one excuse of the speakers going over, namely their flat denial that they took more time than allocated, or more than others.

Public time-recording is especially helpful in less formal meetings that have no moderator or chairperson keeping time and notifying speakers to stop, and in meetings where a speaker is powerful enough that other participants are reluctant to interrupt with reminders of the time limit. A timekeeper is not needed to record the duration of a speech nowadays, because smartphones can identify a person based on their voice and calculate the time for which each voice spoke. There is a business opportunity in developing an app that identifies the number and timing of the speakers. The resulting data could also be used for research into social dynamics, e.g. whether some age, gender or race groups speak less, whether people in positions of power talk and interrupt more.

A smartphone app can also play a notification sound when a speaker’s time is up, eliminating the problem that the less powerful participants do not remind an important speaker to stop. In large meetings with a microphone, a computer keeping track of speech durations can force a speaker to stop by cutting power to the microphone when the time is up. A computer may be attached to other means to stop a speaker from unreasonably taking others’ time, e.g. it may draw the stage curtain, turn off the stage lights or start noise-cancelling the speech.

Real estate website improvements

Almost all real estate websites I have read are missing the essential information about a lease from their search filters. The core info is whether the property is available from a certain date to a certain date, whether the minimum and maximum lease lengths allow a tenancy for the specified term, what is the monthly rent for that term. Some websites specify that information in the description of the property, but do not allow searches based on it. On some websites, the rent doubles when the term is halved, which would be good to know from the start instead of after clicking on a search result.

If the property is only rented to a specific class people, e.g. requires a number of years of good rental history, a certain income level, etc, then it would be good to know this at the start of the search, instead of during the application process.

Almost useless info like hardwood floors, granite countertops, historic building, etc should be removed or relegated to the bottom of the page.

In general, any search website should allow the user to remove specific results (that the user has deemed irrelevant) from future searches, like Craigslist does. Being able to save the search like in Craigslist is also a useful feature.