Tag Archives: data

Online reviews should include more facts

Online reviews are a public good and increase social welfare, but they could be improved by including more concrete data. For example, a restaurant or grocery store review should list the prices of specific foods. A review of a bar or function venue could estimate the number of tables and seats and the distance between tables, thus quantifying how cramped the room is. Currently, most reviews on Google Maps, Yelp and other similar sites are vague, just stating that the reviewer had a bad or great experience, that the staff were helpful or not, etc.
The purpose of a review is (hopefully) to help others (although some people just write rants to vent their emotions). Facts in reviews would help others more than opinions. Photos of the establishment and the food are useful, because they provide factual information. Some photos are more helpful than others. For example, it is more useful to see the inside than the outside of a venue. It is not very useful to see a picture of the outdoor sign of the establishment, but a readable photo of the menu conveys lots of information. In the future, Google Maps and competitors could automatically extract text from photos that contain it, and display the information in search results. Then photos of the menu, or of prices in a grocery store would be even more useful.
The idea for this post came from fruitlessly searching the web for current prices of groceries in different supermarkets in town. It would have been helpful if recent reviews of these supermarkets had included prices of at least some items.
The grocery price comparison apps that I tried had the limitation that the prices were for specific branded products and per package (e.g. Organic Carrots 500g), not per kilogram of a generic product (e.g. 1kg of carrots). This made it difficult to compare general pricing across shops, because each shop has a different range of brands, and only the price of the exact same brand can be compared.
An easy fix to improve the apps would be to allow users to specify which differently-branded products should be treated as identical, for example “Coles orange juice 2 litres” is the same for me as “Woolworths orange juice 2 litres”. Merging similar products would also reduce the memory requirement of the app, because the product database would have fewer entries to keep track of.

Restaurant learning what food people like

A restaurant chain can collect data on what food people like by examining the plates collected from the tables – the more leftovers given the size of the dish, the less popular the food. However, looking at the plates and entering the data takes time. It would be much faster to automate the process. For example, there could be a small conveyor belt for dirty dishes brought back from the eating area. The dishes would be weighed to record the amount of leftovers before scraping and washing. To detect which food was left over, one option is that a camera above the belt photographs the leftovers and then a computer tries to identify the food. This is a complicated machine vision and machine learning problem. A simpler option is to serve different dishes on plates with different shapes, or patterns such as lines and circles that are easily distinguished by computer. Then the plate identifies the dish for the camera, similarly to colour-coded plates identifying the price at sushi-train restaurants.
Even less costly in terms of computation (and without any camera requirement) would be to put RFID tags or other remote-id technology in plates. Each dish would have to be served on a plate with a dish-specific RFID, so the returned plates can be exactly matched to the food served on them. Each plate becomes more costly, but not by much, because RFID tags are cheap.
A single restaurant could also collect data on leftovers, but a chain of restaurants would get a larger dataset faster, thus useful information sooner on which dishes to keep and which to discontinue.

App to measure road quality

The accelerometers in phones can detect vibrations, such as when the car that the phone is in drives through a pothole. The GPS in the phone can detect the location and speed of the car. An app that connects the jolt, location and speed (and detects whether the phone is in a moving car based on its past speed and location) can automatically measure the quality of the road. The resulting data can be automatically uploaded to a database to create an almost real-time map of road quality. The same detection and reporting would work for bike paths.
Perhaps such an app has already been created, but if not, then it would complement map software nicely. Drivers and cyclists are interested in the quality of the roads as well as the route, time and distance of getting to the destination. Map software already provides congestion data and takes traffic density into account when predicting arrival time at a destination. Road quality data would help drivers select routes to minimise damage to vehicles (and the resulting maintenance cost) and to sensitive cargo. This would be useful to trucking and delivery companies, and ambulances.
A less direct use of data on road quality collected by the app is in evaluating the level of local public services provided (one aspect of the quality of local government). Municipalities with the same climate, soil and traffic density with worse roads are probably less well run. For developing countries where data on governance quality and spending is difficult to get, road quality may be a useful proxy. The public services are correlated with the wealth of a region, so road quality is also a proxy for poverty.

Empirical project ideas with econjobmarket and AEAweb JOE

The websites econjobmarket.org and AEAweb JOE are centralized job finding sites for economics PhDs. These have databases of application materials of thousands of job candidates, and the interviews many of them got. The subsequent jobs and publications of the job candidates are listed on the web. There are many empirical projects that can be done with this data, for example how certain keywords in recommendation letters predict the job that a candidate gets, or how the CV at the time of job application predicts future performance. One comparison that has been done in the sciences (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2572075/) is how recommendations of male and female candidates differ, i.e. what words are frequently used for one gender that are not used for the other. It is likely that economics recommendation letters contain similar biases.
The professors of top universities who have access to the databases of the job market websites have an advantage in hiring. They can predict which candidates perform well in the future and offer jobs to those. The employers without access to the databases are left with less promising candidates.