The missing innovation

A random conversation on Farmville with C this morning got me into thinking how successful online games are in enticing players to spend money on virtual goods. While it is intriguing to me how the internet has enabled new ways of making money, ultimately, these offerings are appealing to certain fundamental human desires that are constant over time. In the case of online games, the desire to win, and to be part of a community are some of the driving factors.

Read More

Cafeteria style food delivery

The food delivery business needs to solve an array of issues in order to achieve profitability. Fundamentally, a lack of product differentiability and a low barrier to market entry drives the margin low to non-existent. But as a consumer, what frustrates me the most is the amount of time spent to go through the list of seemingly endless choices, only to be presented with a delivery fee at the end that is more than I am willing to pay, and have to cancel the order altogether.

Read More

When Paying is better than Free

The public tennis courts in the DMV (DC/Maryland/Virginia) area are always in high demand. On a weekend with nice weather, it’s not uncommon for players to have to wait ~40 mins for a court to free up. At the same time, many of the players are unaware that the tennis courts can actually be reserved online in advance, with a fee of $10/hour - an amount that many would gladly pay in order to avoid waiting aimlessly. The local government can also collect revenue from running a reservation system - a rare win-win situation.

Read More

Generalized Additive Model

In many social science and business problems, it is often more important to explain why a phenomenon happens than improving the model’s predictability on the event happening. Having an interpretable model is, therefore, crucial in understanding how different factors interact with the outcome of interest. The model’s interpretability is also important in highly regulated business environments, such as loan approval decisions. Even in situations where prediction accuracy is more important than the “why”, an interpretable model can help debug more complicated models and guide new approaches to feature engineering and data preprocessing.

Read More

Algorithms, part 2

I finished the Algorithm Specialization offered by prof Tim Roughgarden of Stanford University on coursera a few days ago. It has been a wonderful online learning journey albeit a challenging one. The part II of the specialization focuses on topics such as Dynamic Programming, Greedy Algorithms, and computational tractability. To pass the course, it requires a deep level of understanding of the content rather than merely ingesting/memorizing the content at face value.

Read More

Spark for Big Data Analysis

I recently completed Big Data Analysis with Scala and Spark on coursera. While I have previously used pySpark at work, this course provided a really nice overview of the pros and cons of using different data types (RDD vs dataframe vs dataset) and explained why I sometimes did not get the performance I expected. (Understand the difference of transformation and action, and use persist() wisely!).

Read More

Reimagining Peer Review

It is often a test of resolve for researchers to go through peer review. No matter how successful you are as an academic researcher, your articles have probably been rejected by journals at least once, and you have probably wondered if your papers will ever find a home.

Read More

Divide and Conquer

I recently started and finished two algorithm courses offered by Stanford on coursera. The courses were more challenging than I originally anticipated. Tracing through the logic of the code often requires you to be completely alert and focused, and it sometimes feels like a mind gymnastic class. Still, it was incredibly satisfying to see your code works, and runs fast! As a data scientist, I often write one-off analysis scripts that place little emphasis on computational performance. So naturally, I did not know what a difference does efficient code and suitable data structure makes until forcing myself to think like a software engineer.

Read More

Classifying Images of Indiviudals wearing Uniforms

The motivation of this project comes from wanting to understand whether certain media outlets are more likely to post pictures of individuals in uniform to emphasize on a narrative that resonates or sympathizes with the authority. For example, in protests, do media feature pictures of the police or protestors more to promote a certain narrative? In the ICA conference last year, political science researchers are just starting to analyze news images, and I think this could be a very fruitful area of research.

Read More

How do textbooks in mainland China, Hong Kong and Taiwan differ?

The growth of nationalism in mainland China, particularly among the young generation, is hard to miss. This presents a striking contrast to the young people in Hong Kong and Taiwan, who are more eager to embrace “Western values” such as democracy and individual freedom. Researchers have suggested that historical education played an important role in mobilizing popular support for the CCP.

Read More