Machine Learning Compared to Picking the Best Mangoes – Data Digest

We’re back with our round-up of the biggest news on big data, business intelligence, and data visualization. Going forward, we’ll be doing this on a bi-weekly schedule, which we think would be the ideal frequency.

1. Explaining machine learning to non-computer-science people

Quora is a great place to find nuggets of interesting conversations between prominent thinkers today. One conversation particularly caught my interest this week. It seeks to put machine learning in common man’s terms. Many people pull out their best stories, and examples to explain it best, but the top voted reply is entertaining illustration from the way we pick mangoes in the market. It’s a great read for experts in the topic who are looking for an easy way to explain what they do, or for novices seeking to wrap their minds around this new phrase. Here’s an excerpt:

You take a randomly selected specimen of mangoes from the market (training data), make a table of all the physical characteristics of each mango, like color, size, shape, grown in which part of the country, sold by which vendor, etc (features), along with the sweetness, juicyness, ripeness of that mango (output variables). You feed this data to the machine learning algorithm (classification/regression), and it learns a model of the correlation between an average mango’s physical characteristics, and its quality.

The writer, ends with a catchy definition of machine learning: “Making your algorithms smart, so that you don’t need to be. ;)”. Head over to Quora and read this and the other amusing stories. It’s probably the lightest read on this otherwise heavy topic.

2. Facebook’s white paper ‘A Focus on Data Efficiency’

This past month Facebook, Qualcomm, and Ericsson released an extensive white paper on how they’re driving efficiency working with data, and the challenges they face with existing spectrum standards.

Here are some of the main pointers from the white paper:

  • How Facebook built and open-sourced the HipHop Virtual Machine execution engine and achieved a 500% increased in server throughput
  • How Open Compute technologies allows Facebook to house 250 billion photos without hang ups
  • How Facebook created Air Traffic Control to help engineers handle different network conditions like bandwidth, latency, packet loss, corrupted packets, and packet ordering
  • Facebook uses Google’s WebP image compression technology to save over 20% of total network traffic, without loss of image quality
  • Qualcomm is working on technology that can handle 1000 times more network traffic than today

The white paper proposes solutions to giving access to developing countries easy internet access. It is full of proven examples of how these companies, especially Facebook, is solving tomorrow’s problems with technology available today.

If you’re in the mood for a long read, here’s the complete white paper:

3. Data Visualization Makes Sailing More Enjoyable

We all have watched boat rowing races at some point, and found it exciting to see, and easy to follow. They have a single person or a team racing mostly side-by-side, in the same direction, trying to reach the finish line first. It’s all about muscle power, and team work. However, it’s a totally different case with sailing. With the complex routes taken by the sailboats, and boundaries that only avid watchers can spot, anyone watching it on television would find it hard to follow.

This year’s America’s Cup brought in Stan Honey, an expert at using data visualization to change the way we watch popular sports, to make the race more interesting to a television audience. The result was a fresh new way to watch sailing on television by visualizing boundaries, penalties, and other aspects of the sport in an easy to understand manner. Honey describes the project as an effort to take sailing to a non-sailing audience. In the video below, he talks about the challenges of capturing data from video cameras in helicopters, and other measuring devices placed in the water. The team talks about how important it was for them to sync the exact time to the live video, and understand where things (boats, boundaries, routes) are. It’s a fascinating watch, and shows how data visualization is going beyond the confines of business into the sporting world as well.

Leave a Comment