December 16, 2012

Managing Messy Data with the Microwork™ Model

Lauren SchulteBy: Lauren Schulte, Director, Marketing & Communications

Four years ago we started Samasource and the Microwork™ model was born. What started as an altruistic model has grown into a viable business solution for some of the world’s leading enterprises. Our Microwork™ platform has been transformative for our clients’ businesses. People who haven’t heard of a Microwork™ platform, however, often have a difficult understanding what it is and how it helps enterprises be more competitive and profitable.

In order to define microwork we have to start with our clients. All of our clients have one thing in common: they work with a large amount of data. Data has become a crucial part of many organizations’ core business. This data has to be accurate, and often needs manual verification or judgements by humans, which is expensive and time-consuming. We partner with large enterprises such as Google, Microsoft, eBay, and Walmart.com to help solve their enormous, and often times messy, data challenges.

When our clients talk about messy data they often have a broad spectrum of data challenges. These needs range from creating data sets that are fed into machine algorithms for computer vision, to conducting sentiment analysis for social media, to mining data.

Our client services team takes these big, messy data projects and loads them onto our proprietary technology platform called the SamaHub. The SamaHub breaks down large data projects into very small digital tasks. The combination of our specialized, hands-on client services and our proprietary technology makes up our one-of-a-kind Microwork™ model.

We liken the Microwork™ model to the Ford assembly line; you can’t easily teach one worker to build a car, but you can easily teach someone to install a steering wheel, add a tire or fasten a bolt. At the end of the line, a new car is fully assembled, as a result of a large group of workers completing a number of small, individual tasks. In terms of the type of work our agents perform, these tasks can include looking up a business address, tagging an image, or transcribing an audio clip. While these tasks are being completed by workers, the SamaHub is automating up to five steps of quality assurance to ensure accuracy of every task. Once the tasks are completed, the project is recompiled by our client services team and delivered to our clients.

The beauty of the Microwork™ model is that almost anyone can be taught how to perform small tasks with a very high-quality output. This is where impact sourcing plays a role: we aim to employ the world’s bottom billion. We give this work to poor women and youth in nine countries around the world. These are people previously living off of less than $3 per day. We vet and train our agents, teach them to specialize in different types of skills (such as image tagging or data mining) and manage them by providing real-time feedback. Our agents are able to learn marketable skills and earn a living wage within a matter of weeks.

What makes the Microwork™ model a viable business opportunity is that our process yields extraordinarily high-quality results. Our clients have reported that crowdsourcing companies (who utilize a random, unmanaged crowd) typically produce results with accuracy around 76%.  We are redefining the industry standard by committing to 95%+ quality in our client SLAs. In fact, just last week one of our clients did an internal evaluation of our quality and found that our data was 99.62% accurate (in real time). Results such as these have allowed us to retain 100% of our client base this year.

At Samasource, we don’t ask our clients to create new work for us—we partner with them to create innovative ways to solve the messy data challenges they are already struggling with, and deliver the high-quality results that they require. Our Microwork™ model has already been proven to be viable solution for our clients and will continue to grow as a staple for hundreds of enterprises around the world.

comments powered by Disqus