Jasleen Kaur

August 2015


Crowd sourcing is a process. A process that begins with a person who needs services and ends with him receiving the services. The services are provided by a community who in order to learn, or in return for an incentive, renders service. It is different from traditional employees providing services. It involves a “crowd” of online service providers who interact on a platform to analyze and solve.


Crowd sourcing in data analytics is an upcoming field and I must say, a very promising field. Here, I am not thinking of an individual or a company, but of the “crowd” that gets to learn, solve, earn, and keep on learning. Companies like CrowdANALYTIX, Kaggle, Topcoder, Innocentive outsource data analytics projects to the public by hosting them as competitions on their platforms. Community members participate to win competitions. The incentive to win competitions is the attractive prize money if it’s a client based project. The incentive to participate in the contests is to learn analytics for free!! These companies share the best models codes/ scripts on their sites, where any one (literally any one!) can view them and learn where they went wrong, or come up with new methods of mining data. Also, some of these companies like CrowdANALYTIX and Kaggle host “Learning Contests” and “Knowledge contests” respectively, which involve no payment or receipt of money. However, they are for the sole purposes of imparting Machine Learning knowledge to the solvers. 


The advantages of the ‘crowdsourcing- Data Analytics’ combo according to me are many. Firstly, the process is comparatively less expensive than the traditional process of employees solving projects for their client. Also, with a very few skilled data analysts, offering projects to the community leaves open new possibilities and methods to go about the same problem. The company can choose the best solutions to be offered to the client. Secondly, the solvers/ participants get to learn a lot. With a fun cum incentive system at place, the solvers get to do build models on ‘Real data’ and do a lot of experimenting with it. Who wouldn’t want to learn and be paid for it!!The solvers are also incentivized to learn soft wares like R, Python, Tableau etc. Companies like CrowdANALYTIX also host Research Contests. These contests entail the solvers to put on their Research hats, formulate hypothesis, state factors supporting/ defying their arguments and search for data sources to validate their arguments with data sets. You can have look at one of my favorite Research contests here.


A perfect balance of academics and industry, this combo has a lot of scope in coming years. With the realization that “Big data” is everywhere, “Big data Analysts” should also be everywhere!! And of course these contests and sharing of knowledge via contest platforms is the best place to start.
Are you inspired? You sure are.
Join communities and start solving. A lot of data is waiting to be mined and grilled by you……