Sunday, 19 February 2017

The Data Science Process with Azure Machine Learning

It’s no secret today that all our applications and devices are generating tons of data; thus making data analytics a very hot topic these days and Microsoft Azure has all the tools necessary to ingest, manage and process all these data, also called Big Data.
However, all these data in itself is not useful unless processed, interpreted and visualized correctly. Another power behind the data acquired through years is to make Predictive Analytics. That is, using the data to make forecast and predictions.
But, by only using the data gathered, it is difficult to make analysis. To use the data, it needs to be cleansed, transformed and processed to a format that we can use to build Predictive Models. This process is called the Data Science Process.
The Data Science Process
Before the “Buzz Words”, the Cross Industry Standard Process for Data Mining presented a a data mining process model that describes commonly used approaches that data mining experts use to tackle problems.

image001
As you can see in the illustration above, the model proposed consisted of not only technical steps but also focused on understanding the business process and applications before going to the data preparation, modelling and evaluation steps.