Skip to main content

Developing a Recommender Solution with Azure Machine Learning

While preparing my presentation for the Developer's Conference on Machine Learning, I got the idea to make a demo of a recommender engine.
Ever wondered how websites like Amazon and Ebay provides you useful suggestions and recommendations? This blog post is for you!






Introduction

To have a complete introduction to Machine Learning in general and Azure Machine Learning, please read my previous blog post here.

Designing the Experiment


Below are the steps to develop the experiment:

1. Add the dataset

In AzureML, you may upload your existing dataset or load one from an Azure Database, Azure Blob Storage,Data Feed Reader, Web Service or a  Hive Query.

In this example we shall add the Movie Ratings Sample Data.

image


The Movie Rating sample has the following columns:

image


2. Exclude the columns that shall not be needed

To do so, the project columns tool object can be used. Add it in the experiment,

image

Now, from the right menu, select "launch column selector" to select the fields we do not need. Here, we shall exclude the timestamp column.

image

3. Split the data
We now need to partition the data into 2 distinct sets:
     - Train Data : Used to “train” the recommender
     - Test Data : Used to validate the results of the recommender

Drag the split tool and connect it as below.

image


Deciding about the amount of data to use for training and testing is is subjective.

image

The ratio should be typed as a decimal number between 0 and 1 to represent the percentage of rows sent to the first output dataset.
For example, if you type 0.75 as the value, the dataset would be split by using a 75:25 ratio, with 75% of the rows sent to the first output dataset, and 25% sent to the second output dataset.


4. Add the Train Matchbox Recommender
The Train a recommendation model based on the Matchbox recommender engine. It has the ability to learn about people’s preferences from observing how they rate items such as movies, content, or other products.
This is where learning occurs.

image


5. Add the Score Matchbox Recommender

image


The  Score Matchbox Recommender Scores predictions for a dataset using the Matchbox recommender.
It generates results based on a trained recommendation model


6. Add the Evaluate Recommender
It evaluates the accuracy of recommender model predictions



image


At this point in time, our solution is like below and we may run it by clucking on the Run button.

image

After its execution, if we click on the output of the Score Matchbox Recommender and click on visualize, we have all the movie IDs together with their respective "related" movies" as shown below.


image


However, this won't be much useful for analysis purposes. What we want is to have the movie names instead of the movie IDs.

Fortunately, we can use the Join Operator. That's what we shall do below.

7. Add the IMDB Movie Title Sample
    This sample has all the Movie Names and their respective Movie IDs.
image

image


8. Add the Meta Data Editor and make it treat the values as String
This can be done by selecting all the columns from the column selector and set the data type to String from the right pane.

image


9. Join the Movie IDs from the Meta Data editor with the one from the Score MatchBox Recommender.



image


In the column selector, select "Item" from the left column and select "Movie Id" from the Right column selector.

What we just did is Join the Item column form the Score Match Box Recommender to the Movie ID from the IMDB Movie titles. So, if we run the experiment, we shall have the Movie Name and all the related Movie IDs as shown below.

image

Now, we want to have the names of the related movies too! To do so, proceed with the step below.

10. Add another Join operator, to Join the result from the previous join (result) with the Movie Titles sample.

image

In the left column selector, select Related item 1 and in the right column selector, select Movie ID.
This will join the related movie id 1 with the Movie Titles sample to return the name of the related movie.

Run the experiment to obtain a list of movie and their related movies.

image

From our experiment, we have a list of movies (Movie Name) and their Related Movie (Movie Name (2)).
e.g. we can deduct that people who like Thor also liked Iron Man. :)

Comments

Popular posts from this blog

Creating and Querying Microsoft Azure DocumentDB

DocumentDB is the latest storage option added to Microsoft Azure.
It is a no-sql storage service that stores JSON documents natively and provides indexing capabilities along with other interesting features.

This article is available available on theMicrosoft Technet Wiki. This article was highlighted in theTop Contributor awardson the 12th of October 2014. This article was highlighted in the TNWiki Article Spotlight. This article was highlighted in the The Microsoft TechNet Guru Awards! (October 2014).


DocumentDB is the latest storage option added to Microsoft Azure.
It is a no-sql storage service that stores JSON documents natively and provides indexing capabilities along with other interesting features.
This wiki shall introduce you to this new service.

Setting up a Microsoft Azure DocumentDBGo to the new Microsoft Azure Portal. https://portal.azure.com/ 


 Click on New > DocumentDB


Enter A Database ID and hit Create!



Query Unstructured Data From SQL Server Using PolyBase

Scope The following article demonstrates how unstructured data and relational data can be queried, joined and processed in a single query using PolyBase, a new feature in SQL Server 2016. Pre-RequisitesIntroduction to Big Data Analytics Using Microsoft Azure Big Data Analytics Using Hive on Microsoft Azure Analyze Twitter Data With Hive in Azure HDInsight Running Hadoop on Linux using Azure HDInsight  Introduction Traditionally, Big Data is processed using Apache Hadoop which is totally fine. But what if the result of this needs to be linked to the traditional Relation Database? For example, assume that from the analysis of tons of application logs, marketing needs to contact some customs that faced problems in an application following a failure in the application.
This problem is solved with PolyBase. PolyBase allows you to use Transact-SQL (T-SQL) statements to access data stored in Hadoop or Azure Blob Storage and query it in an ad-hoc fashion. It also lets you query semi-structure…

Creating and Deploying Microsoft Azure WebJobs

Azure WebJobs enables you to run programs or scripts in your website as background processes. It runs and scales as part of Azure Web Sites.
This article focuses on the basics of WebJobs before demonstrating an example where it can be used.

This article is also available on the Mirosoft TechNet Wiki.
This article was highlighted in the The Microsoft TechNet Guru Awards! (October 2014).


Introduction
What is Microsoft Azure WebJobs?
Azure WebJobs enables you to run programs or scripts in your website as background processes. It runs and scales as part of Azure Web Sites.

What Scheduling Options is supported by Microsoft Azure WebJobs? Azure WebJobs can run Continuously, On Demand or on a Schedule.
In what language/scripts are WebJobs written?
Azure WebJobs can be created using the following scripts:  .cmd, .bat, .exe (using windows cmd).ps1 (using powershell).sh (using bash).php (using php).py (using python).js (using node)In this article, the use of c# command line app shall be demonstrated.
Cr…