Posts

Reinforcement Learning - UCB and Thompson Sampling

Image
Multi-armed bandit problem In probability theory, the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice.This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma.  Imagine that you are in a casino and there are 5 slot machines, and you have to decide on which machine to play, how many times to play each machine and in which order to play them, and whether to continue with the current machine or try a different machine in order to maximize the profit or win percentage. In the problem, each machine provides a random reward from a probability distribution specific to that machine. Your objective is to

Image Denoising Using Auto Encoders

Image
In this post, we will discuss, how to remove noise from an image using a type of neural network called Autoencoder. Autoencoders can learn efficient data codings in an unsupervised manner. The purpose of an autoencoder is to learn a representation for a set of data, typically for dimensionality reduction, by training the network to ignore signal noise. Image Credits: [ Source ] As described in the image, on both sides of the network we have the same image passed as input and output. By doing this, we create a bottleneck in the network which is nothing but a low dimensional representation of the input image. The layers before the bottleneck represent the encoder which encodes the input image. And then we have a decoder, which does the reverse of what we did in the encoder to get the input image back. Note: If this seems a little bit confusing to you, don't worry, it's a lot easier to understand with the code.

Sentiment Analysis On IMDB Dataset

Image
Problem statement: The main objective in this Project is to predict the sentiment for a number of movie reviews obtained from the Internet Movie Database (IMDb). This dataset contains 50,000 movie reviews that have been pre-labeled with “positive” and “negative” sentiment class labels based on the review content.  The dataset can be obtained from - Here , courtesy of Stanford University and Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts.  They have datasets in the form of raw text as well as already processed bag of words formats. We will only be using the raw labeled movie reviews for our analysis. Hence our task will be to predict the sentiment of 15,000 labeled movie reviews and use the remaining 35,000 reviews for training our supervised models. What is Sentiment analysis? Sentiment analysis is contextual mining of text which identifies and extracts subjective information in source material, and helping a business to understand the so