Projects

Self-supervised 3D scene flow estimation from LIDAR point clouds [Report]

Guide: Evangelos Kalogerakis

This is my master's thesis I worked on. This project tries to estimate scene flow from LIDAR point clouds using self-supervised learning to take advantage of large amounts of unlabelled LIDAR data. We use a simple unsupervised method for estimating 3D scene flow based on mean-shift clustering and Iterative closest point. We show that this method can be used to generate good unsupervised flow labels that can be used to fine-tune existing scene flow models to improve their performance.

Building a Reinforcement Learning agent for text-based games [Report]

Collaborators: Bryon Kucharski, Rakesh R Menon, Clayton Thorrez ; Guide: Yash Chandak (UMass), Marc-Alexandre Cote & Adam Trischler (MSR Montreal)

The project aims at building a reinforcement learning (RL) agent for solving text-based games generated from textWorld, which is a framework for generating text-based games. We explored several methods to reduce the action space for these text-based games. We additionally used A2C and DRRN approaches for learning control policy for selecting actions at a given state. We placed 8th using these approaches in the competition conducted by Microsoft Research Montreal.

Bundle Adjustment in the Large [Report]

Guide: Prof Anurag Mittal

This project explored the inexact Newton type bundle adjustment algorithm proposed in the paper of the same name. This inexact newton method along with methods that use Schur’s complement trick to solve bundle adjustment were experimented on a small, medium and large dataset and their performances were compared.

Categorization of Human Actions from Videos [Report]

Guide: Prof Sukhendu Das

The aim of the project is to build a system that can categorize a large number of videos according to a set of complex human actions being performed. The challenges involved include large amounts of variation in videos of each action along with noise and jitter. Features were extracted from the videos using multi-skip feature stacking. These extracted features are then used for classification using models like SVM, Neural Networks, etc. Various dimensionality reduction techniques have also been tried out to improve the accuracy of the model.

Question-Answering system : Smarter than an Eighth grader? [Report]

Collaborators: Abhishek Naik, Shiva Krishna Reddy, Mohan Bhambhani ; Guide: Prof Sutanu Chakraborty

The goal of the project is to build a system that can answer questions from 8th-grade standardized science tests. The project improves upon the baselines of NLP and Information Retrieval in the multiple choice question answering task. It explores the influence of different knowledge sources on the question answering system. We formulate the retrieval query for the information retrieval system using query expansion. Different NLP concepts like n-grams, named entity similarities, etc were explored.

Contextual Spell Checker [Code] [Report]

Collaborators: Abhishek Naik, Shiva Krishna Reddy, Mohan Bhambhani ; Guide: Prof Sutanu Chakraborty

In this project, we build a spell checker which suggests the possible correct words for the given typo in both the presence and absence of context. We model the word spell check in the absence of any context, as a noisy channel model. Using this formulation, we estimate the correct word given the typo by generating suitable candidates and ranking them based on the posterior of the correct word given the typo. For the phrase and sentence level spell check, we use an HMM of bi-grams where POS are the hidden variables and words are the observed variables, as well as web-scale N-grams model to estimate the correct word given the context.

Exploration of YOLO [Presentation] [Report]

Collaborators: Shiva Krishna Reddy ; Guide: Prof Anurag Mittal

In this work we experiment with YOLO and its variant and observe the influence of different design decisions in obtaining the final results.

Light Field Photography [Report] [Demo]

Collaborators: Abhishek Naik, Shiva Krishna Reddy ; Guide: Prof Kaushik Mitra

This project involves exploring the light field imaging and its numerous applications in computer vision. Raytracer software POV-RAY has been used to synthetically generate Light Field images. Several applications of the light field were explored such as digital refocusing using Fourier Slice Theorem, looking behind an occluded object, depth map estimation using a focal stack, etc. An interactive GUI has been developed to visualize light field imaging and its applications.

Source Code Authorship Attribution [Report]

Collaborators: Sai Srinivas ; Guide: Prof Balaram Ravindran

The goal of the project is to automatically identify the author of a given source code. This was done by modeling the source code as a generative process using Latent Dirichlet Allocation(LDA). Sampling techniques like Gibbs sampling were used for estimation of model parameters. The project also explored the influence of punctuations in identifying the author of source code.

JOS Operating System [Code]

Collaborators: Shiva Krishna Reddy ; Guide: Prof Chester Rebeiro

The aim of the project is to add features to MIT JOS operating system, given a bare-bones skeleton structure of the OS. Features like paging, interrupt and exception handling, inter-process communication, user environment were added to the functionality of the OS.

E-Book Reader [Code]

Collaborators: Shiva Krishna Reddy

The objective of the project is to build a PDF Reader that supports searching for specific files and importing them to the reader. The auto-suggestion feature is also available in the search option. The PDF reader also supports searching for specific text in the pdf. Ternary Tree was used to store the names of all the files and BK-Tree was used for auto-complete suggestions. Jpedal library was used for the pdf reading part.

4-bit Processor [Code] [Report]

Collaborators: Ujjawal Soni, Sowmith ; Guide: Prof Krishna Shivalingam

The project aims at building a 4-bit processor in Verilog using only the basic gates. Modules for basic operations like addition using carry lookahead adder and multiplication using booth multiplier are implemented. Increment and Decrement operations along with bitwise left and right shift operations are also implemented. Additionally, flip-flops, multiplexers, registers are also implemented. A control unit is also present which generates the signals for the function of the processor.

Copter Game [Code]

My first big project was to build a copter game. The game is based on the same mechanism as the popular game Flappy Bird. The SDL2.0 library was used for graphics. The collision mechanism is detected using a bounding box collision detection. High scores recording has also been incorporated. Play it for the crude fun!.