Deepayan Patra


Developer · Designer · Technologist


I am a rising senior at Carnegie Mellon University, School of Computer Science, pursuing a BS in Computer Science with a concentration in Computer Systems and a minor in Machine Learning. My strengths lie in systems engineering with a focus on autonomous operations. I also have interests in design, economics, education and cybersecurity.

I enjoy developing products and solutions to real-world problems. I am actively pursuing internship or full-time opportunities for 2021.

Education

Carnegie Mellon University

B.S. in Computer Science


Computer Systems + Machine Learning


QPA: 3.90

August 2018 - May 2021

Semester: Spring 2020

Involvement: My team implemented an API-driven resumeable NUMA-aware task scheduler and thread pool to update the execution framework implemented in the self-driving DBMS "Terrier". Our implementation allowed for asynchronous execution of tasks in a memory region-aware manner, reducing contention on system latches and wait times from long-running operations. We improved system performance by approximately 60-70x on general scan workloads and up to 600x on worst-case contentious workloads. We also implemented a B+Tree acting as a secondary index for the system. Our implementation supported optimistic inserts and deletes, relying on latches only for splitting inserts. We utilized thread-safe memory allocators and garbage collection queues for each thread and implemented epoch-based garbage collection.

Description: This course is a comprehensive study of the internals of modern database management systems. It will cover the core concepts and fundamentals of the components that are used in both high-performance transaction processing systems (OLTP) and large-scale analytical systems (OLAP). The class will stress both efficiency and correctness of the implementation of these ideas. All class projects will be in the context of a real in-memory, multi-core database system. The course is appropriate for graduate students in software systems and for advanced undergraduates with dirty systems programming skills.

Semester: Spring 2020

Involvement: My team implemented a set of image classification models for the Caltech-101 Dataset. We developed a series of neural network models (VGG, ResNet, and Inception-v3) and applied the data augmentation techniques of weighted random sampling, data transformation, data duplication, and GAN-based augmentation to improve model accuracy from a 47% baseline to 76% on a representative testing set.

Description: Machine learning studies the question "How can we build computer programs that automatically improve their performance through experience?" This includes learning to perform many types of tasks based on many types of experience. For example, it includes robots learning to better navigate based on experience gained by roaming their environments, medical decision aids that learn to predict which therapies work best for which diseases based on data mining of historical health records, and speech recognition systems that learn to better understand your speech based on experience listening to you. This course is designed to give PhD students a thorough grounding in the methods, mathematics and algorithms needed to do research and applications in machine learning.

Semester: Fall 2020

Description: This course covers the design and implementation of compiler and run-time systems for high-level languages, and examines the interaction between language design, compiler design, and run-time organization. Topics covered include syntactic and lexical analysis, handling of user-defined types and type-checking, context analysis, code generation and optimization, and memory management and run-time organization.

Semester: Spring 2020

Involvement: In this course, my team developed a fully-verified backtracking SAT solver with unit propagation and an additional partially-verified version that added the pure literal optimization and additional performance increases from the use of an adjacency list data structure. I also implemented a fully-verified hash-table data structure that supported resizing inserts and deletes.

Description: High-profile bugs continue to plague the software industry, leading to major problems in the reliability, safety, and security of systems. This course teaches students how to write bug-free code through the process of software verification, which aims to prove the correctness of a program with respect to a mathematical specification. Along the way, students will learn how to specify correct program behavior, prove the correctness of their code, use formal semantics to reason about the soundness of proof rules, write practical and efficient verified code, and use decision procedures and model checkers to reduce verification effort. Students will learn the principles and algorithms behind automated verification tools, and understand their practical limitations while gaining experience writing verified, machine-checked code that solves real problems.

Semester: Fall 2020

Description: Computational photography is the convergence of computer graphics, computer vision and imaging. Its role is to overcome the limitations of the traditional camera, by combining imaging and computation to enable new and enhanced ways of capturing, representing, and interacting with the physical world. This advanced undergraduate course provides a comprehensive overview of the state of the art in computational photography. At the start of the course, we will study modern image processing pipelines, including those encountered on mobile phone and DSLR cameras, and advanced image and video editing algorithms. Then we will proceed to learn about the physical and computational aspects of tasks such as 3D scanning, coded photography, lightfield imaging, time-of-flight imaging, VR/AR displays, and computational light transport. Near the end of the course, we will discuss active research topics, such as creating cameras that capture video at the speed of light, cameras that look around walls, or cameras that can see through tissue. The course has a strong hands-on component, in the form of seven homework assignments and a final project. In the homework assignments, students will have the opportunity to implement many of the techniques covered in the class, by both acquiring their own images of indoor and outdoor scenes and developing the computational tools needed to extract information from them. For their final projects, students will have the choice to use modern sensors provided by the instructors (lightfield cameras, time-of-flight cameras, depth sensors, structured light systems, etc.). This course requires familarity with linear algebra, calculus, programming, and doing computations with images. The course does not require prior experience with photography or imaging.

Semester: Fall 2020

Description: This course is about the design and analysis of algorithms. We study specific algorithms for a variety of problems, as well as general design and analysis techniques. Specific topics include searching, sorting, algorithms for graph problems, efficient data structures, lower bounds and NP-completeness. A variety of other topics may be covered at the discretion of the instructor. These include parallel algorithms, randomized algorithms, geometric algorithms, low level techniques for efficient programming, cryptography, and cryptographic protocols.

Semester: Fall 2019

Involvement: I developed a series of CV-related projects, including Lucas-Kanade and Matthews-Baker video trackers, 3D image reconstructions, Augmented Reality image rectifications, and randomized feature detectors.

Description: This course provides a comprehensive introduction to computer vision. Major topics include image processing, detection and recognition, geometry-based and physics-based vision and video analysis. Students will learn basic concepts of computer vision as well as hands on experience to solve real-life vision problems.

Work Experience

  • Research Assistant

    Terrier DBMS

    October 2019 - Ongoing

    As a research assistant under Professor Andy Pavlo, I contribute to CMU's in-memory self-driving relational database management system, Terrier. I have previously worked on the storage and exeuction engines, developing a NUMA-aware, resumable resource management system that intelligently manages task scheduling to prevent busy-waits and allow for fine grained execution control à la SQL Server's SQLOS and the Hyper's morsel scheduling model. I also have improved performance on our write-ahead logging implementation, added to our query language support through the addition of builtin functions, and worked on the optimizer to improve the performance of scans in our engine.

  • Teaching Assistant

    CMU School of Computer Science

    September 2020 - Ongoing

    I serve as a teaching assistant for 07-131, Great Practical Ideas in Computer Science. In this role I contribute to designing the curriculum for this offering of the course and will lecture the freshman Computer Science class on technologies and tools important to their professional development.

  • Software Engineering Intern

    Rockset

    May 2020 - August 2020

    As a software engineering intern at Rockset, my role would have been to work on projects throughout the frontend, product feature, and backend infrastructure teams. Unfortunately due to COVID-19, this position was cancelled for Summer 2020.

  • EXCEL Leader

    CMU Academic Development

    August 2019 - May 2020

    I served as an EXCEL Leader for 21-241, Matrices and Linear Transformations, and 15-122, Principles of Imperative Computation, where I developed content focused to engage and interact with students during weekly sessions meant to focus on individualized learning.

  • Supplemental Instruction Leader

    CMU Academic Development

    August 2019 - May 2020

    I served as a Supplemental Instruction Leader for 15-122, Principles of Imperative Computation. In this role I focused on larger class sizes integrating the course content with techniques on how to understand and improve one's learning skills.

  • Research Assistant

    Tartan Research Project

    January 2019 - September 2019

    As a research assistant under Professor Alex Rudnicky, I worked on CMU's submission for the Amazon Alexa Prize, a competition to create a conversational social chatbot. I worked on creating a reinforcement learning agent to traverse an augmented knowledge graph using Reddit conversational data for natural language generation. I also implemented an end-to-end logger for our application.

Projects

LensFlare

PennApps 2019

LensFlare works to reinvent video players: we've removed annoying ad popups and instead let you focus on what you want to see. LensFlare is a simple visual platform that translates embedded video information such as images and audio with ML to obtain information about products visualized or cited, along with trends regarding their associated organizations or corporations. Users can easily view the goods they find interest in the cards presented, but without being overwhelmed by the amount of promotional content. Our system allows the user to determine specifically what items they would value, and contact organizations directly.

Built with: GCP, Python, React, cv2

  View Site

FLOW

 2nd place Award - Riot Games

HackCMU 2018

FLOW is a scalable, intelligent IoT solution to track and predict personal water usage data via an intelligent monitoring algorithm. Easily installed micro-turbines measure the water flow rate in pipes and relay information back to our internal database, providing live usage statistics to effectively counter issues and meet your monthly goals.

Built with: Express.js, Node.js, HTML5, CSS3, Redis, Flutter, Dart

  View Site

ML Analysis of Maryland Judiciary Records

Pennsylvania Governor's School for the Sciences

Our project focused on developing supervised and unsupervised machine learning methods and standard SQL queries on data scraped from the public domain Maryland court website. We developed a customized web scraper and an associated parser to consistently extract and store all case data over inconsistent representational formats to a relational database hosted on CMU Database Group servers. Our data analyzed over 12 million case records, from which we determined insights into conviction rates from defining individual and case attributes. Data collected was later used in CMU's Introduction to Database Systems course, 15-455.

Built with:  Python, Scrapy, BeautifulSoup4, Psycopg2, scikit-learn, PostgreSQL

  View Site

Assure

TartanHacks 2019

Assure was created to help bridge the gap between our ideals of social support and reality. We created a quick and easy method for individuals to donate and request goods in a nearby range, bringing those in need with those willing to provide. Recipients are allowed to search for items available nearby, and contact owners directly. By removing middlemen, our product focuses on removing occasionally prohibitive costs by creating a truly free system of exchange, allowing people to get back on track by extended a helping hand just a little bit better.

Built with: Node.js, MongoDB, React

  View Site

Skills and Technologies

Over the years, I've picked up a set of skills that I have a good amount of experience with. The following are the skills with which I have the greatest familiarity.


Languages:

90 C/C++
90 Python
80 Unix/Linux
80 HTML + CSS + JS
75 OCaml
75 SQL


Tools and Technologies:



Contact Me

Love making new connections! If you'd like to get in touch, send me an email or a message!

dpatra@deepayan.dev

thepinetree

dpatra2022

Pittsburgh, PA