My research group studies the theory and practice of building decision systems that are robust to corrupted, missing, or otherwise uncertain data. My research brings together ideas from statistics/machine learning and database systems. We are currently studying systems that can analyze large amounts of video, certifiable accuracy guarantees in partially complete databases, and theoretical lower-bounds for lossy compression in relational databases.

News/Updates

8/04/2020 Redesigned online data science course: http://sanjayk.io/cmsc21800/

7/17/2020 Adam Dziedzic Graduates! https://adam-dziedzic.github.io/

6/16/2020 Data Engineering Lectures are Online!! http://sanjayk.io/cmsc13600/

4/13/2020 Two SIGMOD Papers: Incrementability Aware Query Processing and Predicate Constraints.

1/13/2020 Aaron presents our work at CIDR: http://cidrdb.org/cidr2020/papers/p14-shang-cidr20.pdf

Recent Publications

(Video Analytics) Ted Shao*, JinJin Zhao, Sanjay Krishnan. A Storage Manager for Video Analytics. Under Review 2021

(Approximate Query Processing) Xi Liang, Stavros Sintos, Zechao Shang, Sanjay Krishnan. Precomputation-Assisted Stratified Sampling. Under Review 2021

(Approximate Query Processing) Xi Liang, Zechao Shang, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. Fast and Reliable Missing Data Contingency Analysis with Predicate-Constraints. SIGMOD 2020 https://arxiv.org/pdf/2004.04139.pdf

(Resource-Efficient Analytics) Dixin Tang, Zechao Shang, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. Thrifty Query Execution via Incrementability. SIGMOD 2020. pdf

(Resource-Efficient Analytics, Query Optimization) Zechao Shang, Xi Liang, Dixin Tang, Cong Ding, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. CrocodileDB: Efficient Database Execution through Intelligent Deferment. CIDR 2020. pdf

(Query Optimization) Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, and Ion Stoica. Deep Unsupervised Cardinality Estimation. VLDB 2020. pdf

(Video Analytics) Sanjay Krishnan, Adam Dziedzic, and Aaron J. Elmore. Deeplens: Towards a visual data management system. CIDR 2019. pdf

(Resource-Efficient Analytics, Query Optimization) Dixin Tang, Zechao Shang, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. Intermittent Query Processing. VLDB 2019. pdf

(Resource-Efficient Analytics) Adam Dziedzic *, John Paparrizos *, Sanjay Krishnan, Aaron Elmore, Michael Franklin. Band-limited Training and Inference for Convolutional Neural Networks. ICML 2019. pdf

(Query Optimization) Xi Liang, Aaron J. Elmore, and Sanjay Krishnan. Opportunistic View Materialization with Deep Reinforcement Learning. 2019. pdf

(Query Optimization) Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, and Ion Stoica. “Learning to optimize join queries with deep reinforcement learning.” 2018. pdf.

All Publications

How do we build systems that continuously learn from real-world interactions? Real-world reinforcement, imitation learning, and control

Richard Shin, Roy Fox, Sanjay Krishnan, Dawn Song, Ion Stoica. Parametrized Hierarchical Procedures For Neural Programming. ICLR 2018.

Vatsal Patel*, Sanjay Krishnan, Aimee Goncalves, Carolyn Chen, Walter Doug Boyd, Ken Goldberg. Using Intermittent Synchronization to Compensate for Rhythmic Body Motion During Autonomous Surgical Cutting and Debridement. ISMR 2018.

Sanjay Krishnan*, Roy Fox*, Ion Stoica, Ken Goldberg. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations. CoRL 2017. (pdf)

Roy Fox*, Sanjay Krishnan*, Ken Goldberg, Ion Stoica. Multi-Level Discovery of Deep Options. (arxiv)

Sanjay Krishnan, Eugene Wu, Michael Franklin, Ken Goldberg. BoostClean: Automated Error Detection and Repair for Machine Learning. Preprint Available. 2017.

Tejas Kannan, Sanjay Krishnan. Exploring the Sensitivity of Policy Gradients to Observation Noise. RLDM 2017.

Sanjay Krishnan, Animesh Garg, Richard Liaw, Brijen Thananjeyan, Lauren Miller, Florian T. Pokorny, Ken Goldberg. SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards. Under Review IJRR (Request Copy).

Brijen Thananjeyan,Animesh Garg, Sanjay Krishnan, Carolyn Chen, Lauren Miller, Ken Goldberg. Multilateral Surgical Pattern Cutting in 2D Orthotropic Gauze with Deep Reinforcement Learning Policies for Tensioning. ICRA 2017. read

Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg. Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations. ICRA 2017. read

Sanjay Krishnan, Animesh Garg, Richard Liaw, Brijen Thananjeyan, Lauren Miller, Florian T. Pokorny, Ken Goldberg. SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards. WAFR 2016. arxiv

How Do We Visualize/Interpret Data?

Sanjay Krishnan, Eugene Wu. Arachnida: A Transformation-Oriented Explanation Engine. Under Review. 2018

Sanjay Krishnan, Eugene Wu. PALM: Machine Learning Explanations For Iterative Debugging. HILDA 2017. (pdf)

Mo Zhou, Alison Cliff, Sanjay Krishnan, Brandie Nonnecke, Camille Crittenden, Kanji Uchino, Ken Goldberg. M-CAFE 1.0: Motivating and Prioritizing Ongoing Student Feedback During MOOCs and Large on-Campus Courses using Collaborative Filtering.  Proceedings of the 16th Annual ACM Conference on Information Technology Education, SIGITE 15, Chicago, September, 2015. (pdf)

Mo Zhou, Alison Cliff, Allen Huang, Sanjay Krishnan, Brandie Nonnecke, Kanji Uchino, Sam Joseph, Armando Fox, and Ken Goldberg.M-CAFE: Managing MOOC Student Feedback with Collaborative Filtering. In Learning@Scale 2015.(pdf)

Jay Patel, Gil Gershoni, Sanjay Krishnan, Matti Nelimarrka, Brandie Nonnecke, Ken Goldberg.A Case Study in Mobile-Optimized vs. Responsive Web Application Design. In Mobile HCI 2015 (pdf)

Sanjay Krishnan, Jay Patel, Michael J. Franklin, and Ken Goldberg. Social Influence Bias in Recommender Systems: A Methodology for Learning, Analyzing, and Mitigating Bias in Ratings. Under Review: ACM Conference on Recommender Systems (RecSys). Foster City, CA, USA. Oct 2014 (pdf)

Sanjay Krishan, Ken Goldberg, Yuko Okubo, Kanji Uchino. Using a Social Media Platform to Explore How Social Media Can Enhance Primary and Secondary Learning. The Sixth Conference of MIT’s Learning International Networks Consortium. June 2013 (pdf)

Sanjay Krishnan, Ken Goldberg. Distributed Spectral Dimensionality Reduction for Visualizing Textual Data. ICML Workshop on Spectral Learning Methods, Atlanta, GA, June 2013. (pdf)

How Do We Clean Data?

Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, Tim Kraska, Tova Milo, Eugene Wu. SampleClean: Fast and Reliable Analytics on Dirty Data. IEEE Data Engineering Bul. 2015 (pdf)

Xu Chu, Ihab Ilyas, Sanjay Krishnan, Jiannan Wang. Data Cleaning: Overview and Emerging Challenges. SIGMOD 2016. (read)

Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, Tim Kraska. PrivateClean: Data Cleaning and Differential Privacy. SIGMOD 2016. (pdf)

Sanjay Krishnan,Eugene Wu, Michael Franklin, Ken Goldberg, Jiannan Wang. ActiveClean: An Interactive Data Cleaning Framework For Machine Learning. SIGMOD 2016 Demo. (pdf)

Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, Eugene Wu. ActiveClean: Interactive Data Cleaning For Statistical Modeling. VLDB 2016. (pdf)

Sanjay Krishnan, Daniel Haas, Eugene Wu, Michael Franklin. Towards Reliable Interactive Data Cleaning: A User Survey and Recommendations. HILDA 2016. (pdf)

Daniel Haas, Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Eugene Wu. Wisteria: Nurturing Scalable Data Cleaning Infrastructure. VLDB 2015 Demo. (pdf)

Jiannan Wang, Sanjay Krishnan, Michael Franklin, Ken Goldberg, Tim Kraska, Tova Milo. A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data. In SIGMOD, Jun. 2014 (pdf)

How Do We Structure Data? Robotics Trajectory Segmentation/Cleaning

Sanjay Krishnan, Animesh Garg, Sachin Patil, Colin Lea, Gregory Hager, Pieter Abbeel, Ken Goldberg. Transition State Clustering: Unsupervised Surgical Task Segmentation For Robot Learning. IJRR 2018(read).

Caleb Chuck, Michael Laskey, Sanjay Krishnan, Ruta Joshi, Ken Goldberg. Statistical Data Cleaning for Deep Learning of Automation Tasks from Demonstrations. CASE 2017.

Adithya Murali, Animesh Garg, Sanjay Krishnan, Florian T. Pokorny, Pieter Abbeel, Trevor Darrell, Ken Goldberg: TSC-DL: Unsupervised Trajectory Segmentation of Multi-Modal Surgical Demonstrations with Deep Learning. (read)

Sanjay Krishnan, Animesh Garg, Sachin Patil, Colin Lea, Gregory Hager, Pieter Abbeel, Ken Goldberg. Transition State Clustering: Unsupervised Surgical Task Segmentation For Robot Learning. International Symposium on Robotics Research (ISRR). 2015. (read)

How Do We Store and Update Data?

Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, and Tim Kraska. Stale View Cleaning: Getting Fresh Answers from Stale Materialized Views. In VLDB 2015. (pdf) (arxiv)

Liwen Sun, Sanjay Krishnan, Reynold S. Xin and Michael J. Franklin. A Partitioning Framework for Aggressive Data Skipping. VLDB 2014. (pdf)

Liwen Sun, Michael J. Franklin, Sanjay Krishnan, Reynold S. Xin: Fine-grained partitioning for aggressive data skipping. SIGMOD Conference 2014. (pdf)

How Do We Collect Data? Sensors, Actuators, Crowds

Vatsal Patel, Sanjay Krishnan, Aimee Goncalves. SPRK: A Low-Cost Stewart Platform For Motion Study In Surgical Robotics. Under Review ISMR 2018.

Daniel Seita, Sanjay Krishnan, Roy Fox, Stephen McKinley, John F. Canny,
Ken Goldberg. Fast and Reliable Autonomous Surgical Debridement with
Cable-Driven Robots Using a Two-Phase Calibration Procedure. Under Review ICRA 2018.

Yeouhnoh Chung, Sanjay Krishnan , Tim Kraska. A Data Quality Metric (DQM). How to Estimate the Number of Undetected Errors in Data Sets. VLDB 2017. (pdf)

Brandie Nonnecke, Sanjay Krishnan, Jay Patel, Mo Zhou, Laura Byaruhanga, Dorothy Masinde, Maria Elena Meneses, Alejandro Martin del Campo, Camille Crittenden, Ken Goldberg. DevCAFE 1.0: A Participatory Platform for Assessing Development Initiatives in the Field. IEEE Global Humanitarian Technology Conference (GHTC). 2015 (Best Paper) (pdf)

Jeffrey Mahler, Sanjay Krishnan, Michael Laskey, Siddarth Sen, Adithyavairavan Murali, Ben Kehoe, Sachin Patil, Jiannan Wang, Mike Franklin, Pieter Abbeel, Ken Goldberg. Learning Accurate Kinematic Control of Cable-Driven Surgical Robots Using Data Cleaning and Gaussian Process Regression. CASE 2014. (read)