How much of data analytics can we automate? Data scientists spend many hours manipulating and cleaning data, writing careful/scalable data analysis programs, and debugging analyses. My group explores the intersection of AI and Data Science—towards a world where intelligent systems can automatically perform many of the data analytics tasks we currently expect humans to do.
9/23/19 New Course CMSC 21800 http://sanjayk.io/cmsc21800/
8/24/19 Dixin presents our work at VLDB: http://www.vldb.org/pvldb/vol12/p1427-tang.pdf
7/24/19 New vision paper on Deep Learning Systems: https://dl.acm.org/citation.cfm?id=3352022
6/10/19 Adam presented our work at ICML: http://proceedings.mlr.press/v97/dziedzic19a/dziedzic19a.pdf
4/30/19 New paper on selectivity estimation with deep likelihood models https://arxiv.org/abs/1905.04278
Current Projects and Recent Publications
We are always looking for exceptional undergraduates, graduate students, and post-docs! Email me. skr @ cs . uchicago
Deep Query Optimization. What is the role of machine learning in the design and implementation of a modern database system? This question has sparked considerable introspection in the data management community, and the epicenter of this debate is the core database problem of query optimization, where the database system finds the best physical execution path for an SQL query.
Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, and Ion Stoica. “Learning to optimize join queries with deep reinforcement learning.” 2018. pdf.
Xi Liang, Aaron J. Elmore, and Sanjay Krishnan. Opportunistic View Materialization with Deep Reinforcement Learning. 2019. pdf
Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, and Ion Stoica. Selectivity Estimation with Deep Likelihood Models. 2019. pdf
Compressed Deep Learning: The computational demands of modern AI techniques are immense, and as the number of practical applications grows, there will be an increasing burden on shared computing infrastructure. We have a number of projects studying the effects of data compression in deep learning systems.
Sanjay Krishnan, Adam Dziedzic, and Aaron J. Elmore. Deeplens: Towards a visual data management system. 2018. pdf
Adam Dziedzic *, John Paparrizos *, Sanjay Krishnan, Aaron Elmore, Michael Franklin. Band-limited Training and Inference for Convolutional Neural Networks. 2019. pdf
Adam Dziedzic, John Paparrizos, Sanjay Krishnan. Imprecise Neural Networks are Robust To Adversarial Attacks. 2019. (coming soon, ask)
Sanjay Krishnan, Aaron J. Elmore, Michael Franklin, John Paparrizos,
Zechao Shang, Adam Dziedzic, Rui Liu. Artificial Intelligence in Resource-Constrained and Shared Environments. 2019 pdf
Resource-Constraint Database: The end of Moore’s law will push database system designers to be more judicious with computation as the growth in data outpaces the availability of computational resources. Eagerness, or aggressively consuming resources to immediately and quickly complete the task at hand, is one source of wasted resources in modern data systems where the systems expend unnecessary resources waiting on queries, data, or both. Intelligently deferring a task to a later point in time can increase result reuse, reduce work that might later be invalidated, or avoid unnecessary work altogether.
Dixin Tang, Zechao Shang, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. Intermittent Query Processing. 2019. pdf
Zechao Shang, Xi Liang, Dixin Tang, Cong Ding, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. CrocodileDB: Efficient Database Execution through Intelligent Deferment. 2019. (coming soon ask)
How do we build systems that continuously learn from real-world interactions? Real-world reinforcement, imitation learning, and control
Richard Shin, Roy Fox, Sanjay Krishnan, Dawn Song, Ion Stoica. Parametrized Hierarchical Procedures For Neural Programming. ICLR 2018.
Vatsal Patel*, Sanjay Krishnan, Aimee Goncalves, Carolyn Chen, Walter Doug Boyd, Ken Goldberg. Using Intermittent Synchronization to Compensate for Rhythmic Body Motion During Autonomous Surgical Cutting and Debridement. ISMR 2018.
Sanjay Krishnan*, Roy Fox*, Ion Stoica, Ken Goldberg. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations. CoRL 2017. (pdf)
Roy Fox*, Sanjay Krishnan*, Ken Goldberg, Ion Stoica. Multi-Level Discovery of Deep Options. (arxiv)
Sanjay Krishnan, Eugene Wu, Michael Franklin, Ken Goldberg. BoostClean: Automated Error Detection and Repair for Machine Learning. Preprint Available. 2017.
Tejas Kannan, Sanjay Krishnan. Exploring the Sensitivity of Policy Gradients to Observation Noise. RLDM 2017.
Sanjay Krishnan, Animesh Garg, Richard Liaw, Brijen Thananjeyan, Lauren Miller, Florian T. Pokorny, Ken Goldberg. SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards. Under Review IJRR (Request Copy).
Brijen Thananjeyan,Animesh Garg, Sanjay Krishnan, Carolyn Chen, Lauren Miller, Ken Goldberg. Multilateral Surgical Pattern Cutting in 2D Orthotropic Gauze with Deep Reinforcement Learning Policies for Tensioning. ICRA 2017. read
Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg. Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations. ICRA 2017. read
Sanjay Krishnan, Animesh Garg, Richard Liaw, Brijen Thananjeyan, Lauren Miller, Florian T. Pokorny, Ken Goldberg. SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards. WAFR 2016. arxiv
Sanjay Krishnan, Eugene Wu. Arachnida: A Transformation-Oriented Explanation Engine. Under Review. 2018
Sanjay Krishnan, Eugene Wu. PALM: Machine Learning Explanations For Iterative Debugging. HILDA 2017. (pdf)
Mo Zhou, Alison Cliff, Sanjay Krishnan, Brandie Nonnecke, Camille Crittenden, Kanji Uchino, Ken Goldberg. M-CAFE 1.0: Motivating and Prioritizing Ongoing Student Feedback During MOOCs and Large on-Campus Courses using Collaborative Filtering. Proceedings of the 16th Annual ACM Conference on Information Technology Education, SIGITE 15, Chicago, September, 2015. (pdf)
Mo Zhou, Alison Cliff, Allen Huang, Sanjay Krishnan, Brandie Nonnecke, Kanji Uchino, Sam Joseph, Armando Fox, and Ken Goldberg.M-CAFE: Managing MOOC Student Feedback with Collaborative Filtering. In Learning@Scale 2015.(pdf)
Jay Patel, Gil Gershoni, Sanjay Krishnan, Matti Nelimarrka, Brandie Nonnecke, Ken Goldberg.A Case Study in Mobile-Optimized vs. Responsive Web Application Design. In Mobile HCI 2015 (pdf)
Sanjay Krishnan, Jay Patel, Michael J. Franklin, and Ken Goldberg. Social Influence Bias in Recommender Systems: A Methodology for Learning, Analyzing, and Mitigating Bias in Ratings. Under Review: ACM Conference on Recommender Systems (RecSys). Foster City, CA, USA. Oct 2014 (pdf)
Sanjay Krishan, Ken Goldberg, Yuko Okubo, Kanji Uchino. Using a Social Media Platform to Explore How Social Media Can Enhance Primary and Secondary Learning. The Sixth Conference of MIT’s Learning International Networks Consortium. June 2013 (pdf)
Sanjay Krishnan, Ken Goldberg. Distributed Spectral Dimensionality Reduction for Visualizing Textual Data. ICML Workshop on Spectral Learning Methods, Atlanta, GA, June 2013. (pdf)
Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, Tim Kraska, Tova Milo, Eugene Wu. SampleClean: Fast and Reliable Analytics on Dirty Data. IEEE Data Engineering Bul. 2015 (pdf)
Xu Chu, Ihab Ilyas, Sanjay Krishnan, Jiannan Wang. Data Cleaning: Overview and Emerging Challenges. SIGMOD 2016. (read)
Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, Tim Kraska. PrivateClean: Data Cleaning and Differential Privacy. SIGMOD 2016. (pdf)
Sanjay Krishnan,Eugene Wu, Michael Franklin, Ken Goldberg, Jiannan Wang. ActiveClean: An Interactive Data Cleaning Framework For Machine Learning. SIGMOD 2016 Demo. (pdf)
Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Ken Goldberg, Eugene Wu. ActiveClean: Interactive Data Cleaning For Statistical Modeling. VLDB 2016. (pdf)
Sanjay Krishnan, Daniel Haas, Eugene Wu, Michael Franklin. Towards Reliable Interactive Data Cleaning: A User Survey and Recommendations. HILDA 2016. (pdf)
Daniel Haas, Sanjay Krishnan, Jiannan Wang, Michael J. Franklin, Eugene Wu. Wisteria: Nurturing Scalable Data Cleaning Infrastructure. VLDB 2015 Demo. (pdf)
Jiannan Wang, Sanjay Krishnan, Michael Franklin, Ken Goldberg, Tim Kraska, Tova Milo. A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data. In SIGMOD, Jun. 2014 (pdf)
Sanjay Krishnan, Animesh Garg, Sachin Patil, Colin Lea, Gregory Hager, Pieter Abbeel, Ken Goldberg. Transition State Clustering: Unsupervised Surgical Task Segmentation For Robot Learning. IJRR 2018(read).
Caleb Chuck, Michael Laskey, Sanjay Krishnan, Ruta Joshi, Ken Goldberg. Statistical Data Cleaning for Deep Learning of Automation Tasks from Demonstrations. CASE 2017.
Adithya Murali, Animesh Garg, Sanjay Krishnan, Florian T. Pokorny, Pieter Abbeel, Trevor Darrell, Ken Goldberg: TSC-DL: Unsupervised Trajectory Segmentation of Multi-Modal Surgical Demonstrations with Deep Learning. (read)
Sanjay Krishnan, Animesh Garg, Sachin Patil, Colin Lea, Gregory Hager, Pieter Abbeel, Ken Goldberg. Transition State Clustering: Unsupervised Surgical Task Segmentation For Robot Learning. International Symposium on Robotics Research (ISRR). 2015. (read)
Liwen Sun, Sanjay Krishnan, Reynold S. Xin and Michael J. Franklin. A Partitioning Framework for Aggressive Data Skipping. VLDB 2014. (pdf)
Liwen Sun, Michael J. Franklin, Sanjay Krishnan, Reynold S. Xin: Fine-grained partitioning for aggressive data skipping. SIGMOD Conference 2014. (pdf)
Vatsal Patel, Sanjay Krishnan, Aimee Goncalves. SPRK: A Low-Cost Stewart Platform For Motion Study In Surgical Robotics. Under Review ISMR 2018.
Daniel Seita, Sanjay Krishnan, Roy Fox, Stephen McKinley, John F. Canny,
Ken Goldberg. Fast and Reliable Autonomous Surgical Debridement with
Cable-Driven Robots Using a Two-Phase Calibration Procedure. Under Review ICRA 2018.
Yeouhnoh Chung, Sanjay Krishnan , Tim Kraska. A Data Quality Metric (DQM). How to Estimate the Number of Undetected Errors in Data Sets. VLDB 2017. (pdf)
Brandie Nonnecke, Sanjay Krishnan, Jay Patel, Mo Zhou, Laura Byaruhanga, Dorothy Masinde, Maria Elena Meneses, Alejandro Martin del Campo, Camille Crittenden, Ken Goldberg. DevCAFE 1.0: A Participatory Platform for Assessing Development Initiatives in the Field. IEEE Global Humanitarian Technology Conference (GHTC). 2015 (Best Paper) (pdf)
Jeffrey Mahler, Sanjay Krishnan, Michael Laskey, Siddarth Sen, Adithyavairavan Murali, Ben Kehoe, Sachin Patil, Jiannan Wang, Mike Franklin, Pieter Abbeel, Ken Goldberg. Learning Accurate Kinematic Control of Cable-Driven Surgical Robots Using Data Cleaning and Gaussian Process Regression. CASE 2014. (read)