Data Science Research @ U Chicago. How do we use data to improve our lives? My research group studies the theory, practice, and applications of large-scale sensing. This includes the design of efficient streaming data structures, distributed/decentralized data collection, data compression, and novel applications of such systems. For example, we are exploring how these algorithms in data governance and network security (i.e., tracking how information flows through an organization).

Ongoing Projects

  • Decentralized Prediction. Our group is exploring how nodes on a distributed network can efficiently coordinate to make fast collective decisions. For example, a set of servers deciding whether a cyber-attack is currently happening or a set of robots identifying an object in a room.
  • Measuring Data Science. As data science efforts proliferate through every organization, tools are needed to track data movement, data access, and data usage. Such tools are a core component of the future of data governance.
  • Social Determinants of Health. The National Academy of Medicine and Chicago Dept of Public Health recognize that, beyond focusing on and treating biological mechanisms of disease, advancing health also critically requires accounting for and striving to mitigate adverse consequences of social, environmental, behavioral, and psychological factors – the entire lived experience that we will call the “sociome” – because these sociome factors: 1) interact with human biology to exacerbate or even primarily cause disease and injury, and 2) disrupt social, environmental, and psychological securities required for best health. Our group has approached the challenge of documenting, compiling, organizing and querying sociome datasets with an approach that builds capacity beyond the initial questions they’ve begun to answer.


1/10/2023 New paper on multimodal machine learning to be presented at UbiComp 2023.

11/01/2022 New paper on multi-resolution compression to be presented at SIGMOD 2023. Details to follow.

9/10/2022 New paper on streaming approximation to be presented at ICDE 2023:

9/01/2022 Lab Alumni Stavros Sintos joins the University of Illinois-Chicago as an Assistant Professor

3/23/2022 Lab Alumni Xi Liang joins Databricks

Recent Publications

Shinan Liu, Tarun Mangala, Ted Shaowang, JinJin Zhao, John Paparizzos, Sanjay Krishnan, Nick Feamster. AMIR: Active Multimodal Interaction Recognition from Video and Network Traffic in Connected Environment. Ubicomp 2023.

Bruno Barbarioli, Gabriel Mersy, Stavros Sintos, Sanjay Krishnan. HIRE: Hierarchical Residual Encoding for Multiresolution Compression in Time-Series Data. SIGMOD 2023.

Xi Liang, Stavros Sintos, and Sanjay Krishnan. JanusAQP: Efficient Partition Tree Maintenance for Dynamic Approximate Query Processing. ICDE 2023 pdf

Ted Shaowang, Xi Liang, Sanjay Krishnan. Sensor Fusion on the Edge: Initial experiments in the EdgeServe System. Big Data in Emergent Distributed Environments 2022. pdf

Ted Shaowang, Nilesh Jain, Dennis D. Matthews, and Sanjay Krishnan. “Declarative data serving: the future of machine learning inference on the edge.” VLDB 2021 pdf

Past Selected Publications in Relevant Areas

Data Structures for Approximation

Xi Liang, Stavros Sintos, Zechao Shang, Sanjay Krishnan. Combining Sampling and Aggregation (Nearly) Optimally. SIGMOD 2021 pdf

John Paparrizos, Chunwei Liu, Bruno Barbarioli, James Hwang, Ikrudya Edian, Aaron Elmore, Mike Franklin, and Sanjay Krishnan, VergeDB: A Database for IoT Analytics on Edge Devices. CIDR 2021 pdf

Xi Liang, Zechao Shang, Aaron J. Elmore, Sanjay Krishnan, Mike Franklin. Fast and Reliable Missing Data Contingency Analysis with Predicate-Constraints. SIGMOD 2020 pdf

Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, and Ion Stoica. Deep Unsupervised Cardinality Estimation. VLDB 2020. pdf

Distributed and Decentralized Systems (Digitial and Human)

Siyuan Xia, Zhiru Zhu, Chris Zhu, Jinjin Zhao, Kyle Chard, Aaron Elmore, Ian Foster, Michael Franklin, Sanjay Krishnan, Raul Castro Fernandez. Data Station: Delegated, Trustworthy, and Auditable Computation to Enable Data-Sharing Consortia with a Data Escrow. VLDB 2022 pdf

Nalin Ranjan, Zechao Shang, Sanjay Krishnan, and Aaron J. Elmore. “Version Reconciliation for Collaborative Databases.” SoCC 2021 pdf

Martin Jaggi, Virginia Smith, Martin Takác, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, and Michael I. Jordan. Communication-efficient distributed dual coordinate ascent. NeurIPS 2014. pdf

Sanjay Krishnan, Jay Patel, Michael J. Franklin, and Ken Goldberg. Social Influence Bias in Recommender Systems: A Methodology for Learning, Analyzing, and Mitigating Bias in Ratings. Under Review: ACM Conference on Recommender Systems (RecSys). Foster City, CA, USA. Oct 2014 (pdf)

Machine Learning Applications

Vanlin Sathya, Adam Dziedzic, Monisha Ghosh, and Sanjay Krishnan. Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT. ICNC 2020 pdf

Adam Dziedzic, John Paparrizos, Sanjay Krishnan, Aaron Elmore, and Michael Franklin. Band-limited training and inference for convolutional neural networks. ICML 2019. pdf

Sanjay Krishnan, Zongheng Yang, Keng Goldberg, Joe Hellerstein, and Ion Stoica. Learning to Optimize Join Queries with Deep Reinforcement Learning. 2018. pdf

Roy Fox, Richard Shin, Sanjay Krishnan, Ken Goldberg, Dawn Song, and Ion Stoica. Parametrized hierarchical procedures for neural programming. ICLR 2018.

Roy Fox, Sanjay Krishnan, Ion Stoica, and Ken Goldberg. Multi-level discovery of deep options. 2017.