Update: Strarting February 2023, I will be dedicating my full time to the development of the next generation of Knowledge Construction and Serving platforms at Apple.
I am an Assistant Professor at the Department of Computer Science at ETH Zurich where I will lead the Structured Intelligence Systems Group (part of the Systems Group). I am also one of the leads in the Knowledge Platform team at Apple.
Previously, I was a Senior Manager at Apple leading the Knowledge Platform - Graph ML team. I was also an Assistant Professor at UW-Madison and a member of the Database Group. I’ve also had the pleasure to be a co-founder of Inductiv (acquired by Apple), a company developing AI for identifying and correcting errors in data.
I am always looking for good students! If interested in working on the topics below please reach out at theo.rekatsinas[at]inf.ethz.ch.
My lab focuses on the foundations of structured intelligence systems:
-
Software 2.0 for Data Quality: We are exploring the fundamental connections between data cleaning and machine learning. The HoloClean project introduced Machine Learning to the problem of data cleaning: We showed how to model data cleaning as statistical learning problem, how attention-based mechanisms and self-supervised learning can automate data cleaning and introduced multiple theoretical results on how to deal with noisy/dirty data. More recently we are exploring the synergies between data cleaning and machine learning deployments in the Picket project. This talk at the Stanford MLsys Seminar provides an overview.
-
Deep Learning over Billion-scale Structured Data: We are developing a system to make the use of deep learning models over billion-edge structured data easier, faster, and cheaper. We have started with the Marius project that focuses on a key bottleneck in the development of machine learning systems over large-scale graph data: data movement during training. Marius addresses this bottleneck with a novel data flow architecture that maximizes resource utilization of the entire memory hierarchy (including disk, CPU, and GPU memory). Marius is under active development and available as an open-source project. You can learn more about Marius from our recent OSDI`21 and MLOpsWorld talks.
News
-
- March, 2023 Congratulations to my sudent Jason Mohoney for becoming an Apple AI/ML Scholar.
- January, 2022 Excited to talk about Data Debugging in ML at my alma mater, ECE @ NTUA.
- June, 2021 New talk about Marius and Machine Learning Over Billion-Edge Graphs at MLOpsWorld.
- March, 2021 Excited to be talking about Software 2.0 for Data Quality at the Stanford ML Sys seminar.
- February, 2021 Excited to talk about our work on Data Quality at CMU (ML with Large Datasets)