D3M Project

The Data-Driven Discovery of Models (D3M) program aims to develop automated model discovery systems that enable users with subject matter expertise but no data science background to create empirical models of real, complex processes. At the NYU VIDA Center we address three important challenges in automating machine learning: pipeline synthesis, model understanding/curation and data augmentation.


AlphaD3M is an AutoML system that automatically searches for models and derives end-to-end pipelines that read, pre-process the data, and train the model. AlphaD3M uses deep learning to learn how to incrementally construct these pipelines. The process progresses by self play with iterative self improvement.


Visus is a system designed to support the model building process and curation of ML data processing pipelines generated by AutoML systems. Visus also integrates visual analytics techniques and allows users to perform interactive data augmentation and visual model selection.


Auctus automatically discovers datasets on the Web and, different from existing dataset search engines, infers consistent metadata for indexing and supports join and union search queries. Auctus is already being used in a real deployment environment to improve the performance of machine learning models.


09/2020 AlphaD3M+PipelineProfiler was selected as one of the finalists in a machine-learning competition organized by Wells Fargo.
08/2020 Paper “PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines” has been accepted at VIS2020.
09/2019 Visus system has been accepted to the Demo Expo at NYC Media Lab’s Annual Summit.
09/2019 AlphaD3M and Visus have been selected for a talk in the special track Automated Machine Learning and AI at IBM AI Systems Day.
05/2019 Paper “Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar” accepted at AutoML’19.
05/2019 A paper accepted at DEEM’19: “Debugging Machine Learning Pipelines”.
04/2019 New publication accepted  at HILDA’19 (co-located with SIGMOD): “Visus: An Interactive System for Automatic Machine Learning Model Building and Curation”.
06/2018 Paper “AlphaD3M: Machine Learning Pipeline Synthesis” accepted at AutoML’18.