Arno Candel, is the Chief Technology Officer of H2O, a distributed and scalable open-source machine learning platform. He is also the main author of H2O’s Deep Learning. Before joining H2O, Arno was a founding Senior MTS at Skytree where he designed and implemented high-performance machine learning algorithms. He has over a decade of experience in HPC with C++/MPI and had access to the world’s largest supercomputers as a Staff Scientist at SLAC National Accelerator Laboratory where he participated in US DOE scientific computing initiatives and collaborated with CERN on next-generation particle accelerators. Arno holds a PhD and Masters summa cum laude in Physics from ETH Zurich, Switzerland. He has authored dozens of scientific papers and is a sought-after conference speaker. Arno was named “2014 Big Data All-Star” by Fortune Magazine. Follow him on Twitter: @ArnoCandel.
Driverless AI - Introduction and A Look Under the Hood
Driverless AI, H2O.ai’s latest flagship product, speeds up data science workflows by automating feature engineering, model tuning, ensembling and model deployment. Driverless AI turns Kaggle-winning recipes into production-ready code, and is specifically designed to avoid common mistakes such as under- or overfitting, data leakage or improper model validation. Avoiding these pitfalls alone can save weeks or more for each model, and is necessary to achieve high modeling accuracy. With Driverless AI, everyone can now train and deploy modeling pipelines with just a few clicks from the GUI. Advanced users can use the client/server API through a variety of languages such as Python, Java, C++, go, C# and many more. To speed up training, Driverless AI uses highly optimized C++/CUDA algorithms to take full advantage of the latest compute hardware. For example, Driverless AI runs orders of magnitudes faster on the latest Nvidia GPU supercomputers on Intel and IBM platforms, both in the cloud or on premise. There are two more product innovations in Driverless AI: statistically rigorous automatic data visualization and interactive model interpretation with reason codes and explanations in plain English. Both help data scientists and analysts to quickly validate the data and the models. In this talk, we explain how Driverless AI works and show how easy it is to reach top 5% rankings for several highly competitive Kaggle competitions.