Giovanni Seni currently leads a team of data scientists and data engineers at Amazon’s A9.com, working on computational advertising problems. As an active data mining practitioner in Silicon Valley, he has over 15 years R&D experience in statistical pattern recognition and data mining applications. He has been a member of the technical staff at large technology companies, and a contributor at smaller organizations. He holds five US patents and has published over twenty conference and journal articles. His book “Ensemble Methods in Data Mining – Improving accuracy through combining predictions”, was published in February 2010 by Morgan & Claypool.
Distributed Learning of Rule Ensemble Models with H2O
Rule Ensembles (RE) refer to an ensemble-based model built by combining simple, readable rules. While maintaining (and often improving) the accuracy of the classic tree ensemble, the rule-based model is much more interpretable. In this talk, I present an overview of REs and demo H2O-REgo, an R-based package that enables distributed learning of REs models on very large data sets using H2O.