Leland Wilkinson is Chief Scientist at H2O and Adjunct Professor of Computer Science at the University of Illinois Chicago. He received an A.B. degree from Harvard in 1966, an S.T.B. degree from Harvard Divinity School in 1969, and a Ph.D. from Yale in 1975. Wilkinson wrote the SYSTAT statistical package and founded SYSTAT Inc. in 1984. After the company grew to 50 employees, he sold SYSTAT to SPSS in 1994 and worked there for ten years on research and development of visualization systems. Wilkinson subsequently worked at Skytree and Tableau before joining H2O. Wilkinson is a Fellow of the American Statistical Association, an elected member of the International Statistical Institute, and a Fellow of the American Association for the Advancement of Science. He has won best speaker award at the National Computer Graphics Association and the Youden prize for best expository paper in the statistics journal Technometrics. He has served on the Committee on Applied and Theoretical Statistics of the National Research Council and is a member of the Boards of the National Institute of Statistical Sciences (NISS) and the Institute for Pure and Applied Mathematics (IPAM). In addition to authoring journal articles, the original SYSTAT computer program and manuals, and patents in visualization and distributed analytic computing, Wilkinson is the author (with Grant Blank and Chris Gruber) of Desktop Data Analysis with SYSTAT. He is also the author of The Grammar of Graphics, the foundation for several commercial and opensource visualization systems (IBMRAVE, Tableau, Rggplot2, and PythonBokeh).
Auto Visualization involves the problem of producing meaningful graphics when presented with data. Relevant to this task are the strategies that expert statisticians and data analysts use to gain insights through visualization, as well as the portfolio of diagnostic methods devised by statisticians in the last 50 years. While some researchers and companies may claim to do automatic visualization, the problem is much deeper than simply producing collections of histograms, bar charts, and scatterplots. The deeper problem is what subset of these graphics is critical to recognizing anomalies, outliers, unusual distributions, missing values, and so on. This talk will cover aspects of this deeper problem and will introduce H2O software that implements some of these algorithms.