Event Details

The last few years have seen an avalanche of new data sources providing scientists with a unique opportunity to study human behavior. In fact, much of the data now labeled as “Big Data” is generated by us, humans, as a result of our interactions with other people and our technologies. As we incessantly, tweet, email, or make online purchases, we also create new and insightful datasets. This symposium is meant to bring together leading researchers from machine learning and the social sciences, each working at the interface between these disciplines to help us understand new opportunities and methodologies for better predicting human behavior.

Facebook-iconThis event is free and open to the public.
Invite your Friends on Facebook.


Athey_DSIDr. Susan Athey
Dr. Susan Athey is the Economics of Technology Professor at Stanford Graduate School of Business. She received her bachelor’s degree from Duke University and her Ph.D. from Stanford, and she holds an honorary doctorate from Duke University. She previously taught at the economics departments at MIT, Stanford and Harvard. Her current research focuses on the economics of the internet, marketplace design, auction theory, the statistical analysis of auction data, and the intersection of econometrics and machine learning. She has focused on several applications, including timber auctions, internet search, online advertising, the news media, and virtual currency. Susan advises governments and businesses on the design of auction-based marketplaces. She has served as a long-term consultant for Microsoft Corporation since 2007, including a period as chief economist. She also serves as a long-term advisor to the British Columbia Ministry of Forests, helping to architect and implement their auction-based pricing system.

jason baldridge_smaller photoDr. Jason Baldridge

Dr. Jason Baldridge is co-founder and Chief Scientist of People Pattern, and he was until last year an Associate Professor of Computational Linguistics at the University of Texas at Austin. As a professor, Jason worked on probabilistic models for categorization and syntax, with a particular emphasis on low-resource languages. He also focused on methods and applications for connecting linguistic objects to geography and time. He has been active in the creation and promotion of open source software for natural language processing: he is one of the co-creators of the Apache OpenNLP Toolkit, and he has contributed many others, including ScalaNLP, Junto, TextGrounder and OpenCCG. Jason received his Ph.D. from the University of Edinburgh in 2002, where his doctoral dissertation was awarded the 2003 Beth Dissertation Prize from the European Association for Logic, Language and Information. His main academic research interests include categorial grammars, parsing, semi-supervised learning, co-reference resolution and text geolocation.

Lewis_DSIDr. Randall Lewis
Dr. Randall Lewis is an Economics Research Scientist on the Science & Algorithms team at Netflix. As a “big data” econometrician, he combines machine learning and econometrics to develop scalable causal measurement and prediction systems. Before joining Netflix, he worked at Google and Yahoo. Randall attended MIT as a Presidential Fellow where he earned his Ph.D. in economics and attended BYU as a Presidential Scholar, graduating as a valedictorian with degrees in economics and mathematics.

Bill Maurer photo Dean William Maurer
Dean Bill Maurer is a cultural anthropologist who conducts research on law, property, money and finance, focusing on technological infrastructures and social relations of exchange and payment. He has particular expertise in emerging, alternative and experimental forms of money and finance, payment technologies, and their legal implications. Dean Maurer is founding director of the Institute for Money, Technology and Financial Inclusion, funded by the Bill and Melinda Gates Foundation, and was the founding co-director of the Intel Science and Technology Center in Social Computing. In July 2013, he  assumed the role of Dean of the School of Social Sciences at UC Irvine. He maintains an active side interest in the experimental history of the Irvine School of Social Sciences, and has been involved in several curatorial projects related to that history.

Smyth_DSiDr. Padhraic Smyth
Dr. Padhraic Smyth’s research includes machine learning, pattern recognition, applied statistics, data mining, information theory and artificial intelligence. His work focuses on how to automatically extract information from large and complex data sets. His research group works on the basic theory of inference from data as well as on a variety of applications of data analytic algorithms to problems in medicine, biology, climate modeling, astronomy, planetary science and analysis of Web and text data.

Yue_DSIDr. Yisong Yue
Dr. Yisong Yue is an assistant professor in the Computing and Mathematical Sciences Department at the California Institute of Technology. He was previously a research scientist at Disney Research. Before that, he was a postdoctoral researcher in the Machine Learning Department and the iLab at Carnegie Mellon University. He received a Ph.D. from Cornell University and a B.S. from the University of Illinois at Urbana-Champaign. Yisong’s research interests lie primarily in the theory and application of statistical machine learning. He is particularly interested in developing novel methods for spatiotemporal reasoning, structured prediction, interactive learning systems, and learning with humans in the loop. In the past, his research has been applied to information retrieval, recommender systems, text classification, learning from rich user interfaces, analyzing implicit human feedback, data-driven animation, behavior analysis, sports analytics, policy learning in robotics, and adaptive routing & allocation problems.


Dr. Susan Athey

Machine Learning for Personalized Causal Effects and Policy Estimation

Abstract: With the advent of wide access to “big data,” machine learning has made enormous advances in supervised and unsupervised techniques.  Standard supervised ML focuses on prediction problems, but many real-world policy problems can only be partially addressed using purely predictive techniques.  A recent literature has emerged combining techniques from machine learning with tools from the literatures on program evaluation and causal inference.  In this talk, I will review recent proposals to solve three distinct but related problems: heterogeneous treatment effect estimation, average treatment effect estimation, and estimation of optimal personalized policies.  The methods draw from a variety of literatures in machine learning, and themes include the need to modify standard predictive methods to optimize for causal inference objectives and enable the construction of confidence intervals for parameter estimates, as well as the importance of incorporating insights from the econometrics literature on semi-parametric efficient estimation.  I will also highlight the extension of these methods to techniques commonly used to enable causal inference in economic applications, such as instrumental variables.

Dr. Jason Baldridge

Multifaceted Extended Demographic and Psychographic Prediction

Abstract: Companies today have access to extensive data on their consumers, but the data often lacks many of the core demographic and psychographic variables that drive many marketing functions. This is especially true of consumer data that originates in social media profiles, which typically lack structured demographic information beyond names and locations—and even these are often incomplete or fabricated. As such, there has been a surge of academic and commercial interest in predicting values for gender, age, race, location, interests, personality, and more, given some portion of the information available in a social profile.

At People Pattern, we have developed classifiers that handle these predictions at scale and depth to deliver both high precision individual predictions and accurate aggregate population measures. I will discuss our approach to training and deploying profile classifiers based on multifaceted featurization that draws on profile attributes, posts, graph connections, and images. Some of the features additionally lend themselves to distant annotation, which we exploit with label propagation to quickly bootstrap predictions for new variables of interest. I will also briefly touch on ethical considerations raised in the context of such predictions.

Dr. Randall Lewis

Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness

Abstract: Abstract To measure the effects of advertising, marketers must know how consumers would behave had they not seen the ads. We develop a methodology we call `Ghost Ads,’ which facilitates this comparison by identifying the control-group counterparts of the exposed consumers in a randomized experiment. We show that, relative to Public Service Announcement (PSA) and Intent-to-Treat A/B tests, `Ghost Ads’ can reduce the cost of experimentation, improve measurement precision, deliver the relevant strategic baseline, and work with modern ad platforms that optimize ad delivery in real-time. We also describe a variant `Predicted Ghost Ad’ methodology that is compatible with online display advertising platforms; our implementation records more than 100 million predicted ghost ads per day. We demonstrate the methodology with an online retailer’s display retargeting campaign. We show novel evidence that retargeting can work as the ads lifted website visits by 17.2% and purchases by 10.5%. Compared to Intent-to-Treat or PSA experiments, advertisers can measure ad lift just as precisely while spending at least an order of magnitude less.

Dr. Yisong Yue

Building Predictive Behavioral Models via Large-Scale Imitation Learning

Abstract: The ongoing explosion of spatiotemporal tracking data has now made it possible to analyze and model fine-grained behaviors in a wide range of domains.  For instance, tracking data is now being collected for every NBA basketball game with players, referees, and the ball tracked at 25 Hz, along with annotated game events such as passes, shots, and fouls.  Other settings include laboratory animals, people in public spaces, professionals in settings such as operating rooms, actors speaking and performing, digital avatars in virtual environments, and many others.

In this talk, Dr. Yue will describe ongoing research in using imitation learning to develop predictive models of fine-grained behavior.  Imitation learning is branch of machine learning that deals with learning to imitate dynamic demonstrated behavior.  I will provide a broad overview of the methodologies, as well as specific examples in modeling behavior in laboratory animal behavior, professional sports, and speech animation.

This is joint work with Eyrun Eyolfsdottir, Taehwan Kim, Hoang Le, Stephan Zheng, Jianhui Chen, Andrew Kang, Sarah Taylor, Kristin Branson, Iain Matthews, Peter Carr, Patrick Lucey, Jim Little, and Pietro Perona.


The symposium will take place at the Calit2 building on the UC Irvine campus.

Calit2 (Campus Map):

Directions to UC Irvine can be found here.

Organizing Team

Event Co-Chairs

harding photo        sameer singh_DSI

Dr. Matt Harding          Dr. Sameer Singh


This event is free and open to the public. RSVP is requested.


Symposium Schedule

Date: Friday, March 10, 2017
Time: 1:15 – 6:00 p.m.
Location: UC Irvine, Calit2 Auditorium. Parking is $10 per vehicle in the Anteater Parking Structure (directions).

Time Description
1:15 – 1:30 p.m. Opening Remarks
Dean William Maurer, Ph.D., School of Social Sciences, University of California, Irvine
1:30 – 2:15 p.m. Susan Athey, Ph.D., The Economics of Technology Professor, Stanford Graduate School of Business
2:15 – 3:00 p.m. Jason Baldridge, Ph.D., Co-founder and Chief Scientist, People Pattern
3:00 – 3:15 p.m. Coffee Break
3:15– 4:00 p.m. Randall Lewis, Ph.D., Economic Research Scientist, Netflix
4:00 – 4:45 p.m. Yisong Yue, Ph.D., Assistant Professor of Computing and Mathematical Sciences, Caltech
4:45 – 5:00 p.m. Closing Remarks
Padhraic Smyth, Ph.D., Director, Data Science Initiative
Professor of Computer Science and Professor of Statistics, UCI
5:00 – 6:00 p.m. Reception

This symposium is organized by Matthew Harding, Ph.D., Associate Professor of Economics and Sameer Singh, Ph.D., Assistant Professor of Computer Science, UC Irvine.