Skip to main content

Watch the January 2021 Cohort’s research talks here!

January 2021 Rising Stars

Talk Title: Evaluating the Impact of Entity Resolution in Social Network Metrics

Watch Abby’s Research Lightning Talk

Talk Abstract: Modern databases are filled with potential duplicate entries—caused by misspellings, change in address, or differences in abbreviations. The probabilistic disambiguation of entries is often referred to as entity resolution.

Entity resolution of individuals (nodes) in relational datasets is often viewed as a pre-processing step in network analysis. Studies in bibliometrics have indicated that entity resolution changes network properties in citation networks, but little research has used real-world social networks that vary in size and type. We present a novel perspective on entity resolution in networks—where we propagate error from the entity resolution process into downstream network inferences. We also seek to understand how match thresholds in unsupervised entity resolution affect both global and local network properties, such as the degree distribution, centrality, transitivity, and motifs such as stars and triangles. We propose a calibration of these network metrics given measures of entity resolution quality, such as node “splitting” and “lumping” errors.

We use a respondent driven sample of people who use drugs (PWUD) in Appalachia and a longitudinal network study of Chicago-based young men who have sex with men (YMSM) to demonstrate the implications this has for social and public health policy.

Bio: Abby Smith is a Ph.D. Candidate in Statistics at Northwestern University. Her work centers around evaluating the impact of entity resolution error in social network inferences. She is particularly interested in collaborative research and data science for social good applications, and most recently served as a Solve for Good consultant at the mHealth nonprofit Medic Mobile. Abby is passionate about building community for women in statistics and data science in Chicago, and serves as a WiDS Ambassador and R-Ladies: Chicago board member. She holds a Masters in Statistical Practice and a B.S. in Mathematics, both from Carnegie Mellon.

Talk Title: Modeling the Impact of Social Determinants of Health on Covid-19 Transmission and Mortality to Understand Health Inequities

Watch Abby’s Spotlight Research Talk

Talk Abstract: The Covid-19 pandemic has highlighted drastic health inequities, particularly in cities such as Chicago, Detroit, New Orleans, and New York City. Reducing Covid-19 morbidity and mortality will likely require an increased focus on social determinants of health, given their disproportionate impact on populations most heavily affected by Covid-19. A better understanding of how factors such as household income, housing location, health care access, and incarceration contribute to Covid-19 transmission and mortality is needed to inform policies around social distancing and testing and vaccination scale-up.

This work builds upon an existing agent-based model of Covid-19 transmission in Chicago, CityCOVID. CityCOVID consists of a synthetic population that is statistically representative of Chicago’s population (2.7 million persons), along with their associated places (1.4 million places) and behaviors (13,000 activity schedules). During a simulated day, agents move from place-to-place, hour-by-hour, engaging in social activities and interactions with other colocated agents, resulting in an endogenous colocation or contact network. Covid-19 transmission is determined via a simulated epidemiological model based on this generated contact network by tuning (fitting) model parameters that result in simulation output that matches observed Covid-19 death and hospitalization data from the City of Chicago. Using the CityCOVID infrastructure, we quantify the impact of social determinants of health on Covid-19 transmission dynamics by applying statistical techniques to empirical data to study the relationship between social determinants of health and Covid-19 outcomes.

Bio: Abby Stevens is fourth year statistics PhD student at the University of Chicago advised by Rebecca Willett. She is interested in using data science techniques to address important social and political issues, such as climate science, public health, and algorithmic fairness. She graduated with a math degree from Grinnell College in 2014 and then worked as a data scientist at a healthcare tech company before entering graduate school. She has been involved in a number of data science for social good organizations and is a primary organizer of the Women in Data Science Chicago annual event.

Talk Title: Covariant Neural Networks for Physics Applications

Watch Alexander’s Research Lightning Talk

Talk Abstract: Most traditional neural network architectures do not respect any intrinsic structure of the input data, and instead are expect to “learn” it. CNNs are the first widespread example of a symmetry, in this case the translational symmetry of images, being used to advise much more efficient and transparent network architectures. More recently, CNNs were generalized to other non-commutative symmetry groups such as SO(3). However, in physics application one is more likely to encounter input data that belong to linear representations of Lie Groups, as opposed to being functions (or “images”) on a symmetric space of the group.

To deal with such problems, I will present a general feed-forward architecture that takes vectors as inputs, works entirely in the Fourier space of the symmetry group, and is fully covariant. This approach allows one to achieve equal performance with drastically fewer learnable parameters, Moreover, the models become much more physically meaningful and more likely to be interpretable. My application of choice is in particle physics, where the main symmetry is the 6-dimensional Lorentz group. I will demonstrate the success of covariant architectures compared to more conventional approaches.

Bio: I am a PhD student at the University of Chicago working on theoretical hydrodynamics problems in relation to the quantum Hall effect. In addition, I am working on developing new group-covariant machine learning tools for physics applications, such as Lorentz-covariant neural networks for particle physics. My background is in mathematical physics, in which I hold a master’s degree from the Saint-Petersburg University in Russia. My interests lie on the intersection of theoretical and mathematical physics and new inter-disciplinary applications of such ideas.

Talk Title: Credible and Effective Data-Driven Decision-Making: Minimax Policy Learning under Unobserved Confounding

Talk Abstract: We study the problem of learning causal-effect maximizing personalized decision policies from observational data while accounting for possible unobserved confounding. Since policy value and regret may not be point-identifiable, we study a method that minimizes the worst-case estimated regret over an uncertainty set for propensity weights that controls the extent of unobserved confounding. We prove generalization guarantees that ensure our policy will be safe when applied in practice and will in fact obtain the best-possible uniform control on the range of all possible population regrets that agree with the possible extent of confounding. Finally, we assess and compare our methods on synthetic and semi-synthetic data. In particular, we consider a case study on personalizing hormone replacement therapy based on the parallel WHI observational study and clinical trial. We demonstrate that hidden confounding can hinder existing policy learning approaches and lead to unwarranted harm, while our robust approach guarantees safety and focuses on well-evidenced improvement.  This work is joint with Nathan Kallus. An earlier version was circulated as “Confounding-Robust Policy Improvement”.  Time permitting, I will highlight recent follow-up work on robust policy evaluation for infinite-horizon reinforcement learning. 

Bio: My research interests are at the intersection of statistical machine learning and operations research in order to inform reliable data-driven decision-making. Specifically, I have developed fundamental contributions and algorithmic frameworks for robust causal-effect-maximizing personalized decision rules in view of unobserved confounding, as well as methodology for credible impact evaluation for algorithmic fairness with high potential impact in industry and policy. My work has been published in journals such as Management Science and top-tier computer science/machine learning venues (Neurips/ICML), and has received a INFORMS Data Mining Section Best Paper award. My work was previously supported on a NDSEG (National Defense Science and Engineering) Graduate Fellowship.

Talk Title: AI for Population Health: Melding Data and Algorithms on Networks

Watch Bryan’s Spotlight Research Talk

Talk Abstract: As exemplified by the COVID-19 pandemic, our health and wellbeing depend on a difficult-to-measure web of societal factors and individual behaviors. Tackling social challenges with AI requires algorithmic and data-driven paradigms which span the full process of gathering costly data, learning models to understand and predict interactions, and optimizing the use of limited resources in interventions. This talk presents methodological developments at the intersection of machine learning, optimization, and social networks which are motivated by on-the-ground collaborations on HIV prevention, tuberculosis treatment, and the COVID-19 response. These projects have produced deployed applications and policy impact. For example, I will present the development of an AI-augmented intervention for HIV prevention among homeless youth. This system was evaluated in a field test enrolling over 700 youth and found to significantly reduce key risk behaviors for HIV.

Bio: Bryan Wilder is a final-year PhD student in Computer Science at Harvard University, where he is advised by Milind Tambe. His research focuses on the intersection of optimization, machine learning, and social networks, motivated by applications to population health. His work has received or been nominated for best paper awards at ICML and AAMAS, and was a finalist for the INFORMS Doing Good with Good OR competition. He is supported by the Siebel Scholars program and previously received a NSF Graduate Research Fellowship.

Talk Title: Towards Data-Driven Internet Routing Security

Talk Abstract: The Internet ecosystem is critical for the reliability of online daily life. However, key Internet protocols, such as the Border Gateway Protocol (BGP), were not designed to cope with untrustworthy parties, making them vulnerable to misconfigurations and attacks from anywhere in the network. In this talk, I will present an evidence-based data-driven approach to improve routing infrastructure security, which I use to identify and characterize BGP serial hijackers, networks that persistently hijack IP address blocks in BGP. I’ll also show how similar approaches can quantify the benefits of the RPKI security framework against prefix hijacks, and identify route leaks. This work improves our understanding about how our Internet actually works and has been used by industry and researchers for network reputation and monitoring of operational security practices.

Bio: Cecilia Testart is a PhD candidate in EECS at MIT, working with David D. Clark. Her research is at the intersection of computer networks, data science and policy. Her doctoral thesis focuses on securing the Internet’s core routing protocols, leveraging machine learning and data science approaches to understand the impact of protocol design in security, and considering both technical and policy challenges to improve the current state-of-the-art. Cecilia holds Engineering Degrees from Universidad de Chile and Ecole Centrale Paris and a dual-master degree in Technology and Policy and EECS from MIT. Prior to joining MIT, she helped set up the Chilean office of Inria (the French National Institute for Research in Digital Science and Technology) and worked for the research lab of the .CL, the Chilean top-level domain. She has interned at Akamai, MSR and the OECD. Cecilia’s work was awarded with a Distinguished paper award at the ACM Internet Measurement Conference in 2019.

Talk Title: Machine Learning for Astrophysics & Cosmology in the Era of Large Astronomical Surveys and an Application for the Discovery and Classification of Faint Galaxies

Watch Dimitrios’ Research Lightning Talk

Talk Abstract: Observational astrophysics & cosmology are entering the era of big-data. Future astronomical surveys are expected to collect hundreds of petabytes of data and detect billions of objects. Machine learning will play an important role in the analysis of these surveys with the potential to revolutionize astronomy, as well as providing challenging problems that can give opportunities for breakthroughs in the fundamental understanding of machine learning
In this talk I will present the discovery of Low Surface Brightness Galaxies (LSBGs) from the Dark Energy Survey (DES) data. LBGSs are galaxies with intrinsic brightness less than that of the dark sky, and so are hard to detect and study. At the same time, they are expected to dominate the number density of galaxies in the universe, which thus remains relatively unexplored. I will discuss the development of automated, deep learning-based, pipelines for LSBG detection (separation of LSB galaxies from LSB artifacts present in images) and morphological classification. Such techniques will be extremely valuable in the advent of very large future surveys like the planned Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory.

Bio: Dimitrios Tanoglidis is a fifth-year PhD student at the department of Astronomy & Astrophysics at the University of Chicago. He holds a BSc in Physics and MSc in Theoretical Physics, both from the University of Crete, Greece. His research interests lie in cosmology, analysis of large galaxy surveys, and data science applications in astrophysics. He has led the research for the discovery and analysis of Low Surface Brightness Galaxies from the Dark Energy Survey Data using machine learning. Interdisciplinary in nature, he is also pursuing a certificate in Computational Social Science.

Talk Title: Network Effects on Outcomes and Unequal Distribution of Resources

Watch Eaman’s Research Lightning Talk

Talk Abstract: We study how networks affect different groups differently and provide pathways to reinforce existing inequalities. First we provide observational evidence for differential network advantages in access to information: individuals from the low status group receive lower marginal benefit from networking than the high status group. Second, we provide causal evidence for differential diffusion of a new behavior in the network, mainly driven due to homophily and slight initial advantages of a group. Third, we develop a theoretical network model that captures the network structure of unequal access to opportunities. We show that any departure from the uniform distribution of links to information sources among members of a group limits the diffusion of information to the group as a whole. Fourth, we develop an online lab experiment to further study the network mechanisms that widen inter-group differences and yield different returns on social capital to different groups. We recruit individuals to play an online collaborative game in which they have to find and dig gold mines and in the process can pass information to their network neighbors. By changing the network structure and composition of groups with low and high initial advantage, we generate the processes that lead to unequal distribution of opportunities, beyond what’s expected by individual differences. Finally, we contribute to the literature on network structure and performance and propose the concept of bandwidth-diversity matching: individuals who match the tie strength to their contacts with their information novelty achieve truly diverse networks and better outcomes.

Bio: I am a PhD candidate in the Social and Engineering Systems program at MIT IDSS, under supervision of Prof. Pentland and Prof. Eckles. I am also receiving a second PhD in Statistics from the Statistics and Data Science Center at MIT. I received my Bachelor’s and Master’s degrees in Computer Science both from the University of Michigan – Ann Arbor.
My PhD research is focused on micro-level structural factors, such as network structure, that contribute to unequal distribution of resources or information. As a computational social scientist, I use methods from network science, statistics, experiment design and causal inference. I am also interested in understanding the collective behavior in institutional settings, the institutional mechanisms that promote cooperative behavior in networks, or in contrast lead to unequal outcomes for different groups.
In a previous life, I worked at Google New York City as a software engineer from 2011 to 2015. Currently, I am also a research contractor at Facebook working on how networks affect economic outcomes.

Talk Title: What and How Students Read: A Data-driven Insight

Talk Abstract: Reading is an integral part of learning. The purpose of reading to learn is to comprehend meaning from informational texts. Reading comprehension tasks require self-regulated learning (SRL) behaviors – to plan, monitor, and evaluate one’s reading strategies. Students without SRL skills may struggle in reading which in turn may inhibit them to acquire domain-specific knowledge. Thus, understanding students reading behavior and SRL usage is important for intervention. Digital reading platforms can provide opportunities to learn and practice SRL strategies in classroom settings. These platforms log rich array of student and teacher interaction data with the systems. Retrospective analysis of these logged data can derive insights– which can be used to support tailored interventions by instructors and students in complex learning activities. In this talk, I will discuss students’ science reading and SRL behaviors, and connect those behaviors with performance within a digital literacy platform, Actively Learn. The talk consists of two studies (i) identifying patterns that differ between productive and unproductive students (iI) analyzing the association of teachers’ behavior and students’ SRL usage. I will finish my talk by underlying possible future directions.

Bio: Effat Farhana is a Ph.D. Candidate in the Computer Science Department at North Carolina State University working with Dr. Collin F. Lynch in the ArgLab research group. She received her B.S. in Computer Science and Engineering from Bangladesh University of Engineering and Technology. Her research focuses on mining educational software to derive data-driven heuristics, machine learning, and designing interpretable machine learning algorithms.

Talk Title: Quantifying The Power of Mental Shortcuts in Persuasive Communication with Causal Inference from Text

Talk Abstract: The reliance of individuals on mental shortcuts based on factors such as gender, affiliation, and social status could distort the equitability of interpersonal discussions in various settings. Yet, the impact of such shortcuts in real-world discussions remains challenging to quantify. In this talk, I propose a novel quasi-experimental study that incorporates unstructured text in a principled manner to quantify the causal effect of status indicators in persuasive communication. I also examine how linguistic and rhetorical devices moderate this effect, and thus provide communication strategies to potentially reduce individuals’ reliance on mental shortcuts. I discuss implications for fair communication policies both within organizations and in society at large.

Bio: Emaad Manzoor is a PhD candidate in the Heinz College of Information Systems and Public Policy at Carnegie Mellon University, and will begin as an assistant professor of Operations and Information Management at the University of Wisconsin-Madison in Fall 2021. Substantively, he designs randomized experiments and quasi-experimental studies to quantify the persuasive power of mental shortcuts in text-based communication, and how language can be used to moderate this power. Methodologically, he develops data-mining techniques for evolving networks and statistical frameworks for causal inference with text. He is funded by a 2020 McKinsey & Company PhD Fellowship, and was a finalist for the 2019 Snap Research PhD Fellowship, the 2019 Jane Street Depth First Learning Fellowship, and the 2019 INFORMS Annual Meeting Best Paper award.

Talk Title: Machine Learning in Dynamical Systems

Talk Abstract: Many branches of science and engineering involve estimation and control in dynamical systems; consider, for example, using data to help stabilize the flight of a drone or predict the path of a hurricane. We consider control in dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing an online controller which competes with the best dynamic sequence of control actions selected in hindsight, instead of the best controller in some specific class of controllers. This formulation is attractive when the environment changes over time and no single controller achieves good performance over the entire time horizon. We derive the structure of the regret-optimal online controller using techniques from robust control theory and present a clean data-dependent bound on its regret. We also present numerical simulations which confirm that our regret-optimal controller significantly outperforms various classical controllers in dynamic environments.

Bio: Gautam is a PhD student in the Computing and Mathematical Sciences (CMS) department at Caltech, where he is advised by Babak Hassibi. He is broadly interested in machine learning, optimization, and control, especially 1) online learning and online decision-making and 2) integrating machine learning with physics, dynamics and control. Much of his PhD work has been supported by a National Science Foundation Graduate Research Fellowship and an Amazon AWS AI Fellowship. Prior to joining Caltech, he obtained a BS in Mathematics from Georgia Tech.

Talk Title: Adversarial Collusion on the Web: State-of-the-art and Future Directions

Talk Abstract: The growth and popularity of online media have made it the most important platform for collaboration and communication among its users. Given its tremendous growth, the social reputation of an entity in online media plays an important role. This has led to users choosing artificial ways to gain social reputation by means of blackmarket services as the natural way to boost social reputation is time-consuming. We refer to such artificial ways of boosting social reputation as collusion. In this talk, we will comprehensively review recent developments in analyzing and detecting collusive entities on online media. First, we give an overview of the problem and motivate the need to detect these entities. Second, we survey the state-of-the-art models that range from designing feature-based methods to more complex models, such as using deep learning architectures and advanced graph concepts. Third, we detail the annotation guidelines, provide a description of tools/applications and explain the publicly available datasets. The talk concludes with a discussion of future trends.

Bio: Hridoy Sankar Dutta is currently pursuing his Ph.D. in Computer Science and Engineering from IIIT-Delhi, India. Starting January 2021, he will be joining University of Cambridge as a Research Assistant in the Cambridge Cybercrime Centre. His current research interests include data-driven cybersecurity, social network analysis, natural language processing, and applied machine learning. He received his B.Tech degree in Computer Science and Engineering from Institute of Science and Technology, Gauhati University, India in 2013. From 2014 to 2015, he worked as an Assistant Project Engineer at the Indian Institute of Technology, Guwahati (IIT-G), India, for the project ‘Development of Text to Speech System in Assamese and Manipuri Languages’. He completed his M.Tech in Computer Science and Engineering from NIT Durgapur, India in 2015. More details can be found at https://hridaydutta123.github.io/.

Talk Title: Computer-Aided Diagnosis of Thoracic CT Scans Through Multiple Instance Transfer Learning

Talk Abstract: Computer-aided diagnosis systems have demonstrated significant potential in improving patient care and clinical outcomes by providing more extensive information to clinicians.  The development of these systems typically requires a large amount of well-annotated data, which can be challenging to acquire in medical imaging.  Several techniques have been investigated in an attempt to overcome insufficient data, including transfer learning, or the application of a pre-trained model to a new domain and/or task.  The successful translation of transfer learning models to complex medical imaging problems holds significant potential and could lead to widespread clinical implementation.

However, transfer learning techniques often fail translate effectively because they are limited by the domain in which they were initially trained.  For example, computed tomography (CT) is a powerful medical imaging modality that leverages 3D images in clinical decision-making, but transfer learning models are typically trained on 2D images and thus can not incorporate the additional information provided by the third dimension.  This evaluation of the available data in a CT scan is inefficient and potentially does not effectively improve clinical decisions.  In this project, the 3D information available in CT scans is combined incorporated with transfer learning through a multiple instance learning (MIL) scheme, which can individually assess 2D images and form a collective 3D prediction based on the 2D information, similar to how a radiologist would read a CT scan.  This approach has been applied to evaluate both COVID-19 and emphysema in CT thoracic CT scans and demonstrated strong clinical potential.

Bio: Jordan Fuhrman is a student in the Graduate Program in Medical Physics at the University of Chicago. Since joining the program after his graduation from the University of Alabama in 2017, Jordan’s research has focused on the investigation of computer-aided diagnosis techniques for evaluating CT scans. Generally, this includes implementation of machine learning, deep learning, and computer vision algorithms to accomplish such tasks as disease detection, image segmentation, and prognosis assessments. His primary research interests lie in the development of novel approaches that incorporate the full wealth of information in CT scans to better inform clinical predictions, the exploration of explainable, interpretable outputs to improve clinical understanding of deep learning algorithm performance, and the early detection and prediction of patient progress to inform clinical decisions (e.g., most appropriate treatment) and improve patient outcomes. His work has largely focused on incidental disease assessment in low-dose CT lung screening scans, including emphysema, osteoporosis, and coronary artery calcifications, but has also included non-screening scan assessments of hypoxic ischemic brain injury and COVID-19. Jordan is a student member of both the American Association of Physicists in Medicine (AAPM) and the Society of Photo-optical Instrumentation Engineers (SPIE).

Talk Title: How to Preserve Privacy in Data Analysis?

Talk Abstract: The past decade has witnessed the tremendous success of large-scale data science. However, recent studies show that many existing powerful machine learning tools used in large-scale data science pose severe threats to personal privacy. Therefore, one of the major challenges in data analysis is how to learn effectively from the enormous amounts of sensitive data without giving up on privacy. Differential Privacy (DP) has recently emerged as a new gold standard for private data analysis due to the statistical data privacy it can provide for sensitive information. Nevertheless, the adaptation of DP to data analysis remains challenging due to the complex models we often encountered in data analysis. In this talk, I will focus on two commonly used models, i.e., the centralized and distributed/federated models, for differentially private data analysis. For the centralized model, I will present my efforts to provide strong privacy and utility guarantees in high-dimensional data analysis. For the distributed/federated model, I will discuss new efficient and effective privacy-preserving learning algorithms.

Bio: Lingxiao Wang is a final year Ph.D. student in the Department of Computer Science at the University of California, Los Angeles, advised by Dr. Quanquan Gu. Previously he obtained his MS degree in Statistics at the University of Washington. Lingxiao’s research interests are broadly in machine learning, including privacy-preserving machine learning, optimization, deep learning, low-rank matrix recovery, high-dimensional statistics, and data mining. Lingxiao aims to apply his research for social good, and he is one of the core members of the Combating COVID-19 project (https://covid19.uclaml.org/).

Talk Title: Systematic Evaluation of Privacy Risks of Machine Learning Models

Talk Abstract: Machine learning models are prone to memorizing sensitive data, making them vulnerable to membership inference attacks in which an adversary aims to guess if an input sample was used to train the model. In this talk, we show that prior work on membership inference attacks may severely underestimate the privacy risks by relying solely on training custom neural network classifiers to perform attacks and focusing only on aggregate results over data samples, such as the attack accuracy.

To overcome these limitations, we first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks and proposing a new inference attack method based on a modification of prediction entropy. Using our benchmark attacks, we demonstrate that existing membership inference defense approaches are not as effective as previously reported.

Next, we introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score. Our privacy risk score metric measures an individual sample’s likelihood of being a training member, which allows an adversary to perform membership inference attacks with high confidence. We experimentally validate the effectiveness of the privacy risk score metric and demonstrate the distribution of privacy risk scores across individual samples is heterogeneous. Our work emphasizes the importance of a systematic and rigorous evaluation of privacy risks of machine learning models.

Bio: Liwei Song is a fifth-year PhD student in the Department of Electrical Engineering at Princeton University, advised by Prof. Prateek Mittal. Before coming to Princeton, he received his Bachelor’s degree in Electrical Engineering from Peking University.

His current research focus is on investigating security and privacy issues of machine learning models, including membership inference attacks, evasion attacks, and backdoor attacks. His evaluation methods on membership inference have been integrated into Google’s TensorFlow Privacy library. Besides that, he has also worked on attacking voice assistants with ultrasound, which received widespread media coverage, including BBC News and New York Times.

Talk Title: Reasoning about Social Dynamics and Social Bias in Language

Watch Maarten’s Spotlight Research Talk

Talk Abstract: Humans easily make inferences to reason about the social and power dynamics of situations (e.g., stories about everyday interactions), but such reasoning is still a challenge for modern NLP systems. In this talk, I will address how we can make machines reason about social commonsense and social biases in text, and how this reasoning could be applied in downstream applications.

In the first part, I will discuss PowerTransformer, our new unsupervised model for controllable debiasing of text through the lens of connotation frames of power and agency. Trained using a combined reconstruction and paraphrasing objective, this model can rewrite story sentences such that its characters are portrayed with more agency and decisiveness. After establishing its performance through automatic and human evaluations, we show how PowerTransformer can be used to mitigate gender bias in portrayals of movie characters. Then, I will introduce Social Bias Frames, a conceptual formalism that models the pragmatic frames in which people project social biases and stereotypes onto others to reason about biased or harmful implications in language. Using a new corpus of 150k structured annotations, we show that models can learn to reason about high-level offensiveness of statements, but struggle to explain why a statement might be harmful. I will conclude with future directions for better reasoning about social dynamics and social biases.

Bio: Maarten Sap is a final year PhD student in the University of Washington’s natural language processing (NLP) group, advised by Noah Smith and Yejin Choi. His research focuses on endowing NLP systems with social intelligence and social commonsense, and understanding social inequality and bias in language. In the past, he’s interned at AI2 on project Mosaic working on social commonsense reasoning, and at Microsoft Research working on long-term memory and storytelling with Eric Horvitz.

Talk Title: Formal Logic Enhanced Deep Learning for Cyber-Physical Systems

Watch Meiyi’s Research Lightning Talk

Talk Abstract: Deep Neural Networks are broadly applied and have outstanding achievements for prediction and decision-making support for Cyber-Physical Systems (CPS). However, for large-scale and complex integrated CPS with high uncertainties, DNN models are not always robust, often subject to anomalies, and subject to erroneous predictions, especially when the predictions are projected into the future (uncertainty and errors grow over time). To increase the robustness of DNNs for CPS, in my work, I developed a novel formal logic enhanced learning framework with logic-based criteria to enhance DNN models to follow system critical properties and build well-calibrated uncertainty estimation models. Trained in an end-to-end manner with back-propagation, this framework is general and can be applied to various DNN models. The evaluation results on large-scale real-world city datasets show that my work not only improves the accuracy of predictions and effectiveness of uncertainty estimation, but importantly also guarantees the satisfaction of model properties and increases the robustness of DNNs. This work can be applied to a wide spectrum of applications, including the Internet of Things, smart cities, healthcare, and many others.

Bio: Meiyi Ma is a Ph.D. candidate in the Department of Computer Science at the University of Virginia, working with Prof. John A. Stankovic and Prof. Lu Feng. Her research interest lies at the intersection of Machine learning, Formal Methods, and Cyber-Physical Systems. Specifically, her work integrates formal methods and machine learning, and applies new integrative solutions to build safe and reliable integrated Cyber-Physical Systems, with a focus on smart city and healthcare applications. Meiyi’s research has been published in top-tier machine learning and cyber-physical systems conferences and journals, including NeurIPS, ACM TCPS, ICCPS, Percom, etc. She has received multiple awards, including the EECS Rising Star at UC Berkeley, the Outstanding Graduate Research Award at the University of Virginia and the Best Master Thesis Award. She is serving as the information director for ACM Transactions on Computing for Healthcare and a reviewer for multiple conferences and journals. She also served as organizing committees for several international workshops.

Talk Title: Human-AI Collaborative Decision Making on Rehabilitation Assessment

Talk Abstract: Rehabilitation monitoring systems with sensors and artificial intelligence (AI) provide an opportunity to improve current rehabilitation practices by automatically collecting quantitative data on patient’s status. However, the adoption of these systems still remains a challenge. This paper presents an interactive AI-based system that supports collaborative decision making with therapists for rehabilitation assessment. This system automatically identifies salient features of assessment to generate patient-specific analysis for therapists, and tunes with their feedback. In two evaluations with therapists, we found that our system supports therapists significantly higher agreement on assessment (0.71 average F1-score) than a traditional system without analysis (0.66 average F1-score, p < 0.05). In addition, after tuning with therapist’s feedback, our system significantly improves its performance (from 0.8377 to 0.9116 average F1-scores, p < 0.01). This work discusses the potential of a human and AI collaborative system that supports more accurate decision making while learning from each other’s strengths.

Bio: Min Lee is a PhD student at Carnegie Mellon University. His research interests lie at the intersection of human-computer interaction (HCI) and machine learning (ML), where he designs, develops, and evaluates human-centered ML systems to address societal problems. His thesis focuses on creating interactive hybrid intelligence systems to improve the practices of stroke rehabilitation (e.g. a decision support system for therapists and a robotic coaching system for post-stroke survivors).

Talk Title: Mathematical Models of Brain Connectivity and Behavior: Network Optimization Perspectives, Deep-Generative Hybrids, and Beyond

Talk Abstract: Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder characterized by multiple impairments and levels of disability that vary widely across the ASD spectrum. Currently, quantifying symptom severity relies almost solely on a trained clinician’s evaluation. Recently, neuroimaging studies, for example, using resting state functional MRI (rs-fMRI) and Diffusion Tensor Imaging (DTI) have been gaining popularity for studying brain dysfunction. My work aims at linking the symptomatic characterization of ASD with the functional and structural organization of a patient’s brain via machine learning. To set the stage, I will first introduce a joint network optimization to predict clinical severity from rs-fMRI data. Our model is couples two terms: a generative matrix factorization and a discriminative regression in a joint optimization. Next, we extend this to a deep-generative hybrid, that jointly models the complementarity between structure (DTI) and functional dynamics (dynamic rs-fMRI connectivity) to extract predictive disease biomarkers. The generative part of our framework is now a structurally-regularized matrix factorization on dynamic rs-fMRI correlation matrices, guided by DTI tractography to learn anatomically informed connectivity profiles. The deep part of our framework is an LSTM-ANN, which models the temporal evolution of the scan to map to behavior. Our main novelty lies in our coupled optimization, which collectively estimates the matrix factors and the neural network weights. We outperform several state-of-the-art baselines to extract multi-modal neural signatures of brain dysfunction. Finally, I will present our current exploration based on graph neural networks and manifold learning to better capture the underlying data geometry.

Bio: Niharika is a PhD candidate in the department of Electrical and Computer Engineering. Her research interests lie at the intersection of deep learning, non-convex optimization, manifold learning and graph signal processing applied to neuroimaging data. She has developed novel machine learning algorithms that predict behavioral deficits in patients with Autism by decoding their brain organization from their functional and structural neuroimaging scans. Prior to joining Hopkins, she obtained a bachelor’s degree (B. Tech with Hons.) in Electrical Engineering with a minor in Electronics and Electrical Communications Engineering from the Indian Institute of Technology, Kharagpur.

Talk Title: Power Outage Risk Interconnection: Relationship with Social and Environmental Critical Risk Indicators

Watch Olukunle’s Research Lightning Talk

Talk Abstract: The interconnections between diverse components in a system can provide profound insights on the health and risk states of the system as a whole. Highly interconnected systems tend to accumulate risks until a large, systemic crisis hits. For example, in the 2007-09 financial crisis, the interconnection of financial institutions heightened near the collapse, suggesting the system could no longer absorb risks. Extending concepts of interconnectedness and systemic risk to coupled human-natural systems, one might expect similar behaviours of risk accumulation and heightened connectivity, leading to potential system failures. The Predictive Risk Investigation System (PRISM) for Multi-layer Dynamic Interconnection Analysis aims to explore the complex interconnectedness and systemic risks in human-natural systems.

Applying the PRISM approach, we could uncover dynamic relationships and trends in climate resilience and preparedness using Energy, Environmental and Social indicators. This study proposes a case-study application of the PRISM approach to the State of Massachusetts using a dataset of over 130000 power outages in the state from 2013-2018. Random Forest, Locally Weighted Scatterplot Smoothing (LOWESS) and Generalized Additive Models (GAMS) are applied to understand the interconnections between Power outages, Population density and Environmental factors (Weather indicators e.g. Wind Speed, Precipitation).

Bio: I am a Data Scientist with domain expertise in Energy – Oil, Gas, Renewables and Power Systems. With a BS in Petroleum Engineering and an MS in Sustainable Energy Systems, I have always enjoyed a data-centric approach in solving interdisciplinary problems. In my Bachelor’s degree, I used Neural Networks to solve a practical oil-field (Production Engineering) problem. In my master’s I explored potentials for optimizing clean-energy microgrids in low-income, underserved communities while leveraging insights from large, messy, unstructured data. In my PhD at Tufts I am working in an interdisciplinary team of Data and Domain Scientists where I am applying Data Science/Machine Learning Techniques and Tools to Energy, Climate, Financial, and Ecological systems. One word to describe my experience is diversity. I am fortunate to have enjoyed a fair share of diversity in my academic and professional experience – in geography and in scope. An experience that traverses three continents of the world equipped with a broader scientific and engineering background. This exemplifies my interest in complex, interdisciplinary and multifaceted problems traversing various fields such as: science, engineering and data science. I am enthusiastic about applying my knowledge and skills in Data Science to new, challenging, unfamiliar terrains to discover and garner insights and solve problems that improves experience and affects people, communities, and organizations.

Talk Title: Data-Efficient Optimization in Reinforcement Learning

Watch Pan’s Research Lightning Talk

Talk Abstract:Optimization lies at the heart of modern machine learning and data science research. How to design data-efficient optimization algorithms that have a low sample complexity while enjoying a fast convergence at the same time has remained a challenging but imperative topic in machine learning. My research aims to answer this question from two facets: providing the theoretical analysis and understanding of optimization algorithms; and developing new algorithms with strong empirical performance in a principled way. In this talk, I will introduce our recent work in developing and improving data-efficient optimization algorithms for decision-making (reinforcement learning) problems. In particular, I will introduce the variance reduction technique in optimization and show how it can improve the data efficiency of policy gradient methods in reinforcement learning. I will present the variance reduced policy gradient algorithm, which constructs an unbiased policy gradient estimator for the value function. I will show that it provably reduces the sample complexity of vanilla policy gradient methods such as REINFORCE and GPOMDP.

Bio: Pan Xu is a Ph.D. candidate in the Department of Computer Science at the University of California, Los Angeles. His research spans the areas of machine learning, data science, and optimization, with a focus on the development and improvement of large-scale nonconvex optimization algorithms for machine learning and data science applications. Pan obtained his B.S. degree in mathematics from the University of Science and Technology of China. Pan received the Presidential Fellowship in Data Science from the University of Virginia. He has published over 20 high-quality papers on top machine learning conferences and journals such as ICML, NeurIPS, ICLR, AISTATS, and JMLR.

Talk Title: Efficient Neural Question Answering for Heterogeneous Platforms

Watch Qingqing’s Research Lightning Talk

Talk Abstract: Natural language processing (NLP) systems power many real-world applications like Alexa, Siri, or Google and Bing. Deep learning NLP systems are becoming more effective due to increasingly larger models with multiple layers and millions to billions of parameters. It is challenging to deploy these systems because they are compute-intensive, consume much more energy, and cannot run on mobile devices. In this talk, I will present two works on optimizing efficiency in question answering systems and my current research in studying large NLP models’ energy consumption. First, I will introduce DeQA, which provides an on-device question-answering capability to help mobile users find information more efficiently without privacy issues. Deep learning based QA systems are slow and unusable on mobile devices. We design the latency- and memory- optimizations widely applicable for state-of-the-art QA systems to run locally on mobile devices. Second, I will present DeFormer, a simple decomposition-based technique that takes pre-trained Transformer models and modifies them to enable faster inference for QA for both the cloud and mobile. Lastly, I will introduce how we can accurately measure the energy consumption of NLP models using hardware power meters and build reliable energy estimation models by abstracting meaningful features of the NLP workloads and profiling runtime resource usage.

Bio: Qingqing Cao is a graduating Computer Science Ph.D. candidate at Stony Brook University. His research interests include natural language processing (NLP), mobile computing, and machine learning systems. He has focused on building efficient and practical NLP systems for both edge devices and the cloud, such as on-device question answering (MobiSys 2019), faster Transformer models (ACL 2020), and accurate energy estimation of NLP models. He has two fantastic advisors: Prof. Aruna Balasubramanian and Prof. Niranjan Balasubramanian. He is looking for postdoc openings in academia or research positions in the industry.

Talk Title: Artificial Intelligence for Medical Image Analysis for Breast Cancer Multiparametric MRI

Watch Isabelle’s Spotlight Research Talk

Talk Abstract: Artificial intelligence is playing an increasingly important role in medical imaging. Computer-aided diagnosis (CADx) systems using human-engineered features or deep learning can potentially assist radiologists in image interpretation by extracting quantitative biomarkers to improve diagnostic performance and circumvent unnecessary invasive procedures. Multiparametric MRI (mpMRI) has become a part of routine clinical assessment for screening of high-risk patients for breast cancer and monitoring therapy response because it has been shown to improve diagnostic accuracy. Current CADx methods for breast lesion assessment on MRI, however, are mostly focused on one sequence, the dynamic contrast-enhanced (DCE)-MRI. Therefore, we investigated methods for incorporating three sequences in mpMRI to improve the CADx performance in differentiating benign and malignant breast lesions. We compared integrating the mpMRI information at the image level, feature level, or classifier output level. In addition, transfer learning is often employed in deep learning applications in medical imaging due to data scarcity. However, pretrained convolutional neural networks (CNNs) used in transfer learning require two-dimensional (2D) inputs, limiting the ability to utilize high-dimensional information in medical imaging. To address this problem, we investigated a transfer learning method that collapses volumetric information to 2D by taking the maximum intensity projection (MIP) at the feature level within CNNs, which outperformed a previous method of using MIPs of images themselves in the task of distinguishing between benign and malignant breast lesions. We proposed a method that combines feature fusion and feature MIP for computer-aided breast cancer diagnosis using high-dimensional mpMRI that outperforms the current benchmarks.

Bio: Isabelle is a PhD candidate in Medical Physics at the University of Chicago, supervised by Dr. Maryellen Giger. Her research is centered around developing automated methods for quantitative medical image analysis to assist in clinical decision-making. She has proposed novel methodologies to diagnoses breast cancer using multiparametric MRI exams. Since the pandemic, she has also been working on AI solutions that leverage medical images to enhance the early detection and prognosis of COVID-19. She has first-hand experience tackling unique challenges faced by medical imaging applications of machine learning due to high-dimensionality, data scarcity, noisy labels, etc. She loves working at the intersection of physics, medicine, and data science, and she is motivated by the profound potential impact that her research can bring on improving access to high-quality care and providing a proactive healthcare system. She hopes to dedicate her career to building AI-empowered technology to transform healthcare, accelerate scientific discoveries, and improving human well-being.

Talk Title: Asymptotically Optimal Exact Minibatch Metropolis-Hastings

Talk Abstract: Metropolis-Hastings (MH) is one of the most fundamental Bayesian inference algorithms, but it can be intractable on large datasets due to requiring computations over the whole dataset. In this talk, I will discuss minibatch MH methods, which use subsamples to enable scaling. First, I will talk about existing minibatch MH methods, and demonstrate that inexact methods (i.e. they may change the target distribution) can cause arbitrarily large errors in inference. Then, I will introduce a new exact minibatch MH method, TunaMH, which exposes a tunable trade-off between its batch size and its theoretically guaranteed convergence rate. Finally, I will present a lower bound on the batch size that any minibatch MH method must use to retain exactness while guaranteeing fast convergence—the first such bound for minibatch MH—and show TunaMH is asymptotically optimal in terms of the batch size.

Bio: Ruqi Zhang is a fifth-year Ph.D. student in Statistics at Cornell University, advised by Professor Chris De Sa. Her research interests lie in probabilistic modeling for data science and machine learning. She currently focuses on developing fast and robust inference methods with theoretical guarantees and their applications with modern model architectures, such as deep neural networks, on real-world big data. Her work has been published in top machine learning venues such as NeurIPS, ICLR and AISTATS, and has been recognized through an Oral Award at ICLR and two Spotlight Awards at NeurIPS.

Talk Title: Towards Global-Scale Biodiversity Monitoring – Scaling Geospatial and Taxonomic Coverage Using Contextual Clues

Watch Sara’s Research Lightning Talk

Talk Abstract: Biodiversity is declining globally at unprecedented rates. We need to monitor species in real time and in greater detail to quickly understand which conservation efforts are most effective and take corrective action. Current ecological monitoring systems generate data far faster than researchers can analyze it, making scaling up impossible without automated data processing. However, ecological data collected in the field presents a number of challenges that current methods, like deep learning, are not designed to tackle. Biodiversity data is correlated in time and space, resulting in overfitting and poor generalization to new sensor deployments. Environmental monitoring sensors have limited intelligence, resulting in objects of interest that are often too close/far, blurry, or in clutter. Further, the distribution of species is long-tailed, which results in highly-imbalanced datasets. These challenges are not unique to the natural world, advances in any one of these areas will have far-reaching impact across domains. To address these challenges, we take inspiration from the value of additional contextual information for human experts, and seek to incorporate it within the structure of machine learning systems. Incorporating species distributions and access across data collected within a sensor at inference time can improve generalization to new sensors without additional human data labeling. Going beyond single sensor deployment, there is a large degree of contextual information shared across multiple data streams. Our long-term goal is to develop learning methods that efficiently and adaptively benefit from many different data streams on a global scale.

Bio: Sara Beery has always been passionate about the natural world, and she saw a need for technology-based approaches to conservation and sustainability challenges. This led her to pursue a PhD at Caltech, where she is advised by Pietro Perona and funded by an NSF Graduate Research Fellowship, a PIMCO Fellowship in Data Science, and an Amazon/Caltech AI4Science Fellowship. Her research focuses on computer vision for global-scale biodiversity monitoring. She works closely with Microsoft AI for Earth and Google Research to translate her work into usable tools, including widely-used models and benchmarks for detection and recognition of animal species in challenging camera trap data at a global scale. She has worked to bridge the interdisciplinary gap between ecology and computer science by hosting the iWild-Cam challenge at the FGVC Workshop at CVPR from 2018-2021, and through founding and managing a highly successful AI for Conservation slack channel which provides a meeting point for experts from each community to discuss new methods and best practices for conservation technology. Sara’s prior experience as a professional ballerina and a nontraditional student has taught her the value of unique and diverse perspectives in the research community. She’s passionate about increasing diversity and inclusion in STEM through mentorship and outreach.

Talk Title: Promoting Worker Performance with Human-Centered Data Science

Watch Teng’s Spotlight Research Talk

Talk Abstract: Addressing real-world problems about human behavior is one of the main approaches where advances in data science techniques and social science theories achieve the greatest social impact. To approach these problems, we propose a human-centered data science framework that synergizes strengths across machine learning, causal inference, field experiment, and social science theories to understand, predict, and intervene in human behavior. In this talk, I will present three empirical studies that promote worker performance with human-centered data science. In the first project, we work with New York City’s Mayor’s Office and deploy explainable machine learning models to predict the risk of tenant harassment in New York City. In the second project, we leverage insights from social identity theory and conduct a large-scale field experiment on DiDi, a leading ride-sharing platform, showing that the intervention of bonus-free team ranking/contest systems can improve driver engagement. Third, to further unpack the effect of team contests on individual DiDi drivers, we bring together causal inference, machine learning, and social science theories to predict individual treatment effects. Insights from this study are directionally actionable to improve team recommender systems and contest design. More promising future directions will be discussed to showcase the effectiveness and flexibility of this framework.

Bio: I am a final-year Ph.D. candidate at the School of Information, University of Michigan, Ann Arbor, working with Professor Qiaozhu Mei. My research focuses on human-centered data science, where I couple data science techniques and social science theories to address real-world problems by understanding, predicting, and intervening in human behavior.

Specifically, I synergize strengths across machine learning, causal inference, field experiments, and social science theories to solve practical problems in the areas of data science for social good, the sharing economy, crowdsourcing, crowdfunding, social media, and health. For example, we have collaborated with the New York City’s Mayor’s Office and helped to prioritize government outreach to tenants vulnerable to landlord harassment in New York City by deploying machine learning models. In collaboration with Didi Chuxing, a leading ride-sharing platform, we have leveraged field experiments and machine learning models to enhance driver engagement and intervention design. The results of my work have been integrated into the real-world products that involve millions of users and have been published across data mining, social computing, and human-computer interaction venues.

Talk Title: PAPRIKA: Private Online False Discovery Rate Control

Watch Wanrong’s Spotlight Research Talk

Talk Abstract: In hypothesis testing, a false discovery occurs when a hypothesis is incorrectly rejected due to noise in the sample. When adaptively testing multiple hypotheses, the probability of a false discovery increases as more tests are performed. Thus the problem of False Discovery Rate (FDR) control is to find a procedure for testing multiple hypotheses that accounts for this effect in determining the set of hypotheses to reject. The goal is to minimize the number (or fraction) of false discoveries, while maintaining a high true positive rate (i.e., correct discoveries).
In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample. Unlike previous work in this direction, we focus on the online setting, meaning that a decision about each hypothesis must be made immediately after the test is performed, rather than waiting for the output of all tests as in the offline setting. We provide new private algorithms based on state-of-the-art results in non-private online FDR control. Our algorithms have strong provable guarantees for privacy and statistical performance as measured by FDR and power. We also provide experimental results to demonstrate the efficacy of our algorithms in a variety of data environments.

Bio: Wanrong Zhang is a PhD candidate at Georgia Tech supervised by Rachel Cummings and Yajun Mei. Her research interests lie primarily in data privacy, with connections to statistics and machine learning. Her research focuses on designing privacy-preserving algorithms for machine learning models and statistical analysis tools, as well as identifying and preventing privacy vulnerabilities in modern collaborative learning. Before joining Georgia Tech, she received her B.S. in Statistics from Peking University.

Talk Title: Towards Better Informed Extraction of Events from Documents

Watch Xinya’s Spotlight Research Talk

Talk Abstract: Large amounts of text are written and published daily on-line. As a result, applications such as reading through the document to automatically extract useful information, and answering user questions have become increasingly needed for people’s efficient absorption of information. In this talk, I will focus on the problem of finding and organizing information about events and introduce my recent research on document-level event extraction. Firstly, I’ll briefly summarize the high-level goal and several key challenges (including modeling context and better leveraging background knowledge), as well as my efforts to tackle them. Then I will focus on the work where we formulate event extraction as a question answering problem — both to access relevant knowledge encoded in large models and to reduce the cost of human annotation required for training data creation/construction.

Bio: Xinya Du is a Ph.D. candidate at the Computer Science Department of Cornell University, advised by Prof. Claire Cardie. He received a bachelor degree in Computer Science from Shanghai Jiao Tong University. His research is on natural language processing, especially methods that enable learning with fewer annotations for document-level information extraction. His work has been published in leading NLP conferences and has been covered by New Scientist and TechRepublic.

Talk Title: Understanding Success and Failure in Science and Technology

Talk Abstract: The 21st century society is largely driven by science and innovation, but our quantitative understanding of why, how, and when innovators and innovations succeed or fail remains limited. Despite the long-standing interest in this topic, current science of science research relies on citation and publication records as its major data sources. Yet science functions as a complex system that is much more than published papers, and ignorance of this multidimensional nature precludes a deeper examination of many fundamental elements of innovation lifecycles, from failure to scientific breakthrough, from public funding to broad impact. In this talk, I will touch on a few examples of success and failure across science and technology, hoping to illustrate a way for a better understating of the full innovation lifecycle. By combining various large-scale datasets and interdisciplinary analytical frameworks rooted in data mining, statistical physics, and computational social science, we discover a series of fundamental mechanisms and signals underlying the processes in which (1) individuals and organizations build on previous repeated failures towards ultimate victory or defeat in science, startups and security; (2) scientific elites produce breakthrough discoveries in their scientific careers; and (3) scientific research gets funded and used by the general public. The uncovered patterns in these studies not only unveil regularity and predictability underlying the often-noisy social systems, they also offer a new theoretical and empirical basis that is practically relevant for individual scientists, research institutes, and innovation policymakers.

Bio: Yian Yin is a Ph.D. candidate of Industrial Engineering & Management Sciences at Northwestern University, advised by Dashun Wang and Noshir Contractor. He also holds affiliations with Northwestern Institute on Complex Systems and Center for Science of Science and Innovation. Prior to joining Northwestern, he received his bachelor degrees in Statistics and Economics from Peking University in 2016.

Yian studies computational social science, with a particular focus on integrating theoretical insights in innovation studies, computational tools in data science, and modeling frameworks in complex systems to examine various fundamental elements of innovation lifecycles, from dynamics of failure to emergence of scientific breakthrough, from public funding for science to broad uses of science in public domains. His research has been published in multidisciplinary journals including Science, Nature, Nature Human Behaviour, and Nature Reviews Physics, and has been featured in Science, Lancet, Forbes, Washington Post, Scientific American, Harvard Business Review, MIT Technology Review, among other outlets.

Talk Title: Towards Interpretable Machine Learning by Human Knowledge Reasoning

Talk Abstract: Given the great success achieved by statistical learning theories for building intelligent systems, there is still a long-standing challenge of artificial intelligence, which is to bridge the gaps between what machines know, what humans think what machines know, and what humans know, about the real world. By doing so, we are expected to ground the prior knowledge of machines to human knowledge first and perform explicit reasoning for various downstream tasks for better interpretable machine learning. 

In this talk, I will briefly present two pieces of my existing work that leverage human expert and commonsense knowledge reasoning to increase the interpretability and transparency of machine learning models in the field of natural language processing. Firstly, I will show how existing cognitive theories on human memory can inspire an interpretable framework for rationalizing the medical relation prediction task based on expert knowledge. Secondly, I will introduce how we can learn better word representations based on commonsense knowledge and reasoning. Our proposed framework learns a commonsense reasoning module guided by a self-supervision task and provides word pair and single word representations distilled from learned reasoning modules. Both the above works are able to offer reasoning paths to justify their decisions and boost the model interpretability that humans can understand with minimal knowledge barriers.

Bio: Zhen Wang is a Ph.D. student in the Department of Computer Science and Engineering at the Ohio State University advised by Prof. Huan Sun. His research centers on natural language processing, data mining, and machine learning with emphasis on information extraction, question answering, graph learning, text understanding, and interpretable machine learning. Particularly, he is interested in improving the trustworthiness and generalizability of data-driven machine learning models by interpretable and robust knowledge representation and reasoning. He has published papers in several top-tier data science conferences, such as KDD, ACL, WSDM as well as journals like Bioinformatics. He conducts interdisciplinary research that connects artificial intelligence with cognitive neuroscience, linguistics, software engineering, and medical informatics, etc.