Skip to main content
  • Overview

    Apply to the 2021 CDAC Summer Lab | Register: Information Session Jan. 28

    OVERVIEW

    A summer research opportunity for high school and undergraduate students focusing on rigorous, applied, interdisciplinary data science research and rooted in a cohort community.

    The Data & Computing Summer Lab is an immersive 10-week paid summer research program at the University of Chicago. In the program, high school and undergraduate students are paired with a data science mentor in various domains, including: computer science, data science, social science, climate and energy policy, public policy, materials science, and biomedical research. Through this pairing the research assistant will engage with and hone their skills in research methodologies, practices, and teamwork. We encourage participation from a broad range of students, and require no prior research experience to apply.

    BENEFITS

    Students in the program are immersed in a research lab and given unparalleled, first-hand access to impactful, applied data science research. Students will gain not only an understanding of fundamental data science methodologies but specialized training within the application areas specific to their lab’s research thrust. Students are asked to practice communicating their research findings throughout the summer, culminating in final videos. The final videos are presented during an end-of-summer symposium, which is run like a professional conference and provides students a chance to field questions about their project and share the outcomes of their research projects. Students also engage in professional development and training that can help them prepare for future careers in data science and computing. Additionally, many alumni continue research work with their mentor after the program ends.

    COHORT

    In the program, students are welcomed into a cohort of their peers who represent diverse backgrounds, interests, and ambitions. Through near-peer mentoring, social gatherings, and group work on projects, students in this cohort not only become better trained data and computational scientists, but better equipped to tackle any challenges ahead through their experience with group work and collaboration. Students meet weekly in small thematic groups called “clusters” to discuss progress, ask questions, and hear about each others’ projects.

    MOTIVATION

    Broadening participation in data science and computing, especially among underrepresented groups, is essential not only for equalizing opportunities but envisioning – and creating – a future that is truly representative of the world around us. Computational work is often stereotyped as people working alone writing code, when in reality data science is a team sport, inherently interdisciplinary, and in constant conversation with real-world issues to achieve measurable, meaningful impact. We aim to train and immerse students in the research lifecycle, and prepare them for critical transitions and sustained career paths.

    PROGRAMMING

    To supplement their research work, we provide an exciting array of programming for students during the summer. A highlight of the summer programming is a weekly speaker series featuring researchers at the forefront of data science and computing. Speakers address topics ranging from their own unique and unconventional paths to data science research, to their innovative approaches to tackling important, impactful research questions. Students have the chance not only to hear from first-class speakers but also to introduce and be in conversation with them. In the 2020 program, we hosted 26 different speakers from a wide array of data science domains. You can watch select talks from the 2020 speaker series here.

    ALUMNI

    Summer Lab alumni have been co-authors on published papers and posters, created apps and software tools used by thousands of people, and pursued a variety of future paths within research and beyond. Check out the Project Profiles to learn more about previous student cohorts, and watch videos overviewing their summer research projects. Summer Lab alumna Aarthi Koripelly (‘19, ‘20) shared this about her experience in the program:

    The CDAC Summer Lab was a great experience for me to have exposure to the applications of computer science in other domains and gain technical knowledge. My projects have helped me hone my research and communication skills in writing reports, presenting to others, and submitting to a conference, which would not have been possible without the opportunities CDAC has provided.

    Read about the 2020 Summer Lab program. Summer Lab graphics designed by Angela Liu.

  • Application

    Apply to the 2021 CDAC Summer Lab

    Application Timeline

    • Application Deadline: February 28th, 2021, 11:59pm CT
    • Notification Deadline: early April 2021

    Application Overview

    • Research Areas of Interest + Skills Evaluation
      • Select + explain interest in research areas (definitions provided)
      • Self-evaluation of core computational tools and skills (scale 1-5)
      • Relevant coursework (including but not limited to topics in CS, data science, statistics, math) + other experiences
    • Short Answer Questions
      • Internship Goals: What are your big picture academic goals, and what role does this program in particular play in them? This can be broad and far-reaching, but it should be clear why data and computing research plays a role. (500 words max)
      • Teamwork + Collaboration: Describe a time when you were part of a team working on a project. How did you work together and what role did you play? What was a challenge you faced as a member of that team? What’s one thing you’re proud of from that experience? (300 words max)
      • Project Description (new applicants): Describe a project you’ve undertaken. It can be a final project from a class, a side project, or one from a previous research program. In detail, describe: (1) the goals of the project and your approach; (2) tools you used and why you used them; (3) one challenge you faced and how you addressed it; (4) one achievement you’re particularly proud of; and (5) any outcomes or tangible results of the project (not required). With this question, more value will be placed on how you approached the project, rather than how advanced or technical it is. (750 words max)
      • Research Project Description (returning alumni): Describe your research project from last summer. In detail, describe: (1) the goals of the project; (2) the tools you used; (3) one challenge you faced and how you addressed it; (4) one achievement you’re particularly proud of; and (5) any outcomes or tangible results of the project. (750 words max)

    Review Criteria

    • Intellectual Curiosity: Evident interest about data science and the applied domain areas chosen.
    • Skills Baseline: Familiarity with at least one programming language, and translation of self-evaluated skills ratings in CV/relevant coursework/other experiences.
    • Initiative + Teamwork: Student has acted upon interest by pursuing available options and opportunities for computational and data science classes, training, and programs, and has successfully worked as part of a team before.
    • Research Aptitude: Creativity and curiosity, self-direction, goal-oriented and adaptable work ethic, resilient problem solving, time management and communication skills.
    • Program Fit: Clear why this program versus others is uniquely valuable to the applicant. Student is uniquely positioned to benefit from the program due to lack of access to similar programs at their home institution, or potential for growth.

    Due to the volume of applications we receive, we will be unable to provide individual feedback on applications that are not accepted.

    We look forward to your application. Please use the link below to apply:

    Apply to the 2021 CDAC Summer Lab

  • Project Profiles

    Watch the final videos from the 2020 Summer Lab cohort here, and read more about their projects below.

    2020 Program Cohort

    Project: Extracting Scientific Information from Free Text Articles

    Mentor: Kyle Chard, Globus Labs

    Research Area Keywords: Machine Learning & AI // Systems // Medicine & Health

    Project Description: Aarthi Koripelly is an incoming freshman at the University of Chicago studying computer science and statistics, a 2020 Coca Cola Scholar, and a previous intern in the 2019 program. This summer, Aarthi worked with the Globus Labs research group with Dr. Kyle Chard and Zhi Hong, on a project that explored scalable approaches for automatically extracting relations from scientific papers (e.g., melting point of a polymer). The project implemented a dependency parser-based relation extraction model to understand relationships without the need for a Named Entity tagger and integrated several word embeddings models and custom tokenization to boost learning performance for scientific text.

    Project: Chameleon-Sage Image

    Mentors: Rajesh Sankaran & Kate Keahey, Argonne National Laboratory

    Research Area Keywords: Systems // Cloud Computing

    Project Description: Akhil Kodumuri is a sophomore at the University of Illinois at Urbana-Champaign majoring in computer engineering. This summer, he worked with Drs. Rajesh Sankaran and Kate Keahey on creating an image containing all of the Sage software stacks and edge plugins that users can interact with Sage’s platform. This image is intended to be compatible with hardware on the Chameleon platform.

    Project: Combating Misinformation On Twitter Using NLP and Graph Structures

    Mentors: Nick Feamster, Department of Computer Science/Center for Data & Computing

    Research Area Keywords: Machine Learning & AI // Internet of Things

    Project Description: Alex Levi is a student at the University of Chicago pursuing a joint BA/MS program, majoring in the College in mathematics and beginning his first year in the Masters in Computational Social Science (MACSS) program. This summer, he worked with Prof. Nick Feamster on a project focused on building software that detects linkage structures of misinformation on Twitter. Using NLP semantics and network analysis, he built the groundwork for an algorithm that will be able to detect information divergence, both semantically and structurally. He hopes to continue developing this algorithm in the future.

    Project: Max-Flow Min-Cut Theorem with Dynamic Trees

    Mentors: Lorenzo Orecchia, Department of Computer Science

    Research Area Keywords: Machine Learning // Algorithms & Optimization // Computer Science Theory

    Project Description: Andrew Razborov is a junior at the University of Chicago Laboratory Schools. This summer, he worked with Prof. Lorenzo Orecchia and Konstantinos Ameranis on a project working to solve the max flow, min cut problem using dynamic trees, which maintain paths from the root to the source as a forest of vortex disjoint trees.

    Project: Developing RNAseq Pipelines

    Mentors: Jean-Baptiste Reynier & Anna Woodard, Olopade Lab

    Research Area Keywords: Medicine & Health // Scientific Computing // Systems

    Project Description: Arvind Krishnan is a senior at the University of Chicago studying molecular engineering and biological sciences. This summer, he worked with Jean-Baptiste Reynier and Anna Woodard in the Olopade Lab on a project that consisted of creating a pipeline to analyze RNAseq data composed of sequenced mRNA from breast cancer patients. The pipeline generates an expression profile, uses this to classify tumors into their subtypes, as well as quantify the immune cell types in the microenvironment of the tumor.

    Project: funcX Chameleon Burstability

    Mentor: Kyle Chard, Globus Labs

    Research Area Keywords: Systems // Cloud Computing

    Project Description: Avery Schwartz is an incoming freshman at Northwestern University studying computer science, and was previously a student at the University of Chicago Lab Schools as well as a 2019 CDAC intern. This summer, Avery worked with the Globus Labs research group with Dr. Kyle Chard and Matt Baughman, on designing resources and a script to allow funcX to burst out to new nodes using Chameleon Cloud.

    Project: Radiomic Texture Analysis of Immunofluorescence Images of Lupus Nephritis Biopsies to Predict Patient Progression to End Stage Renal Disease

    Mentor: Maryellen Giger, Department of Radiology

    Research Area Keywords: Machine Learning & AI // Image Analysis // Medicine & Health

    Project Description: Bradie Ferguson is a pre-med senior at the University of Washington studying bioengineering and chemistry. This summer, she continued work with Drs. Maryellen Giger and Madeleine Durkee on an image analysis project on microscopic images of lupus nephritis biopsies. The goal was to create a multi-feature classifier that can distinguish between patients that progressed to end stage renal disease (ESRD+) and those that did not progress (ESRD-). To accomplish this, radiomic texture analysis was utilized with future plans of using machine learning.

    Project: Measuring Race and Gender in Children’s Books

    Mentor: Anjali Adukia & H. Birali Runesha, Harris School of Public Policy & Research Computing Center

    Research Area Keywords: Machine Learning & AI // Image Analysis // Society & Policy

    Home Institution: The University of Chicago

    Project Description: Callista Christ is a recent graduate of the College at the University of Chicago, where she majored in Physics and Astrophysics. This summer, she continued work with Drs. Anjali Adukia and Teodora Szasz on a CDAC Discovery Grant project seeking to measure messages about race and gender in children’s books. She wokred on classifying race, gender, and age in cartoons in children’s books, and analyzing how those classifications change throughout time. She also worked on analyzing how the sentiment around homeschooling, Trump, and COVID-19 in general has changed since January.

    Project: Spot Market Prediction

    Mentor: Kyle Chard, Globus Labs

    Research Area Keywords: Machine Learning // Scientific Computing

    Project Description: Tala Germani is a sophomore at the University of Chicago studying computational and applied mathematics (CAM) and economics. This summer, she worked with Dr. Kyle Chard and Matt Baughman in Globus Labs on a project aiming to help users predict price changes in the Amazon spot market. She developed an interactive notebook that summarizes and visualizes past pricing data for users.

    Project: Xtract NLP

    Mentor: Kyle Chard, Globus Labs

    Research Area Keywords: Machine Learning // Scientific Computing

    Project Description: Chimaobi Amanchukwu is a senior at George Bush High School. This summer, he worked with Dr. Kyle Chard and Tyler Skluzacek in Globus Labs on Xtract NLP, a software which takes a folder of different scientific papers and clusters them based on topics. Xtract NLP is customizable for the user and showcases different graphs for insights.

    Project: Self-Driving Trigger For Large Hadron Collider (LHC) Data

    Mentors: Yuxin Chen & David Miller, Department of Computer Science & Department of Physics

    Research Area Keywords: Machine Learning & AI // Physics & Astronomy

    Project Description: Chinmaya Mahesh is a junior at the University of Illinois at Urbana-Champaign majoring in computer science. This summer, he worked with Drs. Yuxin Chen and David Miller on a CDAC Discovery Grant research project titled, “A Data-Driven Trigger System for the Large Hadron Collider.” The project aims to build a machine learning powered replacement for the current trigger system. The main goal of this project was to finish the first step, which is to build an explainable AI model which can interpret and explain in a cost effective way the decisions of a machine learning based trigger system.

    Project: ML Approaches to Reduce Voice Bias

    Mentors: Ben Zhao, SAND Lab

    Research Area Keywords: Machine Learning & AI // Medicine & Health

    Project Description: Christina Tuttle is a junior at Yale University studying computer science and global affairs. This summer, she worked with Prof. Ben Zhao in the SAND Lab on a project to determine what causes voice bias, and whether machine learning can be used to reduce bias in samples.

    Project: Fairness Jupyter Study

    Mentors: Blase Ur & Nick Feamster, SUPERGroup & Department of Computer Science/Center for Data & Computing

    Research Area Keywords: Machine Learning & AI // Internet of Things // Security & Privacy

    Project Description: Christine Jacinto is a senior at Lane Tech College Prep, and previous intern in the 2019 CDAC Summer Lab program. This summer, she continued worked on two projects: one with Prof. Blase Ur in the SUPERGroup Lab focused on creating an experiment design to test a Jupyter notebook plugin; and a second one with Prof. Nick Feamster centered on capturing network traffic for IoT devices to test firewall rules. In her final video, Christine speaks to the first project on the Jupyter pluging.

    Project: Fairness in Machine Learning

    Mentors: Blase Ur, SUPERGroup

    Research Area Keywords: Machine Learning & AI //Security & Privacy

    Project Description: Daniel Serrano is a sophomore at the University of Chicago majoring in computer science. This summer, he worked with Prof. Blase Ur and Galen Harrison in the SUPERGroup Lab on a project developing a Jupyter plugin called Retrograde that can be used by data scientists to create fairer machine learning (ML) models. Rather than testing the model for fairness after the model is created, Retrograde intervenes during the ML building process helping data scientists to think about and document fairness in relation to the data they are working with. His work this summer consisted of creating a study design to help develop Retrograde with the hopes of collecting and analyzing qualitative data from interviews of different ML stakeholders.

    Project: Distance Matters

    Mentors: Eamon Duede, Knowledge Lab

    Research Area Keywords: Data Analysis // Spatial Data // Scientific Computing

    Project Description: Dimitriy Leksanov is a junior at the University of Chicago studying computational and applied mathematics (CAM) and economics. This summer, he worked with Eamon Duede in the KnowledgeLab on a project exploring how various dimensions of distance affect the influence that one academic’s work has on another. These include physical distance, cultural distance, temporal distance, and distance by knowledge practice. The project explored the latter by using word embedding models to calculate the similarities between different academic papers.

    Project: Spatial Insights in GeoDa

    Mentors: Julia Koschinsky, Center for Spatial Data Science

    Research Area Keywords: Data Analysis // Spatial Data // Scientific Computing

    Project Description: Felix Farb is a junior at Walter Payton College Prep. This summer, he worked with Dr. Julia Koschinsky in the Center for Spatial Data Science on a project involving spatial data science related research, specifically in the subject of the causes of racial diversity in Chicago. Additionally, he worked on creating a framework to help students do research of their own in a productive way, by guiding them through the process of hypothesis creation.

    Project: DeepScribe

    Mentors: Sanjay Krishnan & Miller Prosser & Sandra Schloen & Susanne Paulus // Department of Computer Science, OCHRE Data Service at the Oriental Institute, Department of Near Eastern Languages & Civilizations

    Research Area Keywords: Image Analysis // Machine Learning & AI

    Project Description: Grace Su is a sophomore at Columbia University majoring in computer science. This summer, she worked with Drs. Sanjay Krishnan, Sandra Schloen, and Miller Prosser on a CDAC Discovery Grant project titled, “Deciphering Cuneiform with Artificial Intelligence.” She worked on researching and developing DeepScribe, a tool that deciphers cuneiform with artificial intelligence, using a training set of 100,000+ images from the Oriental Institute’s OCHRE data service. She developed an image classification model with Keras, built a Python module for the computer vision pipeline, and performed experiments to investigate and improve the computer vision model.

    Project: Website Templates for OCHRE Archaeology Projects

    Mentors: Miller Prosser & Sandra Schloen, OCHRE Data Service at the Oriental Institute

    Research Area Keywords: Data Analysis // Systems

    Project Description: Helena Abney-McPeek is an undergraduate at Harvard University studying computer science, and a previous intern in the 2018 and 2019 Summer Lab programs. This summer, she worked with Drs. Sandra Schloen and Miller Prosser at the OCHRE Data Service on a project creating website templates for OCHRE archaeology research projects.

    Project: Chameleon Reproducibility Project

    Mentors: Kate Keahey & Zhuo Zhen, Argonne National Laboratory

    Research Area Keywords: Systems // Cloud Computing

    Project Description: Isabel Brunkan is a junior at the Minerva Schools at KGI studying computer science. This summer, she worked with Drs. Kate Keahey and Zhuo Zhen in the Chameleon Cloud group on a project that created a digital artifact repository with experiments replicated and reproduced on Chameleon using Jupyter Notebook. She worked on replicating machine learning experiments, specifically image processing models, and created an experiment structure template to encourage reproducibility. These experiments are stored on Chameleon’s sharing portal for community use.

    Project: Learning Manifolds From Point Clouds

    Mentors: Lorenzo Orecchia, Department of Computer Science

    Research Area Keywords: Machine Learning // Algorithms & Optimization // Medicine & Health

    Project Description: Isabella DeClue is a sophomore at the University of Chicago majoring in statistics and minoring in computer science. This summer, she worked with Professor Lorenzo Orecchia and Ryan Robinett on a project investigating a version of the Moving Least Squares algorithm to estimate at what radii local hyperplane approximations for complex, higher dimensional manifolds are valid.

    Project: Ishan Malhotra

    Mentors: Brian Nord, Fermilab & Department of Astronomy and Astrophysics

    Research Area Keywords: Machine Learning & AI // Physics & Astronomy

    Project Description: Ishan Malhotra is a sophomore at the University of Chicago studying computer science and economics. This summer, he worked with Dr. Brian Nord and the DeepSkies Lab on the early stages of a project aiming to devise a self-driving telescope. His contributions to the project centered around creating a reinforcement learning model to train the self-driving telescope.

    Project: Latent Attention & Training ML Algorithms

    Mentors: Blase Ur, SUPERGroup

    Research Area Keywords: Machine Learning // Security & Privacy

    Project Description: Jamar Sullivan is an incoming freshman at the University of Chicago studying computer science and astrophysics, and recent high school graduate from Gwendolyn Brooks College Prep. This summer, he continued work with Prof. Blase Ur in the SUPERGroup Lab to explore the difference in machine learning models’ performance when using human-collected vs. machine-learned attention. The project created a user interface that requires users to select words that they believe indicate the sentiment of a movie review, and then created a model that would learn the indicative words in a movie review dataset. It’s understood that attention can lead to greater performance in machine learning models, but collecting human information means that it is possible to collect more data from the same sized dataset, and get high accuracy with a smaller model.

    Project: Misinformation WhatsApp

    Mentors: Marshini Chetty, SUPERGroup

    Research Area Keywords: Human-Computer Interaction // Security & Privacy

    Project Description: Jason Chee is a sophomore at the University of Chicago majoring in computer science. This summer, he continued work with Professor Marshini Chetty in the SUPERGroup Lab on a project that looked at misinformation on coronavirus news on end-to-end encrypted platforms like WhatsApp. On the qualitative side, he helped write an interview script and researched different fact-checker APIs. On the quantitative side, he designed and developed a cross-platform WhatsApp URL and metadata extraction app using JavaScript and React Native.

    Project: Fawkes

    Mentors: Heather Zheng, SAND Lab

    Research Area Keywords: Machine Learning // Image Analysis // Human-Computer Interaction

    Project Description: Jiawen Shen is a senior at Bellevue High School. This summer, she worked with Prof. Heather Zheng in the SAND Lab on developing Fawkes, a software tool that help protect users privacy against unregulated third party. She tested many different images to help the team make improvement on Fawkes.

    Project: Security & Functionality in IoT Devices Through SmartWall

    Mentors: Blase Ur & Nick Feamster, SUPERGroup & Department of Computer Science/Center for Data & Computing

    Research Area Keywords: Machine Learning & AI // Internet of Things // Security & Privacy

    Project Description: Julio Ramirez is a senior at Northside College Preparatory High School, and previous intern in the 2019 CDAC Summer Lab program. This summer, he continued worked on two projects: one with Prof. Blase Ur in the SUPERGroup Lab where he helped design an evaluation study for a Jupyter Notebook plugin created to help people who develop machine learning models better understand the data they use to train their model; and a second one with Prof. Nick Feamster where he installed smart devices at home and collected packet captures while implementing firewall rules generated for those devices, examining how the rules impact a device’s functionality and security. In his final video, Julio speaks to the second project on functionality and security in IoT devices.

    Project: Identifying Malicious Network Activity

    Mentor: Nick Feamster, Department of Computer Science/Center for Data & Computing

    Research Area Keywords: Machine Learning & AI // Internet & Communications

    Project Description: Lia Troy is a recent graduate of the College at the University of Chicago. This summer, she worked with Prof. Nick Feamster on a project working to identify malicious network activity.

    Project: Safety Guidelines for BLM Activists

    Mentor: Blase Ur, SUPERGroup

    Research Area Keywords: Security & Privacy // Society & Policy

    Project Description: Maia Boyd is a sophomore at the University of Chicago majoring in computer science and minoring in math. This summer, she worked with Prof. Blase Ur in the SUPERGroup Lab on a project that seeks to understand the computer security and technology safety concerns that Black Lives Matter (BLM) supporters have surrounding protests, and how they address those concerns. In order to achieve this goal, she helped to collect safety guides used by BLM protesters to see what advice is given to protesters. Next, the project team launched an online survey, taken by 167 BLM protesters, that asked about their concerns and if they had heard of or follow the pieces of advice that we collected from the safety guides.

    Project: Scratch Encore – Exploring Student Behavior

    Mentor: Diana Franklin, CANON Lab

    Research Area Keywords: STEM Education // Data Analysis

    Project Description: Melissa Tovar is a senior at the University of Chicago studying computer science. This summer, she worked with Prof. Diana Franklin in the CANON Research Lab on the Scratch Encore team. The project sought to adapt the Scratch Encore curriculum so that it became combatible to remote learning. This includes worksheets now available in google forms or google slides. Another part of the summer was spent analyzing student responses on said worksheets from the previous school year.

    Project: Analyzing Human Behavior with Smart Home Devices

    Mentor: Nick Feamster, Department of Computer Science/Center for Data and Computing

    Research Area Keywords: Machine Learning & AI // Internet & Communications // Security & Privacy

    Project Description: Nikki Chakravarthy is a sophomore at the University of Chicago studying computer science and economics, and a previous intern in the 2019 Summer Lab program. This summer, she worked with Prof. Nick Feamster on a project aiming to understand human behavior related to smart home devices. She used Wireshark to analyze packet captures collected from a Jetson Nano and other IoT devices in her home.

    Project: Spatial Analysis of Access to MOUD (Medications for Opioid Use Disorder) Resources

    Mentor: Qinyun Lin & Marynia Kolak, Center for Spatial Data Science

    Research Area Keywords: Human-Computer Interaction // Wearables & Devices

    Project Description: Olina Liang is a junior at the University of Chicago majoring in astrophysics. This summer, she worked with Drs. Marynia Kolak and Qinyun Lin in the Center for Spatial Data Science on a project focused on scraping opioid-related policy data from over 10 PDFs of around 1,000 pages each, geocoded locations of health facilities, and calculated zipcode level access scores.

    Project: Making Machine Learning More Human – Quantifying Parent-Child Language Alignment Using Neural Language Models

    Mentor: Allyson Ettinger, Department of Linguistics & Susan Goldin-Meadow, Department of Psychology

    Research Area Keywords: Computational Linguistics // Natural Language Processing // Data Analysis

    Project Description: Ray Fregly is a junior at the University of Chicago double majoring in linguistics and computer science. This summer, she worked with Drs. Allyson Ettinger and Susan Goldin-Meadow on a project that altered and implemented neural network language models to Pytorch to quantify parent-child language alignment based on previously collected data. The long-term goal of the project is to use the results of this study to improve our understanding of child language acquisition.

    Project: Data Mining and NLP For Financial Markets

    Mentor: Dacheng Xiu, Booth School of Business

    Research Area Keywords: Economics & Business // Natural Language Processing // Data Analysis

    Project Description: Rachit Surana is a sophomore at the University of Chicago. This summer, he worked with Prof. Dacheng Xiu on a project using data mining of textual data related to financial markets using dynamic scraper and network traffic analysis. In the future, NLP modelling will be used to perform correlational analysis with market prices and other metrics.

     

    Project: Promoting Explanatory Insights in GeoDa

    Mentor: Julia Koschinsky, Center for Spatial Data Science

    Research Area Keywords: Spatial Data // Data Analysis

    Project Description: R.E. Stern is a sophomore at the University of Chicago. This summer, he worked as part of a team led by Dr. Julia Koschinsky developing best practices for spatial data research in the Center for Spatial Data Science‘s exploratory data analysis program GeoDa. Focused on user interaction with theri research’s underlying hypotheses, and on using quasi-experimental design to allow users to consistently develop explanatory insights rather than descriptive ones.

    Project: d-gen: Database Generation & Relational Databases

    Mentor: Raul Castro Fernandez, ChiData/Department of Computer Science

    Research Area Keywords: Data Analysis // Systems

    Project Description: Ryan Wong is a senior at Whitney Young High School, and a previous CDAC intern in the 2019 program. This year, he worked with Professor Raul Castro Ferandez on d-gen, a synthetic relational database generator. d-gen aims to help database users benchmark queries by generating data that adheres to relational database schemas.

    Project: Optimizing Thermal Dissipation of 3D-Printed Objects

    Mentor: Pedro Lopes, Human-Computer Integration Lab

    Research Area Keywords: Human-Computer Interaction // Wearables & Devices

    Project Description: Svitlana Midianko is a junior at the Minerva Schools at KGI studying human behavior. This summer, she worked with Prof. Pedro Lopes in the Human Computer Integration Lab on a project centered around optimizing thermal dissipation of 3-D printed objects. The project included the development of the Fusion360 plugin, written in Python. With the help of such plugin, makers can optimize the heat dissipation of their hardware without having much knowledge in heat dynamics. The plugin’s execution results in modification of the device’s design, explicitly adding extra holes in the upper case. Such change is optimized for minimum temperature of the device.

    Project: Automated Experimental Design for Cosmic Discovery

    Mentor: Brian Nord & Yuxin Chen // Fermilab, Department of Astronomy and Astrophysics, Department of Computer Science

    Research Area Keywords: Machine Learning & AI // Physics & Astronomy

    Project Description: Yair Atlas is a junior at the University of Chicago studying physics and philosophy, and was a previous CDAC intern in the 2019 program. This summer he worked with Drs. Yuxin Chen and Brian Nord on a CDAC Discovery Project titled, “Automated Experimental Design for Cosmic Discovery.” This project focused on using machine learning to improve astrophysical surveys. Specifically, the project used a simulation facility to better understand how design features affect experimental results.

    Project: dextrEMS – Achieving Dexterity with Electrical Muscle Stimulation by Combining it with Brakes

    Mentor: Pedro Lopes, Human Computer Integration Lab

    Research Area Keywords: Human-Computer Interaction // Wearables & Devices // Virtual Reality

    Project Description: Yujie Tao is an incoming graduate student at the University of Chicago’s Masters Program in Computer Science’s Pre-Doctoral program, and a recent graduate of UNC Chapel Hill. This summer, she worked with Prof. Pedro Lopes in the Human Computer Integration Lab on dextrEMS, a project seeking to build an Exoskeleton-EMS system to control the user’s hand for force feedback. She created an automated sign language demo and three VR applications to showcase the strength of our device.

    2019 Program Cohort

    Project: Using Machine Learning to Identify Blurry SEM Images

    Mentor/Lab: Ryan Chard/Globus Labs

    Research Area Keywords: Machine Learning // Image Analysis // Systems & Architecture // Databases

    I worked on a project using machine learning and computer vision to detect Scanning Electronic Microscope images of mice brains that were either blurry or focused. I was able to create three different models using first a Laplacian of Gaussian blob detection; then, a neural network; and finally a convolutional neural network. Then, I helped push the models to DLHub for others to use.

    Project: Password Reuse & Vulnerability Detection

    Mentor/Lab: Blase Ur/SUPERGroup

    Research Area Keywords: Security & Privacy // Human-Computer Interaction

    In this project, in collaboration with UChicago IT Services, we leverage publicly available data breaches and password modification strategies to generate password guesses for current and previous UChicago accounts, and run a simulated attack to assess the vulnerability of UChicago accounts to password guessing attacks. Information is obtained from users who are identified as vulnerable in this simulated attack using anonymous online surveys. Attitudes, intentions, and planned behavior change are surveyed in an effort to design interventions that promote better password security habits.

    Project: Generating Deepfakes with Autoencoders

    Mentor/Lab: Matt Baughman/Globus Labs

    Research Area Keywords: Machine Learning // Security & Privacy

    In recent news, Deepfakes have been written to seem alarmingly easy to make. The aim of my summer project was to generate Deepfakes myself, to gain greater exposure to Machine Learning and to test the learning curve. Using an encoder-decoder architecture, we roughly swapped the faces of two program coordinators (as well as that of Nicholas Cage).

    Project: Distance Matters in Science & Scholarship — Analyzing the Impact of Scientific Citations

    Mentor/Lab: Eamon Duede/Knowledge Lab

    Research Area Keywords: Data Analysis // Spatial Data // Scientific Computing

    After extracting metadata from over 26,000 surveyed authors, we ran relational analysis on our data and found a relationship between the influence of an author’s references and the geospatial distance between the author and referenced author’s institution. This relationship is intriguing and offers insight into understanding what types of publications are more influential for authors.

    Project: funcX Website Design

    Mentor/Lab: Ryan Chard/Globus Labs

    Research Area Keywords: Programming Languages // Databases // Web Design

    My project was to build a website for the Globus service, funcX (a FaaS platform for science), which is designed to be applied to existing cyberinfrastructures to provide scalable, secure, and on-demand execution of short duration scientific functions. I learned how to use Bootstrap, Flask, and PSQL in order to construct a fully-functional website that scientists and researchers can use to remotely run functions with specified input data.

    Project: Fairness in Machine Learning

    Mentors/Lab: Prof. Blase Ur, Julia Hanson (BS 2018), Galen Harrison (PhD, CS)/SUPERGroup

    Research Area Keywords: Machine Learning // Human-Computer Interaction

    I worked on qualitative coding for interviews concerning cloud usage, as well as for data for a fairness in Machine Learning research paper submitted to FAT. I also started a visualization using D3 for sample data from COMPAS.

    Project News: The paper that grew out of this research, titled “An Empirical Study on the Perceived Fairness of Realistic, Imperfect Machine Learning Models”, was accepted at the ACM FAT* 2020 Conference.

    Project: Data and Learning Hub for Science (DLHub)

    Mentors/Lab: Logan Ward & Ryan Chard/Globus Labs

    Research Area Keywords: Machine Learning // Systems & Architecture

    I began by working on DLHub by creating a module to extract data from various machine learning model architectures and translating them to PyTorch models. I then transitioned to learning about image segmentation, and attempting to determine whether transfer learning with image segmentation could be done efficiently.

    Project: Mobile Decision-Making for Doctors Without Internet

    Mentor/Lab: Pedro Lopes/HCI Lab

    Research Area Keywords: Medicine & Health // Human-Computer Interaction // Mobile Computing

    I worked on the Mobile Medicine project, which is aimed at providing Nigerian Doctors with an internet-independent phone interface tool to run computational diagnostics, answer medical queries, and provide realtime knowledge on the spread of infectious disease.

    Project: Physical Backdoors in Neural Networks

    Mentor/Lab: Ben Zhao/SAND Lab

    Research Area Keywords: Artificial Intelligence // Machine Learning // Security & Privacy

    My project was about detecting physical backdoor attacks in image-detecting neural networks. I added physical triggers to traffic signs (blue tape or sticky notes along the bottom of the sign) and added these images to a training dataset for a neural network that classifies traffic signs. The goal of the project was to find a way to detect that the network had been attacked and mitigate the attack.

    Project: EMS Gesture Classification

    Mentor/Lab: Heather Zheng/SAND Lab

    Research Area Keywords: Artificial Intelligence // Human-Computer Interaction // Machine Learning // Security & Privacy

    I helped build a gesture classification model that classifies 32 different hand gestures generated by electrical muscle stimulation. In doing so, our lab hopes to develop novel and secure human-computer interaction methods and secure communication techniques.

    Project: Fairness in Machine Learning

    Mentors/Lab: Prof. Blase Ur, Julia Hanson (BS 2018), Galen Harrison (PhD, CS)/SUPERGroup

    Research Area Keywords: Machine Learning // Human-Computer Interaction

    I worked on the Fairness in Machine Learning project with SUPERgroup. We focused on analyzing empirical data to understand what people perceived to determine a fair machine learning model, among other things. I programmed in JavaScript to help work on a process visualization as part of the project. I also used JavaScript to try and develop summary statistics of the data that was collected as part of our project. I also did qualitative coding for the survey responses used in the project.

    Project News: The paper that grew out of this research, titled “An Empirical Study on the Perceived Fairness of Realistic, Imperfect Machine Learning Models”, was accepted at the ACM FAT* 2020 Conference.

    Project: Is Climate Change Changing Clouds?

    Mentors/Lab: Ian Foster, Michael Maire, Elisabeth Moyer, Rebecca Willett/RDCEP

    Research Area Keywords: Biology & Environment // Image Analysis // Machine Learning

    My work during the program centered on climate science: because clouds introduce uncertainty into climate projections, our project uses unsupervised machine learning methods to evaluate changing cloud characteristics. In my contribution to the project, I tackled the data processing issues brought on by immense but previously under-utilized NASA satellite data: through employing web scraping and API calls as well as mapping and other data visualization techniques, I simplified and expedited the data downloading and wrangling process.

    Project News: Two papers arose from this research, and both were accepted to the American Geophysical Union Fall 2019 meeting: “Cloud Characterization With Deep Learning II” and “Developing Unsupervised Learning Models for Cloud Classification.”

    Project: Internet Tracking Transparency

    Mentor/Lab: Blase Ur/SUPERGroup

    Research Area Keywords: Human-Computer Interaction // Security & Privacy

    I worked on a browser extension that highlights the prevalence of third-party tracking on the internet, specifically focusing on drawing attention to the more sinister aspects of tracking. I also worked on a project about advertising and ad transparency on Twitter.

    Project: Research Sharing in JupyterHub with Chameleon Cloud

    Mentors/Lab: Kate Keahey & Jason Anderson/Chameleon Project Nimbus Team

    Research Area Keywords: Scientific Computing // Systems & Architecture

    Working with the Nimbus team, we integrated Chameleon’s JupyterHub with Zenodo to make it easier to share, publish, and import research notebooks. I worked on making it easier to share notebook-based research in and out of the JupyterHub environment on Argonne’s Chameleon open testbed. I wrote a full-scale, front-end, and back-end publishing extension for JupyterLab, adjusted the JupyterHub configuration to make it easy to import notebooks, and created a Django website to allow users to share and browse research.

    Project News: The poster for this project, titled “Sharing and Replicability of Notebook-Based Research on Open Testbeds”, was accepted at the SC19 International Conference for High Performance Computing, Networking, Storage and Analysis.

    Project: Enhancing Breathability of Wearable Devices

    Mentor/Lab: Pedro Lopes/HCI Lab

    Research Area Keywords: Human-Computer Interaction // Wearable Devices

    My project was to improve the breathability of wearable devices that capture biometric data, such as Apple watches and Fitbits, by creating silicone interfaces that can be slotted between the skin and a wearable. The aim for the silicone interface is to pipe out sweat and allow trapped heat to escape, improving user comfort. I experimented with manufacturing techniques and principles of microfluidics to design a process that can be used by hobbyists who only have access to machinery typically found in a makerspace.

    Project: Kubernetes Backend for VC3 + SLATE

    Mentors/Lab: Dr. Rob Gardner, Lincoln Bryant, & Chris Weaver/MANIAC Lab

    Research Area Keywords: Databases // Systems & Architecture

    I iterated on scientific software projects VC3 and SLATE by replacing OpenStack backend to create ‘login pods’ that can be dynamically provisioned on a lightweight Kubernetes backend, which could improve hardware usage by 2-3 times. I used tools such as Docker, Kubernetes, and Jenkins, and expanded my knowledge of Python, UNIX Shell programming languages, and Linux authorization and authentication mechanisms.

    Project: Queue Prediction for Supercomputers

    Mentor/Lab: Ryan Chard/Globus Labs

    Research Area Keywords: Machine Learning // Systems & Architecture

    Research in many scientific domains require significant computational power resulting in limitations such as the expense to grab new resources and jobs being initiated only after a certain number of nodes become available. Therefore it is essential to incorporate intelligence in computing resource management. One of the key components of this intelligence is being able to predict queue wait times for jobs running through supercomputers using AWS and Parsl. In my project, I leveraged Amazon SageMaker, a cloud Machine Learning (ML) platform to create, train, and deploy a machine learning model to predict queue wait times.

    Project: “If This Then That” — Generating Predictive Rules for Smart Devices Based on Human Behavior

    Mentor/Lab: Dr. Blase Ur & Weijia He (PhD Candidate, CS)/SUPERGroup

    Research Area Keywords: Human-Computer Interaction // Internet of Things

    I programmed features into the backend of an IoT application using a Django web framework. I helped a team of IoT researchers collect data using this app on participants’ interactions with smart devices. Our team is working on synthesizing data to automatically generate predictive IFTTT-style rules tailored to an individual user’s behavior. I also designed a visualization feature so that individuals can better understand their usage patterns and determine whether the generated rules align with their daily behaviors.

    Project: Quantifying Social Determinants of Cardiovascular Disease

    Mentor: Corey Tabit

    Research Area Keywords: Machine Learning // Medicine & Health // Spatial Data

    Our project focused on determining spatiotemporal individual and composite social risk scores as they relate to blood pressure. This summer, we aimed to identify the optimal spatiotemporal buffer of crime that affects people’s blood pressure most.

    Project News: The paper that grew out of this research, titled, “Acute Effects of Violent Crime on Blood Pressure in Chicago,” was published in the Journal of the American College of Cardiology in March 2020.

    Project: Is Climate Change Changing Clouds?

    Mentors/Lab: Ian Foster, Michael Maire, Elisabeth Moyer, Rebecca Willett/RDCEP

    Research Area Keywords: Biology & Environment // Image Analysis // Machine Learning

    We explored how deep learning can be applicable to discover new cloud features from big satellite data. To find the new cloud features, we built a data analysis pipeline where we trained a deep neural network, extracted dimension-reduced clouds information, and then passed the compressed representation to a clustering algorithm.

    Project News: Two papers arose from this research, and both were accepted to the American Geophysical Union Fall 2019 meeting: “Cloud Characterization With Deep Learning II” and “Developing Unsupervised Learning Models for Cloud Classification.”

    Project: Extracting Metadata with XtractHub

    Mentor/Lab: Tyler Skluzacek/Globus Labs

    Research Area Keywords: Databases

    XtractHub is a dynamic metadata extraction workflow. This summer I improved the efficiency of multiple metadata extractors, implemented Docker into the XtractHub workflow, and developed a web server for processing individual file metadata.

    Project News: The paper that grew out of this research, titled “Serverless Workflows for Indexing Large Scientific Data”, was accepted at the Fifth International Workshop on Serverless Computing (WoSC) 2019.

    Project: Is Climate Change Changing Clouds?

    Mentors/Lab: Ian Foster, Michael Maire, Elisabeth Moyer, Rebecca Willett/RDCEP

    Research Area Keywords: Biology & Environment // Image Analysis // Machine Learning

    I helped develop a machine learning algorithm for identifying different cloud classes based on satellite images. We hope to use our model to better understand trends in cloud cover, which plays a critical role in climate change.

    Project News: Two papers arose from this research, and both were accepted to the American Geophysical Union Fall 2019 meeting: “Cloud Characterization With Deep Learning II” and “Developing Unsupervised Learning Models for Cloud Classification.”

    Project: Mobile Decision-Making for Doctors Without Internet

    Mentor/Lab: Pedro Lopes/HCI Lab

    Research Area Keywords: Medicine & Health // Human-Computer Interaction // Mobile Computing

    Mobile Medicine is a new approach to patient diagnostic and record-keeping approach. Due to restrictions in wifi growth, areas in under-developed nations lack modern diagnostic-aids and electronic medical records. Mobile Medicine aims to fix that by creating an SMS-protocol based app that both collects patient information, and provide an accurate adaptive diagnosis.

    Project: funcX Website Design

    Mentor/Lab: Ryan Chard/Globus Labs

    Research Area Keywords: Databases // Programming Languages // Website Design

    This summer, I worked on funcX web service, which is a serverless supercomputing project. It allows users to write, edit, delete, and execute functions over mobile devices with registered endpoints. I mainly worked on the user interface – the website – by writing codes to determine the design of webpages and ways of extracting data from the database with Python, HTML, CSS, and SQL codes.

    Project: Globus Automate Cloud Service

    Mentor/Lab: Ryan Chard/Globus Labs

    Research Area Keywords: Databases // Human-Computer Interaction // Systems & Architecture

    Project: Bioinformatics Workflow Migration Between Fred Hutchinson Cancer Research Center & Globus Genomics

    Mentors/Lab: Paul Davé & Alex Rodriguez/Globus Genomics

    Research Area Keywords: Medicine & Health // Systems & Architecture

    Project: Making Spatial Access Measures Accessible: A New Python Package and AWS Tool

    Mentors/Lab: Dr. James Saxon & Dr. Julia Koschinsky/Center for Spatial Data Science

    Research Area Keywords: Medicine & Health // Spatial Data

    I worked on a project aiming to help researchers calculate spatial access with a very simple interface. I made a website which allows users to easily calculate how easily people living in a certain location have access to a certain resource. This can be used to determine where there isn’t enough healthcare and other important goods.

    Project News: The PySAL (Python Spatial Analysis Library) package in which this research project culminated is now available online.

  • Mentors

    2020 Mentors

    Dr. Allyson Ettinger’s research is focused on language processing in humans and in artificial intelligence systems, motivated by a combination of scientific and engineering goals. For studying humans, her research uses computational methods to model and test hypotheses about mechanisms underlying the brain’s processing of language in real time. In the engineering domain, her research uses insights and methods from cognitive science, linguistics, and neuroscience in order to analyze, evaluate, and improve natural language understanding capacities in artificial intelligence systems. In both of these threads of research, the primary focus is on the processing and representation of linguistic meaning.

    Anjali Adukia is an assistant professor at the University of Chicago Harris School of Public Policy and the College. In her work, she is interested in understanding how to reduce inequalities such that children from historically disadvantaged backgrounds have equal opportunities to fully develop their potential.  Her research is focused on understanding factors that motivate and shape behavior, preferences, attitudes, and educational decision-making, with a particular focus on early-life influences.  She examines how the provision of basic needs—such as safety, health, justice, and representation—can increase school participation and improve child outcomes in developing contexts.

    Adukia completed her doctoral degree at the Harvard University Graduate School of Education, with an academic focus on the economics of education. Her work has been funded from organizations such as the William T. Grant Foundation, the National Academy of Education, and the Spencer Foundation.  Her dissertation won awards from the Association for Public Policy Analysis and Management (APPAM), Association for Education Finance and Policy (AEFP), and the Comparative and International Education Society (CIES). Adukia received recognition for her teaching from the University of Chicago Feminist Forum.  She completed her masters of education degrees in international education policy and higher education (administration, planning, and social policy) from Harvard University and her bachelor of science degree in molecular and integrative physiology from the University of Illinois at Urbana-Champaign.  She is a faculty research fellow of the National Bureau of Economic Research and a faculty affiliate of the University of Chicago Education Lab and Crime Lab.  She is on the editorial board of Education Finance and Policy.  She was formerly a board member of the Young Nonprofit Professionals Network – San Francisco Bay Area. She continues to work with non-governmental organizations internationally, such as UNICEF and Manav Sadhna in Gujarat, India.

    Anna Woodard is a postdoctoral scholar in the Department of Computer Science at the University of Chicago, where she is part of Globus Labs.

    Ben Zhao is a Neubauer Professor of Computer Science at University of Chicago. Over the years, he’s followed his own interests in pursuing research problems that he finds intellectually interesting and meaningful. That’s led him to work on a sequence of areas from P2P networks, online social networks, SDR/open spectrum systems, graph mining and modeling, user behavior analysis, to adversarial machine learning. Since 2016, he’s mostly worked on security and privacy problems in machine learning and mobile systems. His meandering interests have led him to publish at a range of top conferences, including Usenix Security/Oakland/CCS, IMC/WWW, CHI/CSCW, and Mobicom/SIGCOMM/NSDI.

    Together with Prof. Heather Zheng, he co-directs the SAND Lab (Security, Algorithms, Networking and Data) at University of Chicago. He received his PhD in Computer Science from UC Berkeley in 2004, where he was advised by John Kubiatowicz and Anthony Joseph, and created the Tapestry distributed hash table (dissertation). He received my MS from Berkeley in 2000, and his BS in computer science from Yale in 1997. He is an ACM Distinguished Scientist, a recipient of the NSF CAREER award (2005), MIT Tech Review’s TR-35 Award (Young Innovators Under 35) (2006), IEEE Internet Technical Committee’s Early Career Award (2014), and one of ComputerWorld’s Top 40 Technology Innovators under 40. His papers have somewhere around 28,000 citations and an H-index of 66 (for whatever that’s worth). In some of his “free time,” he writes about research and PhD life on Quora.

    Blase Ur is an assistant professor of computer science at the University of Chicago. He founded the UChicago SUPERgroup, an interdisciplinary research collective with dozens of members. Their research spans computer security, privacy, and human-computer interaction (HCI). They are especially interested in using data-driven methods to help users make better security and privacy decisions, as well as to make complex computer systems more usable for non-technical users. Their work has been supported by six NSF grants, as well as grants from Mozilla Research and the Data Transparency Lab.

    He has been fortunate to receive three best paper awards (CHI 2017, USENIX Security 2016, and UbiComp 2014), the 2018 SIGCHI Outstanding Dissertation Award, the 2016 John Karat Usable Privacy and Security Student Research Award, an NDSEG fellowship, a Fulbright scholarship, and three honorable mentions for best paper (CHI 2020, CHI 2016, and CHI 2012). Jointly with the other core members of the CMU passwords group, he also received the 2020 Allen Newell Award for Research Excellence and the 2018 IEEE Cybersecurity Award for Practice. He has strong interests in teaching and K-12 outreach, particularly for broadening participation in CS. He earned my AB in computer science from Harvard University and worked for three years at Rutgers University on outreach and diversity programs.

    Brian Nord uses artificial intelligence to search for clues on the origins and development of the universe. He actively works on statistical modeling of strong gravitational lenses, the cosmic microwave background, and galaxy clusters. As leader of the Deep Skies Lab, he brings together experts in computer science and technology to study questions of cosmology, including dark energy, dark matter, and the early universe, through large-scale data analysis.

    Nord has authored or co-authored nearly 50 papers. He trains scientists in public communication, advocates for science education and funding, and works to develop equitable and just research environments. As co-leader of education and public engagement at the Kavli Institute for Cosmological Physics at UChicago, he organizes Space Explorers, a program to help underrepresented minorities in high school engage in hands-on physics experiences outside the classroom. He is an associate scientist at Fermi National Accelerator Laboratory, where he is a member of the Machine Intelligence Group.

    Homepage.

    Dacheng Xiu’s research interests include developing statistical methodologies and applying them to financial data, while exploring their economic implications. His earlier research involved risk measurement and portfolio management with high-frequency data and econometric modeling of derivatives. His current work focuses on developing machine learning solutions to big-data problems in empirical asset pricing.

    Xiu’s work has appeared in Econometrica, the Journal of Econometrics, the Journal of the American Statistical Association, the Annals of Statistics, and the Journal of Finance. He is a Co-Editor for the Journal of Financial Econometrics, an Associate Editor for the Journal of Econometrics, the Journal of Business & Economic Statistics, the Journal of Empirical Finance, and Statistica Sinica, and also referees for several journals in the fields of econometrics, statistics, and finance. He has received several recognitions for his research, including the Fellow of the Society for Financial Econometrics, the Fellow of the Journal of Econometrics, the 2018 Swiss Finance Institute Outstanding Paper Award, the 2018 AQR Insight Award, and the Best Conference Paper Prize at the 2017 Annual Meeting of the European Finance Association.

    In 2017, Xiu launched a website that provides up-to-date realized volatilities of individual stocks, as well as equity, currency, and commodity futures. These daily volatilities are calculated from the intraday transactions and the methodologies are based on his research of high-frequency data.

    Xiu earned his PhD and MA in applied mathematics from Princeton University, where he was also a student at the Bendheim Center for Finance. Prior to his graduate studies, he obtained a BS in mathematics from the University of Science and Technology of China.

    David Miller’s research focuses on answering open questions about the fundamental structure of matter. By studying the quarks and gluons -—the particles that comprise everyday protons and neutrons —produced in the energetic collisions of protons at the Large Hadron Collider (LHC) at CERN in Geneva, Switzerland, Miller conducts measurements using the ATLAS Detector that will seek out the existence of never-before-seen particles, and characterize the particles and forces that we know of with greater precision. Miller’s work into the properties and measurements of the experimental signatures of these quarks and gluons –or jets” –is an integral piece of the puzzle used in the recent discovery of the Higgs bosons, searches for new massive particles that decay into boosted top quarks, as well as the hints that the elusive quark-gluon-plasma may have finally been observed in collisions of lead ions.

    Besides studying these phenomena, Miller has worked extensively on the construction and operation of the ATLAS detector, including the calorimeter and tracking systems that allow for these detailed measurements. Upgrades to these systems involving colleagues at Argonne National Laboratory, CERN, and elsewhere present an enormous challenge and a significant amount of research over the next several years. Miller is also working with state-of-the art high-speed electronics for quickly deciphering the data collected by the ATLAS detector.

    Miller received his PhD from Stanford University in 2011 and his BA in Physics from the University of Chicago in 2005. He was a McCormick Fellow in the Enrico Fermi Institute from 2011-2013.

    Homepage.

    Diana Franklin is an Associate Professor in Computer Science. She leads five projects involving computer science education involving students ranging from 3rd grade through university. She is the lead PI for quantum computing education for EPIQC, an NSF expedition in computing. Her research agenda explores ways to create curriculum and computing environments in ways that reach a broad audience. She is a recipient of the NSF CAREER award, NCWIT Faculty Undergraduate Mentoring Award, four teaching awards, three best paper awards (ICER ’17, IPDPS ’14, and Computing Frontiers ’13), and an Honourable Mention from CHI ’18.

    Franklin received her Ph.D. from UC Davis in 2002. She was an assistant professor (2002-2007) and associate professor with tenure (2007) in Computer Science at the California Polytechnic State University, San Luis Obispo, during which she held the Forbes Chair. From 2008-2015, she was tenured teaching faculty at UC Santa Barbara. Her research interests include computing education research, architecture involving novel technologies, and ethnic and gender diversity in computing. She is the author of “A Practical Guide to Gender Diversity for CS Faculty,” from Morgan Claypool.

    Eamon Duede is a joint PhD Candidate in the departments of Philosophy and the Committee on Conceptual and Historical Studies of Science, and was formerly the Executive Director of the Knowledge Lab. His work is broadly at the intersection of the philosophy of science and computational science of science. In the philosophy of science, he focuses on models, simulations, and artificial intelligence / machine learning in science. In computational science of science, he uses large scale, computational analysis alongside targeted intelligent surveying and field experiments to understand how institutions and communities produce knowledge.

    Hakizumwami Birali Runesha is the Director of Research Computing for the University of Chicago, where he provides leadership and vision for advancing all aspects of research computing strategies at the University. He is responsible for the design, configuration, and administration of centrally managed High-Performance Computing (HPC) systems and related services across the University. In addition, he provides access to advanced technical expertise, user support, advice and training, and access to the University’s HPC facility to the research community.

    Runesha is a seasoned professional who brings to the University of Chicago HPC management leadership and more than 17 years of experience in high performance computing and scientific software development. He earned his M.S. and Ph.D. in Civil engineering at Old Dominion University. Prior to joining the University of Chicago, he served as Director of Scientific Computing and Applications at the University of Minnesota Supercomputing Institute (MSI) managing the scientific computing, biological computing, visualization and application development groups. In addition to overseeing strategic planning of HPC resources and leading annual procurement of supercomputing resources at MSI, Runesha created the MSI Application software development group and the MSI Scientific Data Management Laboratory to meet the evolving data management and database development needs of university researchers. Prior to joining the University of Minnesota, he was a research scholar at the Hong Kong University of Science and Technology developing parallel computing algorithms for engineering applications, a research associate for the Multidisciplinary Parallel-Vector Computer Center at Old Dominion University and an Assistant Professor at the University of Kinshasa.

    Runesha has developed open source software programs and fast parallel solvers for large-scale finite element applications. He served as principal investigator on a number of research grants and is the author of a number of journal articles, proceedings and conference papers. He has given many invited talks, seminars, courses, and workshops on various HPC topics.

    Heather Zheng is the Neubauer Professor of Computer Science at University of Chicago. She received my PhD in Electrical and Computer Engineering from University of Maryland, College Park in 1999. Prior to joining University of Chicago in 2017, she spent 6 years in industry labs (Bell-Labs, NJ and Microsoft Research Asia), and 12 years at University of California at Santa Barbara. At UChicago, she co-directs the SAND Lab (Systems, Algorithms, Networking and Data) together with Prof. Ben Y. Zhao.

    Homepage.

    Jean-Baptiste Reynier is a recent graduate from the University of Chicago (B.S. in Biology 2018, M.S. in Computer Science 2019). He is currently working as a data science analyst at the Olopade Lab. His work focuses on uncovering the characteristics of the tumor immune micro-environment in breast cancer, using genomic data.

    Julia Koschinsky is the Executive Director of the Center for Spatial Data Science at the University of Chicago and has been part of the GeoDa team for over 16 years. She has been conducting and managing research funded through federal awards of over $8 million to gain insights from the spatial dimensions of urban challenges in housing, health, and the built environment.

    Kate Keahey is one of the pioneers of infrastructure cloud computing. She created the Nimbus project, recognized as the first open source Infrastructure-as-a-Service implementation, and continues to work on research aligning cloud computing concepts with the needs of scientific datacenters and applications. To facilitate such research for the community at large, Kate leads the Chameleon project, providing a deeply reconfigurable, large-scale, and open experimental platform for Computer Science research. To foster the recognition of contributions to science made by software projects, Kate co-founded and serves as co-Editor-in-Chief of the SoftwareX journal, a new format designed to publish software contributions. Kate is a Scientist at Argonne National Laboratory and a Senior Fellow at the Computation Institute at the University of Chicago.

    Kyle Chard is a Research Assistant Professor in the Department of Computer Science at the University of Chicago and Argonne National Laboratory. He has been Program Director of the Data & Computing Summer Lab since its first iteration under CDAC in 2019, and previously oversaw the Summer Internship Program ran by the former Computation Institute.

    He received his Ph.D. in Computer Science from Victoria University of Wellington in 2011. He co-leads the Globus Labs research group which focuses on a broad range of research problems in data-intensive computing and research data management. He currently leads projects related to parallel programming in Python, scientific reproducibility, and elastic and cost-aware use of cloud infrastructure.

    Lorenzo Orecchia is an assistant professor in the Department of Computer Science. His research focuses on applying mathematical techniques from discrete and continuous optimization to design algorithms for computational challenges arising in a variety of applications, including Machine Learning, Numerical Analysis and Combinatorial Optimization.

    Marshini Chetty is an assistant professor in the Department of Computer Science at the University of Chicago, where she co-directs the Amyoli Internet Research Lab or AIR lab. She specializes in human-computer interaction, usable privacy and security, and ubiquitous computing. Marshini designs, implements, and evaluates technologies to help users manage different aspects of Internet use from privacy and security to performance, and costs. She often works in resource-constrained settings and uses her work to help inform Internet policy. She has a Ph.D. in Human-Centered Computing from Georgia Institute of Technology, USA and a Masters and Bachelors in Computer Science from the University of Cape Town, South Africa. In her former lives, Marshini was on the faculty in the Computer Science Department at Princeton University and the College of Information Studies at the University of Maryland, College Park. Her work has won best paper awards at SOUPS, CHI, and CSCW and has been funded by the National Science Foundation, the National Security Agency, Intel, Microsoft, Facebook, and multiple Google Faculty Research Awards.

    Homepage.

    Maryellen L. Giger, Ph.D. is the A.N. Pritzker Professor of Radiology, Committee on Medical Physics, and the College at the University of Chicago. She is also the Vice-Chair of Radiology (Basic Science Research) and the immediate past Director of the CAMPEP-accredited Graduate Programs in Medical Physics/ Chair of the Committee on Medical Physics at the University. For over 30 years, she has conducted research on computer-aided diagnosis, including computer vision, machine learning, and deep learning, in the areas of breast cancer, lung cancer, prostate cancer, lupus, and bone diseases.

    Over her career, she has served on various NIH, DOD, and other funding agencies’ study sections, and is now a member of the NIBIB Advisory Council of NIH. She is a former president of the American Association of Physicists in Medicine and a former president of the SPIE (the International Society of Optics and Photonics) and was the inaugural Editor-in-Chief of the SPIE Journal of Medical Imaging. She is a member of the National Academy of Engineering (NAE) and was awarded the William D. Coolidge Gold Medal from the American Association of Physicists in Medicine, the highest award given by the AAPM. She is a Fellow of AAPM, AIMBE, SPIE, SBMR, and IEEE, a recipient of the EMBS Academic Career Achievement Award, and is a current Hagler Institute Fellow at Texas A&M University. In 2013, Giger was named by the International Congress on Medical Physics (ICMP) as one of the 50 medical physicists with the most impact on the field in the last 50 years. In 2018, she received the iBIO iCON Innovator award.

    She has more than 200 peer-reviewed publications (over 300 publications), has more than 30 patents and has mentored over 100 graduate students, residents, medical students, and undergraduate students. Her research in computational image-based analyses of breast cancer for risk assessment, diagnosis, prognosis, and response to therapy has yielded various translated components, and she is now using these image-based phenotypes, i.e., these “virtual biopsies” in imaging genomics association studies for discovery.

    She is a cofounder, equity holder, and scientific advisor of Quantitative Insights, Inc., which started through the 2009-2010 New Venture Challenge at the University of Chicago. QI produces QuantX, the first FDA-cleared, machine-learning-driven system to aid in cancer diagnosis (CADx). In 2019, QuantX was named one of TIME magazine’s inventions of the year.

    Homepage.

    Marynia Kolak, MS, MFA, PhD, is a health geographer using open science tools and an exploratory data analytic approach to investigate issues of equity across space and time. Her research centers on how “place” impacts health outcomes in different ways, for different people, from opioid risk environments to chronic disease clusters. She focuses on quantifying and distilling the structural determinants of health across different environments, tying political ecology models of public health with geocomputational methods and quasi-experimental policy evaluation techniques. She received the 2017 Concordium Innovation Award at AcademyHealth for her open-source visualization of Chicago determinants of health, and “Highest Impact” award in the Prevention Category at the American College of Cardiology 2019 conference for her work in connecting chronic disease rates with social determinants of health. She serves as the Co-I and spatial analytic lead in the ETHIC project investigating the opioid epidemic in Illinois. She is the Assistant Director of Health Informatics and Lecturer in GIScience at the Center for Spatial Data Science, University of Chicago, and serves as a Public Service Intern at the Chicago Department of Public Health. Marynia additionally serves as an Health and Medical Specialty Group (AAG) board member, and chair of the Chicago Public Health GIS Network. She received her Ph.D in Geography at ASU, M.F.A in Writing from Roosevelt University, M.S. in GIS from John Hopkins University, and B.S. in Geology from the University of Illinois at Urbana-Champaign.

    Miller Prosser earned a PhD in Northwest Semitic Philology from the University of Chicago in 2010, studying under the guidance of professor Dennis G. Pardee. His PhD Thesis, “Bunušu in Ugaritian Society” addresses the socioeconomic structure of the Late Bronze Age kingdom of Ugarit. Upon completion of his degree, Dr. Prosser began work as a research professional at the Oriental Institute’s Persepolis Fortification Archive Project. He would later take a position as a Research Database Specialist at the OCHRE Data Service of the Oriental Institute, supporting dozens of research projects using the Online Cultural and Historical Research Environment. Over the course of the last decade, Dr. Prosser has presented widely at workshops and conferences and served as a lecturer in the Digital Studies Master’s Degree program at the University of Chicago.

    Nick Feamster is Neubauer Professor in the Department of Computer Science and the College. He researches computer networking and networked systems, with a particular interest in Internet censorship, privacy, and the Internet of Things. His work on experimental networked systems and security aims to make networks easier to manage, more secure, and more available.

    Homepage

    Pedro Lopes is an Assistant Professor in Computer Science at the University of Chicago, where he leads the Human Computer Integration lab. Lopes’ research group focuses understanding how to integrate computer interfaces with the human body—creating the interface paradigm that supersedes wearable computing. They created wearable muscle stimulation devices that enable, for example, to: a user to manipulate a tool they never seen before, accelerate our reaction time, read and write information without using a screen, and transform someone’s arm into a plotter so they can solve complex problems with pen and paper. Their work is published at top-tier conferences (ACM CHI, ACM UIST, Cerebral Cortex). Pedro and his students have received one Best Paper award, three Best Talk Awards and two Best Paper nominations. Their work also captured the interest of media, such as MIT Technology Review, NBC, Discovery Channel, NewScientist, Wired and has been shown at Ars Electronica and World Economic Forum. (More: https://lab.plopes.org)

    Qinyun Lin is a postdoctoral fellow at the Center for Spatial Data Science. Her research interests include sensitivity analysis, causal inference, mediation analysis, social network analysis and multi-level models. Her dissertation proposes sensitivity analysis techniques for presence of spillover effects and heterogeneous treatment effects in multi-site randomized control trials. Her dissertation work also looks at unobserved mediator as a post-treatment confounder in causal mediation analysis. Her current research applies a spatial perspective to look at access to medications for opioid use disorder and how it affects opioid-related deaths and HCV infections.

    Rajesh Sankaran received his Ph.D. in electrical engineering from Louisiana State University. His research has focused on applications of electrical and computer engineering techniques toward solving problems in related science and engineering fields.

    He plays lead technical roles in the Array of Things, WxSeNet, and Smart Windows research initiatives. He is also associated with the EcoSpec projects.

    Raul Castro Fernandez is an Assistant Professor of Computer Science at the University of Chicago. In his research he builds systems for discovering, preparing, and processing data. The goal of his research is to understand and exploit the value of data. He often uses techniques from data management, statistics, and machine learning. His main effort these days is on building platforms to support markets of data. This is part of a larger research effort on understanding the Economics of Data. He’s part of ChiData, the data systems research group at The University of Chicago.

    Homepage.

    Sandra Schloen is the Manager of the OCHRE (Online Cultural and Historical Research Environment) Data Service of the Oriental Institute at the University of Chicago. She supports all of the research projects using the software. Trained in computer science and in applications of technology to educational contexts, she applies her technical background to the challenges and complexities of representing and managing cultural and historical data in all of its digital forms.

    Sanjay Krishnan is an Assistant Professor of Computer Science. His research group studies the theory and practice of building decision systems that are robust to corrupted, missing, or otherwise uncertain data. His research brings together ideas from statistics/machine learning and database systems. His research group is currently studying systems that can analyze large amounts of video, certifiable accuracy guarantees in partially complete databases, and theoretical lower-bounds for lossy compression in relational databases.

    Homepage.

    Susan Goldin-Meadow is the Beardsley Ruml Distinguished Service Professor in the Department of Psychology and Committee on Human Development at the University of Chicago. A year spent at the Piagetian Institute in Geneva while an undergraduate at Smith College piqued her interest in the relationship between language and thought, interests she continued to pursue in her doctoral work at the University of Pennsylvania (Ph.D. 1975). At Penn and in collaboration with Lila Gleitman and Heidi Feldman, she began her studies exploring whether children who lack a (usable) model for language can nevertheless create a language with their hands. She has found that deaf children whose profound hearing losses prevent them from learning the speech than surrounds them, and whose hearing parents have not exposed them to sign, invent gesture systems which are structured in language-like ways. This interest in how the manual modality can serve the needs of communication and thinking led to her current work on the gestures that accompany speech in hearing individuals. She has found that gesture can convey substantive information – information that is often not expressed in the speech it accompanies. Gesture can thus reveal secrets of the mind to those who pay attention.

    Professor Goldin-Meadow’s research has been funded by the National Science Foundation, the Spencer Foundation, the March of Dimes, the National Institute of Child Health and Human Development, and the National Institute of Neurological and Communicative Disorders and Stroke. She has served as a member of the language review panel for NIH, has been a Member-at-Large to the Section on Linguistics and Language Science in AAAS, and was part of the Committee on Integrating the Science of Early Childhood Development sponsored by the National Research Council and the Institute of Medicine and leading to the book Neurons to Neighborhoods. She is a Fellow of AAAS, APS, and APA (Divisions 3 and 7). In 2001, she was awarded a Guggenheim Fellowship and a James McKeen Cattell Fellowship which led to her two recently published books, Resilience of Language and Hearing Gesture. In addition, she edited Language in Mind: Advances in the Study of Language and Thought in collaboration with Dedre Gentner. She has received the Burlington Northern Faculty Achievement Award for Graduate Teaching and the Llewellyn John and Harriet Manchester Quantrell Award for Excellence in Undergraduate Teaching at the University of Chicago. She is currently the President of the Cognitive Development Society and the editor of the new journal sponsored by the Society for Language Development, Language Learning and Development. Professor Goldin-Meadow also serves as chair of the developmental area program.

    Teodora engages the community of researchers involved in computational image analysis at The University of Chicago across multi-disciplinary areas including: Biology, Physics, Chemistry, Economics, Public Policy, and Cancer Research.

    She serves as a catalyst for solving challenging questions in the research teams that she is supporting, such as: predicting oxygen support for COVID-19 patients, detecting prostate cancer, and analysis of messages related to identity in official educational setting. Teodora brings to the field practical expertise in state-of-the-art advanced technology: Supercomputers, Cloud Computing, Machine Learning, Image Analysis Techniques, and Big Data.

    Prior to joining RCC, Teodora earned her doctorate degree in Computer Science at Toulouse University in France. She won international challenges (IUS PICMUS 2016) in beamforming for ultrasound medical imaging.

    Yuxin Chen is an assistant professor at the Department of Computer Science at the University of Chicago. Previously, he was a postdoctoral scholar in Computing and Mathematical Sciences at Caltech, hosted by Prof. Yisong Yue. He received my Ph.D. degree in Computer Science from ETH Zurich, under the supervision of Prof. Andreas Krause. He is a recipient of the PIMCO Postdoctoral Fellowship in Computing and Mathematical Sciences, a Swiss National Science Foundation Early Postdoc.Mobility fellowship, and a Google European Doctoral Fellowship in Interactive Machine Learning.

    His research interest lies broadly in probabilistic reasoning and machine learning. He is currently working on developing interactive machine learning systems that involve active learning, sequential decision making, interpretable models and machine teaching. You can find more information in my Google scholar profile.

    Homepage.

    Zhuo Zhen is a cloud computing software developer at the University of Chicago and Argonne National Laboratory. She works on the Chameleon Cloud project where she built and operated the national Chameleon testbed which supports research and innovation in cloud computing.

  • DCSL Team

    DCSL 2020 Team

    Kyle Chard is a Research Assistant Professor in the Department of Computer Science at the University of Chicago and Argonne National Laboratory. He has been Program Director of the Data & Computing Summer Lab since its first iteration under CDAC in 2019, and previously oversaw the Summer Internship Program ran by the former Computation Institute.

    He received his Ph.D. in Computer Science from Victoria University of Wellington in 2011. He co-leads the Globus Labs research group which focuses on a broad range of research problems in data-intensive computing and research data management. He currently leads projects related to parallel programming in Python, scientific reproducibility, and elastic and cost-aware use of cloud infrastructure.

    Katie Rosengarten is the Administrative Specialist for the Center for Data and Computing, responsible for supporting the Center’s programming, logistical, and financial operations.

    Julia Hanson is a Ph.D. student in the Department of Computer Science studying computer security and privacy. This will be Julia’s first summer as a lab coordinator for the CDAC Summer Internship Program.

    Tyler Skluzacek is a Ph.D. student in the Department of Computer Science studying scientific data management, serverless computing, and high performance computing. This will be Tyler’s fourth summer with the CDAC Summer Internship Program.

    Jinjin Zhao is a first year PhD student studying Computer Science at University of Chicago. She did her undergraduate at Princeton University, and am originally from Ottawa, Canada (eh).

    Her research interests are in databases, data systems, and deep learning. Her advisor is Sanjay Krishnan, and she is a part of the ChiData Group . She is currently working on the DeepLens project, and information leakage.

    Julia Lane is the Executive Director of the Center for Data and Computing, responsible for shaping and executing the strategic vision of CDAC, building new research partnerships and outreach strategies to foster interdisciplinary collaborations, and ensuring that the University continues to broaden applications of data science and computing approaches.

  • FAQ

    To learn more about the CDAC Summer Lab, register for the information session on Thursday Jan. 28th from 5-6pm CT. A recording of and slides from the information session will be available afterwards. For more questions, contact Katie Rosengarten at krosengarten@uchicago.edu.

    • Frequently Asked Questions
    • When is the application due?

      The 2021 application is due Sunday February 28 by 11:59pm CT. Late applications will not be considered for review.

      Please subscribe to the CDAC Mailing List to receive notifications about the 2021 program and application.

    • Where can I apply?

      The application for the 2021 Data & Computing Summer Lab program can be found here. If you have any issues accessing or submitting the form, please email Katie Rosengarten (krosengarten@uchicago.edu).

    • Are there any program prerequisites?

      We do not require any previous research experience to participate in the program. Familiarity in at least one programming language (Python, Java, C++, etc.) is preferred, as well as relevant coursework in areas such as computer science, statistics, and math.

    • When will I be notified of my application decision?

      The 2021 application is due Sunday February 28. Decision notifications will be sent out in early April 2021.

      Please subscribe to the CDAC Mailing List to receive notifications about the 2021 program and application.

    • What are the program dates?

      For the 2021 program, we offer two program dates:

      1. June 14th — August 20th
      2. June 21st — August 27th.

      On the application, you will be asked to note the program that best suits your summer schedule and school calendar. If you have any conflicts with the program dates as listed here, please indicate so on your application.

    • Where does the program take place?

      The program administration will consult the recommendations of the University of Chicago and UChicago Medicine to determine the safest format for the Summer Lab 2021 program. We successfully ran the 202 program remotely, and are prepared to do so for 2021 if necessary.

      Whenever in-person research and academic activities resume, research assistants in the program are provided work space in the John Crerar Library, home to the Center for Data and Computing (CDAC) and the Computer Science Department. Unless otherwise agreed upon by their mentor and the Program Director (Kyle Chard), all students are expected to work in the open research space provided — the goal of which is to foster problem-solving and engagement across projects, domains, ages, and skill sets.

    • Is housing provided?

      Unfortunately, we do not provide housing as part of the program.

    • If admitted, how will I be paired with a project?

      On the application, we ask for your research areas of interest, as well as self-reported experience and expertise in relevant data science and computational skills and tools. During the application review process, in combination with your research goals and resume, we will use those self-assessments to determine an applicant’s aptitude and eligibility for available research projects.

    • How many hours a week is the program?

      While students are not required to log their hours, we expect each student to work roughly a full-time schedule each week (>37.5 hrs/wk) — i.e. 8am-4pm; 9am-5pm; or 10am-6pm. Schedules are to be consulted with and confirmed by program mentors.

    • What are the stipend rates for the program?

      The stipend rates for the 2021 program are below:

      • High School stipend rate: $4,875
      • Undergraduate stipend rate: $5,625
    • Are letters of recommendation required?

      We do not require letters of recommendation for the application.

    • Are international students eligible to apply?

      Yes, international students are eligible apply. However, all students must be authorized to work in the United States and provide all necessary documentation in support of their stipend. To see the documentation required to process stipends, please consult this page. We recommend that all international students check with their home institution’s international affairs office to ensure that they qualify.

    • Is there an age limit for participant eligibility?

      The program is open to all current high school and undergraduate students who will be entering another year of their degree program after Summer Lab. Specifically, current high school freshmen through current undergraduate juniors/3rd years are eligible to apply.

  • Student Resources
    • CDAC Mailing List
      • Subscribe to receive updates about the Data & Computing Summer Lab and other student research opportunities.
    • College Center for Research & Fellowships (CCRF)
      • CCRF supports undergraduates as they pursue transformative, educational experiences through scholarly undergraduate research and nationally competitive fellowships. Visit their website to subscribe to their weekly newsletter that highlights new research opportunities.
    • Department of Computer Science Job Board
      • Available to UChicago students as a resource for internship opportunities as well as part-time and full-time positions.

    Email us at cdac@uchicago.edu with any student research-related questions.