Colloquia Archives: 2017-2018

Below is the list of talks in the computer science seminar series for the 2017-2018 academic year. If you would like to receive notices about upcoming seminars, you can subscribe to the announcement listserv

Lecture Series Overview

Top ⇑Fall 2017 Colloquia

Sept 11

The Pebbling Comonad in Finite Model Theory

Samson Abramsky University of oxford and Tulane University

Abstract: Pebble games are a powerful tool in the study of finite model theory, constraint satisfaction and database theory. Monads and comonads are basic notions of category theory which are widely used in semantics of computation and in modern functional programming. We show that existential k-pebble games have a natural comonadic formulation. Winning strategies for Duplicator in the k-pebble game for structures A and B are equivalent to morphisms from A to B in the coKleisli category for this comonad. This leads on to comonadic characterisations of a number of central concepts in Finite Model Theory. These results lay the basis for some new and promising connections between two areas within logic in computer science which have largely been disjoint: (1) finite and algorithmic model theory, and (2) semantics and categorical structures of computation.

About the Speaker: Samson Abramsky is the Christopher Strachey Professor of Computing at the University of Oxford. He also is a Visiting Research Professor at Tulane, as part of the MURI project on Semantics and Tools for Higher Order Functional Quantum Programming Languages. Samson has worked in a wide range of areas in the semantics and logic of computation, including concurrency, domain theory (especially domain theory in logical form), lambda calculus, semantics of programming languages, and abstract interpretation and program analysis. He played a leading role in the development of game semantics and its applications to the semantics of programming languages, in interaction categories, and in geometry of interaction, and connections with traced monoidal categories and realizability. He and five colleagues just received the 2017 Church Award for their work in game semantics and its application to programming language semantics. He also has been a leader in establishing categorical models of quantum mechanics, and their application to quantum computing and quantum information. Samson is a Fellow of the Royal Society.
Sept 25

Modeling Spatial Auditory Attention Using Soft Constraints

Jaelle Scheuerman Tulane university

Interdisciplinary Project Presentation

Abstract: In this talk, the speaker will present an interdisciplinary effort to develop a computational model of spatial auditory attention. Spatial attention has been the focus of research in cognitive models, though much of the work has focused on studying visual attention. This model uses soft constraints and the well-established framework of constraint satisfaction problems to model how auditory attention is allocated over space. It includes three main components: a goal map, saliency map and priority map. The goal map represents attention that is allocated by choice (top-down attention). The saliency map models the distribution of salient (bottom-up) attention. Finally, a priority map combines these two maps to model the total distribution of attentional bias. The model's constraint-based approach is very flexible in terms of embedding and testing different hypotheses. The model was shown to be successful in modeling behavioral data of experiments where there is a single attended location. The framework was then extended to encompass scenarios where there may be multiple attended locations. Using the parameters learned by fitting behavioral data from a single attended location task, the model made predictions about a task where sounds are presented at multiple locations with equal probability. These predictions well fit the experimental data and provide a first example of how the computational model can be used as a predictor.

About the Speaker: Jaelle Scheuerman is a 3rd year PhD student in the Department of Computer Science. Her research interests include artificial intelligence, cognitive models and cognitive architectures. She is particularly interested in applications of preferences and constraints to models of attention and decision making.
Oct 23

Building Scalable Machine Learning Solutions for Data Curation

Ihab Ilyas University of Waterloo

Abstract: Machine learning tools promise to help solve data curation problems. While the principles are well understood, the engineering details in configuring and deploying ML techniques are the biggest hurdle. In this talk I discuss why leveraging data semantics and domain-specific knowledge is key in delivering the optimizations necessary for truly scalable ML curation solutions. The talk focuses on two main problems: (1) entity consolidation, which is arguably the most difficult data curation challenge because it is notoriously complex and hard to scale; and (2) using probabilistic inference to suggest data repair for identified errors and anomalies using our new system called HoloCLean. Both problems have been challenging researchers and practitioners for decades due to the fundamentally combinatorial explosion in the space of solutions and the lack of ground truth. There’s a large body of work on this problem by both academia and industry. Techniques have included human curation, rules-based systems, and automatic discovery of clusters using predefined thresholds on record similarity Unfortunately, none of these techniques alone has been able to provide sufficient accuracy and scalability. The talk aims at providing deeper insight into the entity consolidation and data repair problems and discusses how machine learning, human expertise, and problem semantics collectively can deliver a scalable, high-accuracy solution.

About the Speaker: Ihab Ilyas is a professor in the Cheriton School of Computer Science at the University of Waterloo, where his main research focuses on the areas of big data and database systems, with special interest in data quality and integration, managing uncertain data, rank-aware query processing, and information extraction. Ihab is also a co-founder of Tamr, a startup focusing on large-scale data integration and cleaning. He is a recipient of the Ontario Early Researcher Award (2009), a Cheriton Faculty Fellowship (2013), an NSERC Discovery Accelerator Award (2014), and a Google Faculty Award (2014), and he is an ACM Distinguished Scientist. Ihab is an elected member of the VLDB Endowment board of trustees, elected SIGMOD vice chair, and an associate editor of the ACM Transactions of Database Systems (TODS). He holds a PhD in computer science from Purdue University, West Lafayette.
Nov 13

Clustering Correctly

Justin Eldridge Ohio State University

Abstract: Clustering is an important machine learning task whose goal is to identify the natural groups (or "clusters") in data. Given a data set, what is the correct clustering? There is no single answer to this seemingly simple question. In a statistical setting, however, where it is assumed that the data are sampled from an underlying probability distribution, it is natural to define the clusters of the distribution itself. We then say that a clustering method is "correct" if its output converges in some sense to the ideal clustering of the distribution as the size of the data grows. In this talk, I discuss the correctness of clustering methods in two settings: first, when the data are points sampled from a probability density, and second, when the data are graphs generated from a graphon -- a powerful non-parametric random graph model of much recent interest. In each case, I identify the natural hierarchical cluster structure of the distribution, formalize a strong notion of convergence to the tree of the ideal clustering, and construct efficient clustering methods which converge -- and are therefore "correct".

About the Speaker: Justin Eldridge is a Ph.D. student in the Department of Computer Science and Engineering at The Ohio State University, advised by Mikhail Belkin and Yusu Wang. His research interests focus on the foundations of learning structure from unlabeled data. Justin's work with Drs. Belkin and Wang on the statistical consistency of clustering received the best student paper award at COLT 2015 and a full oral presentation at NIPS 2016. Earlier this year, Justin was a graduate visitor at the Simons Institute for the Theory of Computation at Berkeley, and he is currently a Presidential Fellow at Ohio State.
Dec 1

User-Centric Data Management for Fun, Profit, and the Common Good

Alexandros Labrinidis University of Pittsburgh

Abstract: Big data is transforming all aspects of the human experience, be it everyday life, scientific exploration and discovery, medicine, business, law, journalism, and decision-making at all levels of government. The majority of big data management research emphasizes the systems point-of-view, which focusses on efficiency and scale. This talk will showcase multiple ways where we consider the user point-of-view for big data management, and demonstrate its benefits. We will show how user preferences can positively influence resource allocation decisions, especially in overload situations. This is true both for traditional (static) data and for streaming data processing systems. We will conclude with new research directions, that are being developed as part of new urban informatics research projects.

About the Speaker: Dr. Alexandros Labrinidis is a Professor of Computer Science at the University of Pittsburgh. He joined the Department of Computer Science in 2002, right after receiving his PhD in Computer Science from the University of Maryland, College Park. He is the co-director of the Advanced Data Management Technologies Laboratory and has an adjunct professor appointment with Carnegie Mellon University. Dr. Labrinidis' research focuses on user-centric (big) data management for scalable network-centric applications, including web-databases, data stream management systems, urban informatics, sensor networks, internet of things, and scientific data management. He has published over 100 papers at peer-reviewed journals, conferences, and workshops and was the recipient of an NSF CAREER award in 2008. Dr. Labrinidis served as the Secretary/Treasurer for ACM SIGMOD and as the Editor of SIGMOD Record. He is currently the founding Editor of the Systems/Applications Track of the Parallel and Distributed Databases Journal and an Associate Editor for the VLDB Journal. He has also served on numerous program committees of international conferences/workshops.

Personal home page:
Dec 4


Speaker Institution

Abstract: TBA

Top ⇑Spring 2018 Colloquia

Jan 12

Pleaching Pencil-&-Paper Picture Puzzles

Maarten Löffler Utrecht University

This event will be held on Friday, 1/12/2018, from 2:00 - 3:00 p.m. in Stanley Thomas, Room 302. Please note the special weekday and time for this event.

Abstract: Pencil-and-paper puzzles (e.g., Sudoku) are a popular pastime for both children and adults. The main appeal lies in the logical solving process, but in some genres the puzzler is additionally rewarded when the solved puzzle reveals a picture (e.g., Nonograms). We introduce free-form variants of classic puzzle genres containing non-rectilinear or even curved elements. We study the underlying geometry: what constraints are there on the shapes and location of puzzle elements? How can we measure aspects of puzzles like solvability, difficulty, originality, fun, etc.? Finally, we use these geometric properties to develop automatic generators of puzzles: you draw a picture, and the system gives you a puzzle that solves to that picture.

About the Speaker: Maarten Löffler is an assistant professor at Utrecht University, working in computational geometry and graph drawing. He obtained his PhD in 2009 on geometric uncertainty, and afterwards spent two years at the University of California in Irvine. He has a passion for pencil-and-paper puzzles, participating in the World Championship in 2003 and 2006.
Feb 6

Theory and Data for Better Decisions

Nicholas Mattei IBM Research - AI, T.J. Watson Research Center

This event will be held on Tuesday, 2/6/2018, from 3:30 - 4:30 p.m. in Stanley Thomas, Room 302. Please note the special weekday and time for this event.

Abstract: The internet and modern technology enables us to communicate and interact at lightning speed and across vast distances. The communities and markets created by this technology must make collective decisions, allocate scarce resources, and understand each other quickly, efficiently, and often in the presence of noisy communication channels, ever changing environments, and/or adversarial data. Many theoretical results in these areas are grounded on worst case assumptions about agent behavior or the availability of resources. Transitioning theoretical results into practice requires data driven analysis and experiment as well as novel theory with assumptions based on real world data. I'll discuss recent work that focus on creating novel algorithms for resource allocation with applications ranging from reviewer matching to deceased organ allocation. These projects require novel algorithms and leverage data to perform detailed experiments as well as creating open source tools.

About the Speaker: Nicholas Mattei is a Research Staff Member with IBM Research AI - Reasoning Group at the IBM T.J. Watson Research Laboratory. His research is in artificial intelligence (AI) and its applications; largely motivated by problems that require a blend of techniques to develop systems and algorithms that support decision making for autonomous agents and/or humans. Most of his projects leverage theory, data, and experiment to create novel algorithms, mechanisms, and systems that enable and support individual and group decision making. He is the founder and maintainer of PrefLib: A Library for Preferences; the associated PrefLib:Tools available on Github; and is the founder/co-chair for the Exploring Beyond the Worst Case in Computational Social Choice (2014 - 2017) held at AAMAS. Nicholas was formerly a senior researcher working with Prof. Toby Walsh in the AI & Algorithmic Decision Theory Group at Data61 (formerly known as the Optimisation Group at NICTA), a lecturer in the School of Computer Science and Engineering (CSE), and member of the Algorithms Group at the University of New South Wales. He previously worked as a programmer and embedded electronics designer for nano-satellites at NASA Ames Research Center. He received his Ph.D from the University of Kentucky under the supervision of Prof. Judy Goldsmith in 2012.
Feb 15

Machine Learning-Enhanced Visualization

Matthew Berger University of Arizona

This event will be held on Thursday, 2/15/2018, from 3:30 - 4:30 p.m. in Stanley Thomas, Room 302. Please note the special weekday and time for this event.

Abstract: Visualization is indispensable for exploratory data analysis, enabling people to interact with and make sense of data. Interaction is key for effective exploration, and is dependent on two main factors: how data is represented, and how data is visually encoded. For instance, text data may be represented as a 2D spatialization and visually encoded through graphical marks, color, and size. Typically, these factors do not anticipate how a user will interact with the data, however, which limits the set of interactions one may perform in data exploration. In this talk I will focus on how machine learning can be used to improve data representations and visual encodings for user interaction. My research is centered on building machine learning models when visualization, and in particular how a user interacts with data, is the primary objective. I will first discuss how to learn data representations for the purpose of interactive document exploration. I will demonstrate how compositional properties of neural language models, built from large amounts of text data, empower the user to semantically explore document collections. Secondly, I will show how to learn visual encodings for the purpose of exploring volumetric data. Deep generative models are used to learn the distribution of outputs produced from a volume renderer, providing the user both guidance and intuitive interfaces for volume exploration.

About the Speaker: Matthew Berger is a postdoctoral research associate in the Department of Computer Science at the University of Arizona, advised by Joshua A. Levine. Previously he was a research scientist with the Air Force Research Laboratory. He received his PhD in Computing from the University of Utah in 2013, advised by Claudio T. Silva. His research interests are at the intersection of machine learning and data visualization, focusing on the development of visualization techniques that are driven by machine learning models.
Feb 19

Predictive Modeling of Drug Effects: Learning from Biomedical Knowledge and Clinical Records

Ping Zhang IBM T. J. Watson Research Center

Abstract: Drug discovery is a time-consuming and laborious process. Lack of efficacy and safety issues are the two major reasons for which a drug fails clinical trials, each accounting for around 30% of failures. By leveraging the diversity of available molecular and clinical data, predictive modeling of drug effects could lead to a reduction in the attrition rate in drug development. In this talk, I will introduce my recent work on machine-learning techniques for analyzing and predicting clinical drug responses (i.e., efficacy and safety), including: 1) integrating multiple drug/disease similarity networks via joint matrix factorization to infer novel drug indications; and 2) revealing previously unknown effects of drugs, identified from electronic health records and drug information, on laboratory test results. Experimental results demonstrate the effectiveness of these methods and show that predictive models could serve as a useful tool to generate hypotheses on drug efficacy and safety profiles.

About the Speaker: Ping Zhang is a Research Staff Member at the Center for Computational Health, IBM T. J. Watson Research Center. His research focuses on machine learning, data mining, and their applications to biomedical informatics and computational medicine. Dr. Zhang received his PhD in Computer and Information Sciences from Temple University in 2012. He has published more than 35 peer-reviewed scientific articles in top journals and conferences (e.g., Nucleic Acids Research, BMC Bioinformatics, Journal of the American Medical Informatics Association, KDD, AAAI, ECML, SDM, and CIKM) and filed more than 15 patent applications. Dr. Zhang has served on the program committees of leading international conferences, including KDD, IJCAI, UAI, and AMIA, and on the editorial boards of CPT: Pharmacometrics & Systems Pharmacology and Journal of Healthcare Informatics Research. He won the best in-use/industrial paper award for ESWC 2016 and received a Marco Ramoni Distinguished Paper nomination at AMIA Summits 2014. More details can be found at
Feb 28

From Consensus Clustering to K-means Clustering

Hongfu Liu Northeastern University

This event will be held on Wednesday, 2/28/2018, from 3:00 - 4:00 p.m. in Stanley Thomas, Room 302. Please note the special weekday and time for this event.

Abstract: Consensus clustering aims to find a single partition which agrees as much as possible with existing basic partitions, which emerges as a promising solution to find cluster structures from heterogeneous data. It has been widely recognized that consensus clustering is effective to generate robust clustering results, detect bizarre clusters, handle noise, outliers and sample variations, and integrate solutions from multiple distributed sources of data or attributes. Different from the traditional clustering methods, which directly conducts the data matrix, the input of consensus clustering is the set of various diverse basic partitions. Therefore, consensus clustering is a fusion problem in essence, rather than a traditional clustering problem. In this talk, I will introduce the category of consensus clustering, illustrate the K-means-based Consensus Clustering (KCC), which exactly transforms the consensus clustering problem into a (weighted) K-means clustering problem with theoretical supports, talk about some key impact factors of consensus clustering, extend KCC to Fuzzy C-means Consensus Clustering. Moreover, this talk also includes how to employ consensus clustering for heterogeneous, multi-view, incomplete and big data clustering. Derived from consensus clustering, a partition level constraint is proposed as the new side information for semi-supervised clustering. Along this line, several interesting application based on the partition level constraint, such as feature selection, domain adaptation, gene stratification are involved to demonstrate the extensibility of consensus clustering. Some codes are available for practical use.

About the Speaker: Hongfu Liu is a final-year Ph.D. candidate of Department of Electrical & Computer Engineering, Northeastern University (NEU), supervised by Prof. Yun (Raymond) Fu. Before joining NEU, he got his master and bachelor degrees majored in management in Beihang University with Prof. Junjie Wu in 2011 and 2014, respectively. His research interests generally focus on data mining and machine learning, with special interests in ensemble learning. Website:
Mar 14

Mining in Big Networks with Incomplete Information

Xiang Li University of Florida

This event will be held on Wednesday, 3/14/2018, from 3:00 - 4:00 p.m. in Stanley Thomas, Room 302. Please note the special weekday for this event.

Abstract: Modern networking systems such as Internet of Things or Online Social Networks have growth to billion-scale with billions of nodes and edges. At the same time, most network analyses are conducted on incomplete information due to 1) Collecting network data is expensive and time-consuming; and 2) Security and privacy concerns have made users limit the information being exposed publicly. With such big and incomplete data, it becomes very challenging to mine the billion-scale networks. In this talk, the speaker addresses the above problems via two primary approaches: 1) Develop novel dynamic sampling techniques with the performance bound guarantee to exactly perform the big data mining in billion-scale networks; and 2) Develop active learning methods to support more accurate decisions where decisions are made based on the outcomes of previous decisions. To confirm the practical uses of these mathematical techniques, she develops the first almost exact solution to an NP-complete problem, Maximum-Influence, which can run on Twitter within 3 hours. The results of active learnings are used to not only unveil the near-optimal attack strategies of APTO, one of the most persistent attacks in OSNs, but also advance the research front of adaptive stochastic optimization, including: active learning with batch, adaptive theory under the matroids intersection, and providing the first tool to analyze greedy algorithms for optimization problems where the objective function is non-submodular. Towards the end of the talk, I will summarize my other research results in the field of Cyber Physical Systems and discuss about my future research agenda.

About the Speaker: Xiang Li is currently a UF Informatics Institute Fellow and a Ph.D. candidate in the Computer and Information Science and Engineering department, University of Florida, expected to graduate in May 2018. Her research interests at the intersection of cyber-security, data science, and highly scalable algorithms with applications in complex networking systems. More specifically, the focus of her research is to develop models, theories, and approximation algorithms for many computationally hard problems in Online Social Networks, Device-to- Device Networks, Internet of Things, and Smart Grids. She has published 21 articles in various prestigious journals and conferences such as IEEE Transactions on Mobile Computing, IEEE Transactions on Smart Grids, IEEE INFOCOM, IEEE ICDM, including one Best Paper Award at IEEE MSN 2014, and Best Paper Nominee at IEEE ICDCS 2017. Xiang has served as a reviewer for several journals such as IEEE Transactions on Parallel and Distributed Systems, Journal of Combinatorial Optimization, and IEEE Transactions on Networks Science and Engineering. She is a recipient of many awards, including UF Graduate School Fellowship, UF Informatics Institute Fellowship, UF Outstanding Merits Award, Travel Grant Awards at IEEE INFOCOM, IEEE ICDCS, IEEE ICDM, and IEEE INFOCOM Best-in-Section-Presentation Award.
Mar 19

Topic: Advancing Big Neuroimaging Data Analysis for Precision Diagnostics

Aristeidis Sotiras University of Pennsylvania

Abstract: Modern neurotechnologies produce massive, complex imaging data from multiple modalities that reflect brain structure and function in disease and health, leading neuroimaging to the “big data” era. Big data provides unprecedented opportunities to develop computational approaches that can deliver personalized, quantitative disease indexes of diagnostic and prognostic value. Such biomarkers have the potential to quantify the risk of developing a disease, track the disease progression or the effect of pharmacological interventions in clinical trials, and deliver patient specific diagnosis before measurable clinical effects occur. However, big data analyses also pose important challenges. Specifically, i) the high dimensionality of the data may hinder the extraction of interpretable and reproducible information; while ii) heterogeneity, which is increasingly recognized as a key feature of brain diseases, limits the use of current analytical tools. In this talk, I will discuss novel computational approaches that leverage advanced machine learning techniques to address these challenges. First, I will describe an unsupervised multivariate analysis technique based on non-negative matrix factorization that optimally summarizes high dimensional neuroimaging data through a set of highly interpretable and reproducible imaging patterns. Second, I will discuss a semi-supervised multivariate machine learning technique that aims to reveal disease heterogeneity by jointly performing disease classification and clustering of disease sub-groups. Applications of these approaches in diverse settings will be discussed to highlight their broad impact as well as their role in future directions toward precision medicine.

About the Speaker: Aristeidis Sotiras Ph.D. is a Research Associate in the Center for Biomedical Image Computing and Analytics (CBICA) at the Radiology Department of the University of Pennsylvania, where he works with Prof. Davatzikos on multivariate pattern techniques for quantitative image analysis. Dr. Sotiras received his Ph.D. in Applied Mathematics from École Centrale Paris, where he worked on probabilistic graphical models for deformable image registration under the supervision of Prof. Paragios. His research interests are at the intersection of medical image analysis, machine learning, and computational neuroscience. He focuses on developing novel computational methods to extract information from brain images and delineate patterns in large heterogeneous data sets. The overarching goal of his work is to improve patient-specific diagnosis and advance our understanding of brain structure and function in health and disease.
Apr 16

Computer Science in Firearm Forensics: Development of a Novel 3D-Topography Imaging and Analysis System

Ryan Lilien Cadre Research Labs

Abstract: The talk will describe our work developing an accurate, fast, and low-cost 3D imaging and analysis system for firearm forensics. As portrayed in the movies, ammunition fired through a single firearm will pick up small microscopic imperfections (i.e., toolmarks) unique to that firearm. Microscopic examination of these marks allows firearm examiners to assess the likelihood of common origin (e.g., linking a cartridge case found at a crime scene to a test fire from a suspect’s firearm). At Cadre, we are developing a novel 3D scanning and analysis system for cartridge cases. Our system has been validated and is now in use at the main FBI firearm and toolmark laboratory in Quantico, VA. After presenting a high-level introduction to firearm forensics I will describe our photometric stereo-based 3D scanning system. I will then discuss imaging and matching performance on several large test sets of real-world cartridge cases where our approach achieves approximately 80% recall with no false positives. Finally, I will introduce the emerging application area of virtual comparison microscopy (VCM) and present validation work which supports the hypothesis that VCM is at least as good as if not better than traditional comparison microscopy. The application of novel 3D imaging and analysis methods to the field of firearm forensics has the potential to greatly impact the criminal justice system. The work is supported in part by research grants from the US National Institute of Justice.

About the Speaker: Ryan Lilien, MD/PhD Dr. Lilien’s research focuses on the use of advanced computational methods to provide collaborating scientists informational leverage in solving their research problems. Ryan earned an MD/Ph.D. from Dartmouth Medical School and Dartmouth’s Department of Computer Science. He then became faculty at the University of Toronto where he was cross-appointed between the Department of Computer Science and the Centre for Cellular and Biomolecular Research in the Faculty of Medicine. The Gates Foundation recognized Ryan’s research with a prestigious Grand-Challenges Grant for his ongoing work in Drug Discovery. Ryan is now located in Chicago where he leads Cadre's development of the TopMatch system for firearm toolmark imaging and analysis. He serves on NIST's federal standards committee for forensic science (the OSAC) and has published in the fields of Computer Science, Machine Learning, Image Analysis, Drug Discovery, Molecular Modeling, Protein Engineering, Search-and-Optimization, and Forensics.
Apr 23


Speaker Institution

Abstract: TBA
Apr 30


Speaker Institution

Abstract: TBA

303 Stanley Thomas Hall, New Orleans, LA 70118 504-865-5785