(Tenative Schedule)

Time & Location: All talks are on Thursdays in Dinwiddie 102 at 3:30 pm unless otherwise noted. Refreshments in Gibson 426 after the talk.

Comments indicating vacations, special lectures, or change in location or time are in

Organizer: Gustavo Didier

**Abstract**:

In modern applications of high-throughput sequencing technologies researchers may be interested in quantifying the molecular diversity of a sample (e.g. T-Cell repertoire, transcriptional diversity, or microbial species diversity). In these sampling-based technologies there is an important detail that is often overlooked in the analysis of the data and the design of the experiments, specifically that the sampled observations often do not give a fully representative picture of the underlying population. This has long been a recognized problem in statistical ecology and in the broader statistics literature, and is commonly known as the missing species problem.

In classical settings, the size of the sample is usually small. New technologies such as high-throughput sequencing have allowed for the sampling of extremely large and heterogeneous populations at scales not previously attainable or even considered. New algorithms are required that take advantage of the scale of the data to account for heterogeneity, but are also sufficiently fast and scale well with the size of the data. I will discuss a moment-based approach for estimating the missing species based on an extension of Chao's moment-based lower bound (Chao, 1984). We apply results from the classical moment problem to show that solutions can be obtained efficiently, allowing for estimators that are simultaneously conservative and use more information. By connecting the rich theory of the classical moment problem to the missing species problem we can also clear up issues in the identifiability of the missing species.**Abstract**:

**Abstract**:

**Abstract**:

The (semiparametric) Gaussian copula model consists of distributions that have dependence structure described by Gaussian copulas but that have arbitrary marginals. A Gaussian copula is in turn determined by an Euclidean parameter $R$ called the copula correlation matrix. In this talk we study the normal scores (rank correlation coefficient) estimator, also known as the van der Waerden coefficient, of $R$ in high dimensions. It is well known that in fixed dimensions, the normal scores estimator is the optimal estimator of $R$, i.e., it has the smallest asymptotic covariance. Curiously though, in high dimensions, nowadays the preferred estimators of $R$ are usually based on Kendall's tau or Spearman's rho. We show that the normal scores estimator in fact remains the optimal estimator of $R$ in high dimensions. More specifically, we show that the approximate linearity of the normal scores estimator in the efficient influence function, which in fixed dimensions implies the optimality of the normal scores estimator, holds in high dimensions as well.

Minh TangJohns Hopkins

**Abstract**:

We prove a central limit theorem for the components of the eigenvectors corresponding to the d largest eigenvalues of the normalized Laplacian matrix of a finite dimensional random dot product graph. As a corollary, we show that for stochastic blockmodel graphs, the rows of the spectral embedding of the normalized Laplacian converge to multivariate normals and furthermore the mean and the covariance matrix of each row are functions of the associated vertex's block membership. Together with prior results for the eigenvectors of the adjacency matrix, we then compare, via the Chernoff information between multivariate normal distributions, how the choice of embedding method impacts subsequent inference. We demonstrate that neither embedding method dominates with respect to the inference task of recovering the latent block assignments.

**Location:** Gibson Hall 126A

**Time:** 3:30

Speakerinstitution

**Abstract**:

**Abstract**:

**Abstract**:

Forecasting the trajectory of social dynamic processes such as the spread of infectious diseases poses significant challenges that call for methods that account for data and model uncertainty. Here we introduce a frequentist computational bootstrap approach that weights the uncertainty derived from a set of plausible models to build an ensemble model for sequential forecasting. The power and transparency of this approach is illustrated in the context of simple dynamic differential-equation models, which we confront against the trajectory of real and simulated outbreak data. For illustration, we generate sequential short-term ensemble forecasts of epidemic outbreaks by combining the strengths of phenomenological models that incorporate flexible epidemic growth scaling namely the Generalized-Growth Model (GGM) and the Generalized Logistic Model (GLM). With our ensemble approach, we also addressed prior lessons of the Ebola forecasting challenge particularly with a focus at improving short-term forecasts of outbreaks which may involve a temporary downturn in case incidence.

**Abstract**:

**Abstract**:

**Abstract**:

**Abstract**:

This talk will illustrate some patterns in the homology of the configuration space F_k(M), the space of ordered k-tuples of distinct points in a manifold M. For a fixed manifold M, as k increases, we might expect the topology of these configuration spaces to become increasingly complicated. Church and others showed, however, that when M is connected and open, there is a representation-theoretic sense in which the homology groups of these spaces stabilize. In this talk I will explain these stability patterns, and describe higher-order stability phenomena -- relationships between unstable homology classes in different degrees -- established in joint work with Jeremy Miller. This project was inspired by work-in-progress of Galatius--Kupers--Randal-Williams.

Tristan BuckmasterPrinceton (Host Glatt-holtz)

**Abstract**:

**Abstract**:

Stochastic processing networks arise as models in manufacturing, telecommunications, transportation, computer systems, the customer service industry, and biochemical reaction networks. Common characteristics of these networks are that they have entities (jobs, packets, vehicles, customers, or molecules) that move along routes, wait in buffers, receive processing from various resources, and are subject to the effects of stochastic variability through such quantities as arrival times, processing times, and routing protocols. The mathematical theory of queueing aims to understand, analyze, and control congestion in stochastic processing networks. In this talk, we will review some of the major developments in the last century with more emphasis on some common approximations used in the last couple of decades. In particular, we will discuss broad results for control of large networks as well as more detailed results for control of specific smaller networks, under heavy traffic approximations.

**Location:** Richardson Building 204

**Time: ** 1:30

**Abstract**:

A common theme of enumerative combinatorics is formed by counting functions that are polynomials. For example, one proves in any introductory graph theory course that the number of proper k-colorings of a given graph G is a polynomial in k, the chromatic polynomial of G. Combinatorics is abundant with polynomials that count something when evaluated at positive integers, and many of these polynomials have a (completely different) interpretation when evaluated at negative integers: these instances go by the name of combinatorial reciprocity theorems. For example, when we evaluate the chromatic polynomial of G at -1, we obtain (up to a sign) the number of acyclic orientations of G, that is, those orientations of G that do not contain a coherently oriented cycle. Combinatorial reciprocity theorems appear all over combinatorics. This talk will attempt to show some of the charm (and usefulness!) these theorems exhibit. Our goal is to weave a unifying thread through various combinatorial reciprocity theorems, by looking at them through the lens of geometry.

**Abstract**:

I will introduce variational integrators for finite dimensional ODEsystems based on discretizing a variational principle. The advantageof such a procedure is that, if done with care, it preserves importantgeometric properties of the original system. The presentation willstart with simple examples showing the utility of discretizing avariational integral rather than deriving the differential equationsand discretizing these. For Lagrangian systems (with convexityproperties) a phase space variational principle (Hamilton's principle)can be derived, producing the Hamiltonian equations of motion, asystem of first order (rather than second order)equations. Discretization must be done carefully in order to avoidobtaining a system of higher order, which can lead to parasiticinstabilities. Such a discretization leads to a degenerate variationalintegrator, a form of symplectic integrator. I will briefly discussdiscretizations for Hamiltonian systems with canonical variables aswell as important examples with noncanonical variables. I will brieflydiscuss the extension of these integrators to those with higher orderaccuracy and those with adaptive time stepping.

Mathematics Department, 424 Gibson Hall, New Orleans, LA 70118 504-865-5727 math@math.tulane.edu