I am a professor of computational statistics in the MATH department at University of Copenhagen. I cofounded Copenhagen Causality Lab and do research at the intersection of AI and statistics. My main interest is to automatize learning of causal explanations from data. I use techniques from Bayesian networks, stochastic processes, predictive modeling and machine learning to discover causal structures and achieve explainable, robust and transportable AI.

- Causality
- Machine learning and AI
- Model selection
- Stochastic dynamic models
- Event processes

PhD in Statistics, 2004

University of Copenhagen

MSc in Mathematics, 2000

University of Copenhagen

The Robbins-Siegmund theorem give conditions on a nonnegative almost supermartingale that ensure its almost sure convergence. It can be seen as a generalization of Doob’s martingale convergence theorem for nonnegative supermartingales. It is fairly easy to derive a number of almost sure convergence results from the Robbins-Siegmund theorem, e.g., the strong law of large numbers. In this post we show how it can be used to show convergence of a stochastic gradient descent algorithm.

Shapley values explain how each feature contributes to a prediction. However, what it is that Shapley values precisely explain is determined by a value function. Multiple choices of value functions are possible, and each choice implies a different interpretation of the explanations.

We develop a model-free framework based on the Local Covariance Measure for testing the hypothesis that a counting process is conditionally locally independent of another process. We propose the (cross-fitted) Local Covariance Test, and we show that its level and power can be controlled uniformly, provided that two nonparametric estimators are consistent with modest rates.

We develop a nonparametric test for conditional independence by combining the partial copula with a quantile regression based method for estimating the nonparametric residuals.

We develop the theory of μ-separation for directed mixed graphs, which gives a class of graphical independence models closed under marginalization. We show that there is a unique maximal element in the Markov equivalence class of directed mixed graphs, and we characterize the equivalance class via the directed mixed equivalence graph.

We derive a representation of the degrees of freedom for certain discontinuous estimators and show how it can be used to estimate the risk for Lasso-OLS.

Oracle inequalities for multivariate Hawkes processes and other point processes using $ℓ_1$-penalized estimation.

An algorithm for and implementation of sparse group lasso optimization with applications to multinomial sparse group lasso classification.

We consider local alignments without gaps of two independent Markov chains from a finite alphabet, and we derive sufficient conditions for the number of essentially different local alignments with a score exceeding a high threshold to be asymptotically Poisson distributed.