Posts

Weak and strong solutions of random equations

Structural equation models with cycles may have strong solutions, which are easy to understand. However, if the solution is not unique, Kurtz’ alternatives imply the existence of spooky weak solutions that are not strong solutions.

Why stochastic gradient descent works

The Robbins-Siegmund theorem gives conditions on a nonnegative almost supermartingale that ensure its almost sure convergence. It can be seen as a generalization of Doob’s martingale convergence theorem for nonnegative supermartingales. It is fairly easy to derive a number of almost sure convergence results from the Robbins-Siegmund theorem, e.g., the strong law of large numbers. In this post we show how it can be used to show convergence of a stochastic gradient descent algorithm.

Explainable AI

Shapley values explain how each feature contributes to a prediction. However, what it is that Shapley values precisely explain is determined by a value function. Multiple choices of value functions are possible, and each choice implies a different interpretation of the explanations.