Mriganka Basu Roy Chowdhury
github mail cv blog

Hi, I am Mriganka! I am a 4th year Doctoral Candidate at the Department of Statistics, UC Berkeley, supervised by Shirshendu Ganguly. I am broadly interested in various topics in probability, including spin systems, spectra of matrices, random graphs, Gibbs measures, and mixing times of Markov chains. More recently, I have also been exploring various directions in the theory of neural networks and machine learning.

Myself
Me at Princeton, August 2024.

Feel free to click through some of the sections below. Or take a look at my (nascent) blog.

For each paper below, you will find an associated description. It is intended to be an accessible explanation of the main goal of the work. Detailed abstracts will be found in the link to the paper.
Gaussian to log-normal transition for independent sets on a percolated hypercube
2024, joint work with Shirshendu Ganguly and Vilas Winstein

Combinatorial optimization problems, such as the Hamiltonian cycle, traveling salesman, MAX-SAT, MAX-CUT, and the largest independent set, have been extensively studied in computer science literature and form the foundation of complexity theory. However, a significant limitation of this theory is its focus on the worst-case scenario. In contrast, the average-case analysis of these problems, which is arguably more relevant, has only recently gained traction and is now a vibrant area of research in probability and computer science theory. Interestingly, many of these problems are connected to statistical physics and the phenomena of phase transitions through the language of Gibbs measures.

Consider a hypercube, i.e., a graph whose vertices are the binary strings of a fixed length \(d\), and two vertices are adjacent if their corresponding strings differ in exactly one bit (see the figure for the \(d = 4\) case). The largest independent set problem on the hypercube is a problem with a long history, starting with the work of Korshunov and Sapozhenko, counting the number of independent sets, as well as describing the structure of a typical independent set. The structure is as follows. Let us divide the hypercube into an even and an odd side, depending on the number of bits that equal 1. Then, choose one side uniformly, and pick a \(\mathrm{Poisson}(1/2)\) number of them. This is called the "defect" (in the figure below, the even side at the bottom is the defect side with 2 defects in red). Then remove all neighbors of defect vertices, and choose a random subset of the other side among the vertices that survive. These vertices together form a uniform independent set.

A percolated hypercube
An instance of a percolated hypercube of dimension 4, drawn to emphasize the bipartite structure. The red vertices form an independent set.

Of course, there is no randomness in the problem description, but suppose, we now remove some constraints randomly. Say we drop each edge with probability \(p\), independently. In the figure, the solid black edges remain, and the gray edges have been dropped. In this disordered setting, how does the number of independent sets (now random) behave? How does a typical independent set look like? These questions were first asked in 2022 by a paper of Kronenberg and Spinka [KS], referring to this model as the percolated hypercube. It turns out that there is a sharp transition at \(p = 2/3\), where the count of independent sets transition from being a Gaussian to being log-normal. This was conjectured in [KS], and we establish this concretely in our paper, employing a very different approach compared to earlier works in this direction, which relied on the computation of moments. Our approach proceeds via building a probabilistic understanding of this model, also lending its way to the description of the structure of typical independent sets. It turns out that at $p = 2/3$, the defect side can no longer be picked uniformly, but instead depends crucially on the realization of the percolation.

Characterizing Gibbs states for area-tilted Brownian lines
2023, joint work with Pietro Caputo and Shirshendu Ganguly

Models in statistical physics often give rise to collections of random curves, called line ensembles. As an example, consider the classical Solid-on-Solid model, which seeks to study the interface between two solids. More specifically, it studies the interface between the two phases in a magnet whose, say, bottom surface is magnetized to be spin "down", and rest of the magnet is spin "up" (this is not representative of the actual physics of the system, the true interpretation and model comes from quantum mechanical considerations). Due to a phenomenon called entropic repulsion from the bottom surface, the interface will actually rise up from the bottom surface, resulting in many large level curves, which is a curve at constant height (think of a contour in geographic maps). Many important quantities of interest may then be examined by understanding the level curves of this object. The behavior of these curves is of great interest, and suitable probabilistic objects have been defined which capture that behavior. This is the area-tilted line-ensemble, a collection of ordered, nonintersecting random curves, which is the primary focus of the paper.

Main diagram of the paper
A quadratically growing area-tilted ensemble. Observe how only the first curve can grow.

To elaborate further, this ensemble is specified in terms of its finite domain marginals. The so-called area-tilted Brownian Gibbs property describes this behavior and is derived from the description of the Solid-on-Solid model. It is a known, nontrivial, fact that this ensemble is a well-defined mathematical object (nothing blows up or vanishes under the scaling operations performed to reach it). Until now, only one ensemble was known which satisfied the said Gibbs property. In this paper, we succeed in finding all of them. Perhaps surprisingly, there is an infinite family of such ensembles (yes, an infinite family of families of infinitely many lines) parametrized by two numbers \(L, R\) (representing left and right "rates of growth") such that their sum is \(L + R < 0\). The original ensemble corresponds to \(L = R = -\infty.\) See the figure for an example.

Universality in prelimiting tail behavior for regular subgraph counts in the Poisson regime
2023

A graph) is a mathematical object used to describe relationships between things. Since real-world graphs are very complex, it is often of interest to understand the nature of random graphs. The most famous model of random graphs is the celebrated Erdős–Rényi model, which is created by taking \(n\) vertices, and joining each pair with probability \(p\) independently - let \(G\) be such a random graph. Think of this as \(n\) people, and any two people know each other with probability \(p\) independently of the others. For \(p = 0.1\) say, each vertex is connected to \(0.1 n\) many other vertices on average, which is a huge number. Many applications require this average degree, i.e., the number of other vertices a given vertex is connected to, to be constant, say 5. A moment's thought then reveals that one may seek to emulate this behavior by modifying \(p\) with \(n\), precisely by choosing something like \(p = 5/n\). This is a so-called sparse Erdős–Rényi graph.

Several basic questions can be asked about this model. Among them, of enormous importance is the notion of subgraph count. For example, I can ask "How many triangles are there in this graph?" by which I mean "How many groups of 3 people A, B, C are there such that all of them know each other?". Our understanding of how exactly this count behaves (on a very precise level) was actually open until very recently. But as one may imagine, it is possible to ask this question for any configuration, say the count of four people A, B, C, D forming a "square". In this spirit, this paper provides a generalization of the recent triangle result, extending it to every possible "regular" subgraph, which is every configuration where each person knows the same number of people among the rest (the square example is one such - everyone knows 2 others in the group).

Some recent classes I have been a GSI (TA) for:

Prior to that, I have been a GSI for STAT 205A (Graduate Probability Theory Part 1), STAT 134 (which is an upper-division probability class for undergraduates), and STAT 88 (which is a more introductory variant of STAT 134 combined with some emphasis on statistics).

The website for the Student Probability Seminar is located here.

Apart from research, I spend a significant amount of time coding, making simulations (often probabilistic in nature!) and (simple/incomplete) games. Although I am no longer active in the scene, I used to participate quite frequently in competitive programming competitions, for example, on Codeforces (mbrc) and Codechef (mbrc). Once a year, I would also participate in the now-defunct but celebrated Google Code Jam. In 2021, I (with my team of 3) qualified for the ICPC World Finals, held in Moscow, Russia.

These more "scientific" pursuits aside, I also enjoy gaming, listening to music, reading manga, and watching anime/TV shows. I participated (as a team of one) in one of the biggest game jams, the GMTK Game Jam 2023 - a 48-hour game jam based around a theme. My entry, A Conscious Virus, can be found here.

In random order:

This website is created using flask (and by implication, python), with tailwind for styling. The majority of the content here is rendered from markdown, and structural information (for instance, the data for the papers or the seminar) is recorded as .toml files, which is parsed and loaded into jinja templates. Client-side interactivity uses Alpine.js, which also handles persisting the toggled sections into localStorage. KaTeX is used to render the \(math\).

An older version of this website was written using rust and actix. A significant amount of boilerplate was invested in type-checking .toml files, something I have since found useless since all the parsing happens at server start and hence errors can be caught at that point (surprisingly, actix itself didn't need much code, and I was pleased with the UX). As a result, the 500-line codebase is now just less than 90 lines of python (discounting the blog feature which was not present earlier).