Hello đź‘‹,
I’m Adrien Mathieu (aka. jthulhu1). I’m currently studying theoretical computer science at ENS, in Paris.
Scientific interests
I am interested in a wide variety of subjects in computer science, but life is short, so here is a list of those that particularly interest me, in no particular order:
- the design of compilers and of programming languages
- type theory, and in particular when studying models of type theories that have a mathematical relevance
- proof assistants, and their foundations
- category theory
Internships
Two-Level Type Theory in Lean, and applications to semi-simplicial types in HoTT
Under the supervision of Christian Sattler, in the Logic and Types unit, at Chalmers (Göteborg).
It is a longwidthstanding problem whether it is possible to define semi-simplicial types in Homotopy Type Theory. The goal of the internship was to formalize a conservative strenghtening of HoTT, namely Two-Level Type Theory, in Lean, and, in that strenghtening, stating the conjecture that “it is impossible to define fibrant semi-simplicial types”, which is an internalized way to say that the HoTT fragment of the theory cannot speak of semi-simplicial types.
Approximate Streaming Regular Pattern Matching
Under the supervision of Tatiana Starikovskaya, in the Talgo team, at École Normale Supérieure (Paris).
The internship was about generalizing the paper Streaming Regular Expression Membership and Pattern Matching2 — which describes how to do pattern matching with a regular expression in a streaming setting. The goal was find for approximate matches, that is, given a \(k\in\mathbb{N}\) and a regular expression \(R\), find all positions in the streamed text \(T\) that end substrings of \(T\) which are at (edit or Hamming) distance at most \(k\) of a string recognized by \(R\).
The idea underlying this generalization is that the exact version of this algorithm, described in the paper mentioned earlier, heavily uses a subroutine that does pattern recognition in a streaming fashion, that is, given a pattern \(P\), find every occurrence of \(P\) as a substring of a text \(T\). This algorithm has been generalized to an approximate version, and therefore, one might be tempted to build an approximate regular version of the pattern matching algorithm just as the exact regular version is built, but swapping out the exact pattern matching subroutine with its approximate counterpart, and hope it doesn’t break too many invariants.
Automatic Benchmark Generation
Under the supervision of Guillermo Polito, in the Evref team, at Inria (Lille).
The internship was about performing automated dynamic analysis of a given program, to generate interesting benchmark samples, that is, find inputs that make the program behave particularly poorly. The framework developed is agnostic of the evaluation of a program behavior, so that it can be used both to compare several programs, find regression issues, or just audit a program on its own. The goal is not only to find an input that makes a program behave particularly poorly, but help the developer understand why it behaves so poorly, by:
- looking for small examples, to make it human readable
- looking for minimal examples, that is, examples of which every component is useful in triggering poor behavior
- looking for example families, rather than single a single example
Specifically, the framework generates probability distributions over languages generated by a fixed context-free grammar, so that the expected cost of a run of the program with a given input generated by that grammar is maximized.
-
Because, if abysall cosmic entities programmed, it wouldn’t be in C, but in J. ↩︎
-
BartĹ‚omiej Dudek, PaweĹ‚ Gawrychowski, Garance Gourdel, and Tatiana Starikovskaya. Streaming Regular Expression Membership and Pattern Matching. ↩︎