How to Build a Python-to-C++ Compiler out of Spare Parts - and Why

xazzyfu.github.io/europython2024/

Developed by Nexedi

Performance

is a topic

Python is interpreted

The global interpreter lock

at nexedi

in the industry

  • Migration From Python (Partial) – DropBox (Go)
  • In-house Interpreters – Instagram / Meta (Cinder)
  • Not Python To Start With – Clevercloud (Rust)

Performance is a Topic

EuroPython 2024 :)

  • Yesterday – SPy (Static Python) lang: fast as C, Pythonic as Python
  • Yesterday – Accelerating Python with Rust: The PyO3 Revolution
  • Today – How we used vectorization for 1000x Python speedups (no C or Spark needed!)
  • Tomorrow – Python in Parallel: Sub-Interpreters vs. NoGIL vs. Multiprocessing

Performance is a Topic

PEPs

  • PEP 659 – Specializing Adaptive Interpreter
  • PEP 684 – A Per-Interpreter GIL
  • PEP 703 – Making the Global Interpreter Lock Optional in CPython
  • PEP 744 – JIT Compilation

Performance is a Topic

Faster Interpreters

  • Cinder (Meta) – CPython fork
  • Faster CPython (Microsoft) – CPython fork
  • PyPy – Drop-in
  • Pyston – Drop-in

Performance is a Topic

Optimizing compilers

  • Cython
  • Numba
  • Codon
  • Nuitka
  • Shed Skin

Faster Interpreters

  • Drop-in replacement
  • multiprocessing, threads, async/await

Optimizing compilers

  • Critical sections mostly
  • OpenMP, C code

Focus on concurrency and parallelism

Faster Interpreters

  • Drop-in replacement
  • multiprocessing, threads, async/await

Typon

  • Compile whole programs
  • Builtin concurrency

Optimizing compilers

  • Critical sections mostly
  • OpenMP, C code
  • Compile whole programs
  • Builtin concurrency
  • Typon ⊂ Python
  • Python interoperability

– 2022 – Writing a Scheduler

M:N Scheduler

Many concurrent tasks over N worker threads

Concurrency primitives

fork and sync


          def f():
            # - before -
            fork(lambda: child())
            # - continuation -
          

            def f():
              # - before -
              fork(lambda: child())
              # - continuation -
            

            def f():
              # - before -
              fork(lambda: child())
              # - continuation -
            

            

          def f():
            # - before -
            result = fork(lambda: child())
            # - continuation -
            sync()
            # - joined -
            return result
          

            def f():
              # - before -
              result = fork(lambda: child())
              # - continuation -
              sync()
              # - joined -
              return result
            

            def f():
              # - before -
              result = fork(lambda: child())
              # - continuation -
              sync()
              # - joined -
              return result
            

            def f():
              # - before -
              result = fork(lambda: child())
              # - continuation -
              sync()
              # - joined -
              return result
            

Structured Concurrency


          def f():
            # ...
            fork(lambda: child())
            # ...
            # implicit sync (even if exception)

          f() # no concurrency leak
          

Concurrent I/O


          def server(sock):
            while True:
              conn = sock.accept()[0]
              handle(conn) # only one connection at a time

          def handle(conn):
            msg = conn.recv(1024) # python: classic thread-blocking I/O
            #...
          

          def server(sock):
            while True:
              conn = sock.accept()[0]
              fork(lambda: handle(conn)) # C10K capable :)

          def handle(conn):
            msg = conn.recv(1024) # typon: asynchronous I/O by io_uring
            #...
          

– 2023 – Writing the Compiler

Tom Niget

Compiler Design

Compiler Design

Abstract Syntax Tree

import ast

Abstract Syntax Tree


            import ast

            source = """
            a = 1 + 2
            """

            print(
              ast.dump(
                ast.parse(source),
                annotate_fields=False,
                indent=2,
              )
            )
            

Abstract Syntax Tree


            import ast

            source = """
            a = 1 + 2
            """

            print(
              ast.dump(
                ast.parse(source),
                annotate_fields=False,
                indent=2,
              )
            )
            

            Module(
              [
                Assign(
                  [
                    Name('a', Store())],
                  BinOp(
                    Constant(1),
                    Add(),
                    Constant(2)))],
              [])
            

Abstract Syntax Tree


            import ast

            source = """
            def f(x):
              return x.y
            """

            print(
              ast.dump(
                ast.parse(source),
                annotate_fields=False,
                indent=2,
              )
            )
            

            Module(
              [
                FunctionDef(
                  'f',
                  arguments(
                    [],
                    [
                      arg('x')],
                    kwonlyargs=[],
                    kw_defaults=[],
                    defaults=[]),
                  [
                    Return(
                      Attribute(
                        Name('x', Load()),
                        'y',
                        Load()))],
                  [])],
              [])
            

Visitor Pattern


            import ast

            def compile(source: str):
              tree = ast.parse(source)
              # [...]
              TypeInferer().visit(tree)
              # [...]
              CodeGenerator().visit(tree)
            

Visitor Pattern


            from ast import NodeVisitor

            class TypeInferer(NodeVisitor):
              def visit_Module(self, node):
                # [...]

              def visit_FuncDef(self, node):
                # [...]

              def visit_Call(self, node):
                # [...]

              def visit_Name(self, node):
                # [...]
            

Visitor Pattern


            class NodeVisitor(ast.NodeVisitor):
              def visit(self, node):
                for cls in node.__class__.__mro__:
                  name = 'visit_' + class.__name__
                  if visitor := getattr(self, name, None):
                    visitor(node)
                else:
                  self.generic_visit(node)

            class TypeInferer(NodeVisitor):
              def visit_Module(self, node):
                # [...]

              def visit_FuncDef(self, node):
                # [...]

              def visit_Call(self, node):
                # [...]

              def visit_Name(self, node):
                # [...]
            

Visitor Pattern


            class NodeVisitor(ast.NodeVisitor):
              def visit(self, node):
                if isinstance(node, list):
                  for n in node:
                    self.visit(n)
                for cls in node.__class__.__mro__:
                  name = 'visit_' + class.__name__
                  if visitor := getattr(self, name, None):
                    visitor(node)
                else:
                  self.generic_visit(node)

            class TypeInferer(NodeVisitor):
            

Type Inference

Generics


          def add(a, b):
            return a + b

          add(1, 2) # ok
          add("hello", "world") # ok
          add(1, "world") # error
          

Generics


          def add(a, b):
            return a + b

          add(1, 2) # ok
          add("hello", "world") # ok
          add(1, "world") # error

          class Weird:
            def __add__(self, whatever):
              return str(whatever)

          s = add(Weird(), 1) # ok, s: str
          

Generics


          l = []      # list[T]
          l.append(1) # T = int
          

Code Generation

C++ pretending to be Python


            def f():
              pass
            

            void f() {}
            

C++ pretending to be Python


            def f():
              pass

            a = f
            

            struct {
              void operator() () {}
            } f {};

            auto a = f;
            

C++ pretending to be Python


            def f(x):
              pass

            a = f
            

            struct {
              void operator() (auto x) {}
            } f {};

            auto a = f;
            

C++ pretending to be Python


            def f(x):
              return x.y

            a = f
            

            

C++ pretending to be Python


            class X:
              pass

            def f(x):
              return x.y

            a = f
            

            

C++ pretending to be Python


            class X:
              pass

            def f(x):
              return x.y
            

            

C++ pretending to be Python


            class X:
              y : int

            def f(x):
              return x.y
            

            

C++ pretending to be Python


            class X:
              def y(self):
                pass

            def f(x):
              return x.y
            

            

C++ pretending to be Python


            class X:
              def y(self):
                pass

            def f(x):
              # what if x.y is
              # a boundmethod?
              return x.y

            x_dot_y = f(X())
            

            

C++ pretending to be Python

...And so on

  • Inheritance
  • MRO (multiple inheritance)
  • Generic classes
  • ...

Python Interoperability

Calling Python


          import numpy

          if __name__ == "__main__":

            x = [i for i in range(5)]

            y = numpy.square(x)

            z : list[int] = y

            print(x, y)
          

Calling Python


          # via pybind11, 'numpy' is a Python object
          import numpy

          if __name__ == "__main__":

            # list[int], pure C++
            x = [i for i in range(5)]

            # convert 'x' to Python, get a Python object
            y = numpy.square(x)

            # convert Python object to list[int]
            z : list[int] = y

            # pure C++
            print(x, y)
          

When Python calls you


          

Towards a Standard Library

Towards a Standard Library

Typon/C++ Bindings = Python Stubs + C++ Implementations

  • builtins: list, dict, str, print, open, ...
  • socket: io_uring
  • os (some): io_uring
  • ...
  • A work in progress :)

pypi.org/project/typon/

Thank you!