Codegenerierung für Datenverarbeitung
Contents
Code generation is a key technique for efficient program execution and data processing. This lecture will cover the following topics from a practical perspective with accompanying hands-on exercises:
- Execution models of programs (interpretation, bytecode, machine code generation, etc.)
- Program representations (source code, intermediate representations (IRs), different forms of bytecode)
- Classical techniques of code generation
- SSA and optimization techniques, exemplary described on LLVM-IR
- Machine code generation: instruction selection and register allocation
- Execution of programs in virtual machines (e.g., WebAssembly, BPF, JavaScript)
- Sandboxing and optimizations for JIT compilation
- Execution of database queries (e.g., SQL, data frame API)
- Execution models and code representations
- Execution of machine code/binary translation (e.g., RISC-V)
- Specifics when translating machine code
Organization
- Lecture with integrated exercises: Thu 10–14 (c.t., with break) in 02.11.018
- Exercises will include hands-on programming tasks
- (Basic) C++ knowledge is required
- Language: English
- Module: CIT3230001, 6 ECTS, Bachelor/Master elective
- Area "Databases and Information Systems" for Informatics, Wirtschaftsinformatik/Information Systems, Informatics: Games Engineering, Biomedical Engineering
- B1.2 "Advanced Topics in Data Engineering" for Data Engineering and Analystics
- Written exam (90 minutes), might change to oral on low registration count. Exam from winter 2022/23: [exam2223.pdf].
- Zulip stream for this lecture; private contact via e-mail
Prerequisites
The course is aimed at bachelor/master students who have taken the following (or similar) courses:
- IN0004 Introduction to Computer Architecture
- IN0008 Fundamentals of Databases
Material
Material and exercises will be regularly provided throughout the semester. [lec.pdf] [prs.pdf]
Script (updated regularly with new lecture content, includes content of slides and additional information and comments): [book.pdf]
Note: The schedule below is preliminary and is likely to change during the semester.
Date | Lecture Topic | Homework |
---|---|---|
17.10. | Overview, Motivation, Interpretation Techniques
[lec01.pdf]
Exercise: (no exercise session) | [hw1.txt] [hw1-sol.c] |
24.10. | Compiler Front-end
[lec02.pdf]
Exercise: discussion of homework | [hw2.txt] [hw2-sol.cc] |
31.10. | IR Concepts, Control Flow Graph, SSA Construction
[lec03.pdf] [prs03.pdf]
Exercise: IR design [ex03.txt] | |
07.11. | LLVM-IR
[lec04.pdf]
Exercise: getting started with LLVM [ex04.txt] | [hw3.txt] [hw3-sol.cc] |
14.11. | Analyses and Transformations
[lec05.pdf] [prs05.pdf]
Exercise: writing an LLVM-IR pass [ex05.txt] | |
21.11. | Exercise: auto-vectorization | |
28.11. | Instruction Selection
[lec07.pdf] [prs07.pdf]
Exercise: discussion of hw2 and hw3 | [hw4.txt] [hw4-sol.cc] |
05.12. | (no lecture/exercise, Dies Academicus) | |
12.12. | Register Allocation
[lec08.pdf] [prs08.pdf]
Exercise: SSA destruction | [hw5.txt] [hw5-sol.cc] |
19.12. | Object Files, Linker, Loader
[lec09.pdf]
Exercise: ELF file construction [ex09.txt] | |
09.01. | Unwinding and Debuginfo
[lec10.pdf]
Exercise: unwind info generation [ex10.txt] | [hw6.txt] |
16.01. | JIT-compilation and Sandboxing
[lec11.pdf]
Exercise: WebAssembly hands-on [ex11.txt] | |
23.01. | Vectorization
[lec06.pdf]
Exercise: Auto-vectorization hands-on [ex06.txt] | [hw7.txt] (Bonus) |
30.01. | Query Compilation
[lec13.pdf]
Exercise: SQLite hands-on [ex13.txt] | |
06.02. | Code Generation for GPUs (Nicolai Hähnle, AMD)
Exercise: none/wrap up |