#!/bin/bash : <<'````````bash' # Programming Exercise 7: Dynamic Binary Translation Write a dynamic binary translator for RISC-V programs, in particular rv32i. To keep things simple, you only need to do the actual translation part and you can use a decoder library and LLVM for code generation. ## Implementation Decode and lift RISC-V code with at least basic block granularity to LLVM-IR and use LLVM's JIT compiler. Emulate guest memory by reserving 4 GiB of virtual memory (`mmap(NULL, 0x100000000, PROT_NONE, MAP_PRIVATE|MAP_ANON|MAP_NORESERVE, -1, 0)`) and load the guest program at a command-line configurable address inside this virtual memory (use `mprotect(address, size, PROT_READ|PROT_WRITE)`, write the code there, and call mprotect again to change the code to read-only); use the base of the memory region as base for all guest memory accesses. Initialize a sufficiently large stack (again, use mprotect) and set the stack pointer to point at the top end. At the end of the program, dump the register state in human-readable form. Add a command-line option for single-stepping, where instead of basic blocks single instructions are translated, and for dumping the register state and program counter before entering a translated chunk. You don't need to handle any ISA extensions, system calls, or modifyable code (although you are encouraged to implement some system calls for memory management and I/O). For decoding, you can use frvdec (https://git.sr.ht/~aengelke/frvdec/tree) or LLVM-MC. You may use the template below as starting point. ## Command Line Interface usage: ./bt (-s) program_file [program_args...] RISC-V user-space binary translator. -s: single-stepping mode. program_file: statically linked rv32i ELF executable. ## Analysis Write (and submit) some sample programs to test the functionality of your emulator. Do some profiling to determine the performance-limiting factors. What could you do to improve the performance of your binary translator? ## Submission - Submission deadline: 2025-01-22 23:59 - Submit using `curl --data-binary "@" 'https://db.in.tum.de/teaching/ws2425/codegen/submit.py?hw=7&matrno='` - Write your solution in a single C++ file. (Default file name: `bt.cc`) - Include answers to theory questions as comments at the top of the source file. - Avoid dependencies, no build systems other than Makefile, etc. - If you use frvdec, please copy header and source file into your submission - If you need more than just the C++ file, combine all files s.t. this command sequence works: `split-file somedir; cd somedir; bash ` - If you write your own Makefile: - Use `$(LLVM_CONFIG) --cppflags --ldflags --libs` to find LLVM; note that libs must come after your object files when linking. - Default-initialize `LLVM_CONFIG := llvm-config`, so that `make LLVM_CONFIG=/path/to/llvm-config` overrides it. There is no test script or automatic verification. ## Appendix: Template #include #include #include #include #include #include #include #include using ChunkFunc = uint32_t(uint32_t* regs); // Compile function from module; module is consumed; returns nullptr on failure. void* compile(std::unique_ptr mod, const std::string& name) { std::string error; llvm::TargetOptions options; llvm::EngineBuilder builder(std::move(mod)); builder.setEngineKind(llvm::EngineKind::JIT); builder.setErrorStr(&error); builder.setOptLevel(llvm::CodeGenOpt::None); builder.setTargetOptions(options); llvm::ExecutionEngine* engine = builder.create(); if (!engine) err(1, "could not create engine: %s", error.c_str()); return reinterpret_cast(engine->getFunctionAddress(name)); } int main(int argc, char** argv) { llvm::InitializeNativeTarget(); llvm::InitializeNativeTargetAsmPrinter(); // Implement binary translator logic here. return 0; } ````````bash set -euo pipefail FAILED=0 CXX=g++ CXXFLAGS="-O3 -Wall -Wextra -std=c++20" make bt exit $FAILED