This is the fourth installment in a series about the components of typical development environments. Part 1 introduces the components and the Text Editor, Part 2 covers the Interpreter, and Part 3 covers the Compiler. This part covers one of the less understood components of a toolchain, the linker. It’s easily overlooked because of the prevalence of interpreted and languages, and those which run in a Virtual Machine which don’t use it, but it’s an integral part of a compiled language toolkit.
If you’ve been following this series, you may remember I mentioned that usually, compiled programs don’t have all the code needed to execute them.
Where’s the rest? In libraries.
Libraries are what results when compiling code in a manner that is not executable, but can be referenced from other code. The names used in the programming language specify what names from compiled libraries should be referenced for execution. The linker provides, perhaps unsurprisingly, a link between the program’s source code and the libraries. It simply connects compiled code into a single instance.
There are two kinds of uses for libraries: static, and dynamic. The static use of libraries is easiest, so I’ll cover it first.
The way it works is by outputting a single executable code file which contains both the program’s executable code and the libraries. Executable code files are often called “binary files”, and I’ll call them so from now on.
If you write a program which displays something in the screen, the binary produced by the compiler references the names of the libraries that contain the machine code to do it. If you produce a static linked binary, the libraries referenced and the program binary are stored in a single executable file. The resulting files are big, but can be moved from computer to computer that share the same platform with ease. As many libraries are not included in the Operating System, custom or third party libraries can be included like this for ease of distribution. If that wasn’t done, then the binary couldn’t access the code it needs to perform its operation.
This approach works, as the name states, dynamically. When the program is compiled, it’s not merged with certain – perhaps any – libraries. This produces a small executable file which depends on the libraries being installed on the system to function.
It’s best used when no third party or custom libraries are needed, but can tie the executable firmly to a particular environment and configuration. One of the ways this can happen is when an Operating System update changes a library, which can change the functionality or introduce new bugs. In particular, some applications are certified with a fixed environment, which can make system updates undesirable to preserve the application’s certification. In other words: Dynamic linking makes for a smaller executable file footprint, but makes it brittle in ways that, sometimes, are not acceptable for the applications.
The only linker I know about is called “ld”, and it’s present in unix and unix-like systems. In windows, I’ve only ever worked with dynamic linking, which means that any particular DLLs need to be installed in the target system or packed as independent files when distributing the software.
[EDIT: There seems to be a widely available gratis Windows linker made by Microsoft, called… Linker.exe; I’ve never used it manually. Homepage]
The Rust programming language’s linker can link statically under Windows, but as far as I can tell the linker is included in the language’s compiler, rustc.
[EDIT: the compiler invokes an existing linker in your Operating System, it’s not included in the compiler.]
This is all for the week. Next time we’ll be looking at static code analyzers.
I want to take this end of the week space to invite you to e-mail or comment any doubts you may have; this articles are, of course, subject to revision.
Thank you for reading.