Linker

The linker combines multiple object files (and library files) into a single executable program. It’s the next step after the assembler: the assembler turns each source file into an object file with placeholders for unresolved references; the linker fills in those placeholders by matching names across files.

Why have a separate linker

A common-sense alternative would be to assemble everything into one big binary in a single pass. But that breaks down once programs get large enough to want:

Separate compilation. Edit one source file, recompile just that file’s object, re-link. Much faster than rebuilding everything.
Libraries. Code that’s useful in many programs (math functions, I/O routines) gets compiled once into a library and linked in as needed. The user doesn’t recompile the library for every project.
Modularity. Programmers can work on separate files independently, with the linker resolving cross-file references at the end.

This is separate processing — the work is divided between the assembler (per-file) and the linker (across files).

What the linker does

When two source files are passed through the assembler, each becomes an object file with machine code, data, section headers, and a symbol table. If file A’s instructions reference functions defined in file B, the assembler leaves placeholders — it can’t fill in the addresses because file B’s symbols haven’t been resolved yet.

The augmented data attached to the object file is “like a cheat sheet” containing:

Defined symbols (exports): which labels are defined in this file and at what offsets.
Undefined symbols (imports): which external symbols this file references and where they’re used.
Relocation entries: how to compute the actual address once the linker knows where each section gets placed.

The linker:

Reads all input object files.
Decides where each one’s .text and .data sections will be placed in memory (assigns base addresses).
Builds a unified symbol table by matching exports with imports.
Patches every placeholder in the object code with the correct address, applying relocations.
Writes out a single executable.

If any imported symbol isn’t found in any object file or library, the linker reports an “unresolved external” error and refuses to produce output. Same if two object files both define the same symbol — “multiple definition” error.

Library files

A library file is a collection of pre-assembled object files bundled together with an index of which symbols are defined in which member. The linker, when it can’t find a symbol in the explicit object files, searches the libraries and pulls in any members that define needed symbols.

Two flavors:

Static libraries (.a, .lib): the linker copies the relevant code into the final executable. Self-contained but redundant if many programs use the same library.
Shared / dynamic libraries (.so, .dll): only a reference is included; the actual library code is loaded by the OS at runtime. Saves disk space and lets multiple programs share one copy in memory.

Libraries are how subroutines that are useful across many programs get reused. Standard library functions (printf, malloc, strlen) live in libraries provided with the compiler/OS.

In the full toolchain

source files  →  [compiler/assembler]  →  object files  →  [linker]  →  executable
                                              ↑
                                          library files

The executable is what the Loader eventually copies into memory and runs.

Why “linker” not “combiner”

Historical: linking refers to resolving links between code segments — like hyperlinks between documents, but for symbol references. The linker constructs the full graph of which-call-points-go-where and writes it out concretely. The name comes from the early IBM systems; it stuck.

Idriss Rami — Notes

Explorer

Linker

Why have a separate linker

What the linker does

Library files

In the full toolchain

Why “linker” not “combiner”

Graph View

Table of Contents

Backlinks