- 1 使用简单程序简介链接脚本
- 1.1 测试程序
- 1.2 编译测试程序
- 1.2.1 不使用链接器编译
- 不使用链接器编译
- 读取objdump_test 的结构
- 1.2.2 使用链接器去链接
- 链接脚本
- 使用链接脚本编译
- 读取objdump 的结构
- 2 链接脚本
- 2.1 基本连接器脚本概念
- 2.2 连接器脚本格式
- 2.3 简单链接器脚本示例
该篇文章是基于对 The GNU linker的翻译,并添加了部分自己的理解。
链接器(Linker)是编译系统中的一个重要组件,它负责将一个或多个目标文件(object files)以及库文件(library files)组合成一个可执行文件或者库。在C/C++等编程语言中,当源代码被编译时,编译器会为每个源文件生成一个对应的目标文件。这些目标文件包含了机器代码和符号表信息,其中符号表包括了函数名、变量名及其地址等。
1 使用简单程序简介链接脚本
1.1 测试程序
rlk@rlk:test$ cat objdump_test.c
#include <stdio.h>void greet() {printf("Hello, World!\n");
}int main() {greet();return 0;
1.2 编译测试程序
1.2.1 不使用链接器编译 不使用链接器编译
gcc -o objdump_test objdump_test.c 读取objdump_test 的结构
Entry point address: 0x1050
可执行程序objdump_test的可执行程序的入口为0x1050Number of program headers: 11
表示可执行程序有11个program sectionNumber of section headers: 29
rlk@rlk:test$ readelf -hW objdump_test
ELF Header:Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00Class: ELF64Data: 2's complement, little endianVersion: 1 (current)OS/ABI: UNIX - System VABI Version: 0Type: DYN (Shared object file)Machine: Advanced Micro Devices X86-64Version: 0x1Entry point address: 0x1050Start of program headers: 64 (bytes into file)Start of section headers: 14664 (bytes into file)Flags: 0x0Size of this header: 64 (bytes)Size of program headers: 56 (bytes)Number of program headers: 11Size of section headers: 64 (bytes)Number of section headers: 29Section header string table index: 28
1.2.2 使用链接器去链接 链接脚本
rlk@rlk:test$ cat link.ld
/* linker_script.ld *//* ENTRY(_start) */ /* 指定入口点 */SECTIONS
{. = 0x10000000; /* 设置起始地址 */.init_array :{__init_array_start = .;KEEP(*(SORT(.init_array.*)))KEEP(*(.init_array))__init_array_end = .;}.text : { *(.text) } /* 定义.text段的位置 */.rodata : { *(.rodata) } /* 只读数据段 */.data : { *(.data) } /* 初始化的数据段 */.bss : { *(.bss) } /* 未初始化或清零的数据段 */
rlk@rlk:test$ 使用链接脚本编译
gcc -o objdump objdump_test.c -T link.ld 读取objdump 的结构
Entry point address: 0x10000010
可执行程序的入口地址为 0x10000010,改地址也和链接脚本指定的入口地址一致。
rlk@rlk:test$ readelf -hW objdump
ELF Header:Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00Class: ELF64Data: 2's complement, little endianVersion: 1 (current)OS/ABI: UNIX - System VABI Version: 0Type: EXEC (Executable file)Machine: Advanced Micro Devices X86-64Version: 0x1Entry point address: 0x10000010Start of program headers: 64 (bytes into file)Start of section headers: 8704 (bytes into file)Flags: 0x0Size of this header: 64 (bytes)Size of program headers: 56 (bytes)Number of program headers: 7Size of section headers: 64 (bytes)Number of section headers: 32Section header string table index: 31
2 链接脚本
Every link is controlled by a linker script. This script is written in the linker command language.
The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file, and to control the memory layout of the output file. Most linker scripts do nothing more than this. However, when necessary, the linker script can also
direct the linker to perform many other operations, using the commands described below.
The linker always uses a linker script. If you do not supply one yourself, the linker will use a default script that is compiled into the linker executable. You can use the ‘–verbose’ command-line option to display the default linker script. Certain command-line options, such as ‘-r’ or ‘-N’, will affect the default linker script.
You may supply your own linker script by using the ‘-T’ command line option. When you do this, your linker script will replace the default linker script.
You may also use linker scripts implicitly by naming them as input files to the linker, as though they were files to be linked.
2.1 基本连接器脚本概念
We need to define some basic concepts and vocabulary in order to describe the linker script language.
The linker combines input files into a single output file. The output file and each input file are in a special data format known as an object file format. Each file is called an object file. The output file is often called an executable, but for our purposes we will also call it an object file. Each object file has, among other things, a list of sections. We sometimes refer to a section in an input file as an input section; similarly, a section in the output file is an output section.
Each section in an object file has a name and a size. Most sections also have an associated block of data, known as the section contents. A section may be marked as loadable, which means that the contents should be loaded into memory when the output file is run. A section with no contents may be allocatable, which means that an area in memory should be set aside, but nothing in particular should be loaded there (in some cases this memory must be zeroed out). A section which is neither loadable nor allocatable typically contains some sort of debugging information.
Every loadable or allocatable output section has two addresses. The first is the VMA, or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA, or load memory address. This is the address at which the section will be loaded. In most cases the two addresses will be the same. An example of when they might be different is when a data section is loaded into ROM, and then copied into RAM when the program starts up (this technique is often used to initialize global variables in a ROM based system). In this case the ROM address would be the LMA, and the RAM address would be the VMA.
You can see the sections in an object file by using the objdump program with the ‘-h’ option. Every object file also has a list of symbols, known as the symbol table. A symbol may be defined or undefined. Each symbol has a name, and each defined symbol has an address, among other information. If you compile a C or C++ program into an object file, you will get a defined symbol for every defined function and global or static variable. Every undefined function or global variable which is referenced in the input file will become an undefined symbol.
You can see the symbols in an object file by using the nm program, or by using the objdump program with the ‘-t’ option.
你可以使用 nm
2.2 连接器脚本格式
Linker scripts are text files.
You write a linker script as a series of commands. Each command is either a keyword, possibly followed by arguments, or an assignment to a symbol. You may separate commands using semicolons. Whitespace is generally ignored.
Strings such as file or format names can normally be entered directly. If the file name contains a character such as a comma which would otherwise serve to separate file names, you may put the file name in double quotes. There is no way to use a double quote character in a file name.
You may include comments in linker scripts just as in C, delimited by ‘/’ and ‘/’. As in C, comments are syntactically equivalent to whitespace.
2.3 简单链接器脚本示例
Many linker scripts are fairly simple.
The simplest possible linker script has just one command: ‘SECTIONS’. You use the ‘SECTIONS’ command to describe the memory layout of the output file.
The ‘SECTIONS’ command is a powerful command. Here we will describe a simple use of it. Let’s assume your program consists only of code, initialized data, and uninitialized data. These will be in the ‘.text’, ‘.data’, and ‘.bss’ sections, respectively. Let’s assume further
that these are the only sections which appear in your input files.
For this example, let’s say that the code should be loaded at address 0x10000, and that the data should start at address 0x8000000. Here is a linker script which will do that:
{. = 0x10000;.text : { *(.text) }. = 0x8000000;.data : { *(.data) }.bss : { *(.bss) }
You write the ‘SECTIONS’ command as the keyword ‘SECTIONS’, followed by a series of symbol assignments and output section descriptions enclosed in curly braces.
The first line inside the ‘SECTIONS’ command of the above example sets the value of the special symbol ‘.’, which is the location counter. If you do not specify the address of an output section in some other way (other ways are described later), the address is set from the current value of the location counter. The location counter is then incremented by the size of the output section. At the start of the ‘SECTIONS’ command, the location counter has the value ‘0’.
The second line defines an output section, ‘.text’. The colon is required syntax which may be ignored for now. Within the curly braces after the output section name, you list the names of the input sections which should be placed into this output section. The ‘’ is a wildcard which matches any file name. The expression ‘(.text)’ means all ‘.text’ input sections in all input files.
Since the location counter is ‘0x10000’ when the output section ‘.text’ is defined, the linker will set the address of the ‘.text’ section in the output file to be ‘0x10000’.
The remaining lines define the ‘.data’ and ‘.bss’ sections in the output file. The linker will place the ‘.data’ output section at address ‘0x8000000’. After the linker places the ‘.data’ output section, the value of the location counter will be ‘0x8000000’ plus the size of the ‘.data’ output section. The effect is that the linker will place the ‘.bss’ output section immediately after the ‘.data’ output section in memory.
The linker will ensure that each output section has the required alignment, by increasing the location counter if necessary. In this example, the specified addresses for the ‘.text’ and ‘.data’ sections will probably satisfy any alignment constraints, but the linker may have to create a small gap between the ‘.data’ and ‘.bss’ sections.
That’s it! That’s a simple and complete linker script.