Discover the initrd concept in Linux system.
Let's dive deeper into the concept of initrd, explore why it's needed, and how the Linux Kernel boots with this `temporary rootfs`.
1. What is initrd?
First time heard of that, I have to admit that I have thought initrd
was some programs that are used to initialize ram and disk drives 😛. So let’s analysis this concept to avoid silly mistakes like me.
The wikipedia mentions initrd
concept: in Linux systems, initrd
(initial ramdisk) is a scheme for loading temporary root filesystem into memory, to be used as part of Linux startup process. Let’s delve into this concept.
The ramdisk, in short, still is a block of Physical Memory, but in this case, the computer’s software is treating it as a disk drive. In other words, the ramdisk is a virtual storage location that can be accessed the sam as a disk drive, but actually working on Physical Memory. The main purpose of ramdisk is to provide high performance temporary storage for demanding tasks.
The second thing was mentioned is temporary root filesystem. A root filesystem is the file system contained on the same disk partition on which the root directory (/
) is located. The rootfs must contain everything needed to support a full Linux System, for example, the minimum set of directories: /dev
, /proc
, /bin
, /etc
, etc. The basic set of utilities: sh
, ls
, mv
, etc. Devices: /dev/hd*
, /dev/tty*
, etc. And runtime libs to provide basic functions.
I have other blog that goes into the rootfs and building a simplest one, take a look here: building rootfs.
Back to the initrd
concept, let’s summarize, this rootfs will be loaded into RAM for temporary and as a part of booting flow. temporary means finally we still need to load the real one?. So the questions are why we need to load a temporary rootfs into RAM? why don’t we mount the real rootfs and using it directly? Let’s go to the next section to find out the answers.
2. Why we need initrd?
To answer this question, let’s discuss this scenario. After kernel is booted, and the rootfs (/) mounted, programs can run and further runtime-kernel-modules can be loaded. But to mount the rootfs, some conditions need to be met:
- The kernel needs the corresponding drivers to access the device on which the rootfs is located (especially SCSI drivers). Some solutions are suggested:
- Kernel will contain all these drivers, but those drivers might be conflict to each other.
- Also if the kernel contains all of drivers, it becomes very large.
- The idea about different kernels for each kind of these things appear, but the problem is it becomes so hard to maintain and optimize, the kernel becomes specific when it should be generic.
- The rootfs might be encrypted. In this case a password or key is needed to decrypt and mount the filesystem.
All of these things can be resolved by the concept of initrd
: running user space programs even before the rootfs is mounted. By that, the system startup is able to occur in two phases, where the kernel comes up with a minimum set of compiled-in drivers whereas additional modules are loaded from initrd.
3. Boot flow with initrd
- The bootloader loads the kernel and
initrd
to memory and pass the control to kernel. Loadinginitrd
actually has no different with loading a kernel that uses boot services to load data to RAM from the boot medium. If the bootloader can load kernel, it is also able to loadinitrd
image. Special drivers are not required. - The bootloader informs the kernel that an
initrd
exists and where it is located in memory (through kernel parameters). - If the
initrd
was compressed, the kernel decompresses and mounts it as a temporary rootfs. - The
init
(orlinuxrc
) program is then started, that call now do all the things necessary to mount the real rootfs (usingpivot_root()
syscall). - As soon as the
init
program finishes, it execs theinit
on the new rootfs. The temporary rootfs is unmounted and the boot process continues as normal with the mount of the real rootfs.
The
init
program in temporary rootfs (frominitrd
image) is different with the real rootfs. The one ininitrd
takes care loading real rootfs part, whereas the one in real rootfs is up to user’s applications.
The filesystem under
initrd
can continue to be accessible if mount point/initrd
is there.
4. Other concepts
Let’s take a look to other concepts that might confuse you:
ramfs
- A simple dynamically resizable RAM-based file system. The data is saved in RAM only and there is no backing store for it. Thus, if we unmount aramfs
mounting point every file which is stored there is lost. We can keep writing to the fs until we fill up the entire physical memory, but the VM can’t free it because the VM thinks that files should get written to backing store.ramdisk
- An older version oframfs
that create a fake block device out af an area of RAM, the device is used as backing store to fs. The problem with this is waste memory and CPU for unnecessary works. It also required a fs driver to format and interpret data.tmpfs
- A newer version oframfs
that adds size limits, and the ability to write the data to swap space.rootfs
- A temporaryramfs
ortmpfs
that is replaced by the actual rootfs from physical store.initramfs
- A newer version ofinitrd
concepts. Starting with kernel 2.5.x, the oldinitrd
protocol is getting (replaced/complemented) with the newinitramfs
protocol. So at this moment when we are talking aboutinitrd
we are actually referring toinitramfs
concept.
5. The initrd image
5.1. Why the initrd images are built as compressed cpio images?
The idea is that the initrd
need to unpacked by the kernel during boot, it means, the kernel has to include at least one format extractor to do that. So which format archive should be choose? tar, cpio, or file system types?
Some of the reasons why cpio is chosen:
- This is kernel internal format, kernel have no depends on others, and could easily make something new. Kernel provides its own tools to create and extract this format anyway, so using an existing standard was preferable, but not essential.
- cpio is a standard.
- The code to extract in kernel is simpler and cleaner than any of the various
tar
archive formats. The completeinitramfs
archive format is explained inbuffer-format.rst
, created inusr/gen_init_cpio.c
and extracted ininit/initramfs.c
. All three together come to less than 26K total text.
But why we need to compress the cpio images?
The answer is quite simple, we need it to be as small as possible to fit within the limited RAM and resource during early boot stage. gzip
provides a good balance between compression ration and decompression speed, so it might be the best choice for us.
5.2. The cpio format
cpio (copy in and out) is a general file archiver utility and its associated file format.
5.2.1. Archive Creation
cpio reads file and directory path names from its standard input channel and writes the resulting archive byte stream to its standard output. Cpio is therefore typically used other utilities that generate the list of files to be archived, such as find
command.
The resulting cpio archive is a sequence of files and directories concatenated into a single archive, separated by header sections with file meta information, such as filename, inode number, ownership, permissions, and timestamps.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
________________
| File1 Metadata |
|________________|
| File1 Content |
| |
|________________|
| File2 Metadata |
|________________|
| File2 Content |
| |
|________________|
| Dir1 Metadata |
|________________|
| ... |
|________________|
This below example using find
to list all entries in current folder and pass through cpio
’s standard input. The cpio
and then stream bytes to its standard output as a file /path/archive.cpio
.
1
find . -depth -print | cpio -o > /path/archive.cpio
cpio does not compress any content, but resulting archives are often compressed using
gzip
.
1
find . -depth -print | cpio -o | gzip > /path/archive.cpio.gz
5.2.2. Extraction
cpio reads an archive from its standard input and recreates the archived files in the OS’s file system. In initrd
case, extracting is already taken care by the kernel itself.
5.3. Making an initrd image
5.3.1. How kernel run the init program
After mounting the initrd, the kernel try to run the init
program. Take a look to this piece of kernel code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
asmlinkage __visible __init __no_sanitize_address __noreturn __no_stack_protector
void start_kernel(void)
{
...
/* Do the rest non-__init'ed, we're now alive */
rest_init();
}
static noinline void __ref __noreturn rest_init(void)
{
...
pid = user_mode_thread(kernel_init, NULL, CLONE_FS);
...
}
static int __ref kernel_init(void *unused)
{
...
if (ramdisk_execute_command) {
ret = run_init_process(ramdisk_execute_command);
if (!ret)
return 0;
pr_err("Failed to execute %s (error %d)\n",
ramdisk_execute_command, ret);
}
...
if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;
panic("No working init found. Try passing init= option to kernel. "
"See Linux Documentation/admin-guide/init.rst for guidance.");
}
The Arch
kernel boot code run first, and then jump to start_kernel()
that initialize stuffs, do mount the initrd image, and then run the first user mode thread kernel_init()
. This function will try to run the init
program in current rootfs. Start by trying to run ramdisk_execute_command
program, and then /sbin/init
, /etc/init
and so on.
The
ramdisk_execute_command
variable have the default value as/init
, the user can replace this variable’s value by passing kernel parametersinitrd=<path>
. In casenoinitrd
is passed to kernel parameters, this value will beNULL
. As a result/sbin/init
will be tried first.
Kernel executes the
init
program as its first process that is not expected to exit.
5.3.2. Contents of initrd
An initrd
is a complete self-contained rootfs for Linux. A rootfs normally includes basic file system structure, minimum set of directories (/dev
, /proc
, /etc
, /bin
, etc.), utilities, shared libs, devices, etc. Take a look to this blog to explore what is rootfs, and build a simple one: minimal rootfs.
To understand how initrd
we don’t need to build a complete rootfs. Just remember, the kernel will looking for /init
program, right?. So we build a rootfs with only one file /init
, and then compress into an compressed cpio image.
1
2
3
4
5
6
7
8
9
10
11
12
cat > init.c << EOF
#include <stdio.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
printf("Hi initrd!\n");
sleep(999999999);
}
EOF
aarch64-none-linux-gnu-gcc -static init.c -o init
echo init | cpio -o -H newc | gzip > test.cpio.gz
Boot the system with QEMU:
1
qemu-system-aarch64 -kernel linux/arch/arm64/boot/Image -initrd test.cpio.gz -machine virt -cpu cortex-a53 -m 1G -nographic
In fact, the
/init
should do more tasks such as loading more drivers, mounting the real rootfs, and run the init program and that.
6. Some tricky questions
6.1. Can kernel boot without initrd?
Of course, yes, initrd
is optional and the kernel can mount rootfs directly by either having fixed rootfs name built-in kernel or passing the rootfs name via the kernel parameter root=<path>
. The needed drivers for mounting the rootfs also are required to build into kernel image.
6.2. Is it possible to use initrd as the final rootfs?
Well, It depends on your purposes. The initrd
is based on RAM, so everything you write to this filesystem is temporary, that will lost when you reboot the system. So if your requirements are testing temporary things, or expect higher performance filesystem access, etc, you can use initrd
as the final rootfs. Another use case is that in embedded systems, DBAN, that could package anything they needs into the initrd
and just never do the switching. Otherwise the real rootfs that based on a disk device might be better.
6.3. How the kernel mount the initrd?
Start by loading populate_rootfs()
module.
1
2
3
4
5
6
7
8
9
10
static int __init populate_rootfs(void)
{
initramfs_cookie = async_schedule_domain(do_populate_rootfs, NULL,
&initramfs_domain);
usermodehelper_enable();
if (!initramfs_async)
wait_for_initramfs();
return 0;
}
rootfs_initcall(populate_rootfs);
6.4. What if the init program exit?
Kernel panic when we try to call exit()
in the init
program.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[ 1.622477] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
[ 1.623562] CPU: 0 UID: 0 PID: 1 Comm: init Not tainted 6.13.0-rc5-00163-gab75170520d4-dirty #4
[ 1.625479] Hardware name: linux,dummy-virt (DT)
[ 1.626507] Call trace:
[ 1.627129] show_stack+0x18/0x24 (C)
[ 1.628325] dump_stack_lvl+0x60/0x80
[ 1.628974] dump_stack+0x18/0x24
[ 1.629515] panic+0x168/0x360
[ 1.629785] make_task_dead+0x0/0x17c
[ 1.630240] do_group_exit+0x34/0x9c
[ 1.630655] __arm64_sys_exit_group+0x18/0x20
[ 1.631410] invoke_syscall+0x48/0x104
[ 1.632025] el0_svc_common.constprop.0+0x40/0xe0
[ 1.632869] do_el0_svc+0x1c/0x28
[ 1.633456] el0_svc+0x30/0xcc
[ 1.633797] el0t_64_sync_handler+0x10c/0x138
[ 1.634498] el0t_64_sync+0x198/0x19c
[ 1.635991] Kernel Offset: disabled
[ 1.636681] CPU features: 0x000,00000c00,00800000,0200421b
[ 1.637640] Memory Limit: none
[ 1.638805] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000 ]---
The code that checking for global init task before exiting a process. The function is_global_init()
is literally comparing the process id with 1
.
1
2
3
4
5
6
7
void __noreturn do_exit(long code)
{
...
if (unlikely(is_global_init(tsk)))
panic("Attempted to kill init! exitcode=0x%08x\n",
tsk->signal->group_exit_code ?: (int)code);
}
6.5. Embed an initrd image into a Linux kernel
The config option CONFIG_INITRAMFS_SOURCE
(in General Setup in menuconfig
, and living in usr/Kconfig
) can be used to specify a source for the initramfs archive, which will automatically be incorporated into the resulting binary.