Introduction

Welcome to my Kernel From Scratch (KFS) project documentation! This repository contains my implementation of the École 42 "Kernel From Scratch" curriculum, consisting of 10 progressive projects designed to explore and understand kernel architecture from the ground up.

Fungul is an experimental kernel written in Rust. Just as fungal networks in nature create vast, efficient systems for resource sharing, Fungul aims to provide a robust foundation for process communication and resource management.

This project is in early stage development. While it's exciting to experiment with, it's not yet ready for production use.

This Book

This book contains more indepth information about my technical approach & a more blog-style writing about each project to get to the current point I am at.

To find the book online, go to this page -> a page

Overview

Introduction

Welcome to my Kernel From Scratch (KFS) project documentation! This repository contains my implementation of the École 42 "Kernel From Scratch" curriculum, consisting of 10 progressive projects designed to explore and understand kernel architecture from the ground up.

Technical Requirements

  • Target Architecture: i386 (x86_32)
  • Build System: Custom Makefile required
  • Programming Language: Not restricted to any specific language
  • Compiler Flags:
    • -nostdlib: Prevents linking with standard library
    • -fnodefaultlibs: Excludes default library functions
    • -fno-stack-protector: Disables stack protection mechanisms
  • Custom Linker: Implementation of a custom linker is mandatory
  • No cheating! All code must be original; no copying + pasting allowed

Documentation Structure

Each project in this repository is documented with:

  1. Project Goals and Requirements
  2. Technical Approach and Implementation Details
  3. Challenges Encountered and Solutions
  4. Conclusions and Lessons Learned

Join me

Feel free to explore each project's documentation to understand the journey of building a kernel from scratch. The projects are organized sequentially, with each building upon the knowledge and components developed in previous sections.

KFS 01 - Grub, boot & screen

Introduction

Our first KFS project! I was quite nervous myself at first looking at this. My previous experience with kernels was limited to Linux from Scratch, which barely scratches the surface compared to KFS.

For this project, I chose Rust as my programming language. My experience is limited. I just did the tutorial called Rustlings. I do have a few years of experience in C/C++ & Assembly, which will definitly be useful.

Goals

  • A kernel that can boot via Grub
  • An ASM bootable base
  • A basic Kernel Library
  • Show "42" on screen

Simple enough, right? (It wasn't)

Technical Approach & Implimentation

My approach was quite straightforward for this project. Read, read & READ! I primarily started reading OSDev. It offers good guidance on kernel development.

I started following OSDev's straightforward tutorial for OS booting in C. Having my own libc implementation made this phase quite smooth. With ease, I was able to make a system bootable.

After that, I had some basic knowledge on booting up a system via GRUB. Now was the challange to convert it to Rust. Luckily, Philipp Oppermann's blog helped me immensily! It gave me more insight on how to setup a Rust envirnment. I just had to figure out how to change that to x86_32, since their tutorial is meant for x86_64.

After that I noticed Mr. Oppermann having a second tutorial on VGA; how to set it up and print to it, which is one of the requirements. I finished it and mostly finished KFS_01. I just had to put the dots on the i, and cross the t's.

Challenges

The biggest of this project was understanding the nix-shell, the targeting system of Rust & booting up the Rust kernel in x86_32.

You may have noticed that I am using nix-shell. The reason is simply to make it easier for a developer to start in the correct environment. Once nix-shell is setup it ensures you are always on the correct version with the correct programs. You will have less of the "It works on my machine". The main challenge was to setup the nix-shell, since the documentation of Nix is quite limited. It was just trying a lot of things until it worked.

Secondly, the Rust targeting system was quite vague for me. The main challenge for me was to understand you need a target.json for your own specifications. I do think this is the superior approach, but I was used to gcc, where you have to compile it yourself. It took me a bit of time until I understood that you do not have to compile rustc from scratch, but give it a target.json to give your bare-metal code.

Lastly, booting up in 32-bit was such an ass. In the end, I am still not sure what went entirely wrong. The Rust code itself worked, but there was something wrong with my boot.asm & linker.ld. It was not correctly setup by the Linker to let the BIOS know where to find my kernel_main(). In the end, I just had to change something of my boot.asm. Which worked out.

Conclusion & Lesson Learned

In the end, it went much smoother than expected. There were plenty of tutorials and understandable documentation to get me through the first project.

The lesson I learned was to not assume the same approach for each compiler. Each programming language has a different approach. I am still happy with my choice to use nix-shell. It will definitely avoid headaches in the future.

KFS 02 - GDT & Stack

Introduction

Let's proceed to the second KFS project. The first was doable and I felt confident doing the second one.

For this project, we had to implement a GDT (Global Descriptor Table). The GDT serves as a fundamental data structure in x86 architecture, playing a crucial role in memory management and protection. When our computer starts, it begins in real mode, a simple operating mode that provides direct access to memory and I/O devices. However we need to switch to protected mode, which introduces memory protection, virtual memory, and privilege levels.

Think of protected mode as establishing different security clearance levels in a building. The GDT acts like the security system that defines who can access what. While my earlier comparison to sudo captured the basic idea of privilege levels, the reality is more sophisticated. Instead of just "admin" and "user", the x86 architecture provides four rings (0-3), where ring 0 is the most privileged (kernel space) and ring 3 is the least privileged (user space). Each ring has specific permissions and restrictions, all defined in our GDT.

The GDT is essential not just for security, but also for the basic operation of protected mode. Without a properly configured GDT, the CPU cannot execute protected mode code at all.

Goals

The project requires creating a GDT at 0x00000800 with entries for Kernel Data Space, Kernel Code Space, User Data Space, and User Code Space. Additionally, we need to add minimal PS/2 Keyboard Support and implement a basic shell with the commands reboot & gdt. The gdt command will print the GDT entries in a human-readable way.

Technical Approach & Implementation

My journey began with studying the OSDev documentation. The concepts were initially overwhelming - terms like segment descriptors, privilege levels, and descriptor flags felt like learning a new language. After watching several YouTube tutorials (here & here) about GDT implementation in Rust, things started to click.

I faced a choice: implement the GDT in Assembly or Rust. While Assembly would give more direct control, I chose Rust for its safety features and my growing familiarity with it. Here's how I structured the implementation:

The boot process begins in boot.asm, where we set up multiboot flags and prepare for the transition to protected mode. Then we call gdt_init, a Rust function that sets up our GDT:

#![allow(unused)]
fn main() {
#[no_mangle] // Ensure rustc doesn't mangle the symbol name for external linking
pub fn gdt_init() {
    // Create the GDT descriptor structure
    // size is (total_size - 1) because the limit field is maximum addressable unit
    let gdt_descriptor = GDTDescriptor { 
        size: (size_of::<GdtGates>() - 1) as u16,
        offset: 0x00000800,  // Place GDT at specified address
    }; 
    // Call assembly function to load GDT register (GDTR)
    gdt_flush(&gdt_descriptor as *const _);
}
}

Here's how our GDT entries are structured:

#![allow(unused)]
fn main() {
pub struct Gate(pub u64);  // Each GDT entry is 64 bits

#[no_mangle]
#[link_section = ".gdt"]  // Place in special GDT section for linking
pub static GDT_ENTRIES: GdtGates = [
    // Null descriptor - Required by CPU specification
    Gate(0),
    // Kernel Code Segment: Ring 0, executable, non-conforming
    Gate::new(0, !0, 0b10011010, 0b1100),  
    // Kernel Data Segment: Ring 0, writable, grow-up
    Gate::new(0, !0, 0b10010010, 0b1100),  
    // User Code Segment: Ring 3, executable, non-conforming
    Gate::new(0, !0, 0b11111010, 0b1100),  
    // User Data Segment: Ring 3, writable, grow-up
    Gate::new(0, !0, 0b11110010, 0b1100),  
];
}

Each Gate::new() call takes four parameters:

  • base: The starting address of the segment (0 for flat memory model)
  • limit: The maximum addressable unit (!0 means use entire address space)
  • access: Defines segment privileges and type (explained in detail in the table below)
  • flags: Controls granularity and size (0b1100 for 32-bit protected mode)

After setting up the GDT, I implemented basic keyboard support. While my current polling approach isn't ideal (it continuously checks for keystrokes), it works for our basic shell. A proper implementation would use interrupts to handle keyboard events, but that's a topic for future projects. The VGA driver from KFS_01 was adapted to create a simple shell interface, allowing for the reboot and gdt commands.

The system still experienced triple faults initially. The solution lay in the linker script - by using #[link_section = ".gdt"], I ensured our GDT was placed at the correct memory address. The ordering is crucial: BIOS boot code, then GDT, then the rest of our kernel.

  /* Start at 2MB */
  . = 2M;


  .gdt 0x800 : ALIGN(0x800)
    {
    gdt_start = .;
    *(.gdt)
    gdt_end = .;
  }

  /* The rest... */

Challenges

The challenges were mostly understanding the GDT. I struggled to grasp its purpose and exact workings. It took me reading several articles and watching multiple videos to finally understand what it's meant to do.

I also had no real experience with the linker. Finding the source of the triple fault was particularly frustrating, and it took quite a while before I realized the linker might not be placing the GDT at the correct address.

Conclusion & Lesson Learned

I found that I needed to reread materials multiple times to fully grasp concepts. Fortunately, there was plenty of documentation available about the GDT and its implementation. Working with the GDT motivated me to document everything extensively, like these pages. I mainly do this to ensure I truly understand the functionality of each component I'm working with.

Build Pipeline

It took me quite a long time to truly understand what the best approach is to compile my kernel. There were a few requirements I had for myself:

  1. Rust code should be in rust files, ASM code should be in assembly files
  2. Assembly should be compiled with nasm
  3. The boot should start at _start, which is in an assembly file

I do not think it is practical to have 99% in Rust. Assembly is great for having full control of your CPU - you know exactly what your code does. Assembly should mainly focus on booting and attaching systems to the kernel, while Rust should make use of the resources it gets as a kernel.

Build Process

This had a few unforeseen challenges. While compiling both Assembly & Rust separately is straightforward, combining them into one bin file was quite challenging, especially for testing.

Here's what happens when running cargo run:

  1. First, it calls a Rust build script which:

    • Compiles the necessary assembly code with nasm
    • Links it together with Rust
  2. Rust is compiled twice (in a way):

    • The kernel itself is compiled as a library
    • This library is then linked together with the assembly code into a binary file
    • The resulting binary file can be used to execute the kernel in qemu
  3. After building, cargo uses a custom runner.sh script that:

    • Launches qemu to execute the kernel binary
    • Applies different flags based on the mode (e.g., cargo test runs without opening a window)

Major Challenge

I encountered one significant challenge that took a long time to resolve. While cargo run worked without issues, cargo test would fail to find the multiboot flags.

Initially, I was convinced the issue was with cargo, not my linker. After using dumpobj to investigate, I discovered that multiboot wasn't present in the binary. The solution was adding a single line to my linker:

KEEP(*(.multiboot))

This forces the linker to keep the multiboot section in place, resolving the issue.

Design Decisions

Global Descriptor Table (GDT)

What is it

The GDT serves as a fundamental data structure in x86 architecture, playing a crucial role in memory management and protection. When our computer starts, it begins in real mode, a simple operating mode that provides direct access to memory and I/O devices. However we need to switch to protected mode, which introduces memory protection, virtual memory, and privilege levels.

Think of protected mode as establishing different security clearance levels in a building. The GDT acts like the security system that defines who can access what. While my earlier comparison to sudo captured the basic idea of privilege levels, the reality is more sophisticated. Instead of just "admin" and "user", the x86 architecture provides four rings (0-3), where ring 0 is the most privileged (kernel space) and ring 3 is the least privileged (user space). Each ring has specific permissions and restrictions, all defined in our GDT.

The GDT is essential not just for security, but also for the basic operation of protected mode. Without a properly configured GDT, the CPU cannot execute protected mode code at all.

For more information go to OSDev

My Technical Approach

My approach was as follows: I would start at the boot.asm & setup the multiboot. This will then call gdt_init, which is a Rust function. gdt_init will setup the GDT_Entries & ensure that it creates the correct struct pointer that will be passed to gdt.asm. gdt.asm will place the entries in the correct registers.

The multiboot setup is crucial because it ensures our kernel is loaded correctly by the bootloader and meets the Multiboot Specification, which is a standardized way for bootloaders to load operating systems.

Here are some snippets to give you a better idea:

; Both Rust functions
extern gdt_init
extern kernel_main

_start:
    ; The bootloader has loaded us into 32-bit protected mode
    ; but we need to set up our own GDT for proper segmentation
    call   gdt_init
    call   kernel_main
#![allow(unused)]
fn main() {
#[no_mangle] // Ensure rustc doesn't mangle the symbol name for external linking
pub fn gdt_init() {
    // Create the GDT descriptor structure
    // size is (total_size - 1) because the limit field is maximum addressable unit
    let gdt_descriptor = GDTDescriptor { 
        size: (size_of::<GdtGates>() - 1) as u16,  // Size must be one less than actual size
        offset: 0x00000800,  // Place GDT at specified address in memory
    }; 
    // Call assembly function to load GDT register (GDTR)
    gdt_flush(&gdt_descriptor as *const _);
}
}

Here's how our GDT entries are structured:

#![allow(unused)]
fn main() {
// Each GDT entry is 64 bits (8 bytes)
pub struct Gate(pub u64);  

#[no_mangle]
#[link_section = ".gdt"]  // Place in special GDT section for linking
pub static GDT_ENTRIES: GdtGates = [
    // Null descriptor - Required by CPU specification for error checking
    Gate(0),
    // Kernel Code Segment: Ring 0, executable, non-conforming
    // Parameters: base=0, limit=max, access=0b10011010 (present, ring 0, code), flags=0b1100 (32-bit, 4KB granularity)
    Gate::new(0, !0, 0b10011010, 0b1100),  
    // Kernel Data Segment: Ring 0, writable, grow-up
    // Parameters: base=0, limit=max, access=0b10010010 (present, ring 0, data), flags=0b1100 (32-bit, 4KB granularity)
    Gate::new(0, !0, 0b10010010, 0b1100),  
    // User Code Segment: Ring 3, executable, non-conforming
    // Parameters: base=0, limit=max, access=0b11111010 (present, ring 3, code), flags=0b1100 (32-bit, 4KB granularity)
    Gate::new(0, !0, 0b11111010, 0b1100),  
    // User Data Segment: Ring 3, writable, grow-up
    // Parameters: base=0, limit=max, access=0b11110010 (present, ring 3, data), flags=0b1100 (32-bit, 4KB granularity)
    Gate::new(0, !0, 0b11110010, 0b1100),  
];
}

The assembly code in gdt.asm that actually loads the GDT:

gdt_flush:
    ; Load GDT descriptor structure address from stack
    mov  eax, [esp + 4]
    lgdt [eax]

    ; Enable protected mode by setting the first bit of CR0
    mov eax, cr0
    or  eax, 1
    mov cr0, eax

    ; Set up segment registers with appropriate selectors
    ; 0x10 points to the kernel data segment (third GDT entry)
    mov eax, 0x10
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov ss, ax

    ; Far jump to flush pipeline and load CS with kernel code selector (0x08)
    ; This is necessary to fully enter protected mode
    jmp 0x08:.flush

.flush:
    ret

I am not going in-depth on why specific things happen, but each step is crucial for properly initializing protected mode.

Considerations

I considered using inline assembly, but the reason I did not in the end was because it had a few known bugs in Rust. I felt much more comfortable using actual asm files, instead of doing it inline. Inline assembly in Rust can be particularly problematic when dealing with low-level CPU features, as the compiler's assumptions about register usage and calling conventions might conflict with what we need for GDT setup.

Use C or Assembly instead of Rust

Since C is a much older & better documented program than Rust, I considered using C for the gdt_init, instead of Rust. I did not in the end, because I wanted to stay true to my "pure" Rust kernel & I felt like it would complicate things much more. Using C would have required additional complexity in the build system to handle multiple languages and their interaction.

Linker

Force the Linker to put GDT before the rest of the code. You can do it without this, but you will have issues if you would like to place it in a specific address. If that is not necessary for you, you can ignore this. The specific placement of the GDT in memory can be important for some system designs, particularly when dealing with memory management and virtual memory setup. This was my approach:

  /* Start at 2MB */
  . = 2M;


  .gdt 0x800 : ALIGN(0x800)
    {
    gdt_start = .;
    *(.gdt)
    gdt_end = .;
  }

Memory Layout Considerations

When placing the GDT at a specific address, it's important to ensure that:

  1. The address is accessible during the transition to protected mode
  2. The address doesn't conflict with other important system structures
  3. The address is properly aligned for optimal performance

Debugging

Debugging is essential for finding bugs and truly understanding what your program does. While different programming languages have various debugging approaches, for low-level programs like kernels, gdb is an extremely powerful and useful tool.

Unlike regular programs where you can directly run gdb on the compiled binary, kernel debugging requires additional steps since we're running our code in a VM.

There are several debugging approaches:

  1. Create logfiles
  2. Print to the terminal (via QEMU)
  3. Use gdb

The first two methods are similar, with the main difference being the output destination - file vs terminal. Choose based on your preference.

GDB with QEMU

QEMU can be configured to wait for a GDB connection before executing any code, enabling debugging. Here's how to set it up:

  1. Run cargo debug in the src/kernel directory (this is an alias for debug = "run debug")
  2. Cargo will build and run runner.sh, which adds the appropriate QEMU flags based on the mode (debug, test, or normal)
  3. For debugging, the crucial flags are -s -S
  4. A window will open & QEMU will wait for GDB to connect before starting execution

Connect GDB from another terminal:

gdb
(gdb) target remote localhost:1234

Load debug symbols:

(gdb) symbol-file <kernel bin>

Example debugging session:

gdb
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()

(gdb) symbol-file kernel.b
Reading symbols from kernel.b...done.

(gdb) break kernel_main     # Add breakpoint to any kernel function
Breakpoint 1 at 0x101800: file kernel/kernel.c, line 12.

(gdb) continue # Qemu starts the kernel
Breakpoint 1, kernel_main (mdb=0x341e0, magic=0) at kernel/kernel.c:12
12      {

Rust & Debugging

Rust's name mangling can make debugging challenging since function names may be modified. To ensure a function name remains unchanged for debugging purposes, add the following attribute:

#![allow(unused)]
fn main() {
#[no_mangle]
}

This prevents rustc from modifying the function name, making it easier to set breakpoints.

Reference

Building from Source

You'll need:

  • nix-shell for an isolated development environment
  • QEMU for testing
# Clone the repository
git clone https://github.com/xannyxs/fungul
cd fungul

# Initiate nix-shell
nix-shell shell.nix --command "zsh"

# Build the kernel
make

# Run in QEMU
make run

References

Official Documentation

OSDev and Community Resources

Learning Resources

Development Tools