Skip to content

Conversation

@ryanbreen
Copy link
Owner

@ryanbreen ryanbreen commented Feb 11, 2026

Summary

  • ARM64 boot stages now pass 184/184 (up from 126/184)
  • Unified test architecture framework in place for both x86_64 (252 stages) and ARM64 (184 stages)
  • Fixed multiple ARM64 kernel bugs: signal delivery, clone TTBR0, filesystem writes, symlink following, blocking pipe read, socket syscalls, and more
  • Fixed userspace test bugs: async-signal-safe handlers, getdents64 directory iteration, hello_world exit code

Key changes

Kernel fixes (ARM64):

  • Signal delivery cross-block unblock (SIGCHLD + sigsuspend)
  • Clone TTBR0 fallback for child processes
  • Blocking pipe read implementation
  • File/device write, FD_CLOEXEC exec cleanup
  • 7 missing ARM64 syscalls, argv setup, clear_child_tid
  • Symlink following in sys_open, truncate superblock update

Test infrastructure:

  • Shared boot stage definitions (Phase 4B+4C)
  • Arch-generic HAL wrappers (Phase 3A+3C+3D)
  • Shared test binary list for both architectures (Phase 5A)
  • QemuConfig abstraction in xtask (Phase 4A)
  • Unified graphical boot progress display

Userspace test fixes:

  • pause_test/sigsuspend_test: async-signal-safe handlers (raw_write_str instead of println!)
  • fs_block_alloc_test: loop getdents64 for large directories, fix exit code expectation

Remaining work for full unification

The framework is shared but ARM64 has 68 fewer stages than x86_64:

  • 32 stages: kthread/workqueue/softirq tests excluded from ARM64 (need kthread_test_only feature support)
  • 4 stages: x86-specific diagnostic subtests need ARM64 equivalents
  • ~32 stages: inherently architecture-specific boot stages (PCI vs VirtIO MMIO, GDT vs GIC, etc.)

Test plan

  • ARM64 boot stages: 184/184 passed
  • x86_64 build: 0 warnings, 0 errors
  • ARM64 native boot test: PASSED
  • x86_64 boot stages (252): not re-run in this session
  • Enable kthread/workqueue/softirq tests on ARM64

🤖 Generated with Claude Code

ryanbreen and others added 16 commits February 10, 2026 19:26
…se 1+2A)

Phase 1A: Extract 6 kthread test functions from main.rs to shared
task/kthread_tests.rs module with arch-generic wrappers. All tests
now callable from both x86_64 and ARM64.

Phase 1B: Extract workqueue and softirq tests into new shared
task/workqueue_tests.rs and task/softirq_tests.rs modules. All
x86_64-specific interrupt control replaced with arch-generic
abstractions.

Phase 2A: Move 14 modules unique to main.rs into lib.rs with
appropriate cfg guards, preparing for the extern crate kernel
conversion. All modules gated on target_arch = "x86_64" with
allow(dead_code) since they're called from the binary crate.

Wire new shared tests into ARM64 entry point (main_aarch64.rs).
Both architectures build with zero warnings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace ~40 duplicate mod declarations in main.rs with a single
extern crate kernel statement, matching ARM64's main_aarch64.rs
pattern. This eliminates the separate module tree that caused every
source file to be compiled twice and prevented main.rs from calling
lib.rs functions like exit_qemu().

Key changes:
- Remove all mod declarations, add #[macro_use] extern crate kernel
- Replace all crate:: references with kernel:: (~57 occurrences)
- Remove duplicate QemuExitCode and test_exit_qemu (use kernel::)
- Add use kernel::{...} imports for frequently used modules
- Preserve aarch64_stub for non-x86_64 compilation

No logic changes - only module resolution paths changed.
Both x86_64 and ARM64 build with zero warnings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create xtask/src/qemu_config.rs with Arch enum and QemuConfig struct
that encapsulates all platform-specific build commands, QEMU binary
selection, serial file routing, and launch flags.

Factory methods QemuConfig::for_kthread() and QemuConfig::for_btrt()
replace inline if/else branches in kthread_test() and boot_test_btrt()
with config.build() and config.spawn_qemu() calls.

x86_64 boot test verified passing after Phase 2B conversion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add workqueue_test_only and kthread_stress_test exit blocks to
  main_aarch64.rs, matching x86_64's feature-gated test modes
- Fix unused import warnings by using fully qualified paths for
  ShellState and event_type (only used in post-test-exit shell loop)
- Gate clock_gettime_test and time_test imports in main.rs to avoid
  unused import warnings in kthread/workqueue test modes
- Use cfg_attr(allow(dead_code)) for functions unreachable in test modes

Both architectures build clean with zero warnings for all feature
configurations: testing, kthread_test_only, kthread_stress_test,
and workqueue_test_only.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Merge get_boot_stages() and get_arm64_boot_stages() into single
  get_boot_stages(arch: &Arch) with match on architecture
- Create xtask/src/test_monitor.rs with reusable TestMonitor struct
  for QEMU serial output monitoring (completion marker, panic detection,
  QEMU exit, timeout)
- Refactor kthread_test() to use TestMonitor, replacing ~60-line
  manual monitoring loop with clean monitor/match pattern

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e 5A)

- Create kernel/src/boot/test_list.rs with canonical TEST_BINARIES
  constant listing all 63 userspace test binaries
- Replace hardcoded 20-line array in ARM64's load_test_binaries_from_ext2()
  with reference to shared list
- Add comment in x86_64 main.rs noting canonical list location
- Both architectures build with zero warnings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Phase 3A+3C+3D)

- Add 9 arch-generic wrapper functions in lib.rs dispatching to
  CpuOps/TimerOps trait implementations (arch_without_interrupts,
  arch_enable/disable_interrupts, arch_halt, arch_read_timestamp, etc.)
- Replace scattered x86_64::instructions::interrupts:: calls in shared
  code (kthread.rs, workqueue.rs, scheduler.rs, spinlock.rs, serial.rs)
  with arch-generic wrappers
- Wire up btrt.rs boot_timestamp_ns() to use arch_read_timestamp()
- Net reduction: 10 files, +135/-115 lines, eliminating ~80 lines of
  duplicated #[cfg(target_arch)] blocks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Document shared boot graphics architecture in both entry points
  (render_task, SHELL_FRAMEBUFFER, render_queue, terminal_manager)
- Add missing workqueue and softirq initialization to ARM64 boot path,
  bringing init sequence to parity with x86_64
- Clean up display.rs with clear documentation of x86_64/ARM64
  framebuffer availability constraints
- Remove unnecessary dead_code cfg_attr from progress.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hold

The per-stage timeout (90s) was sending SIGTERM to QEMU when no new
markers appeared. In CI with nested virtualization, serial buffers flush
slowly so QEMU was being killed while the kernel was still running and
producing output. Now on per-stage timeout we check if QEMU is still
alive and keep polling until it exits or the overall timeout expires.

Also removes the ARM64 minimum pass threshold (min_stages=120) which
allowed partial passes. Both architectures now require 100% of stages
to pass.

Fixes unused import warning in kthread.rs (arch_disable_interrupts).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add missing syscall dispatch for Clone, Futex, GetRandom, Nanosleep,
SetTidAddress, ExitGroup, and Getppid to ARM64 dispatch table.

Fix create_user_process() to use create_process_with_argv() so
processes get proper argc/argv on the stack.

Add clear_child_tid handling on ARM64 exit path so thread::join()
works correctly (write 0 to tid address and futex-wake joiners).

Add missing "[test] Entering scheduler idle loop" marker in
main_aarch64.rs for boot stage validation.

Improves ARM64 boot stages from 126/184 to 134/184.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t syscalls

Implement sys_write handlers for RegularFile (ext2 write with O_APPEND
support) and Device (null/zero discard, console/tty serial output).
Previously both returned ENOTSUP (error 95).

Add FD_CLOEXEC cleanup to all four exec paths (x86_64 and ARM64,
both exec_process and exec_process_with_argv) per POSIX semantics.

Add 4 missing socket syscalls to ARM64 dispatch: getsockname,
getpeername, setsockopt, getsockopt. This achieves 100% syscall
parity between x86_64 and ARM64.

Add /tmp directory to ext2 disk creation script - all filesystem
write tests require this directory to exist.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port blocking pipe read from x86_64 to ARM64's shared sys_read.
When a pipe buffer is empty but writers still exist, the reader now
blocks instead of returning EAGAIN. This fixes the Broken pipe
(EPIPE) crash in coreutil tests where the parent would exit the
read loop prematurely, close the read end, and cause the child's
first write to fail.

Also fix close_cloexec to properly decrement pipe/fifo/unix-socket
reference counts instead of just dropping the fd entry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes that improve ARM64 boot stages from 143/184 to 174/184:

1. Signal wakeup: send_signal_to_process() now calls both
   unblock_for_signal() and unblock_for_child_exit() so signals can
   interrupt waitpid (EINTR). Similarly, notify_parent_of_termination
   now also unblocks parents in sigsuspend/pause state.

2. Clone TTBR0: set_next_ttbr0_for_thread() now falls back to
   process.inherited_cr3 when page_table is None, fixing clone
   children that share parent's address space.

3. Combined effect fixes 31 stages: SIGCHLD delivery, clone/exec
   child execution, Rust std thread tests, and filesystem tests.

Co-Authored-By: Ryan Breen <rbreen@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…UDP log level, truncate superblock

- Rewrite signal_regs_test ARM64 to use single asm block preventing
  compiler from clobbering x20-x23 between set/read operations
- Add raw_waitpid() inline asm to waitpid_test and wnohang_timing_test
  to bypass libbreenix-libc errno conversion for ECHILD checks
- Implement symlink following in ext2 resolve_path() with depth limiting
- Add resolve_path_no_follow() for readlink syscall
- Fix UDP delivery log level from debug to info for test detection
- Update truncate_file() to increment superblock free block count

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…k alloc dir read

- Add signal_handler_test and signal_return_test to TEST_BINARIES list so
  ARM64 kernel loads them (fixes stages 27, 28)
- Use net_log! macro for UDP packet delivery message so it's visible on
  ARM64 where log::info! is a no-op (fixes stage 36)
- Add raw_getdents64() syscall for directory reading in fs_block_alloc_test
  since read() on directory FDs returns EISDIR (fixes stage 140)
- Add already_pending signal check before entering WFI loop in pause/sigsuspend
  to handle race where signal arrives before BlockedOnSignal state

Improves ARM64 boot stages from 178/184 to 182/184.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t code

- pause_test/sigsuspend_test: Replace println! with raw_write_str() in
  signal handlers to avoid RefCell double-borrow panic when signal fires
  during another println. Also add race-safe path in pause_test for when
  SIGUSR1 arrives before pause() is called (common on ARM64).
- fs_block_alloc_test: Loop getdents64 calls to read all directory entries
  (was only reading first 1024 bytes, missing hello_world in /bin/ with
  100+ entries). Also fix expected exit code 42→0 since hello_world is
  now hello_std_real which exits 0. Remove unused read() FFI declaration.

Verified: 184/184 ARM64 boot stages pass, x86_64 build clean (0 warnings).

Co-Authored-By: Ryan Breen <rbreen@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ryanbreen ryanbreen merged commit ab00035 into main Feb 11, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant