What does HackerNews think of xv6-public?

xv6 OS

Language: C

https://github.com/mit-pdos/xv6-public has under 10,000 lines of C and assembly including some user space programs.
xv6 was originally written for 32-bit x86; the RISC-V port is a relatively recent development. See e.g. https://github.com/mit-pdos/xv6-public for some of the earlier history.

rxv64 was written for a specific purpose: we had to ramp up professional engineers on both 64-bit x86_64 and kernel development in Rust; we were pointing them to the MIT materials, which at the time still focused on x86, but they were getting tripped up 32-bit-isms and the original PC peripherals (e.g., accessing the IDE disk via programmed IO). Interestingly, the non sequitur about C++ aside, porting to Rust exposed several bugs or omissions in the C original; fixes were contributed back to MIT and applied (and survived into the RISC-V port).

Oh, by the way, the use of the term "SMP" predates Intel's usage by decades.

Hmm no idea. I only build 1 OS on my Mac: https://github.com/mit-pdos/xv6-public.

It obviously won't compile with Apple's gcc (which actually is a clang wrapper), so have to install another compiler (i386-elf-gcc).

But on my Debian box, it works with the standard gcc.

I'd define an arena as the pattern where the arena itself owns N objects. So you free the arena to free all objects.

My first job was at EA working on console games (PS2, GameCube, XBox, no OS or virtual memory on any of them), and while at the time I was too junior to touch the memory allocators themselves, we were definitely not malloc-ing and freeing all the time.

It was more like you load data for the level in one stage, which creates a ton of data structures in many arrays. Then you enter a loop to draw every frame quickly, and you avoid any allocation in that loop. There were many global variables.

---

Wikipedia calls it a region, zone, arena, area, or memory context, and that seems about right:

https://en.wikipedia.org/wiki/Region-based_memory_management

It describes history from 1967 (before C was invented!) and has some good examples from the 1990's: Apache web server ("pools") and Postgres database ("memory contexts").

I also just looked at these codebases:

https://github.com/mit-pdos/xv6-public (based on code from the 70's)

https://github.com/id-Software/DOOM (1997)

I looked at allocproc() in xv6, and gives you an object from a fixed global array. This is similar to a lot of C code in the 80's and 90's -- it was essentially "kernel code" in that it didn't have an OS underneath it. Embedded systems didn't run on full-fledges OSes.

DOOM tends to use a lot of what I would call "pools" -- dynamically allocated arrays of objects of a fixed size, and that's basically what I remember from EA.

Though in g_game.c, there is definitely an arena of size 0x20000 called "demobuffer". It's used with a bump allocator.

---

So I'd say

- malloc / free of individual objects was NEVER what C code looked like (aside from toy code in college)

- arena allocators were used, but global, fixed-size arrays, and dynamic pools were maybe more common.

- arenas are more or less wash for memory safety. they help you in some ways, but hurt you in others.

The reason C programmers don't malloc/free all the time is for speed, not memory safety. Arenas are still unsafe.

When you free an arena, you have no guarantee there's nothing that points to it anymore.

Also, something that shouldn't be underestimated is that arena allocators break tools like ASAN, which use the malloc() free() interface. This was underscored to me by writing a garbage collector -- the custom allocator "broke" ASAN, and that was actually a problem:

https://www.oilshell.org/blog/2023/01/garbage-collector.html

If you want memory safety in your C code, you should be using dynamically instrumented allocators (ASAN, valgrind) and good test coverage. Depending on the app, arenas don't necessarily help, they can hurt.

An arena is a simple idea -- the problem is more if that usage pattern actually matches your application, and apps evolve over time.

I wonder, how hard it would be to write something like xv6: https://github.com/mit-pdos/xv6-public

But in Lisp/Scheme? Preferrably no C. Just Lisp/Scheme and assembly.

* tweetnacl.cr.yp.to

Crypto library in the size of a hundred tweets [ https://twitter.com/tweetnacl ] by the only person with a genuine claim to being able to write safe C & company.

* https://github.com/mit-pdos/xv6-public

&

* https://github.com/mit-pdos/xv6-riscv

UNIX v6 clone in ANSI C by influential Plan 9-era Bell Laboratories employee and now influential Google employee Russ Cox, along with influential computer virus author and son of one of the original UNIX authors Robert T. Morrison; entire source code fits in under a hundred pages of well-typeset documents [ warning, old copy, you should generate a modern one: https://pdos.csail.mit.edu/6.828/2011/xv6/xv6-rev6.pdf ].

If folks want to experiment with a tiny, bare-bones unix-y OS on relatively modern platforms, xv6 (used as a teaching OS at MIT) is ~10kloc of C for the kernel and a handful of userland programs: https://github.com/mit-pdos/xv6-public

More information, docs, etc: https://pdos.csail.mit.edu/6.828/2014/xv6.html

A few years back I did a quick port of it to x86-64 as a project to learn more about 64bit intel (having been in the ARM world for some time) which was fun: https://github.com/swetland/xv6

Looking at the github repo [1] I am immediately underwhelmed by the commit messages. "nothing much," "nits," "nit". Really?

[1] https://github.com/mit-pdos/xv6-public

My personal recommendation concerning OS development is xv6: https://pdos.csail.mit.edu/6.828/2016/xv6.html

Printout of important parts of the source code: https://pdos.csail.mit.edu/6.828/2016/xv6/xv6-rev9.pdf

Book: https://pdos.csail.mit.edu/6.828/2016/xv6/book-rev9.pdf

(both are linked in the menu at the top of the page)

Review by John Regehr: http://blog.regehr.org/archives/1114

(Github Repositories:

> https://github.com/mit-pdos/xv6-public

> https://github.com/mit-pdos/xv6-book).

I ain't an expert in x86 so I'm not sure... only read some of Intel's manual but judging by the code at

https://github.com/mit-pdos/xv6-public

file "entryother.S" is asm that executes on AP's and "bootasm.S" executes on the BSP... both after the BIOS

so... maybe?

maybe that's what your looking for?

Source available on GitHub: https://github.com/mit-pdos/xv6-public

Unless I'm missing something, the entire codebase is just 102 files (including README and such), no subdirectories, and 12,000 lines.

What a fantastic tool.