I have to ask, from a student perspective how does one acquire sufficient knowledge of writing a shell? Does one need to take OS? Or is knowing data structure enough?

The shell should be significantly easier than say writing a compiler, although I think it's less well documented.

Much of the work in writing a shell is parsing -- I would estimate that parsing is 60% of the work, whereas it might only be 10-20% of the work of a compiler. This is both because interpreters are smaller than compilers (fewer transformations), and because shell is harder to parse than most programming languages.

But as you can see from my blog, it requires a few different parsing techniques, and there are some non-obvious choices I made that I think made it easier. (e.g. Lexer modes aka lexical state are a huge win for reducing complexity.)

The shell uses surprisingly few system calls -- fork/exec/wait, open/close/read/write, pipe/dup/fcntl, and that's almost it. These are well worth studying. File descriptors are non-obvious and essential.

I never took the OS class that most people take, which shows you how C and Unix work. This thread lists a few, and also check out xv6 in the parent thread:

https://news.ycombinator.com/item?id=13112960

I learned a lot about system calls by using strace on many programs over the years. I think Python helps a lot because it is interpreted and has relatively direct bindings to Unix system calls.

FWIW, although I learned interesting things about parsing, I don't think there is that much "hard" in the computer science sense about shell. It's mainly an exercise in software engineering -- how do you keep your program from degrading into an enormous mass of poorly-debugged if statements? (I would classify bash in that category, unfortunately)

As far as data structures, you should be able to write a shell without anything fancy at all. In fact some shells are basically just tons of linked lists. Linked lists make memory management easier in C, although I'm using a more modern, high-level style.

I also spent a significant amount of time reading bash, dash, and mksh code for this project. (and to a lesser degree zsh).

Happy to answer any other questions.

laumars

To be honest the biggest "gotchyas" I stumbled across when writing my shell was handling PTYs. In the end I used someone else's module instead of writing my own because they supported several different flavours of UNIX as well so it seemed a pointless duplication of effort for me to roll my own. However it was real humbling when I discovered just how little I actually new about PTYs compared to what I thought I knew.

chubot

I don't know anything about PTYs either, since my shell is mostly a batch program like Perl right now. I'm viewing it first from a programming language perspective.

As mentioned, I'm deferring to GNU readline for interactive features. I'd be interested in what problems you had with PTYs and what module you used?

Are PTYs standardized by POSIX? It feels like you should be able to write a shell against only POSIX APIs (and ANSI C).

I'd also be interested to see your shell. There are a bunch of alternative shells and POSIX shell implementations here:

https://github.com/oilshell/oil/wiki/ExternalResources

EDIT: I saw the sibling comment linking to https://github.com/lmorg/murex. Taking a look! A few months ago I started a thread with the authors of elvish, oh, mash, and NGS shells to exchange ideas. (elvish and oh are also written in Go.) I didn't know about your shell or I would have included you!