xv6 MP1: Kernel Observations

Objectives

In this first xv6 based assignment, you will learn how to:

obtain the correct version of xv6
install and set up the build and virtualization environments
build xv6
run xv6 in an emulator
debug/inspect the running kernel

Obtaining the repository

You should've received an e-mail invitation to share a private repository with me on BitBucket --- you need to have accepted this invitation in order to continue.

Assuming you have Git installed on your machine, simply run the following command (with your own username) to clone your repository into a directory named xv6 on your machine.

$ git clone https://bitbucket.org/michaelee/cs450-fall18-USERNAME.git xv6

Next, you should create an upstream branch that tracks my public xv6 repository, just in case I push any more commits to it for upcoming machine problems.

$ cd xv6
$ git remote add upstream https://bitbucket.org/michaelee/xv6.git

You can now fetch and merge the master branch from this upstream repository into your own master branch. Nothing should happen, as I won't have pushed any new commits yet.

$ git fetch upstream
From https://bitbucket.org/michaelee/xv6
 * [new branch]      master     -> upstream/master
$ git merge upstream/master
Already up-to-date.

For future assignments, I may make changes to the codebase that you would then need to fetch and merge into your repository before beginning work. The above two commands will accomplish this --- if any manual merging is necessary we will cover the procedure for doing so in class.

Running xv6

In order to run xv6, we'll be using a combination of the VirtualBox virtualization platform and the Vagrant virtual environment management tool. Installers for both pieces of software are available for Linux, Mac OS X, and Windows. Download and install them first, then come back here --- download links are below:

VirtualBox: https://www.virtualbox.org/wiki/Downloads
Vagrant: http://www.vagrantup.com/downloads.html

While VirtualBox comes with a graphical user interface, we won't be using it directly. Instead, everything will be done through Vagrant's command line interface.

Start up your platform's command line interpreter (a terminal emulator on Linux/OS X, or Windows command prompt), change into the cloned "xv6" directory, then type the following command:

$ vagrant up

If everything's been installed properly, this command will download the necessary images from the Internet and configure them so as to set up a virtual Linux machine that you can subsequently connect to in order to build and test xv6. It'll take a while. Output will look something like this:

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'ubuntu/focal64'...
==> default: Matching MAC address for NAT networking...
==> default: Checking if box 'ubuntu/focal64' version '20210125.0.1' is up to date...
...
...

Note: If this step hangs indefinitely or you see an error mentioning the unavailability of "VT-x", you may need to turn on virtualization acceleration in your BIOS before trying again. See this link for instructions.

When done, try the command "vagrant status" --- it should produce the following output:

Current machine states:

default                   running (virtualbox)

The VM is running. To stop this VM, you can run `vagrant halt` to
shut it down forcefully, or you can run `vagrant suspend` to simply
suspend the virtual machine. In either case, to restart it again,
simply run `vagrant up`.

This means the virtual machine is now running. You can connect to it using the command "vagrant ssh", which will result in the following:

Welcome to Ubuntu 20.04.1 LTS (GNU/Linux 5.4.0-64-generic x86_64)
...

vagrant@ubuntu-focal:/vagrant$

You're now logged into the virtual machine. The "/vagrant" directory on the virtual machine is synchronized with the git repository you cloned earlier -- when you ssh in, you're automatically taken to that directory. Now we'll build xv6 and run it using the QEMU emulator that was installed for you on the virtual machine. The following commands will do this:

$ make
$ make qemu-nox

The first make command should result in a (successful) build process, while the second should ended with the following output:

xv6...
cpu0: starting 0
sb: size 1000 nblocks 941 ninodes 200 nlog 30 logstart 2 inodestart 32 bmap start 58
init: starting sh
$

This last prompt is produced by xv6, which you've now booted into within QEMU (which is in turn running on the Linux VM -- whew!).

You can explore a bit here, but when you're ready to exit the xv6 session, use the key sequence "Ctrl-a x" (i.e., hit the 'a' key while holding down the control key, then release both and hit the 'x' key). Note that Ctrl-a is a special prefix key used to send QEMU an interrupt sequence. Another useful key sequence is "Ctrl-a c", which drops you into the QEMU monitor console where you can inspect and control the emulated machine.

You should now be dropped back into the virtual linux machine. To get out of that and back onto your own machine, just type "exit". At this point, if you wish, you can terminate the virtual machine by entering "vagrant halt". To bring it back up just use "vagrant up". To SSH in to a running virtual machine, use "vagrant ssh".

Working on xv6

The great thing about using Vagrant is that it automatically gives us the ability to synchronize directory contents between the host machine and the virtual (or "guest") machine. This means you can do all your coding with tools installed on your own machine (e.g., favorite IDEs/editors) and only SSH into the virtual machine for testing purposes.

The xv6 kernel codebase is already sitting in your cloned repository, so you will simply open and edit any files in there. When done, you will do a "vagrant ssh", then "cd /vagrant" in the guest machine to get to the synchronized directory (containing your changes) so as to build and test your changes.

To re-build the xv6 image, you'll always want to first do a "make clean" before running "make", as weird errors will often arise otherwise. Get in the habit of just running the commands together with "make clean ; make" (the shell accepts multiple commands separated by semicolons).

If you wish to debug (e.g., step through and set breakpoints in) the kernel, you can use "make qemu-nox-gdb" to run the emulator in a mode that allows you to attach a gdb session to it. This will require that you "vagrant ssh" into the virtual machine twice --- once to start up the emulator and another to run gdb --- or make use of a terminal multiplexer (tmux is already installed for you).

Kernel Observations

For this assignment you won't be modifying the kernel at all. Instead, you'll simply be stepping to various points in the xv6 startup process and jotting down observations of what is going on (typically, by inspecting and annotating the contents of memory).

Each of the following exercises identifies a key xv6 data structure and/or startup milestone and asks you some questions about it. To answer them, you will need to:

Figure out how to halt the debugger (e.g., by setting a breakpoint and running to it and/or stepping past it, or manually interrupting execution in QEMU or gdb) so that you can inspect the data structure or the state of the machine at the specified milestone. Clearly document the steps you take, e.g.,
```
1. Set a breakpoint at function FOO and run to it
2. Step until line X
3. Examine the value of variable BAR to get address Y
4. Inspect 4 8-byte values starting at Y in gdb with "x /4gx Y"
```

Document the relevant memory address(es) and content(s) -- a simple ASCII diagram will suffice, e.g.,

0x8dffe000:     0x000000000dfbc007
                0x0000000000000000
                0x0000000000000000
                0x0000000000000000

Sometimes, you may be able to get away with just inspecting a variable/structure in gdb, in which case you can simply show the gdb output, e.g.,

{sz = 4096, pgdir = 0x8dffe000, kstack = 0x8dfff000 "", state =
RUNNABLE, pid = 1, parent = 0x0, tf = 0x8dffffb4, context = 0x8dffff9c, 
chan = 0x0, killed = 0, ofile = {0x0 <repeats 16 times>}, 
cwd = 0x80110a14 <icache+52>, name =
"initcode\000\000\000\000\000\000\000"}

Typically, should show raw memory contents (and addresses) as hexadecimal values.

Annotate the relevant memory contents or structure components per the exercise requirements. Again, you can simply include your annotations next to your diagram from (2), e.g.,

0x8dffe000:     0x000000000dfbc007  <--- this is the THINGAMAJIG
                0x0000000000000000       used for WHATCHAMACALLIT
                0x0000000000000000
                0x0000000000000000

or, more granularly, e.g.,

0xabcd1234:     0xdeadbeef
                  bits 0-7   (  0xef): specifies some DOODAD
                  bits 8-23  (0xadbe): specifies the GIZMO
                  bits 24-31 (  0xde): specifies a DOOHICKEY

Exercise 1: System call IDT entry (4 points)

Locate the entry in the IDT that specifies the gate used for handling system calls.

What is its value? Break it down per the relevant Intel manual specification.
What differentiates it from other gate entries?

Exercise 2: First system call (12 points)

Stop execution just before the execution of the first system call handler (hint: the handler is invoked in the syscall function).

What system call is about to be executed? (Explain how you know this.)
List and document the full contents of the allocated portion of the kernel stack at this juncture. (Hint: the proc structure's kstack component points to the bottom of the stack area, and KSTACKSIZE is its size.)

Exercise 3: First call to `swtch` (8 points)

Stop execution at the first call to swtch.

What are the top three words on the kernel stack at this point?

Next, step up to and over the "movl %edx, %esp" instruction.

What are the top six words on the kernel stack now?

Exercise 4: VM Size after first syscall (8 points)

Stop execution right after the first system call handler returns (hint: step over the appropriate line in syscall).

How much user-mode pages are allocated to the first process at this point? (Indicate the physical page base addresses in your listing, and be sure to show how you obtain the values.)
Show the first dozen instructions of the user process. (You do not need to annotate the instructions themselved.)

Submission

Your answers to the above exercises should all be placed in a plain text file named "assign01.txt" in the xv6 repository you cloned at the outset. Be sure to delineate your answers to each exercise clearly.

Add your file to the repository with the command "git add assign01.txt", then commit it with "git commit -am "Adding assignment 1 answers"". Finally, push your commit with "git push origin master", which should result in output like this:

Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 4 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 287 bytes | 287.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0)
To bitbucket.org:michaelee/cs450-SEMESTER-USERNAME.git
   58f54aa..8d699ce  master -> master

You can check on BitBucket at this point to make sure that your work was correctly pushed.