Assignment on Build an Operating System from Scratch
Instructions on setting up the virtualbox image:
Step 1: Download VirtualBox for your computer
Step 2: Copy the virtual hard drive image OS.VDI to your computer. This is a very large image, over 4GB! Give yourself plenty of time and make sure you have the space. You can get it from this link:
Step 3: Follow the instructions here to import the image into virtualbox and make a new virtual machine:
In the options, select Linux as your operating system and Ubuntu 32 bit as your version.
Boot the virtual machine. The Ubuntu desktop should come up within about one to five minutes, depending on the speed of your computer.
On the desktop you should see a set of folders labeled Project A, B, C, D, E, and F. You should also see a file simulator.jar, and at the top of the screen you should see a button to open the terminal.
Double click on simulator.jar. Press Boot Floppy A: If you see a screen come up after a minute or so with the message “It Works”, then you're ready to start the project.
Project A - Booting
Build an Operating System from Scratch
When a computer is turned on, it goes through a process known as “booting.” The computer starts executing the BIOS (which comes with the computer and is stored in ROM). The BIOS loads and executes a very small program called the “bootloader,” which is located at the beginning of the disk. The bootloader then loads and executes the “kernel” - a larger program that comprises the bulk of the operating system.
In this project you will write a very small kernel that will print out “Hello World” to the screen and hang up.
This is a warm-up project intended to get you familiar with the tools and simulator that you will use in subsequent projects.
You will need the following utilities to complete this and subsequent projects: Emumaker86 - An x86 processor simulator bcc (Bruce Evan's C Compiler) - A 16-bit C compiler
as86, ld86 - A 16 bit assembler and linker that typically comes with bcc
gcc - The standard 32-bit GNU C compiler nasm - The Netwide Assembler hexedit - A utility that allows you to edit a file in hexadecimal byte-by-byte dd - A standard low-level copying utility. This generally comes with UNIX.
pico, nano, or vi - A simple text-based editor
All the utilities(except nasm) described above are standard Unix tools and come(including nasm) with our OS.VDI image file download.
Description of the tools:
Emumaker86 is a simulator of a x86 computer. It allows you to simulate an operating system without potentially wrecking a real computer in the process. It is included in our download as a java executable (simulator.jar) from https://github.com/mdblack/simulator/raw/master/simulator.jar
The second thing you will need is a disk image. A disk image is a single file containing every single byte stored on a simulated floppy disk. In this project you are writing an operating system that will run off of a 3 1/2 inch floppy disk.
The first thing that the computer does after powering on is read the bootloader from the first sector of the floppy disk into memory and start it running. A floppy disk is divided into sectors, where each sector is 512 bytes. All reading and writing to the disk must be in whole sectors - it is impossible to read or write a single byte. The bootloader is required to fit into Sector 0 of the disk, be exactly 512 bytes in size, and end with the special hexadecimal code “55 AA.” Since there is not much that can be done with a 510 byte program, the whole purpose of the bootloader is to load the larger operating system from the disk to memory and start it running.
Since bootloaders have to be very small and handle such operations as setting up registers, it does not make sense to write it in any language other than assembly. Consequently you are not required to write a bootloader in this project - one is supplied to you as (bootload.asm) in the download. You will need to assemble it, however, and start it running.
If you look at bootload.asm, you will notice that it is a very small program that does three things. First it sets up the segment registers and the stack to memory 10000 hex. This is where it puts the kernel in memory. Second, it reads 10 sectors (5120 bytes) from the disk starting at sector 3 and puts them at 10000 hex. This would be fairly complicated if it had to talk to the disk driver directly, but fortunately the BIOS already has a disk read function prewritten. This disk read function is accessed by putting the various parameters into various registers, and calling Interrupt 13 (hex). After the interrupt, the program at sectors 3-12 is now in memory at 10000. The last thing that the bootloader does is it jumps to 10000, starting whatever program it just placed there. That program should be the one that you are going to write. Notice that after the jump it fills out the remaining bytes with 0, and then sets the last two bytes to 55 AA, telling the computer that this is a valid bootloader.
To install the bootloader, you first have to assemble it. The bootloader is written in x86 assembly language understandable by the NASM assembler. To assemble it, type nasm bootload.asm. The output file bootload is the actual machine language file understandable by the computer.
You can look at the file with the hexedit utility. Type hexedit bootload. You will see a few lines of numbers, which is the machine code in hexadecimal. Below you will see a lot of 00s. At the end, you will see the magic number 55 AA.
Next you should make an image file of a floppy disk that is filled with zeros. You can do this using the dd utility. Type dd if=/dev/zero of=floppya.img bs=512 count=2880. This will copy 2880 blocks of 512 bytes each from /dev/zero and put it in file floppya.img. 2880 is the number of sectors on a 3 1/2 inch floppy, and /dev/zero is a phony file containing only zeros. What you will end up with is a 1.47 megabyte file floppya.img filled with zeros.
Finally you should copy bootload to the beginning of floppya.img. Type dd if=bootload of=floppya.img bs=512 count=1 conv=notrunc. If you look at floppya.img now with hexedit, you will notice that the contents of bootload are at the beginning of floppya.img.
If you want, you can try running floppya.img with Emumaker86. Nothing meaningful will happen, however, because the bootloader just loads and runs garbage. You need to write your program now and put it at sector 3 of floppya.img.
A Hello World Kernel
Your program in this project should be very simple. It should simply print out Hello World to the top left corner of the screen and stop running. You should write your program in C and call it kernel.c.
When writing C programs for your operating system, you should note that all the C library functions, such as printf, scanf, putchar... are unavailable to you. This is because these functions make use of services provided by Linux. Since Linux will not be running when your operating system is running, these functions will not work (or even compile). You can only use the basic C commands.
Stopping running is the simple part. After printing hello, you don't want anything else to run. The simplest way to tie up the computer is to put your program into an infinite while loop.
Printing hello is a little more difficult since you cannot use the C printf or putchar commands. Instead you have to write directly to video memory. Video memory starts at address B8000 hex. Every byte of video memory refers to the location of a character on the screen. In text mode, the screen is organized as 25 lines with 80 characters per line. Each character takes up two bytes of video memory: the first byte is the ASCII code for the character, and the second byte tells what color to draw the character. The memory is organized line-by-line. Thus, to draw the letter 'A' at the beginning of the third line down, you would have to do the following:
- Compute the address relative to the beginning of video memory: 80 * (3-1) = 160
- Multiply that by 2 bytes / character: 160 * 2 = 320
- Convert that to hexadecimal (use the calculator utility): 320 = 0x140 (0x in C means that a hexadecimal number follows)
- Add that to B8000: 0xB8000+0x140 = 0xB8140 - this is the address in memory
- Write the letter 'A' to address 0xB8140. The letter A is represented in ASCII by the hexadecimal number 0x41.
- Write the color white (0x7) to address 0xB8141
Note: Look for text colors at : http://wiki.osdev.org/Text_UI#Video_Memory
Since 16 bit C provides no built in mechanism for writing to memory, you are provided with an assembly file kernel.asm. This file contains a function putInMemory, which writes a byte to memory. The function putInMemory can be called from your C program and takes three parameters:
- The first hexadecimal digit of the address, times 0x1000. This is called the segment.
- The remaining four hexadecimal digits of the address.
- The character to be written
To write 'A' to address 0xB8140, you should call:
putInMemory(0xB800, 0x140, 'A');
To print out the letter A in white at the beginning of the third line of the screen, call:
putInMemory(0xB800, 0x140, 'A'); putInMemory(0xB800, 0x141, 0x7);
You should now be able to write a C kernel program to print out Hello World at the top left corner of the screen.
Make sure the last thing you have in your main() is the line while(1); or your program will likely crash the simulator!
To compile your C program you cannot use the standard Linux C compiler gcc. This is because gcc generates 32-bit machine code, while the computer on start up runs only 16-bit machine code (most real operating systems force the processor to make a transition from 16-bit mode to 32-bit mode, but we are not going to do this). bcc is a 16-bit C compiler. bcc is fairly primitive and requires you to use the early Kernighan and Ritchie C syntax rather than later dialects. For example, you have to use /* */ for all comments; // is not recognized. You should not expect programs that compile with gcc to necessarily compile with bcc.
To compile your kernel, type bcc -ansi -c -o kernel_c.o kernel.c. The -c flag tells the compiler not to use any preexisting C libraries. The -o flag tells it to produce an output file called kernel.o.
kernel.o is not your final machine code file, however. It needs to be linked with kernel.asm so that the final file contains both your code and the int10() assembly function. You will need to type two more lines:
as86 kernel.asm -o kernel_asm.o to assemble kernel.asm ld86 -o kernel -d kernel_c.o kernel_asm.o to link them all together and produce kernel The file kernel is your program in machine code. To run it, you will need to copy it to floppya.img at sector 3, where the bootloader is expecting to load it (in later projects you will find out why sector 3 and not sector 1). To copy it, type dd if=kernel of=floppya.img bs=512 conv=notrunc seek=3 where “seek=3” tells it to copy kernel to the third sector.
Try running Emumaker86. If your program is correct, you should see “Hello World” printed out.
Notice that producing a final floppya.img file requires you to type several lines that are very tedious to type over and over. An easier alternative is to make a Linux shell script file. A shell script is simply a sequence of commands that can be executed by typing one name.
To generate a script, put all the previous commands into a single text file, with one command per line, and call it compileOS.sh. Then type chmod +x compileOS.sh, which tells Linux that compileOS.sh is an executable. Now, when you change kernel.c and want to recompile, simple type