Project 2: Basic Multiprogramming and System Calls in Nachos

Deadline: July 27, 2003 23:59 PST

Contents

Overview

The Nachos code you have been given is capable of executing user application programs, but in an extremely limited way. In particular, at most one user application process can run at a time, and the only system call that has been implemented is the halt system call that shuts down Nachos. In this assignment, you will correct some of these deficiencies and turn Nachos into a multiprogramming operating system with a working set of basic system calls.

For this assignment, in the first part, you are to modify Nachos so that it can run multiple user applications at once.

That means that you have to implement the Fork, Yield, Exit, Exec, Kill, and Join system calls. The detailed specifications for these system calls are given below.

In the second part of this assignment, you are to implement the Create, Open, Read, Write, and Close system calls.

For this assignment, you will be working on the version of Nachos in the userprog directory. You will also need to write some simple user application programs, compile them using the MIPS cross compiler, and run them under Nachos to test that your modifications to Nachos actually work. User application programs are written in ANSI C. To write them, go to the test subdirectory of Nachos. Several sample applications are already provided in that directory. The only one that will run properly on an unmodified Nachos is the halt program. When you are in the code directory and type make, the programs in the test directory are cross compiled and executable Nachos user applications are created. Check out the test directory and note the source files such as halt.c with their corresponding executables such as halt. Then go to the userprog directory. Nachos should have been built already (i.e., there is a nachos executable in this directory). If not, type make nachos. Then, type nachos -x ../test/halt. This will start Nachos and ask it to load and run the halt program. You should see a message indicating that Nachos is halting at the request of the user program.

In brief, what happens when you type nachos -x halt is as follows:

Trace through the Nachos code until you think you understand how program halt is executed.

In this assignment, you will also need to know the object file formats for Nachos. This is how NOFF (Nachos Object File Format) looks like.

-----------             
| bss     | segment   
-----------           
| data    | segment   
-----------           
| code    | segment   
-----------             
| header  |      
-----------         

Noff-format files consist of four parts. The first part, the Noff header, describes the contents of the rest of the file, giving information about the program's instructions (code segment), initialised variables (data segment) and uninitialised variables (bss segment).

The Noff header resides at the very start of the file and contains pointers to the remaining sections. Specifically, the Noff header contains

--------------
|magic       |  0xbadfad
--------------

For each of the three segments

--------------
|virtual addr|  points to the location in virtual memory
--------------
|in file addr|  points to a location within the NOFF file where section begins
--------------
|size        |  size of the segment in bytes
-------------

This information about the NOFF can be found in /bin/noff.h file.

When you create user programs and compile them using the MIPS compiler (cross compile), you get COFF (common object file format) files. This is a normal MIPS object (executable). For this file to be runnable under Nachos, it has to be turned into NOFF. This is done by using bin/coff2noff, the COFF to NOFF translator. Please check the Makefile in code/test directory to see how is this done. You will need to add code in start.s and userprog/syscall.h in order to add a new system call (Kill).

Exercises

In the first part of this assignment, you are to implement the Fork(), Yield(), Exit(), Exec(), Kill, and Join() system calls. The function prototypes of the system calls are listed in syscall.h and act as follows:

Test your code by creating several user programs that exercise the various system calls. Be sure to test each of the system calls, and to try forking up to three processes (since each has a 1024 byte stack, that's all that will fit in Nachos' 4K byte physical memory right now) and have them yield back and forth for a while to make sure everything is working. Since the facility for I/O from user program will be implemented later during this assignment, you may initially have to rely on using debugging printout in the kernel to track what is happening. Use the DEBUG macro for this, and make sure that debugging printout is disabled by default when you submit your code for grading.

In the second part of this assignment, you are to implement the file system calls: Create, Open, Read, Write, and Close. The semantics of these calls are specified in syscall.h. You should extend your file system implementations to handle the console as well as normal files.

To support the system calls that access the console device, you will probably find it useful to implement a SynchConsole class that provides the abstraction of synchronous access to the console. The file progtest.cc has the beginning of a SynchConsole implementation.

What to Submit

You should include a file reports/project2.txt that explains how you design the code and how your code works, and how to run tests.

You should also indicate what does not work and explain your efforts in order to get partial credits. Also, do not forget to put your name (and the name of your team member) in the writeup file.

Grading

System calls Fork and Exec are the most complex assignments and they are 25% of the total grade each (50% for both). File system calls are 35% of the grade and Yield, Exit, Join, and Kill are the remaining 15%.

If something is not precisely specified, we expect you to take a reasonable assumption, clearly explain it in report, and proceed with your implementation.

Turnin

  1. Go to the directory that contains the top level code directory of your nachos.
  2. Turn in your "code" directory by typing turnin project2@cs170 code.

You can turnin multiple times per project. Earlier versions will be discarded. The timestamp of turnin has to be before midnight of the due date. Please delete core files before turnin.

Required Output Prints

In order for us to see how your program works, some debugging information must be added in your code. You should print out the following information:

  1. Which system call is called? Whenever a system call is invoked, print:
    System Call: [pid] invoked [call]
    where [pid] is the identifier of the process (SpaceID) and [call] is the name of one of the system calls that you had to implement. Just give the name without parentheses (e.g., Fork, Create, Exit).
  2. How many pages are allocated to the user program when it is loaded? The following line should be printed when a program is loaded:
    Loaded Program: [x] code | [y] data | [z] bss
    where [x], [y] and [z] are the sizes of the code, (initialized) data and bss (uninitialized) data segments in bytes.
  3. Which process has been forked and how many pages are allocated to the forked process? Whenever a new process is forked, print the following line:
    Process [pid] Fork: start at address [addr] with [numPage] pages memory
    where [pid] is the process identifier (SpaceID) of the parent process, [addr] is the virtual address (in hexadecimal format) of the function that the new (child) process starts to execute and [numPage] is the number of pages that the new process gets.
  4. What is the file name in the Exec system call? When a new process should be executed, print the following line:
    Exec Program: [pid] loading [name]
    where [pid] is the identifier of the process that executes an Exec system call and [name] is the name of the executable file to be loaded
  5. When a thread exits, print the following line:
    Process [pid] exits with [status]
    where [pid] is the identifier of the exiting process and [status] is the exit code.
  6. When a thread is killed, print the following line:
    Process [pid] killed process [killed-pid]
    where [pid] is the identifier of the process calling Kill system call and [killed-pid] is the pid of killed process. If the call is unsuccessful, print:
    Process [pid] cannot kill process [killed-pid]: doesn't exist

Issues to Consider

Here is an outline of the the major issues you will have to deal with to make Nachos into a multiprogrammed system:

Suggestions and FAQ from previous classes

FAQ - Part 1 (Fork, Exec, Join, Exit, and Yield)

  1. Question: What should I start implementing first?

    Answer: I suggest starting by creating a class/structure for memory manager, a process control block, and a process control block table. I would then start by implementing the system calls in the following order: Yield, Exit, Join, Exec, and Fork. Yield is very simple so shouldn't take you any time at all. Exit and Join are related so it helps to do them together. Fork is the most difficult (along with Exec) so it helps to do them after the others. I would not start part two of the project until you have completely finished part1. Remember, even though part 2's system calls are completely different than part one's, it will be necessary to add some code to part 1's system calls to incorporate part 2's system calls.

  2. Question: How can I test part one (why doesn't printf work in my test programs)?

    Answer: Until you have part 2 done you will have to test part 1 based on flow of execution. For example: to test if Fork/Exec works you Fork/Exec a function/executable and then call Halt() in the function/executable. If Nachos halts when you run your test program then your system call is probably working. Once you have the Write system call implemented you can further test part 1 by putting your Write statements where ever you wish to print something out. This will be useful for testing Join.

  3. Question: What does the memory manager do?

    Answer: It is merely a way of assigning pages from the system's (Nachos) main memory. It does not actually allocate memory (via malloc/new or some equivalent). Basically you will want to enforce atomic (synchronized) operations on a bitmap that represents the number of pages in main memory (there are 32 pages in Nachos main memory right now).

  4. Question: What does a process control block contain?

    Answer: A process control block (pcb) contains the attributes of a process. Some of the major attributes are the thread, the pid (SpaceID), open files, etc (remember that the thread has a pointer to the addrspace which is an indirect attribute of a pcb). There should be a global table of process control blocks as well. Remember that you will also want to be able to get a particular process's condition so that it can be waited on in Join (if necessary) and broadcast in Exit. It may not seem like you are using the process control blocks much in part 1 of the project (except for adding and deleting them) but you will use them more in part 2 of the project).

  5. Question: How do I add a new file for this project?

    Answer: You will want to create your .h and .cc file in userprog (every new file for this project can be added in this directory). Then you will want to add the .h, .cc, and .o to the list of files in the top level Makefile.common in the correct USERPROG section. Be sure not to add your file to the end of the list of files, but instead place it in the same section as the other userprog files. In each USERPROG section (ie _H, _C, and _O) all the files from each directory should be placed with the other files from that directory. Once this is done type "make depend" in the userprog directory and then you will be able to type "make" to compile as before.

  6. Question: How do I translate the name of the executable when implementing Exec?

    Answer: You will want to put a translate function in addrspace that is similar to the translate function in Machine. You will use this function to translate a virtual address into a physical address (one page at a time).

  7. Question: Is there a difference between the parameters of Exec and Fork?

    Answer: Yes there is. Exec takes the string representing the relative path (from where you run Nachos - userprog in this case) to the test program you wish to exec [ie. Exec("../test/myExecedProgram")]. Fork takes the name of the function you wish to fork. It is not a string, but a function pointer so it has the form Fork(myFunction) where you have implemented void myFunction() previously in you test program. In Exec you are translating the string that is passed in and in Fork you are using the pointer to myFunction as the PCReg value for when you run the new thread. If you try to pass a string to Fork or a name (not in string form) to Exec you will have serious problems

  8. Question: Does a Forked thread use the same addrspace as the thread who Forked it?

    Answer: No it does not. It uses a COPY of the addrspace from the thread who Forked it. This means you must create a new addrspace that has the same size page table as the Forking thread and then you must copy the Forking thread's pages in memory into the Forked threads pages in memory.

  9. Question: The project spec says we need to add a function ReadFile to Addrspace. What is this used for?

    Answer: This is used for copying the executable's code and data segments into memory (in the constructor for Addrspace (that Exec calls). noffH.code.virtualAddr is the logical address of the executable's code segment and noffH.initData.virtualAddr is the logical address of the executable's data segment. You no longer want to zero out (bzero) the memory. You need to copy these sections page by page into main memory. You will use your Addrspace::Translate to translate the logical data/code address to the physical address (in memory). Then you can use the C/C++/Unix "bcopy" to copy it over.

  10. Question: Where should we put the global structures/classes?

    Answer: You can put these in system.h in the threads directory. Only put the global variables (their declaration) in this file. Make sure you actually instantiate the global structures (like memory manager) in the system.cc file in the initialize function under the correct #ifdef's (for userprog in this case).

  11. Question: Does the Exit system call take a parameter? What is that parameter?

    Answer: The Exit system call should take an integer parameter. This parameter is the exit value of the process. This is the same as the exit values in Unix (0 -> good, 1 -> bad). For our purposes you can use any integer here. It only matters that if another process was "Join"ing on your process that the Join call would return the value that your process exited with (as per the project spec). This means that you need to save your exit value some how in the Exit system call so that Join can return it if necessary (even if you've already exited when someone calls Join on your process).

  12. Question: What should we do when we can't allocate enough physical pages in Addrspace constructor for a new thread?

    Answer: You should let Fork/Exec know that you don't have enough memory so that it can let the user know (return pid of -1) that it didn't Fork/Exec a new process. Make sure that you check if there is enough free pages in physical memory to facilitate the number of pages for your new process before trying to allocate the physical pages for each logical page. If you do not do this one of your allocates will return -1 and you will have to deallocate all the pages you just allocated before you can let Fork/Exec know of the error.

  13. Question: Do Exec and Fork call thread->Fork?

    Answer: Yes they do. After you have set all other information up (as in the project spec) you need to thread->fork a dummy function (that's in exception.cc). In this dummy function you will want to initialize and restore the registers (for Fork and Exec) and then set up the PC registers and return register address (for Exec only). After this you will call machine->Run() from the dummy function (for both Fork and Exec). Exec may not create a new thread if you choose to implement it by replacing current state of the thread instead of deleting and creating a new one with the same pid.

Part 2 (Create, Open, Close, Read, Write, and console - stdin & stdout):

  1. You will want a few new data structures in this part of the project. You will need a structure that represents a process' open files, a structure that represents the list of all a process' open files, a structure for an open file in a system, and a structure that represents all the open files in the system. Here is a brief explanation of each of these structure/classes:

  2. A system's open file: Needs to contain information such as the filename, an OpenFile object, and the count of the number of processes that have this file open. The OpenFile object is what will be used to actually read and write to this file. You can look at this in filesys/openfile.h where the stubs are contained. The count will be used for determining if a file really needs to be closed or if we should leave it open because other processes are still using it. It will also determine if a file needs to be opened or if it is already open.

  3. The system's open file list: This will contain the array of the system's open files and allow access to those objects. It will decide if a file already exists or if it really needs to be closed or not. You can use the bitmap (that you used in part one of the project) for managing the array of open files.

  4. A process' open file: Needs to contain information such as the file name, the offset into this file for this particular process, and the index into the system wide file table. The offset is used for when you are reading or writing into the file. This is where you start reading from again if you do another read (or writing if you did a write). The index into the system's file table gives us access to the actual file of the system (that is synchronized from all different processes that access it). This mapping of a process' open file to the system file table allows each process to have it's own offset into the file (different from other processes) and makes it appear to the user that he actually has this file open all to himself (when really there is only one open copy of this file in the system's open file table).

  5. The list of a processes open files: This will be an attribute of each PCB. It will have information such as an array of this process' open files which include the "always present" stdin and stdout. You can make the maximum number of open files for a process anything greater than 20. Once again, you can use the bitmap for this too.

Here is an overview of each of the calls and how they use the above processes:

  1. Create: You will need to translate the file name passed in just as you did for the Exec system call. After doing this you can use fileSystem's Create function to create the file. This can be found in filesys/filesys.cc. You can create the file with an initial size of 0.

  2. Open: You will need to translate the file name passed in just as you did in the Exec system call. Next you will want to add a new process open file to your list of your process' open files. This will in turn check to see if the system has this file open yet. If it does, the system file table increments the counter for that file. If it does not, the system will add a new system open file which will open the file using fileSystem's Open function. This can be found in filesys/filesys.cc.

  3. Close: Using the file id passed in you will remove the open file from your process' list of open files and it in turn will let the system file table know to remove the file if necessary (ie. if the count > 0 then count--, otherwise remove file from list and close it). To close the file you can just clear out the reference in your system open file table to the open file. There is no actual close function that needs to be called. As far as the user is concerned you have closed the file.

  4. Read: The main thing you need to do is translate the logical buffer address that the user passed in (what you are to read into) page by page and read into the translated address in memory each time you translate a page. To read each page you will call a read function in your structure for the list of your process' open files. This will get the correct open file, and handle the moving of the offset in the file. The system open file table is called from the process open file and asked to read. This is a synchronized read (since many process could be calling this system open file's read). You need to use locks and call OpenFile's ReadAt function. This is located in filesys/openfile.h). Remember to keep track of the actual number of bytes written (ReadAt returns this) and let the user know how many were actually read.

  5. Write: You will also need to translate the logical buffer provided by the user argument page by page. You will write each page separately as you translate them. Just as Read does, Write will keep track of the offset in the process' open file and will ask the system open file table to write to the desired file. This will occur like Read except you will be calling OpenFile's WriteAt function.

  6. Console Comments: To write to the console a user does not need to open or create any new files (this should be done in the constructor of the process' open file list). The user can simply call Read with "stdin" or write with "stdout" as the file name. Writing to stdout should just print to the terminal (use printf for this). Reading should just wait for the user to type something in and then press enter (use scanf to do this). Remember you cannot read from stdout and you cannot write to stdin.

  7. Alterations to part 1 code from part 2 You will now need to close all your process' files when you exit. You will also need to add this file list to your pcb class for a process' open files.