Project 3 draft: Virtual memory management

Deadline (part 1): August 22, 2003 23:59 PST

Contents

Overview

The goal of this project is to implement demand-paged virtual memory management on top of the basic multiprogramming kernel developed in project 2. In project 2, all the pages for a process were loaded into physical memory. This had two implications:

Demand-paged virtual memory removes both these restrictions. It allows a process to execute with only a fraction of its address-space resident in physical memory. It also allows the total memory footprint of active processes to be larger than the physical memory. It achieves this by:

Virtual memory works in the following way. All addresses generated by a user process are virtual addresses. Address translation hardware checks every such address generated by a user process (read/write, instruction/data) as it tries to map it to the corresponding physical address. If the page that contains the virtual address referenced by the user process is currently resident in physical memory, the translation proceeds as it used to before. If, on the other hand, the page is not in physical memory, a Page Fault Exception is generated. This causes the processor to leave the user mode and switch to kernel mode. The kernel responds a PageFaultException by running a routine (called the Page Fault Handler) whose purpose is to bring the referenced page into physical memory (for this, it may have to move some other page to the disk-based backing store). In general, the Page Fault Handler needs to perform the following operations:

Nachos already contains code to do virtual to physical translation and to raise a PageFaultException. For this project, you will:

Suggested project stages

For this project, you will primarily be working with files in userprog and machine directories. We suggest the following steps and urge you to keep an eye at all times on the data structures you have chosen to store and access information. The hope is that after each stage, your design shall become more concrete and perhaps you shall have a better perspective of what you need to do and how you would like to do it.

Stage 1:
Make sure that you understand the TranslationEntry class as given in machine/translate.h. Ask yourself why each component of a page-table has been included. See the method Machine::Translate in machine/translate.cc to understand how these fields are used.

Stage 2:
Trace how the MIPS simulator (in machine/mipssim.cc) executes instructions. As a part of executing an instruction, the machine accesses main memory for loading the instruction itself (code segment) and possibly for its operands (data/stack segment). Figure out how memory access is implemented and where the virtual to physical translation is performed. It is during the translation process that the machine can determine if the virtual address it is trying to access belongs to a page which does not reside in physical memory. Figure out how, when and where the PageFaultException is thrown.

Stage 3:
In stage 2, you would have figured out how, when and where the PageFaultException is raised. This Exception results in control being passed back to the kernel and is supposed to be handled in a manner similar to system calls. Currently, this exception is not handled. Add code to exception.cc to call a stub routine when this Exception is raised. This will be your page fault handler. As a part of raising the Exception, the processor saves the faulting virtual address in a register that will be used by the kernel to handle the Exception. Note that, at this point, this mechanism is not exercised by program execution as processes are loaded in their entirety and no page faults are generated.

Stage 4:
Figure out how to start a process with none of its pages in memory. For this, you will need to change the code that you wrote in Project 2 for process creation. You may also need to modify the pagetable structure (in the TranslationEntry class) to keep track of pages that are not in memory. Note that you are free to add fields to the this structure (as long as you don't disturb the existing fields). You will need to keep track of the location from which disk-resident pages are to be loaded. Remember that, initially the pages of a process are all in the executable file. Once a page has been brought in to memory, any subsequent flush of this page to disk (during page replacement) should be to backing store (this storage will be created in Stage 6) and not to the executable file. You will also need to allocate space in the backing store for the pages of this process. You can choose to be conservative and allocate space for the entire virtual address space of the process on the backing store at creation time. You can be even more conservative and choose to copy the entire executable file into the allocated space at startup. If you did this, you would need to only concern yourself with moving pages between backing store and the memory during page fault handling. This is only a suggestion. Alternate implementations are more than welcome - you can get up to 10 extra points.

Stage 5
At this point, you are all set to receive a page fault. We suggest that you make your dummy page fault handler (set up in stage 3) simply print some debugging information and return. Using this scheme, you should make sure that control actually flows to your page fault handler during program execution. You don't service the page fault at this stage. Therefore, you will run into an infinite loop where the machine keeps raising page faults.

Stage 6
Now you are all set to implement a page replacement algorithm.

Page Fault Handling

The page fault handler gets control as a result of the machine raising a page fault exception. The handling of this exception involves the following tasks.
  • If there is no unused frame in memory, scan the physical memory for selecting a victim page. Use the Second chance algorithm which is described in Section 10.4.5.2 (pp. 341) in the text-book.
  • If necessary, allocate space on the backing store to receive the contents of the victim page (assuming that it is dirty and needs flushing).
  • Initiate I/O to write the contents of the victim page to the backing store.
  • Adjust the pagetable for the process to which the victim page belongs to reflect the fact that it is no longer resident in memory.
  • Locate the page for which the fault was generated on the backing store; initiate I/O to load the page into the page frame selected in the previous steps.
  • Adjust the pagetable for the faulting process to reflect the fact that the desired page is now resident in memory.
  • Return to user mode and restart the instruction that caused the fault. Make sure to re-execute the instruction for which the page-fault was generated in the first place.
  • Use a single file called SWAP with 8192 sectors (1 MB) to implement the backing store. The size of the swap sectors is the same as that of a physical page frame. You should use the stub implementation of the file system already provided with Nachos (look into filesys/filesys.h and filesys/openfile.h) or your own stubs from project2. You will need a mechanism to keep track of the used and free sectors in the swap file (similar to the mechanism that keeps track of the allocation of the physical page frames in the previous assignment).

    Recommended Data Structures

    The page fault handler requires some auxiliary data structures to accomplish its task. The following data structures may be useful.
  • A boolean flag per virtual page indicating whether a given virtual page of a process is resident in physical memory or on backing store.
  • The mapping between a virtual page of a process and its location (whether in physical memory or on backing store).
  • A map of the backing store (swap file) to keep track of space allocation/de-allocation.
  • A table of the address space information of all active processes in the system possibly indexed by the process_ids.
  • Other designs are possible; the above data structures are just to give you an idea of how it can be done.

    Once you get this working, you should be able to execute programs normally. Launch multiple processes in Nachos simultaneously (using exec, fork) and test your code under various conditions of system load. Include a test case using one process with an address space larger than physical memory and a test case using several concurrently running processes with combined address spaces larger than physical memory. The sort program in the test directory is an example of a program designed to stress the virtual memory system.


    Stage 7 In this stage, you will extend your memory structures and page fault handler to support copy-on-write. This means that when you copy a page (as in a fork system call), that you do not actually copy the page. Instead, you:

    1. set the new page table entry to use the same physical page.
    2. mark the physical page as shared by the new page table entry. This probably requires that you maintain a structure to determine whether a physical page is currently shared (together with information that tells you which processes share that page).
    3. set the read only bit in the new page table entry, so that the machine will generate a fault (ReadOnlyException) on attempts to write to this page. If the page wasn't already being shared, also set the read only bit on the page table entry of the other process.
    When handling a fault caused by trying to write to this read-only page, you must:
    1. verify that this is a valid page (as for any other page fault) and that it is shared
    2. allocate a new physical page
    3. copy the contents of the old physical page to the new physical page
    4. update the faulting page table entry to point to the new physical page instead of the old one.
    5. unmark the old physical page as shared by the faulting page table entry
    You should be aware of the following things: You must:
    1. Implement copy-on-write in your virtual memory system.
    2. Update your Fork system call to copy-on-write pages instead of simple copying them.

    Stage 8 In this stage, you will implement demand zero-filled pages. This is a technique where pages are not allocated, in either main memory or in backing store, until they are accessed. This is useful for pages of uninitialized data, such as the heap or stack.

    When the page is accessed for the first time, the page fault is handled by allocating a new page, which is then cleared (filled with zeros). The faulting page table entry is then updated to point to the new physical page. Up until now, a page was always either in memory or in the backing store. To do this, you will need to update your translationEntry to handle this new page state.

    You are to:

    1. Implement demand zero-filled pages.
    2. Update Fork and Exec system calls to allocate the process's stack as demand zero-filled pages.

    What to submit

    What to Turnin

    1. Go to the directory that contains the top level code directory of your nachos.
    2. Turn in your code directory by typing turnin project3@cs170 code.
    3. In addition to the code, include a file called reports/project3.txt that briefly explains the design of your code and how it works. If some parts of your code do not run, you need to say this outright in report and to describe your design, implementation and difficulties. This is needed for partial credit.

    You can turnin multiple times per project. Earlier versions will be discarded. The timestamp of turnin has to be before midnight of the due date.

    Required Output

    For the following outputs, [pid] is the id of the process on which behalf the operation is performed. [virtualPage] is the involved virtual page number (i.e. the page index into the process virtual address space) and [physicalPage] is the involved physical page number (i.e., the page index into the physical memory of the Nachos virtual machine).

    1. Whenever a page is loaded into physical memory, print
      L [pid]: [virtualPage] -> [physicalPage]
    2. Whenever a page is evicted from physical memory and written to the swap area, print
      S [pid]: [physicalPage]
    3. Whenever a page is evicted from physical memory and not written to the swap area, print
      E [pid]: [physicalPage]
    4. Whenever a process writes to a shared page (and this page needs to be duplicated), print
      D [pid]: [virtualPage]
    5. Whenever a process obtains a zero-filled demand page for the first time (i.e., when you allocate and zero the page out), print
      Z [pid]: [virtualPage]

    Credit