8.4 Code: Scheduling

The last section looked at the internals of swtch; now let’s take swtch as a given and examine switching from one process’s kernel thread through the scheduler to another process. The scheduler exists in the form of a special thread per CPU, each running the scheduler function. This function is in charge of choosing which process to run next. Each CPU has its own scheduler thread because more than one CPU may be looking for something to run at any given time. Process switching always goes through the scheduler thread, rather than direct from one process to another, to avoid some situations in which there would be no stack on which to execute the scheduler (e.g. if the old process has exited, or there is no other process that currently wants to run).

A process that wants to give up the CPU must acquire its own process lock p->lock, release any other locks it is holding, update its own state (p->state), and then call sched. You can see this sequence in yield (2629), sleep and kexit. sched calls swtch to save the current context in p->context and switch to the scheduler context in cpu->context. swtch returns on the scheduler’s stack as though scheduler’s swtch had returned (2582).

scheduler (2558) runs a loop: find a process to run, swtch() to it, eventually it will swtch() back to the scheduler, which continues its loop. The scheduler loops over the process table looking for a runnable process, one that has p->state == RUNNABLE. Once it finds a process, it sets the per-CPU current process variable c->proc, marks the process as RUNNING, and then calls swtch to start running it (2577-2582). At some point in the past, the target process must have called swtch(); the scheduler’s call to swtch() effectively returns from that earlier call. Figure 8.2 illustrates this pattern.

⬇


                        1
                        
                        
                        
                        acquire(&p->lock);


                        2
                        
                        
                        
                        ...


                        3
                        
                        
                        
                        p->state = RUNNABLE;


                        4
                        
                        
                        
                        swtch(&p->context, ...);

⬇


                        1
                        
                        
                        
                          swtch(...); // return


                        2
                        
                        
                        
                          release(&p->lock);


                        3
                        
                        
                        
                      


                        4
                        
                        
                        
                          // find a RUNNABLE p


                        5
                        
                        
                        
                      


                        6
                        
                        
                        
                          acquire(&p->lock);


                        7
                        
                        
                        
                          p->state = RUNNING;


                        8
                        
                        
                        
                          swtch(...,&p->context);

⬇


                        1
                        
                        
                        
                            swtch(&p->context,...); // return


                        2
                        
                        
                        
                            release(&p->lock);

Figure 8.2: swtch() always has the scheduler thread as either source or destination, and the relevant p->lock is always held.

xv6 holds p->lock across calls to swtch: the caller of swtch acquires the lock, but it’s released in the target after swtch returns. This arrangement is unusual: it’s more common for the thread that acquires a lock to also release it. Xv6’s context switching breaks this convention because p->state and p->context must be updated together atomically. For example, if p->lock were released before invoking swtch, a different CPU $c$ might decide to run the process because its state is RUNNABLE. CPU $c$ will invoke swtch which will restore from p->context while the original CPU is still saving into p->context. The result would be that the process would be restored with partially-saved registers on CPU $c$ and that both CPUs will be using the same stack, which would cause chaos. Once yield has started to modify a running process’s state to make it RUNNABLE, p->lock must remain held until the process has saved all its registers and the scheduler is running on its stack. The earliest correct release point is after scheduler (running on its own stack) clears c->proc. Similarly, once scheduler starts to convert a RUNNABLE process to RUNNING, the lock cannot be released until the process’s kernel thread is completely running (after the swtch, for example in yield).

There is one case when the scheduler’s call to swtch does not end up in sched. allocproc sets the context ra register of a new process to forkret (2653), so that its first swtch “returns” to the start of that function. forkret exists to release the p->lock and set up some control registers and trapframe fields that are required in order to return to user space. At the end, forkret simulates the normal return path from a system call back to user space.