[BACK]Return to kernel.html CVS log [TXT][DIR] Up to [local] / prex-old / doc / html / doc

File: [local] / prex-old / doc / html / doc / kernel.html (download) (as text)

Revision 1.1, Tue Jun 3 09:38:43 2008 UTC (15 years, 11 months ago) by nbrk
Branch point for: MAIN

Initial revision

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <title>Prex Kernel Internals</title>
  <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
  <meta name="keywords" content="Prex, embedded, real-time, operating system, RTOS, open source, free">
  <meta name="author" content="Kohsuke Ohtani">
  <link rel="stylesheet" type="text/css" href="../default.css" media="screen">
  <link rel="stylesheet" type="text/css" href="../print.css" media="print">
</head>
<body>
<div id="top">
</div>
<div id="middle">

<table id="content" cellpadding="0" cellspacing="0">
  <tbody>

    <tr>
      <td id="header" colspan="2" valign="top">
        <table width="100%" border="0" cellspacing="0" cellpadding="0">
        <tr>
          <td id="logo">
            <a href="http://prex.sourceforge.net/">
            <img alt="Prex logo" src="../img/logo.gif" border="0"
            style="width: 281px; height: 56px;"></a>
          </td>
          <td id="brief" align="right" valign="bottom">
            An Open Source, Royalty-free,<br>
	    Real-time Operating System
          </td>
        </tr>
        </table>
      </td>
    </tr>

    <tr>
      <td id="directory" style="vertical-align: top;">
      <a href="http://prex.sourceforge.net/">Prex Home</a> >
      <a href="index.html">Document Index</a> >
      Kernel Internals
    </tr>
    <tr><td class="pad" colspan="2" style="vertical-align: top;"></td></tr>

    <tr>
      <td id="main" style="vertical-align: top;">
      <h1>Prex Kernel Internals</h1>

<i>Version 1.4, 2005/12/31</i>

<h3>Table of Contents</h3>
<ul>
  <li><a href="#over">Kernel Overview</a></li>
  <li><a href="#design">Design Policy</a></li>
  <li><a href="#thread">Thread</a></li>
  <li><a href="#task">Task</a></li>
  <li><a href="#sched">Scheduler</a></li>
  <li><a href="#memory">Memory Management</a></li>
  <li><a href="#ipc">IPC</a></li>
  <li><a href="#except">Exception Handling</a></li>
  <li><a href="#int">Interrupt Framework</a></li>
  <li><a href="#timer">Timer</a></li>
  <li><a href="#device">Device I/O Service</a></li>
  <li><a href="#mutex">Mutex</a></li>
  <li><a href="#debug">Debug</a></li>
</ul>
<br>

<h2 id="over">Kernel Overview</h2>

<h3>Kernel Structure</h3>
<p>
The following figure illustrates the Prex kernel structure.
</p>
<p>
<img alt="Kernel Components" src="img/kernel.gif" border="1"
style="width: 602px; height: 414px;"><br>

<i><b>Figure 1. Prex kernel Structure</b></i>
</p>
<p>
A kernel object belongs in one of the following groups.
</p>
<ul>
  <li><b>kern</b>: kernel core components</li>
  <li><b>mem</b>: memory managers</li>
  <li><b>ipc</b>: inter process communication (*)</li>
  <li><b>sync</b>: synchronize objects</li>
  <li><b>arch</b>: architecture dependent components</li>
</ul>
<p>
<i>*) Since all messages in Prex are transferred among threads, the name of
"IPC" is not appropriate.
However, "IPC" is still used as a general term of the message transfer
via the kernel, in Prex.</i>
</p>

<h3>Naming Convention</h3>
<p>
The name of "group/object" in figure 1 is mapped to "directory/file" in the
Prex source tree. For example, the thread related functions are located
in "kern/thread.c", and the functions for semaphore are placed
in "sync/sem.c".
</p>
<p>
In addition, there is a standard naming convention about kernel
routines. The method named <i>bar</i> for the object named <i>foo</i>
should be named "foo_bar". For example, the routine to create a new 
thread is named "thread_create", and locking mutex will be "mutex_lock".
This rule is not applied to the name of the local function.
</p>


<h2 id="design">Design Policy</h2>
<p>
The Prex kernel focuses the following points to be designed.
</p>
<ul>
  <li>Portability</li>
  <li>Scalability</li>
  <li>Reliability</li>
  <li>Interoperability</li>
  <li>Maintainability</li>
</ul>

<h3>Portability</h3>
<p>
The Prex kernel is divided into two different layers -
a common kernel layer and an architecture dependent layer.

Any routine in the common kernel layer must not access to the H/W by itself.
Instead, it must request it to the architecture dependent layer.
The interface to the architecture dependent layer is strictly defined
by the Prex kernel. This interface is designed carefully to support various
different architecture with minimum code change.
So, it is easy to port the Prex kernel to different architecture.
</p>
<p>
The following functions must be provided by the architecture dependent layer.
</p>
<ul>
  <li><b>CPU</b>: initializes processor registers before kernel boot</li>
  <li><b>Context</b>: abstracts processor and hardware context</li>
  <li><b>MMU</b>: abstracts memory management unit (*)</li>
  <li><b>Trap</b>: abstracts processor trap</li>
  <li><b>Interrupt</b>: abstracts the interrupt control unit</li>
  <li><b>Clock</b>: abstracts clock timer unit</li>
  <li><b>Misc.</b>: abstracts system reset, idle power state</li>
</ul>
<p>
<i>*) In case of no-MMU system, MMU related routines will be
defined as no-operation routine.
So, the kernel common layer can not assume MMU is always available.</i>
</p>

<h3>Scalability</h3>
<p>
In order to obtain higher scalability, the kernel does not limit the maximum
number of the kernel objects to create.
So, the resource for all kernel objects are allocated dynamically after
system boot.
This can keep the memory prerequisite smaller than the static
resource allocation.
This means that the kernel can create task, thread, object, device,
event, mutex, timer as many as usable memory remains.
</p>
<p>
The kernel supports both of MMU and No-MMU systems. So, most of the kernel
components and VM sub-system are designed carefully to work without MMU.
</p>

<h3>Reliability</h3>
<p>
When the remaining memory is exhausted, what should OS do?
If the system can stop with panic() there, the error checks of many
portions in the kernel are not necessary.
But obviously, this is not allowed with the reliable system.
Even if the memory is exhausted,
a kernel must continue processing.
So, all kernel code is checking the error status returned
by the memory allocation routine.
</p>
<p>
In addition, the kernel must not crush anytime even if any invalid parameter
is passed via kernel API. Basically, the Prex kernel code is written with
"garbage in, error out" principle.
The Prex kernel never stops even if any malicious
program is loaded.
</p>

<h3>Interoperability</h3>
<p>
Although the Prex kernel was written from scratch, its applications
will be brought from the other operating systems like BSD. So, the system 
call interface is designed with consideration to support general OS
API like POSIX or APIs for generic RTOS.
</p>
<p>
The error code for the Prex system call is defined as the same name 
with POSIX. For example, EINVAL for "Invalid argument", or ENOMEM for
"Out of memory". So, peoples do not have to study new error codes if
they already have skills about POSIX programming.
This is important point to write applications and to read
the kernel code because study of a new error scheme will cause pain
for developers.

In addition, it simplify the POSIX emulation library because it
does not have to remap the error code.
</p>

<h3>Maintainability</h3>
<p>
All kernel codes are kept clean and simple for the maintenance.
All codes are well-commented and consistent.
It is easy to add or remove new system call into the kernel.
The kernel has the debugging facility like the function trace or 
the dump of the kernel objects.
</p>


<h2 id="thread">Thread</h2>

<h3>Thread Control Block</h3>
<p>
The thread control block includes data for
owner task, scheduler, timer, IPC, exception, mutex, and context.
The following thread structure is most important definition
in the kernel codes.
</p>
<pre>
struct thread {
        int             magic;          /* magic number */
        task_t          task;           /* pointer to owner task */
        struct list     task_link;      /* link for threads in same task */
        struct queue    link;           /* linkage on scheduling queue */
        int             state;          /* thread state */
        int             policy;         /* scheduling policy */
        int             prio;           /* current priority */
        int             base_prio;      /* base priority */
        int             ticks_left;     /* remaining ticks to run */
        u_int           total_ticks;    /* total running ticks */
        int             need_resched;   /* true if rescheduling is needed */
        int             lock_count;     /* schedule lock counter */
        int             suspend_count;  /* suspend counter */
        struct event    *sleep_event;   /* sleep event */
        int             sleep_result;   /* sleep result */
        struct timer    timeout;        /* thread timer */
        struct timer    *periodic;      /* pointer to periodic timer */
        struct queue    ipc_link;       /* linkage on IPC queue */
        void            *msg_addr;      /* kernel address of IPC message */
        size_t          msg_size;       /* size of IPC message */
        struct thread   *sender;        /* thread that sends IPC message */
        struct thread   *receiver;      /* thread that receives IPC message */
        object_t        send_obj;       /* IPC object sending to */
        object_t        recv_obj;       /* IPC object receiving from */
        uint32_t        exc_bitmap;     /* bitmap of pending exceptions */
        struct list     mutexes;        /* mutexes locked by this thread */
        struct mutex    *wait_mutex;    /* mutex pointer currently waiting */
        void            *kstack;        /* base address of kernel stack */
        struct context  context;        /* machine specific context */
};</pre>

<h3>Thread Creation</h3>
New thread can be created by thread_create().
The initial states of newly created thread are as follow:
<p>
</p>

<i><b>Table 1. Initial thread state</b></i>
<table border="1" width="60%" cellspacing="0">
<tbody>
<tr>
  <th>Data type</th>
  <th>Initial state</th>
</tr>
<tr>
  <td>Owner Task</td>
  <td>Inherit from parent thread</td>
</tr>
<tr>
  <td>Thread state</td>
  <td>Suspended</td>
</tr>
<tr>
  <td>Suspend count</td>
  <td>Task suspend count + 1</td>
</tr>
<tr>
  <td>Scheduling policy</td>
  <td>Round Robin</td>
</tr>
<tr>
  <td>Scheduling Priority</td>
  <td>Default (= 200)</td>
</tr>
<tr>
  <td>Time quantum</td>
  <td>Default (= 50 msec)</td>
</tr>

<tr>
  <td>Processor registers</td>
  <td>Default value</td>
</tr>

</tbody>
</table>

<p>
Since new thread is initially set to the suspended state, thread_resume()
must be called to start it.
</p>

<p>
Creating a thread and loading its register state are isolated
in different routines. These two routines are used by fork(), exec(),
and pthread_create() in the POSIX emulation library.
</p>

<i><b>Table 2. Usage of thread_create()/thread_load()</b></i>
<table border="1" width="80%" cellspacing="0">
<tbody>
<tr>
  <th>Library routine</th>
  <th>thread_create()</th>
  <th>thread_load()</th>
</tr>
<tr>
  <td>fork()</td>
  <td align="center">O</td>
  <td align="center">X</td>
</tr>
<tr>
  <td>exec()</td>
  <td align="center">X</td>
  <td align="center">O</td>
</tr>
<tr>
  <td>pthread_create()</td>
  <td align="center">O</td>
  <td align="center">O</td>
</tr>
</tbody>
</table>

<h3>Thread Termination</h3>
<p>
The kernel will usually release all resources owned by the terminated thread.
But, there are some complicated process to release the resources.
The priority adjustment may be required if the thread inherits its
priority.
</p>
<p>
If the thread is terminated with mutex locking, all threads waiting for
that mutex does sleep forever. So, the mutex held by the terminated thread
must be unlocked, or change its mutex owner if some thread is waiting for.

</p>
<p>
In general, there is a known issue about the thread termination.
If the termination target is current thread, the kernel can not release
the context of the current thread because the
thread switching always requires the current context.
There are the following 3 solutions for this.
</p>
<ol>
 <li>Create "clean up thread" to terminate thread</li>
 <li>Add condition check in thread switching code</li>
 <li>Defer termination in next termination request</li>
</ol>
The Prex kernel is using #3.

<h3>Thread Suspension</h3>
<p>
Each thread can be set to the suspended state by using thread_suspend()
interface.
Although a thread can be suspended any number of times,
it does not start to run unless it is resumed by the same
number of suspend.
</p>

<h3>Kernel Thread</h3>
<p>
A kernel thread is always executed in kernel mode, and it does not have user
mode context.
The scheduling policy is set to SCHED_FIFO by default.
</p>
<p>
Currently, the following kernel threads are running in kernel mode.
</p>
<ul>
  <li>Interrupt Service Threads</li>
  <li>Timer Thread</li>
  <li>Idle Thread</li>
  <li>DPC Thread</li>
</ul>

<h3>Idle Thread</h3>
<p>
An idle thread runs when no other thread is active.
It has the role of cutting down the power consumption of a system.
An idle thread has FIFO scheduling policy, and it does not
have time quantum.
The lowest scheduling priority (=255) is reserved for an idle thread.
</p>

<h2 id="task">Task</h2>

<h3>Task Creation</h3>
<p>
The task can be created by using task_create().
New child task will have the same memory image with the parent task.
Especially text region and read-only region are physically
shared among them.
The parent task receives the new task ID of child task from task_create(), but
child task will receive 0 as task ID.
</p>
<p>
The initial task states are as follow:
</p>

<i><b>Table 3. Initial task state</b></i>
<table border="1" width="60%" cellspacing="0">
<tbody>
<tr>
  <th>Data type</th>
  <th>Inherit from parent task?</th>
</tr>

<tr>
  <td>Object List</td>
  <td align="center">No</td>
</tr>

<tr>
  <td>Threads</td>
  <td align="center">No</td>
</tr>

<tr>
  <td>Memory Map</td>
  <td align="center">Yes</td>
</tr>

<tr>
  <td>Suspend Count</td>
  <td align="center">No</td>
</tr>

<tr>
  <td>Exception Handler</td>
  <td align="center">Yes</td>
</tr>

</tbody>
</table>

<p>
If the parent task is specified as NULL for task_create(),
all child state are initialized to default.
This is used in exec() emulation.
</p>

<h3>Task Suspension</h3>
<p>
When the task is set to suspend state, the thread suspend count of all threads
in the task is also incremented.
A thread can start to run only when both of the thread suspend count
and the task suspend count becomes 0.
</p>

<h3>Kernel Task</h3>
<p>
The kernel task is a special task that has only an idle thread
and interrupt threads. It does not have any user mode memory.
</p>

<h2 id="sched">Scheduler</h2>
<h3>Thread Priority</h3>
<p>
The Prex scheduler is based on the algorithm known as priority based
multi level queue. Each thread is assigned the priority between
0 and 255. The lower number means higher priority like BSD unix.
It maintains 256 level run queues mapped to each priority.
The lowest priority (=255) is used only for an idle thread.
</p>
<p>
A thread has two different types of priority:
</p>
<ul>
  <li><b>Base priority:</b>
  This is a static priority which can be changed only by user mode program.
  <li><b>Current Priority:</b>
  An actual scheduling priority.
  A kernel may adjust this priority dynamically if it's needed.
</ul>
<p>
Although the base priority and the current priority are same value in almost
conditions,
kernel will sometimes change the current priority to avoid
"priority inversion".
</p>

<h3>Thread State</h3>
<p>
Each thread has one of the following states.
</p>

<ul>
  <li><b>RUN</b>     :Running or ready to run</li>
  <li><b>SLEEP</b>   :Sleep for some event</li>
  <li><b>SUSPEND</b> :Suspend count is not 0</li>
  <li><b>EXIT</b>    :Terminated</li>
</ul>
<p>
The thread is always preemptive even in kernel mode.
There are following 4 events to switch thread:
</p>

<i><b>Table 4. Events to switch thread</b></i>
<table border="1" width="90%" cellspacing="0">
<tbody>
<tr>
  <th>Event</th>
  <th>Condition</th>
  <th>Run queue position</th>
</tr>
<tr>
  <td><b>Block</b></td>
  <td>Thread sleep or suspend</td>
  <td>Move to the tail of runq</td>
</tr>
<tr>
  <td><b>Preemption</b></td>
  <td>Higher priority thread becomes runnable</td>
  <td>Keep the head of runq</td>
</tr>
<tr>
  <td><b>Quantum Expiration</b></td>
  <td>The thread consumes its time quantum</td>
  <td>Move to the tail of runq</td>
</tr>
<tr>
  <td><b>Yield</b></td>
  <td>The thread releases CPU by itself</td>
  <td>Move to the tail of runq</td>
</tr>

</tbody>
</table>

<h3>Scheduling Policy</h3>
<p>
There are following three types of scheduling policy.
</p>
<ul>
  <li><b>SCHED_FIFO</b>: First-in First-out
  <li><b>SCHED_RR</b>: Round Robin (SCHED_FIFO + timeslice)
  <li><b>SCHED_OTHER</b>: Not supported
</ul>
<p>
In early Prex development phase, SCHED_OTHER was implemented as a traditional
BSD scheduler. Since this scheduler changes the thread priority dynamically,
it is unpredictable and does not fit the real-time system.
Recently, SCHED_OTHER policy was dropped from Prex to
focus on real-time platform.
</p>

<h3>Scheduling Parameter</h3>
<p>
An application program can change the following scheduling parameters via
kernel API.
</p>
<ul>
  <li>Thread Priority</li>
  <li>Scheduling Policy</li>
  <li>Time Quantum (only for SCHED_RR)</li>
</ul>

<h3>Scheduling Lock</h3>
<p>
The thread scheduling can be disabled by locking the scheduler.
This is used to synchronize the thread execution to protect the
access to the global resources.
Since any interrupt handler can work while scheduling lock state,
it does not affect to the interrupt latency.
</p>

<h3>Named Event</h3>
<p>
The thread can sleep/wakeup for the specific event. The event works as
the queue of the sleeping threads. Since each event has its name,
it is easy to know which event the debugee is waiting for.
</p>


<h2 id="memory">Memory Management</h2>

<h3>Physical Page Allocator</h3>
<p>The physical page allocator provides the service for
page allocation/deallocation/reservation.
It works on the bottom layer for other memory managers.
</p>
<p>
<img alt="Memory Structure" src="img/memory.gif" border="1"
style="width: 448px; height: 308px;"><br>

<i><b>Figure 2. Prex Memory Structure</b></i>
</p>
<p>
The key point is that Prex kernel does not page out to
any external disk devices. This is an important design point to get
real-time performance and system simplicity.

</p>

<h3>Kernel Memory Allocator</h3>
<p>
The kernel memory allocator is optimized for the small
memory foot print system.
</p>
<p>
To allocate kernel memory, it is necessary to divide one page into
two or more blocks.
There are following 3 linked lists to manage used/free blocks for kernel
memory.
<ol>
  <li>All pages allocated for the kernel memory are linked.</li>
  <li>All blocks divided in the same page are linked.</li>
  <li>All free blocks of the same size are linked.</li>
</ol>
<p>
Currently, it can not handle the memory size exceeding one page.
Instead, a driver can use page_alloc() to allocate large memory.
<br>
When the kernel code illegally writes data into non-allocated memory,
the system will crash easily. The kmem modules are called from
not only kernel code but from various drivers. In order to
check the memory over run, each free block has a tag with magic ID.
</p>
<p>
The kernel maintains the array of the block headers for the free blocks.
The index of an array is decided by the size of each block.
All block has the size of the multiple of 16.
</p>
<pre>free_blks[0] = list for 16 byte block
free_blks[1] = list for 32 byte block
free_blks[2] = list for 48 byte block
     .
     .
free_blks[255] = list for 4096 byte block
</pre>
<p>
In generic design, only one list is used to search the free block
for a first fit algorithm.
However, the Prex kernel memory allocator is using multiple lists
corresponding to each block size.
A search is started from the list of the requested size. So,
it is not necessary to search smaller block's list wastefully.
</p>
<p>
In most of the "buddy" based memory allocators, their algorithm are
using <b>2^n</b> bytes as block size.
But, this logic will throw away much memory in case
the block size is not fit. So, this is not suitable for the
embedded systems that Prex aims to.
</p>

<h3>Virtual Memory Manager</h3>
<p>
A task owns its private virtual address space. All threads
in a same task share one memory space.
When new task is made, the address map of the parent task will be
automatically copied.
In this time, the read-only space is not copied and is shared with old map.
</p>
<p>
The kernel provide the following functions for VM:
</p>
<ul>
  <li>Allocate/deallocate memory region</li>
  <li>Change memory attribute (read/write/exec)</li>
  <li>Map another task's memory to current task</li>
</ul>
<p>
The VM allocator is using the traditional list-based algorithm.
</p>
<p>
The kernel task is a special task which has the virtual memory mapping
for kernel. All other user mode tasks will have the same kernel memory
image mapped from the kernel task. So, kernel threads can work with the
all user mode task context without switching memory map.
</p>
<p>
<img alt="Memory Mapping" src="img/memmap.gif" border="1"
style="width: 504px; height: 271px;"><br>

<i><b>Figure 3. Kernel Memory Mapping</b></i>
</p>

<p>
Since the Prex kernel does not do page out to an external storage,
it is guaranteed that the allocated memory is always continuing
and existing. Thereby, a kernel and drivers can be constructed
very simply.
</p>
<p>
<i>Note: "Copy-on-write" feature was supported with the Prex kernel before.
But, it was dropped to increase the real-time performance.</i>
</p>


<h2 id="ipc">IPC</h2>
<p>
The message passing model of Prex is very simple compared with other
modern microkernels. The Prex message is sent to the "object" from thread
to thread.
The "object" in Prex is similar concept that is called as "port"
in other microkernel.
</p>

<h3>Object</h3>
<p>
An object represents service, state, or policies etc.
For object manipulation, kernel provide 3 functions:
object_create(), object_delete(), object_lookup().
Prex task will create an object to publish its interface to other tasks.
For example, server tasks will create objects like "proc", "fs", "exec" to
allow clients to access their service.
And then, client tasks will send a request message to these objects
</p>
<p>
An actual object is stored in kernel space, and it is protected
from user mode code.
Each object data is managed with the hash table by using its name string.
Usually, an object has a unique name within a system. To send a
message to the specific object, it must obtain the target object ID
by looking up by the name.
</p>
<p>
An object can be created without its name. These objects can be used as
private objects for threads in same task.
</p>

<h3>Message</h3>
<p>
Each IPC message must have the pre-defined message header in it.
The kernel will store the sender task's ID into the message header.
This mechanism ensures the receiver task can get the exact task ID
of the sender task. Therefore, receiver task can check the sender
task's capability for various secure services.
</p>
<p>
It is necessary to recognize the pre-defined message format between
sender and receiver.
</p>
<p>
Messages are sent to the specific object using msg_send().
The transmission of a message is always synchronous. This means that
the thread which sent the message is blocked until it receives
a response from another threads. msg_receive() performs reception
of a message. msg_receive() is also blocked when no message is
reached to the target object. The receiver thread must answer the
message using msg_reply() after it finishes processing.
</p>
<p>
The receiver thread can not receive another message until it
replies to the sender. In short, a thread can receive only one
message at once. Once the thread receives message, it can send
another message to different object. This mechanism allows threads
to redirect the sender's request to another thread.
</p>
<p>
A thread can receive a message from the specific object which is
created by itself or thread in same task.
If the message has not arrived, it
blocks until any message comes in. The following figure shows the IPC transmit
sequence of Prex.
</p>

<img alt="ipc queue" src="img/msg.gif" border="1"
style="width: 505px; height: 347px;"><br>

<i><b>Figure 4. IPC Transmit Sequence</b></i>

<h3>Message Transfer</h3>
<p>
The message is copied to task to task directly without kernel
buffering. The memory region of sent message is
automatically mapped to the receiver's memory within kernel.
This mechanism allows to reduce the number of copy time while message
translation.
Since there is no page out of memory in Prex, we can
copy the message data via physical memory at anytime.
</p>
<img alt="Message transfer" src="img/ipcmap.gif" border="1"
style="width: 459px; height: 321px;"><br>

<i><b>Figure 5. IPC message transfer</b></i>

<h2 id="except">Exception Handling</h2>
<p>
A user mode task can specify its own exception handler with exception_setup().
There are two different types of exception.
</p>
<ul>
  <li><b>H/W exception</b>:
  This type of exception is caused by H/W trap &amp fault. The exception
  will be sent to the thread which caused the trap.
  If no exception handler is specified by the task, it will be
  terminated by kernel.</li>

  <li><b>S/W exception</b>:
  The user mode task can send S/W exception to another task by exception_raise().
  The exception
  will be sent to the thread that is sleeping with exception_wait().
  If no thread is waiting for the exception, the exception is sent
  to the first thread in the target task.</li>
</ul>
<p>
Kernel supports 32 types of exception.
The following pre-defined exceptions are raised by kernel itself.
</p>

<i><b>Table 5. Kernel exceptions</b></i>
<table border="1" width="80%" cellspacing="0">
<tbody>
<tr>
  <th>Exception</th>
  <th>Type</th>
  <th>Reason</th>
</tr>
<tr>
  <td>SIGILL</td>
  <td align="center">H/W</td>
  <td>Illegal instruction</td>
</tr>
<tr>
  <td>SIGTRAP</td>
  <td align="center">H/W</td>
  <td>Break point</td>
</tr>
<tr>
  <td>SIGFPE</td>
  <td align="center">H/W</td>
  <td>Math error</td>
</tr>
<tr>
  <td>SIGSEGV</td>
  <td align="center">H/W</td>
  <td>Invalid memory access</td>
</tr>
<tr>
  <td>SIGALRM</td>
  <td align="center">S/W</td>
  <td>Alarm event</td>
</tr>
</tbody>
</table>

<p>
POSIX emulation library will setup its own exception handler to convert
the Prex exceptions into UNIX signals. It will maintain its own signal mask.
And, it transfer control to the actual POSIX signal handler that is
defined by the user mode process.

</p>

<h2 id="int"> Interrupt Framework</h2>
<p>
Prex defines two different types of interrupt service to
optimize the response time of real-time operation.
</p>

<h3>Interrupt Service Routine (ISR)</h3>
<p>
ISR is started by an actual hardware interrupt. The associated
interrupt is disabled in ICU and CPU interrupt is enabled
while it runs. If ISR determines that its device generates
the interrupt, ISR must program the device to stop the interrupt.
Then, ISR should do minimum I/O operation and return control
as quickly as possible.
ISR will run within the context of current running thread at
interrupt time. So, only few kernel services are available within
ISR. IRQ_ASSERT() macro can be used to detect the invalid
function call from ISR.
</p>

<h3>Interrupt Service Thread (IST)</h3>
<p>
IST is automatically activated if ISR returns INT_CONTINUE to kernel. It
will be called when the system enters safer condition than ISR.
Any interrupt driven I/O operation should be done in IST not ISR.
Since ISR for same IRQ may be run during IST, the shared data,
resources, and device registers must be synchronized by using
irq_lock().
IST does not have to be reentrant, since it is not interrupted
by same IST itself.
</p>

<h3>Interrupt Nesting &amp Priority</h3>
<p>
Each ISR has its logical priority level, with 0 being the highest
priority. While one ISR is running, all lower priority interrupts
are masked off.
This interrupt nesting mechanism avoids delaying of
high priority interrupt events.
</p>
<p>
IST is executed as a normal thread dispatched by the scheduler. So,
the interrupt thread which has higher priority is executed first.
The driver writer can specify the thread priority of IST when
IST is attached to the specific interrupt line.
The important point is that even a user mode task can be performed
prior to an interrupt thread.
</p>
<p>
The following figure is the sample of the Prex interrupt processing.
</p>
<p>
<img alt="Interrupt Processing" src="img/irq.gif" border="1"><br>
<i><b>Figure 6. Prex Interrupt Processing</b></i>
</p>
<p>
</p>

<h3>Interrupt Locking</h3>
<p>
irq_lock() &amp irq_unlock() are used to disable all interrupts in
order to synchronize the access to the kernel or H/W resource. Since irq_lock()
increments a lock counter, irq_unlock() will automatically restore
to the original interrupt state when locking count becomes 0.
So the caller does not have to save the previous interrupt state.
</p>

<h3>Interrupt Stack</h3>
<p>
If each ISR uses the kernel stack of the current running thread, the
stack area may be over-flow when continuous interrupts are occurred at
one same thread. So the kernel stack will be switched to the dedicated stack
while ISR is running.
</p>


<h2 id="timer">Timer</h2>

<h3>Kernel Timers</h3>

<p>
The kernel timer provides the following feature.
</p>
<ul>
  <li><b>Sleep timer</b>:      Put caller thread to sleep in the specified time.</li>
  <li><b>Call back timer</b>:  Call the routine after specified time passes.</li>
  <li><b>Periodic timer</b>:   Call the routine at the specified interval.</li>
</ul>

<h3>Timer Jitter</h3>

<p>
The periodic timer is designed to minimize the deviation between desired and
actual expiration.
</p>

<h2 id="device">Device I/O Service</h2>

The Prex device driver module is separated from the kernel, and this module
is linked with the kernel at the boot time.
The kernel provides only simple and minimum services to help the 
communication between applications and drivers.

<h3>Device Object</h3>
<p>
Since the Prex kernel does not have the file system in it, the kernel
provides a device object service for I/O interface.
The device object is created by the device driver to communicate to the
application. Usually, the driver creates a device object for an existing
physical device. But, it can be used to handle logical or virtual devices.
</p>

<h3>Driver Interface</h3>
<p>
The interface between kernel and drivers are defined clearly as
"Driver Kernel Interface". The kernel provides the following services
for device drivers.
</p>
<ul>
  <li>Device object service</li>
  <li>Kernel memory allocation</li>
  <li>Physical page allocation</li>
  <li>Interrupt handling service</li>
  <li>Scheduler service</li>
  <li>Timer service</li>
  <li>Debug service</li>
</ul>

<h3>Application Interface</h3>
<p>
The kernel device I/O interface are provided to access the specific device
object which is handled by a driver. 
The Prex kernel provides the following 5 functions for applications.
</p>
<ul>
 <li>Open a device</li>
 <li>Close a device</li>
 <li>Read from a device</li>
 <li>Write to a device</li>
 <li>Device I/O control</li>
</ul>

<h2 id="mutex">Mutex</h2>
<h3>Priority Inheritance</h3>
<p>
The thread priority is automatically changed at one of the following conditions.
</p>
<ol>
  <li>
  When the current thread fails to lock the mutex and the mutex
  owner has lower priority than current thread, the priority
  of mutex owner is boosted to the current priority.
  If this mutex owner is waiting for another mutex, such related
  mutexes are also processed.
  </li>
  <li>
  When the current thread unlocks the mutex and its priority
  has already been boosted, kernel recomputes the current priority.
  In this case, the priority is set to the highest
  priority among the threads waiting for the mutexes locked by the
  current thread.
  </li>
  <li>
   When the priority is changed by the user request, the related(inherited)
   thread's priority is also changed.
  </li>
</ol>
<p>
There are following limitations about priority inheritance
with Prex mutex.
</p>
<ol>
  <li>
  If the priority is changed by the user request, the priority
  recomputation is done only when the new priority is higher
  than old priority. The inherited priority is reset to base
  priority when the mutex is unlocked.
  </li>
  <li>
  Even if thread is killed with mutex waiting, the related
  priority is not adjusted.
  </li>
</ol>
<h2 id="debug">Debug</h2>
There are following debugging support functions:
<ul>
  <li>printk(): Display the debug message in kernel.</li>
  <li>panic(): Dump processor registers and stop system.</li>
  <li>ASSERT(): If expression is false (zero), stop system and display information.</li>
  <li>trace_on(): If the kernel trace is enabled, all entry/exit of functions
  are logged.
</ul>
    </tr>
    <tr>
      <td id="footer" colspan="2" style="vertical-align: top;">
        <a href="http://sourceforge.net">
        <img src="http://sourceforge.net/sflogo.php?group_id=132028&amp;type=1"
        alt="SourceForge.net Logo" border="0" height="31" width="88"></a><br>
        Copyright&copy; 2005-2007 Kohsuke Ohtani
      </td>
    </tr>

  </tbody>
</table>

</div>
<div id="bottom"></div>

</body>
</html>