[BACK]Return to kernel.html CVS log [TXT][DIR] Up to [local] / prex-old / doc / html / doc

Annotation of prex-old/doc/html/doc/kernel.html, Revision 1.1.1.1.2.1

1.1       nbrk        1: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
                      2: <html>
                      3: <head>
                      4:   <title>Prex Kernel Internals</title>
                      5:   <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
                      6:   <meta name="keywords" content="Prex, embedded, real-time, operating system, RTOS, open source, free">
                      7:   <meta name="author" content="Kohsuke Ohtani">
                      8:   <link rel="stylesheet" type="text/css" href="../default.css" media="screen">
                      9:   <link rel="stylesheet" type="text/css" href="../print.css" media="print">
                     10: </head>
                     11: <body>
                     12: <div id="top">
                     13: </div>
                     14: <div id="middle">
                     15: 
                     16: <table id="content" cellpadding="0" cellspacing="0">
                     17:   <tbody>
                     18: 
                     19:     <tr>
                     20:       <td id="header" colspan="2" valign="top">
                     21:         <table width="100%" border="0" cellspacing="0" cellpadding="0">
                     22:         <tr>
                     23:           <td id="logo">
                     24:             <a href="http://prex.sourceforge.net/">
                     25:             <img alt="Prex logo" src="../img/logo.gif" border="0"
1.1.1.1.2.1! nbrk       26:             style="width: 250px; height: 54px;"></a>
1.1       nbrk       27:           </td>
                     28:           <td id="brief" align="right" valign="bottom">
                     29:             An Open Source, Royalty-free,<br>
                     30:            Real-time Operating System
                     31:           </td>
                     32:         </tr>
                     33:         </table>
                     34:       </td>
                     35:     </tr>
                     36: 
                     37:     <tr>
                     38:       <td id="directory" style="vertical-align: top;">
                     39:       <a href="http://prex.sourceforge.net/">Prex Home</a> >
                     40:       <a href="index.html">Document Index</a> >
                     41:       Kernel Internals
                     42:     </tr>
                     43:     <tr><td class="pad" colspan="2" style="vertical-align: top;"></td></tr>
                     44: 
                     45:     <tr>
1.1.1.1.2.1! nbrk       46:       <td id="doc" style="vertical-align: top;">
        !            47: 
1.1       nbrk       48:       <h1>Prex Kernel Internals</h1>
                     49: 
                     50: <i>Version 1.4, 2005/12/31</i>
                     51: 
                     52: <h3>Table of Contents</h3>
                     53: <ul>
                     54:   <li><a href="#over">Kernel Overview</a></li>
                     55:   <li><a href="#design">Design Policy</a></li>
                     56:   <li><a href="#thread">Thread</a></li>
                     57:   <li><a href="#task">Task</a></li>
                     58:   <li><a href="#sched">Scheduler</a></li>
                     59:   <li><a href="#memory">Memory Management</a></li>
                     60:   <li><a href="#ipc">IPC</a></li>
                     61:   <li><a href="#except">Exception Handling</a></li>
                     62:   <li><a href="#int">Interrupt Framework</a></li>
                     63:   <li><a href="#timer">Timer</a></li>
                     64:   <li><a href="#device">Device I/O Service</a></li>
                     65:   <li><a href="#mutex">Mutex</a></li>
                     66:   <li><a href="#debug">Debug</a></li>
                     67: </ul>
                     68: <br>
                     69: 
                     70: <h2 id="over">Kernel Overview</h2>
                     71: 
                     72: <h3>Kernel Structure</h3>
                     73: <p>
                     74: The following figure illustrates the Prex kernel structure.
                     75: </p>
                     76: <p>
                     77: <img alt="Kernel Components" src="img/kernel.gif" border="1"
                     78: style="width: 602px; height: 414px;"><br>
                     79: 
                     80: <i><b>Figure 1. Prex kernel Structure</b></i>
                     81: </p>
                     82: <p>
                     83: A kernel object belongs in one of the following groups.
                     84: </p>
                     85: <ul>
                     86:   <li><b>kern</b>: kernel core components</li>
                     87:   <li><b>mem</b>: memory managers</li>
                     88:   <li><b>ipc</b>: inter process communication (*)</li>
                     89:   <li><b>sync</b>: synchronize objects</li>
                     90:   <li><b>arch</b>: architecture dependent components</li>
                     91: </ul>
                     92: <p>
                     93: <i>*) Since all messages in Prex are transferred among threads, the name of
                     94: "IPC" is not appropriate.
                     95: However, "IPC" is still used as a general term of the message transfer
                     96: via the kernel, in Prex.</i>
                     97: </p>
                     98: 
                     99: <h3>Naming Convention</h3>
                    100: <p>
                    101: The name of "group/object" in figure 1 is mapped to "directory/file" in the
                    102: Prex source tree. For example, the thread related functions are located
                    103: in "kern/thread.c", and the functions for semaphore are placed
                    104: in "sync/sem.c".
                    105: </p>
                    106: <p>
                    107: In addition, there is a standard naming convention about kernel
                    108: routines. The method named <i>bar</i> for the object named <i>foo</i>
                    109: should be named "foo_bar". For example, the routine to create a new 
                    110: thread is named "thread_create", and locking mutex will be "mutex_lock".
                    111: This rule is not applied to the name of the local function.
                    112: </p>
                    113: 
                    114: 
                    115: <h2 id="design">Design Policy</h2>
                    116: <p>
                    117: The Prex kernel focuses the following points to be designed.
                    118: </p>
                    119: <ul>
                    120:   <li>Portability</li>
                    121:   <li>Scalability</li>
                    122:   <li>Reliability</li>
                    123:   <li>Interoperability</li>
                    124:   <li>Maintainability</li>
                    125: </ul>
                    126: 
                    127: <h3>Portability</h3>
                    128: <p>
                    129: The Prex kernel is divided into two different layers -
                    130: a common kernel layer and an architecture dependent layer.
                    131: 
                    132: Any routine in the common kernel layer must not access to the H/W by itself.
                    133: Instead, it must request it to the architecture dependent layer.
                    134: The interface to the architecture dependent layer is strictly defined
                    135: by the Prex kernel. This interface is designed carefully to support various
                    136: different architecture with minimum code change.
                    137: So, it is easy to port the Prex kernel to different architecture.
                    138: </p>
                    139: <p>
                    140: The following functions must be provided by the architecture dependent layer.
                    141: </p>
                    142: <ul>
                    143:   <li><b>CPU</b>: initializes processor registers before kernel boot</li>
                    144:   <li><b>Context</b>: abstracts processor and hardware context</li>
                    145:   <li><b>MMU</b>: abstracts memory management unit (*)</li>
                    146:   <li><b>Trap</b>: abstracts processor trap</li>
                    147:   <li><b>Interrupt</b>: abstracts the interrupt control unit</li>
                    148:   <li><b>Clock</b>: abstracts clock timer unit</li>
                    149:   <li><b>Misc.</b>: abstracts system reset, idle power state</li>
                    150: </ul>
                    151: <p>
                    152: <i>*) In case of no-MMU system, MMU related routines will be
                    153: defined as no-operation routine.
                    154: So, the kernel common layer can not assume MMU is always available.</i>
                    155: </p>
                    156: 
                    157: <h3>Scalability</h3>
                    158: <p>
                    159: In order to obtain higher scalability, the kernel does not limit the maximum
                    160: number of the kernel objects to create.
                    161: So, the resource for all kernel objects are allocated dynamically after
                    162: system boot.
                    163: This can keep the memory prerequisite smaller than the static
                    164: resource allocation.
                    165: This means that the kernel can create task, thread, object, device,
                    166: event, mutex, timer as many as usable memory remains.
                    167: </p>
                    168: <p>
                    169: The kernel supports both of MMU and No-MMU systems. So, most of the kernel
                    170: components and VM sub-system are designed carefully to work without MMU.
                    171: </p>
                    172: 
                    173: <h3>Reliability</h3>
                    174: <p>
                    175: When the remaining memory is exhausted, what should OS do?
                    176: If the system can stop with panic() there, the error checks of many
                    177: portions in the kernel are not necessary.
                    178: But obviously, this is not allowed with the reliable system.
                    179: Even if the memory is exhausted,
                    180: a kernel must continue processing.
                    181: So, all kernel code is checking the error status returned
                    182: by the memory allocation routine.
                    183: </p>
                    184: <p>
                    185: In addition, the kernel must not crush anytime even if any invalid parameter
                    186: is passed via kernel API. Basically, the Prex kernel code is written with
                    187: "garbage in, error out" principle.
                    188: The Prex kernel never stops even if any malicious
                    189: program is loaded.
                    190: </p>
                    191: 
                    192: <h3>Interoperability</h3>
                    193: <p>
                    194: Although the Prex kernel was written from scratch, its applications
                    195: will be brought from the other operating systems like BSD. So, the system 
                    196: call interface is designed with consideration to support general OS
                    197: API like POSIX or APIs for generic RTOS.
                    198: </p>
                    199: <p>
                    200: The error code for the Prex system call is defined as the same name 
                    201: with POSIX. For example, EINVAL for "Invalid argument", or ENOMEM for
                    202: "Out of memory". So, peoples do not have to study new error codes if
                    203: they already have skills about POSIX programming.
                    204: This is important point to write applications and to read
                    205: the kernel code because study of a new error scheme will cause pain
                    206: for developers.
                    207: 
                    208: In addition, it simplify the POSIX emulation library because it
                    209: does not have to remap the error code.
                    210: </p>
                    211: 
                    212: <h3>Maintainability</h3>
                    213: <p>
                    214: All kernel codes are kept clean and simple for the maintenance.
                    215: All codes are well-commented and consistent.
                    216: It is easy to add or remove new system call into the kernel.
                    217: The kernel has the debugging facility like the function trace or 
                    218: the dump of the kernel objects.
                    219: </p>
                    220: 
                    221: 
                    222: <h2 id="thread">Thread</h2>
                    223: 
                    224: <h3>Thread Control Block</h3>
                    225: <p>
                    226: The thread control block includes data for
                    227: owner task, scheduler, timer, IPC, exception, mutex, and context.
                    228: The following thread structure is most important definition
                    229: in the kernel codes.
                    230: </p>
                    231: <pre>
                    232: struct thread {
                    233:         int             magic;          /* magic number */
                    234:         task_t          task;           /* pointer to owner task */
                    235:         struct list     task_link;      /* link for threads in same task */
                    236:         struct queue    link;           /* linkage on scheduling queue */
                    237:         int             state;          /* thread state */
                    238:         int             policy;         /* scheduling policy */
                    239:         int             prio;           /* current priority */
1.1.1.1.2.1! nbrk      240:         int             baseprio;       /* base priority */
        !           241:         int             timeleft;       /* remaining ticks to run */
        !           242:         u_int           time;           /* total running time */
        !           243:         int             resched;        /* true if rescheduling is needed */
        !           244:         int             locks;          /* schedule lock counter */
        !           245:         int             suscnt;         /* suspend counter */
        !           246:         struct event    *slpevt;        /* sleep event */
        !           247:         int             slpret;         /* sleep result code */
1.1       nbrk      248:         struct timer    timeout;        /* thread timer */
                    249:         struct timer    *periodic;      /* pointer to periodic timer */
1.1.1.1.2.1! nbrk      250:         uint32_t        excbits;        /* bitmap of pending exceptions */
1.1       nbrk      251:         struct queue    ipc_link;       /* linkage on IPC queue */
1.1.1.1.2.1! nbrk      252:         void            *msgaddr;       /* kernel address of IPC message */
        !           253:         size_t          msgsize;        /* size of IPC message */
        !           254:         thread_t        sender;         /* thread that sends IPC message */
        !           255:         thread_t        receiver;       /* thread that receives IPC message */
        !           256:         object_t        sendobj;        /* IPC object sending to */
        !           257:         object_t        recvobj;        /* IPC object receiving from */
1.1       nbrk      258:         struct list     mutexes;        /* mutexes locked by this thread */
                    259:         struct mutex    *wait_mutex;    /* mutex pointer currently waiting */
                    260:         void            *kstack;        /* base address of kernel stack */
1.1.1.1.2.1! nbrk      261:         struct context  ctx;            /* machine specific context */
1.1       nbrk      262: };</pre>
                    263: 
                    264: <h3>Thread Creation</h3>
                    265: New thread can be created by thread_create().
                    266: The initial states of newly created thread are as follow:
                    267: <p>
                    268: </p>
                    269: 
                    270: <i><b>Table 1. Initial thread state</b></i>
                    271: <table border="1" width="60%" cellspacing="0">
                    272: <tbody>
                    273: <tr>
                    274:   <th>Data type</th>
                    275:   <th>Initial state</th>
                    276: </tr>
                    277: <tr>
                    278:   <td>Owner Task</td>
                    279:   <td>Inherit from parent thread</td>
                    280: </tr>
                    281: <tr>
                    282:   <td>Thread state</td>
                    283:   <td>Suspended</td>
                    284: </tr>
                    285: <tr>
                    286:   <td>Suspend count</td>
                    287:   <td>Task suspend count + 1</td>
                    288: </tr>
                    289: <tr>
                    290:   <td>Scheduling policy</td>
                    291:   <td>Round Robin</td>
                    292: </tr>
                    293: <tr>
                    294:   <td>Scheduling Priority</td>
                    295:   <td>Default (= 200)</td>
                    296: </tr>
                    297: <tr>
                    298:   <td>Time quantum</td>
                    299:   <td>Default (= 50 msec)</td>
                    300: </tr>
                    301: 
                    302: <tr>
                    303:   <td>Processor registers</td>
                    304:   <td>Default value</td>
                    305: </tr>
                    306: 
                    307: </tbody>
                    308: </table>
                    309: 
                    310: <p>
                    311: Since new thread is initially set to the suspended state, thread_resume()
                    312: must be called to start it.
                    313: </p>
                    314: 
                    315: <p>
                    316: Creating a thread and loading its register state are isolated
                    317: in different routines. These two routines are used by fork(), exec(),
                    318: and pthread_create() in the POSIX emulation library.
                    319: </p>
                    320: 
                    321: <i><b>Table 2. Usage of thread_create()/thread_load()</b></i>
                    322: <table border="1" width="80%" cellspacing="0">
                    323: <tbody>
                    324: <tr>
                    325:   <th>Library routine</th>
                    326:   <th>thread_create()</th>
                    327:   <th>thread_load()</th>
                    328: </tr>
                    329: <tr>
                    330:   <td>fork()</td>
                    331:   <td align="center">O</td>
                    332:   <td align="center">X</td>
                    333: </tr>
                    334: <tr>
                    335:   <td>exec()</td>
                    336:   <td align="center">X</td>
                    337:   <td align="center">O</td>
                    338: </tr>
                    339: <tr>
                    340:   <td>pthread_create()</td>
                    341:   <td align="center">O</td>
                    342:   <td align="center">O</td>
                    343: </tr>
                    344: </tbody>
                    345: </table>
                    346: 
                    347: <h3>Thread Termination</h3>
                    348: <p>
                    349: The kernel will usually release all resources owned by the terminated thread.
                    350: But, there are some complicated process to release the resources.
                    351: The priority adjustment may be required if the thread inherits its
                    352: priority.
                    353: </p>
                    354: <p>
                    355: If the thread is terminated with mutex locking, all threads waiting for
                    356: that mutex does sleep forever. So, the mutex held by the terminated thread
                    357: must be unlocked, or change its mutex owner if some thread is waiting for.
                    358: 
                    359: </p>
                    360: <p>
                    361: In general, there is a known issue about the thread termination.
                    362: If the termination target is current thread, the kernel can not release
                    363: the context of the current thread because the
                    364: thread switching always requires the current context.
                    365: There are the following 3 solutions for this.
                    366: </p>
                    367: <ol>
                    368:  <li>Create "clean up thread" to terminate thread</li>
                    369:  <li>Add condition check in thread switching code</li>
                    370:  <li>Defer termination in next termination request</li>
                    371: </ol>
                    372: The Prex kernel is using #3.
                    373: 
                    374: <h3>Thread Suspension</h3>
                    375: <p>
                    376: Each thread can be set to the suspended state by using thread_suspend()
                    377: interface.
                    378: Although a thread can be suspended any number of times,
                    379: it does not start to run unless it is resumed by the same
                    380: number of suspend.
                    381: </p>
                    382: 
                    383: <h3>Kernel Thread</h3>
                    384: <p>
                    385: A kernel thread is always executed in kernel mode, and it does not have user
                    386: mode context.
                    387: The scheduling policy is set to SCHED_FIFO by default.
                    388: </p>
                    389: <p>
                    390: Currently, the following kernel threads are running in kernel mode.
                    391: </p>
                    392: <ul>
                    393:   <li>Interrupt Service Threads</li>
                    394:   <li>Timer Thread</li>
                    395:   <li>Idle Thread</li>
                    396:   <li>DPC Thread</li>
                    397: </ul>
                    398: 
                    399: <h3>Idle Thread</h3>
                    400: <p>
                    401: An idle thread runs when no other thread is active.
                    402: It has the role of cutting down the power consumption of a system.
                    403: An idle thread has FIFO scheduling policy, and it does not
                    404: have time quantum.
                    405: The lowest scheduling priority (=255) is reserved for an idle thread.
                    406: </p>
                    407: 
                    408: <h2 id="task">Task</h2>
                    409: 
                    410: <h3>Task Creation</h3>
                    411: <p>
                    412: The task can be created by using task_create().
                    413: New child task will have the same memory image with the parent task.
                    414: Especially text region and read-only region are physically
                    415: shared among them.
                    416: The parent task receives the new task ID of child task from task_create(), but
                    417: child task will receive 0 as task ID.
                    418: </p>
                    419: <p>
                    420: The initial task states are as follow:
                    421: </p>
                    422: 
                    423: <i><b>Table 3. Initial task state</b></i>
                    424: <table border="1" width="60%" cellspacing="0">
                    425: <tbody>
                    426: <tr>
                    427:   <th>Data type</th>
                    428:   <th>Inherit from parent task?</th>
                    429: </tr>
                    430: 
                    431: <tr>
                    432:   <td>Object List</td>
                    433:   <td align="center">No</td>
                    434: </tr>
                    435: 
                    436: <tr>
                    437:   <td>Threads</td>
                    438:   <td align="center">No</td>
                    439: </tr>
                    440: 
                    441: <tr>
                    442:   <td>Memory Map</td>
                    443:   <td align="center">Yes</td>
                    444: </tr>
                    445: 
                    446: <tr>
                    447:   <td>Suspend Count</td>
                    448:   <td align="center">No</td>
                    449: </tr>
                    450: 
                    451: <tr>
                    452:   <td>Exception Handler</td>
                    453:   <td align="center">Yes</td>
                    454: </tr>
                    455: 
                    456: </tbody>
                    457: </table>
                    458: 
                    459: <p>
                    460: If the parent task is specified as NULL for task_create(),
                    461: all child state are initialized to default.
                    462: This is used in exec() emulation.
                    463: </p>
                    464: 
                    465: <h3>Task Suspension</h3>
                    466: <p>
                    467: When the task is set to suspend state, the thread suspend count of all threads
                    468: in the task is also incremented.
                    469: A thread can start to run only when both of the thread suspend count
                    470: and the task suspend count becomes 0.
                    471: </p>
                    472: 
                    473: <h3>Kernel Task</h3>
                    474: <p>
                    475: The kernel task is a special task that has only an idle thread
                    476: and interrupt threads. It does not have any user mode memory.
                    477: </p>
                    478: 
                    479: <h2 id="sched">Scheduler</h2>
                    480: <h3>Thread Priority</h3>
                    481: <p>
                    482: The Prex scheduler is based on the algorithm known as priority based
                    483: multi level queue. Each thread is assigned the priority between
                    484: 0 and 255. The lower number means higher priority like BSD unix.
                    485: It maintains 256 level run queues mapped to each priority.
                    486: The lowest priority (=255) is used only for an idle thread.
                    487: </p>
                    488: <p>
                    489: A thread has two different types of priority:
                    490: </p>
                    491: <ul>
                    492:   <li><b>Base priority:</b>
                    493:   This is a static priority which can be changed only by user mode program.
                    494:   <li><b>Current Priority:</b>
                    495:   An actual scheduling priority.
                    496:   A kernel may adjust this priority dynamically if it's needed.
                    497: </ul>
                    498: <p>
                    499: Although the base priority and the current priority are same value in almost
                    500: conditions,
                    501: kernel will sometimes change the current priority to avoid
                    502: "priority inversion".
                    503: </p>
                    504: 
                    505: <h3>Thread State</h3>
                    506: <p>
                    507: Each thread has one of the following states.
                    508: </p>
1.1.1.1.2.1! nbrk      509: <p>
        !           510: <img alt="Memory Structure" src="img/thread.gif" border="1"
        !           511: style="width: 430px; height: 314px;"><br>
        !           512: 
        !           513: <i><b>Figure 2. Thread States</b></i>
        !           514: </p>
1.1       nbrk      515: 
                    516: <ul>
                    517:   <li><b>RUN</b>     :Running or ready to run</li>
                    518:   <li><b>SLEEP</b>   :Sleep for some event</li>
                    519:   <li><b>SUSPEND</b> :Suspend count is not 0</li>
                    520:   <li><b>EXIT</b>    :Terminated</li>
                    521: </ul>
                    522: <p>
                    523: The thread is always preemptive even in kernel mode.
                    524: There are following 4 events to switch thread:
                    525: </p>
                    526: 
                    527: <i><b>Table 4. Events to switch thread</b></i>
                    528: <table border="1" width="90%" cellspacing="0">
                    529: <tbody>
                    530: <tr>
                    531:   <th>Event</th>
                    532:   <th>Condition</th>
                    533:   <th>Run queue position</th>
                    534: </tr>
                    535: <tr>
                    536:   <td><b>Block</b></td>
                    537:   <td>Thread sleep or suspend</td>
                    538:   <td>Move to the tail of runq</td>
                    539: </tr>
                    540: <tr>
                    541:   <td><b>Preemption</b></td>
                    542:   <td>Higher priority thread becomes runnable</td>
                    543:   <td>Keep the head of runq</td>
                    544: </tr>
                    545: <tr>
                    546:   <td><b>Quantum Expiration</b></td>
                    547:   <td>The thread consumes its time quantum</td>
                    548:   <td>Move to the tail of runq</td>
                    549: </tr>
                    550: <tr>
                    551:   <td><b>Yield</b></td>
                    552:   <td>The thread releases CPU by itself</td>
                    553:   <td>Move to the tail of runq</td>
                    554: </tr>
                    555: 
                    556: </tbody>
                    557: </table>
                    558: 
1.1.1.1.2.1! nbrk      559: 
1.1       nbrk      560: <h3>Scheduling Policy</h3>
                    561: <p>
                    562: There are following three types of scheduling policy.
                    563: </p>
                    564: <ul>
                    565:   <li><b>SCHED_FIFO</b>: First-in First-out
                    566:   <li><b>SCHED_RR</b>: Round Robin (SCHED_FIFO + timeslice)
                    567:   <li><b>SCHED_OTHER</b>: Not supported
                    568: </ul>
                    569: <p>
                    570: In early Prex development phase, SCHED_OTHER was implemented as a traditional
                    571: BSD scheduler. Since this scheduler changes the thread priority dynamically,
                    572: it is unpredictable and does not fit the real-time system.
                    573: Recently, SCHED_OTHER policy was dropped from Prex to
                    574: focus on real-time platform.
                    575: </p>
                    576: 
                    577: <h3>Scheduling Parameter</h3>
                    578: <p>
                    579: An application program can change the following scheduling parameters via
                    580: kernel API.
                    581: </p>
                    582: <ul>
                    583:   <li>Thread Priority</li>
                    584:   <li>Scheduling Policy</li>
                    585:   <li>Time Quantum (only for SCHED_RR)</li>
                    586: </ul>
                    587: 
                    588: <h3>Scheduling Lock</h3>
                    589: <p>
                    590: The thread scheduling can be disabled by locking the scheduler.
                    591: This is used to synchronize the thread execution to protect the
                    592: access to the global resources.
                    593: Since any interrupt handler can work while scheduling lock state,
                    594: it does not affect to the interrupt latency.
                    595: </p>
                    596: 
                    597: <h3>Named Event</h3>
                    598: <p>
                    599: The thread can sleep/wakeup for the specific event. The event works as
                    600: the queue of the sleeping threads. Since each event has its name,
                    601: it is easy to know which event the debugee is waiting for.
                    602: </p>
                    603: 
                    604: 
                    605: <h2 id="memory">Memory Management</h2>
                    606: 
                    607: <h3>Physical Page Allocator</h3>
                    608: <p>The physical page allocator provides the service for
                    609: page allocation/deallocation/reservation.
                    610: It works on the bottom layer for other memory managers.
                    611: </p>
                    612: <p>
                    613: <img alt="Memory Structure" src="img/memory.gif" border="1"
                    614: style="width: 448px; height: 308px;"><br>
                    615: 
1.1.1.1.2.1! nbrk      616: <i><b>Figure 3. Prex Memory Structure</b></i>
1.1       nbrk      617: </p>
                    618: <p>
                    619: The key point is that Prex kernel does not page out to
                    620: any external disk devices. This is an important design point to get
                    621: real-time performance and system simplicity.
                    622: 
                    623: </p>
                    624: 
                    625: <h3>Kernel Memory Allocator</h3>
                    626: <p>
                    627: The kernel memory allocator is optimized for the small
                    628: memory foot print system.
                    629: </p>
                    630: <p>
                    631: To allocate kernel memory, it is necessary to divide one page into
                    632: two or more blocks.
                    633: There are following 3 linked lists to manage used/free blocks for kernel
                    634: memory.
                    635: <ol>
                    636:   <li>All pages allocated for the kernel memory are linked.</li>
                    637:   <li>All blocks divided in the same page are linked.</li>
                    638:   <li>All free blocks of the same size are linked.</li>
                    639: </ol>
                    640: <p>
                    641: Currently, it can not handle the memory size exceeding one page.
                    642: Instead, a driver can use page_alloc() to allocate large memory.
                    643: <br>
                    644: When the kernel code illegally writes data into non-allocated memory,
                    645: the system will crash easily. The kmem modules are called from
                    646: not only kernel code but from various drivers. In order to
                    647: check the memory over run, each free block has a tag with magic ID.
                    648: </p>
                    649: <p>
                    650: The kernel maintains the array of the block headers for the free blocks.
                    651: The index of an array is decided by the size of each block.
                    652: All block has the size of the multiple of 16.
                    653: </p>
                    654: <pre>free_blks[0] = list for 16 byte block
                    655: free_blks[1] = list for 32 byte block
                    656: free_blks[2] = list for 48 byte block
                    657:      .
                    658:      .
                    659: free_blks[255] = list for 4096 byte block
                    660: </pre>
                    661: <p>
                    662: In generic design, only one list is used to search the free block
                    663: for a first fit algorithm.
                    664: However, the Prex kernel memory allocator is using multiple lists
                    665: corresponding to each block size.
                    666: A search is started from the list of the requested size. So,
                    667: it is not necessary to search smaller block's list wastefully.
                    668: </p>
                    669: <p>
                    670: In most of the "buddy" based memory allocators, their algorithm are
                    671: using <b>2^n</b> bytes as block size.
                    672: But, this logic will throw away much memory in case
                    673: the block size is not fit. So, this is not suitable for the
                    674: embedded systems that Prex aims to.
                    675: </p>
                    676: 
                    677: <h3>Virtual Memory Manager</h3>
                    678: <p>
                    679: A task owns its private virtual address space. All threads
                    680: in a same task share one memory space.
                    681: When new task is made, the address map of the parent task will be
                    682: automatically copied.
                    683: In this time, the read-only space is not copied and is shared with old map.
                    684: </p>
                    685: <p>
                    686: The kernel provide the following functions for VM:
                    687: </p>
                    688: <ul>
                    689:   <li>Allocate/deallocate memory region</li>
                    690:   <li>Change memory attribute (read/write/exec)</li>
                    691:   <li>Map another task's memory to current task</li>
                    692: </ul>
                    693: <p>
                    694: The VM allocator is using the traditional list-based algorithm.
                    695: </p>
                    696: <p>
                    697: The kernel task is a special task which has the virtual memory mapping
                    698: for kernel. All other user mode tasks will have the same kernel memory
                    699: image mapped from the kernel task. So, kernel threads can work with the
                    700: all user mode task context without switching memory map.
                    701: </p>
                    702: <p>
                    703: <img alt="Memory Mapping" src="img/memmap.gif" border="1"
                    704: style="width: 504px; height: 271px;"><br>
                    705: 
1.1.1.1.2.1! nbrk      706: <i><b>Figure 4. Kernel Memory Mapping</b></i>
1.1       nbrk      707: </p>
                    708: 
                    709: <p>
                    710: Since the Prex kernel does not do page out to an external storage,
                    711: it is guaranteed that the allocated memory is always continuing
                    712: and existing. Thereby, a kernel and drivers can be constructed
                    713: very simply.
                    714: </p>
                    715: <p>
                    716: <i>Note: "Copy-on-write" feature was supported with the Prex kernel before.
                    717: But, it was dropped to increase the real-time performance.</i>
                    718: </p>
                    719: 
                    720: 
                    721: <h2 id="ipc">IPC</h2>
                    722: <p>
                    723: The message passing model of Prex is very simple compared with other
                    724: modern microkernels. The Prex message is sent to the "object" from thread
                    725: to thread.
                    726: The "object" in Prex is similar concept that is called as "port"
                    727: in other microkernel.
                    728: </p>
                    729: 
                    730: <h3>Object</h3>
                    731: <p>
                    732: An object represents service, state, or policies etc.
                    733: For object manipulation, kernel provide 3 functions:
                    734: object_create(), object_delete(), object_lookup().
                    735: Prex task will create an object to publish its interface to other tasks.
                    736: For example, server tasks will create objects like "proc", "fs", "exec" to
                    737: allow clients to access their service.
                    738: And then, client tasks will send a request message to these objects
                    739: </p>
                    740: <p>
                    741: An actual object is stored in kernel space, and it is protected
                    742: from user mode code.
                    743: Each object data is managed with the hash table by using its name string.
                    744: Usually, an object has a unique name within a system. To send a
                    745: message to the specific object, it must obtain the target object ID
                    746: by looking up by the name.
                    747: </p>
                    748: <p>
                    749: An object can be created without its name. These objects can be used as
                    750: private objects for threads in same task.
                    751: </p>
                    752: 
                    753: <h3>Message</h3>
                    754: <p>
                    755: Each IPC message must have the pre-defined message header in it.
                    756: The kernel will store the sender task's ID into the message header.
                    757: This mechanism ensures the receiver task can get the exact task ID
                    758: of the sender task. Therefore, receiver task can check the sender
                    759: task's capability for various secure services.
                    760: </p>
                    761: <p>
                    762: It is necessary to recognize the pre-defined message format between
                    763: sender and receiver.
                    764: </p>
                    765: <p>
                    766: Messages are sent to the specific object using msg_send().
                    767: The transmission of a message is always synchronous. This means that
                    768: the thread which sent the message is blocked until it receives
                    769: a response from another threads. msg_receive() performs reception
                    770: of a message. msg_receive() is also blocked when no message is
                    771: reached to the target object. The receiver thread must answer the
                    772: message using msg_reply() after it finishes processing.
                    773: </p>
                    774: <p>
                    775: The receiver thread can not receive another message until it
                    776: replies to the sender. In short, a thread can receive only one
                    777: message at once. Once the thread receives message, it can send
                    778: another message to different object. This mechanism allows threads
                    779: to redirect the sender's request to another thread.
                    780: </p>
                    781: <p>
                    782: A thread can receive a message from the specific object which is
                    783: created by itself or thread in same task.
                    784: If the message has not arrived, it
                    785: blocks until any message comes in. The following figure shows the IPC transmit
                    786: sequence of Prex.
                    787: </p>
                    788: 
                    789: <img alt="ipc queue" src="img/msg.gif" border="1"
                    790: style="width: 505px; height: 347px;"><br>
                    791: 
1.1.1.1.2.1! nbrk      792: <i><b>Figure 5. IPC Transmit Sequence</b></i>
1.1       nbrk      793: 
                    794: <h3>Message Transfer</h3>
                    795: <p>
                    796: The message is copied to task to task directly without kernel
                    797: buffering. The memory region of sent message is
                    798: automatically mapped to the receiver's memory within kernel.
                    799: This mechanism allows to reduce the number of copy time while message
                    800: translation.
                    801: Since there is no page out of memory in Prex, we can
                    802: copy the message data via physical memory at anytime.
                    803: </p>
                    804: <img alt="Message transfer" src="img/ipcmap.gif" border="1"
                    805: style="width: 459px; height: 321px;"><br>
                    806: 
1.1.1.1.2.1! nbrk      807: <i><b>Figure 6. IPC message transfer</b></i>
1.1       nbrk      808: 
                    809: <h2 id="except">Exception Handling</h2>
                    810: <p>
                    811: A user mode task can specify its own exception handler with exception_setup().
                    812: There are two different types of exception.
                    813: </p>
                    814: <ul>
                    815:   <li><b>H/W exception</b>:
                    816:   This type of exception is caused by H/W trap &amp fault. The exception
                    817:   will be sent to the thread which caused the trap.
                    818:   If no exception handler is specified by the task, it will be
                    819:   terminated by kernel.</li>
                    820: 
                    821:   <li><b>S/W exception</b>:
                    822:   The user mode task can send S/W exception to another task by exception_raise().
                    823:   The exception
                    824:   will be sent to the thread that is sleeping with exception_wait().
                    825:   If no thread is waiting for the exception, the exception is sent
                    826:   to the first thread in the target task.</li>
                    827: </ul>
                    828: <p>
                    829: Kernel supports 32 types of exception.
                    830: The following pre-defined exceptions are raised by kernel itself.
                    831: </p>
                    832: 
                    833: <i><b>Table 5. Kernel exceptions</b></i>
                    834: <table border="1" width="80%" cellspacing="0">
                    835: <tbody>
                    836: <tr>
                    837:   <th>Exception</th>
                    838:   <th>Type</th>
                    839:   <th>Reason</th>
                    840: </tr>
                    841: <tr>
                    842:   <td>SIGILL</td>
                    843:   <td align="center">H/W</td>
                    844:   <td>Illegal instruction</td>
                    845: </tr>
                    846: <tr>
                    847:   <td>SIGTRAP</td>
                    848:   <td align="center">H/W</td>
                    849:   <td>Break point</td>
                    850: </tr>
                    851: <tr>
                    852:   <td>SIGFPE</td>
                    853:   <td align="center">H/W</td>
                    854:   <td>Math error</td>
                    855: </tr>
                    856: <tr>
                    857:   <td>SIGSEGV</td>
                    858:   <td align="center">H/W</td>
                    859:   <td>Invalid memory access</td>
                    860: </tr>
                    861: <tr>
                    862:   <td>SIGALRM</td>
                    863:   <td align="center">S/W</td>
                    864:   <td>Alarm event</td>
                    865: </tr>
                    866: </tbody>
                    867: </table>
                    868: 
                    869: <p>
                    870: POSIX emulation library will setup its own exception handler to convert
                    871: the Prex exceptions into UNIX signals. It will maintain its own signal mask.
                    872: And, it transfer control to the actual POSIX signal handler that is
                    873: defined by the user mode process.
                    874: 
                    875: </p>
                    876: 
                    877: <h2 id="int"> Interrupt Framework</h2>
                    878: <p>
                    879: Prex defines two different types of interrupt service to
                    880: optimize the response time of real-time operation.
                    881: </p>
                    882: 
                    883: <h3>Interrupt Service Routine (ISR)</h3>
                    884: <p>
                    885: ISR is started by an actual hardware interrupt. The associated
                    886: interrupt is disabled in ICU and CPU interrupt is enabled
                    887: while it runs. If ISR determines that its device generates
                    888: the interrupt, ISR must program the device to stop the interrupt.
                    889: Then, ISR should do minimum I/O operation and return control
                    890: as quickly as possible.
                    891: ISR will run within the context of current running thread at
                    892: interrupt time. So, only few kernel services are available within
                    893: ISR. IRQ_ASSERT() macro can be used to detect the invalid
                    894: function call from ISR.
                    895: </p>
                    896: 
                    897: <h3>Interrupt Service Thread (IST)</h3>
                    898: <p>
                    899: IST is automatically activated if ISR returns INT_CONTINUE to kernel. It
                    900: will be called when the system enters safer condition than ISR.
                    901: Any interrupt driven I/O operation should be done in IST not ISR.
                    902: Since ISR for same IRQ may be run during IST, the shared data,
                    903: resources, and device registers must be synchronized by using
                    904: irq_lock().
                    905: IST does not have to be reentrant, since it is not interrupted
                    906: by same IST itself.
                    907: </p>
                    908: 
                    909: <h3>Interrupt Nesting &amp Priority</h3>
                    910: <p>
                    911: Each ISR has its logical priority level, with 0 being the highest
                    912: priority. While one ISR is running, all lower priority interrupts
                    913: are masked off.
                    914: This interrupt nesting mechanism avoids delaying of
                    915: high priority interrupt events.
                    916: </p>
                    917: <p>
                    918: IST is executed as a normal thread dispatched by the scheduler. So,
                    919: the interrupt thread which has higher priority is executed first.
                    920: The driver writer can specify the thread priority of IST when
                    921: IST is attached to the specific interrupt line.
                    922: The important point is that even a user mode task can be performed
                    923: prior to an interrupt thread.
                    924: </p>
                    925: <p>
                    926: The following figure is the sample of the Prex interrupt processing.
                    927: </p>
                    928: <p>
                    929: <img alt="Interrupt Processing" src="img/irq.gif" border="1"><br>
1.1.1.1.2.1! nbrk      930: <i><b>Figure 7. Prex Interrupt Processing</b></i>
1.1       nbrk      931: </p>
                    932: <p>
                    933: </p>
                    934: 
                    935: <h3>Interrupt Locking</h3>
                    936: <p>
                    937: irq_lock() &amp irq_unlock() are used to disable all interrupts in
                    938: order to synchronize the access to the kernel or H/W resource. Since irq_lock()
                    939: increments a lock counter, irq_unlock() will automatically restore
                    940: to the original interrupt state when locking count becomes 0.
                    941: So the caller does not have to save the previous interrupt state.
                    942: </p>
                    943: 
                    944: <h3>Interrupt Stack</h3>
                    945: <p>
                    946: If each ISR uses the kernel stack of the current running thread, the
                    947: stack area may be over-flow when continuous interrupts are occurred at
                    948: one same thread. So the kernel stack will be switched to the dedicated stack
                    949: while ISR is running.
                    950: </p>
                    951: 
                    952: 
                    953: <h2 id="timer">Timer</h2>
                    954: 
                    955: <h3>Kernel Timers</h3>
                    956: 
                    957: <p>
                    958: The kernel timer provides the following feature.
                    959: </p>
                    960: <ul>
                    961:   <li><b>Sleep timer</b>:      Put caller thread to sleep in the specified time.</li>
                    962:   <li><b>Call back timer</b>:  Call the routine after specified time passes.</li>
                    963:   <li><b>Periodic timer</b>:   Call the routine at the specified interval.</li>
                    964: </ul>
                    965: 
                    966: <h3>Timer Jitter</h3>
                    967: 
                    968: <p>
                    969: The periodic timer is designed to minimize the deviation between desired and
                    970: actual expiration.
                    971: </p>
                    972: 
                    973: <h2 id="device">Device I/O Service</h2>
                    974: 
                    975: The Prex device driver module is separated from the kernel, and this module
                    976: is linked with the kernel at the boot time.
                    977: The kernel provides only simple and minimum services to help the 
                    978: communication between applications and drivers.
                    979: 
                    980: <h3>Device Object</h3>
                    981: <p>
                    982: Since the Prex kernel does not have the file system in it, the kernel
                    983: provides a device object service for I/O interface.
                    984: The device object is created by the device driver to communicate to the
                    985: application. Usually, the driver creates a device object for an existing
                    986: physical device. But, it can be used to handle logical or virtual devices.
                    987: </p>
                    988: 
                    989: <h3>Driver Interface</h3>
                    990: <p>
                    991: The interface between kernel and drivers are defined clearly as
                    992: "Driver Kernel Interface". The kernel provides the following services
                    993: for device drivers.
                    994: </p>
                    995: <ul>
                    996:   <li>Device object service</li>
                    997:   <li>Kernel memory allocation</li>
                    998:   <li>Physical page allocation</li>
                    999:   <li>Interrupt handling service</li>
                   1000:   <li>Scheduler service</li>
                   1001:   <li>Timer service</li>
                   1002:   <li>Debug service</li>
                   1003: </ul>
                   1004: 
                   1005: <h3>Application Interface</h3>
                   1006: <p>
                   1007: The kernel device I/O interface are provided to access the specific device
                   1008: object which is handled by a driver. 
                   1009: The Prex kernel provides the following 5 functions for applications.
                   1010: </p>
                   1011: <ul>
                   1012:  <li>Open a device</li>
                   1013:  <li>Close a device</li>
                   1014:  <li>Read from a device</li>
                   1015:  <li>Write to a device</li>
                   1016:  <li>Device I/O control</li>
                   1017: </ul>
                   1018: 
                   1019: <h2 id="mutex">Mutex</h2>
                   1020: <h3>Priority Inheritance</h3>
                   1021: <p>
                   1022: The thread priority is automatically changed at one of the following conditions.
                   1023: </p>
                   1024: <ol>
                   1025:   <li>
                   1026:   When the current thread fails to lock the mutex and the mutex
                   1027:   owner has lower priority than current thread, the priority
                   1028:   of mutex owner is boosted to the current priority.
                   1029:   If this mutex owner is waiting for another mutex, such related
                   1030:   mutexes are also processed.
                   1031:   </li>
                   1032:   <li>
                   1033:   When the current thread unlocks the mutex and its priority
                   1034:   has already been boosted, kernel recomputes the current priority.
                   1035:   In this case, the priority is set to the highest
                   1036:   priority among the threads waiting for the mutexes locked by the
                   1037:   current thread.
                   1038:   </li>
                   1039:   <li>
                   1040:    When the priority is changed by the user request, the related(inherited)
                   1041:    thread's priority is also changed.
                   1042:   </li>
                   1043: </ol>
                   1044: <p>
                   1045: There are following limitations about priority inheritance
                   1046: with Prex mutex.
                   1047: </p>
                   1048: <ol>
                   1049:   <li>
                   1050:   If the priority is changed by the user request, the priority
                   1051:   recomputation is done only when the new priority is higher
                   1052:   than old priority. The inherited priority is reset to base
                   1053:   priority when the mutex is unlocked.
                   1054:   </li>
                   1055:   <li>
                   1056:   Even if thread is killed with mutex waiting, the related
                   1057:   priority is not adjusted.
                   1058:   </li>
                   1059: </ol>
                   1060: <h2 id="debug">Debug</h2>
                   1061: There are following debugging support functions:
                   1062: <ul>
1.1.1.1.2.1! nbrk     1063:   <li>printf(): Display the debug message in kernel.</li>
1.1       nbrk     1064:   <li>panic(): Dump processor registers and stop system.</li>
                   1065:   <li>ASSERT(): If expression is false (zero), stop system and display information.</li>
                   1066:   <li>trace_on(): If the kernel trace is enabled, all entry/exit of functions
                   1067:   are logged.
                   1068: </ul>
                   1069:     </tr>
                   1070:     <tr>
                   1071:       <td id="footer" colspan="2" style="vertical-align: top;">
                   1072:         <a href="http://sourceforge.net">
                   1073:         <img src="http://sourceforge.net/sflogo.php?group_id=132028&amp;type=1"
                   1074:         alt="SourceForge.net Logo" border="0" height="31" width="88"></a><br>
                   1075:         Copyright&copy; 2005-2007 Kohsuke Ohtani
                   1076:       </td>
                   1077:     </tr>
                   1078: 
                   1079:   </tbody>
                   1080: </table>
                   1081: 
                   1082: </div>
                   1083: <div id="bottom"></div>
                   1084: 
                   1085: </body>
                   1086: </html>

CVSweb