Annotation of prex-old/doc/html/doc/kernel.html, Revision 1.1.1.1
1.1 nbrk 1: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2: <html>
3: <head>
4: <title>Prex Kernel Internals</title>
5: <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
6: <meta name="keywords" content="Prex, embedded, real-time, operating system, RTOS, open source, free">
7: <meta name="author" content="Kohsuke Ohtani">
8: <link rel="stylesheet" type="text/css" href="../default.css" media="screen">
9: <link rel="stylesheet" type="text/css" href="../print.css" media="print">
10: </head>
11: <body>
12: <div id="top">
13: </div>
14: <div id="middle">
15:
16: <table id="content" cellpadding="0" cellspacing="0">
17: <tbody>
18:
19: <tr>
20: <td id="header" colspan="2" valign="top">
21: <table width="100%" border="0" cellspacing="0" cellpadding="0">
22: <tr>
23: <td id="logo">
24: <a href="http://prex.sourceforge.net/">
25: <img alt="Prex logo" src="../img/logo.gif" border="0"
26: style="width: 281px; height: 56px;"></a>
27: </td>
28: <td id="brief" align="right" valign="bottom">
29: An Open Source, Royalty-free,<br>
30: Real-time Operating System
31: </td>
32: </tr>
33: </table>
34: </td>
35: </tr>
36:
37: <tr>
38: <td id="directory" style="vertical-align: top;">
39: <a href="http://prex.sourceforge.net/">Prex Home</a> >
40: <a href="index.html">Document Index</a> >
41: Kernel Internals
42: </tr>
43: <tr><td class="pad" colspan="2" style="vertical-align: top;"></td></tr>
44:
45: <tr>
46: <td id="main" style="vertical-align: top;">
47: <h1>Prex Kernel Internals</h1>
48:
49: <i>Version 1.4, 2005/12/31</i>
50:
51: <h3>Table of Contents</h3>
52: <ul>
53: <li><a href="#over">Kernel Overview</a></li>
54: <li><a href="#design">Design Policy</a></li>
55: <li><a href="#thread">Thread</a></li>
56: <li><a href="#task">Task</a></li>
57: <li><a href="#sched">Scheduler</a></li>
58: <li><a href="#memory">Memory Management</a></li>
59: <li><a href="#ipc">IPC</a></li>
60: <li><a href="#except">Exception Handling</a></li>
61: <li><a href="#int">Interrupt Framework</a></li>
62: <li><a href="#timer">Timer</a></li>
63: <li><a href="#device">Device I/O Service</a></li>
64: <li><a href="#mutex">Mutex</a></li>
65: <li><a href="#debug">Debug</a></li>
66: </ul>
67: <br>
68:
69: <h2 id="over">Kernel Overview</h2>
70:
71: <h3>Kernel Structure</h3>
72: <p>
73: The following figure illustrates the Prex kernel structure.
74: </p>
75: <p>
76: <img alt="Kernel Components" src="img/kernel.gif" border="1"
77: style="width: 602px; height: 414px;"><br>
78:
79: <i><b>Figure 1. Prex kernel Structure</b></i>
80: </p>
81: <p>
82: A kernel object belongs in one of the following groups.
83: </p>
84: <ul>
85: <li><b>kern</b>: kernel core components</li>
86: <li><b>mem</b>: memory managers</li>
87: <li><b>ipc</b>: inter process communication (*)</li>
88: <li><b>sync</b>: synchronize objects</li>
89: <li><b>arch</b>: architecture dependent components</li>
90: </ul>
91: <p>
92: <i>*) Since all messages in Prex are transferred among threads, the name of
93: "IPC" is not appropriate.
94: However, "IPC" is still used as a general term of the message transfer
95: via the kernel, in Prex.</i>
96: </p>
97:
98: <h3>Naming Convention</h3>
99: <p>
100: The name of "group/object" in figure 1 is mapped to "directory/file" in the
101: Prex source tree. For example, the thread related functions are located
102: in "kern/thread.c", and the functions for semaphore are placed
103: in "sync/sem.c".
104: </p>
105: <p>
106: In addition, there is a standard naming convention about kernel
107: routines. The method named <i>bar</i> for the object named <i>foo</i>
108: should be named "foo_bar". For example, the routine to create a new
109: thread is named "thread_create", and locking mutex will be "mutex_lock".
110: This rule is not applied to the name of the local function.
111: </p>
112:
113:
114: <h2 id="design">Design Policy</h2>
115: <p>
116: The Prex kernel focuses the following points to be designed.
117: </p>
118: <ul>
119: <li>Portability</li>
120: <li>Scalability</li>
121: <li>Reliability</li>
122: <li>Interoperability</li>
123: <li>Maintainability</li>
124: </ul>
125:
126: <h3>Portability</h3>
127: <p>
128: The Prex kernel is divided into two different layers -
129: a common kernel layer and an architecture dependent layer.
130:
131: Any routine in the common kernel layer must not access to the H/W by itself.
132: Instead, it must request it to the architecture dependent layer.
133: The interface to the architecture dependent layer is strictly defined
134: by the Prex kernel. This interface is designed carefully to support various
135: different architecture with minimum code change.
136: So, it is easy to port the Prex kernel to different architecture.
137: </p>
138: <p>
139: The following functions must be provided by the architecture dependent layer.
140: </p>
141: <ul>
142: <li><b>CPU</b>: initializes processor registers before kernel boot</li>
143: <li><b>Context</b>: abstracts processor and hardware context</li>
144: <li><b>MMU</b>: abstracts memory management unit (*)</li>
145: <li><b>Trap</b>: abstracts processor trap</li>
146: <li><b>Interrupt</b>: abstracts the interrupt control unit</li>
147: <li><b>Clock</b>: abstracts clock timer unit</li>
148: <li><b>Misc.</b>: abstracts system reset, idle power state</li>
149: </ul>
150: <p>
151: <i>*) In case of no-MMU system, MMU related routines will be
152: defined as no-operation routine.
153: So, the kernel common layer can not assume MMU is always available.</i>
154: </p>
155:
156: <h3>Scalability</h3>
157: <p>
158: In order to obtain higher scalability, the kernel does not limit the maximum
159: number of the kernel objects to create.
160: So, the resource for all kernel objects are allocated dynamically after
161: system boot.
162: This can keep the memory prerequisite smaller than the static
163: resource allocation.
164: This means that the kernel can create task, thread, object, device,
165: event, mutex, timer as many as usable memory remains.
166: </p>
167: <p>
168: The kernel supports both of MMU and No-MMU systems. So, most of the kernel
169: components and VM sub-system are designed carefully to work without MMU.
170: </p>
171:
172: <h3>Reliability</h3>
173: <p>
174: When the remaining memory is exhausted, what should OS do?
175: If the system can stop with panic() there, the error checks of many
176: portions in the kernel are not necessary.
177: But obviously, this is not allowed with the reliable system.
178: Even if the memory is exhausted,
179: a kernel must continue processing.
180: So, all kernel code is checking the error status returned
181: by the memory allocation routine.
182: </p>
183: <p>
184: In addition, the kernel must not crush anytime even if any invalid parameter
185: is passed via kernel API. Basically, the Prex kernel code is written with
186: "garbage in, error out" principle.
187: The Prex kernel never stops even if any malicious
188: program is loaded.
189: </p>
190:
191: <h3>Interoperability</h3>
192: <p>
193: Although the Prex kernel was written from scratch, its applications
194: will be brought from the other operating systems like BSD. So, the system
195: call interface is designed with consideration to support general OS
196: API like POSIX or APIs for generic RTOS.
197: </p>
198: <p>
199: The error code for the Prex system call is defined as the same name
200: with POSIX. For example, EINVAL for "Invalid argument", or ENOMEM for
201: "Out of memory". So, peoples do not have to study new error codes if
202: they already have skills about POSIX programming.
203: This is important point to write applications and to read
204: the kernel code because study of a new error scheme will cause pain
205: for developers.
206:
207: In addition, it simplify the POSIX emulation library because it
208: does not have to remap the error code.
209: </p>
210:
211: <h3>Maintainability</h3>
212: <p>
213: All kernel codes are kept clean and simple for the maintenance.
214: All codes are well-commented and consistent.
215: It is easy to add or remove new system call into the kernel.
216: The kernel has the debugging facility like the function trace or
217: the dump of the kernel objects.
218: </p>
219:
220:
221: <h2 id="thread">Thread</h2>
222:
223: <h3>Thread Control Block</h3>
224: <p>
225: The thread control block includes data for
226: owner task, scheduler, timer, IPC, exception, mutex, and context.
227: The following thread structure is most important definition
228: in the kernel codes.
229: </p>
230: <pre>
231: struct thread {
232: int magic; /* magic number */
233: task_t task; /* pointer to owner task */
234: struct list task_link; /* link for threads in same task */
235: struct queue link; /* linkage on scheduling queue */
236: int state; /* thread state */
237: int policy; /* scheduling policy */
238: int prio; /* current priority */
239: int base_prio; /* base priority */
240: int ticks_left; /* remaining ticks to run */
241: u_int total_ticks; /* total running ticks */
242: int need_resched; /* true if rescheduling is needed */
243: int lock_count; /* schedule lock counter */
244: int suspend_count; /* suspend counter */
245: struct event *sleep_event; /* sleep event */
246: int sleep_result; /* sleep result */
247: struct timer timeout; /* thread timer */
248: struct timer *periodic; /* pointer to periodic timer */
249: struct queue ipc_link; /* linkage on IPC queue */
250: void *msg_addr; /* kernel address of IPC message */
251: size_t msg_size; /* size of IPC message */
252: struct thread *sender; /* thread that sends IPC message */
253: struct thread *receiver; /* thread that receives IPC message */
254: object_t send_obj; /* IPC object sending to */
255: object_t recv_obj; /* IPC object receiving from */
256: uint32_t exc_bitmap; /* bitmap of pending exceptions */
257: struct list mutexes; /* mutexes locked by this thread */
258: struct mutex *wait_mutex; /* mutex pointer currently waiting */
259: void *kstack; /* base address of kernel stack */
260: struct context context; /* machine specific context */
261: };</pre>
262:
263: <h3>Thread Creation</h3>
264: New thread can be created by thread_create().
265: The initial states of newly created thread are as follow:
266: <p>
267: </p>
268:
269: <i><b>Table 1. Initial thread state</b></i>
270: <table border="1" width="60%" cellspacing="0">
271: <tbody>
272: <tr>
273: <th>Data type</th>
274: <th>Initial state</th>
275: </tr>
276: <tr>
277: <td>Owner Task</td>
278: <td>Inherit from parent thread</td>
279: </tr>
280: <tr>
281: <td>Thread state</td>
282: <td>Suspended</td>
283: </tr>
284: <tr>
285: <td>Suspend count</td>
286: <td>Task suspend count + 1</td>
287: </tr>
288: <tr>
289: <td>Scheduling policy</td>
290: <td>Round Robin</td>
291: </tr>
292: <tr>
293: <td>Scheduling Priority</td>
294: <td>Default (= 200)</td>
295: </tr>
296: <tr>
297: <td>Time quantum</td>
298: <td>Default (= 50 msec)</td>
299: </tr>
300:
301: <tr>
302: <td>Processor registers</td>
303: <td>Default value</td>
304: </tr>
305:
306: </tbody>
307: </table>
308:
309: <p>
310: Since new thread is initially set to the suspended state, thread_resume()
311: must be called to start it.
312: </p>
313:
314: <p>
315: Creating a thread and loading its register state are isolated
316: in different routines. These two routines are used by fork(), exec(),
317: and pthread_create() in the POSIX emulation library.
318: </p>
319:
320: <i><b>Table 2. Usage of thread_create()/thread_load()</b></i>
321: <table border="1" width="80%" cellspacing="0">
322: <tbody>
323: <tr>
324: <th>Library routine</th>
325: <th>thread_create()</th>
326: <th>thread_load()</th>
327: </tr>
328: <tr>
329: <td>fork()</td>
330: <td align="center">O</td>
331: <td align="center">X</td>
332: </tr>
333: <tr>
334: <td>exec()</td>
335: <td align="center">X</td>
336: <td align="center">O</td>
337: </tr>
338: <tr>
339: <td>pthread_create()</td>
340: <td align="center">O</td>
341: <td align="center">O</td>
342: </tr>
343: </tbody>
344: </table>
345:
346: <h3>Thread Termination</h3>
347: <p>
348: The kernel will usually release all resources owned by the terminated thread.
349: But, there are some complicated process to release the resources.
350: The priority adjustment may be required if the thread inherits its
351: priority.
352: </p>
353: <p>
354: If the thread is terminated with mutex locking, all threads waiting for
355: that mutex does sleep forever. So, the mutex held by the terminated thread
356: must be unlocked, or change its mutex owner if some thread is waiting for.
357:
358: </p>
359: <p>
360: In general, there is a known issue about the thread termination.
361: If the termination target is current thread, the kernel can not release
362: the context of the current thread because the
363: thread switching always requires the current context.
364: There are the following 3 solutions for this.
365: </p>
366: <ol>
367: <li>Create "clean up thread" to terminate thread</li>
368: <li>Add condition check in thread switching code</li>
369: <li>Defer termination in next termination request</li>
370: </ol>
371: The Prex kernel is using #3.
372:
373: <h3>Thread Suspension</h3>
374: <p>
375: Each thread can be set to the suspended state by using thread_suspend()
376: interface.
377: Although a thread can be suspended any number of times,
378: it does not start to run unless it is resumed by the same
379: number of suspend.
380: </p>
381:
382: <h3>Kernel Thread</h3>
383: <p>
384: A kernel thread is always executed in kernel mode, and it does not have user
385: mode context.
386: The scheduling policy is set to SCHED_FIFO by default.
387: </p>
388: <p>
389: Currently, the following kernel threads are running in kernel mode.
390: </p>
391: <ul>
392: <li>Interrupt Service Threads</li>
393: <li>Timer Thread</li>
394: <li>Idle Thread</li>
395: <li>DPC Thread</li>
396: </ul>
397:
398: <h3>Idle Thread</h3>
399: <p>
400: An idle thread runs when no other thread is active.
401: It has the role of cutting down the power consumption of a system.
402: An idle thread has FIFO scheduling policy, and it does not
403: have time quantum.
404: The lowest scheduling priority (=255) is reserved for an idle thread.
405: </p>
406:
407: <h2 id="task">Task</h2>
408:
409: <h3>Task Creation</h3>
410: <p>
411: The task can be created by using task_create().
412: New child task will have the same memory image with the parent task.
413: Especially text region and read-only region are physically
414: shared among them.
415: The parent task receives the new task ID of child task from task_create(), but
416: child task will receive 0 as task ID.
417: </p>
418: <p>
419: The initial task states are as follow:
420: </p>
421:
422: <i><b>Table 3. Initial task state</b></i>
423: <table border="1" width="60%" cellspacing="0">
424: <tbody>
425: <tr>
426: <th>Data type</th>
427: <th>Inherit from parent task?</th>
428: </tr>
429:
430: <tr>
431: <td>Object List</td>
432: <td align="center">No</td>
433: </tr>
434:
435: <tr>
436: <td>Threads</td>
437: <td align="center">No</td>
438: </tr>
439:
440: <tr>
441: <td>Memory Map</td>
442: <td align="center">Yes</td>
443: </tr>
444:
445: <tr>
446: <td>Suspend Count</td>
447: <td align="center">No</td>
448: </tr>
449:
450: <tr>
451: <td>Exception Handler</td>
452: <td align="center">Yes</td>
453: </tr>
454:
455: </tbody>
456: </table>
457:
458: <p>
459: If the parent task is specified as NULL for task_create(),
460: all child state are initialized to default.
461: This is used in exec() emulation.
462: </p>
463:
464: <h3>Task Suspension</h3>
465: <p>
466: When the task is set to suspend state, the thread suspend count of all threads
467: in the task is also incremented.
468: A thread can start to run only when both of the thread suspend count
469: and the task suspend count becomes 0.
470: </p>
471:
472: <h3>Kernel Task</h3>
473: <p>
474: The kernel task is a special task that has only an idle thread
475: and interrupt threads. It does not have any user mode memory.
476: </p>
477:
478: <h2 id="sched">Scheduler</h2>
479: <h3>Thread Priority</h3>
480: <p>
481: The Prex scheduler is based on the algorithm known as priority based
482: multi level queue. Each thread is assigned the priority between
483: 0 and 255. The lower number means higher priority like BSD unix.
484: It maintains 256 level run queues mapped to each priority.
485: The lowest priority (=255) is used only for an idle thread.
486: </p>
487: <p>
488: A thread has two different types of priority:
489: </p>
490: <ul>
491: <li><b>Base priority:</b>
492: This is a static priority which can be changed only by user mode program.
493: <li><b>Current Priority:</b>
494: An actual scheduling priority.
495: A kernel may adjust this priority dynamically if it's needed.
496: </ul>
497: <p>
498: Although the base priority and the current priority are same value in almost
499: conditions,
500: kernel will sometimes change the current priority to avoid
501: "priority inversion".
502: </p>
503:
504: <h3>Thread State</h3>
505: <p>
506: Each thread has one of the following states.
507: </p>
508:
509: <ul>
510: <li><b>RUN</b> :Running or ready to run</li>
511: <li><b>SLEEP</b> :Sleep for some event</li>
512: <li><b>SUSPEND</b> :Suspend count is not 0</li>
513: <li><b>EXIT</b> :Terminated</li>
514: </ul>
515: <p>
516: The thread is always preemptive even in kernel mode.
517: There are following 4 events to switch thread:
518: </p>
519:
520: <i><b>Table 4. Events to switch thread</b></i>
521: <table border="1" width="90%" cellspacing="0">
522: <tbody>
523: <tr>
524: <th>Event</th>
525: <th>Condition</th>
526: <th>Run queue position</th>
527: </tr>
528: <tr>
529: <td><b>Block</b></td>
530: <td>Thread sleep or suspend</td>
531: <td>Move to the tail of runq</td>
532: </tr>
533: <tr>
534: <td><b>Preemption</b></td>
535: <td>Higher priority thread becomes runnable</td>
536: <td>Keep the head of runq</td>
537: </tr>
538: <tr>
539: <td><b>Quantum Expiration</b></td>
540: <td>The thread consumes its time quantum</td>
541: <td>Move to the tail of runq</td>
542: </tr>
543: <tr>
544: <td><b>Yield</b></td>
545: <td>The thread releases CPU by itself</td>
546: <td>Move to the tail of runq</td>
547: </tr>
548:
549: </tbody>
550: </table>
551:
552: <h3>Scheduling Policy</h3>
553: <p>
554: There are following three types of scheduling policy.
555: </p>
556: <ul>
557: <li><b>SCHED_FIFO</b>: First-in First-out
558: <li><b>SCHED_RR</b>: Round Robin (SCHED_FIFO + timeslice)
559: <li><b>SCHED_OTHER</b>: Not supported
560: </ul>
561: <p>
562: In early Prex development phase, SCHED_OTHER was implemented as a traditional
563: BSD scheduler. Since this scheduler changes the thread priority dynamically,
564: it is unpredictable and does not fit the real-time system.
565: Recently, SCHED_OTHER policy was dropped from Prex to
566: focus on real-time platform.
567: </p>
568:
569: <h3>Scheduling Parameter</h3>
570: <p>
571: An application program can change the following scheduling parameters via
572: kernel API.
573: </p>
574: <ul>
575: <li>Thread Priority</li>
576: <li>Scheduling Policy</li>
577: <li>Time Quantum (only for SCHED_RR)</li>
578: </ul>
579:
580: <h3>Scheduling Lock</h3>
581: <p>
582: The thread scheduling can be disabled by locking the scheduler.
583: This is used to synchronize the thread execution to protect the
584: access to the global resources.
585: Since any interrupt handler can work while scheduling lock state,
586: it does not affect to the interrupt latency.
587: </p>
588:
589: <h3>Named Event</h3>
590: <p>
591: The thread can sleep/wakeup for the specific event. The event works as
592: the queue of the sleeping threads. Since each event has its name,
593: it is easy to know which event the debugee is waiting for.
594: </p>
595:
596:
597: <h2 id="memory">Memory Management</h2>
598:
599: <h3>Physical Page Allocator</h3>
600: <p>The physical page allocator provides the service for
601: page allocation/deallocation/reservation.
602: It works on the bottom layer for other memory managers.
603: </p>
604: <p>
605: <img alt="Memory Structure" src="img/memory.gif" border="1"
606: style="width: 448px; height: 308px;"><br>
607:
608: <i><b>Figure 2. Prex Memory Structure</b></i>
609: </p>
610: <p>
611: The key point is that Prex kernel does not page out to
612: any external disk devices. This is an important design point to get
613: real-time performance and system simplicity.
614:
615: </p>
616:
617: <h3>Kernel Memory Allocator</h3>
618: <p>
619: The kernel memory allocator is optimized for the small
620: memory foot print system.
621: </p>
622: <p>
623: To allocate kernel memory, it is necessary to divide one page into
624: two or more blocks.
625: There are following 3 linked lists to manage used/free blocks for kernel
626: memory.
627: <ol>
628: <li>All pages allocated for the kernel memory are linked.</li>
629: <li>All blocks divided in the same page are linked.</li>
630: <li>All free blocks of the same size are linked.</li>
631: </ol>
632: <p>
633: Currently, it can not handle the memory size exceeding one page.
634: Instead, a driver can use page_alloc() to allocate large memory.
635: <br>
636: When the kernel code illegally writes data into non-allocated memory,
637: the system will crash easily. The kmem modules are called from
638: not only kernel code but from various drivers. In order to
639: check the memory over run, each free block has a tag with magic ID.
640: </p>
641: <p>
642: The kernel maintains the array of the block headers for the free blocks.
643: The index of an array is decided by the size of each block.
644: All block has the size of the multiple of 16.
645: </p>
646: <pre>free_blks[0] = list for 16 byte block
647: free_blks[1] = list for 32 byte block
648: free_blks[2] = list for 48 byte block
649: .
650: .
651: free_blks[255] = list for 4096 byte block
652: </pre>
653: <p>
654: In generic design, only one list is used to search the free block
655: for a first fit algorithm.
656: However, the Prex kernel memory allocator is using multiple lists
657: corresponding to each block size.
658: A search is started from the list of the requested size. So,
659: it is not necessary to search smaller block's list wastefully.
660: </p>
661: <p>
662: In most of the "buddy" based memory allocators, their algorithm are
663: using <b>2^n</b> bytes as block size.
664: But, this logic will throw away much memory in case
665: the block size is not fit. So, this is not suitable for the
666: embedded systems that Prex aims to.
667: </p>
668:
669: <h3>Virtual Memory Manager</h3>
670: <p>
671: A task owns its private virtual address space. All threads
672: in a same task share one memory space.
673: When new task is made, the address map of the parent task will be
674: automatically copied.
675: In this time, the read-only space is not copied and is shared with old map.
676: </p>
677: <p>
678: The kernel provide the following functions for VM:
679: </p>
680: <ul>
681: <li>Allocate/deallocate memory region</li>
682: <li>Change memory attribute (read/write/exec)</li>
683: <li>Map another task's memory to current task</li>
684: </ul>
685: <p>
686: The VM allocator is using the traditional list-based algorithm.
687: </p>
688: <p>
689: The kernel task is a special task which has the virtual memory mapping
690: for kernel. All other user mode tasks will have the same kernel memory
691: image mapped from the kernel task. So, kernel threads can work with the
692: all user mode task context without switching memory map.
693: </p>
694: <p>
695: <img alt="Memory Mapping" src="img/memmap.gif" border="1"
696: style="width: 504px; height: 271px;"><br>
697:
698: <i><b>Figure 3. Kernel Memory Mapping</b></i>
699: </p>
700:
701: <p>
702: Since the Prex kernel does not do page out to an external storage,
703: it is guaranteed that the allocated memory is always continuing
704: and existing. Thereby, a kernel and drivers can be constructed
705: very simply.
706: </p>
707: <p>
708: <i>Note: "Copy-on-write" feature was supported with the Prex kernel before.
709: But, it was dropped to increase the real-time performance.</i>
710: </p>
711:
712:
713: <h2 id="ipc">IPC</h2>
714: <p>
715: The message passing model of Prex is very simple compared with other
716: modern microkernels. The Prex message is sent to the "object" from thread
717: to thread.
718: The "object" in Prex is similar concept that is called as "port"
719: in other microkernel.
720: </p>
721:
722: <h3>Object</h3>
723: <p>
724: An object represents service, state, or policies etc.
725: For object manipulation, kernel provide 3 functions:
726: object_create(), object_delete(), object_lookup().
727: Prex task will create an object to publish its interface to other tasks.
728: For example, server tasks will create objects like "proc", "fs", "exec" to
729: allow clients to access their service.
730: And then, client tasks will send a request message to these objects
731: </p>
732: <p>
733: An actual object is stored in kernel space, and it is protected
734: from user mode code.
735: Each object data is managed with the hash table by using its name string.
736: Usually, an object has a unique name within a system. To send a
737: message to the specific object, it must obtain the target object ID
738: by looking up by the name.
739: </p>
740: <p>
741: An object can be created without its name. These objects can be used as
742: private objects for threads in same task.
743: </p>
744:
745: <h3>Message</h3>
746: <p>
747: Each IPC message must have the pre-defined message header in it.
748: The kernel will store the sender task's ID into the message header.
749: This mechanism ensures the receiver task can get the exact task ID
750: of the sender task. Therefore, receiver task can check the sender
751: task's capability for various secure services.
752: </p>
753: <p>
754: It is necessary to recognize the pre-defined message format between
755: sender and receiver.
756: </p>
757: <p>
758: Messages are sent to the specific object using msg_send().
759: The transmission of a message is always synchronous. This means that
760: the thread which sent the message is blocked until it receives
761: a response from another threads. msg_receive() performs reception
762: of a message. msg_receive() is also blocked when no message is
763: reached to the target object. The receiver thread must answer the
764: message using msg_reply() after it finishes processing.
765: </p>
766: <p>
767: The receiver thread can not receive another message until it
768: replies to the sender. In short, a thread can receive only one
769: message at once. Once the thread receives message, it can send
770: another message to different object. This mechanism allows threads
771: to redirect the sender's request to another thread.
772: </p>
773: <p>
774: A thread can receive a message from the specific object which is
775: created by itself or thread in same task.
776: If the message has not arrived, it
777: blocks until any message comes in. The following figure shows the IPC transmit
778: sequence of Prex.
779: </p>
780:
781: <img alt="ipc queue" src="img/msg.gif" border="1"
782: style="width: 505px; height: 347px;"><br>
783:
784: <i><b>Figure 4. IPC Transmit Sequence</b></i>
785:
786: <h3>Message Transfer</h3>
787: <p>
788: The message is copied to task to task directly without kernel
789: buffering. The memory region of sent message is
790: automatically mapped to the receiver's memory within kernel.
791: This mechanism allows to reduce the number of copy time while message
792: translation.
793: Since there is no page out of memory in Prex, we can
794: copy the message data via physical memory at anytime.
795: </p>
796: <img alt="Message transfer" src="img/ipcmap.gif" border="1"
797: style="width: 459px; height: 321px;"><br>
798:
799: <i><b>Figure 5. IPC message transfer</b></i>
800:
801: <h2 id="except">Exception Handling</h2>
802: <p>
803: A user mode task can specify its own exception handler with exception_setup().
804: There are two different types of exception.
805: </p>
806: <ul>
807: <li><b>H/W exception</b>:
808: This type of exception is caused by H/W trap & fault. The exception
809: will be sent to the thread which caused the trap.
810: If no exception handler is specified by the task, it will be
811: terminated by kernel.</li>
812:
813: <li><b>S/W exception</b>:
814: The user mode task can send S/W exception to another task by exception_raise().
815: The exception
816: will be sent to the thread that is sleeping with exception_wait().
817: If no thread is waiting for the exception, the exception is sent
818: to the first thread in the target task.</li>
819: </ul>
820: <p>
821: Kernel supports 32 types of exception.
822: The following pre-defined exceptions are raised by kernel itself.
823: </p>
824:
825: <i><b>Table 5. Kernel exceptions</b></i>
826: <table border="1" width="80%" cellspacing="0">
827: <tbody>
828: <tr>
829: <th>Exception</th>
830: <th>Type</th>
831: <th>Reason</th>
832: </tr>
833: <tr>
834: <td>SIGILL</td>
835: <td align="center">H/W</td>
836: <td>Illegal instruction</td>
837: </tr>
838: <tr>
839: <td>SIGTRAP</td>
840: <td align="center">H/W</td>
841: <td>Break point</td>
842: </tr>
843: <tr>
844: <td>SIGFPE</td>
845: <td align="center">H/W</td>
846: <td>Math error</td>
847: </tr>
848: <tr>
849: <td>SIGSEGV</td>
850: <td align="center">H/W</td>
851: <td>Invalid memory access</td>
852: </tr>
853: <tr>
854: <td>SIGALRM</td>
855: <td align="center">S/W</td>
856: <td>Alarm event</td>
857: </tr>
858: </tbody>
859: </table>
860:
861: <p>
862: POSIX emulation library will setup its own exception handler to convert
863: the Prex exceptions into UNIX signals. It will maintain its own signal mask.
864: And, it transfer control to the actual POSIX signal handler that is
865: defined by the user mode process.
866:
867: </p>
868:
869: <h2 id="int"> Interrupt Framework</h2>
870: <p>
871: Prex defines two different types of interrupt service to
872: optimize the response time of real-time operation.
873: </p>
874:
875: <h3>Interrupt Service Routine (ISR)</h3>
876: <p>
877: ISR is started by an actual hardware interrupt. The associated
878: interrupt is disabled in ICU and CPU interrupt is enabled
879: while it runs. If ISR determines that its device generates
880: the interrupt, ISR must program the device to stop the interrupt.
881: Then, ISR should do minimum I/O operation and return control
882: as quickly as possible.
883: ISR will run within the context of current running thread at
884: interrupt time. So, only few kernel services are available within
885: ISR. IRQ_ASSERT() macro can be used to detect the invalid
886: function call from ISR.
887: </p>
888:
889: <h3>Interrupt Service Thread (IST)</h3>
890: <p>
891: IST is automatically activated if ISR returns INT_CONTINUE to kernel. It
892: will be called when the system enters safer condition than ISR.
893: Any interrupt driven I/O operation should be done in IST not ISR.
894: Since ISR for same IRQ may be run during IST, the shared data,
895: resources, and device registers must be synchronized by using
896: irq_lock().
897: IST does not have to be reentrant, since it is not interrupted
898: by same IST itself.
899: </p>
900:
901: <h3>Interrupt Nesting & Priority</h3>
902: <p>
903: Each ISR has its logical priority level, with 0 being the highest
904: priority. While one ISR is running, all lower priority interrupts
905: are masked off.
906: This interrupt nesting mechanism avoids delaying of
907: high priority interrupt events.
908: </p>
909: <p>
910: IST is executed as a normal thread dispatched by the scheduler. So,
911: the interrupt thread which has higher priority is executed first.
912: The driver writer can specify the thread priority of IST when
913: IST is attached to the specific interrupt line.
914: The important point is that even a user mode task can be performed
915: prior to an interrupt thread.
916: </p>
917: <p>
918: The following figure is the sample of the Prex interrupt processing.
919: </p>
920: <p>
921: <img alt="Interrupt Processing" src="img/irq.gif" border="1"><br>
922: <i><b>Figure 6. Prex Interrupt Processing</b></i>
923: </p>
924: <p>
925: </p>
926:
927: <h3>Interrupt Locking</h3>
928: <p>
929: irq_lock() & irq_unlock() are used to disable all interrupts in
930: order to synchronize the access to the kernel or H/W resource. Since irq_lock()
931: increments a lock counter, irq_unlock() will automatically restore
932: to the original interrupt state when locking count becomes 0.
933: So the caller does not have to save the previous interrupt state.
934: </p>
935:
936: <h3>Interrupt Stack</h3>
937: <p>
938: If each ISR uses the kernel stack of the current running thread, the
939: stack area may be over-flow when continuous interrupts are occurred at
940: one same thread. So the kernel stack will be switched to the dedicated stack
941: while ISR is running.
942: </p>
943:
944:
945: <h2 id="timer">Timer</h2>
946:
947: <h3>Kernel Timers</h3>
948:
949: <p>
950: The kernel timer provides the following feature.
951: </p>
952: <ul>
953: <li><b>Sleep timer</b>: Put caller thread to sleep in the specified time.</li>
954: <li><b>Call back timer</b>: Call the routine after specified time passes.</li>
955: <li><b>Periodic timer</b>: Call the routine at the specified interval.</li>
956: </ul>
957:
958: <h3>Timer Jitter</h3>
959:
960: <p>
961: The periodic timer is designed to minimize the deviation between desired and
962: actual expiration.
963: </p>
964:
965: <h2 id="device">Device I/O Service</h2>
966:
967: The Prex device driver module is separated from the kernel, and this module
968: is linked with the kernel at the boot time.
969: The kernel provides only simple and minimum services to help the
970: communication between applications and drivers.
971:
972: <h3>Device Object</h3>
973: <p>
974: Since the Prex kernel does not have the file system in it, the kernel
975: provides a device object service for I/O interface.
976: The device object is created by the device driver to communicate to the
977: application. Usually, the driver creates a device object for an existing
978: physical device. But, it can be used to handle logical or virtual devices.
979: </p>
980:
981: <h3>Driver Interface</h3>
982: <p>
983: The interface between kernel and drivers are defined clearly as
984: "Driver Kernel Interface". The kernel provides the following services
985: for device drivers.
986: </p>
987: <ul>
988: <li>Device object service</li>
989: <li>Kernel memory allocation</li>
990: <li>Physical page allocation</li>
991: <li>Interrupt handling service</li>
992: <li>Scheduler service</li>
993: <li>Timer service</li>
994: <li>Debug service</li>
995: </ul>
996:
997: <h3>Application Interface</h3>
998: <p>
999: The kernel device I/O interface are provided to access the specific device
1000: object which is handled by a driver.
1001: The Prex kernel provides the following 5 functions for applications.
1002: </p>
1003: <ul>
1004: <li>Open a device</li>
1005: <li>Close a device</li>
1006: <li>Read from a device</li>
1007: <li>Write to a device</li>
1008: <li>Device I/O control</li>
1009: </ul>
1010:
1011: <h2 id="mutex">Mutex</h2>
1012: <h3>Priority Inheritance</h3>
1013: <p>
1014: The thread priority is automatically changed at one of the following conditions.
1015: </p>
1016: <ol>
1017: <li>
1018: When the current thread fails to lock the mutex and the mutex
1019: owner has lower priority than current thread, the priority
1020: of mutex owner is boosted to the current priority.
1021: If this mutex owner is waiting for another mutex, such related
1022: mutexes are also processed.
1023: </li>
1024: <li>
1025: When the current thread unlocks the mutex and its priority
1026: has already been boosted, kernel recomputes the current priority.
1027: In this case, the priority is set to the highest
1028: priority among the threads waiting for the mutexes locked by the
1029: current thread.
1030: </li>
1031: <li>
1032: When the priority is changed by the user request, the related(inherited)
1033: thread's priority is also changed.
1034: </li>
1035: </ol>
1036: <p>
1037: There are following limitations about priority inheritance
1038: with Prex mutex.
1039: </p>
1040: <ol>
1041: <li>
1042: If the priority is changed by the user request, the priority
1043: recomputation is done only when the new priority is higher
1044: than old priority. The inherited priority is reset to base
1045: priority when the mutex is unlocked.
1046: </li>
1047: <li>
1048: Even if thread is killed with mutex waiting, the related
1049: priority is not adjusted.
1050: </li>
1051: </ol>
1052: <h2 id="debug">Debug</h2>
1053: There are following debugging support functions:
1054: <ul>
1055: <li>printk(): Display the debug message in kernel.</li>
1056: <li>panic(): Dump processor registers and stop system.</li>
1057: <li>ASSERT(): If expression is false (zero), stop system and display information.</li>
1058: <li>trace_on(): If the kernel trace is enabled, all entry/exit of functions
1059: are logged.
1060: </ul>
1061: </tr>
1062: <tr>
1063: <td id="footer" colspan="2" style="vertical-align: top;">
1064: <a href="http://sourceforge.net">
1065: <img src="http://sourceforge.net/sflogo.php?group_id=132028&type=1"
1066: alt="SourceForge.net Logo" border="0" height="31" width="88"></a><br>
1067: Copyright© 2005-2007 Kohsuke Ohtani
1068: </td>
1069: </tr>
1070:
1071: </tbody>
1072: </table>
1073:
1074: </div>
1075: <div id="bottom"></div>
1076:
1077: </body>
1078: </html>
CVSweb