Changes between Version 41 and Version 42 of Developer/SMP


Ignore:
Timestamp:
03/03/14 18:32:00 (10 years ago)
Author:
Sh
Comment:

/* Requirements */ Add some requirements

Legend:

Unmodified
Added
Removed
Modified
  • Developer/SMP

    v41 v42  
    99
    1010The [http://en.wikipedia.org/wiki/Symmetric_multiprocessing SMP] support for RTEMS is work in progress.  Basic support is available for ARM, PowerPC, SPARC and Intel x86.
    11 =  Requirements  =
    12 
    13 
    14 No public requirements exist currently.
     11
     12 *  Implementation Language
     13  *  The implementation language shall be C11 (ISO/IEC 9899:2011) or assembler.
     14  *  The CPU architecture shall support lock-free atomic operations for unsigned long integers.
     15 *  SMP Synchronization Primitives
     16  *  Atomic operations and fences shall be used to implement higher-level synchronization primitives.
     17  *  An SMP lock which ensures mutual exclusion and FIFO ordering shall be provided.
     18  *  An SMP read-write lock shall be provided which offers phase-fair ordering.
     19  *  A re-usable SMP barrier shall be provided.
     20  *  SMP synchronization primitives may execute infinitely without progress in case other processors execute erroneous code.
     21 *  System Initialization and Shutdown
     22  *  Before execution starts on the entry symbol on at least one processor a boot loader must load the text and read-only sections.
     23  *  A read-only application configuration shall select the boot processor.
     24  *  The boot processor shall initialize the data and BSS sections if not already performed by a boot loader.
     25  *  A CPU architecture or BSP specific method shall ensure that objects resident in the data or BSS section are not accessed before the boot processor or boot loader initialized these sections.
     26  *  The boot processor shall execute the serial system initialization.
     27  *  It shall be possible to shutdown the system anytime after data and BSS section initialization from any processor.
     28  *  The fatal extension handler shall be invoked during system shutdown on each processor.
     29  *  Invocation of the shutdown procedure while holding SMP synchronization primitives may lead to dead-lock.
     30 *  Thread Life Cycle
     31  *  Asynchronous thread deletion shall be possible.
     32  *  Concurrent thread deletion shall be possible.  At most one thread shall start the deletion process successfully, other threads shall observe an error status.
     33  *  Usage of thread identifiers which are re-used by newly created threads with the aim to access a deleted thread may lead to unpredictable results.
     34  *  Threads shall have a method to change and restore the ability for asynchronous thread deletion of the executing thread.
     35  *  The POSIX cleanup handler shall be available for all RTEMS build configurations.
     36  *  The POSIX cleanup handler shall execute in the context of the deleted thread.
     37  *  The POSIX cleanup handler shall execute in case a thread is re-started in the context of the re-started thread.
     38  *  The POSIX keys shall be available for all RTEMS build configurations.
     39  *  The POSIX key destructor shall execute in the context of the deleted thread.
     40  *  The POSIX key destructor shall execute in case a thread is re-started in the context of the re-started thread.
     41  *  In case a thread owns operating system provided resources after the cleanup procedure, then this shall result in a fatal error.
     42 *  Non-Thread Object Life Cycle
     43  *  Concurrent object deletion may have unpredictable results.
     44  *  Usage of objects during deletion of this object may have unpredictable results.
     45 *  Classic API
     46  *  Usage of task variables shall lead to a compile time error.
     47  *  Usage of task non-preempt mode shall lead to a compile time error.
     48  *  Usage of interrupt disable/enable shall lead to a run-time time error if RTEMS debug is enabled and the executing context is unsafe.
     49  *  All other RTEMS object services shall behave like in the single-processor configuration.
     50 *  Profiling
     51  *  The profiling shall be a RTEMS build configuration option.
     52  *  It shall be possible to measure time intervals up to the system tick interval with little overhead in every execution context.
     53  *  It shall be possible to obtain profiling information for the lowest-level system operations like thread dispatch disabled sections, interrupt processing and SMP locks.
     54  *  There shall be a method for the application to retrieve profiling information of the system.
     55 *  Interrupt Support
     56  *  Interrupts shall have an interrupt affinity to the boot processor by default.
     57  *  It shall be possible to set the interrupt affinity of interrupt sources.
     58 *  Clustered/Partitioned Scheduling
     59  *  RTEMS shall allow the set of processors in a system to be partitioned into pairwise disjoint subsets.  Each subset of processors shall be owned by exactly one scheduler instance.
     60  *  A clustered/partitioned fixed-priority scheduler shall be provided.
     61  *  The application configuration shall provide a set of processors which may be used to run the application.
     62  *  The application configuration shall define the scheduler instances.
     63  *  The CPU architecture or BSP shall be able to reduce the set of processors provided by the application configuration to reflect the actual hardware.
     64  *  The application configuration of a scheduler instance shall specify if the set of processors can be reduced.
     65  *  The application configuration of a scheduler instance shall specify if the set of processors can be expanded with processors available by the actual hardware and not assigned to other scheduler instances.
     66 *  Fine Grained Locking
     67  *  No giant lock protecting the system state shall be necessary.
     68  *  Non-blocking operations shall use only an object specific lock.
    1569=  Application Impact of SMP  =
    1670
     
    14591513PSAC control.  The PSAC operations like addition to, removal from and iteration
    14601514over the chain are protected by the corresponding thread lock.  Each action
    1461 will have a local context.  The heir thread will execute the action handlers on
    1462 behalf of the thread of interest.  Since thread dispatching is disabled action
    1463 handlers cannot block.
    1464 
    1465 The execution time of post-switch actions increases the worst-case thread
    1466 dispatch latency since the heir thread must do work for another thread.
    1467 
    1468 On demand post-switch actions help to implement the Multiprocessor Resource
    1469 Sharing Protocol (MrsP) proposed by Burns and Wellings.  Threads executing a
    1470 global critical section can add a post-switch action which will trigger the
    1471 thread migration in case of pre-emption by a local high-priority thread.
    1472 
    1473  thread_dispatch:
    1474         again = true
    1475         while again:
    1476                 level = ISR.disable()
    1477                 current_cpu = get_current_cpu()
    1478                 current_cpu.disable_thread_dispatch()
    1479                 ISR.enable(level)
    1480                 executing = current_cpu.get_executing()
    1481                 current_cpu.acquire()
    1482                 if current_cpu.is_thread_dispatch_necessary():
    1483                         heir = current_cpu.get_heir()
    1484                         current_cpu.set_thread_dispatch_necessary(false)
    1485                         current_cpu.set_executing(heir)
    1486                         executing.set_executing(false)
    1487                         heir.set_executing(true)
    1488                         if executing != heir:
    1489                                 last = switch(executing, heir)
    1490                                 current_cpu = get_current_cpu()
    1491                                 actions = last.get_actions()
    1492                                 if actions.is_empty():
    1493                                         again = false
    1494                                 else:
    1495                                         current_cpu.release()
    1496                                         last.acquire()
    1497                                         if last.get_cpu() == current_cpu:
    1498                                                 while !actions.is_empty():
    1499                                                         action = actions.pop()
    1500                                                         action.do(current_cpu, last)
    1501                                         last.release()
    1502                                         current_cpu.enable_thread_dispatch()
    1503         current_cpu.enable_thread_dispatch()
    1504         current_cpu.release()
    1505 
    1506 It is important to check that the thread is still assigned to the current
    1507 processor, since after the release of the per-processor lock we have a new
    1508 executing thread and the thread of interest may migrated to another processor
    1509 already.  Since the heir thread has now a reference to the thread of interest
    1510 we have to make sure that deletion requests are deferred until the post-switch
    1511 actions have been executed.
    1512 
    1513 An efficient why to get the last executing thread (the thread of interest)
    1514 throughout the context switch is to return the context pointer of the last
    1515 executing thread.  With a simple offset operation we get the thread control
    1516 block.
    1517 =  Thread Delete/Restart  =
    1518 
    1519 ==  Reason  ==
    1520 
    1521 
    1522 Deletion of threads may be required by some parallel libraries.
    1523 ==  RTEMS API Changes  ==
    1524 
    1525 
    1526 None.
    1527 ==  Implementation  ==
    1528 
    1529 
    1530 The current implementation to manage a thread life-cycle in RTEMS has some
    1531 weaknesses that turn into severe problems on SMP.  It leads also to POSIX and
    1532 C++ standard conformance defects in some cases.  Currently the thread
    1533 life-cycle changes are protected by the thread dispatch disable level and some
    1534 parts by the allocator mutex.  Since the thread dispatch disable level is
    1535 actually a giant mutex on SMP this leads in combination with the allocator
    1536 mutex to lock order reversal problems.
    1537 
    1538 The usage of a unified work areas is also broken at the moment
    1539 [https://www.rtems.org/bugzilla/show_bug.cgi?id=2152].
    1540 
    1541 There is also an outstanding thread cancellation bug
    1542 [https://www.rtems.org/bugzilla/show_bug.cgi?id=2035].
    1543 
    1544 One problematic path is the destruction of threads.  Here we have currently the
    1545 following sequence:
    1546 
    1547 <ol>
    1548 <li>Obtain the allocator mutex.</li>
    1549 <li>Disable thread dispatching.</li>
    1550 <li>Invalidate the object identifier.</li>
    1551 <li>Enable thread dispatching.</li>
    1552 <li>Call the thread delete extensions in the context of the deleting thread
    1553 (not necessarily the deleted thread).  The POSIX cleanup handlers are called
    1554 here from the POSIX delete extension.  POSIX mandates that the cleanup handler
    1555 are executed in the context of the corresponding thread.  So here we have a
    1556 POSIX violation
    1557 [http://pubs.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_09.html#tag_02_09_05_03].
    1558 </li>
    1559 <li>Remove the thread from the scheduling and watchdog resources.</li>
    1560 <li>Delete scheduling, floating-point, stack and extensions resources.  Now the
    1561 deleted thread may execute on a freed thread stack!</li>
    1562 <li>Free the object.  Now the object (thread control block) is available for
    1563 re-use, but it is still used by the thread!  Only the disabled thread
    1564 dispatching prevents chaos.</li>
    1565 <li>Release the allocator mutex.  Now we have a lock order reversal (see step 1.
    1566 and 2.).</li>
    1567 <li>Enable thread dispatching.  Here a deleted executing thread disappears.  On
    1568 SMP we have also a race-condition here.  This step looks in detail:
    1569 {{{
    1570 if ( _Thread_Dispatch_decrement_disable_level() == 0 )
    1571         /*
    1572          * Here another processor may re-use resources of a deleted executing
    1573          * thread, e.g. the stack.
    1574          */
    1575         _Thread_Dispatch();
    1576 }
    1577 }}}
    1578 </li>
    1579 </ol>
    1580 
    1581 To overcome the issues we need considerable implementation changes in Score.
    1582 The thread life-cycle state must be explicit and independent of the thread
    1583 dispatch disable level and allocator mutex protection.
    1584 
    1585 The thread life-cycle is determined by the following actions:
    1586 
    1587 ; CREATE : A thread is created.
    1588 ; START : Starts a thread.  The thread must be dormant to get started.
    1589 ; RESTART : Restarts a thread.  The thread must not be dormant to get restarted.
    1590 ; SUSPEND
     1515will have a local cont