Changeset b033e39 in rtems-docs


Ignore:
Timestamp:
Feb 2, 2017, 9:46:05 AM (3 years ago)
Author:
Sebastian Huber <sebastian.huber@…>
Branches:
master
Children:
2e0a2a0
Parents:
7b1c63c
Message:

c-user: Add SMP application issues section

Files:
3 edited

Legend:

Unmodified
Added
Removed
  • c-user/glossary.rst

    r7b1c63c rb033e39  
    2626
    2727:dfn:`atomic operations`
    28     Atomic operations are defined in terms of *ISO/IEC 9899:2011*.
     28    Atomic operations are defined in terms of :ref:`C11 <C11>`.
    2929
    3030:dfn:`awakened`
     
    6161:dfn:`buffer`
    6262    A fixed length block of memory allocated from a partition.
     63
     64.. _C11:
     65
     66:dfn:`C11`
     67    The standard ISO/IEC 9899:2011.
     68
     69.. _C++11:
     70
     71:dfn:`C++11`
     72    The standard ISO/IEC 14882:2011.
    6373
    6474:dfn:`calling convention`
     
    702712    The application defined unit of time in which the processor is allocated.
    703713
     714.. _TLS:
     715
     716:dfn:`TLS`
     717    An acronym for Thread-Local Storage :cite:`Drepper:2013:TLS`.  TLS is
     718    available in :ref:`C11 <C11>` and :ref:`C++11 <C++11>`.  The support for
     719    TLS depends on the CPU port :cite:`RTEMS:CPU`.
     720
    704721:dfn:`TMCB`
    705722    An acronym for Timer Control Block.
  • c-user/symmetric_multiprocessing_services.rst

    r7b1c63c rb033e39  
    272272the system depends on the maximum resource tree size of the application.
    273273
    274 Critical Section Techniques and SMP
    275 -----------------------------------
    276 
    277 As discussed earlier, SMP systems have opportunities for true parallelism which
    278 was not possible on uniprocessor systems. Consequently, multiple techniques
    279 that provided adequate critical sections on uniprocessor systems are unsafe on
    280 SMP systems. In this section, some of these unsafe techniques will be
    281 discussed.
    282 
    283 In general, applications must use proper operating system provided mutual
    284 exclusion mechanisms to ensure correct behavior. This primarily means the use
    285 of binary semaphores or mutexes to implement critical sections.
    286 
    287 Disable Interrupts and Interrupt Locks
    288 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    289 
    290 A low overhead means to ensure mutual exclusion in uni-processor configurations
    291 is to disable interrupts around a critical section.  This is commonly used in
    292 device driver code and throughout the operating system core.  In SMP
    293 configurations, however, disabling the interrupts on one processor has no
    294 effect on other processors.  So, this is insufficient to ensure system wide
    295 mutual exclusion.  The macros
    296 
    297 - ``rtems_interrupt_disable()``,
    298 
    299 - ``rtems_interrupt_enable()``, and
    300 
    301 - ``rtems_interrupt_flush()``
    302 
    303 are disabled in SMP configurations and its use will lead to compiler warnings
    304 and linker errors.  In the unlikely case that interrupts must be disabled on
    305 the current processor, then the
    306 
    307 - ``rtems_interrupt_local_disable()``, and
    308 
    309 - ``rtems_interrupt_local_enable()``
    310 
    311 macros are now available in all configurations.
    312 
    313 Since disabling of interrupts is not enough to ensure system wide mutual
    314 exclusion on SMP, a new low-level synchronization primitive was added - the
    315 interrupt locks.  They are a simple API layer on top of the SMP locks used for
    316 low-level synchronization in the operating system core.  Currently they are
    317 implemented as a ticket lock.  On uni-processor configurations they degenerate
    318 to simple interrupt disable/enable sequences.  It is disallowed to acquire a
    319 single interrupt lock in a nested way.  This will result in an infinite loop
    320 with interrupts disabled.  While converting legacy code to interrupt locks care
    321 must be taken to avoid this situation.
    322 
    323 .. code-block:: c
    324     :linenos:
    325 
    326     void legacy_code_with_interrupt_disable_enable( void )
    327     {
    328         rtems_interrupt_level level;
    329         rtems_interrupt_disable( level );
    330         /* Some critical stuff */
    331         rtems_interrupt_enable( level );
    332     }
    333 
    334     RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" );
    335 
    336     void smp_ready_code_with_interrupt_lock( void )
    337     {
    338         rtems_interrupt_lock_context lock_context;
    339         rtems_interrupt_lock_acquire( &lock, &lock_context );
    340         /* Some critical stuff */
    341         rtems_interrupt_lock_release( &lock, &lock_context );
    342     }
    343 
    344 The ``rtems_interrupt_lock`` structure is empty on uni-processor
    345 configurations.  Empty structures have a different size in C
    346 (implementation-defined, zero in case of GCC) and C++ (implementation-defined
    347 non-zero value, one in case of GCC).  Thus the
    348 ``RTEMS_INTERRUPT_LOCK_DECLARE()``, ``RTEMS_INTERRUPT_LOCK_DEFINE()``,
    349 ``RTEMS_INTERRUPT_LOCK_MEMBER()``, and ``RTEMS_INTERRUPT_LOCK_REFERENCE()``
    350 macros are provided to ensure ABI compatibility.
    351 
    352 Highest Priority Task Assumption
    353 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    354 
    355 On a uniprocessor system, it is safe to assume that when the highest priority
    356 task in an application executes, it will execute without being preempted until
    357 it voluntarily blocks. Interrupts may occur while it is executing, but there
    358 will be no context switch to another task unless the highest priority task
    359 voluntarily initiates it.
    360 
    361 Given the assumption that no other tasks will have their execution interleaved
    362 with the highest priority task, it is possible for this task to be constructed
    363 such that it does not need to acquire a binary semaphore or mutex for protected
    364 access to shared data.
    365 
    366 In an SMP system, it cannot be assumed there will never be a single task
    367 executing. It should be assumed that every processor is executing another
    368 application task. Further, those tasks will be ones which would not have been
    369 executed in a uniprocessor configuration and should be assumed to have data
    370 synchronization conflicts with what was formerly the highest priority task
    371 which executed without conflict.
    372 
    373 Disable Preemption
    374 ~~~~~~~~~~~~~~~~~~
    375 
    376 On a uniprocessor system, disabling preemption in a task is very similar to
    377 making the highest priority task assumption. While preemption is disabled, no
    378 task context switches will occur unless the task initiates them
    379 voluntarily. And, just as with the highest priority task assumption, there are
    380 N-1 processors also running tasks. Thus the assumption that no other tasks will
    381 run while the task has preemption disabled is violated.
    382 
    383 Task Unique Data and SMP
    384 ------------------------
    385 
    386 Per task variables are a service commonly provided by real-time operating
    387 systems for application use. They work by allowing the application to specify a
    388 location in memory (typically a ``void *``) which is logically added to the
    389 context of a task. On each task switch, the location in memory is stored and
    390 each task can have a unique value in the same memory location. This memory
    391 location is directly accessed as a variable in a program.
    392 
    393 This works well in a uniprocessor environment because there is one task
    394 executing and one memory location containing a task-specific value. But it is
    395 fundamentally broken on an SMP system because there are always N tasks
    396 executing. With only one location in memory, N-1 tasks will not have the
    397 correct value.
    398 
    399 This paradigm for providing task unique data values is fundamentally broken on
    400 SMP systems.
    401 
    402 Classic API Per Task Variables
    403 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    404 
    405 The Classic API provides three directives to support per task variables. These are:
    406 
    407 - ``rtems_task_variable_add`` - Associate per task variable
    408 
    409 - ``rtems_task_variable_get`` - Obtain value of a a per task variable
    410 
    411 - ``rtems_task_variable_delete`` - Remove per task variable
    412 
    413 As task variables are unsafe for use on SMP systems, the use of these services
    414 must be eliminated in all software that is to be used in an SMP environment.
    415 The task variables API is disabled on SMP. Its use will lead to compile-time
    416 and link-time errors. It is recommended that the application developer consider
    417 the use of POSIX Keys or Thread Local Storage (TLS). POSIX Keys are available
    418 in all RTEMS configurations.  For the availablity of TLS on a particular
    419 architecture please consult the *RTEMS CPU Architecture Supplement*.
    420 
    421 The only remaining user of task variables in the RTEMS code base is the Ada
    422 support.  So basically Ada is not available on RTEMS SMP.
    423 
    424274OpenMP
    425275------
     
    522372prologue before the stack is switched to the interrupt stack.
    523373
     374Application Issues
     375==================
     376
     377Most operating system services provided by the uni-processor RTEMS are
     378available in SMP configurations as well.  However, applications designed for an
     379uni-processor environment may need some changes to correctly run in an SMP
     380configuration.
     381
     382As discussed earlier, SMP systems have opportunities for true parallelism which
     383was not possible on uni-processor systems. Consequently, multiple techniques
     384that provided adequate critical sections on uni-processor systems are unsafe on
     385SMP systems. In this section, some of these unsafe techniques will be
     386discussed.
     387
     388In general, applications must use proper operating system provided mutual
     389exclusion mechanisms to ensure correct behavior.
     390
     391Task variables
     392--------------
     393
     394Task variables are ordinary global variables with a dedicated value for each
     395thread.  During a context switch from the executing thread to the heir thread,
     396the value of each task variable is saved to the thread control block of the
     397executing thread and restored from the thread control block of the heir thread.
     398This is inherently broken if more than one executing thread exists.
     399Alternatives to task variables are POSIX keys and :ref:`TLS <TLS>`.  All use
     400cases of task variables in the RTEMS code base were replaced with alternatives.
     401The task variable API has been removed in RTEMS 4.12.
     402
     403Highest Priority Thread Never Walks Alone
     404-----------------------------------------
     405
     406On a uni-processor system, it is safe to assume that when the highest priority
     407task in an application executes, it will execute without being preempted until
     408it voluntarily blocks. Interrupts may occur while it is executing, but there
     409will be no context switch to another task unless the highest priority task
     410voluntarily initiates it.
     411
     412Given the assumption that no other tasks will have their execution interleaved
     413with the highest priority task, it is possible for this task to be constructed
     414such that it does not need to acquire a mutex for protected access to shared
     415data.
     416
     417In an SMP system, it cannot be assumed there will never be a single task
     418executing. It should be assumed that every processor is executing another
     419application task. Further, those tasks will be ones which would not have been
     420executed in a uni-processor configuration and should be assumed to have data
     421synchronization conflicts with what was formerly the highest priority task
     422which executed without conflict.
     423
     424Disabling of Thread Pre-Emption
     425-------------------------------
     426
     427A thread which disables pre-emption prevents that a higher priority thread gets
     428hold of its processor involuntarily.  In uni-processor configurations, this can
     429be used to ensure mutual exclusion at thread level.  In SMP configurations,
     430however, more than one executing thread may exist.  Thus, it is impossible to
     431ensure mutual exclusion using this mechanism.  In order to prevent that
     432applications using pre-emption for this purpose, would show inappropriate
     433behaviour, this feature is disabled in SMP configurations and its use would
     434case run-time errors.
     435
     436Disabling of Interrupts
     437-----------------------
     438
     439A low overhead means that ensures mutual exclusion in uni-processor
     440configurations is the disabling of interrupts around a critical section.  This
     441is commonly used in device driver code.  In SMP configurations, however,
     442disabling the interrupts on one processor has no effect on other processors.
     443So, this is insufficient to ensure system-wide mutual exclusion.  The macros
     444
     445* :ref:`rtems_interrupt_disable() <rtems_interrupt_disable>`,
     446
     447* :ref:`rtems_interrupt_enable() <rtems_interrupt_enable>`, and
     448
     449* :ref:`rtems_interrupt_flash() <rtems_interrupt_flash>`.
     450
     451are disabled in SMP configurations and its use will cause compile-time warnings
     452and link-time errors.  In the unlikely case that interrupts must be disabled on
     453the current processor, the
     454
     455* :ref:`rtems_interrupt_local_disable() <rtems_interrupt_local_disable>`, and
     456
     457* :ref:`rtems_interrupt_local_enable() <rtems_interrupt_local_enable>`.
     458
     459macros are now available in all configurations.
     460
     461Since disabling of interrupts is insufficient to ensure system-wide mutual
     462exclusion on SMP a new low-level synchronization primitive was added --
     463interrupt locks.  The interrupt locks are a simple API layer on top of the SMP
     464locks used for low-level synchronization in the operating system core.
     465Currently, they are implemented as a ticket lock.  In uni-processor
     466configurations, they degenerate to simple interrupt disable/enable sequences by
     467means of the C pre-processor.  It is disallowed to acquire a single interrupt
     468lock in a nested way.  This will result in an infinite loop with interrupts
     469disabled.  While converting legacy code to interrupt locks, care must be taken
     470to avoid this situation to happen.
     471
     472.. code-block:: c
     473    :linenos:
     474
     475    #include <rtems.h>
     476
     477    void legacy_code_with_interrupt_disable_enable( void )
     478    {
     479      rtems_interrupt_level level;
     480
     481      rtems_interrupt_disable( level );
     482      /* Critical section */
     483      rtems_interrupt_enable( level );
     484    }
     485
     486    RTEMS_INTERRUPT_LOCK_DEFINE( static, lock, "Name" )
     487
     488    void smp_ready_code_with_interrupt_lock( void )
     489    {
     490      rtems_interrupt_lock_context lock_context;
     491
     492      rtems_interrupt_lock_acquire( &lock, &lock_context );
     493      /* Critical section */
     494      rtems_interrupt_lock_release( &lock, &lock_context );
     495    }
     496
     497An alternative to the RTEMS-specific interrupt locks are POSIX spinlocks.  The
     498:c:type:`pthread_spinlock_t` is defined as a self-contained object, e.g. the
     499user must provide the storage for this synchronization object.
     500
     501.. code-block:: c
     502    :linenos:
     503
     504    #include <assert.h>
     505    #include <pthread.h>
     506
     507    pthread_spinlock_t lock;
     508
     509    void smp_ready_code_with_posix_spinlock( void )
     510    {
     511      int error;
     512
     513      error = pthread_spin_lock( &lock );
     514      assert( error == 0 );
     515      /* Critical section */
     516      error = pthread_spin_unlock( &lock );
     517      assert( error == 0 );
     518    }
     519
     520In contrast to POSIX spinlock implementation on Linux or FreeBSD, it is not
     521allowed to call blocking operating system services inside the critical section.
     522A recursive lock attempt is a severe usage error resulting in an infinite loop
     523with interrupts disabled.  Nesting of different locks is allowed.  The user
     524must ensure that no deadlock can occur.  As a non-portable feature the locks
     525are zero-initialized, e.g. statically initialized global locks reside in the
     526``.bss`` section and there is no need to call :c:func:`pthread_spin_init`.
     527
     528Interrupt Service Routines Execute in Parallel With Threads
     529-----------------------------------------------------------
     530
     531On a machine with more than one processor, interrupt service routines (this
     532includes timer service routines installed via :ref:`rtems_timer_fire_after()
     533<rtems_timer_fire_after>`) and threads can execute in parallel.  Interrupt
     534service routines must take this into account and use proper locking mechanisms
     535to protect critical sections from interference by threads (interrupt locks or
     536POSIX spinlocks).  This likely requires code modifications in legacy device
     537drivers.
     538
     539Timers Do Not Stop Immediately
     540------------------------------
     541
     542Timer service routines run in the context of the clock interrupt.  On
     543uni-processor configurations, it is sufficient to disable interrupts and remove
     544a timer from the set of active timers to stop it.  In SMP configurations,
     545however, the timer service routine may already run and wait on an SMP lock
     546owned by the thread which is about to stop the timer.  This opens the door to
     547subtle synchronization issues.  During destruction of objects, special care
     548must be taken to ensure that timer service routines cannot access (partly or
     549fully) destroyed objects.
     550
     551False Sharing of Cache Lines Due to Objects Table
     552-------------------------------------------------
     553
     554The Classic API and most POSIX API objects are indirectly accessed via an
     555object identifier.  The user-level functions validate the object identifier and
     556map it to the actual object structure which resides in a global objects table
     557for each object class.  So, unrelated objects are packed together in a table.
     558This may result in false sharing of cache lines.  The effect of false sharing
     559of cache lines can be observed with the `TMFINE 1
     560<https://git.rtems.org/rtems/tree/testsuites/tmtests/tmfine01>`_ test program
     561on a suitable platform, e.g. QorIQ T4240.  High-performance SMP applications
     562need full control of the object storage :cite:`Drepper:2007:Memory`.
     563Therefore, self-contained synchronization objects are now available for RTEMS.
     564
    524565Directives
    525566==========
  • common/refs.bib

    r7b1c63c rb033e39  
    285285  url       = {https://hal.archives-ouvertes.fr/hal-01295194/document},
    286286}
     287@misc{RTEMS:CPU,
     288  title     = {{RTEMS CPU Architecture Supplement}},
     289  url       = {https://docs.rtems.org/branches/master/cpu-supplement.pdf},
     290}
Note: See TracChangeset for help on using the changeset viewer.