source: rtems-docs/porting/code_tuning.rst @ 7497f5e

4.115
Last change on this file since 7497f5e was 7497f5e, checked in by Joel Sherrill <joel@…>, on 10/28/16 at 20:57:11

porting: Review and tidy up multiple formatting issues.

  • Property mode set to 100644
File size: 6.1 KB
Line 
1.. comment SPDX-License-Identifier: CC-BY-SA-4.0
2
3.. COMMENT: COPYRIGHT (c) 1988-2002.
4.. COMMENT: On-Line Applications Research Corporation (OAR).
5.. COMMENT: All rights reserved.
6
7Code Tuning Parameters
8######################
9
10Inline Thread_Enable_dispatch
11=============================
12
13Should the calls to _Thread_Enable_dispatch be inlined?
14
15- If ``TRUE``, then they are inlined.
16
17- If ``FALSE``, then a subroutine call is made.
18
19Basically this is an example of the classic trade-off of size versus
20speed.  Inlining the call (TRUE) typically increases the size of RTEMS
21while speeding up the enabling of dispatching.
22
23[NOTE: In general, the _Thread_Dispatch_disable_level will only be 0 or 1
24unless you are in an interrupt handler and that interrupt handler invokes
25the executive.] When not inlined something calls _Thread_Enable_dispatch
26which in turns calls _Thread_Dispatch.  If the enable dispatch is inlined,
27then one subroutine call is avoided entirely.]
28
29.. code-block:: c
30
31    #define CPU_INLINE_ENABLE_DISPATCH       FALSE
32
33Inline Thread_queue_Enqueue_priority
34====================================
35
36Should the body of the search loops in _Thread_queue_Enqueue_priority be
37unrolled one time?  In unrolled each iteration of the loop examines two
38"nodes" on the chain being searched.  Otherwise, only one node is examined
39per iteration.
40
41- If ``TRUE``, then the loops are unrolled.
42 
43- If ``FALSE``, then the loops are not unrolled.
44
45The primary factor in making this decision is the cost of disabling and
46enabling interrupts (_ISR_Flash) versus the cost of rest of the body of
47the loop.  On some CPUs, the flash is more expensive than one iteration of
48the loop body.  In this case, it might be desirable to unroll the loop.
49It is important to note that on some CPUs, this code is the longest
50interrupt disable period in RTEMS.  So it is necessary to strike a balance
51when setting this parameter.
52
53.. code-block:: c
54
55    #define CPU_UNROLL_ENQUEUE_PRIORITY      TRUE
56
57Structure Alignment Optimization
58================================
59
60The following macro may be defined to the attribute setting used to force
61alignment of critical RTEMS structures.  On some processors it may make
62sense to have these aligned on tighter boundaries than the minimum
63requirements of the compiler in order to have as much of the critical data
64area as possible in a cache line.  This ensures that the first access of
65an element in that structure fetches most, if not all, of the data
66structure and places it in the data cache.  Modern CPUs often have cache
67lines of at least 16 bytes and thus a single access implicitly fetches
68some surrounding data and places that unreferenced data in the cache.
69Taking advantage of this allows RTEMS to essentially prefetch critical
70data elements.
71
72The placement of this macro in the declaration of the variables is based
73on the syntactically requirements of the GNU C "__attribute__" extension.
74For another toolset, the placement of this macro could be incorrect.  For
75example with GNU C, use the following definition of
76CPU_STRUCTURE_ALIGNMENT to force a structures to a 32 byte boundary.
77
78.. code-block:: c
79
80    #define CPU_STRUCTURE_ALIGNMENT __attribute__ ((aligned (32)))
81
82To benefit from using this, the data must be heavily used so it will stay
83in the cache and used frequently enough in the executive to justify
84turning this on.  NOTE:  Because of this, only the Priority Bit Map table
85currently uses this feature.
86
87The following illustrates how the CPU_STRUCTURE_ALIGNMENT is defined on
88ports which require no special alignment for optimized access to data
89structures:
90
91.. code-block:: c
92
93    #define CPU_STRUCTURE_ALIGNMENT
94
95Data Alignment Requirements
96===========================
97
98Data Element Alignment
99----------------------
100
101The CPU_ALIGNMENT macro should be set to the CPU's worst alignment
102requirement for data types on a byte boundary.  This is typically the
103alignment requirement for a C double. This alignment does not take into
104account the requirements for the stack.
105
106The following sets the CPU_ALIGNMENT macro to 8 which indicates that there
107is a basic C data type for this port which much be aligned to an 8 byte
108boundary.
109
110.. code-block:: c
111
112    #define CPU_ALIGNMENT              8
113
114Heap Element Alignment
115----------------------
116
117The CPU_HEAP_ALIGNMENT macro is set to indicate the byte alignment
118requirement for data allocated by the RTEMS Code Heap Handler.  This
119alignment requirement may be stricter than that for the data types
120alignment specified by CPU_ALIGNMENT.  It is common for the heap to follow
121the same alignment requirement as CPU_ALIGNMENT.  If the CPU_ALIGNMENT is
122strict enough for the heap, then this should be set to CPU_ALIGNMENT. This
123macro is necessary to ensure that allocated memory is properly aligned for
124use by high level language routines.
125
126The following example illustrates how the CPU_HEAP_ALIGNMENT macro is set
127when the required alignment for elements from the heap is the same as the
128basic CPU alignment requirements.
129
130.. code-block:: c
131
132    #define CPU_HEAP_ALIGNMENT         CPU_ALIGNMENT
133
134NOTE:  This does not have to be a power of 2.  It does have to be greater
135or equal to than CPU_ALIGNMENT.
136
137Partition Element Alignment
138---------------------------
139
140The CPU_PARTITION_ALIGNMENT macro is set to indicate the byte alignment
141requirement for memory buffers allocated by the RTEMS Partition Manager
142that is part of the Classic API.  This alignment requirement may be
143stricter than that for the data types alignment specified by
144CPU_ALIGNMENT.  It is common for the partition to follow the same
145alignment requirement as CPU_ALIGNMENT.  If the CPU_ALIGNMENT is strict
146enough for the partition, then this should be set to CPU_ALIGNMENT.  This
147macro is necessary to ensure that allocated memory is properly aligned for
148use by high level language routines.
149
150The following example illustrates how the CPU_PARTITION_ALIGNMENT macro is
151set when the required alignment for elements from the RTEMS Partition
152Manager is the same as the basic CPU alignment requirements.
153
154.. code-block:: c
155
156    #define CPU_PARTITION_ALIGNMENT    CPU_ALIGNMENT
157
158NOTE:  This does not have to be a power of 2.  It does have to be greater
159or equal to than CPU_ALIGNMENT.
Note: See TracBrowser for help on using the repository browser.