1 | .. comment SPDX-License-Identifier: CC-BY-SA-4.0 |
---|
2 | |
---|
3 | .. COMMENT: COPYRIGHT (c) 1988-2002. |
---|
4 | .. COMMENT: On-Line Applications Research Corporation (OAR). |
---|
5 | .. COMMENT: All rights reserved. |
---|
6 | |
---|
7 | Code Tuning Parameters |
---|
8 | ###################### |
---|
9 | |
---|
10 | Inline Thread_Enable_dispatch |
---|
11 | ============================= |
---|
12 | |
---|
13 | Should the calls to _Thread_Enable_dispatch be inlined? |
---|
14 | |
---|
15 | - If ``TRUE``, then they are inlined. |
---|
16 | |
---|
17 | - If ``FALSE``, then a subroutine call is made. |
---|
18 | |
---|
19 | Basically this is an example of the classic trade-off of size versus |
---|
20 | speed. Inlining the call (TRUE) typically increases the size of RTEMS |
---|
21 | while speeding up the enabling of dispatching. |
---|
22 | |
---|
23 | [NOTE: In general, the _Thread_Dispatch_disable_level will only be 0 or 1 |
---|
24 | unless you are in an interrupt handler and that interrupt handler invokes |
---|
25 | the executive.] When not inlined something calls _Thread_Enable_dispatch |
---|
26 | which in turns calls _Thread_Dispatch. If the enable dispatch is inlined, |
---|
27 | then one subroutine call is avoided entirely.] |
---|
28 | |
---|
29 | .. code-block:: c |
---|
30 | |
---|
31 | #define CPU_INLINE_ENABLE_DISPATCH FALSE |
---|
32 | |
---|
33 | Inline Thread_queue_Enqueue_priority |
---|
34 | ==================================== |
---|
35 | |
---|
36 | Should the body of the search loops in _Thread_queue_Enqueue_priority be |
---|
37 | unrolled one time? In unrolled each iteration of the loop examines two |
---|
38 | "nodes" on the chain being searched. Otherwise, only one node is examined |
---|
39 | per iteration. |
---|
40 | |
---|
41 | - If ``TRUE``, then the loops are unrolled. |
---|
42 | |
---|
43 | - If ``FALSE``, then the loops are not unrolled. |
---|
44 | |
---|
45 | The primary factor in making this decision is the cost of disabling and |
---|
46 | enabling interrupts (_ISR_Flash) versus the cost of rest of the body of |
---|
47 | the loop. On some CPUs, the flash is more expensive than one iteration of |
---|
48 | the loop body. In this case, it might be desirable to unroll the loop. |
---|
49 | It is important to note that on some CPUs, this code is the longest |
---|
50 | interrupt disable period in RTEMS. So it is necessary to strike a balance |
---|
51 | when setting this parameter. |
---|
52 | |
---|
53 | .. code-block:: c |
---|
54 | |
---|
55 | #define CPU_UNROLL_ENQUEUE_PRIORITY TRUE |
---|
56 | |
---|
57 | Structure Alignment Optimization |
---|
58 | ================================ |
---|
59 | |
---|
60 | The following macro may be defined to the attribute setting used to force |
---|
61 | alignment of critical RTEMS structures. On some processors it may make |
---|
62 | sense to have these aligned on tighter boundaries than the minimum |
---|
63 | requirements of the compiler in order to have as much of the critical data |
---|
64 | area as possible in a cache line. This ensures that the first access of |
---|
65 | an element in that structure fetches most, if not all, of the data |
---|
66 | structure and places it in the data cache. Modern CPUs often have cache |
---|
67 | lines of at least 16 bytes and thus a single access implicitly fetches |
---|
68 | some surrounding data and places that unreferenced data in the cache. |
---|
69 | Taking advantage of this allows RTEMS to essentially prefetch critical |
---|
70 | data elements. |
---|
71 | |
---|
72 | The placement of this macro in the declaration of the variables is based |
---|
73 | on the syntactically requirements of the GNU C "__attribute__" extension. |
---|
74 | For another toolset, the placement of this macro could be incorrect. For |
---|
75 | example with GNU C, use the following definition of |
---|
76 | CPU_STRUCTURE_ALIGNMENT to force a structures to a 32 byte boundary. |
---|
77 | |
---|
78 | .. code-block:: c |
---|
79 | |
---|
80 | #define CPU_STRUCTURE_ALIGNMENT __attribute__ ((aligned (32))) |
---|
81 | |
---|
82 | To benefit from using this, the data must be heavily used so it will stay |
---|
83 | in the cache and used frequently enough in the executive to justify |
---|
84 | turning this on. NOTE: Because of this, only the Priority Bit Map table |
---|
85 | currently uses this feature. |
---|
86 | |
---|
87 | The following illustrates how the CPU_STRUCTURE_ALIGNMENT is defined on |
---|
88 | ports which require no special alignment for optimized access to data |
---|
89 | structures: |
---|
90 | |
---|
91 | .. code-block:: c |
---|
92 | |
---|
93 | #define CPU_STRUCTURE_ALIGNMENT |
---|
94 | |
---|
95 | Data Alignment Requirements |
---|
96 | =========================== |
---|
97 | |
---|
98 | Data Element Alignment |
---|
99 | ---------------------- |
---|
100 | |
---|
101 | The CPU_ALIGNMENT macro should be set to the CPU's worst alignment |
---|
102 | requirement for data types on a byte boundary. This is typically the |
---|
103 | alignment requirement for a C double. This alignment does not take into |
---|
104 | account the requirements for the stack. |
---|
105 | |
---|
106 | The following sets the CPU_ALIGNMENT macro to 8 which indicates that there |
---|
107 | is a basic C data type for this port which much be aligned to an 8 byte |
---|
108 | boundary. |
---|
109 | |
---|
110 | .. code-block:: c |
---|
111 | |
---|
112 | #define CPU_ALIGNMENT 8 |
---|
113 | |
---|
114 | Heap Element Alignment |
---|
115 | ---------------------- |
---|
116 | |
---|
117 | The CPU_HEAP_ALIGNMENT macro is set to indicate the byte alignment |
---|
118 | requirement for data allocated by the RTEMS Code Heap Handler. This |
---|
119 | alignment requirement may be stricter than that for the data types |
---|
120 | alignment specified by CPU_ALIGNMENT. It is common for the heap to follow |
---|
121 | the same alignment requirement as CPU_ALIGNMENT. If the CPU_ALIGNMENT is |
---|
122 | strict enough for the heap, then this should be set to CPU_ALIGNMENT. This |
---|
123 | macro is necessary to ensure that allocated memory is properly aligned for |
---|
124 | use by high level language routines. |
---|
125 | |
---|
126 | The following example illustrates how the CPU_HEAP_ALIGNMENT macro is set |
---|
127 | when the required alignment for elements from the heap is the same as the |
---|
128 | basic CPU alignment requirements. |
---|
129 | |
---|
130 | .. code-block:: c |
---|
131 | |
---|
132 | #define CPU_HEAP_ALIGNMENT CPU_ALIGNMENT |
---|
133 | |
---|
134 | NOTE: This does not have to be a power of 2. It does have to be greater |
---|
135 | or equal to than CPU_ALIGNMENT. |
---|
136 | |
---|
137 | Partition Element Alignment |
---|
138 | --------------------------- |
---|
139 | |
---|
140 | The CPU_PARTITION_ALIGNMENT macro is set to indicate the byte alignment |
---|
141 | requirement for memory buffers allocated by the RTEMS Partition Manager |
---|
142 | that is part of the Classic API. This alignment requirement may be |
---|
143 | stricter than that for the data types alignment specified by |
---|
144 | CPU_ALIGNMENT. It is common for the partition to follow the same |
---|
145 | alignment requirement as CPU_ALIGNMENT. If the CPU_ALIGNMENT is strict |
---|
146 | enough for the partition, then this should be set to CPU_ALIGNMENT. This |
---|
147 | macro is necessary to ensure that allocated memory is properly aligned for |
---|
148 | use by high level language routines. |
---|
149 | |
---|
150 | The following example illustrates how the CPU_PARTITION_ALIGNMENT macro is |
---|
151 | set when the required alignment for elements from the RTEMS Partition |
---|
152 | Manager is the same as the basic CPU alignment requirements. |
---|
153 | |
---|
154 | .. code-block:: c |
---|
155 | |
---|
156 | #define CPU_PARTITION_ALIGNMENT CPU_ALIGNMENT |
---|
157 | |
---|
158 | NOTE: This does not have to be a power of 2. It does have to be greater |
---|
159 | or equal to than CPU_ALIGNMENT. |
---|