source: rtems-docs/porting/code_tuning.rst @ 0aa6200

4.115
Last change on this file since 0aa6200 was 6733466, checked in by Amar Takhar <amar@…>, on 01/17/16 at 00:08:48

Split document into seperate files by section.

  • Property mode set to 100644
File size: 5.9 KB
RevLine 
[6733466]1Code Tuning Parameters
2######################
3
4Inline Thread_Enable_dispatch
5=============================
6
7Should the calls to _Thread_Enable_dispatch be inlined?
8
9If TRUE, then they are inlined.
10
11If FALSE, then a subroutine call is made.
12
13Basically this is an example of the classic trade-off of size versus
14speed.  Inlining the call (TRUE) typically increases the size of RTEMS
15while speeding up the enabling of dispatching.
16
17[NOTE: In general, the _Thread_Dispatch_disable_level will only be 0 or 1
18unless you are in an interrupt handler and that interrupt handler invokes
19the executive.] When not inlined something calls _Thread_Enable_dispatch
20which in turns calls _Thread_Dispatch.  If the enable dispatch is inlined,
21then one subroutine call is avoided entirely.]
22.. code:: c
23
24    #define CPU_INLINE_ENABLE_DISPATCH       FALSE
25
26Inline Thread_queue_Enqueue_priority
27====================================
28
29Should the body of the search loops in _Thread_queue_Enqueue_priority be
30unrolled one time?  In unrolled each iteration of the loop examines two
31"nodes" on the chain being searched.  Otherwise, only one node is examined
32per iteration.
33
34If TRUE, then the loops are unrolled.
35
36If FALSE, then the loops are not unrolled.
37
38The primary factor in making this decision is the cost of disabling and
39enabling interrupts (_ISR_Flash) versus the cost of rest of the body of
40the loop.  On some CPUs, the flash is more expensive than one iteration of
41the loop body.  In this case, it might be desirable to unroll the loop.
42It is important to note that on some CPUs, this code is the longest
43interrupt disable period in RTEMS.  So it is necessary to strike a balance
44when setting this parameter.
45.. code:: c
46
47    #define CPU_UNROLL_ENQUEUE_PRIORITY      TRUE
48
49Structure Alignment Optimization
50================================
51
52The following macro may be defined to the attribute setting used to force
53alignment of critical RTEMS structures.  On some processors it may make
54sense to have these aligned on tighter boundaries than the minimum
55requirements of the compiler in order to have as much of the critical data
56area as possible in a cache line.  This ensures that the first access of
57an element in that structure fetches most, if not all, of the data
58structure and places it in the data cache.  Modern CPUs often have cache
59lines of at least 16 bytes and thus a single access implicitly fetches
60some surrounding data and places that unreferenced data in the cache.
61Taking advantage of this allows RTEMS to essentially prefetch critical
62data elements.
63
64The placement of this macro in the declaration of the variables is based
65on the syntactically requirements of the GNU C "__attribute__" extension.
66For another toolset, the placement of this macro could be incorrect.  For
67example with GNU C, use the following definition of
68CPU_STRUCTURE_ALIGNMENT to force a structures to a 32 byte boundary.
69
70#define CPU_STRUCTURE_ALIGNMENT __attribute__ ((aligned (32)))
71
72To benefit from using this, the data must be heavily used so it will stay
73in the cache and used frequently enough in the executive to justify
74turning this on.  NOTE:  Because of this, only the Priority Bit Map table
75currently uses this feature.
76
77The following illustrates how the CPU_STRUCTURE_ALIGNMENT is defined on
78ports which require no special alignment for optimized access to data
79structures:
80.. code:: c
81
82    #define CPU_STRUCTURE_ALIGNMENT
83
84Data Alignment Requirements
85===========================
86
87Data Element Alignment
88----------------------
89
90The CPU_ALIGNMENT macro should be set to the CPU’s worst alignment
91requirement for data types on a byte boundary.  This is typically the
92alignment requirement for a C double. This alignment does not take into
93account the requirements for the stack.
94
95The following sets the CPU_ALIGNMENT macro to 8 which indicates that there
96is a basic C data type for this port which much be aligned to an 8 byte
97boundary.
98.. code:: c
99
100    #define CPU_ALIGNMENT              8
101
102Heap Element Alignment
103----------------------
104
105The CPU_HEAP_ALIGNMENT macro is set to indicate the byte alignment
106requirement for data allocated by the RTEMS Code Heap Handler.  This
107alignment requirement may be stricter than that for the data types
108alignment specified by CPU_ALIGNMENT.  It is common for the heap to follow
109the same alignment requirement as CPU_ALIGNMENT.  If the CPU_ALIGNMENT is
110strict enough for the heap, then this should be set to CPU_ALIGNMENT. This
111macro is necessary to ensure that allocated memory is properly aligned for
112use by high level language routines.
113
114The following example illustrates how the CPU_HEAP_ALIGNMENT macro is set
115when the required alignment for elements from the heap is the same as the
116basic CPU alignment requirements.
117.. code:: c
118
119    #define CPU_HEAP_ALIGNMENT         CPU_ALIGNMENT
120
121NOTE:  This does not have to be a power of 2.  It does have to be greater
122or equal to than CPU_ALIGNMENT.
123
124Partition Element Alignment
125---------------------------
126
127The CPU_PARTITION_ALIGNMENT macro is set to indicate the byte alignment
128requirement for memory buffers allocated by the RTEMS Partition Manager
129that is part of the Classic API.  This alignment requirement may be
130stricter than that for the data types alignment specified by
131CPU_ALIGNMENT.  It is common for the partition to follow the same
132alignment requirement as CPU_ALIGNMENT.  If the CPU_ALIGNMENT is strict
133enough for the partition, then this should be set to CPU_ALIGNMENT.  This
134macro is necessary to ensure that allocated memory is properly aligned for
135use by high level language routines.
136
137The following example illustrates how the CPU_PARTITION_ALIGNMENT macro is
138set when the required alignment for elements from the RTEMS Partition
139Manager is the same as the basic CPU alignment requirements.
140
141.. code:: c
142
143    #define CPU_PARTITION_ALIGNMENT    CPU_ALIGNMENT
144
145NOTE:  This does not have to be a power of 2.  It does have to be greater
146or equal to than CPU_ALIGNMENT.
147
148.. COMMENT: COPYRIGHT (c) 1988-2002.
149
150.. COMMENT: On-Line Applications Research Corporation (OAR).
151
152.. COMMENT: All rights reserved.
153
Note: See TracBrowser for help on using the repository browser.