[6733466] | 1 | Code Tuning Parameters |
---|
| 2 | ###################### |
---|
| 3 | |
---|
| 4 | Inline Thread_Enable_dispatch |
---|
| 5 | ============================= |
---|
| 6 | |
---|
| 7 | Should the calls to _Thread_Enable_dispatch be inlined? |
---|
| 8 | |
---|
| 9 | If TRUE, then they are inlined. |
---|
| 10 | |
---|
| 11 | If FALSE, then a subroutine call is made. |
---|
| 12 | |
---|
| 13 | Basically this is an example of the classic trade-off of size versus |
---|
| 14 | speed. Inlining the call (TRUE) typically increases the size of RTEMS |
---|
| 15 | while speeding up the enabling of dispatching. |
---|
| 16 | |
---|
| 17 | [NOTE: In general, the _Thread_Dispatch_disable_level will only be 0 or 1 |
---|
| 18 | unless you are in an interrupt handler and that interrupt handler invokes |
---|
| 19 | the executive.] When not inlined something calls _Thread_Enable_dispatch |
---|
| 20 | which in turns calls _Thread_Dispatch. If the enable dispatch is inlined, |
---|
| 21 | then one subroutine call is avoided entirely.] |
---|
| 22 | .. code:: c |
---|
| 23 | |
---|
| 24 | #define CPU_INLINE_ENABLE_DISPATCH FALSE |
---|
| 25 | |
---|
| 26 | Inline Thread_queue_Enqueue_priority |
---|
| 27 | ==================================== |
---|
| 28 | |
---|
| 29 | Should the body of the search loops in _Thread_queue_Enqueue_priority be |
---|
| 30 | unrolled one time? In unrolled each iteration of the loop examines two |
---|
| 31 | "nodes" on the chain being searched. Otherwise, only one node is examined |
---|
| 32 | per iteration. |
---|
| 33 | |
---|
| 34 | If TRUE, then the loops are unrolled. |
---|
| 35 | |
---|
| 36 | If FALSE, then the loops are not unrolled. |
---|
| 37 | |
---|
| 38 | The primary factor in making this decision is the cost of disabling and |
---|
| 39 | enabling interrupts (_ISR_Flash) versus the cost of rest of the body of |
---|
| 40 | the loop. On some CPUs, the flash is more expensive than one iteration of |
---|
| 41 | the loop body. In this case, it might be desirable to unroll the loop. |
---|
| 42 | It is important to note that on some CPUs, this code is the longest |
---|
| 43 | interrupt disable period in RTEMS. So it is necessary to strike a balance |
---|
| 44 | when setting this parameter. |
---|
| 45 | .. code:: c |
---|
| 46 | |
---|
| 47 | #define CPU_UNROLL_ENQUEUE_PRIORITY TRUE |
---|
| 48 | |
---|
| 49 | Structure Alignment Optimization |
---|
| 50 | ================================ |
---|
| 51 | |
---|
| 52 | The following macro may be defined to the attribute setting used to force |
---|
| 53 | alignment of critical RTEMS structures. On some processors it may make |
---|
| 54 | sense to have these aligned on tighter boundaries than the minimum |
---|
| 55 | requirements of the compiler in order to have as much of the critical data |
---|
| 56 | area as possible in a cache line. This ensures that the first access of |
---|
| 57 | an element in that structure fetches most, if not all, of the data |
---|
| 58 | structure and places it in the data cache. Modern CPUs often have cache |
---|
| 59 | lines of at least 16 bytes and thus a single access implicitly fetches |
---|
| 60 | some surrounding data and places that unreferenced data in the cache. |
---|
| 61 | Taking advantage of this allows RTEMS to essentially prefetch critical |
---|
| 62 | data elements. |
---|
| 63 | |
---|
| 64 | The placement of this macro in the declaration of the variables is based |
---|
| 65 | on the syntactically requirements of the GNU C "__attribute__" extension. |
---|
| 66 | For another toolset, the placement of this macro could be incorrect. For |
---|
| 67 | example with GNU C, use the following definition of |
---|
| 68 | CPU_STRUCTURE_ALIGNMENT to force a structures to a 32 byte boundary. |
---|
| 69 | |
---|
| 70 | #define CPU_STRUCTURE_ALIGNMENT __attribute__ ((aligned (32))) |
---|
| 71 | |
---|
| 72 | To benefit from using this, the data must be heavily used so it will stay |
---|
| 73 | in the cache and used frequently enough in the executive to justify |
---|
| 74 | turning this on. NOTE: Because of this, only the Priority Bit Map table |
---|
| 75 | currently uses this feature. |
---|
| 76 | |
---|
| 77 | The following illustrates how the CPU_STRUCTURE_ALIGNMENT is defined on |
---|
| 78 | ports which require no special alignment for optimized access to data |
---|
| 79 | structures: |
---|
| 80 | .. code:: c |
---|
| 81 | |
---|
| 82 | #define CPU_STRUCTURE_ALIGNMENT |
---|
| 83 | |
---|
| 84 | Data Alignment Requirements |
---|
| 85 | =========================== |
---|
| 86 | |
---|
| 87 | Data Element Alignment |
---|
| 88 | ---------------------- |
---|
| 89 | |
---|
| 90 | The CPU_ALIGNMENT macro should be set to the CPUâs worst alignment |
---|
| 91 | requirement for data types on a byte boundary. This is typically the |
---|
| 92 | alignment requirement for a C double. This alignment does not take into |
---|
| 93 | account the requirements for the stack. |
---|
| 94 | |
---|
| 95 | The following sets the CPU_ALIGNMENT macro to 8 which indicates that there |
---|
| 96 | is a basic C data type for this port which much be aligned to an 8 byte |
---|
| 97 | boundary. |
---|
| 98 | .. code:: c |
---|
| 99 | |
---|
| 100 | #define CPU_ALIGNMENT 8 |
---|
| 101 | |
---|
| 102 | Heap Element Alignment |
---|
| 103 | ---------------------- |
---|
| 104 | |
---|
| 105 | The CPU_HEAP_ALIGNMENT macro is set to indicate the byte alignment |
---|
| 106 | requirement for data allocated by the RTEMS Code Heap Handler. This |
---|
| 107 | alignment requirement may be stricter than that for the data types |
---|
| 108 | alignment specified by CPU_ALIGNMENT. It is common for the heap to follow |
---|
| 109 | the same alignment requirement as CPU_ALIGNMENT. If the CPU_ALIGNMENT is |
---|
| 110 | strict enough for the heap, then this should be set to CPU_ALIGNMENT. This |
---|
| 111 | macro is necessary to ensure that allocated memory is properly aligned for |
---|
| 112 | use by high level language routines. |
---|
| 113 | |
---|
| 114 | The following example illustrates how the CPU_HEAP_ALIGNMENT macro is set |
---|
| 115 | when the required alignment for elements from the heap is the same as the |
---|
| 116 | basic CPU alignment requirements. |
---|
| 117 | .. code:: c |
---|
| 118 | |
---|
| 119 | #define CPU_HEAP_ALIGNMENT CPU_ALIGNMENT |
---|
| 120 | |
---|
| 121 | NOTE: This does not have to be a power of 2. It does have to be greater |
---|
| 122 | or equal to than CPU_ALIGNMENT. |
---|
| 123 | |
---|
| 124 | Partition Element Alignment |
---|
| 125 | --------------------------- |
---|
| 126 | |
---|
| 127 | The CPU_PARTITION_ALIGNMENT macro is set to indicate the byte alignment |
---|
| 128 | requirement for memory buffers allocated by the RTEMS Partition Manager |
---|
| 129 | that is part of the Classic API. This alignment requirement may be |
---|
| 130 | stricter than that for the data types alignment specified by |
---|
| 131 | CPU_ALIGNMENT. It is common for the partition to follow the same |
---|
| 132 | alignment requirement as CPU_ALIGNMENT. If the CPU_ALIGNMENT is strict |
---|
| 133 | enough for the partition, then this should be set to CPU_ALIGNMENT. This |
---|
| 134 | macro is necessary to ensure that allocated memory is properly aligned for |
---|
| 135 | use by high level language routines. |
---|
| 136 | |
---|
| 137 | The following example illustrates how the CPU_PARTITION_ALIGNMENT macro is |
---|
| 138 | set when the required alignment for elements from the RTEMS Partition |
---|
| 139 | Manager is the same as the basic CPU alignment requirements. |
---|
| 140 | |
---|
| 141 | .. code:: c |
---|
| 142 | |
---|
| 143 | #define CPU_PARTITION_ALIGNMENT CPU_ALIGNMENT |
---|
| 144 | |
---|
| 145 | NOTE: This does not have to be a power of 2. It does have to be greater |
---|
| 146 | or equal to than CPU_ALIGNMENT. |
---|
| 147 | |
---|
| 148 | .. COMMENT: COPYRIGHT (c) 1988-2002. |
---|
| 149 | |
---|
| 150 | .. COMMENT: On-Line Applications Research Corporation (OAR). |
---|
| 151 | |
---|
| 152 | .. COMMENT: All rights reserved. |
---|
| 153 | |
---|