| 32 | = Processor Affinity = |
| 33 | |
| 34 | == Status == |
| 35 | |
| 36 | |
| 37 | Thread processor affinity is not supported. |
| 38 | == Future Directions == |
| 39 | |
| 40 | |
| 41 | Implement the missing feature. This consists of two parts |
| 42 | |
| 43 | * the application level API, and |
| 44 | * the scheduler support. |
| 45 | |
| 46 | There is no POSIX API available that covers thread processor affinity. Linux |
| 47 | is the de facto standard system for high performance computing so it is |
| 48 | obvious to choose a Linux compatible API for RTEMS. Linux provides |
| 49 | [http://man7.org/linux/man-pages/man3/CPU_SET.3.html CPU_SET(3)] to manage sets |
| 50 | of processors. Thread processor affinity can be controlled via |
| 51 | [http://man7.org/linux/man-pages/man2/sched_setaffinity.2.html SCHED_SETAFFINITY(2)] |
| 52 | and |
| 53 | [http://man7.org/linux/man-pages/man3/pthread_setaffinity_np.3.html PTHREAD_SETAFFINITY_NP(3)]. |
| 54 | RTEMS should provide these APIs and implement them. It is not possible to use |
| 55 | the Linux files directly since they have a pure GPL license. |
| 56 | |
| 57 | The scheduler support for arbitrary processor affinities is a major challenge. |
| 58 | Schedulability analysis for schedulers with arbitrary processor affinities is a |
| 59 | current research topic <ref name="Gujarati2013>Gujarati2013: Arpan Gujarati, Felipe Cerqueira, and Björn Brandenburg. Schedulability Analysis of the Linux Push and Pull Scheduler with Arbitrary Processor Affinities, Proceedings of the 25th Euromicro Conference on Real-Time Systems (ECRTS 2013), July 2013.</ref>. |
| 60 | |
| 61 | Support for arbitrary processor affinities may lead to massive thread |
| 62 | migrations in case a new thread is scheduled. Suppose we have M processors |
| 63 | 0, ..., M - 1 and M + 1 threads 0, ..., M. Thread i can run on processors i |
| 64 | and i + 1 for i = 0, ..., M - 2. The thread M - 1 runs only on processor |
| 65 | M - 1. The M runs only on processor 0. A thread i has a higher priority than |
| 66 | thread i + 1 for i = 0, ..., M, e.g. thread 0 is the highest priority thread. |
| 67 | Suppose at time T0 threads 0, ..., M - 2 and M are currently scheduled. The |
| 68 | thread i runs on processor i + 1 for i = 0, ..., M - 2 and thread M runs on |
| 69 | processor 0. Now at time T1 thread M - 1 gets ready. It casts out thread M |
| 70 | since this is the lowest priority thread. Since thread M - 1 can run only on |
| 71 | processor M - 1, the threads 0, ..., M - 2 have to migrate from processor i to |
| 72 | processor i - 1 for i = 0, ..., M - 2. So one thread gets ready and the |
| 73 | threads of all but one processor must migrate. The threads forced to migrate |
| 74 | all have a higher priority than the new ready thread M - 1. |
| 75 | {| class="wikitable" |
| 76 | |+ align="bottom" | Example for M = 3 |
| 77 | ! Time |
| 78 | ! Thread |
| 79 | ! Processor 0 |
| 80 | ! Processor 1 |
| 81 | ! Processor 2 |
| 82 | |- |
| 83 | | rowspan="4" | T0 |
| 84 | | 0 |
| 85 | | |
| 86 | | style="background-color:green" | |
| 87 | | |
| 88 | |- |
| 89 | | 1 |
| 90 | | |
| 91 | | |
| 92 | | style="background-color:yellow" | |
| 93 | |- |
| 94 | | 2 |
| 95 | | |
| 96 | | |
| 97 | | |
| 98 | |- |
| 99 | | 3 |
| 100 | | style="background-color:red" | |
| 101 | | |
| 102 | | |
| 103 | |- |
| 104 | | rowspan="4" | T1 |
| 105 | | 0 |
| 106 | | style="background-color:green" | |
| 107 | | |
| 108 | | |
| 109 | |- |
| 110 | | 1 |
| 111 | | |
| 112 | | style="background-color:yellow" | |
| 113 | | |
| 114 | |- |
| 115 | | 2 |
| 116 | | |
| 117 | | |
| 118 | | style="background-color:purple" | |
| 119 | |- |
| 120 | | 3 |
| 121 | | |
| 122 | | |
| 123 | | |
| 124 | |- |
| 125 | |} |
| 126 | |
| 127 | In the example above the Linux Push and Pull scheduler would not find a |
| 128 | processor for thread M - 1 = 2 and thread M = 3 would continue to execute even |
| 129 | though it has a lower priority. |
| 130 | |
| 131 | The general scheduling problem with arbitrary processor affinities is a |
| 132 | matching problem in a bipartite graph. There are two disjoint vertex sets. |
| 133 | The set of ready threads and the set of processors. The edges indicate that a |
| 134 | thread can run on a processor. The scheduler must find a maximum matching |
| 135 | which fulfills in addition other constraints. For example the highest priority |
| 136 | threads should be scheduled first. The performed thread migrations should be |
| 137 | minimal. The augmenting path algorithm needs O(VE) time to find a maximum |
| 138 | matching, here V is the ready thread count plus processor count and E is the |
| 139 | number of edges. It is particularly bad that the time complexity depends on |
| 140 | the ready thread count. It is an open problem if an algorithm exits that is |
| 141 | good enough for real-time applications. |
| 142 | |
| 143 | [wiki:File:rtems-smp-affinity.png File:rtems-smp-affinity.png] |