wiki:Projects/GSoC/Atomic_Operations

Version 19 (modified by André Marques, on 05/09/14 at 15:29:09) (diff)

Few text corrections

Atomic Operations

Mentors:

  • Chris Johns

Students:

  • Deng Hengyi

Status:

Introduction:

Different Architecture atomic support

Candidate APIs / Implementations

Goal: To provide RTEMS with core services to solve synchronization problems on multicore platforms.

Requirements:

  • Advanced C and assembly language programming
  • Familiarity with RTEMS kernel software architecture
  • Understand concurrency problems and solutions

Resources:

Acknowledgements

References of atomic implementation

ConcurrencyKit?

A candidate for this project is the ConcurrencyKit? (ck). Among other targets, ck works on the 32-bit x86 architecture, so a first step would be to try compiling ck with rtems for pc386 and run a sample application under QEMU. Once a sample application is working, the next step would be to get as much of the ck regression suite to run as possible. Beyond that are many possible directions. Potential students should ask on the mailing list and work with potential mentors to design a project that suits their abilities and goals.

FreeBSD Atomic

The FreeBSD Atomic Operations API defines a set of atomic operations that can then be used to build solutions to concurrency problems. The FreeBSD implementations support a lot of target architectures, so there is more freedom to pick a starting point.

NetBSD Atomic

The NetBSD kernel implements seven classes of atomic memory operations. In the NetBSD kernel if the architecture provides compare and swap (CAS) each atomic operations is built on CAS. If the architecture does not provide hardware support for atomic compare and swap (CAS), atomicity is provided by a restartable sequence or by a spinlock.

C11 and C++11 Atomic

The end of 2011 brought new releases of both the C and C++ standards for the first time both of which contain a new set of atomic types and operations. The older versions of C and C++ had no support for atomic operations at all. The older versions of GCC and Clang provide the _sync_* family of built-in functions, which provide some atomic operations support. The GCC 4.7(or newer) and latest version Clang has provided built-in functions approximately match the requirements for C++11 memory model.

Linux Atomic

The Linux kernel mainly implements two class of atomic primitives: one without return value and the other with return value. In the Linux kernel any the atomic operation that modifies some state in memory and returns information about the state (old or new) implies an SMP-conditional general memory barrier (smp_mb()) on each side of the actual operation (with the exception of explicit lock operations).

Comparison of Atomic Implementation

In the proposal of Gsoc2012 project it has make a very detailed comparisons of all the above atomic implementations. Finally the atomic operations for RTEMS will refer the FreeBSD atomic implementations but the API design should contain most type of ISO C11 atomic definitions and follow its standard for API evolution.

Comparison of C11 and FreeBSD Atomic API definition

Through the comparison the C11 atomic API definition and the atomic API definition of FreeBSD kernel have the same or similar semantic. The functions not ending in _explicit in C11 atomic API have the semantics of their corresponding _explicit with memory_order arguments of memory_order_seq_cst in FreeBSD, So In there just list the API definition ending in _explicit.

  1. The atomic_store generic functions C1X: void atomic_store_explicit(volatile A *object, C desired, memory_order order); FreeBSD: void atomic_store_rel_<type>(volatile _type_ *p, _type_ v);
  • FreeBSD API has the same semantic of C1X's API with memory_order arguments of memory_order_release.
  1. The atomic_load generic functions C1X:

C atomic_load_explicit(volatile A *object, memory_order order);

FreeBSD: _type_ atomic_load_acq_<type>(volatile _type_ *p);

  • FreeBSD API has the same semantic of C1X's API with memory_order arguments of memory_order_acquire.
  1. The atomic_fetch-and-modify generic functions C1X: (1)C atomic_fetch_add_explicit(volatile A *object, M operand, memory_order order); (2)C atomic_fetch_sub_explicit(volatile A *object, M operand, memory_order order); (3)C atomic_fetch_or_explicit(volatile A *object, M operand, memory_order order); (4)C atomic_fetch_xor_explicit(volatile A *object, M operand, memory_order order); (5)C atomic_fetch_and_explicit(volatile A *object, M operand, memory_order order); FreeBSD: (1)void atomic_add_[acq_|rel_]<type>(volatile _type_ *p, _type_ v); (2)void atomic_subtract_[acq_|rel_]<type>(volatile _type_ *p, _type_ v); (3)void atomic_set_[acq_|rel_]<type>(volatile _type_ *p, _type_ v); (4)void atomic_clear_[acq_|rel_]<type>(volatile _type_ *p, _type_ v);
  • The FreeBSD API (1),(2),(3) have the same semantic of C1X's API (1),(2),(3) with memory_order arguments of memory_order_acquire or memory_order_release([acq_|rel_]). The FreeBSD API does not have the same semantic with C1X's API(4), but the FreeBSD API(4) is similar with C1X's API(5) where one uses &= the other uses &=~. So this part is easy to adapt. But all the FreeBSD APIs do not return value.
  1. The atomic_exchange generic functions C1X: C atomic_exchange_explicit(volatile A *object, C desired, memory_order order); FreeBSD: XXX NetBSD: _type_ atomic_swap_<type>(volatile _type_ *ptr, _type_ new);

  • The FreeBSD API does not support the atomic exchagne functions, but NetBSD has the similar semantic of C1X's API only with memory_order arguments of memory_order_relaxed.
  1. The atomic_compare_exchange generic functions C1X: _Bool atomic_compare_exchange_weak_explicit(volatile A *object, C *expected, C desired, memory_order success, memory_order failure); _Bool atomic_compare_exchange_strong_explicit(volatile A *object, C *expected, C desired, memory_order success, memory_order failure); FreeBSD: int atomic_cmpset_[acq_|rel_]<type>(volatile _type_ *dst, _type_ old, _type_ new);
  • In the FreeBSD Atomically compare the value stored at *dst with old and if the two values are equal, update the value of *dst with new. Returns zero if the compare failed, nonzero otherwise. In the C1X Atomically compares the value pointed to by object for equality with that in expected, and if true, replaces the value pointed by object with desired, and if false, updates the value in expected with the value pointed to by object.

Design of Atomic operations API

The first part is a directory structure chart, atomic.h is API definition file and cpuatomic.h is an implementation file.

/cpukit

|

score

|

include

| | | ----rtems | | | | | ----score | | | | | ----atomic.h

cpu

| | | ------architecture | | | | | -------rtems | | | | | ----score | | | | | ----cpuatomic.h | | | | | ----cpu.h

Most of the implementation of atomic operations are assembly instructions, if not they could also be implemented with inline C source code. So i place the architecture-independent atomic API definitions to the atomic.h which is visible to other rtems components like score, dirver and etc. The architecture-dependent atomic implementations are placed on the cpuatomic.h which exists in every architecture-related directory as show above. The API is associated with implementations using methods like this: for example, the atomic general load function API: int Atomic_Load_Acq_Int(volatile int *p)

  1. In the implementation file cpuatomic.h it will be implemented like this: static inline int _Atomic_Load_Acq_Int(volatile int *p) {

embedded assembly code;

};

  1. In the API definition file atomic.h it will be defined like this: #define Atomic_Load_Acq_Int(p) _Atomic_Load_Acq_Int((volatile u_int *)(p))
  2. The cpuatomic.h should be included in the atomic.h directly or indirectly. If it is included in the atomic directly its file name should be fixed and each architecture has the same file name. If it is included in the atomic.h indirectly its file name can be unfixed and each architecture can have different file name, but it must be included by a architecture-dependent header file(like cpu.h as showed above) which can be included by architecture-independent score header file atomic.h

Synopsis:

The following is an atomic operation API definition in "rtems" style:

  1. The atomic_store generic functions void _Atomic_Store_Rel_<_type_>(volatile _type_ *p, _type_ v);
  2. The atomic_load generic functions _type_ _Atomic_Load_Acq_<_type_>(volatile _type_ *p);
  3. The atomic_fetch-and-modify generic functions void _Atomic_Fetch_Add_[Acq_|Rel_]<_type_>(volatile _type_ *p, _type_ v); void _Atomic_Fetch_Sub_[Acq_|Rel_]<_type_>(volatile _type_ *p, _type_ v); void _Atomic_Fetch_Or_[Acq_|Rel_]<_type_>(volatile _type_ *p, _type_ v); void _Atomic_Fetch_And_[Acq_|Rel_]<_type_>(volatile _type_ *p, _type_ v);
  4. The atomic_compare_exchange generic functions int _Atomic_cmpset_[acq_|rel_]<_type_>(volatile _type_ *dst, _type_ old, _type_ new);

Description:

  1. _types_:

Each atomic operation operates on a specific type. The type to use is indicated in the function name. The available types that can be used are:

int unsigned integer long unsigned long integer ptr unsigned integer the size of a pointer 32 unsigned 32-bit integer 64 unsigned 64-bit integer

Because rtems is used in lots of soc with 8 or 16 bit bus, so some architectures can consider provide operations for types smaller than int. In the FreeBSD only some architectures provide those types.

char unsigned character short unsigned short integer

8 unsigned 8-bit integer

16 unsigned 16-bit integer

  1. Acq and Rel:

"Acq" represens a read memory barrier which ensures that the effects of this operation are completed before the effects of any later data accesses "Rel" represens a write memory barrier which ensures that all effects of all previous data accesses are completed before this operation takes place

Build and Test HOWTO

How to build the RTEMS to test atomic operation:

The test cases will be install the rtems-install directory.

How to run atomic test cases on qemu:

Before you run the script you must have rtems boot image under ${HOME}/qemu/pc386_fda and you must create a directory ${HOME}/qemu/hd to store your test case execute file.

References

  • TBD

Other sections: If you have more to say about the project that doesn't fit in the proposed sections of this template, feel free to add other sections at will.