#3755 new defect

leak in realloc

Reported by: Jeffrey Hill Owned by:
Priority: normal Milestone:
Component: score Version: 4.11
Severity: normal Keywords: realloc leak
Cc: Blocked By:
Blocking:

Description

We are running Lua in an RTEMS system which, by default, uses realloc for memory allocation. We observe a memory leak on RTEMS unless we replace the default Lua allocator on RTEMS as follows. The net result being avoidance of any calls to realloc, and instead fall back to Lua memory allocation based solely on malloc and free.

nios2-rtems4.11-gcc (GCC) 4.9.3 20150626 (RTEMS 4.11, RSB no-repo, Newlib 2.2.0.20150423)

configure --target=nios2-rtems4.11 --prefix=/ade/rtems/install/rtems-4-11 --disable-itron --disable-tests --enable-posix --enable-cxx --enable-rtemsbsp=altera-sys-config-S43X-TDAQ-dev --enable-networking

This is occurring in 4.11.1.99. We definitely need to sync up, however our cursory search in the bug database indicated that this might need to be reported.

#if defined ( CAS_LUA_DONT_USE_REALLOC )
#pragma message("installing RTEMS workaround for realloc bugs")
void * RspThr :: LuaHandle :: m_alloc ( void * const pPriv, void * const pOld, 
                                    const size_t osize, const size_t nsize ) 
{
    void * pNew;
    if ( nsize > 0u ) {
        if ( pOld ) {
            if ( osize == nsize ) {
                pNew = pOld;
            }
            else {
                pNew = malloc ( nsize );
                if ( pNew ) {
                    memcpy ( pNew, pOld, std :: min ( osize, nsize ) );
                    free ( pOld );
                }
            }
        }
        else {
            pNew = malloc ( nsize );
        }
    }
    else {
        free ( pOld );
        pNew = NULL;
    }
    return pNew;
}
#else
void * RspThr :: LuaHandle :: m_alloc ( void * const pPriv, void * const pOld, 
                                const size_t osize, const size_t nsize ) 
{
    if ( nsize == 0 ) {
        free ( pOld );
        return NULL;
    }
    else {
        void * const pNew = realloc ( pOld, nsize );
        return pNew;
    }
}
#endif /* ifdef CAS_LUA_DONT_USE_REALLOC */

Change History (12)

comment:1 Changed on Jun 5, 2019 at 4:27:21 PM by Jeffrey Hill

Component: admintool/newlib

comment:2 Changed on Jun 6, 2019 at 7:04:24 AM by Sebastian Huber

Component: tool/newlibscore

The realloc() is implemented in RTEMS (cpukit/libcsupport/src/realloc.c). Do you have a self-contained test case for this resource leak? Could it be also a memory fragmentation issue?

comment:3 Changed on Jun 6, 2019 at 12:47:22 PM by Joel Sherrill

I am teaching an RTEMS class this week and strangely someone mentioned this problem. They have a low memory target so we suspect that compounds the issue. We had a few ideas. First was to set a breakpoint at all exit paths from realloc() that return NULL and ensure Lua handles that correctly. Returning NULL means the memory was not resized but still exists at the same size. This could be incorrectly handled by Lua and a leak.

Another idea was to use the resource_snapshot or just the workspace command/method to check that all memory from Lua is actually being freed after each command invocation.

https://git.rtems.org/rtems/tree/cpukit/libcsupport/src/realloc.c

If turning on Lua's own memory pool management addresses the issue, then it is either increased fragmentation in RTEMS or they may just be holding on to previous allocations more efficiently and not losing memory.

comment:4 Changed on Jun 6, 2019 at 1:25:36 PM by Sebastian Huber

We had some trouble with the first fit allocator of RTEMS in applications which use a lot of malloc/free. We switched to TLSF and the fragmentation issues disappeared.

Since the CAS_LUA_DONT_USE_REALLOC fixes the problems, there could be an error in _Heap_Resize_block().

comment:5 in reply to:  4 Changed on Jun 7, 2019 at 2:16:01 AM by Chris Johns

Replying to Sebastian Huber:

We had some trouble with the first fit allocator of RTEMS in applications which use a lot of malloc/free. We switched to TLSF and the fragmentation issues disappeared.

What is involved in switching to TLSF? Can this code be merged into RTEMS so users can make a suitable selection?

comment:6 Changed on Jun 7, 2019 at 4:59:44 AM by Sebastian Huber

I started to work on that in #3582, but I am now busy with other things. A proper integration is quite a bit of work. For the applications we use just an adopted

https://github.com/mattconte/tlsf/blob/master/tlsf.c

and link it to the application. No RTEMS patch is necessary.

comment:7 in reply to:  6 ; Changed on Jun 7, 2019 at 5:22:50 AM by Chris Johns

Replying to Sebastian Huber:

I started to work on that in #3582, but I am now busy with other things. A proper integration is quite a bit of work. For the applications we use just an adopted

Is the adopted version publicly available?

https://github.com/mattconte/tlsf/blob/master/tlsf.c

and link it to the application. No RTEMS patch is necessary.

We should write this up somewhere.

comment:8 in reply to:  7 ; Changed on Jun 7, 2019 at 11:50:23 AM by Sebastian Huber

Replying to Chris Johns:

Replying to Sebastian Huber:

I started to work on that in #3582, but I am now busy with other things. A proper integration is quite a bit of work. For the applications we use just an adopted

Is the adopted version publicly available?

No, I had no time to clean it up. One problem is that the file has no license header. This makes it a bit difficult to distribute separately.

https://github.com/mattconte/tlsf/blob/master/tlsf.c

and link it to the application. No RTEMS patch is necessary.

We should write this up somewhere.

It is not a high priority item for me.

comment:9 Changed on Jun 8, 2019 at 12:52:11 AM by Jeffrey Hill

I should add that we see linear 5kB leaked each Lua context create/destroy cycle. When a Lua context runs for extended period there isn't a leak. Maybe it doesn't look like a fragmentation issue due to the linear 5kB per cycle. Admittedly our heapSpace command line diagnostic maybe isn't showing info about fragmentation. We create a fresh Lua context when a client connects to the EPICS server, and destroy it at client disconnect.

I did log all of the Lua allocations and frees by pointer value and verified that they were balanced using a map in a python script a few weeks back. I don't recall now if there were nil pointers in there in that mix. Its a good idea, and something to look for, I will have a look when spare time arrives :-).

Also I suppose that a standalone test playing back the various allocations requests, shrinking and enlarging existing blocks, simulating the lua calls to realloc, but just following input from my log files might reproduce the issue standalone?

comment:10 in reply to:  8 Changed on Jun 8, 2019 at 1:36:51 AM by Chris Johns

Replying to Sebastian Huber:

One problem is that the file has no license header. This makes it a bit difficult to distribute separately.

I have raised an issue in github asking if the BSD license can be added to the source.

It is not a high priority item for me.

Sure. Posting the code as is somewhere would help and would not take long to do. :)

Others can clean it up and bring it into the kernel. There is clearly a need and the current state of the NFS client in libbsd where it uses the heap is an example. I am sure there are others.

comment:11 in reply to:  9 Changed on Jun 8, 2019 at 1:38:18 AM by Chris Johns

Replying to Jeffrey Hill:

Also I suppose that a standalone test playing back the various allocations requests, shrinking and enlarging existing blocks, simulating the lua calls to realloc, but just following input from my log files might reproduce the issue standalone?

This is a great idea and it would make a good testsuite test if it exposes an issue.

comment:12 Changed on Jun 8, 2019 at 8:40:02 AM by Sebastian Huber

I added a ticket for the alternative allocator: #3757.

Note: See TracTickets for help on using tickets.