Code Monkey home page Code Monkey logo

umm_malloc's Introduction

verifier

umm_malloc - Memory Manager For Small(ish) Microprocessors

This is a memory management library specifically designed to work with the ARM7 embedded processor, but it should work on many other 32 bit processors, as well as 16 and 8 bit devices.

You can even use it on a bigger project where a single process might want to manage a large number of smaller objects, and using the system heap might get expensive.

Acknowledgements

Joerg Wunsch and the avr-libc provided the first malloc() implementation that I examined in detail.

http://www.nongnu.org/avr-libc

Doug Lea's paper on malloc() was another excellent reference and provides a lot of detail on advanced memory management techniques such as binning.

http://gee.cs.oswego.edu/dl/html/malloc.html

Bill Dittman provided excellent suggestions, including macros to support using these functions in critical sections, and for optimizing realloc() further by checking to see if the previous block was free and could be used for the new block size. This can help to reduce heap fragmentation significantly.

Yaniv Ankin suggested that a way to dump the current heap condition might be useful. I combined this with an idea from plarroy to also allow checking a free pointer to make sure it's valid.

Dimitry Frank contributed many helpful additions to make things more robust including a user specified config file and a method of testing the integrity of the data structures.

GitHub user @devyte provided useful feedback on the nesting of functions as well as a fix for the problem that separates out the core free and malloc functionality.

GitHub users @d-a-v and @devyte provided great input on establishing a heap fragmentation metric which they graciously allowed to be used in umm_malloc.

Katherine Whitlock (@stellar-aria) extended the library for usage in scenarios where more than one heap or memory space is needed.

Usage

This library is designed to be included in your application as a submodule that has default configuration that can be overridden as needed by your application code.

The umm_malloc library can be initialized two ways. The first is at link time:

  • Set UMM_MALLOC_CFG_HEAP_ADDR to the symbol representing the starting address of the heap. The heap must be aligned on the natural boundary size of the processor.
  • Set UMM_MALLOC_CFG_HEAP_SIZE to the size of the heap in bytes. The heap size must be a multiple of the natural boundary size of the processor.

This is how the umm_init() call handles initializing the heap.

We can also call umm_init_heap(void *pheap, size_t size) where the heap details are passed in manually. This is useful in systems where you can allocate a block of memory at run time - for example in Rust.

Multiple heaps

For usage in a scenario that requires multiple heaps, the heap type umm_heap is exposed. All API functions (malloc, free, realloc, etc.) have a corresponding umm_multi_* variant that take a pointer to this type as their first parameter.

Much like standard initialization, there are two methods:

  • umm_multi_init(umm_heap *heap), which initializes a given heap using linker symbols
  • umm_multi_init_heap(umm_heap *heap, void *ptr, size_t size), which will initialize a given heap using a known address and size.

Automated Testing

umm_malloc is designed to be testable in standalone mode using ceedling. To run the test suite, just make sure you have ceedling installed and then run:

ceedling clean
ceedling test:all

Configuration

โš ๏ธ You MUST provide a file called umm_malloc_cfgport.h somewhere in your app, even if it's blank

The reason for this is the way the configuration override heirarchy works. The priority for configuration overrides is as follows:

  1. Command line defines using -D UMM_xxx
  2. A custom config filename using -D UMM_CFGFILE="<filename.cfg>"
  3. The default config filename umm_malloc_cfgport.h
  4. The default configuration in src/umm_malloc_cfg.h

The following #defines are set to useful defaults in src/umm_malloc_cfg.h and can be overridden as needed.

The fit algorithm is defined as either:

  • UMM_BEST_FIT which scans the entire free list and looks for either an exact fit or the smallest block that will satisfy the request. This is the default fit method.
  • UMM_FIRST_FIT which scans the entire free list and looks for the first block that satisfies the request.

The following #defines are disabled by default and should remain disabled for production use. They are helpful when testing allocation errors (which are normally due to bugs in the application code) or for running the test suite when making changes to the code.

  • UMM_INFO is used to include code that allows dumping the entire heap structure (helpful when there's a problem).

  • UMM_INTEGRITY_CHECK is used to include code that performs an integrity check on the heap structure. It's up to you to call the umm_integrity_check() function.

  • UMM_POISON_CHECK is used to include code that adds some bytes around the memory being allocated that are filled with known data. If the data is not intact when the block is checked, then somone has written outside of the memory block they have been allocated. It is up to you to call the umm_poison_check() function.

API

The following functions are available for your application:

void *umm_malloc(size_t size)
void *umm_calloc(size_t num, size_t size)
void *umm_realloc(void *ptr, size_t size)
void  umm_free(void *ptr)

They have exactly the same semantics as the corresponding standard library functions.

To initialize the library there are two options:

void  umm_init(void)
void  umm_init_heap(void *ptr, size_t size)

Multi-Heap API

For the case of multiple heaps, corresponding umm_multi_* functions are provided.

void *umm_multi_malloc(umm_heap *heap, size_t size)
void *umm_multi_calloc(umm_heap *heap, size_t num, size_t size)
void *umm_multi_realloc(umm_heap *heap, void *ptr, size_t size)
void  umm_multi_free(umm_heap *heap, void *ptr)

As with the standard API, there are two options for initialization:

void  umm_multi_init(umm_heap *heap)
void  umm_multi_init_heap(umm_heap *heap, void *ptr, size_t size)

Background

The memory manager assumes the following things:

  1. The standard POSIX compliant malloc/calloc/realloc/free semantics are used
  2. All memory used by the manager is allocated at link time, it is aligned on a 32 bit boundary, it is contiguous, and its extent (start and end address) is filled in by the linker.
  3. All memory used by the manager is initialized to 0 as part of the runtime startup routine. No other initialization is required.

The fastest linked list implementations use doubly linked lists so that its possible to insert and delete blocks in constant time. This memory manager keeps track of both free and used blocks in a doubly linked list.

Most memory managers use a list structure made up of pointers to keep track of used - and sometimes free - blocks of memory. In an embedded system, this can get pretty expensive as each pointer can use up to 32 bits.

In most embedded systems there is no need for managing a large quantity of memory block dynamically, so a full 32 bit pointer based data structure for the free and used block lists is wasteful. A block of memory on the free list would use 16 bytes just for the pointers!

This memory management library sees the heap as an array of blocks, and uses block numbers to keep track of locations. The block numbers are 15 bits - which allows for up to 32767 blocks of memory. The high order bit marks a block as being either free or in use, which will be explained later.

The result is that a block of memory on the free list uses just 8 bytes instead of 16.

In fact, we go even one step futher when we realize that the free block index values are available to store data when the block is allocated.

The overhead of an allocated block is therefore just 4 bytes.

Each memory block holds 8 bytes, and there are up to 32767 blocks available, for about 256K of heap space. If that's not enough, you can always add more data bytes to the body of the memory block at the expense of free block size overhead.

There are a lot of little features and optimizations in this memory management system that makes it especially suited to small systems, and the best way to appreciate them is to review the data structures and algorithms used, so let's get started.

Detailed Description

We have a general notation for a block that we'll use to describe the different scenarios that our memory allocation algorithm must deal with:

   +----+----+----+----+
c  |* n |  p | nf | pf |
   +----+----+----+----+

Where:

  • c is the index of this block
    • is the indicator for a free block
  • n is the index of the next block in the heap
  • p is the index of the previous block in the heap
  • nf is the index of the next block in the free list
  • pf is the index of the previous block in the free list

The fact that we have forward and backward links in the block descriptors means that malloc() and free() operations can be very fast. It's easy to either allocate the whole free item to a new block or to allocate part of the free item and leave the rest on the free list without traversing the list from front to back first.

The entire block of memory used by the heap is assumed to be initialized to 0. The very first block in the heap is special - it't the head of the free block list. It is never assimilated with a free block (more on this later).

Once a block has been allocated to the application, it looks like this:

  +----+----+----+----+
c |  n |  p |   ...   |
  +----+----+----+----+

Where:

  • c is the index of this block
  • n is the index of the next block in the heap
  • p is the index of the previous block in the heap

Note that the free list information is gone because it's now being used to store actual data for the application. If we had even 500 items in use, that would be 2,000 bytes for free list information. We simply can't afford to waste that much.

The address of the ... area is what is returned to the application for data storage.

The following sections describe the scenarios encountered during the operation of the library. There are two additional notation conventions:

?? inside a pointer block means that the data is irrelevant. We don't care about it because we don't read or modify it in the scenario being described.

... between memory blocks indicates zero or more additional blocks are allocated for use by the upper block.

While we're talking about "upper" and "lower" blocks, we should make a comment about adresses. In the diagrams, a block higher up in the picture is at a lower address. And the blocks grow downwards their block index increases as does their physical address.

Finally, there's one very important characteristic of the individual blocks that make up the heap - there can never be two consecutive free memory blocks, but there can be consecutive used memory blocks.

The reason is that we always want to have a short free list of the largest possible block sizes. By always assimilating a newly freed block with adjacent free blocks, we maximize the size of each free memory area.

Operation of malloc right after system startup

As part of the system startup code, all of the heap has been cleared.

During the very first malloc operation, we start traversing the free list starting at index 0. The index of the next free block is 0, which means we're at the end of the list!

At this point, the malloc has a special test that checks if the current block index is 0, which it is. This special case initializes the free list to point at block index 1 and then points block 1 to the last block (lf) on the heap.

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
0  |  0 |  0 |  0 |  0 |           0  |  1 |  0 |  1 |  1 |
   +----+----+----+----+              +----+----+----+----+
                                      +----+----+----+----+
                                   1  |*lf |  0 |  0 |  0 |
                                      +----+----+----+----+
                                               ...
                                      +----+----+----+----+
                                   lf |  0 |  1 |  0 |  0 |
                                      +----+----+----+----+

The heap is now ready to complete the first malloc operation.

Operation of malloc when we have reached the end of the free list and there is no block large enough to accommodate the request.

This happens at the very first malloc operation, or any time the free list is traversed and no free block large enough for the request is found.

The current block pointer will be at the end of the free list, and we know we're at the end of the list because the nf index is 0, like this:

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
pf |*?? | ?? | cf | ?? |           pf |*?? | ?? | lf | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
 p | cf | ?? |   ...   |            p | cf | ?? |   ...   |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+  
cf |  0 |  p |  0 | pf |            c | lf |  p |   ...   | 
   +----+----+----+----+              +----+----+----+----+
                                      +----+----+----+----+
                                   lf |  0 | cf |  0 | pf |
                                      +----+----+----+----+

As we walk the free list looking for a block of size b or larger, we get to cf, which is the last item in the free list. We know this because the next index is 0.

So we're going to turn cf into the new block of memory, and then create a new block that represents the last free entry (lf) and adjust the prev index of lf to point at the block we just created. We also need to adjust the next index of the new block (c) to point to the last free block.

Note that the next free index of the pf block must point to the new lf because cf is no longer a free block!

Operation of malloc when we have found a block (cf) that will fit the current request of b units exactly

This one is pretty easy, just clear the free list bit in the current block and unhook it from the free list.

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
pf |*?? | ?? | cf | ?? |           pf |*?? | ?? | nf | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
 p | cf | ?? |   ...   |            p | cf | ?? |   ...   |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+  Clear the free
cf |* n |  p | nf | pf |           cf |  n |  p |   ..    |  list bit here
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
 n | ?? | cf |   ...   |            n | ?? | cf |   ...   |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
nf |*?? | ?? | ?? | cf |           nf | ?? | ?? | ?? | pf |
   +----+----+----+----+              +----+----+----+----+

Unhooking from the free list is accomplished by adjusting the next and prev free list index values in the pf and nf blocks.

Operation of malloc when we have found a block that will fit the current request of b units with some left over

We'll allocate the new block at the END of the current free block so we don't have to change ANY free list pointers.

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
pf |*?? | ?? | cf | ?? |           pf |*?? | ?? | cf | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
 p | cf | ?? |   ...   |            p | cf | ?? |   ...   |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
cf |* n |  p | nf | pf |           cf |* c |  p | nf | pf |
   +----+----+----+----+              +----+----+----+----+
                                      +----+----+----+----+ This is the new
                                    c |  n | cf |   ..    | block at cf+b
                                      +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
 n | ?? | cf |   ...   |            n | ?? |  c |   ...   |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
nf |*?? | ?? | ?? | cf |           nf | ?? | ?? | ?? | pf |
   +----+----+----+----+              +----+----+----+----+

This one is prety easy too, except we don't need to mess with the free list indexes at all becasue we'll allocate the new block at the end of the current free block. We do, however have to adjust the indexes in cf, c, and n.

That covers the initialization and all possible malloc scenarios, so now we need to cover the free operation possibilities...

Free Scenarios

The operation of free depends on the position of the current block being freed relative to free list items immediately above or below it. The code works like this:

if next block is free
    assimilate with next block already on free list
if prev block is free
    assimilate with prev block already on free list
else
    put current block at head of free list

Step 1 of the free operation checks if the next block is free, and if it is assimilate the next block with this one.

Note that c is the block we are freeing up, cf is the free block that follows it.

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
pf |*?? | ?? | cf | ?? |           pf |*?? | ?? | nf | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
 p |  c | ?? |   ...   |            p |  c | ?? |   ...   |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+ This block is
 c | cf |  p |   ...   |            c | nn |  p |   ...   | disconnected
   +----+----+----+----+              +----+----+----+----+ from free list,
   +----+----+----+----+                                    assimilated with
cf |*nn |  c | nf | pf |                                    the next, and
   +----+----+----+----+                                    ready for step 2
   +----+----+----+----+              +----+----+----+----+
nn | ?? | cf | ?? | ?? |           nn | ?? |  c |   ...   |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
nf |*?? | ?? | ?? | cf |           nf |*?? | ?? | ?? | pf |
   +----+----+----+----+              +----+----+----+----+

Take special note that the newly assimilated block (c) is completely disconnected from the free list, and it does not have its free list bit set. This is important as we move on to step 2 of the procedure...

Step 2 of the free operation checks if the prev block is free, and if it is then assimilate it with this block.

Note that c is the block we are freeing up, pf is the free block that precedes it.

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+ This block has
pf |* c | ?? | nf | ?? |           pf |* n | ?? | nf | ?? | assimilated the
   +----+----+----+----+              +----+----+----+----+ current block
   +----+----+----+----+
 c |  n | pf |   ...   |
   +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
 n | ?? |  c |   ...   |            n | ?? | pf | ?? | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
nf |*?? | ?? | ?? | pf |           nf |*?? | ?? | ?? | pf |
   +----+----+----+----+              +----+----+----+----+

Nothing magic here, except that when we're done, the current block (c) is gone since it's been absorbed into the previous free block. Note that the previous step guarantees that the next block (n) is not free.

Step 3 of the free operation only runs if the previous block is not free. it just inserts the current block to the head of the free list.

Remember, 0 is always the first block in the memory heap, and it's always head of the free list!

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
 0 | ?? | ?? | nf |  0 |            0 | ?? | ?? |  c |  0 |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
 p |  c | ?? |   ...   |            p |  c | ?? |   ...   |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
 c |  n |  p |   ..    |            c |* n |  p | nf |  0 |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
 n | ?? |  c |   ...   |            n | ?? |  c |   ...   |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
nf |*?? | ?? | ?? |  0 |           nf |*?? | ?? | ?? |  c |
   +----+----+----+----+              +----+----+----+----+

Again, nothing spectacular here, we're simply adjusting a few pointers to make the most recently freed block the first item in the free list.

That's because finding the previous free block would mean a reverse traversal of blocks until we found a free one, and it's just easier to put it at the head of the list. No traversal is needed.

Realloc Scenarios

Finally, we can cover realloc, which has the following basic operation.

The first thing we do is assimilate up with the next free block of memory if possible. This step might help if we're resizing to a bigger block of memory. It also helps if we're downsizing and creating a new free block with the leftover memory.

First we check to see if the next block is free, and we assimilate it to this block if it is. If the previous block is also free, and if combining it with the current block would satisfy the request, then we assimilate with that block and move the current data down to the new location.

Assimilating with the previous free block and moving the data works like this:

   BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
pf |*?? | ?? | cf | ?? |           pf |*?? | ?? | nf | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
cf |* c | ?? | nf | pf |            c |  n | ?? |   ...   | The data gets
   +----+----+----+----+              +----+----+----+----+ moved from c to
   +----+----+----+----+                                    the new data area  
 c |  n | cf |   ...   |                                    in cf, then c is
   +----+----+----+----+                                    adjusted to cf
   +----+----+----+----+              +----+----+----+----+
 n | ?? |  c |   ...   |            n | ?? |  c | ?? | ?? |
   +----+----+----+----+              +----+----+----+----+
            ...                                ...
   +----+----+----+----+              +----+----+----+----+
nf |*?? | ?? | ?? | cf |           nf |*?? | ?? | ?? | pf |
   +----+----+----+----+              +----+----+----+----+

Once we're done that, there are three scenarios to consider:

  1. The current block size is exactly the right size, so no more work is needed.

  2. The current block is bigger than the new required size, so carve off the excess and add it to the free list.

  3. The current block is still smaller than the required size, so malloc a new block of the correct size and copy the current data into the new block before freeing the current block.

The only one of these scenarios that involves an operation that has not yet been described is the second one, and it's shown below:

BEFORE                             AFTER

   +----+----+----+----+              +----+----+----+----+
 p |  c | ?? |   ...   |            p |  c | ?? |   ...   |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+              +----+----+----+----+
 c |  n |  p |   ...   |            c |  s |  p |   ...   |
   +----+----+----+----+              +----+----+----+----+
                                      +----+----+----+----+ This is the
                                    s |  n |  c |   ..    | new block at
                                      +----+----+----+----+ c+blocks
   +----+----+----+----+              +----+----+----+----+
 n | ?? |  c |   ...   |            n | ?? |  s |   ...   |
   +----+----+----+----+              +----+----+----+----+

Then we call free() with the adress of the data portion of the new block (s) which adds it to the free list.

umm_malloc's People

Contributors

andig avatar bfiset avatar dexterbg avatar dimonomid avatar guilhermgonzaga avatar mhightower83 avatar mikee47 avatar mrcmry avatar rhempel avatar stellar-aria avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

umm_malloc's Issues

dbglog.h is missing

Compiler build error! There was a comment somewhere in the commits that macros from dbglog.h are moved to umm_malloc.c. It is not the case. The folder c-helper-macros is emply in the original repo.

ceedling build does not work on Linux

Building this library on Linux has a number of issues:

  • Warnings in the DBGLOG macros due to %x expecting unsigned int but argument is void *
  • Undefined reference to sqrtf() probably due to a missing -lm linker option under Linux

Fix these issues before moving on

add stddef.h to umm_malloc.h

size_t is not defined by default and depending on how you build a project that includes umm_malloc, you can run into issues when umm_malloc.h tries to use it before stddef.h gets included elsewhere.

block list on .bss and memory data other where

Hi,
Im developing a project that has a DDR2 memory that is used only to store a specific type of data, I would like to manage this data with your code but I couldn't find a way to have the block list save on the on chip memory (.bss) managing data pointers that is on a another specific memory region.

do you think that is possible ?

umm_realloc() won't expand chunk located at the end of the heap

I somehow downloaded an older version dated 2014-04-02, but it's already deployed so not changing it.
However, I looked at the latest source and this issue doesn't seem to have been addressed.
umm_realloc() always hits the last case and calls malloc to move the chunk when the chunk to enlarge is at the end of the heap.
This happens because, in this case, umm_assimilate_up() does nothing.
I didn't put a lot of thought into it, but this seems to have fixed it for me. I made assimilate_up return true if it's main condition passes:

   if ( !umm_assimilate_up( c, umm_heap ) && // did assimilate up fail?
        !(UMM_NBLOCK(UMM_NBLOCK(c)) & UMM_BLOCKNO_MASK) // are we at the end of the heap?
      ) {
      // Assimilate up as much as needed or as much as possible.
      unsigned short int upBlocks = blocks;
      if ( UMM_NUMBLOCKS <= c+upBlocks+1 )
      	upBlocks = UMM_NUMBLOCKS-c-2; // not enough blocks available: grab as many as possible
      if ( upBlocks > 0 ) {
	      // Copied this logic from malloc.
	      UMM_NFREE(UMM_PFREE(c)) = c+upBlocks;
	      memcpy( &UMM_BLOCK(c+upBlocks), &UMM_BLOCK(UMM_NBLOCK(c)), sizeof(umm_block) );
	      UMM_NBLOCK(c) = c+upBlocks;
	      UMM_PBLOCK(c+upBlocks) = c;
	  }
   }

I'd appreciate any thoughts if someone has a better idea or sees a problem.
Thanks!

realloc has nested critical sections

realloc calls both malloc and free. The code path for malloc is ok, but the code for free has complications. The call to free is within a critical section guarded by UMM_CRITICAL_ENTRY and UMM_CRITICAL_EXIT. The free function body has its own critical section, which means it is nested within the one in the realloc body.
For small(ish) systems, such as the ESP8266, the ENTRY and EXIT guards enable/disable interrupts. In such cases, the code after the EXIT guard in free() and the final EXIT guard in realloc won't be guarded as is expected:

realloc()
  //some code
  UMM_CRITICAL_ENTRY //disable interrupts
  //some critical code
  call free()
    //some code
    UMM_CRITICAL_ENTRY //disable interrupts again (does nothing)
    //some critical code
    UMM_CRITICAL_EXIT //enable interrupts (code point A)
    //more code
    return from free()
  //more critical code
  UMM_CRITICAL_EXIT //enable interrupts again (does nothing, code point B)
  //final code
  return from realloc()

The code between code points A and B is expected to be guarded because of the ENTRY and EXIT, but it is not.

It is possible to fix this within the ESP compiling environment with a lock class instance for ENTRY, because there umm is being compiled as .cpp. In that case, each class instance saves the previous interrupt state and restores it on destruction. However, that is not viable in other systems where umm is being built as .c.

I think it would be best to implement a fix here by removing the nested critical sections. The solution would be to refactor free() into two functions: an internal unguarded core function, and a wrapper that calls the core function between ENTRY and EXIT guards. Then, the free() call in realloc() would be replaced with the unguarded function.

Edit: the code path for malloc is also not ok. While implementing #16 I realized that malloc is also called from within the critical section, so I refactored that as well.

realloc strategy change

I have been working with a very old branch of umm_malloc that is in the Arduino ESP8266 project. I am in the process of adapting your current version into the project. I have a few thoughts I will share over time. For now, I want to focus on a change I noticed that might have been intentional or an oversight. A little background is needed. In the old version, I saw the comment:

"Bill Dittman provided excellent suggestions, including macros to support using these functions in critical sections, and for optimizing realloc() further by checking to see if the previous block was free and could be used for the new block size. This can help to reduce heap fragmentation significantly."

Of interest here is the "reduce heap fragmentation". While the top-level comment mentioned this, the code where it happens did not identify that it was doing it. The logic flow was to always try and assimilate up, then look to see if an assimilate down was possible. By always assimilating down when possible it resulted in locally deframenting the allocation. This process also applied to allocations that were being shrunk. This, of course, comes at the price of a memory copy.

After some searching, I found where the enhancement was removed: 13008fa (around line 531 of umm_malloc.c left side). The additional code cleanup changed the logic to one of reducing copy on realloc.

I can see where some would prefer to reduce memory copy and others would prefer the defragmentation. IMHO for the Arduino ESP8266 with very limited heap space, the local defragmentation when possible would be prefered.

Add speed test to framework

The current umm_malloc test framework does not have any speed tests. The recent addition of inline vs one-time fragmentation metric is an example of the need for performance testing of the algorithms used in umm_malloc.
The challenge is that the test conditions will vary depending on the hardware used, and umm_malloc is used on a variety of platforms, from ARM Cortex M[0..7] to ESP8266 to Intel processors (for fast local heaps)
Initially I will write the test case for both 32 and 64 bit MinGW compilers running on my Dell 7150 Pro - a small (and older) 2 in 1 tablet.
In the long run I would like to be able to run the test suite on a variety of devices, so the test may eventually have an input for the platform the test is running on and platform specific pass/fail criteria.

umm_init

hi.

I'm have problems on cc3220sf with umm_malloc.

umm_malloc.c have some code -

  /* init heap pointer and size, and memset it to 0 */
  umm_heap = (umm_block *)UMM_MALLOC_CFG__HEAP_ADDR;
  umm_numblocks = (UMM_MALLOC_CFG__HEAP_SIZE / sizeof(umm_block));
  memset(umm_heap, 0x00, UMM_MALLOC_CFG__HEAP_SIZE); 

If i'm printf (&umm_heap) - i'm get address from data section , eg - 0x20004939
if i'm printf (UMM_MALLOC_CFG__HEAP_ADDR) - i'm get address from heap, eg - 0x20005be3

From main program i'm make test -

char *test = umm_malloc(20);
printf(&test);
out = 0x20004d25

This is right?

And some time from logs i'm see that umm try free ptr with addr in data or bss sections, eg - free(0x20004de8) - and doo nothing becase this ptr null - and not allocating with umm_malloc.

Where problem? or i'm wrong?

umm_malloc_cfg.h conflicts with user's config file

Hi

Dunno if this is a bug or feature, but umm_malloc_cfg.h that is committed into your library conflicts with a file created by the user. This way users have to modify your umm_malloc_cfg.h file to change settings or remove it and provide their own somewhere else. And then if the users (like myself) are using your library as a git submodule, git reports modifications to the umm_memory directory which cannot be pushed unless they fork the umm_malloc. I think that only umm_malloc_cfg_example.h should exists in the repo. Please let me know if I'm missing something.

Move to full ceedling integration

Currently umm_malloc has the Unity (TDD for Embedded) framework as a submodule. This is OK but we end up not being able to take advantage of some of the features like mocking, lightweight exception handling and a test oriented build system.

Remove the Unity submodule, and update this repo to have the full ceedling environment.

Make all conditionally compiled functions upper case

In some cases, it is not clear if a function is really part of the source. For example the fragmentation metric can be calculated on-the-fly or on-demand. In the on-the-fly case, there are calls like this:

umm_fragmentation_metric_init();
umm_fragmentation_metric_add(c);

When on-the-fly calculation is disabled these are simply 3defined as empty stings and have no effect on the generated code.

Unfortunately they look like normal functions, so to give the reader a hint that they are in fact macros without resorting to inline #ifdef guards that make the code messy, we will mae all code that can be #defined out into upper case macros - like this:

#ifdef UMM_METRICS
#define UMM_FRAGMENTATION_METRIC_ADD(c) umm_fragmentation_metric_add(c)
#else
#define UMM_FRAGMENTATION_METRIC_ADD(c)
#endif // UMM_METRICS

There is no check if heap size is within configured limits

According to the documentation, max heap size = block_size * uint15 max, so for 8-byte blocks (by default), it's:

8 * 32767 = 2621424 B

This limitation is not mentioned in the "configuration" section of the README. I would consider this information important enough to mention there, but even more important would be adding a runtime check to verify if the numbers align.

I would consider adding an assert inside umm_init_heap() to check if requested size of the heap is within limits.

@rhempel what do you think about it? I can submit a PR if you agree it makes sense.

malloc readme inconsistency?

Trying to understand how the initialization works. In https://github.com/rhempel/umm_malloc/blame/master/README.md#L214 you mention that

The current block pointer will be at the end of the free list, and we know we're at the end of the list because the nf index is 0, like this:

In the example just above, https://github.com/rhempel/umm_malloc/blame/master/README.md#L175 the nf index is 1 though:

   BEFORE                             AFTER

               nf   pf                            nf   pf
   +----+----+----+----+              +----+----+----+----+
0  |  0 |  0 |  0 |  0 |           0  |  1 |  0 |  1 |  0 |
   +----+----+----+----+              +----+----+----+----+
   +----+----+----+----+
1  |  0 |  0 |  0 |  0 |
   +----+----+----+----+

Is #175 potentially wrong or am i missing something?

How dose UMM_REDEFINE_MEM_FUNCTIONS work?

I read the config file, and there i found that define. But I do not find anything in the src - so I do not understand how that should replace the normal malloc.

many thanks
cheers
mathias

dbglog missing (and not submodule?)

Hi,

I'm banging my head against a brick wall trying to get the DBGLOG stuff to work ... everything is fine if I put some null defines in to replace it, but it's not part of the repo and doesn't seem to be a submodule???

Previous issues raised suggest it's a submodule and has been documented ... but I've tried a --recurse-submodules clone, and I've also tried a submodule init and submodule update ... but there's nothing! And there's no references anywhere I can find??

What am I missing?

Inconsistently fails for oversized allocation requests

I understand this won't generally be an issue for the intended use of this library, but it did cause me some trouble and think it's worth fixing. The reason it's an issue for me is that (for my application) large allocations must fail when requested of umm before the system moves on to another memory manager.

The issue is that when size_t is 32bit, umm_blocks() will overflow for larger block sizes:
uint16_t umm_blocks(size_t size)

So with my umm configuration, if I request 8MB from umm (which, obviously it can't and shouldn't serve), it will succeed and return a ptr to some unknown chunk size.

The simple solution for me was to make umm_blocks() return zero when size is out of range.

I replaced this line:
return 2 + size / (UMM_BLOCKSIZE);
With this:
size /= (UMM_BLOCKSIZE);
if (size > 0xFFFF-2) {
return 0;
}
return 2 + size;

Then test for zero as an error condition in alloc cores:
blocks = umm_blocks(size);
if (blocks == 0)
return (void *)NULL;

Improve umm_malloc integration with Rust

From @mattico

I made a few changes to umm_malloc to help with integration:

  1. Runtime initialization. The standard Rust Embedded linker scripts don't put the heap at a fixed location, so I made umm_init take heap_addr and size. Not sure what a good general solution would be that preserves the ability to statically define the address/size.

  2. Support extern functions for critical section entry/exit. Rust projects may have arbitrary locking mechanisms, and rather than require users to translate those into C to #define I use extern functions for critical section entry and exit. Being unable to inline those is not ideal, but I couldn't think of a better solution. I do intend to provide an assembly solution for cortex-m that just enables/disables interrupts which should work for 80% of cases.

Some additional ideas that could help:

  1. Rust's GlobalAlloc API takes a size and alignment. I'm currently just asserting that the requested alignment is less than or equal to 8, the size of umm_block. In practice this works fine, requiring a larger alignment than that is pretty rare. Ideally umm_malloc would provide a memalign like API, though.

  2. I do want to expose umm_malloc's diagnostics APIs to Rust users easily. I have to do more investigation to see what is required, though.

umm_malloc version

Hello rhempel,

Thanks for the umm_malloc library. Could you please add version numbers for it?

Thanks

Separate stress testing from the poison block testing

Currently the stress testing mechanism is built into the poison block testing. This needs an awkward data structure to manage calling the correct version of the allocation functions. Even worse, the stress test results are slightly different for 8 byte blocks and higher block sizes due to the extra memory that must be allocated to handle the poison areas.
Ideally, we need to separate the stress testing from poison testing, and also update the poison data area size to correctly allocate extra memory depending on the block size to that the number of blocks allocated during poison testing remains the same regardless of the block size.

heap size

Hi
with the "#define UMM_MALLOC_CFG_HEAP_SIZE" I can set the heap size to a user-defined size. However, the heap size is limited to 256kB due to the maximum number of blocks. Is it possible to increase the heap to 1MB or more? Where do I have to make changes to achieve this?
I have read that it is possible to add more data bytes to the body of the memory block at the expense of free block size overhead.
Where can I add more data bytes to the body of the memory block?

Internal random number generator

Compiling the test suite under mingw or Cygwin produces different results during random allocations, even with the same seed number.

We believe this is due to the fact that each run-time library uses a slightly different random number generator algorithm.

To work around this problem, we will include in the test suite a random number generator that is platform independent.

https://en.wikipedia.org/wiki/Linear_congruential_generator

Use explicity sized values - uintxx_t

In the current source, created about 10 years ago, we used standard C types such as unsigned short int to force the size of struct elements and other values,

This is NOT portable to 64 bit architectures so we need to use explicitly sized values where possible:

  • uint8_t
  • uint16_t
  • uint32_t

Warnings

VS2017RC: there are 17 warnings for x86 and 25 warnings for x64

Couple improvements

Hi guys,

I noticed a couple of things in the master branch:

  • umm_info -> Compiler error -> line 165/171. Variable umm_metrics does not exist (I think it should be ummHeapInfo)
  • umm_integrity_check -> use of printf instead of DBGLOG_INFO functions
  • umm_poison -> get_unpoisoned() wrong type of variable: uint8_t c must be uint16_t c

Best regards,
Douwe

Allow umm_malloc to initalize the heap in different ways

In some use scenarios, there may not be linker supplied values to tell us where the heap should be or how big it is. In this case we will need an API call umm_init_heap() that lets us pass in the start address and size manually. The old umm_init() API is still there, it just calls umm_init_heap() with the linker supplied values.
While we are at it, we can let the user choose whether they want a check for an initialized heap before any heap operation, and if so, to allow this condition to be handled as needed.
Finally, we clean up the override mechanism for umm_malloc to allow compile time -D flags to take priority over a user supplied config file. If none of these are present then the system default umm_malloc_cfg.h file takes precedence.
To implement this successfully we will need the following incremental changes:

  • Add a way to either auto-initialize a heap or call an error trap if any allocations are called on an uninitialized heap
  • Implement the concept of one or more heap control blocks to allow multiple heaps
  • Improve the umm_malloc compile time configuration system
  • Implement heap control block initialization for integration into Rust

Add gcov feature to testing framework

Currently we do not have any code coverage metrics - and that's about to change.
Add support for gcov tasks in project.yml
In general we want to run each configuration separately when doing coverage testing, I think,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.