As a follow-up to the previous article, here are some tips to determine the use of the heap by your program.
In C, it is quite difficult to estimate heap usage because the allocator functions are type-erased, i.e. the return type of the allocator is void*
, so it is implicitly converted to the final type.
So you can't easily characterize the allocations made by type. At best, you can make a histogram according to the allocation size, but not much more.
If you absolutely must qualify allocations by type, it is recommended that you replace (with the preprocessor) malloc
with another function that does not return an void*
like this:
// In break_malloc.c
struct NotConstructible {};
NotConstructible * breakMalloc(const size_t size) { return ::malloc(size); }
// In break_malloc.h
NotConstructible * breakMalloc(const size_t size);
#define malloc breakMalloc
When you'll build your firmware, force the inclusion of the break_malloc.h
file (with -include
flag), the compiler will error out on each allocation and you'll be able to modify the call to something more verbose like:
void * malloc_type(const size_t size, const char * type);
// With type extracted from the error line:
// replace: MyStruct * ptr = malloc(sizeof(MyStruct) * 4);
// by: MyStruct * ptr = malloc_type(sizeof(MyStruct) * 4, "MyStruct");
After that, you have to store the allocations according to the given type in a hash table for example. See below for a possible code.
In C++, there is no implicit conversion of void*
to another type. It is easier to capture the type when calling the allocator by using a template function.
However, the allocation happens systematically via the void * operator new(size_t)
function, which has also lost its initial type. It is therefore not possible to capture the allocated type here (or else, with difficulty, by generating a backtrace and parsing it, which we cannot afford in the embedded world).
It is possible to do the same trick as in C mode with the preprocessor:
template <typename T>
struct tag {
const std::size_t size = { sizeof(T) };
static const char * name() { return typeid(T).name(); }
};
template<typename T>
void * operator new(size_t s, const tag<T> t) {
Registry::credit(t::name(), t::size, s);
return ::malloc(s);
}
#define new bad_new
And replace all calls to new
with a placement new new(tag<MyStruct>{})
. Normally, a C++ code should not have too many calls to new
.
If you use STLs, it is possible to act at a higher level, via allocators. In this case, we will write an allocator that will replace the default allocator and capture the allocations made by your program. The code is a bit more complex, so here it is, bit by bit:
#include <unordered_map>
#include <typeinfo>
#include <limits>
#include <cstdio>
#include <cstring>
// Containers we inject in the allocator
#include <string>
struct Registry
{
struct AllocTrack
{
size_t count;
size_t max;
size_t current;
AllocTrack() : count(0), max(0), current(0) {}
};
std::unordered_map<const char *, AllocTrack> registry;
static Registry & get_instance() { static Registry a; return a; }
void credit(const char * key, std::size_t count, std::size_t size) {
AllocTrack & track = registry[key];
track.count += count;
track.max += size;
track.current += size;
}
void debit(const char * key, std::size_t count, std::size_t size) {
AllocTrack & track = registry[key];
track.count -= count;
track.current -= size;
}
void dump() {
// Not using C++ iostream here to avoid triggering the allocator itself
printf("Allocator registry:\n");
for (auto it : registry) {
char buffer[256];
strncpy(buffer, it.first, 256);
char * t = strstr(buffer, ";");
if (!t) t = strstr(buffer, "]");
if (t) *t = 0;
printf("%s: %lu instances %lu bytes, max usage %lu\n", buffer, it.second.count, it.second.current, it.second.max);
}
}
template<typename T>
static const char * getTypeName() {
const char * templateType = __PRETTY_FUNCTION__;
const char * ptr = strstr(templateType, "=") + 2;
return ptr;
}
~Registry() {
dump();
}
};
The AllocTrack
structure captures allocation statistics, i.e. the number of allocations in progress, the maximum size allocated (total), the size of an allocation unit.
The Registry
structure is a registry which associates a type (given by its name) with its statistics. We can perform 2 operations on this database, credit or debit an allocation.
The getTypeName()
function is a trick to avoid compiling your program with exceptions (-frtti
), it uses the preprocessor to extract the type from the given template parameter.
Then, it's quite simple, we replace the allocator with our own:
template <class T>
class accounting_allocator {
public:
// type definitions
typedef T value_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T& reference;
typedef const T& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
// rebind allocator to type U
template <class U>
struct rebind {
typedef accounting_allocator<U> other;
};
// return address of values
pointer address (reference value) const {
return &value;
}
const_pointer address (const_reference value) const {
return &value;
}
/* constructors and destructor
* - nothing to do because the allocator has no state
*/
accounting_allocator() throw() {
}
accounting_allocator(const accounting_allocator&) throw() {
}
template <class U>
accounting_allocator (const accounting_allocator<U>&) throw() {
}
~accounting_allocator() throw() {
}
// return maximum number of elements that can be allocated
size_type max_size () const throw() {
// std::cout << "max_size()" << std::endl;
return std::numeric_limits<std::size_t>::max() / sizeof(T);
}
// allocate but don't initialize num elements of type T
pointer allocate (size_type num, const void* = 0) {
pointer ret = (pointer)(::operator new(num*sizeof(T)));
Registry::get_instance().credit(Registry::getTypeName<T>(), num, num * sizeof(T));
return ret;
}
// initialize elements of allocated storage p with value value
void construct (pointer p, const T& value) {
// initialize memory with placement new
new((void*)p) T(value);
}
// destroy elements of initialized storage p
void destroy (pointer p) {
// destroy objects by calling their destructor
p->~T();
}
// deallocate storage p of deleted elements
void deallocate (pointer p, size_type num) {
// print message and deallocate memory with global delete
::operator delete((void*)p);
Registry::get_instance().debit(Registry::getTypeName<T>(), num, num * sizeof(T));
}
};
template<>
class accounting_allocator<void>
{
public:
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef void* pointer;
typedef const void* const_pointer;
typedef void value_type;
template<typename _Tp1>
struct rebind
{ typedef std::allocator<_Tp1> other; };
};
// return that all specializations of this allocator are interchangeable
template <class T1, class T2>
bool operator== (const accounting_allocator<T1>&, const accounting_allocator<T2>&) throw() { return true; }
template <class T1, class T2>
bool operator!= (const accounting_allocator<T1>&, const accounting_allocator<T2>&) throw() { return false; }
#ifdef DEBUG_ALLOCATOR
typedef std::basic_string<char, std::char_traits<char>, accounting_allocator<char> > VString;
namespace std
{
template<>
struct hash<VString>
{
typedef VString argument_type;
typedef std::size_t result_type;
result_type operator()(argument_type const& in) const
{
return std::_Hash_impl::hash(in.data(), in.length());
}
};
}
typedef std::basic_stringstream<char, std::char_traits<char>, accounting_allocator<char> > VStringStream;
typedef std::basic_ostringstream<char, std::char_traits<char>, accounting_allocator<char> > VOStringStream;
inline bool operator== (const VString & a, const std::basic_string<char, std::char_traits<char>, std::allocator<char>> & b) throw() { return a.size() == b.size() && memcmp(a.data(), b.data(), a.size()) == 0; }
#else
typedef std::string VString;
typedef std::stringstream VStringStream;
typedef std::ostringstream VOStringStream;
#endif
There is an example here for the std::string
type but it is possible to do the same with all STL containers. You have to replace in your code std::string
by VString
, then compile and run your program.
When you want to know the consumption of a container in your project, you call Registry::dump()
and you get:
Allocator registry:
char: 512 instances 512 bytes, max usage 1785