Categories
c++ memory-management placement-new standards

Array placement-new requires unspecified overhead in the buffer?

68

5.3.4 [expr.new] of the C++11 Feb draft gives the example:

new(2,f) T[5] results in a call of operator new[](sizeof(T)*5+y,2,f).

Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[]. This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. —end example ]

Now take the following example code:

void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];

According to the above quote, the second line new (buffer) std::string[10] will internally call operator new[](sizeof(std::string) * 10 + y, buffer) (before constructing the individual std::string objects). The problem is that if y > 0, the pre-allocated buffer will be too small!

So how do I know how much memory to pre-allocate when using array placement-new?

void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];

Or does the standard somewhere guarantee that y == 0 in this case? Again, the quote says:

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions.

16

  • 3

    I don’t think you can know that at all. I think placement new was always rather thought of like a tool to use your own memory manager, than something allowing you to pre-allocate memory. Anyway, why don’t you simply loop through array with regular new? I don’t think it will influence performancee much because placement new is basically a no-op, and constructors for all objects in array have to be called separately anyway.

    – j_kubik

    Jan 4, 2012 at 1:01

  • 3

    @j_kubik that’s not as simple as it looks! If one of the constructors throws midway through the loop you have to clean up the objects you already constructed, something array-new forms do for you. But everything seems to indicate placement-array-new cannot be safely used.

    Jan 4, 2012 at 1:07


  • 2

    @FredOverflow: Thanks a ton for clarifying the question.

    Jan 4, 2012 at 1:09


  • 1

    That is what would make sense and how I thought that it was done. However, if that were the case, it should be an implementation detail of the operator new[] and operator delete[] in whatever scope they are located in to deal with this extra overhead internally rather then having this overhead passed along with the minimal required space. I think that was the original intent, but if a constructor throws an exception, this can cause a problem if it’s not known how many elements have been constructed. What’s really missing from C++ is a way to define how to construct an array of elements.

    – Adrian

    Jun 24, 2013 at 18:23


51

Update

Nicol Bolas correctly points out in the comments below that this has been fixed such that the overhead is always zero for operator new[](std::size_t, void* p).

This fix was done as a defect report in November 2019, which makes it retroactive to all versions of C++.

Original Answer

Don’t use operator new[](std::size_t, void* p) unless you know a-priori the answer to this question. The answer is an implementation detail and can change with compiler/platform. Though it is typically stable for any given platform. E.g. this is something specified by the Itanium ABI.

If you don’t know the answer to this question, write your own placement array new that can check this at run time:

inline
void*
operator new[](std::size_t n, void* p, std::size_t limit)
{
    if (n <= limit)
        std::cout << "life is good\n";
    else
        throw std::bad_alloc();
    return p;
}

int main()
{
    alignas(std::string) char buffer[100];
    std::string* p = new(buffer, sizeof(buffer)) std::string[3];
}

By varying the array size and inspecting n in the example above, you can infer y for your platform. For my platform y is 1 word. The sizeof(word) varies depending on whether I’m compiling for a 32 bit or 64 bit architecture.

16

  • 1

    @Kerrek SB: You are correct that I was careless with alignment. I’ve added alignas to the client code to make things right. The placement new expression should take care of alignment with respect to the “cookie” and the “data” respectively. For example here is how the Itanium ABI does it (sourcery.mentor.com/public/cxx-abi/abi.html#array-cookies). And yes, you can infer y as you suggest. Be aware that y may be dependent on the alignment of the new’d type, and on whether or not that type has a trivial destructor (and other platforms may have other details).

    Jan 5, 2012 at 2:11

  • 5

    @HowardHinnant: I’m still baffled that the placement version requires any cookie at all. What’s it for? What’s in it? After all, the only way you can destroy those array elements is by hand, isn’t it? Your link even says that there’s no cookie for the placement version (size_t, void*). Do you think the non-zeroness of the cookie should be a defect report?

    – Kerrek SB

    Jan 5, 2012 at 2:14


  • 2

    @Kerrek SB: Well that’s a good question and I’m not sure I have a good answer for it. I suppose that some hypothetical user-written placement delete, which is called in case there is an exception thrown during the default construction of each element, might make use of the cookie during clean up. But I don’t have a good example of such a case in my back pocket. And even if such a hypothetical user-written placement delete existed, it would necessarily be platform dependent. On the bright side, it is legal for sizeof(y) to be 0. 🙂

    Jan 5, 2012 at 3:50

  • 2

    If you would like to submit a defect report on this, it should be aimed at the CWG (as opposed to the LWG). Here is the CWG issues list: open-std.org/jtc1/sc22/wg21/docs/cwg_active.html . And your best strategy for submitting an issue is to email the author of that list. I don’t know if an issue demanding y == 0 always would be successful if for no other reason than backwards compatibility with established ABI’s such as the Itanium ABI. Breaking ABI at this low level is very daunting.

    Jan 5, 2012 at 3:54

  • 5

    It seems that here is already a defect report on this matter! D’oh…

    – Kerrek SB

    Jan 5, 2012 at 4:56

9

Update: After some discussion, I understand that my answer no longer applies to the question. I’ll leave it here, but a real answer is definitely still called for.

I’ll be happy to support this question with some bounty if a good answer isn’t found soon.

I’ll restate the question here as far as I understand it, hoping that a shorter version might help others understand what’s being asked. The question is:

Is the following construction always correct? Is arr == addr at the end?

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

We know from the standard that #1 causes the call ::operator new[](???, addr), where ??? is an unspecified number no smaller than N * sizeof(T), and we also know that that call only returns addr and has no other effects. We also know that arr is offset from addr correspondingly. What we do not know is whether the memory pointed to by addr is sufficiently large, or how we would know how much memory to allocate.


You seem to confuse a few things:

  1. Your example calls operator new[](), not operator new().

  2. The allocation functions do not construct anything. They allocate.

What happens is that the expression T * p = new T[10]; causes:

  1. a call to operator new[]() with size argument 10 * sizeof(T) + x,

  2. ten calls to the default constructor of T, effectively ::new (p + i) T().

The only peculiarity is that the array-new expression asks for more memory than what is used by the array data itself. You don’t see any of this and cannot make use of this information in any way other than by silent acceptance.


If you are curious how much memory was actually allocated, you can simply replace the array allocation functions operator new[] and operator delete[] and make it print out the actual size.


Update: As a random piece of information, you should note that the global placement-new functions are required to be no-ops. That is, when you construct an object or array in-place like so:

T * p = ::new (buf1) T;
T * arr = ::new (buf10) T[10];

Then the corresponding calls to ::operator new(std::size_t, void*) and ::operator new[](std::size_t, void*) do nothing but return their second argument. However, you do not know what buf10 is supposed to point to: It needs to point to 10 * sizeof(T) + y bytes of memory, but you cannot know y.

8

  • You should expand on the difference between what new does and the operator new function. Until the linked convo, I thought new was simply syntactic sugar. Also, the calls to operator new instead of operator new[] were typos. I did it AGAIN in this comment 🙁

    Jan 4, 2012 at 0:08


  • @MooingDuck: I, and others, have done so countless times before on SO. I recommend a Good Book, or searching SO.

    – Kerrek SB

    Jan 4, 2012 at 0:09

  • 3

    But what about new(buf) T[10]? How do you make buf big enough? (Coming from the chat discussion I know this is the actual intended question, but it was not made clear 🙁 )

    Jan 4, 2012 at 0:11


  • @R.MartinhoFernandes: You’re absolutely right; I’ve amended the answer, and basically I don’t have an answer to the question now. I won’t delete this unless someone takes exception to it, but we definitely need a proper answer still.

    – Kerrek SB

    Jan 4, 2012 at 1:03

  • 3

    @GMan: No! On the contrary: We have no idea how much memory is required by ::new (buf) T[n]! That’s what the initial quote from 5.3.4 says: We call ::operator new[](sizeof(T) * n + y, buf), with no knowledge about y.

    – Kerrek SB

    Jan 4, 2012 at 1:32


7

Calling any version of operator new[] () won’t work too well with a fixed size memory area. Essentially, it is assumed that it delegates to some real memory allocation function rather than just returning a pointer to the allocated memory. If you already have a memory arena where you want to construct an array of objects, you want to use std::uninitialized_fill() or std::uninitialized_copy() to construct the objects (or some other form of individually constructing the objects).

You might argue that this means that you have to destroy the objects in your memory arena manually as well. However, calling delete[] array on the pointer returned from the placement new won’t work: it would use the non-placement version of operator delete[] ()! That is, when using placement new you need to manually destroy the object(s) and release the memory.

5

  • 1

    Good point about placement operator delete[](). @Mooing Duck: pay attention to it.

    Jan 4, 2012 at 6:17

  • 1

    I’m aware that placement-newed objects have to be delted manually. uninitialized_fill is a good idea, but you seem to be saying that the overloaded operator new for arrays that takes a buffer in the C++ spec wont work for what it’s designed for. Is that what you’re saying? (That is what chat determined.)

    Jan 4, 2012 at 6:49

  • 2

    placement operator new[]() is working what it is intended for: allocate memory in a way using additional arguments and constructing objects in this memory. What doesn’t seem to work portably is the version which only takes a void* to already allocated memory. Given that you wouldn’t know where the objects end up at it seems questionable anyway.

    Jan 4, 2012 at 7:01

  • 2

    The entire point is that only the standard delete[] operator requires the information that is stored in the extra bytes (both for going through the array, invoking each element’s destructor, and for passing the size of the array to the deallocation function, if it needs it). The interesting question for me is now whether the standard actually says so, or if we’ve found a defect.

    Jan 4, 2012 at 8:08

  • I don’t think this qualifies as a defect. However, I agree that the standard may be enhanced to remove the possibility of using more memory than the objects need.

    Jan 4, 2012 at 8:22