std::vector with a not-so-movable type

Consider this:

class some_class {
    public:
        some_class(some_class && x) = delete;
    // ....
};

Let’s say that we do not want some class type to prohibit move operations, and for some reason, we want to keep a std::vector<T> type of these. Unfortunately this doesn’t work as-is.

C++ 101 : what is a ‘move’ operation?

Consider this:

std::vector<int> get_power_vect(int size) {
    std::vector<int> ret;
    ret.reserve(size);
    for(int x=0; x<size; x++) {
        ret.push_back(x);
    }
    return ret;
}

On classic C++, this kind of code resulted in a horrible *facepalm*. This is because the final return statement results in creating a sizable vector, and then returning a copy of the vector to the caller, and then immediately destroying the function’s stack, resulting in destroying the original vector.

This resulted in a lot of old C++ code doing something like this:

void get_power_vect(int size, std::vector<int> & ret) {
    ...
}

This works without performance overhead, but there are a couple of problems:

It isn’t clear if an argument (especially if it’s a reference type) is an input or an output. Either it has to be clearly documented (which we know that people are bad at) or somebody has to spend time reading the implementation.
The function has to think of what the input vector would look like – e.g., what will happen if the vector ret was non-empty?

C++11 changed this by introducing ‘rvalue’ references which ends up being ‘moved’ instead of being ‘copied’. I won’t go into too much details, but to put it simply, what happens is that a variable gets moved instead of copied if the source variable is going to be destroyed right away.

So, the statement ‘return ret’ results in the caller’s vector contents being moved to the callee’s context, so none of that silly copy operation happens. Thus, modern C++ recommends even returning huge STL data structures because this is going to result in only a couple of pointer copy operations.

A move constructor looks something like this:

some_class::some_class(some_class && x) {...}

See the ‘&&’ operation – that is the ‘rvalue reference’ type, which (to put simply) means that the variable is a rvalue type which will be destroyed right after this call, so you don’t need to do any deep copying.

So, what can possibly go wrong?

A move operation is a godsend in many cases, but is going to definitely be confusing in other cases. The biggest problem is that it can screw over pointers and references. Consider this:

std::vector<int> x;
std::vector<int> * xp = &x;
...
std::vector<int> y = std::move(x); // force a move operation.  the value of x
                                   // is now undefined
...
return xp->size() + 3; // undefined behavior

So, once we did a move, any pointers pointing to the original object gets to point to invalid locations. This may be a bit easy to avoid when the object is confined to a small piece of block, but if your class spans multiple hundreds of files… then good luck avoiding all this trouble.

Thus, if there is an object type that should never be moved around even accidently because there will be many pointers or references of that object everywhere, then it’s better to just prohibit move operations (or even copy operations) by disallowing it:

some_class(some_class && x) = delete;

But… but… but…

The main problem I had was that prohibiting move operations mean that they are no longer usable on many STL data structures (notably std::vector). In many cases I ended up having to keep a vector of std::unique_ptr simply because these things were not movable.

But why does std::vector require a move operation?

Remember that std::vector is a ‘elastic’ array. Add something – you construct that on the end of the buffer. The problem is that if you run out of buffer, std::vector has is to construct a new buffer and then move each individual object to the new buffer. Append something on the front – then std::vector shifts everything by one (involving move) and then creates the new object on the newly created spot.

But if we don’t need those operations, we should be able to use std::vector. It’s perfectly valid that we create a vector of size 100 (or some arbitrary size) and then keep it at that size until it gets destroyed. The only problem is that the compiler can’t statically know if the move will be ever required or not. So it disallows it.

Solution : make it a runtime error

I ended up with a very simple solution: redefine the move operation to be a runtime error. Something like this:

some_class(some_class && x) {
    throw std::runtime_error("Move operations not allowed");
}

This results in the code compiling but crash if a move operation is ever invoked. Except… it changes a compile-time error to a runtime error which is much harder to debug. Do this on a sparsely-used data structure might be okay, but it’s going to look bad if there is something like 1000 places using that class.

So, an alternative – simply create a child class that defines the move type as a runtime error, and use that type for std::vector.

template <typename T>
class assert_fail_move : public T {
    using T::T;
public:
    assert_fail_move(assert_fail_move && x) {
        throw std::runtime_error("Did not expect move operations to happen");
    }
};

template <typename T> using unmovable_vector =  
                            std::vector<assert_fail_move<T>>;

Example usage:

// Test example
class testclass {
private:
    int _w;
    int _x;
public:
    testclass(int w, int x) {
        _w = w; _x = x;
    }
    testclass() {
        _w = 0; _x = 0;
    }
    testclass(testclass && x) = delete;
    int sumsum() const {
        return _w + _x;
    }
    void setval(int w, int x) {
        _w = w;
        _x = x;
    }
};

void print(const testclass * x) {
    std::cout << x->sumsum() << std::endl;
}
void print(testclass & x) {
    std::cout << x.sumsum() << std::endl;
}
int main() {
    unmovable_vector<testclass> v(10);
    for(int i=0; i<10; i++) {
        v[i].setval(i, i+3);
    }

    unmovable_vector<testclass> v2;
    v2.reserve(10);
    // will result in a runtime error if we reserved too less
    for(int i=0; i<10; i++) {
        v2.emplace_back(i, i+3);
    }
    for(auto & x : v) {
        print(&x);
    }
    for(auto & x : v2) {
        print(x);
    }
    return 0;
}

C++ 101 : what is a ‘move’ operation?

So, what can possibly go wrong?

But… but… but…

Solution : make it a runtime error

Leave a Reply Cancel reply