Professional way to produce a large problem without filling up huge arrays: C++, free memory from part of an array

03/11/2022 softwareengineering

I’m developing a physics simulation, and as I’m rather new to programming, I keep running into problems when producing large programs (memory issues mainly). I know about dynamic memory allocation and deletion (new / delete, etc), but I need a better approach to how I structure the program.

Let’s say I’m simulating an experiment which is running for a few days, with a very large sampling rate. I’d need to simulate a billion samples, and run over them.

As a super-simplified version, we’ll say a program takes voltages V[i], and sums them in fives:

i.e. NewV[0] = V[0] + V[1] + V[2] + V[3] + V[4]

then NewV[1] = V[1] + V[2] + V[3] + V[4] + V[5]

then NewV[2] = V[2] + V[3] + V[4] + V[5] + V[6]
…and this goes on for a billion samples.

In the end, I’d have V[0], V[1], …, V[1000000000], when instead the only ones I’d need to store for the next step are the last 5 V[i]s.

How would I delete / deallocate part of the array so that the memory is free to use again (say V[0] after the first part of the example where it is no longer needed)? Are there alternatives to how to structure such a program?

I’ve heard about malloc / free, but heard that they should not be used in C++ and that there are better alternatives.

Thanks very much!

tldr; what to do with parts of arrays (individual elements) I don’t need anymore that are taking up a huge amount of memory?

What you describe, “smoothing by fives”, is a finite impulse response (FIR) digital filter. Such filters are implemented with circular buffers. You keep only the last N values, you keep an index into the buffer that tells you where the oldest value is, you overwrite the current oldest value with the newest one at each step, and you step the index, circularly, each time.

You keep your collected data, that you are going to crunch down, on disk.

Depending on your environment, this may be one of those places where you’re better off getting experienced help. At a university, you put a note up on the bulletin board in the Computer Science Department, offering student wages (or even student consulting rates) for a few hours of work, to help you crunch your data. Or maybe you offer Undergraduate Research Opportunity points. Or something.

Every problem can be solved by adding an additional level of indirection. So do that.

You can’t delete part of an array in C++. But you can create a new array holding just the data you want to keep, then delete the old one. So you can build a data structure that allows you to “remove” elements you don’t want from the front. What it will actually do is create a new array and copy the unremoved elements to the new one, then delete the old.

Or you could just use std::deque, which can effectively do this already. deque, or “double-ended queue”, is a data structure intended for cases where you’re deleting elements from one end while adding elements to the other.

The FIR and SMA answers you got are good in your case, however I’d like to take the opportunity to push forward a more generic approach.

What you have here is a stream of data: instead of structuring your program in 3 big steps (get data, compute, output result) which require loading all data in memory at once, you can instead structure it as a pipeline.

A pipeline starts with a stream, transforms it, and pushes it to a sink.

In your case, the pipeline looks like:

Read items from disk, emit items one at a time
Receive items one at a time, for each item received emit the last 5 received (where your circular buffer comes in)
Receive items 5 at a time, for each group compute the result
Receive the result, write it to disk

C++ tends to use iterators rather than streams, but to be honest streams are easier to model (there is a proposal for ranges which would be similar to streams):

template <typename T>
class Stream {
public:
    virtual boost::optional<T> next() = 0;
    virtual ~Stream() {}
};

class ReaderStream: public Stream<Item> {
public:
    boost::optional<Item> next() override final;

private:
    std::ifstream file;
};

class WindowStream: public Stream<Window> {
public:
    boost::optional<Window> next() override final;

private:
    Window window;
    Stream<Item>& items;
};

class ResultStream: public Stream<Result> {
public:
    boost::optional<Result> next() override final;

private:
    Stream<Window>& windows;
};

And then, the pipeline look like:

ReaderStream itemStream("input.txt");
WindowStream windowStream(itemsStream, 5);
ResultStream resultStream(windowStream);
std::ofstream results("output.txt", std::ios::binary);

while (boost::optional<Result> result = resultStream.next()) {
    results << *result << "n";
}

Streams are not always applicable (they do not work when you need random access to the data), but when they are, they rock: by operating on a very small amount of memory you keep it all in CPU cache.

On another note: it seems like your problem might be “embarrassingly parallel”, you may want to split your big file in chunks (keep in mind, for processing by windows of 5, that you need to have 4 common elements at each boundary) and then process the chunks in parallel.

If CPU is the bottleneck (and not I/O), then you can speed it up by launching one process per core that you have after having split the files in roughly equal amounts.

LEAVE A COMMENT Hủy