I disagree. It's more than "allocate memory", it's: "give me some room on
the buffer". When we don't care about downtime (like now), it's enough to
kmalloc a temporary buffer and flush (write to the FD) immediately.
But when we will care about downtime, "the buffer" will be memory in kernel
where the entire checkpoint image data will be kept while the container is
frozen (memory contents need only be marked copy-on-write, not explicitly
buffered).
In this setting, cr_kwrite() will not really write the data to the FD, but
only record somewhere that the data exists. Only after the container resumes
execution will we eventually write the data to the FD and free the buffer.
(This is also useful in case you want to keep the checkpoint image entirely
in memory without flushing to any FD; and yes, there are use-cases for that).
So for this, instead of using kmalloc() to allocate a temp-buf, fill it with
data, and then copy that data to the actual (big) buffer, the mechanism
provides a shortcut to allocate space directly on the buffer, and save the
copy.
But you miss the point: the writing to the file descriptor in the (future)
optimized version will _not_ happen while the container is frozen. Only
much later after the container resumes, so at the end it will be:
(1) freeze container (2) dump data to in-kernel buffer (mark dirty pages
copy-on-write) (3) unfreeze container (4) write in-kernel buffer to FD.
By deferring step #4 until after the freeze-period, you can save tens and
hundreds of milliseconds, and seconds. Now the goal is to make step #2 be
as fast as possible, and every millisecond (and less) counts, e.g. to take
fast checkpoints of interactive desktops. And my experience is that in
such "workload" repetitive kmalloc/kfree inside that dump loop has a
measurable impact on performance (of course, for those objects which are
abundant).
Actually it simplifies *many* error paths -- in *all* cr_write_...()
functions - including the future ones (and there are many to come).
You can see how it's used in the code that dumps files. Or look at the next
incarnation of the patch.
Oren.
--