Международная конференция разработчиков
и пользователей свободного программного обеспечения

Linux Userspace Checkpoint/Restore: From Dreams to Reality

Andrew Vagin, Moscow, Russia

LVEE 2013

Checkpoint/restore is a feature that allows to freeze a set of running processes and save their complete state to disk. Unfortunately, many attempts to merge such functionality into the upstream Linux kernel have failed miserably, mostly for the code complexity reasons. OpenVZ kernel developers team found a way to overcome this inability to merge the code upstream, by implementing most of the required pieces in userspace, with a minimal intervention into the kernel.

Checkpoint/restore is a feature that allows to freeze a set of running processes and save their complete state to disk. This state can later be restored and so processes are resumed exactly the way they were running before. This feature opens a whole set of possibilities, from doing live migration to fast start of huge applications.

Unfortunately, many attempts to merge such functionality to the upstream Linux kernel have failed miserably, mostly for the code complexity reasons. That leaves the Linux community with a poor option of using the non-upstreamed kernel patches available from e.g. OpenVZ or Oren Laadan.

OpenVZ kernel developers team have recently found the way to overcome this inability to merge the code upstream, by implementing most of the required pieces in userspace, with minimal intervention into the kernel.

The project started about a year ago, but it is already enough powerful. Now CRIU is capable to dump an LXC container with Apache and MySQL.

Many interesting problems were solved during CRIU development.
For example CRIU required an ability:

  • to inject a parasite code (solved by Tejun Heo)
  • to dump and restore pending data in a sockets
  • to dump and restore shared objects
  • etc

The technology should be useful for system and distro developers, advanced users, and anyone interested in containers, virtualization, and high availability systems.

Abstract licensed under Creative Commons Attribution-ShareAlike 3.0 license

Назад