Checkpoints¶
To achieve resilience and fault tolerance, Apache Flink® uses checkpoints with stateful functions. These checkpoints allow Flink to recover the state and position within the stream and provide failure-free execution for applications.
Essentially, a checkpoint creates a snapshot of the data stream and stores it. The snapshots then provide a mechanism to recover from unexpected job failures. Compared to traditional database systems, checkpoints are closer to recovery logs than to backups.
graph TD;
id5[/Older records in data stream/]-->id4[[Checkpoint barrier n-1]];
id4 --- id3[(Checkpoint operator state)];
id3 --- id2[[Checkpoint barrier n]];
id2-->id1[/Newer records in data stream/];
id3-->id6[(State backend)];
For more information, see the Apache Flink® documentation on checkpoints.