educative.io

What's the difference between FsImage and image file snapshot?

There is a say: At periodic intervals, the EditLog and FsImage files are merged to create a new image file snapshot, and the edit log is cleared out.
Does it seem that FsImage and image file snapshot is the same thing?
What’s the difference between them? What exactly they are?


Course: Grokking the Advanced System Design Interview - Learn Interactively
Lesson: Fault Tolerance - Grokking the Advanced System Design Interview

Hi @HSU_TZUJEN,

The FsImage is stored in the NameNode’s local file system as a file.

Snapshots allow you to save a copy of data at a particular time. A snapshot of the complete file system can also be taken. This does not include copying data but recording filesize, block information, etc., to a snapshottable directory.

In simple terms, FsImage stores information about where data is stored, how many blocks it is stored in, and other related information, whereas Snapshot stores a read-only image of the data/file system.

I hope this explains the difference. Please let me know if you still have any confusion.

Thank You.

Could I consider a snapshot as a read-only FsImage, but a FsImage could be modified with Editlog?
Snapshot is a snapshot for all metadata right?

I still can not understand what’s the difference between them in your response, but the snapshot is read-only compared to FsImage.

Could I say that HDFS stores modifiable snapshots which are FsImages periodically, so we could use the FsImages and Editlogs to produce a snapshot for any particular timestamp we want?