sync is a standard system call in the Unix operating system, which commits to disk all data in the kernel filesystem buffers, i.e., data which has been scheduled for writing via low-level I/O system calls. Note that higher-level I/O layers such as stdio may maintain separate buffers of their own.
As a function in C, the sync() call is typically declared as void sync(void) in <unistd.h>. The system call is also available via a command line utility also called sync, and similarly named functions in other languages such as Perl.
The related system call fsync() commits just the buffered data relating to a specified file descriptor. fdatasync() is also available to write out just the changes made to the data in the file, and not necessarily the file's related metadata.
Unix systems typically run some kind of flush or update daemon, which calls the sync function on a regular basis. On some systems, the cron daemon does this, and on Linux it's handled by the pdflush daemon. Buffers are also flushed when filesystems are unmounted or remounted read-only, for example prior to system shutdown.
In order to provide proper durability, databases need to use some form of sync in order to make sure the information written has made it to non-volatile storage rather than just being stored in a memory-based write cache that would be lost if power failed. PostgreSQL for example may use a variety of different sync calls, including fsync() and fdatasync(), in order for commits to be durable. Unfortunately, for any single client writing a series of records, a rotating hard drive can only commit once per rotation, which makes for at best a few hundred such commits per second. Turning off the fsync requirement can therefore greatly improve commit performance, but at the expense of potentially introducing database corruption after a crash.
Databases also employ log files (typically much smaller than the main data files) that have information about recent changes, such that changes can be reliably redone in case of crash; then the main data files can be synced less often.
Hard disks may default to using their own volatile write cache to buffer writes, which greatly improves performance while introducing a potential for lost writes. (Tools such as hdparm -F will instruct the HDD controller to flush the on-drive write cache buffer.) The performance impact of turning caching off is so large that even the normally conservative FreeBSD community rejected disabling write caching by default in FreeBSD 4.3.
Even when fsync is used correctly, data loss may occur when dealing with file entry handling, such as file creation, file renaming, or linking.
libnofsync is a hack that replaces fsync with a dummy function that does nothing.
- fsync specification
- fdatasync specification
- The Linux Page Cache and pdflush
- PostgreSQL Reliability and the Write-Ahead Log
- Tuning PostgreSQL WAL Synchronization
- Write-Cache Enabled?
- FreeBSD Handbook — Tuning Disks
- Necessary step(s) to synchronize filename operations on disk