The read-of-non-persistent-write problem is found for lock-free programs on persistent memory. As compare-and-swap (CAS) operations do not persist the written values to persistent memory, the modified data can be made visible by the cache coherence protocol to a concurrent observer before the modified data can be observed by a crash observer at persistent memory. If a power failure happens right after the write is made visible but not yet persistent, the read-of-non-persistent-write problem can occur, i.e., a data variable that is modified by a compare-and-swap operation can be made visible to a concurrent observer before a crash observer, causing potential crash inconsistencies.
To illustrate the problem: for a singly linked lock-free list, a node can be inserted by a producer thread A after the head node, the next pointer of the head node gets atomically switched (CAS) to point to the new node A, however, this CAS is not persisted. Then, another node gets inserted by producer thread B after node A, as CAS for node A is already visible to all concurrent threads. CAS atomically switches the next pointer of node A to point to node B, and this CAS gets persisted. If a power failure happens at this point, the application that uses the linked list would be left in an inconsistent state, with both node A and node B lost, as the next pointer from the head node to node A has not been persisted. As node B has been published but can’t be accessed after a reboot, and other data may have been persisted that are accessed through or dependent on node B, all subsequent accesses to such data will not be possible, causing data loss.3
The read-of-non-persistent-write problem is not limited to lock-free linked lists, it can be found in any lock-free data structures where the potential gap between concurrent visibility and persistent visibility can exist. For instance, a similar problem can occur with persistent circular buffers.4
Satish M. Thatte. 1986. Persistent memory: a storage architecture for object-oriented database systems. In Proceedings on the 1986 international workshop on Object-oriented database systems (OODS '86). IEEE Computer Society Press, Los Alamitos, CA, USA, 148-159. ↩
P. Mehra and S. Fineberg, "Fast and flexible persistence: the magic potion for fault-tolerance, scalability and performance in online data stores," 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings., Santa Fe, NM, USA, 2004, pp. 206-. doi: 10.1109/IPDPS.2004.1303232 ↩
Wang, William; Diestelhorst, Stephan (June 17, 2019). "Persistent Atomics for Implementing Durable Lock-Free Data Structures for Non-Volatile Memory (Brief Announcement)". The 31st ACM Symposium on Parallelism in Algorithms and Architectures. Association for Computing Machinery. pp. 309–311. doi:10.1145/3323165.3323166. ISBN 9781450361842. S2CID 195064876 – via ACM Digital Library. 9781450361842 ↩
Wolczko, Mario (April 26, 2019). "Non-Volatile Memory and Java: Part 2". Medium. https://medium.com/@mwolczko/non-volatile-memory-and-java-part-2-c15954c04e11 ↩