It is often desirable to have concurrent read-write access to a database when there is no need for full recoverability or transaction semantics. For this class of applications, Berkeley DB provides an interface supporting deadlock-free, multiple-reader/single writer access to the database. This means that at any instant in time, there may be either multiple readers accessing data or a single writer modifying data. The application is entirely unaware of which is happening, and Berkeley DB implements the necessary locking and blocking to ensure this behavior.
To create Berkeley DB Concurrent Data Store applications, you must first initialize an environment by calling DB_ENV->open. You must specify the DB_INIT_CDB and DB_INIT_MPOOL flags to that interface. It is an error to specify any of the other DB_ENV->open subsystem or recovery configuration flags, for example, DB_INIT_LOCK, DB_INIT_TXN, or DB_RECOVER. All databases must, of course, be created in this environment by using the db_create interface or Db constructor, and specifying the environment as an argument.
Berkeley DB performs appropriate locking in its interface so that safe enforcement of the deadlock-free, multiple-reader/single-writer semantic is transparent to the application. However, a basic understanding of Berkeley DB Concurrent Data Store locking behavior is helpful when writing Berkeley DB Concurrent Data Store applications.
Berkeley DB Concurrent Data Store avoids deadlocks without the need for a deadlock detector by performing all locking on an entire database at once (or on an entire environment in the case of the DB_CDB_ALLDB flag), and by ensuring that at any given time only one thread of control is allowed to simultaneously hold a read (shared) lock and attempt to acquire a write (exclusive) lock.
All open Berkeley DB cursors hold a read lock, which serves as a guarantee that the database will not change beneath them; likewise, all non-cursor DB->get operations temporarily acquire and release a read lock that is held during the actual traversal of the database. Because read locks will not conflict with each other, any number of cursors in any number of threads of control may be open simultaneously, and any number of DB->get operations may be concurrently in progress.
To enforce the rule that only one thread of control at a time can attempt to upgrade a read lock to a write lock, however, Berkeley DB must forbid multiple cursors from attempting to write concurrently. This is done using the DB_WRITECURSOR flag to the DB->cursor interface. This is the only difference between access method calls in Berkeley DB Concurrent Data Store and in the other Berkeley DB products. The DB_WRITECURSOR flag causes the newly created cursor to be a "write" cursor; that is, a cursor capable of performing writes as well as reads. Only cursors thus created are permitted to perform write operations (either deletes or puts), and only one such cursor can exist at any given time.
Any attempt to create a second write cursor or to perform a non-cursor write operation while a write cursor is open will block until that write cursor is closed. Read cursors may open and perform reads without blocking while a write cursor is extant. However, any attempts to actually perform a write, either using the write cursor or directly using the DB->put or DB->del methods, will block until all read cursors are closed. This is how the multiple-reader/single-writer semantic is enforced, and prevents reads from seeing an inconsistent database state that may be an intermediate stage of a write operation.
With these behaviors, Berkeley DB can guarantee deadlock-free concurrent database access, so that multiple threads of control are free to perform reads and writes without needing to handle synchronization themselves or having to run a deadlock detector. Because Berkeley DB has no knowledge of which cursors belong to which threads, however, some care must be taken to ensure that applications do not inadvertently block themselves, causing the application to hang and be unable to proceed. Some common mistakes include the following:
Copyright Sleepycat Software