9. Transaction Management

Why do we care?

Transaction Support

What is a transaction?

Properties of Transactions

There are four basic (ACID) properties that define a transaction:

Concurrency Control

The process of managing simultaneous operations on the database without having them interfere with one another

Lost Update Problems

A successfully completed update is overridden by another user
Lost Update Problem.png
- Loss of T2’s update avoided by preventing T1 from reading balx until after update

Uncommitted Dependency Problem

Occurs when one transaction can see intermediate results of another transaction before it has committed
Uncommitted dependency problem.png
- Problem avoided by preventing T3 from reading balx until after T4 commits or aborts

Inconsistent Analysis Problems

Occurs when a transaction reads several values but a second transaction updates some of them during the execution of the first
Inconsistent Analysis Problem.png
- Problem avoided by preventing T6 from reading balx and balz until after T5 completed updates

Serialisability

The objective of a concurrency control protocol is to schedule non-interfering transactions

Concurrency Control Techniques

Two basic concurrency control techniques

Locking

Transactions use locks to deny access to other transactions and to therefore prevent incorrect updates
Two types

Locks are used in the following way:

Two-Phase Locking (2PL)

A transaction follows the 2PL protocol if all locking operations precede the first unlock operations in the transaction
Two phases for the transaction:

Cascading Rollback
If every transaction in a schedule follows 2PL, the schedule is serialisable

Deadlock

An impasse that may result when two (or more) transactions are each waiting for locks held by the other to be released
Deadlock.png
There is only one way to break a deadlock: Abort all but one of the transactions
A deadlock should be transparent to the user, so the DBMS should restart the transaction(s)
However, in practice the DBMS cannot restart the aborted transaction since it is unaware of transaction logic even if it was aware of the transaction history (unless there is no user input in the transaction or the input is not a function of the database state)

Three general techniques for handling a deadlock:

Time-stamping

A different approach to locking, with timestamps

Basic Timestamp Ordering
Consider a transaction T with timestamp ts(T) requesting to read(x)

Consider a transaction T with timestamp ts(T) requesting to write(x)

Strict time-stamping…
When writing or reading, the transaction must also wait for the last transaction to write to the data item to abort/commit to avoid a cascading rollback

Thomas’ Write Rule
A variation to the basic timestamp ordering protocol that increase concurrency by relaxing strict serialisability

Optimistic Techniques

In some environments, conflicts are rare and it is more efficient to proceed without delays

Read Phase

Extends from the start until immediately before the commit

Validation Phase

After the read phase,
For a read-only transaction, it checks that the data it read is still the same values. If no interference, the transaction is committed, otherwise it is aborted and restarted

Each transaction T is assigned a time-stamp at

  1. finish(Tx)<start(Ty), or
  2. start(Ty)<finish(Tx), and
    1. The set of data items written by Tx are not the ones read by Ty; and
    2. start(Ty)<finish(Tx)<validation(Ty)
Write Phase

After a successful validation phase for an update transaction, the updates made to the local copy are applied to the database

Granularity of Data Items

The size of data items chosen as the unit of protection by the concurrency control protocol
Ranging from coarse to fine:

Hierarchy of Granularity

When a node is locked, all its descendants are also locked

Multiple-Granularity Locking

Intention Shared (IS): explicit locking at a lower level of the tree but only with shared locks
Intention Exclusive (IX): explicit locking at a lower level with exclusive or shared locks
Shared and Intention Exclusive (SIX): the sub tree rooted by that node is locked explicitly in shared mode and explicit locking is being done at a lower level with exclusive mode locks
Multiple-Granularity Locking.png
Granular example.png|500
4 Transactions T1, T2, T3, T4
T1 wants an S lock on Record2
T2 wants an X lock on Record2
T3 wants an X lock on Record1
T4 wants an S lock on Page2

Types of Failures

System Crashes: due to hardware/software errors, loss in main memory
Media Failure: such as head crashes or unreadable media, resulting in a loss of parts of secondary storage
Application Software Errors: such as logical errors
Natural Physical Disasters: such as fires, floods, etc…
Carelessness or Unintentional Destruction of Data or Facilities
Sabotage: intentional corruption/destruction

Transactions and Recovery

Transactions represent the basic unit of recovery

Recovery Facilities

The DBMS should provide the following facilities to assist with recovery:

Log File

Contains information about all updates to the database:

Checkpointing
Recovery Techniques

If the database has been damaged: