PostgreSQL and typeorm - Practical transactions

Lesson goal
To understand what a transaction does for us and how to choose when one is required in our code.
I will also show how to add transactions using typeorm.
Database course index
This is part of a full course on persistence in postgres with typeorm and sql!
There is a github repo to go with this course. See part 2 for instructions.
You can see examples of typeorm in a real app on Use Miller on GitHub
The problem with multi-user apps
We need databases to be multi-user. That is, we need to allow more than one customer to read from it and write to it at one time. Otherwise they wouldn't be very suitable for web applications!
This leads us to having to deal with "concurrent" access.
1. Concurrency control in databases
When multiple users are accessing and modifying the same data in a database at the same time, there is the potential for conflicts to arise.
For example, In a banking application we might have an automatic bill payment being paid AND the account owner withdrawing money from the ATM at the same time. Both processes would read the current balance to check if there are enough funds before proceeding.
The bill payment would write the new balance before the ATM transaction did because it's a computerised transaction while the customer is taking time trying to remember the amount of cash that they really need for lunch.
So the ATM transaction would later attempt to write the old balance - the ATM withdrawal amount. Now we have inconsistent data.
Note: This is a simplified example. In reality, the ATM transaction would be rejected because the software would use some sort of atomic update mechanism and it would detect that the balance had changed. But the point is that the data is inconsistent at the point of attempting the persistence of the ATM balance transaction.
This situation is acute in most programs using ORMs because we often select a bunch of data out to our program, make a decision based on the data and then write back modified data using the domain models.
const myStateRecord = repo.find({ id: 1 })
if (myStateRecord.state === TIME_TO_DO_SOMETHING) {
myStateRecord.state = DOING_THE_THING
repo.save(myStateRecord)
//fire off an event to do the thing
}
Transactions are what we use to ensure that the steps in our program logic are all executed as one "unit". If any part of the execution fails we can revert all the changes and do something to fix the issue.
A transaction is a sequence of database operations that are treated as a single unit. This means that either all of the operations in the transaction are completed successfully, or none of them are.
In order to use transactions we often need to "lock" the data while we're using it. A "lock" means that other users can't access the data until we're done with it. Note that transactions don't automatically lock data - that depends on the isolation level and specific operations you perform.
Unfortunately this leads to another issue - deadlocks.
A deadlock is a situation in which two or more transactions are waiting for each other to release locks they have acquired on data in the database.
This can lead to a situation in which none of the transactions can make progress, and the system can become "deadlocked" until the deadlock is resolved.
Most database systems kill a transaction that has been waiting for a lock for too long. This is called a "timeout".
// Pseudo-code to illustrate the concept
const myStateRecord = repo.find({ id: 1 })
if (myStateRecord.state === TIME_TO_DO_SOMETHING) {
myStateRecord.state = DOING_THE_THING
repo.save(myStateRecord)
transaction.commitTransaction()
//fire off an event to do the thing
}
Glossary: Isolation levels
We can control what the database does when there are updates to data that affect other transactions that are in progress.
We can have one transaction pay close attention to how another transaction has modified data it is also using, or we can have a transaction completely ignore updates made by another transaction.
Another option would be to just error if any any data used by a transaction was modified by another transaction.
These are referred to as "isolation levels".
It's important to note that different databases on the same server can have different default isolation levels set. You need to confirm the default isolation level before making any assumptions about it.
It's also important to note that different database clients - e.g. typeorm - can have a different default isolation level to the database! For example .NET data access libraries will often use serializable as the default isolation level even though SQL Server's default is READ_COMMITTED.
In general it's better to always specify the isolation level in your code so you're protected from changes later.
| isolation level | description |
|---|---|
| READ UNCOMMITTED | In postgres this is the same as read committed |
| READ COMMITTED | A concurrent transaction can modify data you have previously read, your transaction will only see the updated values if the other transaction commits. This is the default isolation level in Postgres, oracle and ms sql server. |
| REPEATABLE READ | Your transaction sees a snapshot of the data as it was at the start of the transaction. Other transactions can modify data, but you won't see their changes. |
| SERIALIZABLE | Uses Serializable Snapshot Isolation (SSI) in Postgres. Transactions are checked for conflicts and will fail if they would violate serializability. Front end devs: This is the default isolation level in SQLite |
There's two important things to note here. The default isolation level in postgres is READ COMMITTED. This means that, your transaction won't read other transaction's changes until they are committed. Once they are committed they can be read. This means other transactions can modify data while your transaction is running
-- the select * is just illustrative. don't do this.
select * from my_table where ...
--do a few other commands
update my_table...
-- right here another transaction commits and modifies the data you were working with
select * from my_table -- this will contain the new data unless you changed it!
You can hint to postgres that you will be modifying the rows with FOR UPDATE.
select * from my_table FOR UPDATE where ...
--do a few other commands
-- right here another transaction commits - IT WILL ERROR because you have locked those rows
Note that you could consider using a serializable isolation level. Postgres uses Serializable Snapshot Isolation (SSI) which doesn't actually lock rows - instead it tracks dependencies and fails transactions that would violate serializability. Just be aware of the concurrency technique you need for a given scenario.
Using transactions in typeorm
The simplest way to use transactions in TypeORM is with the callback-based API:
await dataSource.transaction(async (manager) => {
await manager.save(petOwner)
await manager.save(pet1)
await manager.save(pet2)
})
// Transaction is automatically committed on success, rolled back on error
If you need more control, you can use the queryRunner method to manage the transaction yourself:
const queryRunner = dataSource.createQueryRunner()
const petOwner = new PetOwner()
petOwner.name = 'tx_owner'
const pet1 = new Pet()
pet1.name = 'tx pet 1'
pet1.owner = petOwner
pet1.type = 'cat'
const pet2 = new Pet()
pet2.name = 'tx pet 2'
pet2.owner = petOwner
pet2.type = 'dog'
await queryRunner.connect()
await queryRunner.startTransaction()
try {
await queryRunner.manager.save(petOwner)
await queryRunner.manager.save(pet1)
await queryRunner.manager.save(pet2)
await queryRunner.commitTransaction()
} catch {
// since we have errors lets rollback the changes we made
await queryRunner.rollbackTransaction()
} finally {
// you need to release a queryRunner which was manually instantiated
await queryRunner.release()
}
Summary
If you're selecting some data to make a decision, and then writing back new information based on that decision, you likely need a transaction.
If you're only selecting data you almost certainly don't need a transaction.
You should understand what READ_COMMITTED does for you in Postgres (https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED).