PostgreSQL and typeorm - Practical transactions

Published on June 13, 2022

Lesson goal

To understand what a transaction does for us and how to choose when one is required in our code.

I will also show how to add transactions using typeorm.

Database course index

This is part of a full course on persistence in postgres with typeorm and sql!

There is a github repo to go with this course. See part 2 for instructions.

The problem with multi-user apps

We need databases to be multi-user. That is, we need to allow more than one customer to read from it and write to it at one time. Otherwise they wouldn’t be very suitable for web applications!

This leads us to having to deal with “concurrent” access.

1. Concurrency control in databases

When multiple users are accessing and modifying the same data in a database at the same time, there is the potential for conflicts to arise.

For example, In a banking application we might have an automatic bill payment being paid AND the account owner withdrawing money from the ATM at the same time. Both processes would read the current balance to check if there are enough funds before proceeding.

Concurrency issue banking

The bill payment would write the new balance before the ATM transaction did because it’s a computerised transaction while the customer is taking time trying to remember the amount of cash that they really need for lunch.

So the ATM transaction would later attempt to write the old balance - the ATM withdrawal amount. Now we have inconsistent data.

Note: This is a simplified example. In reality, the ATM transaction would be rejected because the software would use some sort of atomic update mechanism and it would detect that the balance had changed. But the point is that the data is inconsistent at the point of attempting the persistence of the ATM balance transaction.

This situation is acute in most programs using ORMs because we often select a bunch of data out to our program, make a decision based on the data and then write back modified data using the domain models.

const myStateRecord = repo.find({ id: 1 })
if ((myStateRecord.state = TIME_TO_DO_SOMETHING)) {
  myStateRecord.state = DOING_THE_THING
  repo.save(myStateRecord)
  //fire off an event to do the thing
}

Transactions are what we use to ensure that the steps in our program logic are all executed as one “unit”. If any part of the execution fails we can revert all the changes and do something to fix the issue.

A transaction is a sequence of database operations that are treated as a single unit. This means that either all of the operations in the transaction are completed successfully, or none of them are.

In order to use transactions we “lock” the data while we’re using it. A “lock” means that other users can’t access the data until we’re done with it.

Unfortunately this leads to another issue - deadlocks.

A deadlock is a situation in which two or more transactions are waiting for each other to release locks they have acquired on data in the database.

This can lead to a situation in which none of the transactions can make progress, and the system can become “deadlocked” until the deadlock is resolved.

Most database systems kill a transaction that has been waiting for a lock for too long. This is called a “timeout”.

const transaction = repo.createTransaction()
transaction.start()
const myStateRecord = repo.find({ id: 1 })
if ((myStateRecord.state = TIME_TO_DO_SOMETHING)) {
  myStateRecord.state = DOING_THE_THING
  repo.save(myStateRecord)
  transaction.commitTransaction()
  //fire off an event to do the thing
}

Glossary: Isolation levels

We can control what the database does when there are updates to data that affect other transactions that are in progress.

We can have one transaction pay close attention to how another transaction has modified data it is also using, or we can have a transaction completely ignore updates made by another transaction.

Another option would be to just error if any any data used by a transaction was modified by another transaction.

These are referred to as “isolation levels”.

It’s important to note that different databases on the same server can have different default isolation levels set. You need to confirm the default isolation level before making any assumptions about it.

It’s also important to note that different database clients - e.g. typeorm - can have a different default isolation level to the database! For example .NET data access libraries will often use serializable as the default isolation level even though SQL Server’s default is READ_COMMITTED.

In general it’s better to always specify the isolation level in your code so you’re protected from changes later.

isolation level description
READ UNCOMMITTED In postgres this is the same as read committed
READ COMMITTED A concurrent transaction can modify data you have previously read, your transaction will only see the updated values if the other transaction commits. This is the default isolation level in Postgres, oracle and ms sql server.
REPEATABLE READ A concurrent transaction can modify data you have previously read, the transaction will fail if the transactions could have resulted in different states if committed in a different order
SERIALIZABLE If a concurrent transaction modifies data your transaction is trying to commit, your transaction will fail. Front end devs: This is the default isolation level in SQLite

There’s two important things to note here. The default isolation level in postgres is READ COMMITTED. This means that, your transaction won’t read other transaction’s changes until they are committed. Once they are committed they can be read. This means other transactions can modify data while your transaction is running


-- the select * is just illustrative. don't do this.

select * from my_table where ...
--do a few other commands
update my_table...
-- right here another transaction commits and modifies the data you were working with
select * from my_table -- this will contain the new data unless you changed it!

You can hint to postgres that you will be modifying the rows with FOR UPDATE.

select * from my_table FOR UPDATE where ...
--do a few other commands
-- right here another transaction commits - IT WILL ERROR because you have locked those rows

Note that you could consider using a serializable isolation level which would “lock” all the rows anyway, if using serializable isolation level you don’t need the FOR UPDATE, but you will always lock things. Just be aware of the concurrency technique you need for a given scenario.

Using transactions in typeorm

There are multiple ways to use transactions with typeorm but if you use the queryRunner method you can control the transaction rollback yourself.

const connection = await AppDataSource.connection()
const queryRunner = connection.createQueryRunner()
const petOwner = new PetOwner()
petOwner.name = 'tx_owner'

const pet1 = new Pet()
pet1.name = 'tx pet 1'
pet1.owner = petOwner
pet1.type = 'cat'

const pet2 = new Pet()
pet2.name = 'tx pet 2'
pet2.owner = petOwner
pet2.type = 'dog'

await queryRunner.connect()
await queryRunner.startTransaction()
try {
  ;async (transactionalEntityManager: EntityManager) => {
    await transactionalEntityManager.save(petOwner)
    await transactionalEntityManager.save(pet1)
    await transactionalEntityManager.save(pet2)

    await transactionalEntityManager.commitTransaction()
  }
} catch {
  // since we have errors lets rollback the changes we made
  await queryRunner.rollbackTransaction()
} finally {
  // you need to release a queryRunner which was manually instantiated
  await queryRunner.release()
}

Summary

If you’re selecting some data to make a decision, and then writing back new information based on that decision, you likely need a transaction.

If you’re only selecting data you almost certainly don’t need a transaction.

You should understand what READ_COMMITTED does for you in Postrgres (https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED).

Lesson index

Darragh ORiordan

Hi! I'm Darragh ORiordan.

I live and work in Sydney, Australia building and supporting happy teams that create high quality software for the web.

I also make tools for busy developers! Do you have a new M1 Mac to setup? Have you ever spent a week getting your dev environment just right?

My Universal DevShell tooling will save you 30+ hours of configuring your Windows or Mac dev environment with all the best, modern shell and dev tools.

Get DevShell here: ✨ https://devshell.darraghoriordan.com


Read more articles like this one...

List of article summaries

#engineering

Building an AI generated game with Stable Diffusion and data from Wikipedia

Last week I released a game called Doodle:ai.

In the game you’re shown AI generated images and you have to guess the Wikipedia topic it used to create the game.

#engineering

Easiest way to optimise images for web

Here is how I optimise all pngs and jpgs in a folder for publishing to the web.

#developer-experience

Start tracking DORA metrics for your team in just 15 minutes with Apache Dev Lake

DORA (DevOps Research and Assessment) metrics are an excellent way for engineering organisations to measure and improve their performance.

Up until now, monitoring the DORA metrics across Github, Jira, Azure Devops etc required custom tooling or a tedious manual process.

With Apache Dev Lake you can get beautiful reporting for DORA metrics on your local machine in as little as 15 minutes (honestly!).

From Google Sheets to Grafana
From Google Sheets to Grafana

#engineering

A summary of NDC Sydney 2022 Developer Conference

I attended my first in-person conference for more than 3 years last week! NDC is one of the more well-known developer conferences in Australia and New Zealand. It’s a 5 day conference with 3 days of talks and 2 days of workshops.

There’s so much to learn across all the streams so I try to take notes for each of the talks to quickly reference them later. This post contains all my notes. I’ll add the relevant videos to talks later if they’re released.

A reminder that these notes are just my notes. They’re paraphrased and summarised from what the speaker actually said. Each speakers would have provided must more clarity and went into more detail during their pressos!

Comments