PostgreSQL and typeorm - Practical transactions

Published on June 13, 2022 - Tagged: #engineering

Follow me on twitter for more posts like this

Lesson goal

To understand what a transaction does for us and how to choose when one is required in our code.

I will also show how to add transactions using typeorm.

Database course index

This is part of a full course on persistence in postgres with typeorm and sql!

There is a github repo to go with this course. See part 2 for instructions.

  1. Intro to persistence
  2. Creating a local instance
  3. Storing data in postgres
  4. Getting into relations
  5. Advanced querying
  6. Transactions (you are here)
  7. 9 tips, tricks and common issues
  8. Database administration glossary
  9. Intro to caching

The problem with multi-user apps

We need databases to be multi-user. That is, we need to allow more than one customer to read from it and write to it at one time. Otherwise they wouldn’t be very suitable for web applications!

An issue with this is that when working in a database we often work on data that’s tightly coupled or we try to modify records concurrently.

In a banking application we might want to debit one balance and credit another balance to transfer funds for example. If one of these actions fails, both should fail. Otherwise money would be either missing or created out of thin air!

This is acute in most programs using ORMs because we often select a bunch of data out to our program, make a decision based on the data and then write back modified data using the domain models.

const myStateRecord = repo.find({ id: 1 })
if ((myStateRecord.state = TIME_TO_DO_SOMETHING)) {
  myStateRecord.state = DOING_THE_THING
  repo.save(myStateRecord)
  //fire off an event to do the thing
}

Transactions are what we use to ensure that the steps in our program logic are all executed as one “unit”. If any part of the execution fails we can revert all the changes and do something to fix the issue.

const transaction = repo.createTransaction()
transaction.start()
const myStateRecord = repo.find({ id: 1 })
if ((myStateRecord.state = TIME_TO_DO_SOMETHING)) {
  myStateRecord.state = DOING_THE_THING
  repo.save(myStateRecord)
  transaction.commit()
  //fire off an event to do the thing
}

Glossary: isolation levels

We can control what the database does when there are conflicting updates to our data detected. We can have it essentially ignore conflicts or error if there is a conflict or a specific type of conflict.

It’s important to note that different databases on the same server can have different default isolation levels set. You need to confirm the default isolation level before making any assumptions about it.

It’s also important to note that different database clients - e.g. typeorm - can have a different default isolation level to the database! For example .NET data access libraries will often use serializable as the default isolation level even though SQL Server’s default is READ_COMMITTED.

In general it’s better to always specify the isolation level in your code so you’re protected from changes later.

isolation level description
READ UNCOMMITTED In postgres this is the same as read committed
READ COMMITTED A concurrent transaction can modify data you have previously read, your transaction will only see the updated values if the other transaction commits. This is the default isolation level in Postgres, oracle and ms sql server.
REPEATABLE READ A concurrent transaction can modify data you have previously read, the transaction will fail if the transactions could have resulted in different states if committed in a different order
SERIALIZABLE If a concurrent transaction modifies data your transaction is trying to commit, your transaction will fail. Front end devs: This is the default isolation level in SQLite

There’s two important things to note here. The default isolation level in postgres is READ COMMITTED. This means that, your transaction won’t read other transaction’s changes until they are committed. Once they are committed they can be read. This means other transactions can modify data while your transaction is running


-- the select * is just illustrative. don't do this.

select * from my_table where ...
--do a few other commands
update my_table...
-- right here another transaction commits and modifies the data you were working with
select * from my_table -- this will contain the new data unless you changed it!

You can hint to postgres that you will be modifying the rows with FOR UPDATE.

select * from my_table FOR UPDATE where ...
--do a few other commands
-- right here another transaction commits - IT WILL ERROR because you have locked those rows

Note that you could consider using a serializable isolation level which would “lock” all the rows anyway, if using serializable isolation level you don’t need the FOR UPDATE, but you will always lock things. Just be aware of the concurrency technique you need for a given scenario.

Using transactions in typeorm

There are multiple ways to use transactions with typeorm but if you use the queryRunner method you can control the transaction rollback yourself.

const connection = await AppDataSource.connection()
const queryRunner = connection.createQueryRunner()
const petOwner = new PetOwner()
petOwner.name = 'tx_owner'

const pet1 = new Pet()
pet1.name = 'tx pet 1'
pet1.owner = petOwner
pet1.type = 'cat'

const pet2 = new Pet()
pet2.name = 'tx pet 2'
pet2.owner = petOwner
pet2.type = 'dog'

await queryRunner.connect()
await queryRunner.startTransaction()
try {
  ;async (transactionalEntityManager: EntityManager) => {
    await transactionalEntityManager.save(petOwner)
    await transactionalEntityManager.save(pet1)
    await transactionalEntityManager.save(pet2)
  }
} catch {
  // since we have errors lets rollback the changes we made
  await queryRunner.rollbackTransaction()
} finally {
  // you need to release a queryRunner which was manually instantiated
  await queryRunner.release()
}

Summary

If you’re selecting some data to make a decision, and then writing back new information based on that decision, you likely need a transaction.

If you’re only selecting data you almost certainly don’t need a transaction.

You should understand what READ_COMMITTED does for you in Postrgres (https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED).

Lesson index

  1. Intro to persistence
  2. Creating a local instance
  3. Storing data in postgres
  4. Getting into relations
  5. Advanced querying
  6. Transactions (you are here)
  7. 9 tips, tricks and common issues
  8. Database administration glossary
  9. Intro to caching
Darragh ORiordan

Hi! I'm Darragh ORiordan.

I live and work in Sydney, Australia building supporting happy teams that create high quality software for the web.

I also make tools for busy developers! Do you have a new M1 Mac to setup? Have you ever spent a week getting your dev environment just right?

My DevShell tooling will save you 30+ hours configuring your dev environment with all the best modern tools. Get it here

https://darraghoriordan.gumroad.com/l/devshell


Read more articles like this one...

List of article summaries

#engineering

PostgreSQL and typeorm - Caching

With most web applications you can drastically increase performance by using caching for data that’s frequently read across network boundaries. This lesson will explore some common caching techniques, you’ll learn how some common tools and libraries provide caching for us.

While caching helps with performance it can also cause some surprises and bugs in applications and i’ll discuss some of those too.

#engineering

How to run Monica personal CRM on Dokku

I left my home country right after university and I worked and lived in a few countries since then. I’ve met lots of amazing people but I’ve always struggled to remember contact details and important dates for everyone.

#engineering

Find 20% of missing site traffic with plausible analytics and some proxying

Google Analytics (GA) has been a force in web site metrics since 2005. The metrics have always been incredibly useful but it’s a “free” product so you pay for it by providing all your site data to Google for tracking and advertising.

With Google Analytics your metrics are tightly coupled with tracking and advertising so when ad-blockers kick in to block tracking they also block your metrics!

The good news is that this is all fixable!

#engineering

Open Telemetry in NestJs (and React)

Open Telemetry is good enough to use in production projects now and most cloud providers and telemetry services have integrated open telemetry into their products.