Does AWS Aurora really provide strong consistency?

amj · July 24, 2023, 12:49pm

We work with Aurora MySQL and we have observed significant replica lag on read clusters. A stackoverflow thread confirms the same behaviour for PostgreSQL as well - database - AWS Aurora cluster: strong or eventual consistency? - Stack Overflow. Their documentation claims an average lag of less 100ms - so can it be still considered Strong Consistency? Thanks

Course: https://www.educative.io/collection/10370001/4941429335392256
Lesson: https://www.educative.io/collection/page/10370001/4941429335392256/5131359030345728

Asmat_Batool · July 27, 2023, 5:47am

Greetings amj,
Aurora has many operational modes, and adding on the choice of storage engines, the situation becomes more complicated. You are right to point out that, unfortunately, the documentation of Aurora does not properly claim what data models can be achieved with what configurations, leaving everyone to interpret different statements differently (as in the stack overflow thread you mentioned).

In your setup, if a client writes something and its success is acknowledged by the system, and after that, some other client does the read on the mutated data item and gets a stale value (at least for some time), then the consistency model cannot be linearizable (strong consistency).

With replicas in the availability zones of a single region, the storage layer of Aurora is shared, and in managed service mode, all reads funnel through a front-end service that re-routes the request to the most up-to-date replica. In such a scenario:
(1) No client should see stale reads once a write has been acknowledged as a success by the writer client.
(2) The total order of mutations should be the same on all replicas.
Though it is not clear if the above two hold when replicas are in different AWS regions, but it looks like it is not always the case, as you mentioned.

Thank you for pointing out your operational experience with us and in the light of this discussion it is best to replace this example with Google’s Spanner (that at least claims to provide linearizability for some of its operations).

amj · August 5, 2023, 10:44pm

Thank you @Asmat_Batool for the reply.

No client should see stale reads once a write has been acknowledged as a success by the writer client

I will explore your above point whether we can check the status in anyway. Our setup is in a single AWS region so it will be great if we can get this working (our current workaround is to force the read to writer cluster whenever the absolute latest is required). Thanks again.