PostgreSQL query stuck on LWLock:BufferMapping with high CPU and memory usage — how to debug further?

Question

We’re experiencing frequent long-running queries (>43 secs) in our PostgreSQL production DB, and they often get stuck on:

wait_event_type = LWLock
wait_event = BufferMapping

This seems to indicate contention on shared buffers. The queries are usually simple SELECTs (e.g., on the level_asr_asrlog table) but during peak usage, they slow down drastically and sometimes get auto-killed after 60 seconds.[based on statement_timeout]

Instance Configuration:
PostgreSQL version: 14
RAM: 104 GB (≈95 GB usable for Postgres)
vCPUs: 16
SSD Storage: GCP auto-scaled from 10TB → 15TB over the last year
Shared Buffers: 34.8 GB
work_mem: 4 MB
maintenance_work_mem: 64 MB
autovacuum_work_mem: -1 [I think this means its equal to maintenance_work_mem]
temp_buffers: 8 MB
effective_cache_size: ~40 GB
max_connections: 800

Observations

VACUUM processes often take >10 minutes.
Memory is almost fully utilized (free memory <5%)
CPU spikes >95% and correlates with memory pressure.
The system appears to be thrashing — swapping data instead of doing useful work.
The wait event BufferMapping implies the backend is stuck trying to 
associate a block with a buffer, likely due to memory contention.

I need help with below things,

How to further diagnose LWLock:BufferMapping contention?
Is increasing work_mem or shared_buffers a safe direction?
Should I implement PgBouncer to reduce max_connections impact on memory?
How to confirm if the OS is thrashing, and if so, how to resolve it?

Laurenz Albe · Accepted Answer · 2025-03-27 05:57:39Z

1

First measure: see if the disks are overloaded. If yes, get more I/O power and tune your statements.

Second measure: reduce shared_buffers to 16GB or 8GB.

Third measure: reduce the number of connections with a reasonably sized connection pool (you don't have to reduce max_connections)

answered Mar 27 at 5:57

Laurenz Albe

62.7k4 gold badges58 silver badges94 bronze badges

Add a comment |

Frank Heikens · Accepted Answer · 2025-03-28 13:46:03Z

1

maintenance_work_mem is very low for a 104 GB database server that has 15TB storage. I would change this one and give it at least a few GB of RAM.

But also check the rest of your configuration and statistics. I expect your current observations are just the tip of the iceberg regarding performance issues.

answered Mar 28 at 13:46

Frank Heikens

24.2k1 gold badge29 silver badges20 bronze badges

Add a comment |

goodfella · Accepted Answer · 2025-07-22 01:45:54Z

This issue might arise from I/O-heavy operations like sequential scans or VACUUM, which load new pages into shared buffers—causing queries that would otherwise run smoothly to appear slow.. Use a monitoring tool or GCP portal to check buffer usage and I/O during the spike. Also, review slow query logs to identify and tune problematic queries.

Since you have 95GB for Postgres, set effective_cache_size accordingly to improve query planning. Increasing maintenance_work_mem or autovacuum_work_mem to 1–2GB can boost vacuum performance.

Implementing pgbouncer with transaction pooling is a solid move—just ensure your app doesn’t rely on session-level features.

Stack Exchange Network

PostgreSQL query stuck on LWLock:BufferMapping with high CPU and memory usage — how to debug further?

3 Answers 3

Your Answer

Hot Network Questions

PostgreSQL query stuck on LWLock:BufferMapping with high CPU and memory usage — how to debug further?

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions