Connection pooling on the DE interview

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Why interviewers care

Connection management looks boring on paper and quietly kills production at 2am. The first incident a data engineer owns at a Postgres-backed shop is rarely a missing index — it is a wave of FATAL: sorry, too many clients already errors after a deploy fanned out 20 microservices each holding 30 idle connections. That is why the question keeps showing up in DE loops, usually phrased as "walk me through what pgbouncer does and when transaction pooling will break your app".

The interviewer is checking three things at once: do you know why a Postgres connection is expensive, can you reason about the trade-offs between session, transaction, and statement pooling, and do you understand that an application-level pool plus an external pooler are not redundant — they solve different problems.

Load-bearing trick: the right answer is almost never "use a bigger machine." It is "cap incoming connections at the pooler, keep the database serving a small fixed backend, and let the app queue."

The fastest way to fail this question is to describe pooling as "reusing connections" and stop there.

Why pooling exists

A Postgres connection is not a lightweight TCP handle. Through PG 16, each connection is backed by a dedicated OS process. Even with the threading work landing in 17, the production reality most teams operate is process-per-connection. Each backend allocates 5–10 MB of resident memory for catalogs, query state, and per-backend buffers. Startup is not free either — backend creation, authentication, role resolution, and search-path setup take 50–100 ms.

Multiply that out. Imagine 40 microservices, 6 pods each, 20 connections per pod to the primary. That is 4,800 backends. At 8 MB apiece that is roughly 38 GB of memory the database holds open before a single query runs.

App ─→ Pool (20 connections) ─→ DB

The pool sits in the middle and presents a cheap, in-process handle to the application. The application "opens" and "closes" pool handles on every request — that is a hashmap lookup, microseconds. The physical TCP connection and Postgres backend underneath get reused across thousands of requests.

Cost component Without pooling With pooling
Backend memory 8 MB × every client 8 MB × pool size only
Connection setup latency 50–100 ms per request One-time, then reused
Max throughput ceiling Bound by max_connections Bound by query time
Failure mode under spike FATAL: too many clients Queue at the pool

That last row is the one to memorize. Without a pool, traffic spikes turn into hard errors. With a pool, they turn into latency you can observe and shed.

pgbouncer in three modes

pgbouncer is the canonical lightweight pooler in the Postgres world — a single-threaded, event-driven C program that terminates client connections, holds a small fixed pool of upstream backends, and multiplexes one onto the other. A single instance handles 10,000 incoming connections on modest hardware.

[App pod 1] ─┐
[App pod 2] ─┼─→ pgbouncer (10,000 incoming) ─→ Postgres (50 backend)
[App pod 3] ─┘

The trick is the asymmetry. Postgres caps out at 50 actual backends, but every application in the cluster believes it has its own 10,000-connection pipe. The pooler holds the queue.

pgbouncer offers three pooling modes, and which one you pick has serious application-level consequences:

Mode Connection assigned for... Throughput What breaks
Session Entire client session Lowest Nothing (acts like a direct connection)
Transaction One transaction Highest, typical default Prepared statements, SET, advisory locks
Statement One statement Highest Multi-statement transactions, almost everything

Statement pooling is rarely used in production — it forbids explicit transactions, which means no BEGIN; ... COMMIT;. It exists for very specific analytical workloads where every query is a single autocommitted statement.

Transaction vs session pooling

This is the comparison the interviewer is waiting for.

Session pooling is conservative. Once a client connects, it owns a backend until it disconnects. Anything set on that backend — temp tables, prepared statements, advisory locks, session GUCs — sticks around for the life of the session. The cost is throughput: a sleepy client holding a backend means one fewer backend available for everyone else.

Transaction pooling is aggressive. The pooler hands the backend back after every COMMIT or ROLLBACK. The next transaction from the same client may land on a different backend. Throughput goes way up — a busy backend serves dozens of clients per second — but the application has to behave as if session state does not exist.

Gotcha: every application bug caused by transaction pooling is a hidden assumption that something survives a COMMIT. Find those assumptions before the pooler does.

The concrete caveats interviewers like to hear listed:

  • Prepared statements are not persistent. If your driver issues PREPARE foo and then EXECUTE foo later, the second call may land on a backend that has never heard of foo. Either disable server-side prepared statements, or use pgbouncer 1.21+ with prepared_statements = on.
  • Session-level SET is lost. SET search_path, SET timezone, SET statement_timeout — none survive a commit. Use SET LOCAL or set defaults on the role.
  • Advisory locks become unreliable. Use pg_advisory_xact_lock (transaction-scoped) instead of pg_advisory_lock.
  • LISTEN / NOTIFY is effectively disabled. Run notification consumers through session pooling on a separate port.
  • Temporary tables disappear between transactions.

The default for most OLTP workloads is transaction pooling. The throughput gain — often 5–10× more clients per backend — is large enough that it is worth auditing the codebase for the bullets above.

The real-world pattern at scale is two pgbouncer ports: a transaction-mode port for the hot path, and a session-mode port for the few services that need stateful sessions.

Train for your next tech interview
1,500+ real interview questions across engineering, product, design, and data — with worked solutions.
Join the waitlist

Application-level pools and HikariCP

A common interview trap: "if pgbouncer is in front, do I still need a pool inside my application?" The answer is yes.

App pool (10 conn) ─→ pgbouncer (1,000) ─→ Postgres (50 backend)

The application-level pool exists for latency, not resource control. Every new connection costs a TCP handshake, auth round-trip, and driver setup — 0.5–2 ms on the same host, 3–10 ms across a VPC. For a 50 ms SLO you do not want to spend 10% of it on setup per request.

The pooler downstream exists for resource control — it protects the database from the application. Without it, every Kubernetes restart slams the database with new backends.

HikariCP is the consensus best-in-class JVM pool: microsecond acquisition, small, opinionated. Spring Boot default since 2.0. Typical config: maximumPoolSize = 10, connectionTimeout = 30s, idleTimeout = 10 minutes, maxLifetime = 30 minutes.

Layer Tool Pool size per instance Purpose
App (Java/Kotlin) HikariCP 10 Avoid per-request handshake
App (Python) psycopg_pool, asyncpg pool, sqlalchemy.pool 5–10 Same
App (Node) pg-pool 10 Same
Cluster pgbouncer 50–200 backends Protect the database
Database Postgres max_connections = 200 Hard ceiling

The math has to add up. 30 app pods × Hikari pool of 10 = 300 incoming to pgbouncer. That is fine for pgbouncer, but it must not forward 300 backends to a database sized for 100.

Sizing the pool

Two formulas every DE should be able to quote.

The Little's Law version: pool size ≈ throughput × average latency. At 500 qps and 20 ms average latency, you need 500 × 0.02 = 10 connections. Add 30–50% headroom and you land on 15.

The HikariCP rule of thumb: connections = ((core_count × 2) + effective_spindle_count). For an 8-core DB server on SSD, that is 17 backends. Past that, queries spend more time fighting each other than getting served.

Sanity check: if active_connections climbs while tps stays flat, the pool is too big.

A database with 500 backends under load is almost always slower than the same database with 50 backends and a queue.

Common pitfalls

The first and most expensive pitfall is mismatching pgbouncer mode with application behavior. A team turns on transaction pooling, deploys, and three days later a billing job silently double-charges customers because an advisory lock that "obviously" worked locally no longer pins to the same backend. The fix is to audit every SET, PREPARE, LISTEN, and pg_advisory_lock call before flipping to transaction mode, and add a CI check that flags new occurrences.

A second classic is stacking pools without doing the multiplication. Engineers set HikariCP to maximumPoolSize = 10, scale to 50 pods, and are shocked when pgbouncer reports 500 client connections and starts queuing. Always work the worst case: pods × app_pool_size = max_clients_to_pgbouncer, and pgbouncer.default_pool_size × pools < postgres.max_connections. Write the numbers down before the deploy, not during the incident.

The third pitfall is idle-in-transaction holding the world hostage. A long-running query inside an open transaction holds its pooled backend even under transaction mode, because the transaction has not committed. One slow ORM call holding a transaction for 90 seconds can starve the entire pool. Set idle_in_transaction_session_timeout = 30s on the role, and alert when transactions exceed a few seconds.

A fourth, often missed: prepared statements quietly disabled because of pgbouncer. Many teams set prepareThreshold = 0 to make transaction pooling work, then forget they have given up 15–30% of query performance. Upgrade to pgbouncer 1.21+ and re-enable prepared_statements = on — it tracks names per client.

The fifth, and the one that costs jobs: assuming pgbouncer is highly available because it is "just a proxy." It is a single-threaded process. If it dies, every client connecting through it dies with it. Run two or three instances behind a TCP load balancer (or use multithreaded pgcat), monitor pooler latency separately from database latency, and have a runbook for the pooler being the bottleneck.

If you want to drill questions exactly like this — pooling, replication lag, query plans, Kafka semantics — NAILDD is launching with 500+ data engineering interview problems mapped to real loops.

FAQ

Do I still need pgbouncer on Postgres 17 with the threaded backend?

Almost certainly yes. The PG 17 work reduces per-connection memory but does not eliminate setup cost, role resolution, or the max_connections ceiling. pgbouncer also gives you a queue, separate scaling, and a clean place to enforce per-tenant limits. Treat threading as a nice-to-have, not a replacement.

When would I prefer session pooling over transaction pooling?

Pick session pooling when the app genuinely depends on session state — LISTEN / NOTIFY consumers, long-lived advisory locks, services that rely on temp tables across transactions, or legacy code where auditing every SET is more expensive than the throughput you give up. In practice, run a session-mode port just for those services and keep transaction mode as the default for everything else.

How is pgcat different and is it production ready?

pgcat is a Rust rewrite with multi-threading, sharding, query mirroring, and built-in failover. The single-threaded ceiling of pgbouncer is the main reason teams evaluate it. As of 2026 it runs in production at several large shops but has a smaller ecosystem; for most teams pgbouncer 1.21+ behind a load balancer is still the lower-risk choice.

Should every microservice share one pgbouncer?

For most setups, one shared pgbouncer (or a small HA cluster) per database is simpler to operate. Per-service poolers give isolation but multiply operational overhead. The common compromise is one shared pgbouncer with per-database or per-user pool limits in pgbouncer.ini.

What pool size should I configure in HikariCP?

Start with 10 and tune from observability. More than (cores × 2) + spindles rarely helps and often hurts. If your service is mostly waiting on the database, 5–10 is usually correct. Reaching for 50+ is almost always papering over a slow query or missing index.