API Gateway vs BFF for systems analysts
Contents:
Why interviewers ask this
If you are interviewing for a systems analyst role at any company past the seed stage — Stripe, Uber, DoorDash, Airbnb, Notion, Vercel — you will get a microservices architecture question. The two patterns that come up in almost every loop are the API Gateway and the Backend for Frontend (BFF). Interviewers ask because the answer separates people who memorized a Sam Newman chapter from people who have actually argued about where to put authentication logic at 11pm during an outage.
The classic prompts sound like this: "Why do you need a gateway at all?", "How is a BFF different from an API Gateway?", "Should aggregation happen on the gateway, the BFF, or the client?" The wrong answer is to recite definitions. The right answer connects each pattern to a concrete responsibility — auth, fan-out, client shape — and explains the tradeoff you accept when you reach for it.
Load-bearing trick: Treat API Gateway as platform infrastructure owned by a central team, and BFF as a product surface owned by the client team. Most candidates blur this line and lose the question.
API Gateway in one diagram
An API Gateway is a single entry point for clients into a microservice system. Every external request hits it first, and it decides where to route, who is allowed in, and how often they can knock.
[Mobile]
[Web] --> [API Gateway] --> [User Service]
[Partner] --> [Order Service]
--> [Payment Service]The responsibilities you should be able to list in 30 seconds:
- Routing — map
/users/*to the User service,/orders/*to the Order service. - Authentication — validate JWT or OAuth tokens before anything reaches a backend.
- Authorization — enforce coarse application-level permissions via RBAC claims.
- Rate limiting — throttle abusive clients per IP, per token, per route.
- Caching — store hot GET responses for seconds to minutes.
- Observability — centralized request logs, latency metrics, error rates.
- Transformation — translate between protocols or response shapes (REST to gRPC, XML to JSON).
- Lightweight aggregation — occasionally fan out to a few services and merge.
The upside is consolidation. Backends do not each implement their own auth; clients do not need to know your internal topology. The downside is that an unhealthy gateway is a single point of failure for everything, and any logic that creeps into it becomes a bottleneck shared by every team in the company.
This is also why a gateway should be boring infrastructure with a strict charter — every "small exception" added by a team you cannot say no to becomes someone's incident a year later.
BFF in one diagram
A Backend for Frontend is the opposite philosophy: give each client type its own backend so the API exactly fits what that client renders.
[Mobile] --> [Mobile BFF] --> [User Svc, Order Svc, ...]
[Web] --> [Web BFF] --> [User Svc, Order Svc, ...]
[Partner] --> [Partner API] --> [User Svc, Order Svc, ...]The job of a BFF is to do the fan-out and shaping that a generic API cannot. A mobile screen on a flaky 4G network wants one round trip with 12 fields. A web dashboard wants 80 fields and is happy with three parallel calls. A partner integration wants stable, versioned, contract-tested endpoints. Trying to satisfy all three with one shared API ends in a Frankenstein contract that no one is happy with.
Real responsibilities of a BFF:
- Aggregation — one client request becomes five backend calls; the BFF merges the responses.
- Client-shaped payloads — strip fields mobile does not need, denormalize for the web.
- UI-driven endpoints —
GET /home-screenreturns exactly what the home screen renders. - Client-specific business logic —
if mobile and on-cellular, prefer cached image variant.
The cost is duplication. You now have three teams writing similar code in three repos, and you must ensure they do not each reinvent the wheel for things like pagination or error envelopes.
Side-by-side comparison
Here is the comparison most candidates want but never get on a whiteboard. Bring this table mentally to your interview.
| Dimension | API Gateway | BFF |
|---|---|---|
| Purpose | Single external entry point | One backend per client surface |
| Owner | Platform / infra team | Frontend / client team |
| Cross-cutting (auth, rate limit, logging) | Yes, centralized | No, delegates to gateway |
| Heavy aggregation | Avoid — keep it thin | Yes, this is the job |
| Client-specific logic | Minimal | Encouraged |
| Typical latency budget | < 5 ms overhead | 20-80 ms per request (fan-out heavy) |
| Failure blast radius | Whole company | One client surface |
| Common tooling | Kong, Traefik, AWS API Gateway, Apigee | Custom Node.js, Go, GraphQL Federation |
The two patterns are complementary, not competing. A mature stack at a company like DoorDash or Airbnb usually looks like this:
Client -> API Gateway (auth, rate limit) -> BFF (aggregation, shaping) -> Backend servicesSanity check: If your gateway is doing aggregation and your BFF is doing rate limiting, someone has confused the two and you have a refactor coming.
When to pick which
A clean way to think about it: you almost always want a gateway first, then you add BFFs only when client diversity creates pressure on a single shared API.
Reach for an API Gateway when you have any non-trivial number of microservices behind any kind of external API. Even at five services, the cost of putting auth, rate limiting, and routing into each service individually exceeds the cost of a managed gateway. Public APIs, partner APIs, B2B integrations — gateway is mandatory.
Reach for a BFF when you have two or more client surfaces with materially different needs — say, a mobile app and a web app and a smart TV — and you find your frontend teams writing painful adapter code or doing three round trips to render one screen. The trigger is real client diversity, not architectural taste. A single React web app does not need a BFF; calling shared services directly through the gateway is fine.
You skip both when you have a monolith or a single backend and a single frontend. Adding a gateway to a two-service system is over-engineering theater, and adding a BFF when you have one client is a guaranteed maintenance burden with no upside.
Tooling worth naming
When the interviewer asks "what tools have you seen", having two or three concrete names per category beats a hand-wave. Group them so you sound like you have actually picked one before.
Gateway tools (managed and self-hosted):
- Kong — open-source, large plugin ecosystem, common in self-hosted Kubernetes shops.
- Traefik — Kubernetes-native, good ingress story, lightweight.
- AWS API Gateway — managed, deep integration with Lambda and IAM, pay-per-request.
- Apigee — Google's enterprise option, heavy on policy management.
- Envoy — used as a building block by service meshes and modern gateways.
BFF tooling patterns:
- A custom Node.js or Go service behind the gateway — most common, full control.
- GraphQL Federation (Apollo Router, Hasura) — single endpoint, type-safe, the schema becomes the contract.
- tRPC for full-stack TypeScript shops where the BFF and client share types.
If you say "I would default to Kong for the gateway and a Node BFF with a typed contract for our web client" you sound like someone who has shipped this. If you say "we would use Apigee, GraphQL, Istio and Kafka" you sound like someone who read a Medium post.
Common pitfalls
The pitfalls below are the ones that come up in real interview debriefs. They are paragraphs, not bullets, because each trap deserves a real explanation.
Putting business logic on the API Gateway. The fastest way to corrupt a clean architecture is to add a "small" pricing rule onto the gateway because it sits in front of three services that need it. Six months later, the gateway has a domain model and the platform team is on-call for finance bugs. The fix is ruthless scope: the gateway does routing, auth, rate limiting, observability, and transformation. Business rules belong in services.
One BFF for every client. The whole reason BFF exists is that mobile and web have different needs. If you build a single shared BFF and put both client teams on it, you have reinvented a worse version of the gateway. Each client surface should own its BFF; if two clients genuinely have identical needs, they probably do not need a BFF at all and can call the gateway directly.
Gateway without high availability. Running a single gateway instance is a textbook single point of failure. The fix is at least two or three replicas behind a load balancer, with health checks and graceful drain. Managed gateways like AWS API Gateway handle this for you; self-hosted Kong or Traefik clusters need explicit work. If you cannot survive losing one node, you do not have a gateway — you have a tripwire.
Duplicating auth logic in every service. A common anti-pattern is to keep token validation inside each microservice "just in case the gateway is misconfigured". This doubles the bug surface and makes rotating signing keys a coordination nightmare. The contract should be: the gateway validates tokens and forwards verified claims as headers; services trust those headers because nothing reaches them outside the gateway.
Skipping rate limiting. No rate limit means any client mistake — an infinite-loop bug in a partner integration, a misconfigured cron job, a scraper — can take down the system. Rate limiting per token, per IP, per route is cheap to add at the gateway and expensive to retrofit during an incident. Set sensible defaults on day one even when traffic is tiny.
Refusing to cache. A gateway sees every request, which means it is the ideal place to short-circuit repeat traffic. Idempotent GET endpoints with TTLs of 10-60 seconds can absorb traffic spikes that would otherwise hit your databases. Candidates who say "we don't cache because the data must be fresh" usually have not measured how stale "fresh" actually needs to be — most product surfaces tolerate 30-second staleness fine.
Related reading
- Kafka for systems analyst interviews
- Cache strategies for systems analyst interviews
- C4 model for systems analyst interviews
- Case interview for systems analysts
- Systems analyst resume guide
If you want to drill architecture questions like this one until they become automatic, NAILDD has hundreds of systems analyst scenarios — gateways, BFFs, caches, messaging — graded with worked answers.
FAQ
Is a service mesh a replacement for an API Gateway?
No, they solve different problems. A service mesh (Istio, Linkerd, Consul Connect) handles service-to-service traffic inside the cluster — mTLS, retries, circuit breaking, traffic shifting between versions. An API Gateway handles external-to-service traffic — what hits your system from the public internet or a partner network. In production at companies like Lyft or Airbnb you typically have both: the gateway terminates external traffic and the mesh governs everything east-west behind it.
Does GraphQL replace the BFF?
Often, yes — a GraphQL endpoint built with Apollo Federation or Hasura is a BFF for many teams. It aggregates underlying services, lets each client ask for exactly the fields it needs, and centralizes the schema as a contract. The honest tradeoff is that you take on GraphQL operational complexity (query depth limits, persisted queries, caching nuances) in exchange for not writing one bespoke BFF per client. For two clients with sharply different needs, federated GraphQL is usually the cleanest answer. For one client, plain REST behind a gateway is fine.
How do you do versioning across gateway and BFF?
The gateway exposes major versions in the URL — /v1/, /v2/ — and routes them to appropriate backend versions. The BFF can version its own schema independently, since its contract is with one client team that ships in lockstep with it. The trap is trying to version backend services with the same scheme as gateway URLs; backend services should evolve via additive changes (new fields, new optional params) and deprecate slowly, not by branching the codebase per version.
Where does authentication actually live?
In a healthy setup, the gateway validates the token (signature, expiry, issuer) and forwards a verified-claims header or context to backend services and BFFs. Services trust this header because nothing reaches them outside the gateway. Authorization — "can this user do this action on this resource?" — can live either on the gateway (for coarse RBAC) or in the service (for fine-grained, resource-aware checks). The BFF rarely owns auth itself; it inherits the verified context from upstream.
Can the BFF call other BFFs?
It can, but it should not. BFFs are leaf nodes — they call backend services, not each other. If your Mobile BFF needs something the Web BFF computes, that logic belongs in a shared backend service, not chained between BFFs.
Is this an official taxonomy?
No, it is consensus from practice. The terms come from microservices.io and Sam Newman's Building Microservices, widely used at Stripe, Netflix, and Uber, but no standards body owns them. Different teams draw the line slightly differently. In an interview, define your terms upfront so the interviewer knows which version you mean.