May 18, 2026·13 min read

North Star metric for product managers

Q: Can a B2B SaaS use the same playbook as a consumer app?

Yes, with one substitution — count **active accounts** or **active teams**, not active users. A 10-seat team with 3 active users is often a healthier signal than a 100-seat team with 4 active users, and per-user metrics will mislead you. Notion, Linear, Figma, and Slack all use team-level NSMs for exactly this reason.

Q: How long should I wait before declaring a new NSM "the" NSM?

Run it in parallel with your existing primary metric for **one to two quarters**. Watch whether the new NSM predicts retention and revenue better in cohort backtests. If it does, retire the old metric publicly and migrate dashboards. Switching the NSM without a parallel period loses the trust of engineering and design.

Train for your next tech interview

1,500+ real interview questions across engineering, product, design, and data — with worked solutions.

Join the waitlist

Contents:

What a North Star metric actually is
Criteria of a good North Star
Industry examples
How to pick a North Star
Metric hierarchy under the NSM
Common pitfalls
Related reading
FAQ

What a North Star metric actually is

A North Star Metric (NSM) is the single product number that captures the value users get from your product. It is not revenue, it is not DAU, and it is not a board-deck vanity number — it is a count of value delivered, sampled at a cadence your team can actually act on. The whole point is to give a 40-person product org one shared rallying number that beats arguing about a dashboard with sixty tiles.

Three things the NSM does. First, it ends quarterly priority debates: every initiative either moves the NSM or it does not. Second, it connects today's shipped feature to next year's revenue without a finance background. Third, it survives reorgs — the org chart changes, the NSM does not.

The North Star is a count, not a ratio. Ratios disguise growth — a flat ratio with a doubling base is enormous progress, and the executive review will not see it. Counts make growth legible.

Walk into a senior PM interview at Stripe, Notion, or Linear and expect this exact question: "What North Star would you choose for our product, and why?" The interviewer is testing whether you separate proxy of value from lagging financial output.

Criteria of a good North Star

A defensible NSM passes five tests, in this order:

It reflects delivered value, not activity. "Logged in this week" is activity; "completed a workout" is value. The verb matters.
It correlates with long-term retention. Users who hit the NSM more often in week 1 should still be around at D30 and D90. If they are not, you picked an engagement-bait metric.
It is measurable in your warehouse without a data engineering project. If computing the NSM takes a 200-line dbt model with three CTEs, nobody will look at it daily.
A non-technical teammate can explain it in one sentence. Slack's "messages sent within paid teams that reached 2,000 messages" is famously specific but still one sentence. That is the bar.
Product decisions move it, not marketing or PR. If a paid acquisition campaign can spike your NSM, it is a top-of-funnel metric, not a North Star.

Load-bearing rule: If you cannot draw a straight line from "a designer shipped a better empty state" to "the NSM moves," your NSM is too far downstream. Pull it earlier in the value chain.

Industry examples

The strongest NSMs are public-domain at this point. Look at the verb in each one — that is where the team's product theory lives.

Company	North Star metric	Why this verb, not another
Spotify	Time spent listening	Listening time correlates with subscription retention better than DAU; passive plays do not count.
Airbnb	Nights booked	A booking is the value moment for both guest and host; views and searches are noise.
Notion	Weekly active teams creating content	Teams, not seats — collaborative creation is the moat, solo note-taking is not.
Stripe	Payment volume processed for live businesses	"Live" excludes test mode; volume aligns with the business model without becoming pure revenue.
Airtable	Weekly active bases with collaborators	A base with one editor is a spreadsheet; a base with three is a workflow.
Linear	Issues closed per active team per week	The product exists to ship work; closing issues is the verb that proves it.
Figma	Multiplayer files edited per week	Single-player edits are Sketch; multiplayer is the wedge.
Slack	Messages sent within teams that crossed the 2,000-message threshold	The 2,000-message marker is the empirical retention cliff.

Notice how every metric has a qualifier — "within paid teams," "with collaborators," "for live businesses." The qualifier is what separates a North Star from a vanity count. Without it, you ship feature flags that pad the number; with it, you ship features that pad value.

This is also why "DAU" alone fell out of fashion at Meta — shallow engagement scaled while satisfaction did not.

How to pick a North Star

The selection algorithm is mechanical once you have a value hypothesis.

Name the core value moment. Not a feature — an outcome. For a food-delivery marketplace, the outcome is a meal arrives, hot, that the customer wanted. Not "an order is placed."
Find the count of that moment. "Orders with a 4-or-5-star rating per week" counts the moment. "GMV" counts the dollars, which is a consequence.
Test the retention correlation. Bucket users by their week-1 NSM count. Do high-count users retain at D30? If a user with 5 NSM events in week 1 retains at 70% but a user with 1 retains at 22%, the metric is predictive. If the curves are flat, the metric is engagement theater.
Test the revenue correlation. Cohorts with rising NSM should have rising LTV 90 days later. If LTV is flat while NSM climbs, you are gaming yourselves.
Pressure-test with the team. Read the candidate metric to engineering, design, and support. If they cannot name three things they would build to move it, it is too abstract.

A worked example for the food-delivery marketplace, with three candidate metrics and how they score:

Candidate	Reflects value?	Drives retention?	Movable by product?	Verdict
Gross merchandise value (GMV)	No — counts dollars, not joy	Weakly	Partially	Reject — it is a financial output
Orders per week per active user	Partial	Yes	Yes	Promising but ignores quality
Orders with 4-5 star rating per week	Yes	Yes (strongest)	Yes	Adopt

The third candidate wins because a bad order — cold food, missing item, rude driver — should not count. Counting all orders rewards volume; counting good orders rewards the product fixing the parts of the experience that customers actually feel.

Train for your next tech interview

1,500+ real interview questions across engineering, product, design, and data — with worked solutions.

Join the waitlist

Metric hierarchy under the NSM

The NSM is the apex; underneath sits a tree of drivers and operational metrics. The hierarchy is what turns the NSM from a poster into a planning tool.

North Star: 4-5★ orders per week
│
├── Driver: Acquisition
│   └── New users with first order
├── Driver: Activation
│   └── Share of users with 2+ orders in week 1
├── Driver: Retention
│   ├── D30 return rate
│   └── D90 return rate
├── Driver: Frequency
│   └── Orders per month per active user
└── Driver: Quality
    ├── Share of orders rated 4-5★
    ├── On-time delivery rate
    └── Order accuracy rate

Each driver has an owner. Acquisition is usually a growth PM, activation belongs to onboarding, retention sits with lifecycle, frequency with merchandising, quality with operations. The senior PM who owns the NSM does not own every driver — they own the portfolio decision of which driver to push this quarter.

Sanity check: If two drivers move opposite directions and the NSM stays flat, you do not have a problem with the NSM — you have a coordination problem between two teams. The hierarchy is doing its job by surfacing it.

A common second-level breakdown looks like the table below. The numbers are illustrative for a Series B consumer marketplace, not benchmarks to copy.

Driver	Healthy quarterly trend	Warning threshold	Owner
Activation rate (2+ orders in week 1)	+3 to +5 points	Flat or declining 2 quarters in a row	Onboarding PM
D30 retention	+1 to +3 points	Drop of >2 points QoQ	Lifecycle PM
Frequency per active	+0.2 to +0.5 orders/mo	Flat with rising acquisition spend	Merchandising PM
4-5★ share	≥85% sustained	Drop below 80%	Ops + Quality

The point of the table is not the numbers — it is that each row has an owner and a threshold. A driver without an owner is a driver nobody is moving.

Common pitfalls

Equating NSM with revenue. Revenue is the lagging consequence of delivered value, not value itself. If your NSM is ARR, you have built a finance dashboard, not a product compass. The fix is to write down the user outcome that produces revenue and count that outcome — ARR climbs as a side effect, and you know why it climbed.

Picking an NSM nobody can recite. "Weekly active users who completed five events including one purchase and rated four-plus" is a query, not a metric. If the PM org cannot recite the NSM at standup without reading it, the metric is dead. Slack's version is one sentence with one qualifier — that is the maximum complexity humans tolerate.

Never revisiting the NSM. Products evolve and the value hypothesis from year one is often wrong by year three. Notion shifted from "active users" to "active teams creating content" as the company pivoted from solo notes to collaboration. Schedule a yearly NSM review to force the question "is this still the right verb?" and document the answer either way.

Letting marketing influence the NSM. If a paid campaign can spike your North Star by 20% in a week, your NSM is too top-of-funnel. The metric should be downstream of marketing's reach and upstream of finance's revenue — the middle stretch where product actually lives.

Maintaining more than one North Star. A North Star is singular by definition; the moment you have three, you have OKRs. Multi-star orgs end up with teams optimizing numbers that fight each other — engagement minutes versus paid conversion, with no shared metric to reconcile them. Pick one number, and let the drivers handle the nuance.

If you want to drill product-sense and metrics questions like this on a daily cadence — including North Star teardowns from Stripe, Notion, Linear, and Airbnb — NAILDD is launching with 500+ PM and analytics interview problems shaped exactly around this pattern.

FAQ

How is the NSM different from an OKR?

The NSM is a permanent strategic compass; an OKR is a time-boxed quarterly or annual commitment. An OKR can include the NSM as its Objective, and the Key Results are usually drivers underneath the NSM. The NSM should not change every quarter — if it does, you have not picked an NSM, you have picked a moving KPI dressed up in starry language. A healthy company keeps the same NSM for 2 to 4 years and only revisits it after a major product pivot.

How is the NSM different from a KPI?

A KPI is any indicator a team tracks; you have many. An NSM is exactly one per product, sitting at the top of the hierarchy. Every KPI in the product org should ladder up to the NSM, otherwise the KPI is measuring something the company has not decided to care about. Practically: KPIs are how individual teams talk about their progress; the NSM is how the whole product org talks about progress to the CEO.

Can a B2B SaaS use the same playbook as a consumer app?

Yes, with one substitution — count active accounts or active teams, not active users. A 10-seat team with 3 active users is often a healthier signal than a 100-seat team with 4 active users, and per-user metrics will mislead you. Notion, Linear, Figma, and Slack all use team-level NSMs for exactly this reason.

Should the NSM be a leading or a lagging indicator?

Leading, but not so leading that it becomes activity. The NSM sits between activation (very leading, very early) and revenue (very lagging, very late). The sweet spot is roughly the value moment — the action where the user gets what they came for, repeated often enough to be measured weekly. If the NSM only moves quarterly, it is too lagging; if it spikes from a tutorial completion, it is too leading.

What if my product has two genuinely distinct user types?

Marketplaces are the classic case — Airbnb has guests and hosts, DoorDash has eaters and merchants and dashers. You still pick one NSM, but it is usually the transactional verb that requires both sides: nights booked, orders delivered. That metric only moves if both sides of the marketplace are healthy, which is the whole point of using one NSM instead of two.

How long should I wait before declaring a new NSM "the" NSM?

Run it in parallel with your existing primary metric for one to two quarters. Watch whether the new NSM predicts retention and revenue better in cohort backtests. If it does, retire the old metric publicly and migrate dashboards. Switching the NSM without a parallel period loses the trust of engineering and design.

Is the NSM an official term from a specific framework?

No. The phrase was popularized by Sean Ellis and the early growth-hacking community around 2010 to 2015, and the public examples were assembled from talks and company case studies. What matters is the practice: one count of delivered value, owned by product, ladderable to revenue.