These benchmarks come from 6–12 months of FSM and call data across comparable field service operations. They show what top-quartile looks like, what average looks like, and how wide the gap is in every category that matters to EBITDA.
These are not survey benchmarks. They are not self-reported numbers from conference attendees or vendor case studies. They come from FSM and call data analysis across field service operations running ServiceTitan, HCP, FieldEdge, and Jobber — $20M–$100M operators in HVAC and residential services.
Gross margin per job is calculated from job cost records: labor hours pulled from dispatch records multiplied by each tech’s burdened rate, plus actual parts cost from purchase records, compared against invoiced job totals. Callback rate is derived from dispatch records with callback flags and job-type cross-references — not from manually reported counts. CSR booking rate comes from call recording analysis matched against FSM booking outcomes: answered calls that produced a scheduled job versus answered calls that did not.
The result is benchmark ranges built from actual operational records across comparable operators — not what operators think their numbers are, and not what vendors report as “industry averages.” Most operators, when they run the same calculation on their own data, find their real numbers are 5–12 points worse than their monthly P&L suggests. That gap is what these benchmarks are designed to surface.
Most operators know their company-wide averages. Very few have measured their own numbers against the benchmark ranges that define top-quartile execution. The gap between knowing your average and knowing where you stand is where $400K–$1M in recoverable margin hides.
Top-quartile field service techs close at 36–42% GM per job. Company averages typically run 29–33%. The gap: 8–14 points on identical job types, same pricebook, same zip code. The difference is diagnostic thoroughness, option presentation, and pricebook compliance. On a 50-tech operation with $35M revenue, that spread represents $400K–$875K in recoverable annual margin.
8–14 pt GM spread between top and bottom performersIndustry average callback rate for residential HVAC: 8–14% of completed jobs. Top-quartile operators run 3–5%. That gap represents $150K–$250K/year in direct cost for a 50-tech shop — plus displaced schedule capacity and customer attrition. The causes: incomplete diagnostics, wrong-part arrivals, dispatch mismatches. All of them show up in FSM job records before they compound.
Average: 8–14% | Top quartile: 3–5% | Gap: $150K–$250K/yearAverage inbound booking rate for field service CSRs: 60–68%. Top-quartile CSRs book 74–78% of answered calls, with top-10% operators reaching 82%+. The spread between your best and worst CSR is worth $300–$800 per missed call in lost jobs. During peak season, low booking rates cost $40K–$100K/month in revenue that went to competitors.
Median: 60–68% | Top quartile: 74–78% | Top 10%: 82%+Source: FSM and call data analysis across $20M–$100M field service operators. Ranges represent 2025–2026 operational data. Not survey data.
For the three benchmarks with the highest EBITDA impact, here is the specific operational mechanism that produces the improvement — not general advice, but the lever that actually moves the number.
The median-to-top-quartile move on GM is almost never a pricing problem. Pricebook rates are usually adequate. The problem is pricebook compliance and diagnostic completeness — techs who skip options, underquote labor, or close jobs below the approved price floor without a flag ever reaching the office.
The mechanism: margin guardrails embedded in the job workflow. Every closed job is compared against the expected GM range for that job type and tech, based on actual invoiced vs. cost data. Jobs that close below threshold are flagged before the invoice is finalized — not discovered three weeks later on the P&L. Over 60–90 days, this creates a feedback loop that tightens compliance without requiring a new pricing conversation. The 8–14 point spread between your best and worst techs on identical job types is not a compensation or tenure issue. It is a real-time visibility issue.
Callback rate at the median means roughly 1 in 8 jobs comes back. At top quartile, it is 1 in 12. The difference is not that top-quartile techs are more experienced — it is that their diagnostic paths for high-callback job types are documented and transferable.
The mechanism: an operational knowledge graph that maps how your top-quartile techs execute the specific job types where your callback rate is highest. If your callback rate on a particular equipment failure mode drops to 4% when your best tech runs the job, that is not luck — it is a diagnostic sequence that can be extracted, documented, and made the standard for every other tech. The knowledge graph converts individual expertise into operational infrastructure. It is why new techs at top-quartile shops ramp in 3–5 months instead of 9–12: they have a structured execution path, not tribal knowledge.
The median CSR books 6.5 out of 10 answered calls. The top-quartile CSR books 7.6. That difference, at $300–$800 per missed job during peak season, is $40K–$100K per month. And the gap almost always comes from a handful of specific call patterns: price objection handling, urgency framing, and what happens in the last 90 seconds before the caller does not schedule.
The mechanism: call scoring against a defined booking standard, applied to recorded calls at the rep level, weekly. Not a general coaching program — a structured analysis of where each CSR’s calls break down, matched against a call-type-specific standard. The operators who move from median to top quartile on booking rate do it by identifying the 3–4 call patterns where their lowest-performing CSRs consistently lose bookings, and running targeted coaching on those patterns specifically. It takes 4–6 weeks to see movement. It does not require new staff or new call scripts from scratch.
Benchmarks are only useful if you know your actual numbers. Most operators have company-wide monthly averages. Those averages hide the variance that benchmarks are designed to expose.
To calculate your real gross margin per job, you need a cost-side data pull from your FSM: all completed jobs over the last 6–12 months, with tech assignment, labor hours, job type, and invoice total. Match those labor hours against each tech’s burdened rate — base pay plus burden: payroll taxes, benefits, vehicle allocation, and insurance. Add actual parts cost from purchase records, not the pricebook cost. That sum versus the invoice total is your real GM per job. Run it by tech. Run it by job type. The spread across both dimensions will be wider than your monthly P&L suggests.
Most operators who go through this exercise find that 20–30% of their techs are generating the majority of their margin gap. They also find that the job types with the worst GM are not random — they cluster by type, by season, and by which techs are assigned to them. That is the starting point for using the benchmark table above as an actionable diagnostic rather than a vanity comparison.
If your FSM is ServiceTitan, HCP, FieldEdge, or Jobber, the data pull is straightforward via API export. The analysis itself takes 2–3 days of structured work. If you want to see what your numbers look like against these benchmarks before committing to a full engagement, that is exactly what the 45-minute diagnostic is for.
Industry benchmarks tell you where the line is. Your FSM data tells you where you stand. Most operators have never run both numbers together.
Company-wide average. Doesn’t show by tech, job type, or branch. Looks acceptable. Hides the spread.
Total callbacks per month. Doesn’t segment by tech, job type, or root cause. Too aggregated to fix.
Lags by 30–45 days. Shows the outcome, not the driver.
“Marcus is good.” “Kevin has a lot of callbacks.” Nobody has pulled the data.
See exactly which techs and which job types are above and below your target range. The 3 techs generating 40% of your margin gap become visible on Day 1.
Know which job types are running 18% callbacks when top operators run 4%. Know the root cause category. Know the intervention.
See which rep is at 58% when top performers hit 85%. Measure against the benchmark that matters, not a company average that masks the spread.
Benchmark gaps are cross-system. The callback that starts with a dispatch mismatch and ends in a missed follow-up isn’t visible in any single report.
Four tools. Full benchmark map in 30 days, daily monitoring after that.
Pulls all completed job records, invoice data, and callback flags from your FSM via API. Cross-references your GM per job, callback rate, and pricebook compliance against the benchmark ranges above — by tech, by job type, by branch, by season. Shows you exactly where you stand before anything changes.
Integrates call recording data with FSM booking outcomes. Maps your CSR booking rate by rep, by day, by call type against industry benchmark. Surfaces the rep-level gaps, the call volume patterns that stress booking performance, and the specific objection types that break the booking rate most.
Benchmarks alone don’t tell you how to close the gap. The operational knowledge graph documents how your top-quartile techs execute the specific job types where you’re below benchmark. Their diagnostic path becomes the standard for the techs who are running at average. Not industry best practices — your best people’s actual methods.
Once your benchmarks are established, AI monitors every tech’s GM per job, callback rate, and booking rate daily against the target. Flags when a metric drifts below threshold — 3–4 weeks before the monthly P&L shows the problem. The benchmark becomes a living standard, not a one-time report.
A top-quartile field service operation runs 52–55% gross margin per completed job. Median operators sit at 43–47%. The bottom quartile falls below 38%. The spread is almost never about pricing — it comes from labor cost variance, pricebook non-compliance, and incomplete diagnostics that lead to missed add-on opportunities. On a $35M operation, closing a 10-point GM gap is worth $400K–$875K in recoverable annual margin. The operators who reach top quartile are not charging more; they are closing more of the jobs they are already running at the rate the pricebook says.
The median callback rate for HVAC and residential field service operations is 12–15% of completed jobs. Top-quartile operators run 7–9%, and top-10% operators are below 5%. The bottom quartile runs above 18%. A 50-tech shop at 15% callbacks is spending $150K–$250K per year on direct rework cost alone — before accounting for displaced schedule capacity and customer attrition. The callbacks that cost the most are not the obvious repeat failures; they are the systematic failures on specific job types that never get attributed to a root cause because they are measured as a company total rather than by job type and tech.
Target a CSR booking rate of 74–78% at top quartile, with top-10% operators reaching 82% or higher. The median is 62–68%. If your CSR team is below 62%, you are losing $300–$800 per missed inbound call in jobs that went to competitors. The gap between your best and worst CSR is almost always larger than the gap between your average CSR and the benchmark — which is why rep-level measurement matters more than a team average. The specific call patterns where booking rates break down are consistent across operations: price objection handling, urgency framing, and the last 90 seconds of calls that end without a scheduled job.
True cost per job requires pulling labor hours from dispatch records and multiplying by each tech’s burdened labor rate — base pay plus burden: payroll taxes, benefits, vehicle allocation, and insurance. Add actual parts cost from purchase records, not the pricebook cost. Compare that total against the invoiced job amount. Most operators use monthly P&L averages for this calculation, which hides tech-level variance of 15–20 points on identical job types. A tech who takes 3.5 hours on a job your best tech runs in 2 hours, on the same flat-rate invoice, is running at 30–40% worse GM — and it only shows up in your monthly margin as a fraction of a percent because it is averaged across all jobs.
Gross margin per job covers revenue minus direct job cost: field labor (burdened), parts, and direct materials. Net margin subtracts overhead — office staff, software, marketing, owner compensation, facilities. In field service, gross margin is the operational benchmark because it is controllable at the tech and job-type level. Net margin is an accounting outcome. Operators who focus on net margin as their primary metric typically cannot identify which jobs, techs, or job types are driving the problem. Gross margin per completed job, measured at the individual job level, is what separates a diagnostic benchmark from a lagging financial report. You can act on a GM-per-job variance this week. You cannot act on net margin that is reported 30–45 days after the month closes.
Start with a 6–12 month pull from your FSM: all completed job records with invoice totals, tech assignments, job types, and callback flags. Match those against your actual labor cost (hours times burdened rate) and parts cost at the job level. That gives you GM per job by tech and job type. Layer in your call recording data to calculate CSR booking rate by rep. Then compare both sets of numbers against the benchmark ranges in the table above. The operators who get the most from benchmarking are the ones who measure at the individual level first — company averages mask the variance that benchmarks are designed to expose. If this process sounds like a significant data lift, the 45-minute diagnostic is where we start: we pull a recent export, show you your numbers against these ranges, and scope what a full benchmark analysis would surface.
Our Full-Operation Audit (Days 1–30) maps every revenue leak — field and back of house. If we don’t identify at least $200,000 in recoverable annual revenue, we refund Phase 1 in full. You keep all audit deliverables.
After kickoff, we ask for about 30 minutes a week of your ops leader’s time.
We’ll start with a recent export or sample call data from your FSM and call system, show you the biggest leaks, and scope the engagement. Full access happens only if you proceed to the audit.