How It Works
TrimIndex Score Methodology
The TrimIndex Score is a composite rating built from six independent dimensions. Every number is derived from primary data sources — government databases, EPA testing, and aggregated owner reports — with no editorial guesswork.
Five Quality Dimensions
The composite TrimIndex Score blends five quality dimensions. Weights reflect how much each attribute affects long-term ownership satisfaction for a battery-electric vehicle. Users can adjust weights via sliders — the adjusted composite is recomputed client-side and shareable via URL. Value is computed separately as a Value-to-Score Ratio (1-10), shown alongside the composite.
Pack-level chemistry (LFP / NMC / NCA), warranty years remaining, thermal management, and connector network durability. Pack catalog deduplicates 1,132 vehicles into ~70 unique pack designs. AI modifier adjusts within ±8 pts based on owner-reported degradation patterns.
Forum and Reddit sentiment averaged across owner_satisfaction-tagged excerpts. AI modifier adjusts based on cross-source consensus and class-peer comparison.
NHTSA recall severity (DO_NOT_DRIVE / PARK_OUTSIDE advisories), crash-involved complaint rate, safety-dimension complaint counts, and build-quality forum sentiment. Volume-driven complaint penalty capped to prevent popular-vehicle bias.
EPA-certified range × 65 + EPA efficiency × 40, both as percentile within class. Forum sentiment about range adjusts ±10 when N≥5 excerpts. Class-leader vehicles cap at 100; class median scores ~53.
OTA update cadence, software-dimension complaint rate, and tech-related forum sentiment. AI modifier captures recently-shipped features that lag database refresh.
Hybrid Scoring
Four dimensions use a hybrid approach: a deterministic formula baseline plus a bounded AI modifier (±8 points maximum). The AI modifier is applied by the rate-dimension skill running via the Claude CLI at scan time.
Evidence requirements prevent hallucination: a modifier of ±3 or more requires at least 2 cited sources (complaint IDs, forum threads, recall campaigns). A modifier of ±6 or more requires 3 sources. Low-data vehicles are capped at ±3. The AI never overrides hard data — it can only shade within the allowed band.
Data Sources
Full recall and complaint databases, updated daily. Provides recall campaign details, complaint narratives, DO_NOT_DRIVE and PARK_OUTSIDE advisory flags, and vPIC vehicle specifications.
Official EPA-tested range (miles) and efficiency (kWh/100mi) per vehicle trim. Fetched once per model year from fueleconomy.gov.
Original MSRP and Fair Purchase Price per trim, scraped via Firecrawl. Used to compute the Value dimension and market delta.
Consumer Verified score (0–100, from owner surveys) and Verified Fair Price at model level, denormalized across all trims. Displayed as independent third-party validation alongside the TrimIndex Score.
Owner excerpts classified by dimension and sentiment using the classify-sentiment skill. Sources include r/electricvehicles, r/teslamotors, r/BoltEV, r/BMWI4, and 10+ model-specific forums. ~2,300 classified excerpts per canary scan.
Value Score Formula
The Value dimension is pure math — no AI involved. It compares the original MSRP to the current composite market value from KBB and J.D. Power:
raw_score = clamp(delta × 200 + 50, 0, 100)
value_score = age_adjust(raw_score, vehicle_age_years)
The age adjustment prevents penalizing vehicles that have depreciated normally — a 5-year-old car is expected to be worth 40–60% of MSRP. The curve is calibrated against class-average depreciation rates.
Class Benchmarks
Every dimension score is calibrated against class peers sharing the same body type. This prevents a pickup truck from being penalized relative to a compact hatchback with different baseline expectations. Class benchmarks are recomputed at each scan from all scored vehicles in the fleet.
Update Cadence
Scores are recomputed monthly via an automated pipeline. NHTSA data refreshes daily. Each rescan runs the full pipeline: NHTSA ingestion → EPA specs → KBB/J.D. Power pricing → forum sentiment classification → hybrid dimension scoring → intelligence synthesis. Score history is retained — you can see if a vehicle improved or degraded over time.
Confidence Levels
Each hybrid dimension displays a confidence level: High (10+ evidence sources), Medium (5–9), Low (2–4), or Insufficient (<2). When confidence is insufficient, the AI modifier is not applied — only the deterministic formula baseline is used. New or rare vehicles default to the formula score until enough owner data accumulates.
Recent Calibration Notes
2026-04-27 (v3.4) — Per-VIN complaint normalization. Build Quality and Software & Tech now normalize NHTSA complaint counts by US fleet size when known (sourced from Argonne National Lab + Tesla quarterly + InsideEVs/CleanTechnica trackers, covering 91% of the catalog). Tesla Model Y at 1,035 complaints / ~1.5M US fleet correctly scores at ~7 per 10K (healthy) instead of getting the -10 popularity penalty. Niche models with high per-VIN rates correctly get bigger penalties. The v3.3 cap remains as a fallback for vehicles without sales data.
2026-04-27 (v3.3) — Range & Efficiency rebased. Prior formula capped at ~70 without forum-sentiment data; sentiment was weighted as a 0-30 contributor that effectively penalized vehicles with no range-tagged excerpts. New formula uses percentile-driven baseline (range × 65 + efficiency × 40, median car ~53, top of class 100) with sentiment as a ±10 modifier requiring N≥5 excerpts.
2026-04-27 — Build Quality complaint cap. Volume-rate complaint penalty capped at -10 (was uncapped at -25). High-fleet vehicles like Tesla Model Y were being punished for sales volume, not per-VIN defect rate. Severity-driven penalties (crash-involved, safety-dimension, recall-campaign severity) carry the rest of the weight.
2026-04-27 — NCA chemistry recalibrated. NCA cells (Panasonic 2170 in 2018-2023 Teslas) raised from 50 → 75, matching NMC parity. Original 50 score had no service-data justification; both chemistries show similar real-world longevity per OEM warranty data and pack-catalog measured curves.
2026-04-27 — Owner Satisfaction confidence blend. Pure rescaled-sentiment score now blends toward 50 (neutral) when sample size is small (N<30). Catches both unfairly-low cases (vehicles with a handful of grumpy posts being pulled below median) AND unfairly-high cases (vehicles with 4 cherry-picked positive posts scoring at 70+). N≥30 owner-tagged excerpts → no blend, full pure score.
2026-04-27 — Brand-bankruptcy floor (Fix E). Vehicles from manufacturers that have ceased operations (currently Fisker) get their composite hard-capped at 30. Battery Health additionally treats stated warranty terms as voided. Reflects that ownership outcome is dominated by no-warranty, no-service-network reality, not by EPA range or chemistry.
2026-04-27 — Multi-DND systemic-safety amplifier (Fix F). Build Quality applies an extra -25 penalty when a vehicle has 2+ unremediated Do-Not-Drive recalls. Uses date heuristic to avoid punishing remediated DNDs (e.g. Bolt 21V-650 closed by 2023 doesn't trigger this; Fisker 24V-186 + 24V-486 still open does).
2026-04-27 — Software Tech severity sensitivity (Fix G). Software complaints now penalized by severity in addition to count. Catches "key fob locked owner inside vehicle" class issues that the count-rate formula previously missed (a vehicle with 2 severity-5 software complaints over 24 months would score 100 under the old formula).
2026-04-25 — Value removed from composite. Value was previously 15% of composite; the depreciation-aware formula was penalizing used EVs trading above the curve (most of the catalog). Split into composite (quality only) + Value-to-Score Ratio (1-10 deal-quality signal shown alongside).