Why State Status Vocabularies Cannot Be Unified Into a Clean Binary
The surface-level problem is that 50 states produce status strings independently, each optimizing for their own regulatory framework. The deeper problem is that the underlying operational states themselves are not uniform, so a clean binary mapping loses information that actually matters for credit decisions.
The three classes of status variance
• Label variance: Same operational state, different string. "Active" (most states), "Good Standing" (several), "Current" (a few), "In Good Standing" (Delaware, a few others). Easy to normalize mechanically.
• Reason variance: Same directional state (e.g., "not active"), different reasons. California's FTB Suspended (tax non-payment) versus SOS Suspended (failure to file) versus Dissolved (terminated) carry different risk profiles and different remediation paths. Normalizing all three to "Inactive" loses signal.
• Scope variance: Some states expose granular subtypes that other states do not publish. Texas alone recognizes roughly 18 distinct entity statuses. A normalization schema designed around the 4-status minimum (e.g., Florida, Nebraska, Minnesota) forces Texas subtypes into generic buckets.
What a binary normalization loses
A verification response that returns only "active" or "inactive" cannot distinguish a Delaware LLC in Good Standing from a California LLC that is FTB Suspended but operating. Both might be returning "inactive" under naive normalization. The underwriting decision for each is different. The regulatory remediation path is different. The audit artifact should reflect the distinction.
The cross-state query problem
The commercial appeal of normalization is cross-state queryability. A risk team running a portfolio sweep for "all inactive entities" wants one flag, not 50 state-specific status lists. That is a real operational need. The resolution is to expose both: normalized status for cross-state querying, raw state-specific status and reason code for state-specific underwriting logic.
How the Top Volume States Handle Status Strings
Here is the operational shorthand for the states that represent the majority of U.S. formation volume.
California: the FTB versus SOS distinction
California returns multiple statuses depending on entity type and suspension reason: Active, Suspended - SOS/FTB, Suspended - FTB, Suspended - SOS, Forfeited - SOS/FTB, Dissolved, and several others. The FTB (Franchise Tax Board) and SOS (Secretary of State) suspensions have different causes and different remediation paths. FTB Suspended means the entity has fallen behind on state franchise tax. SOS Suspended means the entity has failed to file required statements (typically Statement of Information). Forfeited is a more severe status that follows prolonged non-compliance. Dissolved is terminal.
Texas: the 18-subtype taxonomy
Texas recognizes roughly 18 distinct entity statuses including In Existence, Forfeited Existence, Voluntarily Dissolved, Involuntarily Dissolved, Terminated, Expired, and several others. Series LLCs, Professional Corporations, and foreign registrations each carry state-specific status variants. Normalizing Texas into 4 buckets drops information that matters for underwriting on professional services, multi-state operators, and franchise operations.
Delaware: the Good Standing language
Delaware uses "In Good Standing" as its active equivalent. The operational distinction: an entity in Delaware is either in good standing (tax current, filings current) or it is not. Delaware does not expose the same breadth of intermediate statuses that California and Texas do, partly because Delaware's franchise tax and annual reporting regime is more streamlined. The Delaware-specific gotcha is the paid status request that confirms the good-standing flag.
Florida: the 4-status simplicity
Florida keeps it simple: Active, Inactive, Withdrawn, or Dissolved. Florida's formation volume (665,668 in 2025, the largest in absolute terms) runs on a relatively clean status taxonomy.[1] The normalization boundary is easy for Florida; it is harder for Texas and California, which dominates overall schema complexity.
New York: varies by entity type
New York returns different status vocabularies depending on whether the entity is a Domestic Business Corporation, Foreign Business Corporation, LLC, LP, or LLP. "Active," "Inactive," "Dissolved," and several variants appear across entity types, with the Department of State tracking and the Tax Department tracking not always synchronized.
A Normalization Schema That Preserves Information
The pattern used by several production KYB platforms, and the one recommended here, is a two-layer schema: normalized status for cross-state comparison, raw status for state-specific logic.
The normalized status set
A defensible normalized status set covers roughly 7 values:
• Active: Operating, in compliance with state filing requirements, not in any suspended or dissolved state.
• Inactive: Not currently in good standing. Not yet terminal. Typically recoverable through remediation.
• Suspended: Explicitly suspended by state action (tax, filing, or regulatory). May be recoverable.
• Dissolved: Voluntarily terminated by the entity. Recoverable in limited circumstances (reinstatement windows vary by state).
• Forfeited: Involuntarily terminated by state action, typically for prolonged non-compliance. Recoverable in some states with additional steps.
• Withdrawn: Foreign entity has ended its registration in the state. The underlying entity may still exist in its home state.
• Unknown: Status could not be retrieved or does not map to any of the above.
Seven values balance the cross-state querying benefit (enough granularity to drive risk rules) with the schema-consistency benefit (not exploding into 50-state-specific variants).
The raw status preservation
Alongside the normalized status, the schema should include the raw status string as returned by the state registry plus, where available, the reason code or sub-status. The raw field is state-specific; the normalized field is cross-state. Consumers can query the normalized field for portfolio-level analysis and drop to the raw field for state-specific underwriting decisions.
The schema example
{
"entity_id": "abc123",
"state": "CA",
"normalized_status": "Suspended",
"raw_status": "Suspended - FTB",
"status_reason_code": "FTB",
"status_reason_description": "Franchise Tax Board suspension for non-payment",
"status_as_of": "2026-04-22T14:32:11Z",
"status_source": "live_registry_query"
}
The `normalized_status` drives cross-state queries. The `raw_status` and `status_reason_code` drive California-specific rules. The `status_as_of` and `status_source` drive audit defense.
Normalization Mapping: State-by-State Examples
Here is the mapping for a handful of representative states.
| State | Raw Status | Normalized Status | Risk Implications |
|---|---|---|---|
| CA | Active | Active | Standard underwriting |
| CA | Suspended - SOS/FTB | Suspended | Investigate remediation path; decline default |
| CA | Suspended - FTB | Suspended | Tax non-payment; remediation possible with back-tax payment |
| CA | Suspended - SOS | Suspended | Filing non-payment; remediation possible with filing |
| CA | Forfeited - SOS/FTB | Forfeited | Prolonged non-compliance; difficult remediation |
| CA | Dissolved | Dissolved | Terminal |
| TX | In Existence | Active | Standard underwriting |
| TX | Forfeited Existence | Forfeited | Involuntary; check for reinstatement option |
| TX | Voluntarily Dissolved | Dissolved | Terminal |
| TX | Involuntarily Dissolved | Dissolved | Terminal; investigate cause |
| DE | In Good Standing | Active | Standard underwriting (paid query required to confirm) |
| DE | Not In Good Standing | Inactive | Tax or filing gap; remediation typically possible |
| FL | Active | Active | Standard underwriting |
| FL | Inactive | Inactive | Investigate reason |
| FL | Dissolved | Dissolved | Terminal |
| NY | Active | Active | Standard underwriting |
| NY | Inactive | Inactive | Multiple possible causes; check raw status reason |
The table is representative, not exhaustive. The pattern: each state maps its raw status strings into the 7-value normalized set via a state-specific mapping table, and the raw status is preserved alongside the normalized status.
Where State Normalization Gets Hard
The mapping table above looks clean. The edge cases are not.
Statuses with multiple reasons
California's "Suspended - SOS/FTB" means suspended by both SOS and FTB, which is operationally worse than suspended by either alone. The mapping has to expose both reason codes; a single reason field loses information.
Status changes that do not propagate uniformly across state agencies
New York's Department of State and Department of Taxation and Finance do not always surface the same status at the same time. An entity can appear as Active in one and Suspended in the other. The normalization layer has to decide which source of truth to use (typically DOS for formation and status, Tax for compliance), and the policy should document the choice.
State-specific subtypes that do not map cleanly
Texas's Series LLC, Professional Corporation, and various foreign registrations each carry status variants that do not fit the 7-value normalized set perfectly. The typical fix is to use the normalized value as the primary field and expose the state-specific entity type (`entity_type`) and status qualifier (`status_qualifier`) in secondary fields.
Historical status transitions
A business that was previously Suspended and is now Active carries different risk than one that has been continuously Active. The status history matters. The schema should capture status transitions with timestamps, not just the current status.
Handling Normalization in the Verification Waterfall
The status normalization layer sits between the vendor response and the risk-rule evaluation. It is a distinct stage with its own failure modes.
Where normalization can drift
If the vendor's normalization logic evolves (adding a new status code, collapsing previously-separate ones), your downstream risk rules may return incorrect decisions on cases that crossed the boundary. The same risk applies if your own normalization layer evolves without coordination with the rules team.
Audit trail requirements
For audit defense, the record should capture both the normalized status and the raw state response at the moment of the decision. If the examiner asks why an entity in California showed as "Inactive" when the state website today shows "Active," the answer is in the timestamp: the verification captured the state's view at that moment, and the state has since changed it.
The vendor-reliance question
Some shops rely on the vendor's normalization; others implement their own layer on top of the vendor's raw response. The trade-off: vendor normalization is cheaper to integrate but locks you into the vendor's choices. Self-normalization is more work but preserves optionality. The defensible pattern is to request raw status from the vendor API and apply your own normalization mapping, with the vendor's normalization as a reference rather than a dependency.
The Normalization Data-Model Checklist
Before the next schema review, run this checklist against your KYB data model.
• Normalized status as a primary field with a fixed vocabulary (7 values recommended).
• Raw status string preserved alongside, state-specific.
• Reason code where the state provides one (FTB, SOS, tax, filing).
• Reason description for human-readable audit trail.
• Status-as-of timestamp capturing the moment of verification.
• Status source (live registry, cached vendor response, manual entry).
• Entity type (LLC, Corp, LP, Series LLC, etc.) preserved for state-specific rules.
• Status qualifier for state-specific subtype information.
• Status history with transition timestamps for perpetual KYB.
• Source-of-truth documentation where multiple state agencies return different statuses (NY DOS versus Tax).
The schema is more verbose than the naive binary. It is also the schema that survives a compliance review and a state-mix variance audit.
Examples of Normalization Drift and How to Catch It
Three patterns cause normalization drift in production:
The vendor-upgrade surprise
A vendor adds a new status code ("Administratively Dissolved" appears alongside existing "Involuntarily Dissolved"). Your normalization layer does not have a mapping. The status returns as Unknown. Risk rules treat Unknown as a decline. Good entities get declined silently until the mapping is updated.
Catch it: subscribe to vendor release notes, review any schema changes, add new mappings before they appear in production traffic.
The state-policy-shift surprise
A state changes its status vocabulary. Delaware rebrands "In Good Standing" to "Current Good Standing" (hypothetical). The aggregator passes the new string through. Your normalization mapping does not recognize it. Same outcome.
Catch it: monitor the distribution of raw status strings. A new string appearing in your traffic is a signal to investigate.
The cache-transition gap
An entity that was Active in the cache shows up as Inactive on live-ping because of a recent status change. The cache normalization ran correctly against the old status; the live normalization runs against the new one. The risk rules see two different normalized values for the same entity at close time points. Without audit logging of both, the decision path is opaque.
Catch it: log both the cached and live normalized statuses when the waterfall escalates. The divergence is the signal.












.png)