Tips for working with approvals data

A first-time user’s guide to what’s actually in the data and how to read it.

articles Beginner 10 min read

Garage conversions. Subdivision maps. Right-of-way agreements. Accessory dwelling units added to backyards. These are some of the things the Development Services Department processes, and the work gets published in the Approvals dataset.

That breadth is what makes the dataset valuable. It is also what makes it easy to misread.

This guide walks through what the dataset contains, how its lifecycle works, and where first-time users typically get tripped up.

Note: The dataset begins in 2003. That is the earliest point for which the Development Services Department has been able to publish reliable digital records. Anything before that is out of scope.

Approvals include more than just permits

The dataset is named “Approvals” for a reason. Open the data and you will find permits, but you will also find tentative maps, parcel maps, agreements, and a long tail of other approval types. Building a single-family home requires a permit. Subdividing a lot generates a map. A street vacation requires an agreement. They all show up here, alongside each other, with their own categories and their own rules.

If building permits issued in San Diego are what you are after, you are in roughly the right place. The file just contains more than that.

When you go to filter by type, expect to filter for more than one at a time. The dataset records type names as they appear in the source system, and those names have drifted over time. The same kind of approval may show up under an abbreviation, under a former name, or under a legacy code from before a system change. Filter for the full set of labels that represent what you are studying, not just the most current name.

Processing time varies widely too. Some approval types are issued the same day with no review required. Others take years to move through a multi-step review. Averaging time-to-issue across the whole file mixes both ends of that spectrum.

Note: The first thing to do with this dataset is filter to the approval types you care about. Skipping that step is the most common mistake.

An approval is a record, not a verdict

The word “approval” refers only to the type of record being tracked and is not a statement that the City said yes.

A record’s status field is what tells you whether it was issued, withdrawn, expired, cancelled, or still under review. All of those are still approvals in the dataset, because “approval” here describes the kind of record, not its outcome.

This is one of the main reasons a raw row count can mislead. A count of rows is a count of approval records. A count of things that were granted is something else, and it requires filtering on status, and sometimes on milestone dates as well.

Take a common question: how many building permits have been issued? It is tempting to count rows where the status reads “Issued.” That misses most of the answer. Once a permit is issued, the lifecycle continues, and the status field moves with it. A permit that was issued and then completed reads “Completed,” not “Issued,” but it is still an issued permit. To count issued permits reliably, filter on the approval type and on whether the issuance milestone date has a value, not on the current status alone.

That same date is also what you use to narrow the count to a specific window. Issued in a particular month or year? Filter the issuance date to that period.

One row is one approval, not one project

The second most common mistake is treating each row as a project.

A single development project can generate many approvals. A new building might require a building permit for the structure, separate electrical and mechanical permits, and a grading permit. Each of those is a row. Counting rows tells you how many approvals were processed, not how many projects were built.

The dataset also covers two distinct tracks of approvals: discretionary approvals (case-by-case decisions like conditional use permits, neighborhood development permits, or planned development permits) and ministerial, by-right approvals (the permits issued when a project meets code without case-by-case review). A given project follows one track or the other, and the dataset has a filter that lets you distinguish them. Picking the right track for your question matters.

The dataset gives you the building blocks. It does not pre-aggregate them for you.

Created, Issued, and Closed are three views of the same lifecycle

The data downloads page lists three views: Created, Issued, and Closed. They are not three different datasets. They are three different lenses on the same approvals, filtered by which date in the lifecycle you care about.

Created is the broadest of the three. It includes every approval that has been entered into the system, regardless of whether it was ever issued or closed.

Issued is the subset of approval records that have actually been granted. Each application produces one or more approval records, and each record moves through the lifecycle on its own. An approval record can be created and never make it to issued, because it was withdrawn or is still in review.

Closed is the subset that has been finalized in some way. Closed does not necessarily mean approved. The most common path to closed is that all required inspections have been completed and the approval has run its full course. But closed can also mean the approval was withdrawn, expired, or cancelled.

If you want to count approval records that started in a given year, the Created view is the right starting point. If you want to count approval records actually granted in that year, use Issued. If you want to study what happened to approval records that were finalized in a given year, use Closed.

Picking the wrong view is one of the easier ways to publish a number that does not mean what you think it means.

One more thing about these views. They are filters, not interpretations. Every approval record carries a status field along with a set of milestone dates: when it was created, when it was issued, when it was closed. The status read alongside those dates is the most reliable indicator of what actually happened to a given record. The view tells you which records are in the file; the status and milestone dates tell you the story of each one.

Views by year

The full Created file is over half a gigabyte. That is more than most desktop spreadsheet tools handle well. Most analysts do not need every approval going back to the beginning of the data; they need a window.

The year-filtered files exist to make that window easy to grab. Each of the three views is published as a smaller, year-by-year subset for recent years alongside the full historical file. Pick the year you care about, pick the lifecycle view that matches your question, and you have what you need.

Note: Files inside the same series can be combined. Files across series cannot, at least not without thinking carefully, because the same approval can appear in Created, Issued, and Closed for different years. Stacking a Created file and an Issued file from the same year and treating the result as “all approvals in that year” will double-count any approval that was both created and issued during the year.

Questions this data is built to answer

The dataset is structured around volume and lifecycle, so it can answer volume and lifecycle questions, such as:

  • How many approvals of a given type were created, issued, or closed in a year, a month, or a fiscal quarter
  • How that volume breaks down by Council District, community planning area, or zip code
  • How long it takes, on average, for an approval of a given type to move from created to issued
  • Whether volumes are trending up or down over multiple years
  • Where, geographically, development is concentrated

These are the questions the dataset was designed to support. They are also the questions that most often drive policy conversations, planning analyses, and journalism.

Questions this data is not built to answer

The dataset is also a snapshot of a transactional system, which means it has hard limits on what it can tell you.

  • It cannot tell you the real-time status of any specific approval record. The published file lags the live system; anything that changed since the last refresh will not show up until the next one.
  • It cannot tell you why a decision was made. The fields describe what happened, not the reasoning behind it.
  • It cannot tell you anything that is not encoded as a field. Square footage and valuation are reported when they are required by the application; design intent and architectural detail are not.
  • It cannot give you a clean project-level rollup without external work. There is no pre-built project view in the dataset. If you want to study projects rather than approvals, that view has to be built on top of the data.
  • It cannot tell you anything about approvals processed before 2003. Long-run trend analysis that needs decades of history is not what this dataset is for.

A dataset is most useful when its limits are understood up front. These are the limits.

The pitfalls that catch first-time users

Four mistakes show up over and over in analysis built on this data.

Treating “Closed” as “Approved”

A closed approval can mean inspections were completed and the work is finished, but it can also mean the approval was withdrawn, expired, or cancelled. They are not the same thing, and conflating them inflates the count of approvals that were actually granted.

Counting approvals as projects

Already covered above. Worth repeating because it is that common.

Mixing fiscal and calendar year

The City operates on a fiscal year that runs July through June. Some published metrics use fiscal year, some use calendar year. The dataset gives you the dates; staying consistent is up to the analyst.

Combining files without checking for overlap

Stacking Created, Issued, and Closed produces a file that contains the same approval more than once. The fix is to pick one view based on the question and stay there, or to deduplicate on the approval identifier when combining is genuinely necessary.

How the data gets here

The data does not appear on the portal directly. It moves through a pipeline, and knowing the pipeline helps explain what is on the screen.

An application is submitted to the Development Services Department’s permitting system, where it moves through pre-screening, review, issuance, construction, inspections, and eventual closure. State changes update the record in the system. On a regular schedule, a process pulls the latest state of every approval out of the permitting system, formats it for publication, and posts the resulting CSV files to the open data portal.

The practical implications are worth holding onto.

The published file lags the live system. To know how current it is, check the “last updated” timestamp on the dataset page. For most analytical questions, the typical lag is not a concern. For anything that needs the latest state, it is.

The fields on the portal are a curated subset of what exists in the permitting system. Some operational fields are intentionally not published.

The published file is lightly processed before it lands on the portal. Most of the processing is cosmetic, like consistent formatting and encoding, with some light normalization for readability. It is not a raw export.

Historical corrections happen. If a record was entered incorrectly and is later fixed, the corrected version appears in the next refresh. Re-running an analysis a month later can produce slightly different numbers than the original run, especially for recent records.

Where to go next

The data dictionary on the dataset page lists every field in the file with a short description. That is the next stop after this article.

After that, the most productive way to learn this dataset is to download a year-filtered file, open it in any tool that handles CSVs, filter to a single approval type, and look at how the dates, statuses, and geographies behave for that one slice. The patterns become much clearer once the noise from the other types is removed.

The Approvals dataset rewards patience. The reader who treats it as a window into the workflow of the Development Services Department will get more out of it than the reader who treats it as a list of buildings.

Skip the analysis and check out one of our dashboards

These tips should make any analysis easier, but another option is to check out exising City dashboards.

The Permitting Center Dashboard provides a comprehensive overview of the Development Services Department’s (DSD) permit data, inspections and code enforcement.