Data reliability
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Overview

Data reliability is the measure of how accurate and complete a set of data is. Reliable data creates reproducible outcomes and performs consistently across repeated projections, analysis, and testing. Data reliability for the Federal Audit Clearinghouse is crucial to building trust with our Federal partners and the public; The more reliable our data is, the more trust we build with our users.

What data do we provide?

The Federal Audit Clearinghouse (The FAC) collects Single Audit report packages, as required by The Single Audit Act. This act mandates an annual audit of all non-Federal entities, including Tribes, that spend $1,000,000 or more of Federal Financial Assistance (Federal grant dollars) in a fiscal year. A Single Audit report package comprises two parts: The audit report PDF and Form SF-SAC. The audit report PDF is prepared by an independent auditor, and it presents both an organization's financial statements and compliance with Federal award requirements. Form SF-SAC is a set of excel worksheets that collect specific data pulled from the report PDF. The data requested in Form SF-SAC may vary based upon the reporting entity and their award requirements.

Where does our data come from?

Single Audit report packages must be uploaded to fac.gov by the recipient of the Federal grant funding (the entity), and are independently certified as accurate by both the entity and conducting auditor before being published to our searchable database. Tribal entities may choose to partially suppress their data from public view, in which case the full report can still be viewed by Federal employees with special permissions. We confirm the identities of these parties by requiring use of a Login.gov account in order to upload reporting packages. Login.gov is free to use, and has the highest standards of security to keep our information safe including identity verification and two-factor authentication.

How do we ensure data quality and reliability?

Data quality steps are built into each step of the FAC data pipeline, from intake to dissemination to periodic checks and monitoring. We think about data reliability in four distinct timeframes: collection, dissemination, ongoing curation, and periodic monitoring.

Our Collection

To ensure the data we collect is reliable and consistent, we’ve built multiple levels of validation into the FAC intake system. For more information on how we collect data, visit our audit resources page. Here are the main kinds of validation a user will encounter when submitting an audit report:

  1. Web form validations. These validations check that each of the web-based form fields on fac.gov contain the expected inputs. For example, a form validation might ensure phone numbers and dates are entered in the correct format or that required fields are filled out.
  2. Built-in workbook validations. Because we collect much of our SF-SAC data via excel workbook, we can take advantage of the built-in data validation features commonly found in spreadsheet software. These validations are designed to help users understand our validation requirements and guide them toward entering the most appropriate information possible. Working with spreadsheets poses some limitations, which means we can’t rely on these validations alone, but they are still a useful tool.
  3. Upload validations. When someone uploads a PDF or workbook file to the FAC, we check the file for security issues, completeness, and validity. If the file fails any of these steps, we provide an error message and next steps, but do not allow the file’s contents to proceed. If the file is a workbook, we run it through a more complex validation and either add it to the corresponding audit report or return a list of validation problems that must be resolved.
    • If you are interested in learning more about these more complex validations, you can find additional documentation in our open source GitHub repository here.
  4. Cross validation. Some of our validation checks span multiple workbooks or forms and can only be done after the user has provided a complete set of data. For example, the auditee’s UEI must be the same in every workbook and the total number of findings reported in the federal awards workbook must match the number of entries in the audit findings workbook. These cross-sectional checks must be completed before an audit report can move into the certification and submission phase.
  5. Eligibility checks. This category includes other miscellaneous checks to ensure an entity is eligible to submit an audit report and includes elements like UEI validation and verifying the entity spent at least $750,000 during its audit period.

Dissemination

Before an audit can become searchable (via the FAC’s search tool or API), we have to transform it from a collection of forms and workbooks into a shape that fits our data distribution model. This final step shouldn’t fail as long as the previous validation steps worked as expected, but there is always a chance of things going wrong. Rather than allow inconsistent audits to be disseminated, the FAC is built to reject audits that don’t meet this final validation step, alerting the FAC team to investigate the issue and re-trigger dissemination once it has been resolved.

Curation and auditable edit checks

While we strongly prefer validating data as it’s submitted, there are a few points in the pipeline where manual intervention may be required. To ensure a clear and auditable history for each record, each manual intervention automatically generates metadata and attaches it to the audit report in question. Below are the manual interventions currently supported by the FAC, each of which generates its own record including, at minimum, a timestamp and the user who initiated the action.

Not all of the above curation records and edit checks are publicly available. Some records are for internal purposes and are unrelated to the audits themselves. For the rest, we plan to design and build a publication pipeline in the coming months.

Monitoring

The FAC has implemented a still-evolving schedule (we’re always improving!) for ongoing manual and automated analysis of our data collection. Our testing is designed to monitor these five essential quality metrics:

Test frequency varies based upon individual metrics and procedures, with our most frequent tests running on an automated once-weekly basis.

Continuous Improvement

We’re always working to improve the quality of our data. We ticket known issues or bugs in our product backlog on github as they are discovered; Because these are updated often visiting github is the best way to review our current issues. Please click here to view our product backlog. Once you’ve opened the backlog, select a Roadmap Item from the left hand column (for example: Data Quality) to view all existing tickets associated with that topic. Click on the title of individual tickets for more detail and see what plans we have for remediation. To learn what work we have planned next please view the fac.gov product roadmap at the top of our Updates page.

Working with data from The FAC

We welcome your questions and feedback via our Helpdesk. The FAC help desk is staffed by our own implementation team, so rest assured that this is a reliable and direct way to contact us. We're currently designing more comprehensive instructional materials for working with our data, we appreciate your patience as we determine our best path forward.