Data Verification, Reporting and Validation
The goal of any analytical activity is to obtain the best data for the needs of the project. The effort involves steps taken by the laboratory, as well as those taken by the project managers and data users. The laboratory must document and report the results of all analyses to ensure the correct evaluation and interpretation of the data by the end user. Data verification, data reporting and data validation are the steps in the process.
On this page:
Data Verification
The first step is for the laboratory to verify the results to ensure that they represent a true picture of the analytical process. This verification step may involve the analytical laboratory supervisor or a qualified individual at the laboratory who is independent of the analytical process. The laboratory supervisor should review all laboratory results and calculations before submitting the data to the program manager. Any errors identified during this review should be returned to the analyst to correct before the data package is submitted.
After the errors are corrected, the laboratory supervisor should verify that the final package is complete and compliant with the project requirements and sign each data submission to certify that the package was reviewed and determined to be compliant. The laboratory also should supply a data package narrative that discusses any issues that occurred from sample receipt through the final analysis of the samples, including any quality control failures and associated corrective actions.
Data Reporting
The second step is the actual reporting of the sample results. The format of the data to be delivered is something that must be specified in the Quality Assurance Project Plan (QAPP) or in the contract or other agreement with the laboratory. Data are commonly provided as electronic data deliverables) in a spreadsheet format (e.g., a *.csv file or other format compatible with common spreadsheet programs such as Microsoft Excel) and in a PDF file with copies of all supporting raw data.
Some established fish monitoring programs may have developed reporting forms that laboratories may use for reporting. However, project managers need to understand that such requests involve costs for the laboratory to customize their existing reporting procedures to those of the project and those costs will be passed along to the project in the per-sample analysis price. The need for customized reporting forms also may limit the interest of new laboratories in bidding on projects for organizations that must solicit competitive bids.
Given the availability of data in spreadsheet-compatible formats, it may be easier at both ends of the project to work with electronic data and forego data reporting forms and the associated data transcription process altogether. It may be more practical and less costly to specify the information content that the project requires rather than a rigid reporting format. That way, a laboratory can provide the needed information without significant programming costs to produce a project-specific format. The laboratory should be able to provide one or more examples of their existing electronic data deliverable formats to the project manager for review before delivering the actual project analytical results. However, if the project manager determines that important information is missing from the laboratory’s existing format, work with the laboratory to add one or more fields to the electronic data deliverable format to resolve the issue.
Although the exact data are dependent on the type of analysis and the analytical method, for fish tissue analyses, the typical information content for a given sample may include:
- Client Site ID (if applicable)
- Client Sample ID
- Lab ID (used to associate the final results with the raw data)
- Sample matrix (e.g., tissue)
- Receipt date
- Batch ID
- Preparation date
- Analysis date
- Analyte name or abbreviation
- CAS number (where available)
- Concentration found
- Units (e.g., ng/g)
- Method detection limit
- Quantitation limit
- Lab qualifier flags
- Sample size (e.g., mass for tissue samples)
- Sample size units
- Dilution factor (if any)
- Method ID
- Surrogate or labeled compound recoveries (where applicable)
For QC sample results, the file also must include:
- QC sample type
- QC sample ID
- Spike amount (if applicable)
- Spiked compound recovery
- Relative percent difference (RPD) for laboratory duplicate samples
The exact field names in the electronic data deliverable usually can be left to the laboratory’s discretion (e.g., “Receipt date” vs “Sample receipt date”). Likewise, the exact order in which the data elements are presented may not be critical as long as the information content meets the needs of the project. The laboratory also should include any additional information that it may find necessary, provided that all the field names are clearly defined.
The specific contents of the PDF of the data package also will vary with the type of analysis and the analytical method, but should include:
- Analysis dates and times
- Analysis sequence or run logs
- Sample dilution (if required)
- Final dilution prior to injection
- Data from diluted sample analyses, if required
- Instrument calibration results
- Chromatograms, ion current profiles, bar graph spectra, library search results, or direct instrument readouts; i.e., strip charts, printer tapes, etc., applicable to the instrumentation involved in the analysis
- Quantitation reports, including results for quantitation ions and confirmation ions for mass spectrometry methods that use those as identification criteria for the target analytes.
- Data system outputs, and other data to link the raw data to the summary results reported
- Laboratory bench sheets and copies of all pertinent logbook pages for all sample preparation and cleanup steps, and for all other parts of the analytical determination.
Data Validation
The third step in the process is data validation or data review. It should be conducted by project personnel (not the laboratory) or a third party skilled in data review procedures. Some projects will require more effort than others, and as with data reporting, project managers must balance the level of effort needed for data validation against the goals of the project and the risk of making the wrong decision with the data at hand.
The purpose of the data validation is to evaluate the actual results against the agreed-upon data quality specifications (e.g., detection and quantitation limits, precision, accuracy) and other performance criteria established in the Quality Assurance Project Plan. At the minimum, the validation effort must determine if the data are complete. Were all of the samples analyzed for all of the parameters of interest? Did the laboratory provide all of the data required, whatever the format and extent of reporting was agreed to? If not, then the laboratory should be contacted to resolve any completeness errors as soon as practical.
Once they have determined that the data are complete, the data validators focus on evaluating the field sample results and the QC sample results against the technical specifications in the methods and the QAPP. The goal is to have data of known and documented quality.
It may be necessary to qualify reported data values. Any qualifiers applied by the laboratory are a starting point in the data qualification process, but the data reviewer needs to go one step further and answer the “So what?” question. For example, the laboratory may flag some field sample results to indicate that the analyte was found in the associated method blank prepared alongside the field samples. The next step is for the data validator to compare the blank results against the field sample result and assess the risk to the project. Was the analyte found in the sample and the associated blank at very similar levels? That suggests that there was some source of contamination in the analytical laboratory that affected both the blank and the field sample, and the sample result may not be usable. Conversely, if the field sample result was orders of magnitude above that in the blank, the issue on blank contamination is a moot point.
The project QAPP should identify the person (by name or title) who will be conducting the data validation effort and to what extent (e.g., every sample result, some fraction of the sample results, or some other approach) and what the final product of the validation effort will be. One common approach is to add columns to the laboratory’s electronic data deliverable and insert any data validation qualifiers into that file, along with explanatory comments, as needed. Such an approach offers the advantage of preserving the laboratory’s original data submission while clearly providing evidence of the validation process. Data need not be perfect to be useful. The presence of qualifiers applied by either the laboratory or the data validator is not intended to suggest that data are not useable; rather, the qualifiers are intended to caution the user about an aspect of the data that does not meet the acceptance criteria established for the project.
After all of the analytical data have been validated and the sample information compiled, jurisdictions may want to post monitoring data to Tribal, state, or territory websites. Data can also be submitted to the Water Quality Exchange (WQX), which is the mechanism for data partners to submit water quality data to the U.S. Environmental Protection Agency. The Water Quality Portal (WQP) uses the WQX data format to share more than 380 million water quality data records sourced from at least 900 federal, Tribal, state, and other partners. The WQP is the nation’s largest source for water quality monitoring data, and it is available for anyone to retrieve monitoring data.
Requiring that the laboratory provide you with an electronic data deliverable will facilitate the process of uploading your data to the WQX, which will not accept a PDF. The WQX webpage has FAQs and factsheets to lead you through the data uploading processes.
WQX includes a biological template that can be used to upload fish tissue contaminant concentration data.
Commonly Used Fields | Value Examples |
---|---|
Biological Intent | Tissue |
Characteristic Name | Mercury |
Characteristic Name User Supplied | Mercury |
Result Analytical Method Context | USEPA |
Result Analytical Method ID | 1631E |
Result Unit | ng/g |
Result Value | |
Result Value Type | Actual |
To retrieve your data or data from other jurisdictions from WQP, consult the WQP User Guide.