Biological test method for measuring terrestrial plants exposed to contaminants in soil: chapter 5


Section 5 - Specific Procedures for Testing Field-Collected Soil or Similar Particulate Material

This section provides specific instructions for preparing and testing samples of field-collected (site) soil or similar particulate material, in addition to the procedures discussed in Section 4.

Detailed guidance for the collection, handling, transport, storage, and analyses of field-collected soil is given in a number of reports specific to these subjects (e.g., van Ee et al., 1990; Webster and Oliver, 1990; USEPA, 1991; Keith, 1992; Klute,1986; Carter, 1993; OMAFRA, 1999). In the absence of guidance specific to these subjects from Environment Canada, such reports should be consulted and followed (in addition to the guidance provided here), when collecting samples of field- collected soil and preparing them for toxicity tests with terrestrial plants using the biological test method described herein.

5.1 Sample Collection

Crépin and Johnson (1993) provide a useful summary of field-sampling design and appropriate techniques for sample collection. Field surveys of soil toxicity using biological tests with terrestrial plants and/or other suitable, soil associated test organisms (e.g., EC, 2004c, 2005c) are frequently part of more comprehensive surveys (e.g., Callahan et al., 1991; Menzie et al., 1992; and Saterbak et al., 2000). Such surveys could include a test battery to evaluate the toxicity of soil together with tests for bioaccumulation of contaminants, chemical analyses, biological surveys of epifaunal and/or infaunal organisms, and perhaps the compilation of geological and hydrographic data. Statistical correlation can be improved and costs reduced if the samples are taken concurrently for these tests, analyses, and data acquisitions.

Samples of soil to be used in the biological test method described herein (Section 4), might be taken quarterly, semiannually, or annually from a number of contaminated or potentially contaminated sites for monitoring and compliance purposes. Samples of soil might also be collected on one or more occasions during field surveys of sites for spatial (i.e., horizontal or vertical) or temporal definition of soil quality. One or more sites should be sampled for reference (presumably clean) soil during each field collection.Footnote62

The number of stations to be sampled at a study site and the number of replicate samples per station will be specific to each study. This will involve, in most cases, a compromise between logistical and practical constraints (e.g., time and cost) and statistical considerations. Webster and Oliver (1990), Crépin and Johnson (1993) and OMAFRA (1999) provide guidance on the sampling design; van Ee et al. (1990) and USEPA (1991) address issues related to quality assurance and quality control.

For certain monitoring and regulatory purposes, multiple replicates (i.e., separate samples from different grabs or cores taken at the same site) should be taken at each sampling station, including one or more reference stations. Each of these field replicates should be tested for its toxicity to terrestrial plants using five or more test vessels per replicate sample (Section 4.1). The use of power analysis (see Section 5.5.2) with endpoint data obtained in previous tests of the same type, performed with previous samples from the same or similar sites, will assist in determining if additional laboratory replicates need to be tested with each field replicate. Also, some of the statistical tests have requirements for a minimum number of replicates. For certain other purposes (e.g., preliminary or extensive surveys of the spatial distribution of toxicity), the survey design might include only one sample from each station, in which case the sample would normally be homogenized and split between five replicate test vessels. The latter approach precludes any determination of mean toxicity at a given sampling location (station), and completely prevents any conclusion on whether a station is different from the control or reference, or from another location. It does, however, allow a statistical comparison of the toxicity of that particular sample with the reference or control or with one or more samples from other locations. It is important to realize that any conclusion(s) about differences, which arise from testing single field samples lacking replication, cannot be extended to make any conclusion(s) about the sampling locations.

Sites for collecting reference soil should be sought where the geochemical properties of the soil are similar to soil characteristics encountered at the test sites. Matching of total organic carbon content (%) or organic matter content (%) might not be warranted in cases where pollution (e.g., from or within sewage or industrial sludge) is responsible for the high organic carbon content of test soils. Preliminary surveys to assess the toxicity and geochemical properties of soil within the region(s) of concern and at neighbouring sites are useful for selecting appropriate sites at which to collect reference soil.

Samples of municipal or industrial sludge (e.g., sewage sludge, dewatered mine tailings, or biosolids from an industrial clarifier or settling pond) might be collected for the assessment of their toxic effect(s) on plants, and for geochemical and contaminant analyses. Other particulate wastes being considered for land disposal might also be collected for toxicity and physicochemical evaluation.

Guidance for various soil sampling plans and procedures is available in the technical literature (e.g., Petersen and Calvin, 1986; Keith, 1992; Crépin and Johnson, 1993). Procedures used for sample collection (i.e., core, grab, or composite) will depend on the study objectives and the nature of the soil or other particulate material being collected. A shovel, auger, or soil corer (preferably stainless steel) is frequently used for collecting soil samples. The surface of the location where each sample is to be collected should be cleared of debris such as twigs, leaves, stones, thatch, and litter. If the location is an area of grass or other herbaceous plant material, the plants should be cut to ground level and removed before the sample is collected. Removal of the vegetation should be done such that removal of soil particles with the roots is minimal. Dense root masses (e.g., grasses) should be removed and then shaken vigorously to remove soil particles adhering to the roots. The soil sample to be collected for toxicity and evaluation of chemistry should be taken from one or more depths that represent the layer(s) of concern (e.g., a surficial layer of soil, or one or more deeper layers of soil or subsoil if there is concern about historical deposition of contaminants).

The required volume of soil per sample should be calculated, before a sampling program is initiated. This calculation should take into account the quantity of soil required to prepare laboratory replicates for soil toxicity tests, as well as that required for particle size characterization, total organic carbon content(%), organic matter content (%), moisture content (%), and specific chemical analyses. A volume of at least 5 - 7 L of soil per sample is normally required, although this will depend on the study objectives/design (e.g., single- concentration or multi-concentration test) and the nature of the chemical analyses to be performed, and possibly also on the nature of the soil (e.g., need for removal of excess water and/or debris in the laboratory, which can reduce the sample volume). To obtain the required sample volume, it is frequently necessary to combine subsamples retrieved using the sampling device. The same collection procedure should be used at all field sites sampled.

5.2 Sample Labelling, Transport, Storage, and Analyses

Containers for transport and storage of samples of field-collected soil or similar particulate material must be made of nontoxic material. The choice of container for transporting and storing samples depends on both sample volume and the potential end uses of the sample. The containers must either be new, thoroughly cleaned, or lined with high- quality plastic. Thick (e.g., 4 mil) plastic bags are routinely used for sample transport and storage. If plastic bags are used, it is recommended that each be placed into a second clean, opaque sample container (e.g., a cooler or a plastic pail with a lid) to prevent tearing and support the weight of the sample and to maintain darkened conditions during sample transport (ASTM, 1999b). Plastic containers or liners should not be used if there are concerns about the plastic affecting the characteristics of the soil (e.g., compounds from plastic leaching into the soil).

Following sample addition, the air space in each container used for sample transport and storage should be minimized (e.g., by collapsing and taping a filled or partially filled plastic bag). Immediately after filling, each sample container must be sealed, and labelled or coded. Labelling and accompanying records made at this time must include at least a code or description that identifies sample type (e.g., grab, core, composite), source, precise location, land use information, replicate number, and date of collection; and should include the name and signature of sampler(s). Persons collecting samples of soil should also keep records that describe details of:

Soil samples should not freeze or become overheated during transport or storage. It is recommended that samples be kept in darkness (i.e., held in light-tight, opaque transfer containers such as coolers or plastic pails with lids) during transport, especially if they might contain PAHs or other chemicals or chemical products that could be photoactivated or otherwise altered due to exposure to sunlight. As necessary, gel packs, regular ice, or other means of refrigeration should be used to assure that the temperature of the sample(s) remains cool (e.g., 7 ± 3 °C) during transit.

The date the sample(s) is received at the laboratory must be recorded. Sample temperature upon receipt at the laboratory should also be measured and recorded. Samples to be stored for future use must be held in airtight containers. If volatile contaminants are in the soil or of particular concern, any air “headspace” in the storage container should be purged with nitrogen gas, before capping tightly. Samples must not freeze or partially freeze during transport or storage (unless they are frozen when collected), and must not be allowed to dehydrate. If, however, one or more samples are saturated with excess water upon arrival at the laboratory (e.g., sampling occurred during a significant rainfall event), the sample(s) may be transferred to plastic sheeting for a brief period (e.g., one or more hours) to enable the excess water to run off or evaporate. Thereafter, the sample(s) should be returned to the transport container(s) or transferred to one or more airtight containers for storage.

It is recommended that samples be stored in darkness at 4 ± 2 °C. These storage conditions must be applied in instances where PAHs or other light- sensitive contaminants are present, or if the samples are known to contain unstable volatiles of concern. It is also recommended that samples of soil or similar particulate material be tested as soon as possible after collection. The soil toxicity test(s) should begin within two weeks of sampling, and preferably within one week. The test must begin within six weeks, unless it is known that the soil contaminants are aged and/or weathered and therefore considered stable.

Dry sieving (i.e., press sieving; not wet sieving) of samples through a coarse-mesh sieve is desirable to remove large particles (see Section 5.3). This procedure may be performed in the field. Undesirable coarse material (e.g., large gravel or stones, large debris, large indigenous macroinvertebrates, or large plant material) may also be removed in the field before sample transport.

In the laboratory, each sample of field-collected soil should be thoroughly mixed (Section 5.3), and representative subsamples taken for physicochemical characterization. Each sample (including all samples of negative control soil and reference soil) must be characterized by analyzing subsamples for at least the following:

Additionally, the following analyses should be performed:

Other analyses could include:

Unless indicated otherwise, identical chemical, physical, and toxicological analyses should be performed with subsamples representative of each replicate sample of field-collected soil (including reference soil) taken for a particular survey of soil quality, together with one or more subsamples of negative control soil.

5.3 Preparing Sample for Testing

Field-collected soil or similar particulate waste material must not be sieved with water, as this would remove contaminants present in the interstitial water or loosely sorbed to particulate material. Large gravel or stones, debris, indigenous macroinvertebrates, or plant material should normally be removed using forceps or a gloved hand. If a sample contains a large quantity of debris (e.g., plant material, wood chips, glass, plastic, large gravel) or large macroinvertebrates, these may be removed by pressing the soil through a coarse sieve (e.g., mesh size of 4 - 9 mm; EC, 2000).

Qualitative descriptions of each sample of field-collected test soil should be made and recorded at the testing laboratory, including information on sample colour, texture, and the presence and description of roots, leaves, and macroscopic soil organisms. Unless research or special study objectives dictate otherwise, each sample of field- collected test material should be homogenized in the laboratory before use (USEPA, 1989).Footnote63 Mixing can affect the concentration and bioavailability of contaminants in the soil, and sample homogenization might not be desirable for all purposes.

As indicated in Section 3.7, one or more samples of field-collected test soil might either be tested at a single concentration only (typically, 100%), or evaluated for toxicity in a multi-concentration test whereby a series of concentrations are prepared by mixing measured quantities with either negative control soil or reference soil. When performing a multi-concentration test, the following series of concentrations of test soil (mixed in negative control soil or reference soil), which spans the range of 100-1% test soil using nine concentrations, might prove suitable: 100%, 80%, 65%, 50%, 30%, 15%, 7.5%, 3%, 1%, and 0%. Guidance on other concentration series that might prove as or more suitable is found in Section 6.2, along with that for preparing test mixtures which might apply equally when performing a multi-concentration test with one or more samples of field-collected soil. Refer to Section 4.1, for additional guidance when selecting test concentrations. In each instance, the test must include a treatment comprised solely of negative control soil (see Section 3.4).

To achieve a homogeneous sample, transfer it to a clean, rigid mixing container (e.g., a large stainless steel or plastic bowl) or for larger volumes of soil, to clean plastic sheets, spread out on the floor. The sample should be mixed manually (using a gloved hand or a nontoxic device such as a stainless steel spoon) or mechanically (e.g., using a domestic hand- held mixer with beaters at low speed, or a hand-held wire egg beater) until its texture and colour are homogeneous. While mixing, care should be taken to ensure that the impact of mixing on soil structure is minimal and that the structure is not destroyed entirely. As soon as the texture and colour of the sample appears to be homogeneous, mixing should be discontinued.

For each sample included in a test, mixing conditions including duration and temperature must be as similar as possible. If there is concern about the effectiveness of sample mixing, subsamples of the soil should be taken after mixing, and analyzed separately to determine the homogeneity of particle sizes, chemical(s) of interest, etc. Any moisture that separates from a sample during its transport and/or storage must be remixed into it, if possible.

The moisture content of a given sample of field-collected test soil should be standardized during its preparation by determining its water-holding capacity (WHC) and then hydrating the soil to an optimal moisture content based on a percentage of this value. The optimal percentage of the WHC for each sample of field-collected soil must be determined before sample preparation and test initiation. In order to do so, the moisture content of each homogenized sample (i.e., each sample of test soil, including the negative control soil) must be determined (Sections 4.1 and 4.6). Thereafter, the WHC of each sample must be determined using a recognized standard procedure (see following three paragraphs). A subsample of each soil sample is then hydrated to a homogeneous, crumbly consistency with clumps approximately 3 - 5 mm in diameter.Footnote64 Based on the initial moisture content of the sample, the WHC of the sample, and the amount of water added to achieve the desired soil consistency, the sample’s optimal moisture content can be calculated and expressed as a percentage of the WHC for each soil. Once this target (or optimal) percentage of the WHC has been determined, the moisture content of each sample of test soil (including the negative control soil) can be standardized to the selected (sample-specific) moisture content. Test water (i.e., de-ionized or distilled water) should be added to each sample with a moisture content that is less than the pre- determined optimal percentage of its WHC, until this moisture content is achievedFootnote65  (Aquaterra Environmental, 1998a). If a sample is too wet, it should be spread as a thin layer on a clean sheet of plastic (e.g., a new plastic garbage bag) or a clean, non-reactive (e.g., stainless steel or plastic) tray, and allowed to dry by evaporation at ambient (~20 °C) room temperature. Rehydration to the pre- determined optimal percentage of its WHC might be necessary. Upon adjustment of a sample’s moisture content to the desired percentage of its WHC, the moisture content (%) of the hydrated soil must be determined and the percent WHC and percent moisture content recorded and reported.

The WHC (and the percent WHC that is optimal for biological testing) of a particular soil is generally unique to each soil type, and is ultimately the result of the interaction of many variables associated with soil structure (e.g., micro/macro-aggregation, pore space, bulk density, texture, organic matter content). There are a number of methods that can be used to determine WHC; however, most of these methods require measurements to be made on an intact soil sample (e.g., soil core) where characteristics (structural aggregations, pore space, bulk density, texture, and organic matter content) are preserved during collection. The USEPA (1989) has described an appropriate method for toxicity testing using unconsolidated materials (such as samples of field-collected soils that have been dried, sieved, and homogenized; or samples of soil formulated in the laboratory from constituents).Footnote66 This method is outlined here.

For this method, ~130 g (wet wt) of sample is placed into an aluminum pan or petri dish (15 × 1 cm), and dried at 105 °C until a constant weight is achieved (this usually takes a minimum of 24 h). Thereafter, 100 g of the oven-dried soil is placed into a 250-mL glass beaker with 100 mL of distilled or de-ionized water. The resulting slurry is mixed thoroughly with a glass stir rod. A folded filter paper (185-mm diameter Fisherbrand P8 coarse porosity, qualitative creped filter paper; catalogue no. 09-790-12G) is placed into a glass funnel (with a top inside diameter of 100 cm and a stem length of 95 cm). The folded filter paper should be level with the top of the glass funnel. Using a pipette, up to 9 mL of distilled or de-ionized water is slowly added to the filter paper to wet the entire surface. The funnel and hydrated filter paper are then weighed. To obtain the initial weight for the mass of the funnel plus hydrated filter paper plus dried soil (see “I” in Equation 1), the weight of the dried soil (100 g) is added to the weight of the funnel and the wet filter paper.

The funnel is then placed into a 500-mL Erlenmeyer flask and the soil slurry is slowly poured onto the hydrated filter paper held in the funnel. Any soil remaining on the beaker and stir rod is rinsed into the funnel with the least amount of water necessary to ensure that all of the solid material has been washed onto the filter. The funnel is then tightly covered with aluminum foil and allowed to drain for three hours at room temperature. After three hours, the funnel containing the hydrated filter paper and wet soil is weighed. This weighing represents the final weight for the mass of the funnel plus hydrated filter paper plus (wet) soil (see “F” in Equation 1).

The water-holding capacity for the subsample of soil in the funnel, expressed as percentage of soil dry mass, is then calculated using the following equation:

WHC = ([F - I] ÷ D) x 100 WHC = ([F - I] ÷ D) x 100

where:

WHC  = water-holding capacity (%)
F = mass of funnel + hydrated filter paper + wet mass of soil
I = mass of funnel + hydrated filter paper + dry mass of soil
D = 100 g (i.e., dry mass of soil)

The WHC of each sample of test soil should be determined in triplicate, using three subsamples.

The percentage of water (i.e., Pw) that is added to a sample of field-collected soil to achieve the desired hydration (i.e., the optimal percentage of the WHC) can be calculated as follows:Footnote67

PW = [WHC × (PWHC /100)]- MC (Equation 2)

where:

PW = percentage of water to add to the soil (%)
WHC =  water-holding capacity (%)
MCi =  initial moisture content of the soil

The volume of water (i.e., Vw ) that should be added to a sample of field-collected soil to achieve the desired hydration (i.e., the optimal percentage of the sample’s water-holding capacity) can be calculated as follows:

VW = (PW × M)/100 (Equation 3)

where:

VW = volume of water to add to the soil (mL)
PW = percentage of water to add to the soil (%)
M = total mass of soil required for test (expressed as dry wt)Footnote68

Except for research-oriented toxicity tests intended to determine the influence of pH on sample toxicity, the pH of samples of field-collected soil must not be adjusted. Studies intending to investigate the effect of pH on sample toxicity should conduct two side- by-side tests, whereby one or more sets of treatments is adjusted to a fixed pH value using calcium carbonate or a suitable acid or base, and the pH of one or more duplicate sets of treatments is not adjusted.

Immediately following sample hydration (or dehydration) and mixing, subsamples of test material required for the toxicity test and for physicochemical analyses must be removed and placed into labelled test vessels (see Section 4.1), and into the labelled containers required for the storage of subsamples for subsequent physicochemical analyses. Any remaining portions of the homogenized sample that might be required for additional toxicity tests using plants or other test organisms (e.g., according to EC, 2004c and EC, 2005c) should also be transferred to labelled containers at this time. All subsamples to be stored should be held in sealed containers with minimal air space, and must be stored in darkness at 4 ± 2 °C (Section 5.2) until used or analyzed. Just before it is analyzed or used in the toxicity test, each subsample must be thoroughly remixed to ensure that it is homogeneous.

5.4 Test Observations and Measurements

A qualitative description of each field-collected test material should be made at the time the test is set up.  This might include observations of sample colour, texture, and homogeneity, and the presence of plants or macroinvertebrates. Any changes in the appearance of the test material observed during the test or upon its termination, should be noted and reported.

Section 4.6 provides guidance and requirements for the observations and measurements to be made during or at the end of each test. These observations and measurements apply and must be made when performing the soil toxicity test described herein using one or more samples of field-collected (site) soil.

Depending on the test objectives and experimental design, additional test vessels might be set up at the beginning of the test (Section 4.1) to monitor soil chemistry. These could be destructively sampled during and at the end of the test. Test organisms might or might not be added to these extra test vessels, depending on the study’s objectives. Measurements of chemical concentrations in the soil within these vessels can be made by removing aliquots of the soil for the appropriate analyses (see Section 5.2).

5.5 Test Endpoints and Calculations

The common theme for interpreting the results of tests with one or more samples of field-collected test soil, is a comparison of the biological effects for the test (site) soil(s) with the effects found in a reference soil. The reference sample should be used for comparative purposes whenever possible or appropriate, because this provides a site-specific evaluation of toxicity (EC, 1997a, b, 2001, 2004c). Sometimes the reference soil might be unsuitable for comparison because of toxicity or atypical physicochemical characteristics. In such cases, it would be necessary to compare the test soils with the negative control soil. Results for the negative control soil will assist in distinguishing contaminant effects from noncontaminant effects caused by soil physicochemical properties such as particle size, total organic carbon content (%), and organic matter content (%). Regardless of whether the reference soil or negative control soil is used for the statistical comparisons, the results from negative control soil must be used to judge the validity and acceptability of the test (see Section 4.4).

Analyses of the results will differ according to the purposes and particular designs of the test. This section covers the analytical procedures, starting with the simplest design and proceeding to the more complex designs. Standard statistical procedures are generally all that is needed for analyzing the results. Investigators should consult EC (2004a) for guidance on the appropriate statistical endpoints and their calculation. As always, the advice of a statistician familiar with toxicology should be sought for the design and analyses of tests.

Analysis of variance (ANOVA) involving multiple comparisons of endpoint data derived for single- concentration tests involving field replicates of field-collected soil from more than one sampling location is commonly used for statistical interpretation of the significance of findings from soil toxicity tests. This hypothesis-testing approach is subject to appreciable weaknesses. Notably, any increased variability within the test will weaken its power to distinguish toxic effects (i.e., less toxicity is concluded). Similarly, use of only a few replicates instead of many replicates will weaken the discrimination of a test and will lead to a conclusion of less apparent toxicity, other things being equal (see Section 5.5.2). There is no alternative to hypothesis testing, when comparing toxicity data for multiple samples of field-collected soil (i.e., field replicates of soil from more than one sampling location) that use only one concentration (usually full strength, i.e., 100% sample). There are alternatives for comparing point estimates of toxicity if various concentrations of each sample of field-collected soil are tested and multiple endpoint values for ICp or EC50 are determined (see Section 6.4). Section 9 in EC (2004a) should be consulted for guidance when comparing multiple ICps or multiple EC50s.

The parametric analyses involving ANOVA for comparative data from single-concentration tests with multiple samples of field-collected soil (i.e., field replicates of soil from more than one sampling location) assume that the data are normally distributed, that the treatments are independent, and that the variance is homogeneous among the different treatments. As the first step in analysis, these assumptions should be tested using the Shapiro-Wilk's Test for normality and Bartlett's Test for Homogeneity of Variance (Eisenhart et al., 1947; Sokal and Rohlf, 1969). If the data satisfy these assumptions, analysis may proceed. If not, data could be transformed (e.g., as square roots, logarithms, or as arcsine square root for quantal data which are to be used in statistical analysis; Mearns et al., 1986). The tests for normality and homogeneity might then show conformance with normality and homogeneity, and in fact that is a likely outcome of a transformation. Assumptions should be re-tested following any transformation of data. Parametric tests are reasonably robust in the face of moderate deviations from normality and equality of variance; therefore, parametric analysis (e.g., ANOVA and multiple comparison) should proceed, even if moderate nonconformity continues after transformation. Excluding a data set for minor irregularities might lose a satisfactory and sensitive analysis and forgo the detection of real effects of toxicity.Footnote69 Analysis by nonparametric statistical procedures should also proceed in parallel, with the more sensitive (lower endpoint) of the two analyses providing the final estimates of toxicity. Section 3 in EC (2004a) should be consulted for guidance when comparing the findings for single- concentration tests involving field replicates of samples from multiple locations, using parametric or non-parametric tests.

Guidance in Section 6 (including that in Section 6.2 for performing range-finding tests, and that in Section 4.8 for calculating test endpoints) should be followed if a multi-concentration test is performed using one or more samples of field-collected soil diluted with negative control soil or clean reference soil. Section 9 in EC (2004a) should be consulted when comparing such point estimates of toxicity for multiple samples of field-collected soil.

5.5.1 Variations in Design and Analysis

A very preliminary survey might have only one sample of test soil (i.e., contaminated or potentially contaminated site soil) and one sample of reference soil, without replication. Simple inspection of the results might provide guidance for designing more extensive studies.

If there is a single test sample and a single reference sample, with equal replication for each, a standard Student's t-test would be suitable for analysis (Paine and McPherson, 1991; EC, 1997a, b, 2001). The t- test is fairly robust and handles unequal numbers of replicates in the test and reference samples, as well as moderately unequal variances in the two groups (Newman, 1995; USEPA, 1995).

A preliminary evaluation might conceivably be conducted with samples from many stations, but without either field replicates or laboratory (within- sample) replicates. The objective might be to identify a reduced number of sampling stations deserving of more detailed and further study. Opportunities for statistical analysis would be limited. The nonreplicated test data could be compared with the reference data using outlier detection methods (USEPA, 1994a; Newman, 1995; EC, 1997a, b, 2001, 2004a, c). A sample would be considered toxic if its result was rejected as an extreme value when considered as part of the data for the reference soil and/or the negative control soil.

A more usual survey of soils would involve the collection of replicate samples from several places by the same procedures, and their comparison with replicate samples of a single reference soil and/or negative control soil. There are several pathways for analysis, depending on the type and quality of data, but often there would be an analysis of variance (ANOVA) followed by one of the multiple-comparison tests. In the ANOVA, the reference soil would also be treated as that from a “location”.

In these multi-location surveys, the type of replication would influence the interpretation of results. If field replicates were collected at each of the sampling locations, and no laboratory replicates were used, a one-way ANOVA would evaluate the overall difference in test results with respect to sampling location, over and above the combined variability of sampling the location and running the test. It would be unusual but much more powerful, to have field replicates for all sampling locations and also laboratory replicates of each field replicate. If that were done, the laboratory replicates would become the replicates in a nested one-way ANOVA, and would be the base of variability for comparing differences in the samples. The ANOVA could be used to determine (a) if there was an overall difference in test results for samples with respect to their sampling location, and (b) whether there was an overall difference in replicates taken at the various locations. After an ANOVA, the analysis would proceed to one or more types of multiple- comparison test, as described in the following text.

If only laboratory replicates and no field replicates were tested, there could be no conclusions about differences due to sampling location (see also Section 5.1). The laboratory replicates would only show any differences in the samples that were greater than the baseline variability in the within- laboratory procedures for setting up and running the test. Sample variability due to location would not really be assessed in the statistical analysis, except that it would contribute to any difference in test results associated with sampling location.

If it were desired to compare the test results for the replicate samples from each sampling location with those for the reference soil, to see if the toxicity of the two sources of soil (locations) differed, Dunnett's test should be used. It assumes normality and equal variance, and is based on an experiment- wise value of a (the probability of declaring a significant difference when none actually exists). If replication was unequal, investigators could use the Dunn-Sidak modification of the t-test, or alternatively the Bonferroni adjustment of the t-test (p. 189 in Newman, 1995; Appendix D in USEPA, 1995; Section 7.5.1 in EC, 2004a).

In a multi-location survey, an investigator might wish to know which of the samples from various sampling locations showed results that differed statistically from others as well as knowing which ones were different from the reference and/or negative control sample(s). Such a situation might involve sampling from a number of locations at progressively greater distances from a point source of contamination, in which instance the investigator might want to know which sampling locations provided samples that had significantly higher toxicity than others, and thus which locations were particularly deserving of cleanup. Tukey's test is designed for such an analysis; this test is commonly found in statistical packages and can deal with unequal sample sizes.Footnote70

If it were desired to compare the toxicity of the samples from each sampling location with that for the reference sample(s), but the data do not conform to requirements of normality and equal variance, the ANOVA and subsequent tests would be replaced by nonparametric tests. Steel's Many-One Rank test would be used if replication were equal, while unequal replication would require use of the Wilcoxon Rank Sum test with Bonferroni's adjustment.

5.5.2 Power Analysis

An important factor to consider in the analysis of the results for toxicity tests with soil is the potential for declaring false positives (i.e., calling a clean site contaminated; Type I error) or false negatives (i.e., calling a contaminated site clean; Type II error). Scientists are usually cautious in choosing the level of significance (α) for tolerating false positive results (Type I error), and usually set it at P = 0.05 or 0.01. Recently, toxicologists have been urged to report both α and statistical power (1 - {β), i.e., the probability of correctly rejecting the null hypothesis (H0) and not making a Type II error. There are several factors that influence statistical power, including:

Environment Canada’s guidance document on statistical methods for environmental toxicity tests (EC, 2004a) provides further information and guidance on errors of Types I and II.

Power analysis can be used a priori to determine the magnitude of the Type II error and the probability of false positive results. It can also be used to ascertain the appropriate number of field and laboratory replicates for subsequent surveys involving this test, or to assist in the selection of future sampling sites. It is always prudent to include as many replicates in the test design as is economically and logistically warranted (see Section 5.1); power analysis will assist in this determination. A good explanation of the power of a test, and how to assess it, can be found in USEPA (1994a). Guidance on power analysis is provided in EC (2004a).

Many investigators have difficulty with power analysis, and do not apply it due to its perceived complexity and the differing formulae specific to various statistical tests. In view of this complexity, the Minimum Significant Difference may be applied as an alternative approach (i.e., as an “index of power”; see EC, 2004a for guidance).

Page details

Date modified: