Biological test method for measuring terrestrial plants exposed to contaminants in soil: chapter 4


Universal Test Procedures

General procedures and conditions described in this section for toxicity tests with terrestrial plants apply when testing the toxicity of samples of soil, particulate waste (e.g., sludge), or chemical, and also apply to their associated reference toxicity tests. More specific procedures for conducting tests with field-collected samples of soil or other similar particulate material (e.g., sludge, de-watered mine tailings, drilling mud residue, compost, biosolids) are provided in Section 5. Guidance and specific procedures for conducting tests with negative control soil or other soil spiked (amended) experimentally with chemical(s) or chemical product(s) are given in Section 6.

All aspects of the test system described in Section 3 must be incorporated into these universal test procedures. Those conditions and procedures described in Section 2 for seed storage, handling, and sorting in preparation for soil toxicity tests, also apply. A summary checklist in Table 2 describes recommended conditions and procedures to be universally applied to each test with samples of contaminated or potentially contaminated soil, as well as those for testing specific types of test materials or substances. These could include samples of site soil, biosolids (e.g., dredged material, sludge from a sewage treatment plant, composted material, or manure), or negative control soil (or other soil, contaminated or clean) spiked in the laboratory with one or more test chemicals or chemical products.

This biological test method uses terrestrial plant seed as test organisms, and measures seedling emergence and growth (shoot and root length and dry mass) inhibition as the biological endpoints. Test organisms are chosen from a list of 12 species approved for use in this test method (see Section 1.2). Test duration is 14 or 21 daysFootnote21, depending on the species chosen and the biomass needed for determination of the endpoint measurement(s) (see Section 4.3). The test soils are hydrated during the test, but not renewed. This definitive test method was applied and validated by six participating laboratories in a series of concurrent 14-day multi- concentration tests using red clover (Trifolium pratense) in artificial soil spiked with boric acid (EC, 2005a).Footnote22

Table 2 - Checklist of Recommended Conditions and Procedures for Conducting Definitive Tests of Soil Toxicity Using Terrestrial Plants

Universal
Test type whole soil toxicity test; no renewal (static test)
Test duration 14 days for barley, cucumber, durum wheat, lettuce, radish, red clover, or tomato; and
- 21 days for alfalfa, blue grama grass, carrot, northern wheatgrass, or red fescue
Approved test species monocotyledons: barley (Hordeum vulgare), blue grama grass (Bouteloua gracilis), durum wheat (Triticum durum), northern wheatgrass (Elymus lanceolatus; formerly named Agropyron dasystachyum), and red fescue (Festuca rubra);
- dicotyledons: alfalfa (Medicago sativa), carrot (Daucus carota), cucumber (Cucumis sativus), lettuce (Lactuca sativa), radish (Raphanus sativus), red clover (Trifolium pratense), and tomato (Lycopersicon esculentum)
Number of Concentrations minimum of 9, plus negative control; recommend 11, plus negative control
Number of Replicates

For single-concentration test (e.g., site soil tested at 100% concentration only):
- ≥ 5 replicates/treatment
For multi-concentration test:
- ≥ 4 replicates/treatment for equal replicate test design; or
- regression design; unequal replicates among test treatments:

  • 6 replicates for negative control soil
  • 4 replicates for lowest 4 - 6 test concentrations
  • 3 replicates for highest 5 test concentrations
Number of seeds per replicate 5 seeds/vessel for barley, cucumber, durum wheat, lettuce, northern per replicate wheatgrass, radish, red clover, red fescue, or tomato; and
- 10 seeds/vessel for alfalfa, blue grama grass, or carrot
Negative control Soil depends on stud y design and objectives; cleanfield-collected soil or artificial soil if testing site soils; recommend artificial soil for tests with chemical(s) or chemical pro duct(s) spiked in soil
Test vessel polypropylene cups (1 L), covered for 7 days or until plants reach top of container
Amount of soil/test vessel identical wet wt, equivalent to a volume of ~500 mL; ~350 g dry wt if artificial soil
Moisture content, test soils for soil preparation, hydrate to the optimal % of its water-holding capacity (W H C) if field-collected
soil (see Section 5.3), or to ~70% of W H C if artificial soil; during test, hydrate to saturation
Air temperature daily range, constant 24 ± 3 °C ; alternatively, day: 24 ± 3 °C , night: 15 ± 3 °C
Humidity ≥ 50%
Lighting full spectrum fluorescent: mimic natural light spectrum (e.g., Vita Lite® by Duro-Test® );
300 ± 100 µmol/(m2 · s) adjacent to the level of the soil surface; 16 h light:8 h dark
Watering hydration water sprayed over soil surface until saturation, about every two days when covered and once per day after covers are removed, or whenever soil appears dry; weak nutrient solution might be necessary depending on fertility of soil and length of test
Measurements during test soil moisture content in each treatment/concentration at start; pH in each treatment/concentration at start and end; temperature in test facility, daily or continuously; humidity in test facility; light intensity once during test
Observations during test number of emerged seedlings at end of test in each test vessel; shoot/root length and shoot/root dry mass at test end; number of surviving plants at test end showing an atypical appearance (e.g., chlorosis, lesions); optionally, Day-7 seedling emergence (% ) and shoot/root wet mass at test end
Biological endpoints emergence of seedlings during test; length of longest shoot and longest root at test end; dry weight of entire shoot and root structures (oven-dried at 90 °C until constant mass) at test end; appearance of surviving plants at test end; optionally, wet weight of shoo t and root at test end
Statistical endpoints mean (± SD) percent emergence in each treatment/concentration at test end (Day 14 or Day 21); mean (± SD) length of longest shoots and roots in each treatment at test end (Day 14 or Day 21); mean (± SD) dry wt of shoots and roots in each treatment at test end (D ay 14 or D ay 21); if multi-concentration test: 14- or 21 -day EC 50 for inhibition of % emergence, data permitting; 14- or 21 -day IC p for each of mean shoot length, root length, shoot dry wt, and root dry wt of individual plants surviving in each concentration at test end
Test validity

invalid if any of the following occurs in negative control soil at test end:

  • mean % emergence is <60% for carrot, cucumber, or tomato; <70% for alfalfa, barley, blue grama grass, lettuce, northern wheatgrass, red clover, or red fescue; <80% for durum wheat; or <90% for radish
  • mean % survival of emerged seedlings in negative control soil at test end is <90%
  • mean percentage of control seedlings exhibiting phytotoxicity or developmental anomalies is >10%
  • mean root length is <40 mm for tomato; <70 mm for blue grama grass, red clover, or red fescue; <80 mm for carrot; <100 mm for lettuce; <110 mm for northern wheatgrass or radish; <120 mm for alfalfa or cucumber; or <170 mm for barley; or <200 mm for durum wheat
  • mean shoot length is <20 mm for lettuce; <30 mm for red clover; <40 mm for alfalfa; <45 mm for carrot; < 50 mm for blue grama grass, radish, or tomato; < 60 mm for cucumber; <80 mm for red fescue; <100 mm for northern wheatgrass; <150 mm for barley; or <160 mm for durum wheat
Test with reference toxicant must perform at least once every two months, or in conjunction with definitive test(s) with soil samples; use boric acid; prepare and test ≥ 5 concentrations plus a negative control, using artificial soil as a substrate; ≥ 3 replicates/concentration; 5 or 10 seeds per replicate (i.e., species- specific); follow procedures and conditions for a reference toxicity test described in Section 4.9; determine % emergence and 7-day or 10 -day (species dependent) ICp for shoot length (including 95 % confidence limits); express as mg boric acid/kg, dry wt
Field-Collected Soil
 
Transport and storage seal in plastic and minimize air space; transport in darkness (e.g., using an opaque cooler, plastic pail or other light-tight container); do not freeze or overheat during transportation; store in dark at 4 ± 2 °C ; test should start within two weeks, and must start within six weeks unless soil contaminants are known to be stable
Negative control soil either natural, uncontaminated field-collected soil or artificial soil, for which previous plant tests have shown that all criteria for test validity could be regularly met
Reference soil one or more samples for tests with field-collected soil; ideally taken from site(s) presumed to be clean but near sites of test soil collection; characteristics including percent organic matter, particle size distribution, and pH similar to test soil(s)
Characterization of test soils at least percent moisture, WHC, pH , conductivity, percent total organic carbon (TOC), percent organic matter, and particle sizes (% sand, % silt, % clay); optionally, contaminants of concern [e.g., metals, polycyclic aromatic hydrocarbons (PAHs), pesticides]
Preparation of test soils if necessary, remove debris and indigenous macro-organisms using forceps; if necessary, press through a sieve of suitable mesh size (e.g., 4 - 6 mm); mix; determine soil moisture content; hydrate with de-ionized or distilled water (or, if and as necessary, dehydrate) to optimal percentage of its WHC (see Section 5.3); mix; dilute with control or reference soil if multi-concentration test; ensure homogeneity
Soil Spiked with Chemical(s) or Chemical Substance(s)
Negative control recommend artificial soil, or a cleanfield-collected soil soil
Characterization of chemical(s) or chemical substance(s) information on stability, water solubility, vapour pressure, purity, and biodegradability of chemical(s) or chemical substance(s) spiked into negative control soil should be known beforehand
Solvent de-ionized water is the preferred solvent; if an organic solvent is used, the test must include a solvent control
Preparation of mixtures procedure depends on the nature of the test substance(s) and the test design and objectives; chemical/soil mixtures may be prepared manually or by mechanical agitation; test substance(s) may be added as measured quantities in solution (i.e., in water or an organic solvent), directly as a liquid substance, or as a solid material comprised partly or completely of the test substance(s); ensure homogeneity
Concentration chemical(s) or chemical substance(s) added normally measure at beginning and end of test, in high, medium, and low concentrations as a of minimum

4.1 Preparing Test Soils

Each test vessel (see Section 3.2.2) placed within the test facility must be clearly coded or labelled to enable identification of the sample and (if diluted) its concentration. The date and time when the test is started must be recorded, either directly on the labels or on separate data sheets dedicated to the test. The test vessels should be positioned such that observations and measurements can be made easily. Treatments should be positioned randomly within the test facility (EC, 1997a, b, 2001, 2004c) and rotated regularly (e.g., while watering).

On the day of the start of the test, which is the day the seeds are initially exposed to samples of test material or substance (i.e., Day 0), each sample or subsample of test soil or similar particulate material, including negative control soil and, if used, reference soil, should be mixed thoroughlyFootnote23 (see Sections 5.3 and 6.2) to provide a homogeneous mixture consistent in colour, texture, and moisture.

If field-collected samples of site soil are being prepared for testing, large particles (stones, thatch, sticks, debris) should be removed before mixing, along with any vegetation or macroinvertebrates observed (see Section 5.3).

Test soils for terrestrial plant testing are prepared on the day of test initiation (i.e., Day 0). The quantity of each test soil mixed as a batch should be enough to set up the replicates of that treatment (see Table 2) plus an additional amount for the physicochemical analyses to be performed (Section 4.6) and a surplus to account for the unused soil that adheres to the sides of the mixing container. The moisture content (%) of each test soil should be known or determined, and adjustments made as necessary by mixing in test water (or, if and as necessary, by dehydrating the sample) until the desired moisture level is achieved (see Sections 5.3 and 6.2). Quantitative measures of the homogeneity of a batchmight be made by taking aliquots of the mixture for measurements such as particle size analysis, total organic carbon (%), organic matter content (%), moisture content (%), and concentration of one or more specific chemicals.

Immediately following the mixing of a batch, an identical wet weight of test soil equivalent to a volume of ~500 mL should be transferred to each replicate test vessel.Footnote24 The soil added to each test vessel should be smoothed (but not compressed) using a spoon, by gently shaking the vessel back and forth horizontally, or by gently tapping the vessel ≥ 3 times on the benchtop or with a hand.

For a single-concentration test [e.g., site soil tested at 100% concentration only; a particular concentration of test soil; or a chemical tested at one concentration (e.g., Maximum Label Rate)], a minimum of five replicate test vessels and five replicate negative control vessels must be set up by adding an identical wet weight (equivalent to a volume of ~500 mL) of the same batch to each replicate vessel. For a multi- concentration test, either equal or unequal replication across treatments can be used. If replication is equal across treatments, at least four replicate test vessels must be set up for each treatment. If replication is unequal across treatments (see Section 4.8), six replicate vessels should be prepared for the negative control soil, four replicate vessels should be prepared for the lowest 4 - 6 test concentrations, and three replicate vessels should be prepared for the highest five test concentrations.Footnote25 For any test that is intended to estimate the ICp in a definitive soil test (see Section 4.8), at least nine concentrations plus a negative control soil must be prepared, and more (≥ 11) are recommended to improve the likelihood of bracketing each endpoint sought.Footnote26

Concentrations should be chosen to span a wide range, including a low concentration that obtains effects like the negative control, and a high concentration that results in “complete” or severe effects. It is a common mistake to anticipate the endpoint and bracket it with a closely spaced series of concentrations, all of which might turn out to be either too low or too high. To keep the wide range of concentrations, and also obtain the important mid- range effects, it might be necessary to use additional treatments in order to split the selected range more finely. In any case, a consistent geometric series should be used. Additional guidance on selecting test concentrations that applies here is found in EC (2004a).

4.1.1 Range-Finding Test

In the case of appreciable uncertainty about sample toxicity, it is often beneficial to run a range-finding test for the sole purpose of establishing more closely the concentrations to be used for the definitive test, in which instance the number of replicates per concentration could be reduced (see Section 6.2). Conditions and procedures for the range-finding test are similar to the definitive test (see Table 2); however, the experimental design differs.

The range-finding test is a short-term test (7 days for barley, cucumber, durum wheat, lettuce, red clover, radish, and tomato; and 10 days for alfalfa, blue grama grass, carrot, northern wheatgrass, and red fescue), with ≥ 6 concentrations of test chemical or test soilFootnote27, and only duplicate vessels (i.e., two replicates) per treatment. The test species must be the same as that to be used in the definitive test (see Section 2.1), and the number of seeds per replicate should be the same as those used in the definitive test (see Table 2 and Section 4.2). Negative control soil, air temperature and lighting conditions, percent moisture of soils, watering, and measurements during the test, are the same as those described for the definitive test (see Table 2). Shoot length and root length can be used to predict where the sublethalendpoints for growth will be in the definitive test.Footnote28 In most cases, the endpoints for growth in the definitive test will be at lower concentrations than those observed for the range- finding test, due to the longer test duration in the definitive test. The number of emerged seedlings at the end of the range-finding test should also be observed and recorded to determine whether the test validity criteria for seedling emergence in the definitive test are likely to be met (see Section 4.4).

4.2 Beginning the Test

Following the addition of test soil to each test vessel, 5 or 10 sorted seeds, depending on the species (see Section 2.1) are planted in the soil within each test vessel, in order of increasing test concentration. For species requiring only five seeds (i.e., barley, cucumber, durum wheat, lettuce, northern wheatgrass, radish, red clover, red fescue, and tomato), four seeds are distributed equally around one seed within the centre of the soil in each test vessel. For alfalfa, blue grama grass, and carrot, which require 10 seeds per test vessel, nine seeds are distributed equidistant around one centre seed. Using fine forceps, each seed should be planted to a depth that is twice the diameter of the seed itself. The seeds are covered with the surrounding test substrate by tapping the test substrate with a stainless steel spatula or glass rod.Footnote29 After the seeds have been added to each test vessel, the vessels are hydrated by spraying the soil surface with hydration water using a fine-mist spray bottle. Enough water is added to bring the moisture content of the soils close to saturation (see Section 4.5). Following hydration, lids (see Section 3.2.2) should be placed on the test vessels, to minimize loss of moisture.

4.3 Test Conditions

  • This is a 14- or 21-day soil toxicity test, during which the soil in each test vessel is not renewed. The test duration for barley, cucumber, durum wheat, lettuce, radish, red clover, and tomato is 14 days, whereas that for alfalfa, blue grama grass, carrot, northern wheatgrass, and red fescue (i.e., species that produce less phytomass and/or take longer to germinate) is 21 days.
  • The test vessel is a 1-L clear polypropylene container. Its contents (i.e., a 500-mL volume of test soil) are covered with a clear polypropylene lid (see Section 3.2.2).
  • For a single-concentration test, at least five replicate test vessels must be set up for each test soil (i.e., each treatment). For a multi- concentration test, the use of an unequal number of replicate test vessels per test concentration and control, depending on concentration and treatment, is recommended. A minimum of six replicates for controls, four replicates in the lowest 4 - 6 test concentrations, and three replicates in the highest five test concentrations, should be prepared (see Section 4.1).Footnote25
  • The test must be conducted at a constant mean air temperature of 24 ± 3 °C; or a daily mean air temperature of 24 ± 3 °C and a nightly mean air temperature of 15 ± 3 °C for those facilities that can accommodate daily changes in test temperatures (see Section 3.1).
  • Test vessels must be illuminated with a 16-h light and 8-h dark daily photoperiod. Full- spectrum fluorescent lights or equivalent which mimic a natural light spectrum (e.g., Vita Lite® by Duro-Test®) should be used. Light intensity adjacent to the surface of the soil in each test vessel must be 300 ± 100 µmol/(m2 · s) (i.e., the test must be: equivalent to 18 750 ± 6250 lux) (see Section 3.3).

4.4 Criteria for a Valid Test

For a valid test, each of the following five test criteria must be achievedFootnote30:

  • The mean percent emergence for individual plant species grown in negative control soil for the duration of the test must be:
  • ≥ 60% for carrot, cucumber, or tomato;
  • ≥ 70 % for alfalfa, barley, blue grama grass, lettuce, northern wheatgrass, red clover, or red fescue;
  • ≥ 80% for durum wheat; or
  • ≥ 90% for radish.
  • The mean percent survival for emerged seedlings grown in negative control soil for the duration of the test must be ≥ 90%.Footnote31
  • The mean percentage of seedlings grown in negative control soil for the duration of the test, that exhibit phytotoxicityand/or developmental anomalies, must be ≤ 10%.Footnote32
  • The mean root length for individual plant species grown in negative control soil for the duration of the test must be:
    • ≥ 40 mm for tomato;
    • ≥ 70 mm for blue grama grass, red clover, or red fescue;
    • ≥ 80 mm for carrot;
    • ≥ 100 mm for lettuce;
    • ≥ 110 mm for northern wheatgrass or radish;
    • ≥ 120 mm for alfalfa or cucumber;
    • ≥ 170 mm for barley; or
    • ≥ 200 mm for durum wheat.
  • the mean shoot length for individual plant species grown in negative control soil for the duration of the test must be:
    • ≥ 20 mm for lettuce;
    • ≥ 30 mm for red clover;
    • ≥ 40 mm for alfalfa;
    • ≥ 45 mm for carrot;
    • ≥ 50 mm for blue grama grass, radish, or tomato;
    • ≥ 60 mm for cucumber;
    • ≥ 80 mm for red fescue;
    • ≥ 100 mm for northern wheatgrass;
    • ≥ 150 mm for barley; or
    • ≥ 160 mm for durum wheat.

4.5 Hydration of Test Soils During the Test

Test soils are hydrated to “near-saturation” as needed, throughout the test. Hydrating to near- saturation means, in this instance, that water is added to the surface of the soil until ~0.5 cm of water is temporarily (≤ 1 h) visible pooling at the bottom of the test vessel following its addition. Hydration water, at 24 ± 3 °C, should be sprayed onto the surface of the soil using a fine-mist spray bottle on Day 0, just after the seed has been added to the test vessels, and again every 48 hours, or as needed, until the lids of the test vessels are removed (see Section 4.6). Thereafter, and if and as required, water should be added at least once every 24 hours to achieve near-saturation daily throughout the test.Footnote33 A weak nutrient solution [e.g., a half strength of Hoagland’s nutrient solution (Hoagland and Aaron, 1950)] should be used to water each test vessel if it is determined that the test soil is too deficient in nutrients to sustain healthy plant growth in the negative control soil for the duration of the experiment.Footnote34

The location of the test vessels in the environmental chamber or the testing area should be randomly varied each time that water is added to test vessels, so that the test organisms within these vessels are randomly exposed to any slight variations in test conditions (i.e., lighting, temperature, humidity, or ventilation) that might exist in the testing area.

4.6 Observations and Measurements During the Test

The biological endpoints for the test are seedling emergence, root and shoot length, and root and shoot dry mass at the end of the test (i.e., on Day 14 or Day 21, depending on the test species). Determining the number of seedlings emerged in each test vessel on Day 7 is also useful and frequently done (i.e., to determine the 7-day emergence rate for a single- concentration test, or the 7-day EC50 for a multi- concentration test), although such observations are optional. Depending on the study objectives, root and shoot wet mass might also be determined at the end of the test; however, these endpoints are also optional. Throughout the test, observations should be made and recorded of the number of emerged and the state or condition of the emerged plants, each time the test soils are hydrated (see Section 4.5).

Seedling emergence is measured visually by counting the number of seedlings that have emerged 3 mm above the soil surface in each test vessel. The lids must be removed from all of the test vessels for the remainder of the test on Day 7 or just before the seedlings reach a height that would contact the lid of the test vessel, whichever occurs first.

A visual assessment of the health and condition of the plants (e.g., phytotoxicity) in each test vessel should also be made and recorded when the plants first appear, as well as each time a test vessel is watered thereafter.Footnote35 Observations might include:

  • chlorosis (loss of pigment),
  • necrosis (localized dead tissue),
  • defoliation (loss of leaves),
  • dessication (dried leaves or stems),
  • malformation (structural defects),
  • mottling (marked or spotted),
  • staining (discolouration),
  • wilting (limp),
  • withering (in the process of drying),
  • discoloured or deformed leaves or stem,
  • overt signs of delayed emergence, or
  • impaired development and/or growth.

The number of seedlings in the control test vessels that are alive at the end of the test should be counted, to determine whether the test validity criterion for percent survival of emerged plants in negative control soil has been met (see Section 4.4).

Air temperature in the test facility (Section 3.1) must be measured daily (e.g., using a maximum/minimum thermometer) or continuously (e.g., using a continuous chart recorder). The humidity should be measured periodically (Section 3.1).

The light fluence rate must be measured at least once during the test period at points approximately the same distance from the light source as the soil surface and at several locations in the test area (see Section 3.3).

In at least one replicate of each treatment (including the negative control soil and, if used, reference soil), the pH must be measured and recorded at the beginning and end of the test, and the moisture content must be measured and recorded at the beginning of the test only.Footnote36 The initial (Day 0) measurements should be made using subsamples of each batch of test soil used to set up replicates of a particular treatment (see Section 4.1).Footnote37 The final (i.e., Day 14 or Day 21) measurements should be made using subsamples of the replicates of each treatment to which plants were exposed, following the end-of-test observations of plant emergence, condition, and growth.

Soil pH should be measured using a calcium chloride (CaCl2 ) slurry method (modified from Hendershot et al., 1993).Footnote38 For these analyses, 4 g of hydrated soilFootnote39 is placed into a 30-mL glass beaker (~3 cm in diameter and ~7 cm high) with 20 mL of 0.01 M CaCl2.Footnote40 The suspension should be stirred intermittently for 30 min (e.g., once every 6 min). The slurry should then be left undisturbed for ~1 h. Thereafter, a pH probe is immersed into the supernatant and the pH recorded once the meter reading is constant.

The moisture content of each test soil is measured by placing a 3 - 5 g subsample of each test soil into a pre-weighed aluminum weighing pan, and measuring and recording the wet weight of the subsample. Each subsample should then be placed into a drying oven at 105 °C until a constant weight is achieved; this usually requires a minimum of 24 hours. The dry weight of each subsample should then be measured and recorded. Soil moisture content must be calculated (on a dry-weight basis) by expressing the moisture content as a percentage of the soil dry weight:

Moisture content (%) = ([wet weight (g) - dry weight (g)] ÷ dry weight (g)) x 100

It is important that the moisture content (%) calculation be based on dry weight (not wet weight), since the results of these calculations are used with calculations of water-holding capacity (also calculated based on dry weight) to express the optimal moisture content in test soils (see Section 5.3).

Depending on the nature of the test and the study design, concentrations of chemical(s) or chemical product(s) of concern might be measured for test soils or selected concentrations thereof, at the beginning and end of the test. For a test using a sample of field-collected site soil, the chemical(s) or chemical product(s) measured will depend on the contaminant(s) of concern (see Section 5.4). For a multi-concentration test with chemical-spiked soil, such measurements should be made for the high, medium, and low strengths tested, as a minimum (see Section 6.3). Aliquots for these analyses should be taken as described previously for pH and moisture content; analyses should be according to proven and recognized (e.g., SAH, 1992; Carter, 1993) analytical techniques.

4.7 Ending the Test

The test is terminated after 14 days of exposure for barley, cucumber, durum wheat, lettuce, radish, red clover, and tomato; and after 21 days of exposure for alfalfa, blue grama grass, carrot, northern wheatgrass, and red fescue. At that time, the number of live and apparently dead plants in each test vessel should be determined and recorded, and any abnormal patterns in morphology, growth, and development (i.e., relative to the plants in the negative control soil) also recorded. Photographs might be taken to visually record the concentration-response relationship in the above-ground phytomass. Even if no shoots are visible above the soil surface, the soil should be checked for root material in case roots developed from the seed but no shoot material was produced. These observations are for qualitative purposes only (i.e., for this test method, a seedling must emerge 3 mm above the soil surface to be considered “emerged”) and, if roots develop, where no shoot material was produced, it should be noted. Thereafter, each test vessel must be processed separately to keep the seedlings within each replicate isolated from those in each of the other replicate test vessels.

The plants must be carefully separated from the test soil and from the roots of the other plants. This can be achieved by gently loosening the soil and root matrix from the test vessel and removing all soil that can be easily removed without disturbing the root matrix. In some cases, roots can be more easily separated from the soil after the soil is first saturated with water and allowed to soak for several minutes. The remaining soil and plant mass are placed into a pan of water. The roots can then be held under a gentle stream of tap water, or they can be sprayed with water from a spray bottle, to gently dislodge as many of the remaining soil particles as possible. This also aids in separating the roots of the plants from each other. The plants are then placed onto a moistened, labelled sheet of paper towel, one for each test vessel, and covered with plastic to minimize water loss until measurements can be made and recorded. Measurements of shoot and root lengths are made from the transition point between the hypocotyland the root to the longest leaf tip when the leaves are gently straightened, and to the tip of the longest root when the roots are gently straightened. Shoot and root length for each plant in each replicate are measured with a ruler, and recorded in millimetres.

The shoots and roots are then separated from each other at the point at which there is a discernible transition between root and shoot tissue, and from the seed itself, using a scalpel. The remaining seed is discarded. The shoot and root structures from each replicate test vessel are weighed separately, as two groups (i.e., shoots and roots). The entire rinsed shoot biomassfrom each test vessel must be transferred as a group to a damp paper towel or blotting paper. Thereafter, they should be placed into a clean aluminum weighing pan (1 - 2.5 g) that has been previously numbered, weighed, and held in a desiccator.Footnote41 This process is repeated with the entire rinsed root biomass from each test vessel. If wet mass is being determined, the aluminium pans containing shoots and roots are weighed immediately with an analytical balance that measures consistently to 0.1 mg. The dry mass must be determined and is done so in a similar way once the plants are dried in an oven at 90 °C until a constant weight is achieved (this usually takes a minimum of 24 h) (Aquaterra Environmental and ESG, 2000). Upon removal from the oven, the weighing pans are moved immediately to a desiccator. Once cooled, each weighing pan should be individually and randomly removed from the desiccator and weighed immediatelyFootnote42 to the nearest 0.1 mg on a balance that measures accurately to this limit. Mean dry weight per surviving plant is calculated for each replicate (see Section 4.8.3).

Although it is the intention of Environment Canada to use mean shoot dry weight and mean root dry weight as additional test validity criteria for definitive tests, there is insufficient data at this time on which to base minimum weight requirements for control plants. It is recommended, however, that for definitive tests:

  • The mean shoot dry weight per surviving plant, for individual plant species grown in negative control soil for the duration of the test be:
    • ≥ 1.0 mg for red fescue;
    • ≥ 1.5 mg for blue grama grass;
    • ≥ 2.0 mg for carrot;
    • ≥ 2.5 mg for lettuce;
    • ≥ 4.0 mg for red clover;
    • ≥ 5.0 mg for tomato;
    • ≥ 7.0 mg for northern wheatgrass;
    • ≥ 8.0 mg for alfalfa;
    • ≥ 20 mg for radish;
    • ≥ 25 mg for durum wheat;
    • ≥ 35 mg for barley; or
    • ≥ 40 mg for cucumber, and
  • The mean root dry weight per surviving plant, for individual plant species grown in negative control soil for the duration of the test be:
    • ≥ 0.2 mg for tomato;
    • ≥ 0.5 mg for blue grama grass, carrot, or red fescue;
    • ≥ 1.0 mg for lettuce or red clover;
    • ≥ 3.0 mg for northern wheatgrass or radish;
    • ≥ 4.0 mg for alfalfa;
    • ≥ 7.0 mg for cucumber; or
    • ≥ 25.0 mg for barley or durum wheat.

During the series of dry-weight determinations for the groups of plants from a test, the first weighing pan should be returned to the desiccator, and weighed again at the end of all weighings. This serves as a check on any sequential gain of water by the weighing pans in the desiccator over time, which can occur when each weighing pan is removed for its weight determination. The change in weight of the first weight pan over time should not be >5%; if it is, all weighing pans should be re-dried for ≥ 2 h and then re-weighed.

Following the removal of plants from each test vessel, subsamples of each test soil (including the negative control soil and, if included in the test, reference soil) should be taken for pH determination (Section 4.6). Analyses for other chemical constituents (i.e., concentrations of contaminants) should also be made at this time using representative subsamples of each test soil (Section 4.6).Footnote43

4.8 Test Endpoints and Calculations

The percent emergence in each test vessel at the end of the test (Day 14 for barley, cucumber, durum wheat, lettuce, radish, red clover, and tomato; and Day 21 for alfalfa, blue grama grass, carrot, northern wheatgrass, and red fescue) must be calculated and reported for each test. The mean (± SD) percent emergence for all replicate groups of plants exposed to each treatment for 14 or 21 days must also be calculated and reported. Any optional observations of emergence taken on Day 7 (see Section 4.6) should also be calculated and reported as percent emergence in each test vessel, as well as mean (± SD) percent emergence for each treatment.

For a single-concentration test (see Section 4.1), the mean (± SD) value for the percent emergence of plants at test end, as determined for each treatment, is compared with that for the sample(s) of reference soil or, as necessary and appropriate, compared with that for the negative control soil (see Section 5.5). For a multi-concentration test (see Sections 4.1, 5.3, and 6.2), the 14-day or 21-day EC50 for emergence must be calculated and reported (data permitting).Footnote44 If 7-day observations of percent emergence in each concentration were made during a multi- concentration test, it is recommended that the 7-day EC50 for emergence also be calculated and reported (data permitting). Environment Canada’s guidance document on statistical methods for estimating endpoints of toxicity tests (EC, 2004a) provides definitive direction and advice for calculating EC50s, which should be followed (see Section 4.8.2, herein).

The growth endpoints for this test are based on shoot and root length, as well as shoot and root dry weight, of surviving plants in each replicate and each treatment as measured at the end of the 14- or 21-day test period. Shoot and root wet weight are additional (optional, but recommended) endpoints. A significant reduction in the length or weight of the plants is considered indicative of an adverse toxic effect of the treatment on the growth of test plants. For a single-concentration test (see Sections 5.3 and 6.2), the mean (± SD) values for shoot and root length, and shoot and root dry weight, of plants surviving in the test soil at test-end is determined and compared to those values for the sample(s) of reference soil or, as necessary and appropriate, compared to those values for the negative control soil. A Student’s t-test or other appropriate statistic (EC, 2004a) should be used for this comparison. For a multi-concentration test (see Sections 5.3 and 6.2), the 14- or 21-day ICp for growth inhibition represented by each endpoint measurement (i.e., decreased mean length of individual plant shoots and roots, and decreased mean dry weights of individual plant shoots and roots) must be calculated and reported (data permitting).Footnote45

Environment Canada (2004a) provides direction and advice for calculating ICps, which should be followed; Section 4.8.3 (including Appendix I) gives further guidance in this regard. Initially, regression techniques (see Section 4.8.3.1) must be applied to multi-concentration data intended for calculation of an ICp.Footnote46 In the event that the data do not lend themselves to calculating the 14- or 21-day ICps for the growth inhibition using the appropriate regression analysis (see Appendix I), linear interpolation of these data using the program ICPIN should be applied in an attempt to derive an ICp (see Section 4.8.3.2).

4.8.1 Percent Emergence

The mean and standard deviation of seedling emergenceare calculated for each test concentration. The percent effect is then calculated for each treatment using the following formula:

Percent effect = (mean treatment emergence - mean control emergence) × 100 ÷ mean control emergence

The percent effect is then plotted against test concentration in a histogram, with the median line representing the control response, or the 0% effect. All histogram bars above the median line (+ve percent effect) indicate that there is an enhanced emergence relative to that in the control, and histogram bars below the median line (-ve percent effect) indicate that there is an inhibition of emergence relative to that in the control. The magnitude and consistency of the percent effect among treatments indicates whether or not there is a concentration-response relationship. If there is an obvious, visible adverse effect in an exposure- dependent manner (i.e., there is a visual concentration-response relationship), the 14-day or 21-day EC50 must be calculated and reported (data permitting).

4.8.2 EC50

When a multi-concentration test with soil mixtures is conducted (Section 6.2), the quantal seedling emergence data for a specific period of exposure must be used to calculate (data permitting) the appropriate median effect concentration (EC50) for inhibition of percent emergence, together with its 95% confidence limits. For barley, cucumber, durum wheat, lettuce, radish, red clover, and tomato, a multi-concentration test must determine the 14-day EC50 for inhibition of percent emergence (at test end); and for alfalfa, blue grama grass, carrot, northern wheatgrass, and red fescue, the 21-day EC50 must be determined (at test end). The seven- day EC50 (i.e., that based on emergence data collected on Day 7 of the test) for inhibition of percent emergence, might also be determined and reported, data permitting (see Section 4.6).

To estimate an EC50, emergence data at the specified period of exposure are combined for all replicates at each concentration (including the replicate control groups). If emergence is not ≥ 50% in at least one concentration, the EC50 cannot be estimated. If there is complete emergence at a specific concentration, that information is used as a 0% effect of emergence. However, if successive concentrations yield a series of 100% emergence, only the highest concentration of the series should be used in estimating the EC50 ( i.e., the zero-effect that is “closest to the middle” of the distribution of data). Similarly, if there was a series of successive complete inhibition of emergence at the high concentrations in the test, only one value of 100% effect would be used, i.e., the one at the lowest concentration. Use of only one 0% and one 100% effect applies to any form of statistical analysis and to plotting on a graph.

The guidance provided by Environment Canada (2004a) on choosing statistical test methods to be applied to quantal (e.g., EC50) data should be consulted when choosing the statistical test to be applied to such data for toxicity tests using plants. Probit and/or logit regressions are the “preferred” methods (EC, 2004a), provided that two or more concentrations showing partial effects are included in the data. The probit analysis also gives the slope of the line, which should be reported. If probit or logit do not work because of only one partial effect, use the Spearman-Kärber method with no trim. If no partial effect is evident, use the binomial method. The binomial estimate might differ somewhat from the others, and this estimate should only be used as a last resort. Formal confidence limits are not estimated using the binomial method; instead, outer limits of a range are provided, within which the EC50 and its true confidence limits would lie.

Various computer programs may be used to calculate the EC50. Stephan (1977) developed a program to estimate EC50s using probit, moving average, and binomial methods, and adapted it for the IBM- compatible personal computer. Use of this program, which was modified in 1989 to include estimates using the Spearman Kärber method with no “trimming” (i.e., with no deletion of data from the calculations), is available on disketteFootnote47 from Environment Canada (address in Appendix B), and is recommended. Other satisfactory computer and manual methods may be used (e.g., SAS, 1988 or version 3.5 of TOXSTAT, 1996; see EC, 2004a for additional information). Programs using the trimmed Spearman-Kärber method are available for personal computers; however, this method (with trimming) should be applied cautiously to EC50 estimates according to EC (2004a), because divergent results might be obtained by operators who are unfamiliar with the implications of trimming ends of the concentration-response data. However, there are situations where application of the trimmed Spearman Kärber method is warranted (see EC, 2004a for guidance).

Any computer-derived EC50 should be checked by examining a plot, on logarithmic-probability scales, of percent emergence at a defined period of exposure for the various test concentrations (EC, 2004a). Any major disparity between the estimated EC50 derived from this plot and the computer-derived EC50 must be resolved. A hand-plotted graph is recommended for this check (EC, 2004a). A computer-generated plot (e.g., SigmaPlot™; Version 8.0.2 or later)Footnote48 could be used if it were based on logarithmic- probability scales. If there has been an error in entering the data, however, a computer-generated plot would contain the same error as the mathematical analysis, and so the investigator should carefully check for correct placement of points (EC, 2004a).

A manual plot of emergence (mortality)/concentration data to derive an estimated EC50 is illustrated in Figure 2. This (hypothetical) figure is based on test concentrations of 1.8, 3.2, 5.6, 10, and 18 mg chemical/kg soil (dry-weight basis) causing emergence inhibition of 0, 20, 40, 90, and 100% of seedlings exposed to the respective concentrations for a specified period of time. The concentration expected to inhibit the emergence of 50% of the seedlings can be read by following across from 50% (broken line) to the intersection with the best-fit line, then down to the horizontal axis for an estimated EC50 (5.6 mg/kg, dry wt).

In fitting a line such as that in Figure 2, more emphasis should be assigned to points that are near 50% inhibition of emergence. Logarithmic-probability paper (log-probit, as in Figure 2) can be purchased in good technical bookstores, ordered through them, or photocopied (see blank graph in EC, 2004a).

Figure 2: Estimating a median effective concentration by plotting emergence on logarithmic-probability paper

Figure 2: Estimating a median effective concentration by plotting emergence on logarithmic-probability paper
Description

This figure is a plot of percent inhibition of emergence vs concentration - both axes are log-scale. Four data points are plotted with a straight line of best fit drawn through them. Dotted lines are used to illustrate how one can read across from 50% inhibition to the intersection point with the line of best fit and then down to the concentration axis to obtain the corresponding value.

For the regular set of data in Figure 2, computer programs gave very similar estimates to the graphic one. Some of the computed EC50s (and 95% confidence limits) were:

Stephan (1977) method:

  • probit: 5.58 (4.24 and 7.37)
  • moving average: 5.58 (4.24 and 7.33)
  • binomial: 6.22 (between 1.8 and 10)

SAS (1988) probit analysis: 5.58 (4.26 and 7.40)

TOXSTAT (1996) method (version 3.5)

  • probit: 5.58 (4.38 and 7.12)
  • Spearman Kärber, zero trim: 4.64 (4.40 and 7.23)
  • logit: 5.63 (4.39 and 7.22)

Table 4.2 in EC (2004a) provides additional examples of computed data for acute quantal tests using various computer programs.

4.8.3 ICp

When a multi-concentration test for effects of exposure of terrestrial plants to spiked soil mixtures is conducted, the quantitative continuous data representing growth inhibition (i.e., shoot and root length, and shoot and root dry mass) must be used to calculate an ICp ( inhibiting concentration for a specified percent effect) for each of these four endpoints, data permitting (see introductory paragraphs of Sections 4.8 and 6.2). The ICp is a quantitative estimate of either:

  1. the concentration causing a fixed percent reduction in the mean length of individual plant shoots at test end;
  2. the concentration causing a fixed percent reduction in the mean length of individual plant roots at test end;
  3. the concentration causing a fixed percent reduction in the mean dry weight of individual plant shoots at test end; or
  4. the concentration causing a fixed percent reduction in the mean dry weight of individual plant roots at test end.

The ICp is calculated as a specified percent reduction for each endpoint (e.g., the IC25 and/or IC20, for a 25% and/or 20% reduction, respectively). The desired value of p is selected by the investigator, and 25% or 20% is currently favoured. Any ICp that is calculated and reported must include the 95% confidence limits.

In the analyses of growth, the length and weight measurements of individual shoots or roots in each replicate (test vessel) are pooled for each of these measurements, and the mean of these lengths and weights are used in the analyses. For length measurements, the mean length of individual shoots (or roots) in each replicate is calculated. For dry weight measurements, the mean weight of individual shoots (or roots) in each replicate is calculated as the total dry weight of all of the plant shoots (or roots) that survived in the test vessel divided by the number of plants that survived in that vessel to the end of the test.Footnote49Footnote50

The mean lengths and weights from all the replicates of a given treatment (concentration) are used to calculate the average for the treatment; this is the average individual shoot and root length and shoot and root dry weight of surviving plants per concentration. These data are compared to the average individual lengths of shoots and roots and the average individual weights of shoots and roots in the negative control, obtained by the same procedure. If there are no emerged plants in a replicate (test vessel), that replicate does not contribute to the average for the treatment. If there are no emerged plants in all replicates at a given concentration, that concentration does not have an average length or weight of emerged plants and cannot be used in the analysis for comparison with the average length or weight in the negative control.

As indicated in the introductory paragraphs of Section 4.8, separate ICps for individual shoot and root length and shoot and root dry mass must be calculated and reported (data permitting) upon completion of the test. These calculations must be made using the appropriate linear or nonlinear regression analyses (see Section 4.8.3.1). If, however, regression analyses fail to provide meaningful ICps for shoot/root length or shoot/root dry weight, the ICPIN analyses described in Section 4.8.3.2 should be applied to the corresponding data.

4.8.3.1 Use of regression analysis.

Upon completion of a definitive 14- or 21-day multi- concentration test, separate ICps (including their respective 95% confidence limits) for the individual mean lengths and dry weights of shoots and roots, must be calculated using linear and/or nonlinear regression procedures. These values may be calculated using a series of linear and nonlinear regression models (data permitting) proposed by Stephenson et al. (2000b) that have been re- parameterized, based on techniques applied by van Ewijk and Hoekstra (1993), to automatically generate the ICp and its 95% confidence limits for any value of ‘p’ (e.g., IC25 or IC50). The proposed models for application consist of one linear model, and the following four nonlinear regression models: exponential, Gompertz, logistic, and logistic adjusted to accommodate hormesisFootnote51. Further guidance on the use of these linear and nonlinear regression models for calculating ICps is provided by Stephenson (2003a) and Stephenson et al. (2000b). The reader is also strongly advised to consult EC (2004a) for additional guidance on the general application of linear and non-linear regression for the analysis of quantitativetoxicity data. Instruction for the appropriate application of linear and non-linear regression, using Version 11.0 of the statistical program SYSTATFootnote52, is provided in Appendix I. However, any statistical software capable of linear and nonlinear regression may be used when calculating the respective ICps and their associated 95% confidence limits. Appendix I provides instruction on the use of regression models to derive the most appropriate ICps for reduced plant growth, assessed using shoot and root length and dry weight metrics.

The five models recommended for application follow. Further information on these specific models is presented in Appendix I.

Exponential model:

Y = a × (1 - p)(C ÷ ICp)

where:
Y = root or shoot length or dry mass
a = the y-intercept (i.e., the control response)
p = desired value for ‘p’ (e.g., 0.25 for a 25% inhibition)
C = the test concentration as a logarithm
ICp = the ICp for the data set

Gompertz model:

Y = t × exp[log(1 - p) × (C ÷ ICp)b]

where:
Y = root or shoot length or dry mass
t = the y-intercept (i.e., the control response)
exp = the exponent of the base of the natural logarithm
p = desired value for ‘p’ (e.g., 0.25 for a 25% inhibition)
C = the test concentration as a logarithm
ICp = the ICp for the data set
b = a scale parameter (estimated to be between 1 and 4) that defines the shape of the equation

Hormesis model:

Y = t × [1 + (h × C)] ÷ {1 + [(p + (h × C)) ÷ (1 - p)] × (C ÷ ICp)b}

where:
Y = root or shoot length or dry mass
t = the y-intercept (i.e, the control response)
h = describes the hormetic effect (estimated to be small, usually between 0.1 and 1)
C = the test concentration as a logarithm
p = desired value for ‘p’ (e.g., 0.25 for a 25% inhibition)
ICp = the ICp for the data set
b = a scale parameter (estimated to be between 1 and 4) that defines the shape of the equation

Linear model:

Y = [(-b × p) ÷ ICp] × C + b

where:
Y = root or shoot length or dry mass
b = the y-intercept (i.e., the control response)
p = desired value for ‘p’ (e.g., 0.25 for a 25% inhibition)
ICp = the ICp for the data set
C = the test concentration as a logarithm

Logistic model:

Y = t ÷ {1 + [p ÷ (1 - p)] × (C ÷ ICp)b}

where:
Y = root or shoot length or dry mass
t = the y-intercept (i.e, the control response)
p = desired value for ‘p’ (e.g., 0.25 for a 25% inhibition)
C = the test concentration as a logarithm
ICp = the ICp for the data set
b = a scale parameter (estimated to be between 1 and 4) that defines the shape of the equation

The general process for the statistical analysis and selection of the most appropriate regression model (linear or non-linear) for quantitative toxicity data is outlined in Figure 3. The selection process begins with an examination of a scatter plot or line graph of the test data to determine the shape of the concentration-response curve. The shape of the curve is then compared to available models so that one or more appropriate model(s) that best suits the data is (are) selected for further examination (see Figure I.1, Appendix I, for an example of five potential models).

Figure 3: The general process for the statistical analysis and selection of the most appropriate model for quantitative toxicity data

Figure 3 is a flow chart that can be used to select an appropriate statistical model for a particular dataset.
Description

This figure is a flow chart that can be used to select an appropriate statistical model for a particular dataset. The general process is to plot the data and attempt to fit potential models to it. Investigators should then assess to determine if it meets the assumptions of normality and homoscedasticity. The tree branches at this point to account for data sets that may not be normally distributed or homoscedastic but proceeds downwards if both assumptions are met. A check for outliers is made. If outliers are present the investigator is directed to repeat the entire process after removing them to determine if they should be removed from the data set. Once the check for outliers is complete, smallest residual mean square error is utilized as a final model selection criterion.

June 2007 Amendments to Environment Canada's Biological Test Method EPS 1/RM/45

Once the appropriate model(s) is (are) selected for further consideration, assumptions of normality and homoscedasticity of the residuals are assessed. If the regression procedure for one or more of the examined models meets the assumptions, the data (and regression) are examined for the presence of outliers. If an outlier has been observed, the test records and experimental conditions should be scrutinized for human error. If there are one or more outliers present, the analysis should be performed with and without the outlier(s), and the results of the analyses compared to examine the effect of the outlier(s) on the regression. Thereafter, a decision must be made as to whether the outlier(s) should be removed from the final analysis. The decision should take into consideration natural biological variation, and biological reasons that might have caused the apparent anomaly. Additional guidance on the presence of outliers and unusual observations is provided in Appendix I (Section I.2.4), as well as in EC (2004a). If there are no outliers present or none are removed from the final analysis, the model that demonstrates the smallest residual mean square error is selected as the model of best choice. Additional guidance from a statistician familiar with dealing with outlier data is also advised.

Normality should be assessed using the Shapiro- Wilk’s test as described in EC (2004a). A normal probability plot of the residuals may also be used during the regression procedure, but is not recommended as a stand-alone test for normality as the detection of a ‘normal’ or ‘non-normal’ distribution depends on the subjective assessment of the user. If the data are not normally distributed, then the user is advised to try another model, consult a statistician for further guidance on model selection, or to perform the less-desirable linear interpolation (using ICPIN, see Section 4.8.3.2) method of analysis.

Homoscedasticity of the residuals should be assessed using Levene’s test as described in EC (2004a), and by examining the graphs of the residuals against the actual and predicted (estimated) values. Levene’s test provides a definite indication of whether the data are homogeneous (e.g., as in Figure I.2A of Appendix I) or not. If the data (as indicated by Levene’s test) are heteroscedastic (i.e., not homogeneous), then the graphs of the residuals should be examined. If there is a significant change in the variance and the graphs of the residuals produce a distinct fan or ‘V’ pattern (refer to Figure I.2B, Appendix I for an example), then the data analysis should be repeated using weighted regression. Before choosing the weighted regression, the standard error of the ICp is compared to that derived from the unweighted regression. If there is a difference of greater than 10% between the two standard errorsFootnote53, then the weighted regression is selected as the regression of best choice. However, if there is less than a 10% difference in the standard error between the weighted and unweighted regressions, then the user should consult a statistician for the application of additional models, given the test data, or the data could be re-analyzed using the less-desirable linear interpolation (using ICPIN, see Section 4.8.3.2) method of analysis. This comparison between weighted and unweighted regression is completed for each of the selected models while proceeding through the process of final model selection (i.e., model and regression of best choice). Some non-divergent patterns might be indicative of an inappropriate or incorrect model (refer to Figure I.2C, Appendix I, for an example), and the user is again urged to consult a statistician for further guidance on the application of additional models.

4.8.3.2 Linear interpolation using ICPIN.

If regression analyses of the endpoint data (see Section 4.8.3.1) fail to provide an acceptable ICp for growth, linear interpolation using the computer program called ICPINshould be applied. This program (Norberg-King, 1993; USEPA, 1994b, 1995) is not proprietary, is available from the USEPA, and is included in most computer software for environmental toxicology, including TOXSTAT. The original instructions for ICPIN from USEPA are clearly written and make the program easy to use (Norberg-King, 1993).Footnote54 An earlier version was called BOOTSTRP.

Analysis by ICPIN does not require equal numbers of replicates in different concentrations. The ICp is estimated by smoothing of the data as necessary, then using the two data-points adjacent to the selected ICp (USEPA, 1994b, Appendix L; USEPA, 1995, Appendix L). The ICp cannot be calculated unless there are test concentrations both lower and higher than the ICp; both those concentrations should have an effect reasonably close to the selected value of p, preferably within 20% of it. At present, the computer program does not use a logarithmic scale of concentration, and so Canadian users of the program must enter the concentrations as logarithms. Some commercial computer packages have the logarithmic transformation as a general option, but investigators should make sure that it is actually retained when proceeding to ICPIN. ICPIN estimates confidence limits by a special “bootstrap” technique since usual methods would not be valid. Bootstrapping performs many resamplings from the original measurements. The investigator must specify the number of resamplings, which can range from 80 to 1000. At least 400 is recommended here, and 1000 would be beneficial.Footnote55

If there are several adjacent high concentrations with no emerged plants, only the lowest of that string of concentrations should be used in the analysis (i.e., the concentration closest to the middle of the series of concentrations used in the test). Normally, there is no particular benefit to including the additional concentrations, since they offer nothing to the analysis (i.e., the data consist only of zero mean weights and zero mean lengths).

Besides determining and reporting the computer- derived ICps for length and weight of individual plants at test end, separate graphs of percent reduction for each of shoot and root lengths and shoot and root dry weights should be plotted against the logarithm of concentration, to check the mathematical estimations and to provide visual assessments of the nature of the data (EC, 2004a).

If the ICPIN program is used when there is a hormetic effect, an inherent smoothing procedure could change the control value and bias the estimate of ICp. Accordingly, before statistical analysis, hormetic values at low concentration(s) should be arbitrarily replaced by the control value. This is considered a temporary expedient until a superior approach is established (EC, 2004a). The correction is applied for any test concentration in which the average effect (i.e., the geometric average of the replicate means) is higher (“better”) than the average for the control. To apply this correction, replace the observed mean weights (or mean lengths) of the replicates in the hormetic concentration(s), with the means of replicates in the control. The geometric average for that/those concentration(s) will then be the same as that for the control.

4.9 Tests with a Reference Toxicant

Table 13 of Appendix E summarizes the guidance for performing reference toxicity tests given in other documents describing procedures and conditions for conducting emergence-and-growth tests in soil using plants. Described herein are the procedures and conditions to be followed when performing reference toxicity tests in conjunction with a 14-day or 21-day test of emergenceand growth using plants. The procedures herein also apply to tests for assessing the acceptability and suitability of batches of seed for use in soil toxicity tests; and should be applied to assess intralaboratory precision when a laboratory is inexperienced with the biological test method defined in this document and is initially setting up to perform it (see Sections 2.5, and 3.2.1).

The routine use of a reference toxicant is necessary to assess, under standardized test conditions, the relative sensitivity of a lot of terrestrial plant seed being used. Tests with a reference toxicant also serve to demonstrate the precision and reliability of data produced by the laboratory personnel for that reference toxicant, under standardized test conditions. A reference toxicity test, conducted according to the procedures and conditions described herein, must be performed according to one of the following regimes:

  1. at least once every two months using the same lot of seed being used to provide test organisms for soil toxicity tests over an extended period (i.e., ≥ 2 months); or
  2. at the same time as the definitive soil toxicity test(s), using seed taken from the same lot number as those used for the definitive test(s) (see Section 2.5).

A laboratory that frequently performs soil toxicity tests using terrestrial plants might choose to routinely (e.g., every two months) monitor the sensitivity of their seed to one or more reference toxicants, while including a reference toxicity test using a portion of the seeds used to start a definitive soil toxicity test. Alternatively, a laboratory might choose to monitor the sensitivity of their seed to a reference toxicant less frequently, and to perform a reference toxicity test at the time that each definitive soil toxicity test is performed.

Each reference toxicity test performed in conjunction with the definitive test for soil toxicity must be conducted as a static multi-concentration growth test. The duration of the reference toxicity test is seven days if the species of organisms is alfalfa, barley, cucumber, durum wheat, lettuce, radish, red clover, or tomato. For reference toxicity tests with blue grama grass, carrot, northern wheatgrass, and red fescue, the test duration is 10 days. In each instance, the ICp for shoot length is determined at the end of the test. A summary checklist in Table 3 describes the conditions and procedures that must be applied to each reference toxicity test. Additional conditions and procedures described in Section 4 for performing a multi-concentration test with samples of test soil apply equally to each reference toxicity test. Procedures given in Section 6 for the preparation and testing of chemicals spiked in negative control soil also apply here, and should be referred to for further information. Environment Canada’s guidance document on using negative control sediment spiked with a reference toxicant (EC, 1995) provides useful information that is also applicable when performing reference toxicity tests with negative control soil spiked with a reference toxicant.

The reference toxicity test should be performed using 1-L polypropylene containers as test vessels (Section 3.2.2) and a 500-mL aliquot of test soil representing each treatment (concentration) in each test vessel. The number of replicate test vessels per concentration must be ≥ 3. The number of seeds per vessel is species-specific, and differ slightly from those specified for definitive tests. Reference toxicity tests with barley, cucumber, durum wheat, lettuce, radish, red clover, red fescue, and tomato must include five seeds per vessel, whereas for alfalfa, blue grama grass, carrot, and northern wheatgrass, 10 seeds per vessel are required (see Table 3).

Procedures for starting and ending a reference toxicity test should be consistent with those described in Sections 4.2 and 4.7. Test conditions for temperature and light, described in Section 4.3, apply. Test observations and measurements given in Section 4.6 should be followed. Observations and measurements should be as described in section 4.6; however, only percent emergenceand individual shoot length should be determined at the end of the test.

To be valid, the mean percent emergence at the end of the test for plants held in the control treatment used in a particular reference toxicity test must be:

  • ≥ 60 % for tomato;
  • ≥ 70 % for blue grama grass, carrot, lettuce, northern wheatgrass, red clover, or red fescue;
  • ≥ 80 % for alfalfa, barley, cucumber, or durum wheat; or
  • ≥ 90 % for radish.

Additionally, the mean shoot length for individual plant species grown in negative control soil for the duration of the test must be:

  • ≥ 10 mm for lettuce or red clover;
  • ≥ 20 mm for alfalfa, blue grama grass, or tomato;
  • ≥ 40 mm for carrot, cucumber, radish, or red fescue;
  • ≥ 50 mm for northern wheatgrass;
  • ≥ 100 mm for barley; or
  • ≥ 120 mm for durum wheat.
Table 3. Checklist of Recommended Conditions and Procedures for Conducting Reference Toxicity Tests on Soil Using Terrestrial Plants
Test type - whole soil reference toxicity test; no renewal (static test)
Test duration - 7 days for alfalfa, barley, cucumber, durum wheat, lettuce, radish, red clover, or tomato; and
- 10 days for blue grama grass, carrot, northern wheatgrass, or red fescue
Approved test Species - monocotyledons: barley (Hordeum vulgare), blue grama grass (Bouteloua gracilis), durum wheat (Triticum durum), northern wheatgrass (Elymus lanceolatus; formerly named Agropyron dasystachyum), and red fescue (Festuca rubra);
-dicotyledons: alfalfa (Medicago sativa), carrot (Daucus carota), cucumber (Cucumis sativus), lettuce (Lactuca sativa), radish (Raphanus sativus), red clover (Trifolium pratense), and tomato (Lycopersicon esculentum)
Number of concentrations - minimum of five test concentrations, plus a negative control
Number of replicates - ≥ 3 replicates/concentration
Number of seeds per replicate - 5 seeds/vessel for barley, cucumber, durum wheat, lettuce, radish red clover, red fescue, or tomato; and
- 10 seeds/vessel for alfalfa, blue grama grass, carrot, or northern wheatgrass
Negative control soil - artificial soil
Test vessel - polypropylene cups (1 L), covered for seven days or until plants reach top of container
Amount of soil/ test vessel - identical wet wt, equivalent to a volume of ~500 mL; ~350 g dry wt for artificial soil
Moisture content - for soil preparation, hydrate to ~70% of water-holding capacity (WHC) for artificial soil; during test, hydrate to saturation, as needed
Air temperature - daily range, constant 24 ± 3 °C; alternatively, day: 24 ± 3 °C, night: 15 ± 3 °C
Humidity - ≥50 %
Lighting - full spectrum fluorescent: mimic natural light spectrum (e.g., Vita Lite® by DuroTest®); 300 ± 100 :mol/(m2 · s) adjacent to the level of the soil surface; 16 h light:8 h dark
Watering - hydration water sprayed over soil surface until saturation, about every two days when covered and once per day after covers are removed, or whenever soil appears dry
Measurements during test - soil moisture content in each treatment/concentration at start; pH in each treatment/concentration at start and end; temperature in test facility, daily or continuously; humidity in test facility; light intensity once during test
Observations during test - number of emerged seedlings at the end of the test in each test vessel and shoot length at test end; number of surviving plants showing an atypical appearance (e.g., chlorosis, lesions)
Biological endpoints - emergence of seedlings during test and length of longest shoot at test end; appearance of surviving plants at test end
Statistical endpoints - mean (± SD) percent emergence in each treatment/concentration at test end (i.e., on Day 7 or Day 10); mean (± SD) length of longest shoots in each treatment/test concentration at test end (Day 7 or Day 10); 7-day or 10-day ICp for shoot length
Test validity

- invalid if any of the following occurs in negative control soil at test end:

  • mean percent emergence is <60% for tomato; <70% for blue grama grass, carrot, lettuce, northern wheatgrass, red clover, or red fescue; <80% for alfalfa, barley, cucumber, or durum wheat; or <90% for radish
  • mean percent survival of emerged seedlings in negative control soil at test end is <90%
  • mean percentage of control seedlings exhibiting phytotoxicity or developmental anomalies is >10%
  • mean shoot length is <10 mm for lettuce or clover; <20 mm for alfalfa, blue grama grass, or tomato; <40 mm for carrot, cucumber, radish, or red fescue; <50 mm for northern wheatgrass; <100 mm for barley; or <120 mm for durum wheat

In addition, the mean percent survival of emerged seedlings in the negative control soil must be ≥ 90 % at the end of the test; and the mean percentage of seedlings grown in negative control soil that exhibit phytotoxicity and/or developmental anomalies must be ≤ 10% in order for the results of a reference toxicity test to be declared valid.

Test endpoints to be calculated and reported include the mean percent emergence in each treatment on Day 7 or Day 10, depending on the species. The 7- day or 10-day ICp (including its 95% confidence limits) for shoot length must also be calculated. Results for a reference toxicity test should be expressed as mg reference chemical/kg soil, on a dry- wt basis.

Appropriate criteria for selecting the reference toxicant to be used in conjunction with a definitive test for soil toxicity include the following (EC, 1995):

  • chemical readily available in pure form;
  • stable (long) shelf life of chemical;
  • can be interspersed evenly throughout cleansubstrate;
  • good concentration-response curve for test organism;
  • stable in aqueous solution and in soil;
  • minimal hazard posed to user; and
  • concentration easily analyzed with precision.

The reference toxicity test requires a minimum of six treatments (i.e., negative control soil and five concentrations of reference toxicant). Reagent- grade boric acid (H3BO3)Footnote56 is recommended for use as the reference toxicant when performing soil toxicity tests with plants, although other chemicals may be used if they prove suitable.Footnote57 Each test concentration should be made up according to the guidance in Sections 4.1 and 6.2, using artificial soil (Section 3.4.2) as substrate.

Routine reference toxicity tests (e.g., those performed once every two months or in conjunction with each definitive test for soil toxicity) using boric acid [or another suitable reference chemical, such as copper sulphate (CuSO4)] spiked in negative control soil should consistently apply the same test conditions and procedures described herein. A series of test concentrations should be chosenFootnote58, based on preliminary tests, to bracket the ICp and enable calculation of the 7-day or 10-day ICp for shoot
length.

Once sufficient data are available (EC, 1995), all comparable ICps for a particular reference toxicant derived from these toxicity tests must be plotted successively on a warning chart. A separate warning chart must be prepared for each plant species used in definitive toxicity tests. Each new ICp for the same reference toxicant should be examined to determine whether it falls within ± 2 SD of values obtained in previous comparable tests using the same reference toxicant and test procedure (EC, 1997a, 1997b, 2001). A separate warning chart must be prepared and updated for each dissimilar procedure (e.g., differing plant species or differing reference toxicants). The warning chart should plot logarithm of concentration on the vertical axis against date of the test or test number on the horizontal axis. Each new ICp for the reference toxicant should be compared with established limits of the chart; the ICp is acceptable if it falls within the warning limits.

The logarithm of concentration (including ICp as a logarithm) should be used in all calculations of mean and standard deviation, and in all plotting procedures. This simply represents continued adherence to the assumption by which each endpoint value was estimated based on logarithms of concentrations. The warning chart may be constructed by plotting the logarithmic values of the mean and ± 2 SD on arithmetic paper, or by converting them to arithmetic values and plotting those on the logarithmic scale of semi-log paper. If it were demonstrated that the ICps failed to fit a log-normal distribution, an arithmetic mean and SD might prove more suitable.

The mean of the available values of log (ICp), together with the upper and lower warning limits (± 2 SD), should be recalculated with each successive ICp for the reference toxicant until the statistics stabilize (EC, 1995, 1997a, b, 2001). If a particular ICp fell outside the warning limits, the sensitivity of the test organisms and the performance and precision of the test would be suspect. Since this might occur 5% of the time due to chance alone, an outlying ICp would not necessarily indicate abnormal sensitivity of the seed, nor unsatisfactory precision of toxicity data. Rather, it would provide a warning that there might be a problem. A thorough check of all test conditions and procedures should be carried out. Depending on the findings, it might be necessary to repeat the reference toxicity test or purchase new seed before undertaking further soil toxicity tests.

Results that remained within the warning limits might not necessarily indicate that a laboratory was generating consistent results. Extremely variable historic data for a reference toxicant would produce wide warning limits; a new data point could be within the warning limits but still represent undesirable variation in test results. A CV of no more than 30%, and preferably 20% or less, has been suggested as a reasonable limit by Environment Canada (EC, 1995, 2004a) for the mean of the available values of log (ICp) (see preceding paragraph). For this biological test method, the CV for mean historic data derived for reference toxicity tests performed using boric acid should not exceed 30%.

Concentrations of reference toxicant in all stock solutions should be measured chemically using appropriate methods (e.g., analytical methods involving AES with ICAP scan, for concentration of boron). Test concentrations of reference toxicant in soil are prepared by adding a measured quantity of the stock solution to negative control soilFootnote59, and mixing thoroughly.Footnote60 Upon preparation of the mixtures of the reference toxicant in soil, aliquots should be taken from at least the negative control soil as well as the low, middle, and high concentrations.Footnote61 Each aliquot should either be analyzed directly, or stored for future analysis (i.e., at the end of the test) if the ICp for shoot length, based on nominal concentrations, was found to be outside the warning limits. If stored, sample aliquots must be held in the dark at 4 ± 2 °C. Stored aliquots requiring chemical measurement should be analyzed promptly upon completion of the reference toxicity test. The 7-day or 10-day ICp for shoot length should be calculated based on the measured concentrations if they are appreciably (i.e., ≥ 20%) different from nominal ones and if the accuracy of the chemical analyses is satisfactory.

If boric acid is used as a reference toxicant, the following analytical method applies (OMEE, 1996). A 1 - 5 g subsample of soil spiked with boric acid is dried at 105 °C to constant weight. A 1-g aliquot is then extracted using an 0.01 M solution of CaCl2, by boiling a slurry of soil in 50 mL of this extraction solution and then re-adjusting the final volume to 50 mL using more extraction solution. The 50-mL extract is then filtered through a #4 Whatman™ filter, and diluted to a final volume of 100 mL. A blank sample is prepared in a similar manner. The filtrate is analyzed for elemental boron using ICAP/AES. The boric acid concentration in the soil is then calculated using the following equation:

Boric acid (mg/kg, dry wt) = ([µg B/mL (measured) x final volume (mL) x MWboric acid/MWboron] ÷ [1000 (µg) x weight of sample (mg dry wt)]) x 106

The analytical limit of detection for boric acid in soil is reportedly 1 mg boric acid/kg soil dry wt in most instances (Stephenson, 2003b).

Page details

Date modified: