Comment Notes from Excel File: EstimatingProbabilityOfAccidentalExtinction.xls

Current version: February 15, 2013

Estimating (in cell A1)
Pobability of (in cell A2)
Accidental Extinction (in cell A3)
version 1 (in cell A4)

Introduction (in cell A5)
Organismic population counts are notorious for their irregular pattern. When plotted on a chart in sequential order, the result is a jagged, apparently chaotic time series. Occasionally there can be an unusually high or low count which has no apparent reason. If the drop is to a level below the survival threshold the population will become extinct. The probability of such an event can be estimated from the distribution of the counts in the record provided that: 1) the fluctuations are random, 2) the series has no trend and 3) the ambient conditions that may affect the counts remain about the same. If these conditions are satisfied, and if the counts' distribution can be assumed to be either normal (symmetric) or lognormal (skewed to the right), at least as an approximation, then the probability of accidental extinction can be estimated by the procedure used in this worksheet, at least theoretically.

Notice that this Excel workbook can also be used to estimate the probability of encountering counts of probiotic mirobes or spores in industrial products that are below specified thresholds. It can also be used to calculate the probability of encountering an entry falling below set thresholds in fluctuating nonbiological systems such as those encountered in quality control records.

Auto-Update---> (in cell A6)
Auto-Update: Use it for quick results.
1) Select the 100 example Count data values in column H (cells H2 to H101) and choose 'Clear> Contents' from the 'Edit' menu.
2) Manually enter new Count data directly into column H or paste data copied from another source into column H starting at cell H2 using the 'Paste Special...Values' command on the 'Edit' menu.
3) Enter up to 5 Survival Threshold values applicable to your own data into cells E2, E4, E6, E8 & E10.
4) Click the 'Update Worksheet' button to run the AutoUpdate macro.
5) Compare the Correlation Coefficient of the Count data (cell C8) with the Correlation Coefficient of the Ln(Count) data (cell D8). Choose the distribution (Normal or Lognormal) whose Correlation Coefficient is closest to 1. If neither Correlation Coefficient is close to 1, do not use this program. Verify your choice visually by looking at the two Quantile-Quantile Plots. The chosen distribution should be the one whose Q-Q plotted data most closely resemble a straight line.
6) For the chosen distribution, columns F (Normal) and G (Lognormal) contain probabilities (expressed in %) that a count will fall below the Survival Thresholds listed in column E.
Note: This worksheet applies only to Count data that are nomally or lognormally distributed. There are other formal statistical tests for normality not included here. For more details, read the 9-step Manual Update procedure described below. For additional background information, consult the References.

Manual Update: (in cell A7)
Manual Update:
Detailed 9-step instructions to manually update affected cells and charts in the worksheet when the Count data have changed. The AutoUpdate macro performs all these steps automatically.

Step 1 (in cell A8)
Step 1: List the experimental data in sequence and plot them.
a) Clear the example Count data values from column H (labeled 'Count (C)' and colored rose) by selecting them and choosing 'Clear> Contents' from the 'Edit' menu.
b) Enter your microbial Count data directly into column H or copy your data (1 column of numeric ASCII text values, 1 measurement per line, no gaps) from another source and choose 'Paste Special...Values' from the 'Edit' menu to paste them into the 'Count (C)' column starting at cell H2. The number (up to 16384) of microbial Count measurements that you pasted is shown in cell B2. If you pasted more than 16384 values you must change the 16385 in the formula in cell B2 to refer to a row number that is not less than the number of the row containing the last value that was pasted.
c) The Count data are plotted against their index numbers (i) on the line chart called 'Food Product #1 SPC'. Select the title and change it to refer to your own data.
d) Click on the chart and choose 'Source Data...' from the 'Chart' menu. Click on the 'Series' tab. Be sure the 'Values:' field specifies every row of Count data that you want to appear on the chart by setting the last row of the 'Values:' field (122 in the example chart) to a row that is 1 greater than the last row containing a Count data value (because row 2, not row 1, contains the first data point and we want to be sure the last point appears on the chart).

Step 2 (in cell A9)
Step 2: The natural logarithms of the Count data in column H are automatically calculated in column M (colored light yellow).

Step 3 (in cell A10)
Step 3: Plot a histogram bar chart of the Count data and of their natural logarithms. The cells related to the two histograms are in rows 161 through 181 of columns A through G, just below the two histogram bar charts. Click on the Count Histogram bar chart and choose 'Source Data...' from the 'Chart' menu. Click on the 'Series' tab. B167:B173 are the Category (X) cells from the 'Bin Max Value' column and D167:D173 are the Values (Y) cells from the 'Frequency (%)' column used to draw the chart. Modify the last row (173 in the example) to be 166+# of Histogram Bins shown in cell B4. E167:E173 are the Category (X) cells and G167:G173 are the Values (Y) cells used in the Ln(Count) Histogram bar chart. Modify those two last row references as you did for the Count Histogram bar chart. (See reference 6 for more information on deciding how many histogram bins to use.)

Step 4 (in cell A11)
Step 4: The arithmetic mean (AVERAGE), standard deviation (STDEV), skewness (SKEW) and correlation coefficient (CORREL) of the Count and Ln(Count) data are updated automatically. These are calculated in cells C2, C4, C6 & C8 (for the Count data) and D2, D4, D6 & D8 (for the Ln(Count) data).

Step 5 (in cell A12)
Step 5: Calculate the autocorrelation function of the data and plot it for lags from 1 to up to 25. The absence of any significant autocorrelation is an indication that the Counts are really independent. The method will most probably work satisfactorily even if "mild" autocorrelations exist, but the reliability of the estimates may be affected. (See References. Also, see for more about autocorrelation and autocorrelation plots.)
a) Non-empty white colored cells apply to autocorrelation calculations.
b) If the number of Lags (shown in cell B9) is less than the allowed maximum (25), click on the Autocorrelation Function chart and choose 'Source Data...' from the 'Chart' menu. Click on the 'Series' tab and set the last referenced row (initially 26) to be 1 greater than the value shown in cell B9.
c) Columns M through column (Q+# of Lags) must contain the same number of rows as column H. Clear the excess rows from those columns if they contain more rows than column H. If they contain fewer rows, copy and paste the last row from each column into the number of new rows needed to make each column the same length as column H.

Step 6 (in cell A13)
Step 6: Calculate the needed values for and plot a Quantile-Quantile (Q-Q) Plot of the Count and Ln(Count) data. Column I contains the Count data from column K and column J contains the Ln(Count) data from column M but columns I and J are also sorted into ascending order. Column N contains the index number (i) of each data value. Column O contains the inverse of the standard normal cumulative distribution of (i-0.5)/n, where mean=0, standard deviation=1 and n is the # of Count data values (cell B2). The Normal Probability Q-Q Plot plots column I (Sorted Count) vs column O and the Lognormal Probability Q-Q Plot plots column J (Sorted Ln(Count)) vs column O. In order to compare the two Q-Q plots their y-axes must be the same length relative to the data that are plotted on them. Double-click on the chart's y-axis and set the y-axis Scale Minimum and y-axis Scale Maximum of the normal Q-Q Plot to the Min(Count) value (cell B163) and Max(Count) value (cell B165), respectively. Set the y-axis Scale Minimum and y-axis Scale Maximum of the lognormal Q-Q Plot to the Min(Ln(Count)) value (cell E163) and Max(Ln(Count)) value (cell E165), respectively.

Step 7 (in cell A14)
Step 7: Specify up to 5 Survival Threshold values for the population in question. Five different example Thresholds are located in cells E2, E4, E6, E8 & E10. Replace them with Threshold values that apply to your own data.

Step 8 (in cell A15)
Step 8: Choose the distribution that matches your data.
1) If the Count Histogram has a symmetric bell shape, your data probably have a normal distribution. If the Ln(Count) Histogram has a symmetric bell shape, your data probably have a lognormal distribution.
2) Compare the Quantile-Quantile Plot of the Count data (normal distribution) with the Q-Q plot of the Ln(Count) data (lognormal distribution). Choose the distribution whose Q-Q Plot looks most like a straight line.
Compare the correlation coefficient of the Count data (cell C8) with the correlation coefficient of the Ln(Count) data (cell D8). Choose the distribution whose correlation coefficient is closest to one. If neither correlation coefficient is close to one, do not use this program.
(See for more about skewness. There are other formal statistical tests for normality or lognormality that are not included in this worksheet.)

Step 9 (in cell A16)
Step 9: The worksheet calculates the probability (expressed in %) that a count will fall below the Survival Threshold(s) specified in step 7 (provided that the Counts' distribution remain unchanged). Calculations in cells F2, F4, F6, F8 & F10 assume the Count data have a normal distribution. Calculations in cells G2, G4, G6, G8 & G10 assume the Count data have a lognormal distribution. Note that the results are only rough estimates of the order of magnitude of the probabilities of accidental extinction with the entered Survival Threshold values.

** WARNING ** (in cell A17)
*** Note that the calculated probabilities DO NOT INDICATE WHEN EXTINCTION WILL OCCUR. ***

References: (in cell A18)
1) Peleg, M. and Horowitz, J. 2000. On estimating the probability of aperiodic outbursts of microbial populations from their fluctuating counts. Bulletin Mathematical. Biology 62:17-35.

2) Peleg, M., Nussinovitch, A. and Horowitz, J. 2000. Interpretation and extraction of useful information from irregular fluctuating industrial microbial counts. J. Food Science 65:740-747.

3) Horowitz, J., Normand, M.D. and Peleg, M. 1999. On modeling the irregular fluctuations in microbial counts. Critical Reviews Food Science Nutrition 39(6):503-517.

4) Corradini, M.G., Normand, M.D., Nussinovitch A., Horowitz, J. and Peleg, M. 2001. Estimating the frequency of high microbial counts in industrial food products using various distribution functions. J. Food Prot. 65(3):674-681.

5) Peleg, M. 2006. Advanced quantitative microbiology for food and biosystems: Models for predicting growth and inactivation. CRC Press, Boca Raton FL.

6) Rosenkrantz, W.A. 2000. Confidence bands for quantile functions: A parametric and graphic alternative for testing goodness of fit. The American Statistician 54(3):185-190.

Related web pages: (in cell A19)
Related web pages:
Estimating the Probability of High Microbial Counts Using Excel:

Failure Probabilities from Quality Control Charts:

Simulating Microbial Count Records with an Expanded Fermi Solution Model:

A Probabilistic Model for Population Extinction:

Authors: (in cell A20)
Model by:
Micha Peleg         Tel: 413-545-5852
Professor           Fax: 413-545-1262
Worksheet by:
Mark D. Normand     Tel: 413-545-2365
EDP Programmer II   Fax: 413-545-1262
Both located at:
Department of Food Science
Chenoweth Laboratory
University of Massachusetts
Amherst, MA  01003
Worksheet on the Web at:

Last modified: February 15, 2013

Acknowledgements: (in cell A21)
The research on which this worksheet is based was initially supported by the USDA NRI (Food Safety) Program and is currently supported by the Massachusetts Agricultural Experiment Station at Amherst.

[Return to ] Estimating the Probability of Accidental Extinction Using Excel

Content last updated: February 15, 2013