ABSTRACTS OF PAPERS DELIVERED AT THE14TH INTERNATIONAL
CONFERENCE ON QUANTITATIVE METHODS FOR ENVIRONMENTAL SCIENCES 3-7 NOVEMBER 2003
Challenges in Developing an Information Management System
for Lake Erie –a case study
Neela M.
Akhouri, Jignesh M. Patel and Stephen L. Goldman
Lake Erie
Research Center, University of Toledo, Oregon, Ohio, USA
Lake
Erie, one of the largest fresh water lakes in the world was moribund in the
early 1970’s. With the enactment of the Clean Water Act of 1972 and the Great
Lakes Water Quality Agreement between Canada and the United States, water
quality management agencies have been successful to a large extent in
controlling nutrient and contaminant levels in the ecosystem and a resurgence
of fish and wildlife has been evident. The present state of Lake Erie reported
by US Environment Protection Agency (2002 LaMP, USEPA) indicates land-use,
nutrient enrichment, habitat degradation and loss, pathogen and toxic chemical
contamination, and human health still continue to be priority management
issues.
To
understand the various factors influencing sustainability of the lake ecosystem
at a larger spatial and temporal scale (e.g. entire length of the river or creek)
and to mitigate any detrimental changes, management goals and objectives should
be based on the current state of its ecosystem components. To establish these
management goals, it is necessary to form a baseline understanding of change
over time in the watershed ecosystems. To develop baseline indices; analyze
trends, changes, adaptations, and to establish relationships between factors
influencing ecological changes, massive sets of data and a tool to “re-purpose”
previously-existing data are required. However, historically, ecological data
have been collected by a single or small group of investigators in plots of £ 1 m2
over relatively short periods of time (Kareiva and Anderson, 1988; Brown and
Roughgarden, 1990). Therefore, it is not possible to accrue data at large
spatial and temporal scales by any one project or program.
For
the past several decades, many disjointed programs and policies have existed at
the federal, state, and local levels to record and analyze the processes
occurring at different spatial and temporal scales in Lake Erie and its
drainage basin. Even though the need for availability of extensive data sets
(Meyer et al., 1999), and integrated analysis framework (Hardy, 1998) has been
established, the daunting task of integration and synthesis of voluminous and
complex data (Great Lakes Commission, 2003) has been stymied due to its (1)
spatial dispersion, (2) structural and semantic heterogeneity, and (3) lack of
detailed information regarding available data and availability of data. Details
of these problems and our approach to finding a solution using technology tools
will be presented in this paper.
W.J.R. Alexander
Department of Civil and
Biosystems Engineering, University of Pretoria, South Africa.
A
comprehensive database consisting of 11 804 years of South African annual
hydrometeorological data from 183 sites was assembled and analysed. The
principal conclusion was that there has been a beneficial increase in the mean
annual rainfall over South Africa by at least 9% during the 78-year period of
record. There was some indication of an acceleration of the increase during the
past two decades. It was also established that there was a corresponding
increase in open water surface evaporation during this period. It is postulated
that the concurrent increase in these two processes is due to naturally
occurring global warming. There is also a statistically significant 18 to
22-year periodicity in many South African rainfall and river flow records. It
is characterised by sudden reversals from periods of below average conditions
to conditions of high rainfall and floods. It is probably related to solar
activity. The analyses produced no evidence to support the views that climate
change will result in a meaningful reduction in rainfall, or an increase in the
frequency or severity of floods and droughts within the next 50 years. All
evidence was to the contrary.
Statistical analysis of extreme floods
W.J.R. Alexander
Department of Civil and
Biosystems Engineering, University of Pretoria, South Africa
In this presentation it is demonstrated that
widespread, severe floods are caused by infrequent, but not rare,
meteorological phenomena, including tropical cyclones and cut-off low pressure
systems. The magnitude of the severe floods relative to the series of annual
maximum floods at any one site can be readily determined, but all direct
statistical analysis methods seriously under-estimate their frequency of
occurrence. This is demonstrated by the statistical analyses of wide area,
severe flood-producing rainfall. The statistical analyses confirm that the widely
used log Pearson type 3 distribution using conventional moment estimators,
remains the preferred method for the statistical analysis of hydrological and
meteorological data, but that no direct statistical analysis methods can be
used with confidence for return periods exceeding 50 years. It is also
demonstrated that there is no justification for making any allowances for
future climate change in design flood estimation procedures, as any increases,
should they occur, will be undistinguishable against the background of the
large natural variability.
Stochastic rainfall generation for rainfall-runoff modelling
Enrica Bellone
University
College London
enrica@stts.ucl.ac.uk
Rainfall
and evaporation are important inputs to the rainfall-runoff models used for
flood design. We overview the
stochastic simulation methods being studied at IC/UCL. For single-site
rainfall, a wide range of Poisson-based models have been investigated. Simulations from these models may be
combined with synthetic evaporation series, derived using a Generalized Linear
Model (GLM)-based approach, to obtain the necessary fine scale inputs for
runoff generation. A spatial extension of the Poisson-based models (fitted to
UK radar data) provides a continuous space - continuous time approach to the
rainfall simulation problem. Alternatively, GLMs have been applied to daily
raingauge records, where radar data are not available or the underlying
hypotheses of the point process models are not satisfied. While the
Poisson-based models require both spatial and temporal stationarity, the GLM
approach does not involve a stationarity assumption. Combining the two
approaches, for regions where both radar and gauge data are available, may be a
way to allow spatial and temporal non-stationarities, including climate change
scenarios, in continuous simulations.
A Statistician's Take on the Benthic Index of Biotic
Integrity
Grace Chiu
Pacific
Institute for the Mathematical Sciences (PIMS) and Department of Statistics,
University of Washington
Stream
health is often measured by the multimetric
benthic index of biotic integrity''(B-IBI) developed by Kerans and Karr
(1994). B-IBI metrics quantify the well-being of benthic inhabitants of the
stream. For the Puget Sound Lowland streams, such metrics include the total
number of taxa, the proportion of predatory taxa, etc. Each metric is converted
into a score of 1, 3, or 5, where a higher value indicates a healthier stream
with respect to the metric. Summing the metric scores yields the B-IBI.
Conventional
scoring requires subjective and space/time-dependent input on the cutoff points
for the metric-to-score conversion. To a statistician, simple standardization
(centering and division by the standard deviation) may be a more natural,
non-study-specific conversion. Our more statistically
oriented B-IBI (SOBIBI) is the sum of all scores computed this way.
Currently, we are comparing the performance of our SOBIBI to the reported
properties of the B-IBI via plots and bootstrap simulations.
In
this talk, I will introduce the background of the project, discuss our current
progress, and invite ideas from the audience.
A Quantitative Decision Support Instrument for Water Quality
Management
Marius
Claassen, Alta de Waal, Xolani Hadebe and Thabo Sekonyela
CSIR, South Africa
mclaassen@csir.co.za, adewaal@csir.co.za, xhadebe@csir.co.za, and tsekonyela@csir.co.za
The
promulgation of the National Water Act (NWA), 1998 (Act No. 36 of 1998) and
various other acts, as well as the publication of various policies and White
Papers (such as Water Supply and Sanitation, 1994, etc.) have given a new
direction to water resources management, and specifically Water Quality
Management (WQM), in South Africa. The purpose of the NWA is to ensure that the
South Africa’s water resources are protected, used, developed, conserved,
managed and controlled in an equitable, efficient and sustainable manner. This
necessitates a change in the approach of WQM to an integrated source, resource
and remediation focused management approach.
Currently
no quantitative technique exists to assist licence officers in making decisions
regarding the allocation of water licences. Licence officers are frustrated
because of the lack of structure and support that currently exists to alleviate
their tasks. To make this integrated approach operational, management
instruments are needed for regional offices to include resource-directed water
quality issues into licence allocations. This calls for a decision support tool
that will guide regional offices in making important licence allocation
decisions. The decision support tool must be able to facilitate a decision
despite incomplete, imprecise, and highly variable data. Also, the decision
needs to be based on multiple criteria such as socio-economic factors, race and
gender considerations, alignment with catchment strategy, etc.
We
suggest that Bayesian networks provide a possible solution to this problem.
Bayesian networks have proven to be an extremely powerful technique for
reasoning under uncertainty and the graphical structure explicitly represents
cause-and-effect assumptions between variables. In this article we explain the
role of Bayesian networks in licence allocation decision making: The procedure
consists of identifying the indicators and considerations relevant to WQM. The
Bayesian network links the indicators and considerations and enables us to
calculate a value (within some probability bounds) for each consideration,
given the values of indicators. These values are then used to calculate the
probability of allocating the licence (or not). This methodology allows
decision support, even under circumstances of severe uncertainty and lack of
information.
Pesticides
risk assessment of Goulburn-Murray Water's (G-Mw) irrigation supply channels,
Victoria, Australia
Ray Correll1,
Mary Barnes1, Rai Kookana1, Golam Kibria2*,
Peter Butcher2
CSIRO Contaminant
Interactions & Remediation, Urrbrae, Adelaide, Australia1
Goulburn-Murray Rural Water
Authority, Tatura, Victoria, Australia2
Goulburn Murray Rural Water Authority (G-MW) is a
major rural water authority in Australia.
It supplies water for irrigation, domestic and stock drinking and
aquaculture. A preliminary survey (2001) found than more than 70 pesticides
were used as herbicides, insecticides, and fungicides. G-MW suspected that
water supply channels contaminated with pesticides can be unfit for human
consumption, stock drinking, irrigation, food processing and aquaculture and
could have an adverse impact on aquatic biota (including native fish) living in
natural waterways.
A first tier (1st) assessment was made of
the risk from pesticides to water quality and through water quality to human,
stock & domestic supply, food industry, pastures, aquaculture and aquatic
ecosystems. Risk was measured using various exposure pathways to channels
(drift, accidental spills, unlawful acts, and drainage discharge) to different
receptors (human, stock, domestic use, food processing industry, aquaculture,
aquatic ecosystem, pastures, and crops).
Various recommendations were made to reduce risks to
water supply channels, The recommendations include monitoring of pesticides in
fruit, vegetables and leguminous pastures and aquaculture production sites, the
use of best-management practices, and a new survey to obtain pesticides uses
data in the new and expanding farming zones (not covered in the 1st
tier assessment) and obtaining of further ecotoxicological information of
harmful pesticides.
A methodology for seismic
risk assessment with an application to the insurance industry
Nicholas Davies and Andrzej
Kijko
Hannover Re Africa Ltd,
Johannesburg and Council for Geoscience, Private Bag X112, Pretoria 0001, South
Africa
The study
concentrates on a methodology for probabilistic seismic hazard and seismic risk
assessment. It starts with an introduction and a historical perspective on the
estimation of seismic damage to buildings. The method for the estimation of
expected damage from a probabilistic point of view is then presented. The work
closes with an application of the described methodology to several sites around
South Africa.
Using the Worsley likelihood ratio test to define
changes in longitudinal profiles of rivers
ESJ
Dollar1 and LH Dollar2
CSIR Environmentek, Pretoria1 and School of Civil and Environmental Engineering, University of the Witwatersrand2
The
questions of ‘which’ and ‘how many’ rivers need to be conserved in South Africa
in order to meet the objectives of sustainability and use, and in order to
protect ‘river biodiversity’ remains unanswered. The South African National
Water Act (No. 36 of 1998) makes provision for the classification of water
resources in order to achieve a balance between protection and use. Currently,
a multi-disciplinary project aiming to answer the aforementioned questions is
underway. One of the objectives of this project is to divide the longitudinal
profiles of South African rivers in ‘macro-reaches’. Macro-reaches are defined as
stretches of river of variable length within which inherited controls (such as
lithology, structure, tectonics and so on) are sufficiently similar to result
in a uniform channel type. This has been achieved for a small number of South
African rivers, as this process is time consuming and data intensive. In this paper we explore the possibility of using a method (Worsley,
1979) for identifying significant changes in river longitudinal profiles that
can be applied to all rivers in South Africa.
Worsley’s
method (1979) defines a single change point in a set of data. This was applied sequentially to all data
points known along a river profile in order to identify multiple change
points. Change point positions derived
from the statistical method were compared to longitudinal profiles that had
previously been sub-divided using survey data, geological maps, aerial
photographs and video footage. An
assumption was that if the statistical method could identify the same change
points, then the method can be applied with reasonable certainty to rivers for
which detailed assessments have not been performed. An example of the Crocodile River is presented. It was found that the usefulness of the
method depended on assumptions made to apply the Worsley likelihood ratio. Various questions are raised to determine whether
there are more appropriate methods of finding change points and how the data
set should be sub-divided to provide statistically valid answers that are also
useful to decision makers.
Modified likelihood function with environmental applications
Abdel H
El-Shaarawi
National Water
Research Institute, Burlington, Ontario, Canada and Department of Chemical
Engineering, University of Genoa, Italy
Inferences
about a vector of parameters
using the likelihood
function L(
based on independent
observations
, remain unchanged under permutations of the elements of
This fact leads to a
simple approximation
of L(
using order statistics.
produces a closed
form estimates for the parameters of the location and scale family of
distributions and a single non-linear equation for estimating the shape
parameter.
is then used to fit
the generalized extreme value distribution to several environmental data sets.
Simulation results will also presented to study the performance of the method.
Modeling the
accumulation of contaminants in the aquatic environment
Abdel H. El-Shaarawi and Abdalla Elbergali
National Water Research Institute, Burlington, Ontario,
Canada
The
main dynamic features of the growth of an organism or a community are usually
modeled as a set of deterministic differential equations. Models of this type
do not account for natural environmental and or biological variability. This
paper discusses how to account for the variability by the including stochastic
components in such models and how to use the Fokker-Planck or diffusion
equation to compute the moments of the associated random process. Expressions
for the first and second order moments are required for the application of
quasi likelihood method to make inferences about the process parameters. The
results of using this approach to model the accumulation of contaminants in the
tissue of fish from the Great Lakes will be presented and discussed.
Modeling
particulate matter data in sparse monitoring networks
A. Fasso and O. Nicolis
University of Bergamo, Italy
In
many countries, particulate matters are being monitored only recently. Since
these monitoring networks are often sparsely distributed over the land, it is
of interest to assess spatio-temporal correlation between particulate matters
and other quantities. In this talk, some models describing the relation between
PM10 and Oxide Nitrate are developed and compared using data from North Italy.
Modeling waterborne infectious outbreaks: when, where, and how bad will
they be?
Nina H. Fefferman and Elena N. Naumova
Tufts University
We offer
mathematically rigorous definitions of previously ambiguous epidemiological
concepts of waterborne infections. These definitions include notions of
environmental exposure, diseases temporal patterns, and outbreak signatures. We
define an outbreak as an occurrence of either of the following: the probability
of exposure is more than one standard deviation higher than the norm in at
least one location, or else the size of the population open to possible
exposure to infection at a given time is more than one standard deviation
higher than the norm. By using the specific properties of the disease as
parameters to generate a disease temporal pattern, we portray an outbreak as a composition
of these patterns that form a unique outbreak signature. Based on the proposed
definitions, we derive a sequential combinatorial model of symptomatic
manifestation of disease outbreaks. By considering the observed number of new
cases per unit time as a composite function of the probabilities of exposure,
infection, and the timing of the onset of symptoms, we are able to
differentiate between spatial and temporal spreads of sustained exposure.
Finally, we use the demographic distribution of affected populations in order
to further decompose an outbreak signature. We demonstrate our model
performance using a simulated hypothetical outbreak. Using ten years of
Massachusetts’s surveillance data on laboratory-confirmed cryptosporidiosis we
compile a set of parameters essential for modeling. Then, using a proposed
model we attempt to describe a suspected outbreak of cryptosporidiosis occurred
in Worcester, MA in 1995.
The proposed
modeling procedure allow us (1) to decompose an outbreak signature and differentiate
between different disease spread scenarios, (2) to predict endemic Poisson-like
fluctuations prior to an outbreak, and (3) to examine the likely reporting
errors associated with all phases of an outbreak manifestation.
Towards a more integrated approach to the management of
land, water and external inputs at a catchment scale
W Enkerlin, A Fajgelj, IG Ferris, L Gourcy, K Gross, LK
Heng.,L Foglund, O Perera, J Turner, G Voigt, F Zapata
Food and
Environmental Protection Section, Joint FAO/IAEA Division of Nuclear Techniques
in Food and Agriculture; Soil and Water Management and Crop Nutrition Section,
Joint FAO/IAEA Division of Nuclear Techniques in Food and Agriculture; and Soil
Science Unit, Agency’s Laboratories, International Atomic Energy Agency
I.Ferris@iaea.org F.Zapata@iaea.org and L.K.Heng@iaea.org
One of the key challenges for sustainable
intensification of agricultural production is to ensure sufficient, safe and
nutritious food production for an ever growing population on limited land and
water resources while promoting natural resource conservation (Lal, 2000;
Walling, 2001). The recently held World Summit on Sustainable Development
(WSSD), September 2002 reaffirmed land degradation as one of the major global
environment and sustainable development challenges of the 21st
Century. Agricultural lands have been more severely affected by degradation
processes. About one third of agricultural land has been degraded during the
last 50 years. Land degradation affects soil quality through several processes
including water or wind erosion, waterlogging, salinization, acidification,
soil compaction, nutrient mining and soil organic matter (organic carbon)
depletion. The main causes of land degradation are inappropriate land use,
unsustainable farming practices, deforestation, and overgrazing. Another
serious challenge, albeit less visible, is environmental pollution that degrades
soil and water quality (Lal, 2000; UNEP, 2000). Pesticides are often perceived
as a cause of groundwater and surface water pollution. Both challenges require
an integrated approach at the landscape level and linkages between production
and monitoring activities (OECD, 2001). These integrated studies must be
planned and implemented by multi-disciplinary teams with a range of knowledge
and expertise that often is found in several institutions (Chalk et al., 2002). To change current
practices it is imperative that all stakeholders, including the decision and
policy makers (farmers and regulatory officials) are involved at the onset
(OECD, 2001).
Lizelle Fletcher
Department of Statistics, University
of South Africa
The
Water Research Commission (WRC) came into being in 1971 to address the
management and utilisation of South Africa’s meagre water resources in a
sustainable and responsible manner. Cloud seeding research in this country,
focusing on rainfall enhancement, was thus driven by the WRC and the S A
Weather Bureau (SAWB) from the early 1980s onwards.
The Bethlehem Precipitation Research Project (BPRP),
conducted from 1984 to 1989, was an exploratory experiment initiated with the
primary objective to investigate natural and artificially modified
precipitation processes within summertime convective clouds. Simultaneously, a
similar cloud-seeding experiment, the Program for Atmospheric Water Supply
(PAWS), was independently conducted by a private company CloudQuest at Carolina
in the eastern part of the country.
The National Precipitation Research Programme (NPRP)
followed on the BPRP, and was the result of the amalgamation in 1990 of the
above two projects, combining the efforts of the Bethlehem meteorologists and
the Carolina researchers. This project came to an abrupt end in April 1997, at
the height of progress and success, when the then Minister of Water Affairs and
Forestry withdrew the WRC’s funding for rainfall enhancement research in South
Africa. It was during this period that a novel hygroscopic seeding technique
was developed which was successfully implemented not only in South Africa, but
also in Mexico, under the auspices of the US National Center for Atmospheric
Research (NCAR) as well as in Thailand.
Early in 1995 the Northern Province government
approached the WRC and the SAWB to employ cloud seeding at very short notice as
an emergency response to drought in the province. This operational project ran
until 1997.
The South African Rainfall Enhancement Program
(SAREP) proposal was drafted in the second half of 1997. This semi-operational
project was conducted from December 1997, building on the previous work of the
NPRP. Funding for this project was unfortunately unstable, leading to its
demise in April 2001.
Estimating space-time trends combining observations with
output of numerical models
Montserrat
Fuentes
Statistics
Department
North Carolina
State University
Estimating
spatial temporal trends of air pollution levels is vital for air quality
management, and presents statistical problems typical of many environmental and
spatial applications. Ideally, such trends would be based on a dense network of
monitoring stations, but this does not always exist. Instead, there are generally two main sources of information about
pollution levels: one is pollution measurements at a sparse set of monitoring
stations and the other is the output of
the regional scale air quality models.
Here
we develop formal methods for combining sources of information with different
spatial resolutions for space-time trend estimation. We formulate this problem
using a Hierarchical model, in which scientific information and output of
numerical models is introduced to improve the trend estimation.
We
also offer a review of the current literature and approaches that use output of
numerical models as a prior for trend estimation.
We
present applications to air quality and wind fields trend estimation using
output of numerical air quality models and weather prediction numerical models.
Modeling Several Levels of Uncertainty in Hearing Threshold
Data
Byron J.
Gajewski
University of
Kansas
According
to the world health organization (WHO) and many other organizations, noise
pollution is a very important environmental problem. Additionally, the WHO also
states that there is not enough knowledge of the effect of the “dose-response”
relationship of noise pollution control on humans. To better understand human
hearing loss, for example, good statistical methods for analyzing this type of
data are necessary. Over the past few years we have been involved in analyzing
data sets using hearing threshold as the primary response variable. In general,
the distribution of hearing thresholds (y)
tend to have heavy tails and be more peaked than normal or log-normal
distributions. A typical remedy to this problem is to transform the hearing
threshold with the natural logarithm of the response plus a parameter, or ln(y + a), where alpha is an unknown parameter.
In the past, some researchers have set alpha to 20, a priori, while others
estimate alpha with maximum likelihood estimation. we use a Bayesian approach
to estimate a.
Additionally,
we wish to obtain the probability of a 15 decibel (db) drop in subjects’
hearing threshold over time. Furthermore, as a nuisance, the data typically
suffers from right censoring, grouping and missing observations. Therefore, we
desire a method that simultaneously accounts for the uncertainty in: (1) usual
parameters associated with a parametric model; (2) an added parameter, alpha
(3) the transformation to a probability measure and (4) the nuisance tendencies
of the data. To account for the list of uncertainties, we formulate a Bayesian method. We then
demonstrate the method with a case study. One can extend the method for use on
many types of hearing threshold studies.
Isotonic regression for the normalisation of environmental
quality data
Anders
Grimvall
Department of
Mathematics, Linköping University, SE-58183 Linköping, Sweden
angri@mai.liu.se
Trends in environmental quality can
emerge more clearly if the collected data are normalised by removing
meteorologically driven fluctuations and other forms of natural variation. The
most widely used normalisation procedures are based on linear or non-linear
regression models. In this paper, we discuss how we can incorporate prior
knowledge about monotone response to one or more explanatory variables. In
particular, we propose a technique that facilitates estimation of monotone
temporal trends in the presence of seasonal variation and one or more
covariates. The basic idea is simple. First, the seasonal pattern is decomposed
into increasing and decreasing components prevailing under different parts of
the year, and then an algorithm for isotonic regression is employed to extract
a monotone temporal trend. Because existing algorithms for least squares
isotonic regression can be used only for problems involving a small number of
explanatory variables we propose a computationally simpler two-step procedure
in which a smooth function is first fitted to data and then made monotone. More
precisely, we suggest that a kernel smoother is used to produce a response
surface for a grid of values of the explanatory variables and that a simple
averaging algorithm developed by Mukerjee and Stern (1994) and improved by
Strand (2003) is then used to produce an increasing or decreasing response
surface. The performance of our method is tested on water and air quality data
collected over several seasons.
Using wavelet
tools to estimate and assess trends in atmospheric data
Peter Guttorp
University of Washington, USA
Recently,
wavelet methods have been applied in a variety of geophysical time series. I
will describe some of these tools (the modified wavelet transform, multiscale
decomposition, wavelet variance, etc) and show how they can be used to estimate
long-term temporal structure with attendant standard errors. The methodology
will be illustrated in two contexts: estimation of hemispherical mean
temperature, and an experiment measuring atmospheric turbulence. Possible
extensions to space-time data will also be discussed.
Developing
ecosystem health monitoring programs for rivers and streams
Bronwyn D. Harch,
Maree O’Sullivan, Ross Sparks
CSIRO
Mathematical and Information Sciences
CMIS, Qld
Bioscience Precinct, 306 Carmody Road, St Lucia 4067, Australia
Bronwyn.Harch@csiro.au
During the
last decade, the long-term ecological health of Australian rivers and streams has
emerged as one of the biggest national natural resource management objectives.
The National Water Quality Management Strategy forms a national program in
Australia to achieve ecologically sustainable use of water resources by
protecting and enhancing their quality, while maintaining economic and social
development.
The main focus
for monitoring the ‘health’ of rivers and streams has progressed from only
measuring water chemistry attributes to measuring aspects of ecosystem health –
chemical, physical and biological. Ecosystem health for freshwaters is now
increasingly being diagnosed using aquatic macro-invertebrates (water bugs),
fish, frogs, nutrients (forms of N and P), algae, macrophytes, the production
and consumption of organic carbon, sediment bacteria and the more traditional
water chemistry parameters.
Examples of
our involvement with the development of ecosystem health programs in South-East
Qld, South-West Qld and the Sydney area will be presented with an emphasis on
the contributions made by statistics. Aspects of sampling program design,
statistical analysis and reporting of ecosystem health assessment will be
highlighted.
(Poster
presentation)
Kernel Estimates of Hazard Functions and Application
Ivana Horova,
Jiri Zelinka and Marie Budikova
In
recent years, considerable attention has been paid to methods for analyzing
data on events over time and to the study of factors associated with recurrence
rates for these events.
In
summarizing survival data, there are two functions of central interest, namely
the survival and the hazard functions. The well-known product-limit estimator
of the survival function was proposed by Kaplan and Meier in the year
1958. A single sample of survival data
may also be summarized through the hazard function. We focus on nonparametric
estimates of the hazard function and its derivatives under random censoring
based on kernel estimate of the Nelson estimator of the cumulative hazard
function. Methods of kernel estimates
represent one of the most effective nonparametric methods. An automatic procedure for the simultaneous
choice of the parameters of kernel estimates is applied to the estimate of
hazard functions for two carcinoma data sets kindly provided by the Masaryk Memorial
Cancer Institute in Brno. The attention
is also paid to the points of the most rapid change of these hazard functions.
A Class of Linear Space-Time Models Applied to Estimation of
Fish Populations
Gudmund Høst
Norwegian Computing Centre, Norway
gudmund.host@nr.no
Joint
modeling of population dynamics in space and time may be important when
estimating population abundance. In estimating abundance of marine populations,
both the data collection and the population may have space-time
characteristics. We will present a class of models for population density
derived from a linear partial differential equation with random forcing. We
show that the model has an equivalent and more familiar representation as
smoothing of uncorrelated noise.
We
estimate parameters by Markov Chain Monte Carlo (MCMC) and apply the model to a
herring population in Norway.
The power of the test in analysis of variance of Poisson
distributed variables
Zuzana
Hrdlièkova
Masaryk
University Brno,Czech Republic
Department of
Applied Mathematics
The
power of the test for the linear hypothesis testing in the classical ANOVA
model with the normal error can be found by means of the noncentral F
distribution. The knowledge of the power of the test is very useful for planning
biometrical experiments.
If
the assumptions of the classical linear model are disturbed then the
generalized linear models are often used. The tests are based on the Pearson's
chi^2 statistics and the deviance or scaled deviance. In such situation it is
in general very complicated to derive exact power of the test. In some
situation the asymptotic power can be used. Therefore in the paper we focus on
the ANOVA models in situation when error terms have Poisson distribution. We
aim at calculating the power of the test for particular values of parameters by
means of simulation. Obtained results are applied to environmental data.
Diagnosing Hydroclimatic Variations and Change based on a
Quantile Regression framework
Shaleen Jain
NOAA-CIRES
Climate Diagnostics Center and Cooperative Institute for Research in
Environmental Sciences, University of Colorado, Boulder, CO 80305-3328
Shaleen.Jain@noaa.gov
Decadal
and longer-term variations in regional hydroclimate directly impact the
sustainability of water-dependent natural systems (e.g., stream ecosystems), as
well as the planning and operations of water-related infrastructure (dams,
reservoirs, levees, and water supply systems). Diagnosis of the variations and
change from observational data is often made difficult due to the presence of
nonstationarities, outliers, and departures from normality. Quantile regression
methodologies, employed on full and moving-window subsets of the data largely
overcome these problems, and allow a robust characterization of the sensitivity
the full empirical probability distribution. Results are presented for an
analysis framework to diagnose long-term variations across various quantiles
for the major river systems along the North American west coast. Preliminary
results are also presented for extending the diagnosis framework to a
multivariate setting. Finally, we discuss the relevance of the framework toward
the characterization of climate-related vulnerability of water resources.
Seismological Outliers: L1
or Adaptive Lp‑Norm Application
Andrzej Kijko
Council for Geoscience,
Private Bag X112, Pretoria 0001, South Africa
In the early 1930's Jeffreys recognized that the errors
in determination of arrival times of seismic waves do not follow the Gaussian
distribution required by least squares (L2‑norm)
location techniques. Large errors in the arrival times can displace the least‑squares
estimate of the earthquake hypocenter far away from the true location. As an alternative
to the least‑squares procedure, the minimization of the sum of absolute
residuals (L1‑norm),
as protection against the effects of outliers, is often used. In this note a
modification of inversion procedures is proposed which incorporates an adaptive
algorithm for Lp‑norm
estimation. A detailed study by
numerical simulation demonstrates that in the presence of outlying
observations, an Lp‑norm
procedure can select the proper value of p,
which may not necessarily equal 1. The advantage of this procedure is that no
apriori decision to use L1‑norm
is to be made, but instead the data prescribes the most appropriate value of p.
In on‑line seismic networks where the system is
fully automated, and where there is no chance for manual intervention, it is suggested
that the adaptive Lp‑norm
offers the most reliable method for data processing.
Keywords: Outliers, adaptive
LP norm
Probability Modeling of the Water-Quality: Assessment in Mid-Atlantic Region
M. Liu1,
N. K. Neerchal1, E. A. Greene2, and A. E. LaMotte2
Department of
Mathematics and Statistics, UMBC, Baltimore, MD 21250, and United States
Geological Survey, Baltimore, MD 21237
Nitrate
concentration level is an important indicator of water quality. Thus it is very
useful to relate the levels of nitrate concentrations in a watershed to various
indicators of stress, such as fertilizer applications, percentage of
agricultural land in the watershed. In this project, existing nitrate
concentration data from the U. S. Geological Survey (USGS) National Water
Quality Assessment (NAWQA) were used in conjunction with geographic data to
develop logistic regression equations to predict the presence of elevated
levels of nitrate concentration. The resulting logistic-regression equations
were transformed to determine the probability of nitrate concentration
exceeding a threshold relating to a specific designated use. Generalized
Estimating Equations methodology is used to estimate models that account for
intra-watershed correlation. Statistical tests are also performed to
investigate possible overdispersion resulting from intra-watershed correlation.
Even though this model was developed for nitrate contamination, the method can
be used for other water-quality parameters that may affect human of
environmental health.
Functional
data analysis methods in the environmental sciences
Wendy Meiring
University of California, Santa Barbara
Functional
data are observations from curves or surfaces. Many environmental data sets may
be considered as functional data. For example a balloon-based ozonesonde
measures the ozone partial pressure profile as the sonde ascends through the
atmosphere. Each launch time provides observations from one ozone partial
pressure profile as a function of altitude. A sequence of sonde launches provides
observations from a time series of ozone profile "curves", with each
curve a function of altitude. The shape of these curves evolves over time in
response to complex dynamical and chemical processes. We present functional
data analysis methodology to estimate altitude-dependent non-linear time trends
and other space-time modes of variability. We illustrate throughout with a
study of altitude- dependent variation in stratospheric ozone partial pressures
over Hohenpeissenberg in Germany, based on a time series of ozonesonde flights.
We
illustrate the value of functional data analysis methods for studying altitude
dependent non-linear time trends and ozone variation related to the
Quasi-Biennial Oscillation. Due to the large number of observations, our analysis
combines dimension reducing functional basis approximations, with flexible
additive models on the low-dimensional basis function coefficient scale.
We
fit additive coefficient models composed of cubic and periodic splines, as
special cases of smoothing spline anova models (SSANOVA). We provide
preliminary estimates of uncertainty of altitude dependent QBO effects and
non-linear time trends, building on SSANOVA Bayesian confidence intervals.
We
discuss the importance of non-linear time trends to study ozone recovery as
stratospheric chlorine levels decrease.
We
indicate continued improvements of the uncertainty measures as well as model
extensions for additive and interactive effects of additional atmospheric
explanatory variables.
Analysis of Multidimensional Discrete data with Rasch type
Models. Application to the WHO Housing and Health Survey.
Mounir Mesbah
Université de
Bretagne-Sud, Vannes, France.
Most
general population surveys, nowadays, include jointly with measurement of
various manifest observed variables, a set of questions giving categorical
ordinal responses. Each set is inserted to measure a specific latent trait,
i.e. an unobserved causal variable.
Rasch
type models relate an unobserved latent continuous variable (ability) to
manifest observed ordinal unidimensional items. This model, belonging to the
item response theory family, was first developed in the psychometry fields to
deal with measurement of univariate trait of student in educational testing.
In
our work, we will use graphical and Rasch type models, to show how one can
construct valid multivariate distribution to deal with such complex situations.
Then,
we present data from a large Pan-European Survey, the “Housing and Health
Survey” including various set of questions measuring Quality of Life, Immediate
Environment, Health, Housing conditions and other related indicators.
Finally,
using this large data set, we will show how to build and validate a complex
causal model predicting Health Related Quality of Life by housing condition
variables.
References:
X.Bonnefoy and al. “Housing and Health in Europe: Preliminary
Results of a Pan-European Study” American Journal of Public Health,
September 2003
J.B. Hardouin and M.Mesbah. “Clustering binary variables in subscales
using an extended Rasch model and Akaike information criterion” Submitted.
M.L.Feddag and M.Mesbah “Generalized Estimating Equations for
Longitudinal Mixed Rasch Model” Submitted.
Jaroslav Michalek,
Masaryk University Brno,
Czech Republic
The data from the population with negative binomial
distribution (NBD) are very frequent in environmental applications. The NBD can be considered as a mixture of
Poisson distributions with a Gamma distribution of the means. The distribution
has two parameters, m and k,
here m is an
expectation and the variance of the distribution is given by m + m2/k.
Thus the parameter k is related to the so called
over-dispersion of the Poisson distribution.
If k is known, the standard GLM
technique can be used. If k is unknown, it can be
estimated. The data analysis is then often performed in the same way as in the
case where k is known. The situation is more complicated
if k is unknown and the estimate of m depends on covariates. In
that case, the general method of maximum likelihood can be applied and
asymptotic results can be used only if the sample size is large enough.
In this contribution simulated data will be used to
demonstrate different approaches of estimating and testing parameters of the
NBD. The robustness of the applied methods will be compared. The results of a
statistical analysis of ecological data that used with NBD will be presented in
the second part of the paper. The data which we use were obtained from an
ecological study oriented to verifying the hypothesis that the distribution of
large herbivores in a forests depends on the presence of a shrub layer.
Estimating
rainfall in Zimbabwe
Kingstone Mutsonziwa
School of Statistics and
Actuarial Science, University of the Witwatersrand
Annual data from 1953-1993 in Zimbabwe were used to
extrapolate rain gauge readings at each of the stations considered to estimate
the amount of rainfall received in areas where no readings were made using
Kriging. Rainfall in Zimbabwe is correlated with latitude and longitude with
rainfall being approximately constant in the NW-SE direction. The areas in the
NE direction receive substantial amounts of rainfall in comparison to the rest
of the areas in Zimbabwe. Some limitations in the analysis will be discussed
and some recommendations are suggested.
Evaluation of effects of environmental temperature on
seasonality of infections
Elena N.
Naumova, PhD and Jyotsna Jagai, MS
Tufts
University School of Medicine
To
explore long-term, mid-term, and short-term effects of environmental stressors,
and extreme events in particular, on incidence of infectious diseases,
analytical tools have to be adapted to the specific properties of a health
outcome. Infectious diseases are
typically manifested via periods with low incidence alternated by periods of
outbreak clusters, which form a unique seasonal pattern. Each infectious disease has specific
clinical properties, such as incubation time, and dominant routes of
exposure/transmission, that can be influenced by environmental factors. While climatic
conditions typically define or restrict a habitat area of a pathogen,
meteorological factors affect timing and intensity of infectious
outbreaks. Extreme weather events may
have pronounced immediate effects, induce a shift in seasonal pattern, and have
long-term consequences.
The
objective of this study was to develop methodology for assessing long-term,
mid-term, and short-term effects of ambient temperature on a temporal pattern
of enteric infections (EI). We
discussed the notion of seasonality and approaches to parametric modeling of a
seasonal pattern. We suggest a two-stage hierarchical regression modeling
procedure. The first stage provided the estimates for seasonal characteristics,
which are used in the second stage to estimate the degree of associations among
those characteristics.
Using
45,816 records of reported cases of six enteric infections collected over a
ten-year period (1992-2001) in Massachusetts, we described diseases temporal
patterns. We linked these temporal
patterns to daily measures of ambient temperature abstracted from databases of
the National Oceanic and Atmospheric Administration. All EI, except one, exhibited well-defined seasonal patterns. Two
diseases were almost mimicking the temperature seasonal curve; and in three EI
we observed significant delays in peak timing relative to a peak in ambient
temperature. Extreme temperature predicted the intensity of a seasonal increase
and the timing of seasonal peak in waterborne infections. Foodborne infection incidence was the most
sensitive to the short-term temperature fluctuations, significantly rising on
2, 8 and 15 days after an increase in temperature. Our findings offer insights to the possible dominant routes of
transmission for these diseases.
We
suggest that weather forecast can be considered for forecasting of EI, so
public health measures to prevent these diseases can be better targeted and
focused.
MFWIS: A
Surveillance System For Waterborne Infections
Elena N.
Naumova and Ian B. MacNeill
Tufts
University School of Medicine, Boston, MA 02111 and Department of Statistical
and Actuarial Sciences, University of Western Ontario, London, Ontario,
Canada N6A 5B7
elena.naumova@
tufts.edu and ibmacneill@hotmail.com
MFWIS
is a system for monitoring case count data on infections with a view to early
detection of outbreaks and to forecasting the extent of detected
outbreaks. Historical data are smoothed
using a loess-type smoother. Upon receipt of a new datum, the smoothing is
updated and estimates are made of the first two derivatives of the smooth curve
and these are used for near-term forecasting.
Recent data and the near-term forecasts are used to compute a warning
index. The algorithm for computing the
warning index and the interpretation of the index have been designed to effect
a balance between type I errors (false prediction of an epidemic) and type II
errors (failure to correctly predict an epidemic). If the warning index signals a sufficiently high probability of
an epidemic, then a forecast of the possible size of the outbreak is made. This longer term forecast is made by fitting
a “signature” curve to the available data.
The effectiveness of the forecast depends upon the extent to which the
signature curve captures the shape of outbreaks of the infection under
consideration.
Crossing Problems
Jiri Neubauer
University of Ostrava
Observing
some random process it seems to be interesting to know the probability that the
process is crossing any level during a particular time interval. Another thing
which could be useful to know is time the process spends above a given level.
This contribution is dedicated to the crossing problem, to the question of mean
number of points the process crosses the given level and to the question of
time spent above the level. It was necessary to use the spectral analysis of
time series, particularly the estimation of a spectral density function, to
reach the solution of these issues. Results will be presented on simulated time
series and on time series of some air pollutants, which were measured in the
town of Vyskov in Czech Republic.
Power of non-parametric trend tests – a semi-parametric
approach
Anders Nordgaard
Linköping
University. Sweden
Non-parametric
tests for monotone trends in environmental quality data have become widespread
during the past decades. The basic ideas were formulated by Mann (1945) and
Kendall (1975). More specific features, such as the handling of serial
correlation and adjustments involving covariates, have been discussed by Hirsch
and Slack (1984), Libiseller and Grimvall (2002), and several other authors.
However, despite the popularity of non-parametric trend tests, there is no
generally accepted procedure to calculate the power of such tests. In this
paper, we develop a semi-parametric method for such calculations. The method is
based on simulations involving resampling from measured data. It is
non-parametric in the sense that it does not require any specific model of the
random variation in collected data. However, it is parametric in the sense the
trend scenarios are expressed as parametric functions. The method is applied on
measured concentrations of nutrients in Swedish rivers.
Keywords: Time series,
Mann-Kendall test, power function, resampling, bootstrap
References
Hirsch, R.M., Slack, J.R. (1984): A nonparametric trend test for season