371b Cluster Analysis to Investigate Air Quality Trends at Houston, TX

S. Pakalapati1, Jose A. Romagnoli1, S. Beaver2, and A. Palazoglu2. (1) Chemical Engineering, Louisiana State Univeristy, Cain Department of Chemical Engineering, South Stadium Road, Baton Rouge, LA 70803, (2) Department of Chemical Engineering and Materials Science, University of California, Davis, One Shields Avenue, Davis, CA 95616

Ozone is a secondary pollutant resulting from the photochemical reactions of volatile organic compounds (VOCs) with oxides of nitrogen (NOx) that has adverse effects on human health, agriculture, and the environment. The Clean Air Act of 1971 requires the federal Environmental Protection Agency (EPA) to set National Ambient Air Quality Standards (NAAQS) for harmful pollutants including ground-level ozone composition. State governments are in turn required to attain these standards, requiring detailed knowledge of the physical and chemical processes affecting ozone accumulation to develop effective regulatory control strategies. Houston, TX is a moderate non attainment area under the new 8-hr ozone NAAQS, which is based on the daily maximum for 8-hr running averaged ozone composition at a given geographical location. The root causes for episodes in which the NAAQS threshold of 85 ppb is exceeded are unique for Houston. Rapid ozone formation occurs in this region due to both highly variable meteorology owing to its proximity to the Gulf Coast and intense emissions from the petrochemical industry.

Researchers typically study air pollution dynamics using Air Quality Models (AQMs), which simulate differential equations representing atmospheric transport and reaction processes over a gridded, three-dimensional study domain. AQMs require very large amount of time to develop, and are specific to a relatively short period of time for a particular geographic region. Given the voluminous amounts of air quality data produced from monitoring networks which record hourly, ground level air quality and meteorological parameters, statistical analysis offers a means by which large scale investigation of ozone pollution can be considered. In this study, we consider cluster analysis as a tool for investigating air quality data sets spanning multiple years. Cluster analysis is an unsupervised form of statistics which indicates recurring patterns among a set of observations. By carefully considering the relevant distance and time scales when forming the cluster analysis, the results indicate groups of days sharing similar spatio-temporal patterns, indicating recurring physical processes affecting local ozone composition levels.

We apply to the Houston area two advanced clustering algorithms we have developed previously for use with air quality studies. The first algorithm is an extension of the widely used k-means algorithm in which N days are assigned among k clusters. By analyzing the spatial field for daily maximum 8-hr ozone episodes, recurring spatial distributions ozone are identified which can be used to directly infer various regimes for ozone production. A second clustering algorithm, intended for time series measurements exhibiting a diurnal (daily) cycle, considers the hourly wind field measurements, indicating a small number of recurring mesoscale flow patterns which have a large bearing on regional ozone levels.

Our study focuses on the period 1 April through 31 October of the years 2001 – 2004. Measurements are available from two separate networks of ground level monitoring stations operated by the Texas Commission for Environmental Quality (TCEQ). All data is reported at an hourly rate, though missing measurements are a common problem associated with analysis of environmental measurements over such an extended observation period. The first network, the Continuous Air Monitoring Stations (CAMS), monitor ozone and NOx concentration levels. From these air quality data, the 8-hr daily maximum ozone levels can readily be calculated for 145 episodic days. The spatial field for daily maximum 8-hr ozone is clustered using the first algorithm to identify episodes sharing similar spatial distributions for regional ozone levels, indicating episodes sharing common cause. A second meteorological monitoring network records hourly, ground-level wind speed, wind direction, and temperature data. The time series clustering algorithm is applied to these continuous wind field measurements to determine meteorological regimes affecting regional air quality.

To interpret the observed spatial and temporal variation of the clusters, we calculate the cluster means at each monitoring site for the daily maximum 1-hr and 8-hr ozone composition, the morning NO spike capturing rush hour emissions, and various meteorological parameters. When plotted geospatially, these means indicate localized activity of the respective parameters within each cluster. The association between the mesoscale wind flow patterns and ozone composition is identified from the snapshots of time-dependent wind direction distribution plots. By clustering the episode days based on the recirculation factors calculated from the wind data, and by comparing these clusters with the ozone clusters, we identified the atmospheric transport characteristics of the pollutants. The average daily 500-hPa and 850-hPa weather maps for each cluster gave the synoptic meteorological patterns contained in the upper atmosphere. Cluster averaged time series data of wind speed, direction, temperature and compositions of ozone precursors are analyzed for the episode days and their previous days to observe the diurnal pattern variation at each site.

The cluster averaged NOx, temperature and wind speeds indicated lack of significant cluster-to-cluster variation. This indicates that the ventilation and emissions of oxides of nitrogen have a relatively small impact on ozone formation. The wind field patterns from the wind field clustering analysis lie near the ground level and are not strongly influenced by the upper atmosphere at 500-hPa level. Distinguishable patterns observed at the 850-hPa level indicate the importance of low altitude and the land/sea breeze cycle for Houston area, as opposed to synoptic conditions driven by upper atmosphere. Some of the wind patterns are much more conducive to ozone than other patterns; however there still exists some variability in ozone levels that cannot be related to mesoscale flow patterns.

Our statistical methods have allowed rapid analysis of Houston area, starting with little advanced knowledge of the study region. Additionally, the scale of the study, spanning four years of historical data, is larger than can practically be performed using traditional AQM simulation methods. Our results are promising, revealing several unique aspects of Houston ozone dynamics, however the research will clearly benefit from incorporation of VOC data in our future work.