Cleaning Arctic Aerosol Datasets From Pollution

Ivo Beck

Not to publish the Pollution Detection Algorithm would be a lost opportunity. By making it publicly available, we can help make data treatment processes more transparent.

Ivo Beck, IIE-EERL

What is the project about?

The Pollution Detection Algorithm (PDA) is an algorithm developed to identify local pollution emission in aerosol and trace gas datasets in remote locations. We know this pollution is not representative of those remote environments, as measurements regularly include emissions from the research vessel’s stack. For the development of the algorithm we used aerosol number concentration data collected during the Multidisciplinary drifting Observatory for the Study of Arctic Climate expedition (MOSAiC) in the central Arctic Ocean from September 2019 to October 2020. A description of the algorithm was published in Atmospheric Measurements Techniques (AMT), an open access journal.

Photo: Ivo Beck

The PDA is written in Python and the code is interactive, meaning that it can also be run by users who are not familiar with Python. The code is licenced under a Creative Commons Attribution 4.0 International license.


Why Open?

Cleaning datasets from local pollution is a task faced by many scientists in atmospheric research. The PDA provides an objective way to accomplish this and can be applied to many datasets. Not to publish it would be a lost opportunity and by making it publicly available, we can help make data treatment processes more transparent. We also chose to publish the validation of the PDA in an open access journal (AMT). 


Who benefits from it? 

The target groups are mainly atmospheric researchers who analyse data from remote areas. There is no common standard method to clean atmospheric datasets from pollution influences and often, ancillary data are used for that. The PDA, conveniently, only relies on the target dataset itself and can therefore be easily applied. It also makes cleaned datasets more comparable, which is another benefit for the atmospheric science community.


How did you make it Open Software? 

The algorithm is is published on Zenodo, where the code will be updated regularly. The dataset will be available on PANGAEA, an open access data publisher, in the beginning of 2023.


This project has been supported by the Swiss National Science Foundation (grant no. 188478) and the Swiss Polar Institute (grant No. DIRCR-2018-004).


Contact: Ivo Beck, Julia Schmale