Measuring NO2 with a Winsen ZE25-O3 Ozone Sensor

Ozone belongs high up in the atmosphere, but sadly, due to various things we humans do, we can sometimes get quite a lot of it here at ground level. We’ve written a little about ozone before in case you are curious, but otherwise, let’s get technical and try to measure ozone…well, it turns NO2.

Not into reading? Jump straight to the code!

About the Sensor

The first step is to understand the sensor a little. Winsen ZE25-O3 is a fairly cheap sensor (around 33USD or 28EUR over at aliexpress). It is an electrochemical sensor, which means that it reacts with the target gas, which is ozone in our case, resulting in a current with a known relation to the gas concentration.

The sensor is cross sensitive with NO2, i.e. it will also react if NO2 is present in the air. It has a stated resolution of 0.01ppm (10ppb). Its detection range is 0 to 10ppm. It has an analog output as well as digital (UART). A very brief introduction to electrochemical sensors can be found in Introduction to Electrochemical (EC) Gas Sensors, it is not about our sensor specifically, but conveys the basic idea quite well.

On paper this sensor looks pretty nice, but what about reality? We do not have a lab but rather a very limited budget and this is our first experiment with a sensor; this limits what we can learn about this sensor, but, it also makes it more fun as we are forced to be a bit creative to learn anything at all.

Collecting Data and Correlating with Nearest Government Air Quality Station

In the absence of a calibrated reference sensor we can just put next to our sensor under test, we use that the city of Berlin is measuring air quality at various places in the city. We try different stations based on how close they are to where our indoor measurement is performed, and evaluate which one is the “best” by measuring correlation between our measurements and the stations.

As we know that sensors can be cross sensitive, in particular our sensor does state being cross sensitive with NO2, we make sure to also correlate it against NO2 and not only O3.

Setup

We followed the sensor’s datasheet to connect it correctly. Unfortunately, the provided cable wasn’t ideal for our Arduino, so we had to cut the connector and connect the individual cables directly to the Arduino. This seems to work fine. For the digital value (UART) there’s a checksum, which we calculated and it matched, indicating that the connection is fine.

Figure 1. Sensor connected to an Arduino Uno, which in turn is connected to a laptop’s USB port.

We connected the 5V and ground to the corresponding pins on the Arduino. The sensors TX is connected to digital port 2 on the Arduino. The sensor’s analog output is connected to analog input 0. That’s all needed on the hardware side of things, now let’s move on to software.

Code

With the connections done, it’s time for ones and zeroes, or well, maybe just source code to make the ones and zeroes flow. The main piece of code for collecting data with the Arduino is:

int ppb = sensor.readPPB();
int analogVal = analogRead(A0);

if (ppb > 0) {
 Serial.print(ppb);
 Serial.print(",");
 Serial.println(analogVal);
 delay(1000);
}

For the complete code, see the code in Github.

Then there’s the other piece for analyzing the data, which thanks to pandas and other Python libraries is simple and concise, the main code is:

hourlyData = ozoneData
 .groupby(ozoneData.index // 3600)
 .mean()

sensorAndBerlinData = pd.concat(
 [hourlyData, berlinData],
 axis=1
)

model = LinearRegression().fit(
 sensorAndBerlinData[['analogVal']],
 sensorAndBerlinData['no2']
)

# ...print out model coefficients...
# correlations calculated using scipy as:
scipy.stats.pearsonr(
 sensorAndBerlinData['analogVal'],
 sensorAndBerlinData['no2']
)
scipy.stats.spearmanr(
 sensorAndBerlinData['analogVal'],
 sensorAndBerlinData['no2']
)
scipy.stats.kendalltau(
 sensorAndBerlinData['analogVal'],
 sensorAndBerlinData['no2']
)

# Show fancy plot with regression line
sn.lmplot(
 x='analogVal',
 y='no2',
 data=sensorAndBerlinData,
 fit_reg=True
)

For the complete code, see the code in Github.

With the code written, we just need to collect some data. Ideally we’d do it in stable conditions, but as that is a bit tricky for us, we just left the computer and Arduino in a corner on the kitchen table for roughly two days - and hoped our toddler wouldn’t go play with it. As we just dump the data from the sensor to the serial port, we can read and store it (on Arch Linux) with:

cat /dev/ttyACM0 >
 home_ppb_analog_2021-01-25T07:00:00Z

To stop reading data, you can press ctrl + c in the terminal. Once we have some data (the more the better of course), we run the Python analysis code by doing:

source ./env/bin/activate
cd src
python main

We have to run it from the right folder, as we hard-coded the path to the data. See Virtual Environments and Packages if you wonder about source ./env/bin/activate, it’s just a convenient way to run your Python code in isolation.

Result

Running our Python analysis code gave us the following picture:

Figure 2. A value representing the sensor’s output voltage (analogVal) plotted against the outdoor NO2 concentration, together with a regression line and 95% confidence interval.

Where the model (regression line) has the following equation:

f(x) = 7.52x - 636.96

It’s R2 score is 0.36.

We get the following correlations between “analogVal” (a number representing the sensor’s output voltage) and NO2 concentration reported by the city air quality monitor station (MC174):

Pearson: 0.60, p-value=0.0002
Spearman: 0.57, p-value=0.0005
Kendall's Tau: 0.37, p-value=0.003

Naturally, we tried a lot of different ways of looking at the data. As both the data and code is in Github, we encourage you to play around with it yourself if you are curious.

You might have noticed that we use the analogValue rather than ozone concentration in PPB (which we also store) calculated by the sensor, that’s since we got a stronger correlation when correlating analogVal with the city’s data. This also means the Arduino code could be just a few lines, but we left the code for reading the PPB value from the sensor’s UART output in case someone wants to play around with it.

Using the results to estimate indoor NO2

So far we have tried to predict outdoor NO2 levels based on indoor measurements. We’re exploiting that outdoor levels tend to influence indoor levels [1] [2] [3] to estimate the outdoor level given the indoor level. This can naturally be turned around, we can estimate indoor levels given outdoor levels.

In [1] they found the indoor/outdoor ratio to be 0.73 and in [2] they found it to vary in the range from 0.88 to 1. Using this, we can estimate indoor NO2 given Berlin’s outdoor measurements. But, as we’re often experimenting with air purification, this wouldn’t be useful, since if we’d be successful in reducing indoor NO2 that would fall under the radar.

So, for potential usefulness, or perhaps mainly for fun (as the analogVal would suffice for us), we updated our code to report estimated indoor NO2. We estimate indoor levels by using our fitted model and multiply by 0.73. To get the result in PPB (Berlin’s data that we used as reference is in μg/m3) we divide by 1.88:

double estimateIndoorNO2PPB(int analogVal) {
  return ((7.51826819982211 * analogVal
         - 636.9571501802109) * 0.73)
         / 1.88;
}

See the code over at Github.

Luckily, we previously bought an air quality monitor that can measure NO2. By assuming that there should be a strong correlation with outdoor levels, and the stronger the better, we have an evaluation metric between our home-made setup and this commercial air quality monitor. It is not perfect, but it is also not completely made up, as both we and other studies have detected correlation between indoor and outdoor NO2 concentrations. We also don’t have any known NO2 or O3 sources in our home, which means both pollutants are likely mainly coming in with the outdoor air.

Figure 3. NO2 measured by an Uhoo Air Quality monitor plotted against the outdoor NO2 concentration. Note that the units are different and therefore not directly comparable.

Figure 3 shows that the relation between indoor and outdoor NO2 is quite weak. It is definitely less clear than in Figure 2. When fitting a regression line to the data, we get an R2 score of 0.06. This can be compared to 0.36 which we got with the Winsen ZE25-O3 sensor. We get the following correlations between indoor and outdoor NO2 concentration:

Pearson: -0.24, p-value=0.178
Spearman: -0.48, p-value=0.005
Kendall's Tau: -0.34, p-value=0.008

as compared to our results with Winsen ZE25-O3:

Pearson: 0.60, p-value=0.0002
Spearman: 0.57, p-value=0.0005
Kendall's Tau: 0.37, p-value=0.003

for the code testing the Uhoo, see Github.

Note how we have a stronger correlation with outdoor levels than our commercial air quality monitor. This doesn’t prove that our sensor is better of course, but it is a nice hint that that might well be the case. It certainly makes us a bit sad that we paid 312.55EUR (~379USD) for the Uhoo, when one of the main reasons we got it was that it can measure NO2.

Noteworthy to mention is that the Uhoo app does not give data with decimals, even if you can see in the graphs that they store them. There’s also no way to export the data as far as we know, so we simply copied the data from the graphs (with the value given by the Uhoo app when clicking on the point). And naturally, we could have a faulty unit or something else might have gone bad. Doesn’t make us feel that much better, but it is important to keep in mind that our tests are far from perfect.

Conclusion

We started out wanting to measure ozone and planned to use the nearest city air quality monitor as reference. But, we saw a stronger correlation between sensor voltage and NO2. Our sensor’s (Winsen ZE25-O3) data sheet does state that it is cross sensitive with NO2, so we chose to measure NO2 instead. Our theory is that NO2 is simply dominating, perhaps as we don’t get so much sun during the winter here in Berlin.

We fitted a linear model from our sensor’s voltage value to Berlin’s NO2 measurements and used this to predict outdoor levels given indoor measurements. Then we adjusted the value by multiplying with an indoor/outdoor ratio we found for NO2, namely 0.73, to get an estimate of indoor NO2.

We’re quite happy with the results. We managed to beat an commercial air quality monitoring device in terms of correlating with outdoor NO2, suggesting that we measure NO2 with a greater precision. And we suspect there are some fairly simple things we learned along the way that we could try to improve our results further.

But perhaps we’re even more excited about the future, where we see some potential of using machine learning or artificial neural networks (sometimes called deep learning) to do better calibration (eg. taking wind, temperature and humidity into account), perhaps continuous sensor monitoring (to signal when they need to be re-calibrated or exchanged) or just improving precision.

If time permits, we’re also a bit keen on actually measuring ozone with an ozone sensor, using what we learnt here to handle the cross sensitivity with NO2 by kind of creating a “software filter”, to get better precision.

References

[1] S. Hwang and W. Park, “Indoor air concentrations of carbon dioxide (co2), nitrogen dioxide (no2), and ozone (o3) in multiple healthcare facilities,” Environmental Geochemistry and Health, vol. 42, May 2020, doi: 10.1007/s10653-019-00441-0.

[2] P. Blondeau, V. Iordache, O. Poupard, D. Genin, and F. Allard, “Relationship between outdoor and indoor air quality in eight french schools,” Indoor air, vol. 15, pp. 2–12, Mar. 2005, doi: 10.1111/j.1600-0668.2004.00263.x.

[3] A. Hagenbjörk-Gustafsson, B. Forsberg, G. Hestvik, D. Karlsson, S. Wahlberg, and T. Sandström, “Measurements of indoor and outdoor nitrogen dioxide concentrations using a diffusive sampler,” The Analyst, vol. 121, no. 9, pp. 1261–1264, 1996, doi: 10.1039/an9962101261.