A resident installs an air-quality sensor outside their home, hoping to contribute to a growing network of citizen-collected environmental data. But before making the information public, they drag the sensor’s location pin on the digital map slightly down the street.
That small act of digital self-protection is at the center of new research from Yue Lin, a professor in the Department of Geography and Geographic Information Science at the University of Illinois Urbana-Champaign.
Lin’s recent study examined “location masking” — the practice of intentionally obscuring a sensor’s true location before sharing data publicly — using a national dataset of PurpleAir sensors, a popular crowdsourced air-quality monitoring platform. The findings, recently published in The Professional Geographer, reveal that the data increasingly used by researchers and government agencies is shaped not only by technology but also by the people who contribute to it and their concerns about privacy, trust, and surveillance.
For Lin, the findings connect to larger questions about how human behavior and social context shape the data increasingly used in modern technologies.
“Sometimes when people think about biases, the problem becomes purely technical,” Lin said. “But a large part of my research is to answer how these biases are not purely technical.”
Those questions have become increasingly important as artificial intelligence and machine learning systems rely on enormous quantities of data gathered from online platforms and public contributions.
“In an ideal world, everything can be collected as data,” she added. “But in reality, our world is political; it’s not neutral.”
Mapping environmental data
PurpleAir sensors are consumer air-quality monitors installed at homes, schools, intersections, and community spaces. Users can share data via an online map, contributing to a crowdsourcing network that researchers use to study localized air pollution.
“In the past, government agencies were the primary force in purchasing and deploying these sensing devices and technologies,” Lin said. “Right now, literally everybody can purchase their own sensors and install them.”
When users register a sensor, they are asked to place a pin on a map marking its location. But some users intentionally move that pin away from the sensor’s actual location before sharing the data.
“For geomasking, it means that some people don’t want people just to locate the sensor in their backyard when they’re viewing the web map,” Lin said. “They may want to drag it or just place the pin a little bit away from their home.”
“If a researcher wants to grab data from the PurpleAir sensing platform and do environmental research, especially place-based research, then the data and location accuracy affect the results,” Lin said.
Privacy and participation
Lin’s interest in location privacy began during her PhD research studying census and geographic data privacy protections. Later, while living in Chicago, she became interested in environmental justice and community-based sensing initiatives that sought to fill gaps left by traditional environmental monitoring systems.
The overlap between those interests led her to PurpleAir data and eventually to findings that challenged previous studies on mobile phones and wearable devices, which found that people in urban areas often mask their locations more frequently because of heightened feelings of surveillance. But Lin’s study found the opposite pattern with environmental sensors.
Instead of seeing higher masking in dense urban environments, the study found that users in cities were often more willing to share accurate sensor locations. Lin suggested this could be linked to the “hidden-in-the-crowd effect,” where people may feel more anonymous because they are part of a larger population.
The findings suggest privacy behaviors are shaped not only by individual preference but also by neighborhood demographics, including income, educational attainment, age, and racial composition.
“We found that location masking is not just about where people put the sensors, but also about who puts them. Neighborhoods with higher levels of educational attainment, higher income, and older populations have lower levels of location masking behaviors.”
The study also found lower levels of masking in neighborhoods with larger proportions of non-white and Hispanic residents, which Lin partly linked to environmental activism efforts in communities of color.
Rethinking citizen science data
For Lin, the study's broader implications extend far beyond air-quality sensors.
Lin hopes the study encourages people to think more critically about how that data is created and what limitations it carries.
“What is shown on the PurpleAir web map is not a reflection of the truth,” Lin said. “The location is self-reported.”
The study also connects to Lin’s broader research on geospatial technologies, artificial intelligence, and data ethics. Much of her work examines how social and political biases become embedded in datasets and the technologies built from them.
As crowdsourced environmental sensing continues to grow, Lin hopes people recognize both its potential and its limitations.
“Citizen science data may not fully substitute for traditional sensing methods, but it can serve as a valuable complement to them.”