The crowdsourcing campaign was organized as a competition with prizes offered to those who contributed the most, based on a combination of quality and quantity. The Geo-Wiki platform (www.geo-wiki.org), a web platform dedicated to citizen engagement in environmental monitoring, was used as a tool to carry out the campaign. A custom user interface was prepared for the campaign (Fig. 2), where participants viewed a random location in the tropics (here broadly defined as the area between 30 degrees north and south latitude of the equator, c i.e. comprising part of the subtropics), where a 1 × 1 km blue box indicated the location to be interpreted visually. The Global Forest Change (GFC) Tree Loss Map (v1.7)ten has been overlaid on the imagery to show all areas where tree loss was detected at any time between 2008 and 2019. The area of tree loss has been shaded in red and the map itself has been aggregated 100m for fast rendering.
The year 2008 was chosen as the start date because the RED indicates that this date is the cut-off year for the conversion of high-carbon areas, i.e. forest, to other uses. lands.7. In order to capture the main drivers of forest loss, but also to include potential additional drivers such as the existence of roads as precursors to deforestation, participants were asked to perform three steps: 1) Select driver predominant tree loss visible inside the tree loss pixels in the blue box from a list of nine specific drivers; 2) to select all other visible tree loss factors inside the tree loss pixels in the blue box from a list of five more general factors, and 3) to mark if the roads, trails or buildings were visible in the blue box. The list of specific and general drivers along with their definitions are presented in Table 1. The Geo-Wiki interface allowed participants to switch between different background images such as ESRI, Google Maps and Bing Maps as well as the Sentinel 2 satellite imagery. The different image sources allowed participants to see the location at different resolutions and at different time periods. It also provided attendees with information about the current country and continent as well as background image dates. Additionally, it provided participants with links to view NDVI and Sentinel time series, and to view location and explore historical imagery using the Google Earth platform. All of these tools were intended to facilitate the identification of drivers of forest loss by allowing participants to look at locations at different times and at different spatial resolutions.
At the start of the campaign, each participant received a quick start guide of the interface and the required tasks. As shown in Fig. 2, this quick start guide can be accessed again at any time during the campaign. Figure 2 also shows that the interface had buttons for four other functions. The first was to view the sample gallery with access to pre-loaded video tutorials and sample images describing each forest loss factor and how to make a visual interpretation and selection of each (available at https://application.geo-wiki.org/Application/modules/drivers_forest_change/drivers_forest_change_gallery.html). An illustration of the gallery of examples presented to participants is shown in Figure S1. The second function was to request help from experts, which automatically sent IIASA experts an email regarding a specific location. The third was to join the expert chat, which took participants to a dedicated chat interface on the Discord messaging platform. Here, attendees could ask questions and interact directly with staff and other attendees. Finally, there was a button to view the leaderboard as well as campaign objectives, rules and prizes (available at https://application.geo-wiki.org/Application/modules/drivers_forest_change/drivers_forest_change.html). When participants started the campaign, they were shown 10 initial practice locations, where they could try out the user interface (UI) with checkpoints, which showed participants how to identify different drivers of forest loss. This set of videos, images and training points, as well as the image gallery, have been developed to train participants before and during the campaign.
Campaign set-up and data quality
As the aim of the campaign was to determine the drivers of tree loss across the tropics, sample locations were selected from the GFC tree loss layerten for the tropics (between 30 degrees north and south of the equator). No stratification was used because a completely random sample across the tropics was considered the most accurate representation of tree loss and its corresponding factors. The previous map of deforestation factors6 used a 5K sample of 10×10 km grid cells to produce a global map. Here, the sample size was largely determined by the estimated capacity of the crowd. Therefore, we sought to validate this. 150k 1×1 km locations across the tropics, which is a considerably larger sample size than Curtis’s et al.6. In order to reduce noise, the GFC tree loss layerten was first aggregated at 100 m resolution from the original 30 m, and then 150 K centroids were randomly selected. Of these, a subsample of 5,000 random locations was selected for visual interpretation by six IIASA experts (with backgrounds in remote sensing, agronomy, forestry and geography). Due to time constraints, only 2001 locations were assessed by at least three different experts. In these locations, an agreement was discussed and once a consensus was reached, these locations became the final control or expert data set. The control locations were then used to produce quality scores for each participant as the campaign progressed to rank them and determine the eventual winners. The list of prizes offered to the top 30 participants is presented in Table S1 of the Supplementary Information (SI), and a list and ranking of the motivations mentioned by the participants are presented in Figure S2 in the SI.
Control locations were randomly presented to participants at a ratio of approximately 2 control locations for every 20 uncontrolled locations visited. If participants correctly selected the predominant tree loss factor (in step 1), they were awarded 20 points; if they chose the wrong answer, they lost 15 points. If participants confused grazing and commercial agriculture or forest fires with other natural disturbances, they only lost 10 points instead of 15. Additionally, they could earn an additional 8 points by selecting the correct secondary factors at step 2. If a mixture of correct and incorrect answers was provided in step 2, participants gained 2 points for each correct choice and lost 2 points for each incorrect choice, with a minimum gain/loss of 0 points. Finally, participants could earn 2 additional points by correctly reporting the existence of roads, trails or buildings in step 3. The scoring system was based on previous experiences of the Geo-Wiki campaign and aimed to promote the main driver selection. The points were used to produce a ranking with the total number of points per participant. Additionally, a relative quality score (RQS) was derived from the score received by users and the potential score that could have been obtained if all checkpoints had been correctly interpreted. This is shown in the equation. 1.
where RQS is between 0 and 1, NCP is the number of checkpoints visited and SumScore is the number of points obtained.
The RQS was crucial in understanding each participant’s performance in terms of the quality of their visual interpretations, as this was independent of the number of locations interpreted. After the campaign ended, an average RQS was used as the minimum criteria for participants to receive a prize, regardless of their position in the leaderboard. In addition, all users who have submitted a substantial number of interpretations, i.e. more than 1000 with the minimum required RQS, were invited to become co-authors of the current manuscript, whether they received a monetary price or not. All these co-authors also contributed to the editing and revision of this manuscript. Also, future users of the dataset could use the RQS as a key indicator of data quality.
After the campaign, post-processing of the data included the elimination of interpretations made by users that violated any of the contest rules. Additionally, during the campaign, some users communicated with IIASA staff using the “Ask Experts” button and pointed out that some checkpoints were wrong. Therefore, the corresponding points lost were added to the participants’ final score where the correction was made. A total of 18742 validations from 1 participant were removed before the campaign ended and the user was disqualified because their account was deemed to be shared between multiple people and computers, which was not allowed. Another user’s commits (38,502 out of 40,828) were also removed due to inconsistencies, but the user remained in the competition. Before the prizes were awarded to the top 30 users, a questionnaire was administered to all users in order to gather information on the characteristics of the participants and to assess their motivations. Participation was mandatory for the first 30 users. A summary of participant backgrounds is provided in SI Figure S3.