Review of selected papers related to Climate change and Machine learning


Review by Gereltuya Bayanmunkh Paper selection by Dmitry Alekseevich Grigoriev
Group 22.М08-мм, Program “Artificial Intelligence and Data Science” Associate Professor of Computer Science
St. Petersburg State University St. Petersburg State University

Using Twitter to Understand Public Interest in Climate Change: The Case of Qatar1

Problem: Since the attention on climate change as a topic shifted from those of exclusively-scientific spheres to the general public in recent years, the need to understand public conversations on climate change has arisen. Especially for Qatar, the highest CO2 emitter per capita in the world as of 20111a, it was not clear how to achieve the mission to limit the country’s carbon footprint at the population scale. Traditional surveys are costly and not scalable, so the authors examined public awareness of climate change and related topics on Qatar’s Twittersphere.

Data: Two main datasets used are Tweets related to environmental issues posted by people living in Qatar and Qatar’s daily weather conditions data. Data collection steps include: Identifying 117K users of Qatar, provided by a Qatar location gazetteer of the authors’ creation, Twitter Decahose6a, FollowerWonk.com6b; Collecting 109.6M tweets from more than 98,066 distinct users, provided by Twitter Historical API6c; Collecting daily weather conditions data in Qatar, provided by Weather Underground API1d. Data processing steps include: Keeping data from a certain period (1st January 2011 - 1st January 2016); Extracting tweets using a taxonomy of topics related to climate change1c; Qualitative analysis on the extracted tweets to keep a clean set of 36,612 relevant tweets by 8,470 distinct users. Additional data on 8K users were collected on an ad hoc basis, provided by Twitter API1d.

Methods: The authors built a network of co-occurrences where a concept is a numbered node with an associated quantity (the proportion of tweets in which the concept appears); a co-occurrence of two concepts is a link between two nodes that represent the two concepts with an associated weight (the proportion of tweets in which the concept pair co-appears). To uncover relevant information from the network, the authors used the network backbone extraction algorithm proposed by Serrano et al.1b and Gephi1e to visualize the uncovered graphs. To measure the linear correlation between different pairs of environmental topics and weather conditions variables, the authors used the Pearson product-moment coefficient.

Results: Temporal dynamics analysis showed that the activity around environmental topics fluctuates a lot compared to general Twitter activity; and time series for different environmental topics follow different trends. Content analysis resulted in word clouds of hashtags and backbone graphs which showed dominating and cluster-central hashtags for different environmental topics. Analysis on users took their short biographies into account and identified two tiers of environment-aware users: the top 1% and the top 5% most prolific users. Correlation analysis revealed a set of highly-correlated topics, a well-correlated group of topics, and dependencies related to intuitive explanations of events. The authors found that public interest in climate change topics is mainly driven by widely-covered global events or local events with a direct impact on people’s daily lives; and concluded that organizing big events is not enough to raise any lasting public awareness toward climate change.

Comparing Events Coverage in Online News and Social Media: The Case of Climate Change2

Problem: How does the coverage of news related to climate change, one of the most urgent global challenges2a, in social media differ from that in the mainstream news? Is there any gap between what the general public shares in social media and what the news media publishes online when it comes to climate change discourse? The authors compared climate change agendas to detect differences and similarities in both types of media in terms of triggers, actions, and news values with this paper.

Data: Defining discourses on climate change according to UNFCCC2b and news articles about climate change according to the frame by Kuypers2c, the authors used news data collected for 17 months (1st April 2013 - 31st September 2014, with the exclusion of January 2014) by GDELT2d and social media data in English for the same period posted on Twitter. The news dataset amounted to 561,644 URLs; while the social media dataset provided by Twitter’s Sample API2e amounted to 482,615 tweets. Then, the authors identified coverage peaks in both types of media: 218 for news and 428 for social media using attention patterns described by Lehmann et al.2h for manual and n-gram-assisted annotation to identify 195 candidate events in GDELT and 202 in Twitter.

Methods: After the data collection of both media types, both datasets went through similar iterative filtering processes to identify and keep data relevant to climate change (URLs are filtered using GDELT themes/taxonomies with their co-occurrence graph; tweets using climate change terms by Pearce et al.2f and Kirilenko and Stepchenkova2g with candidate term selection). Two authors manually removed false positives from the candidate event pool and compared their results to the results obtained with the help of Crowdflower platform2i workers to be trimmed further by the third author of the paper. Then, the remaining events were annotated with types and sub-types with related categories (with the same two authors – crowdsource workers – third author pipeline). Next, events were further annotated according to the chosen six news values by crowdsource workers with varying degrees of confidence for different references.

Results: Event type analysis revealed significant differences in terms of coverage of disasters (mainstream media–MSM 20% vs. Twitter 7%); media-triggered events (MSM 1% vs. Twitter 9%); individual actions (MSM 4% vs. Twitter 14%). Distribution analysis of types and sub-types showed important similarities and differences between types of actors and actions covered in both types of media, which went on to become the actor-action basis of the concluding recommendations of the paper. News values analysis indicated the similarity of both media types in covering events that are extraordinary, unpredictable, high magnitude, and negative; excluding the values conflictive and related to elite persons. Event types and news values analysis resulted in the news magnitude-values basis of the concluding recommendations of the paper.

Climate Change Communication on Facebook, Twitter, Sina Weibo, and Other Social Media Platforms3

Problem: Social media platforms3c are used by scientists, activists, journalists, policymakers, and citizens to share and receive information; to discuss climate change issues and criticize policies and media coverage; to mobilize communities to provide rescue and relief in the aftermath of climate change-related disasters and organize movements and campaigns. Initial research on climate change communication focused on traditional media3a with top-down approach3b and later research on the same topic, but in social media are based on quantitative analysis of tweets from Western countries. The authors aimed to explore the increasingly important social media use in climate change communication across a variety of platforms, cultures, and media systems.

Data: While the authors of this paper did not explicitly bring any new quantitative data to light, they analyzed qualitative data in the form of previous research papers on topics including Climate change, Social media platforms, Environmental communication, and Climate change communication.

Methods: The authors of this paper identified a correspondence between research topics in environmental communication with those in climate change communication. The identified pairs are as follows: (knowledge, information), (attitude, opinion), and (behavior, mobilization) concerning environmental communication and climate change communication, respectively. Then, the authors went on to review existing literature on the research topics individually as well as in comparison to make observations and induct the main thesis of the paper.

Results: Blogging helped climate scientists to bridge the gap3a between how academia and the news media discuss climate change, to reach the public directly, and more efficiently with the use of accessible language and visuals3d. Climate activists, nongovernmental organizations, and similarly, climate change skeptics also use blogging to communicate their stance and further their causes. Social media platforms like Twitter3e and Facebook3f are also used as reporting mediums during natural disasters, which allows first-hand and second-hand information to spread to the larger public in real time. As for researching, social media platforms provide a less intrusive way of collecting data on climate change communication; studying them to classify users’ environmental behavior stages, and identify their sentiments towards certain events; however, the limitation is that not all of the populations around the world use social media. Users on social media platforms including Sina Weibo3g react to the traditional media coverage of climate change issues and communicate their feelings after natural disasters. Although a hub for like-minded communities, these platforms provide a space for wider discussion on climate change for their users. As for framing, several studies on Twitter data separately conclude that the attitudes toward climate change are polarized; and the difference in the use of terms such as “climate change” and “global warming” reveals the difference in the stance of climate change activists and skeptics. Social media platforms including Facebook, and Youtube3h help small and big environmental groups reach more people with their campaigns to influence the actions of individuals and institutions.

Natural disasters detection in social media and satellite imagery: a survey4

Problem: Natural disaster detection based on data retrieved from social media, and disaster analysis based on satellite imagery is useful in case of news agencies being unable to report timely on the event, and it can be further used for the prediction of the stock market and climate change. However, there are several challenges for both methods: the relevance and authenticity of content shared on social media are hard to verify; there are not enough large-scale annotated datasets, and machine learning techniques used on data from social media and satellite are not evaluated widely; satellite imagery is affected by external factors, and it is produced at low temporal frequency.

Data: The authors categorized the datasets used in relevant literature into the following four categories: databases on disaster loss and damage, hazard catalogs, socio-economic indicators, and exposure datasets; in addition to mentioning generic datasets with historical events. Some specific textual datasets include CrisisLex4d, ChileEarthquakeT14d, SoSItalyT44d, SandyHurricaneGeoT14d, and ClimateCovE3504d. Some specific image datasets include DIRSM4e, YFCC100M4e, Landsat4e, and FDSI4f.

Methods: To conduct a comprehensive analysis of different approaches for disaster detection, retrieval, summarization, and analysis, the authors categorized existing literature into sub-groups that correspond to specific tasks within the natural disaster analysis field, and identified relevant references. Furthermore, they compared inputs and outputs of different approaches within the sub-groups, identified state-of-the-art methods as well as comprehensive, publicly available datasets, and identified opportunities for improvements within various aspects.

Results: The authors identified relevant literature in disaster detection from Twitter text (8 references4a), disaster detection from images from social media (14 references4b), benchmark competitions on disaster analysis in social media, and compared their findings, respectively. Furthermore, they identified relevant works on disaster detection in satellite images (11 references4c), benchmark competitions on flood detection in satellite imagery, and compared the methods and results. Current trends in disaster analysis were discussed within recent literature as well.

Beyond climate and conflict relationships: new evidence from copulas analysis5

Problem: This paper contributed to the newly emergent climate-society literature, a new multi-disciplinary approach that investigates how climate is likely to affect society with quantitative and empirical methods, by using time-varying copulas to reinvestigate the association between climate and conflicts. Special focus was paid to identifying a long-term and time-varying link between climate proxies and social disturbances and war to better contribute to the literature on the long-term impact of climate change.

Data: The time and place of focus were the Little Ice Age period (1500 - 1800) and pre-industrial Europe. The datasets used in this paper are divided into three categories: Temperature and precipitation data (including EUR_TEMP5d, MannNH5e, and Temp20065f), ENSO and NAO Teleconnection data, and Conflicts data (including Conflict catalogue5g that documented a total of 582, and a dataset that documented a total of 205 social disturbances during the focus period). The authors also noted that there was a potential sampling bias since the investigation has been conducted only on documented conflicts.

Methods: Hypothesizing that human populations are not very different from animal populations regarding the behavior they adopt in times of resource limitations5a, the authors introduced a causal scheme5b detailing how climate impacts social disturbances and war. Then using different copulas (including the Gaussian copula, the Student copula, the Gumbel copula, and the Clayton copula) and different distributions, they looked for the mean dependence as well as the tail dependence between climate and conflicts. Following the methodology proposed by Bedoui et al.5c, the authors estimated the margins using empirical distribution, applied Canonical Maximum likelihood is used to estimate the parameter of the copula in each case, then used a rolling window methodology to account for the time variance of the copula models.

Results: Copula analysis identified a positive dependence between low temperatures and conflicts, and negative or positive dependence between precipitation anomalies and conflicts, however, the relationship is not uniform. The
periods in which climate and conflict links were genuinely active during the Little Ice Age were 1600, 1630-1650, and 1730. Construction of the global causal scheme from climate to social disturbances and war helped illustrate a potential self-sustaining vicious circle with climate-related economic burden, social disturbance events, and war as its moving parts. The authors also used time-varying copula analysis for the first time in this context.

References

1. Abbar, S., Zanouda, T., Berti-Equille, L., & Borge-Holthoefer, J. (2021). Using Twitter to Understand Public Interest in Climate Change: The Case of Qatar. Proceedings of the International AAAI Conference on Web and Social Media, 10(2), 168-177. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/14849

a) WB. 2015. CO2 emissions (metric tons per capita). http://data.worldbank.org/indicator/EN.ATM.CO2E.PC?order=wbapi_data_value_2011+wbapi_data_value+wbapi_data_value-first&sort=desc [Online; accessed 15-Feb-2016].

b) Serrano, M. Á.; Boguná, M.; and Vespignani, A. 2009. Extracting the multiscale backbone of complex weighted networks. National academy of sciences 106(16):6483–6488.

c) Zanouda, T., and Abbar, S. 2016. Qatar Environment Taxonomy. http://scdev5.qcri.org/qenv/taxonomy/ [Online; accessed 15-Feb-2016].

d) Underground, W. 2016. History API. http://www.wunderground.com/weather/api/ [Online; accessed 15-Feb-2016].

e) Gephi. https://gephi.org/

2. Olteanu, A., Castillo, C., Diakopoulos, N., & Aberer, K. (2021). Comparing Events Coverage in Online News and Social Media: The Case of Climate Change. Proceedings of the International AAAI Conference on Web and Social Media, 9(1), 288-297. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/14626

a) Hoornweg, D. 2011. Cities and climate change: responding to an urgent agenda. World Bank Publications.

b) http://unfccc.int/key_documents/the_convention/items/2853.php accessed 03.2015.

c) Kuypers, J. A. 2009. Framing Analysis. Lexington Press. 181+.

d) http://www.gdeltproject.org accessed 03.2015

e) These tweets are collected via Twitter’s Sample API and can be found in the Internet Archive: https://archive.org/details/twitterstream accessed 01.2015.

f) Pearce, W.; Holmberg, K.; Hellsten, I.; and Nerlich, B. 2014. Climate change on Twitter: Topics, communities and conversations about the 2013 IPCC Working Group 1 Report. PLOS ONE.

g) Kirilenko, A. P., and Stepchenkova, S. O. 2014. Public microblogging on climate change: One year of Twitter worldwide. Global Environ. Change.

h) Lehmann, J.; Gonc ̧alves, B.; Ramasco, J. J.; and Cattuto, C. 2012. Dynamical classes of collective attention in Twitter. In Proc. of WWW.

i) http://www.crowdflower.com/

3. Tandoc, Edson C., and Nicholas Eng. “Climate Change Communication on Facebook, Twitter, Sina Weibo, and Other Social Media Platforms.” Oxford Research Encyclopedia of Climate Science. 26 Apr. 2017; Accessed 22 Oct. 2022. https://oxfordre.com/climatescience/view/10.1093/acrefore/9780190228620.001.0001/acrefore-9780190228620-e-361

a) Russill, C., & Nyssa, Z. (2009). The tipping point trend in climate change communication. Global Environmental Change, 19(3), 336-344.

b) Nerlich, B., Koteyko, N., & Brown, B. (2010). Theory and language of climate change communication. Wiley Interdisciplinary Reviews: Climate Change, 1(1), 97-110.

c) boyd, d.m., & Elison, N. B. (2007). Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication, 13(1), 210-230.

d) Thorsen, E. (2013). Blogging on the ice: Connecting audiences with climate-change sciences. International Journal of Media & Cultural Politics, 9(1), 87-101.

e) Acar, A., & Muraki, Y. (2011). Twitter for crisis communication: Lessons learned from Japan’s tsunami disaster. International Journal of Web Based Communities, 7(3), 392-402.

f) Tandoc. E., & Takahashi, B. (2016). Log in if you survived: Collective coping on social media in the aftermath of Typhoon Haiyan in the Philippines. New Media & Society (pp. 1-16). Advanced online publication.

g) Qu, Y., Huang, C., Zhang, P., & Zhang, J. (2001). Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake. Proceedings of the ACM 2011 Conference on Computer-Supported Cooperative Work (pp.25-34). Hangzhou, China.

h) Dosemagen, S. (2016). Social media and saving the environment: Clicktivism or real change? Huffington Post. Retrieved from http://www.huffingtonpost.com/shannon-dosemagen-/social-media-and-saving-t_b_9100362.html

4. Said, N., Ahmad, K., Riegler, M. et al. Natural disasters detection in social media and satellite imagery: a survey. Multimed Tools Appl 78, 31267–31302 (2019). https://doi.org/10.1007/s11042-019-07942-1

a) Table 1. Summary of some relevant works in disaster detection in Twitter text in terms of event types, modality (Single, multi-modal) of information, datasets used for the evaluations, and a brief description of the method

b)  Table 2. Summary of some relevant works in disaster detection in single images in terms of event types, modality (single, multi-modal) of the information and the dataset used for the evaluations, and a brief description of the method

c) Table 3. Summary of some relevant works on disaster detection in satellite images in terms of event types, dataset, and a brief description of the method

d) Table 4. Summary of the benchmark datasets for natural disasters detection in Twitter

e) Table 5. Summary of the benchmark datasets for natural disasters detection in images from social media. All the datasets are annotated and "B" indicates whether the dataset has been part of a benchmark competition or not

f) Table 6. Summary of the benchmark datasets for disaster detection in satellite imagery

5. Olivier Damette & Stephane Goutte, 2020. “Beyond climate and conflict relationships: new evidence from copulas analysis,” Working Papers of BETA 2020-19, Bureau d’Economie Théorique et Appliquée, UDS, Strasbourg; Accessed 22 Oct. 2022. https://ideas.repec.org/p/ulp/sbbeta/2020-19.html

a) Zhang DD., Lee HF, Wang C., Li B., Pei Q., Zhang J., An Y. (2011), “The causality analysis of climate change and large-scale human crisis”, Proceedings of the National Academy of Sciences (PNAS), 108, 42, 17296-17301

b) Figure 1: Theoretical hypotheses and causal scheme

c) Bedoui, R., Braeik, S., Goutte, S., and Guesmi, K. (2018), ”On the study of conditional dependence structure between oil, gold and usd exchange rates”, International Review of Financial Analysis, 59, 134-146.

d) Zhang DD., Brecke P., Lee HF, He Y-Q., Zhang J. (2007), “Global climate change, war, and population decline in recent human history, Proceedings of the National Academy of Sciences, 104, 49, 19214-19219

e) Mann ME, Jones PD (2003), ”Global surface temperatures over the past two millenia

f) Büntgen, U.,, Tegel, W.,, Nicolussi, K.,, McCormick, M.,, Frank, D., Trouet, V., Kaplan, J.O., Herzig, F., Heussner, K.U., Wanner, H., Luterbacher, J,, Esper, J. (2011), “2500 years of European climate variability and human susceptibility”, Science, 331, 578-582

g) Sorokin PA. (1937), Social and cultural dynamics. American book, New York

6. Additional references

a) Twitter Decahose API. https://developer.twitter.com/en/docs/twitter-api/enterprise/decahose-api/overview/decahose

b) FollowerWonk.com. https://followerwonk.com/

c) Twitter Historical API. https://developer.twitter.com/en/docs/tutorials/choosing-historical-api

d) Twitter API. https://developer.twitter.com/en/docs/twitter-api

e) Earth Month Club – Mongolia climate Q&A. https://earthmonth.club/mongolia#emc-climate-mongolia

Reviewer notes

Using Twitter to Understand Public Interest in Climate Change: The Case of Qatar1

  1. Tweets source is not explicitly stated–the reviewer assumes that the source is Twitter API6d.
  2. Tweet data collection and processing logic regarding language and translation is not explicitly stated. However, both Arabic and English names for locations were used in the gazetteer, and removing stop words in both Arabic and English is also mentioned in the data cleaning section.
  3. In the Correlations section, Figure 6 is mentioned as showing the Pearson correlation score between pairs; in fact, Figure 6 shows hashtag co-occurrence graphs and Figure 7 shows the intended matrix of pairwise correlations between variable pairs.
  4. While the paper did include the context of the study and challenges, there is a space for improvement in including a dedicated section for limitations.
  5. Mongolia was ranked 3rd biggest fossil CO2 emitter per capita in 20206e, but the most at-risk (~30% of the total population working predominantly in agriculture) are not tweeting about climate change, let alone are on Twitter. So, how can we examine public awareness of climate change with minimized sampling bias?

Comparing Events Coverage in Online News and Social Media: The Case of Climate Change2

  1. Since news data and taxonomies used to process it heavily depended on the GDELT dataset, it may be worth looking further into the biases and limitations of the dataset beyond what is mentioned in the “challenges and limitations” list if we are to follow up on the results of the paper.
  2. The arbitrary values used in extracting climate change themes/taxonomies for the news data and detecting events could be potential focus points of experiments in examining their effects on data processing.
  3. What other attention pattern classifications are there?
  4. Could there be a pattern to duplicate activity peaks on social media?

Climate Change Communication on Facebook, Twitter, Sina Weibo, and Other Social Media Platforms3

  1. Climate change communication seems to be at the intersection of climate change and communication.
  2. Climate change communication could be included within environmental communication.
  3. In environmental communication, there are some important concepts such as environmental citizenship, environmental knowledge, attitude, and behavior.
  4. “Environmental knowledge, attitude, and behavior corresponds to information, opinion, and mobilization with regard to climate change communication.” - which is very interesting.
  5. There are stages of environmental behavior - which is also very interesting.

Natural disasters detection in social media and satellite imagery: a survey4

  1. Does an ensemble approach containing multiple models count as a single modal?
  2. Some new concepts, methods, and tools found include ontology, co-training, STA/LTA, rapid assessment, HSV low-level color features, early and late fusion methods, deep features, CEDD, CL, Gabor, Kernel Discriminant Analysis, DBpedia spotlight, combMax, DELEF, and Glove.
  3. Seriousness analysis in textual data from Twitter is interesting. Would it be hard to define seriousness in various contexts?
  4. Two versions of the publicly-available multimedia evaluation benchmark were mentioned in the paper: MediaEval 2017 and 2018. Since then, the benchmark has been updated - so it would be interesting to look into the updates.
  5. Natural disaster detection on image datasets resulted in mostly flood detection methods, is there any way to enrich satellite data to detect other types of natural disasters effectively?

Beyond climate and conflict relationships: new evidence from copulas analysis5

  1. “Considering Tol and Wagner (2010), the economies in the 1600’s were very similar (in income comparisons) to some vulnerable developing countries today. Since countries are particularly concerned by probable future climate change, a cliometric investigation can help policy makers warn populations and implement suitable climate change mitigation policies.” – Are economies defined mainly by their income levels?
  2. A reference was missing for [Büntgen et. al 2006], but at a different point, [Büntgen et. al 2011] was used as a citation of the same term, so I used the latter as a reference.