Dataset. The US National Oceanic and Atmospheric Administration (NOAA) publishes the Integrated Surface Data collection. It contains weather station measurements from stations around the world for the last decades. We have mirrored the ~205 GB of compressed weather measurements provided on the FTP server. A format documentation also is available there. Further, this year we added the Historical Land-Cover Change and Land-Use Conversions Global Dataset that keeps track of land use in the USA in different snapshots, starting from the 18th century.
N3: Urbanisation vs Climate Change. Rising temperatures at a measurement station over the years can be ascribed to general global warming, but might also be related to the local effect of increasing urbanization around the weather station. Using the land-use USA dataset, detect if urbanization at measurement stations is correlated with temperature increase and attempt to compare the contribution of urbanization to the overall picture of global warming.
Summary: This project combines information from the climate dataset with a land-use dataset to investigate one of the arguments used by those who think that climate change is a hoax: namely that the increasing temperatures are due to the measurements stations being in places that over time get more urbanized, such that the higher temperatures should be ascribed to "urban heat islands". The spark-based analysis in this project first of all shows that only 12% of all measurement stations are situated in urban environments, severely undercutting this narrative. The authors did identify a number of urban places where temperature increases could be correlated to increased urbanization, interestingly these are all cities in warm areas, such that one hypothesis is that these correlations may be related to heat-producing services such as air-conditioning.
Data curiosity: *** Paper writing: **** Technical difficulties mastered: ** Visualization coolness: ****