2015 · 2015-2016 · 2016 · 2017 · 2018 · 2019 · 2020 · 2021 · VU Canvas
LSDE: Large Scale Data Engineering 2021

Dataset. Through the US Freedom of Information Law, records of all taxi rides in New York City were made public. NYC since then makes all taxi and other ride data public. In these projects, though, we focus on yellow cabs in the years from 2009 up and until July 2016 (after which the geographical level of detail was decreased).

T2: Heat Map. Create a traffic heatmap visualizing busy/empty street segments per time of day. Use a route planner to determine road segments each trip has traversed. When there are alternatives, consider the recorded trip distance.

Summary. The below visualization was created by recreating the likely route followed by 1.3 billion taxi-trips using the GraphHopper route-planner, wrapped in Spark to parallelize this task and to decompose the computed routes into road segments and to aggregate counts by segment and time. The data is time sliced (year, month, season), but also by time-of-day and whether using an additional dataset.

Data curiosity: ****
Writing: ****
Technical difficulties mastered: ***
Visualization coolness: ***


Taxi Heat Map -- Dongqi Pu, Chang Liu and Linxiao Zhu (paper)