In my group at Microsoft, we have worked with the United Nations, Guide Dogs for the Blind in the UK, and Ströer in Germany on a number of projects involving high scale data.
In this talk, I'll share some of the best practices and patterns that have come out of those experiences: best practices for storing and indexing geospatial data at scale, incremental ingestion and slice processing of the data, efficiently building and presenting progressive levels of detail on a web and mobile.
The audience will walk away with an understanding of how to efficiently summarize data over a geographic area, general methods for doing incremental updates to large scale datasets with Apache Spark, and best practices around precomputing high scale frontend data views.