Session Outline

Goldman Sachs Data Lake is a large lake implementation with a 2500 warehouse, 140 000 livek dataset that is growing exponentially.  Handling big data on this scale demanded development of bespoke methods to manage and regulate the plant. This presentation will cover the latest techniques engineering teams at Goldman Sachs use to govern our Data Lake.

Key Takeaways

  • Method to vectorise the representation of warehouses and its benefits 
  • How to model asymmetric workload on a big data platform to manage client expectations

Speaker Bio

Mikael Lang – Managing Director, Data Engineering | Goldman Sachs
Mikael is global head of Data Lake Reliability Engineering, Data Quality Engineering and regional head of Stockholm Core Engineering. Previously, he held various leadership roles in Prime Services Technology in London, Hong Kong and Bengaluru. Mikael joined Goldman Sachs in 2006 as an analyst in Prime Services Risk Technology and was named managing director in 2019. Prior to joining the firm, Mikael worked at Sony Broadcast and Professional Research Laboratories. Mikael earned an MSc in Mechanical Engineering from the KTH Royal Institute of Technology in Stockholm in 2003.

October 15 @ 11:00
11:00 — 11:20 (20′)

Day 2 | M3 | Data Engineering Stage