In this session we’ll look at some fundamental concepts for distributed computing using Apache Spark, different big data file formats and why/when to use big data and best practices when using distributed computing frameworks such as Apache Spark.