Posts

Showing posts from October, 2018

Spark learning journey - Part 1

Image
I need to start getting accommodated with Spark .This is both because I will need it in my new project but also because it is quite a successful platform these days and those of us working in data related projects need to know the bits and pieces of cluster  processing systems. I'll start a series of notes from my learning curve, this is the first post. I hope some of you who are new to this platform will find them useful, so i'll write them for beginners, simply as my notes. First thing you should try to get an overall understanding of what is Spark and what are the industry problems it solves. Among many very good documentation I browsed online, I would say this   link provides a good overview summary that is easy to read and understand. Step 1- Document yourself  For documenting, first thing I started to do is reading the book   Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka  by Raul Estrada (Author), Isaac Ru...