Data increases every day depending on expansion of the service. At some point, the processing time will exceed the available time. We need to avoid this situation.
Apache Spark is in-memory base distributed system. It provides horizontally scalability for our batch system.
In this session, I talk about why should we use Apache Spark? and how do we start Apache Spark?
Let's start scaling out batch system!