Catalyst
Objective:
- Stop reinventing wheel for bigdata development between different teams within the organization
- Generate a software library/framework providing abstractions and tools to ease their dev lifecyle
Approach:
- Consulted different teams to know their needs
- Gathered a list of
- tools (like CICD pipelines, build files, Nexus,..)
- abstractions (like IO utils, config parsers,..)
- Minimal software design leaving out scope for more features to be added
- Adoption of most common used design patterns - functional and object-oriented basing on the problem statement
Results:
- Small portable framwork to incorporate in any bigdata project
- Supports Apache Spark, Apache Hadoop, Apache Kafka, Apache Hbase, MongoDB, … and other bigdata technologies
- Adds support of automation with scripting and devops tools like docker, jenkins, maven/gradle/sbt,…