Summit Africa RecruitmentSummit Africa Recruitment
Job title: Hadoop Infrastructure (Admin) Engineer
Employment type:Full Time
Experience:4 to 7 years
Salary:R80000 to R100000
Job published:21 May 2020
Job reference no:919725365

Job Description

Our client values a Hadoop Infrastructure (Admin) Engineer as someone who works behind the scenes to deploy, manage and maintain the various methodologies and technologies of the Hadoop Ecosystem, to various consumers, in ways and forms that makes sense and add value. This definition is very broad, as this role falls into the data engineering field, which is just as broad.

You must be the type of individual that lives and breathes the admin / infrastructure side of the Hadoop Ecosystem, someone that can make fixes, enhancements, changes and deploy all those through multiple environments, therefore have the skills and experience in the configuration, deployment and/or administration.


You have the following technical competencies

· Hadoop 3 or Hive 3
· Responsibility for supporting, configuring, upgrading, and maintaining multiple Hadoop clusters.
· Configure and Deploy Apache Hadoop and other Apache components from scratch on VMs, Docker, and/or Kubernetes.
· Configure and Deploy Hiveserver LLAP on bare metal or Docker/K8
· Installation / Setup
· Yarn using the Capacity and Fair Scheduler
· Performance tuning and scaling Hiveserver LLAP using TEZ in a production environment
· Spark 2.4
· Spark Warehouse Connector
· Ranger 2.0
· Atlas 2.0
· Installation / Setup of Hadoop in Linux environment
· Deployment in a Hadoop cluster and its maintenance
· Health checks of a Hadoop cluster, monitoring whether it is up and running all the time
· Analyse the storage data volume and allocating the space in HDFS
· Resource management in a cluster environment
· Responsibility for supporting, configuring, upgrading, and maintaining multiple Hadoop clusters.


You have knowledge and/or experience with the following concepts

Hadoop EcoSystem

· Apache Hadoop
· Apache Spark
· Apache Hive
· Apache Zookeeper
· Apache Solr

Understanding Integration features

· Apache Atlas
· Apache Ranger
· Apache Zeppelin
· Writing high-performance, reliable and maintainable modular code
· Data pipelining knowledge - data extraction and transformation
· Knowledge of the MapReduce and related data processing paradigms
· Hands on experience in HiveQL
· Hadoop development and implementation.


You have the following personal competencies

· The ability to solve problems
· The ability to rotate around a problem, to see if solutions can be gained in different ways
· The ability to work in an ever changing, unstructured environment
· The ability to work as part of a team, with vastly differing skill sets and opinions.
· The ability to contribute ideas to the quorum
· The ability to mentor and provide guidance for other team members
· A systems approach to thinking, as opposed to a siloed approach. The candidate needs to understand how their work affects the greater system
· The ability to work without supervision, and take accountability for the work they deliver
· The ability to liaise with a client, sifting through the fluff and extracting the actual requirements.