Data Engineer - Sandton
8 days ago
The Data Engineer will report into the Data Engineering Lead and will design, implement and maintain data pipelines that are scalable, repeatable, secure and can serve multiple users within the business. In addition they will manage all aspects of the data platform. The Data Engineer will source data, assess quality, check for accuracy prior to reporting and analysis and ensure that users can access the required data. This individual will develop prototypes and proof of concepts for data science solutions. In addition they will implement complex data projects with a focus on collecting, parsing, managing, modelling and visualising large sets of data using multiple platforms. The Data Engineer will need to understand how to apply technologies to solve big data problems and develop big data solutions. This may involve distributed computing systems and distributed machine learning deployment.
Assist with research on development and trends in the area of Data Engineering.
Ensure knowledge of industry standards as well as best practice and identify gaps.
Partner with business and data colleagues to determine and refine requirements, understand business needs and translate needs into technical solutions.
Ensure that all aspects of the data platform are adequately managed i.e. data management, storage, access to data, data science and traditional reporting tools.
Identify and source data that will meet stakeholder requirements.
Clean data as required to ensure accuracy.
Convert data into an easily understandable format to ensure that stakeholders can interpret data in a way that meets the objectives of the requests
Code, test and document new or modified data systems to create robust and scalable applications for data analytics.
Design key and indexing schemes as well as partitioning.
Manage daily, weekly and monthly data processing pipelines, quality checks and automated distribution.
Ensure that the quality assured data is made available to Data Scientists and other statistical analysts for further use and preparation for model development.
Identify process improvements to streamline data collection and data processing pipelines.
Identify appropriate data cleaning approaches to rectify data quality problems and fulfil business requirements.
Oversee the collation, distribution and presentation of data and statistics as required by business.
Ensure full compliance to statutory regulations, policies, procedures, best practice, professional standards as well as the strategy.
Analyse, retrieve and consolidate data to produce weekly, monthly, quarterly and annual reports utilising various data sources.
Partner with the data quality management team to develop and maintain quality assurance processes prior to reporting.
Analyse data to ensure reliability, accuracy, alignment and correlation with previous reports.
Ability to analyse statistics and other data, interpret and evaluate results and create reports and presentations for use by others.
Bachelor’s Degree at the appropriate NQF level in the area of computer science, engineering, mathematics, statistics and/or a combination of these
Data engineering certifications such as Python, Microsoft, AWS, Hadoop, big data and cloud infrastructure are advantageous.
A minimum of 5 years’ experience in data engineering.
Experience with SQL and working with large scale data etc.
Experience with distributed data processing such as Hadoop, Spark, Kafka, Hive, Nifi and Hbase is advantageous.
Experience with Cloud Data Engineering technologies is advantageous
SAP BW ETL experience is advantageous.
Experience in operationalising data science solutions or similar product development experience in a high scale production environment is advantageous.
Project management or consulting experience applied in cross-functional projects is advantageous