Senior Data Engineer – Kingston Stanley – Dubai – UAE
Kingston Stanley invites applications for Senior Data Engineer in Dubai, UAE
Job Title:
Senior Data Engineer, Data Lake, Big Data, Apache NiFi – Dubai, 6-month contract
You must be a Senior Data Engineer, currently on a Freelance Visa in Dubai. You will be joining a growing Security company, heavily involved in AI and Big Data products that are sold to public and private sector clients.
This is a 6-month contract ONLY, with the possibility of extension at a later date.
Working hours for this role are:
- Monday to Thursday, 7:30am – 3:30pm
- Friday, 7am – 1pm
- Alternative Saturday’s, 7:30am – 12:30pm
- They work on a “one Saturday on, two Saturday’s off” model, so any candidate would only have to work two Saturday’s a month.
Responsibilities:
- Solid background on software development with strong python coding skills and solve challenging problems.
- Developing Data pipelines with Cloud Services & On-premise Data Centers.
- Web crawling, data cleaning, data annotation, data ingestion and data processing.
- Reading and collating complex data sets.
- Creating and maintaining data pipelines.
- Continual focus on process improvement to drive efficiency and productivity within the team.
- Use of Python, SQL, ES, Shell etc. to build the infrastructure required for optimal extraction, transformation, and loading of data.
- Provide insights into key business performance metrics by building analytical tools that utilize the data pipeline.
- Support the wider business with their data needs on an ad hoc basis.
- Open to extensive international business travel as and when required, and for extended periods.
Qualifications:
- 6+ years of programming experience, solid coding skills in Python, Shell, and Java.
- Bachelor’s degree in computer engineering, Computer Science, or Electrical Engineering and Computer Sciences.
- Strong practical knowledge in data processing and migration tools, such as Apache NiFi, Kafka, and Spark.
- Design, build, and maintain data processing with CDP (Cloudera Data Platform) Private Cloud.
- Develop and Maintain Data Workflow with Apache Airflow.
- Experience with HDFS or Similar Object Storage
- Strong Understanding about Distribute Computing and Distributed Systems
- Experience with Web crawling, cleaning.
- Experience with solution architecture, data ingestion, query optimization, data segregation, ETL, ELT, AWS, EC2, S3, SQS, lambda, Elastic Search, Redshift, CI/CD frameworks and workflows.
- Working knowledge of data platform concepts – data lake, data warehouse, ETL, big data processing (designing and supporting variety/velocity/volume), real time processing architecture for data platforms, scheduling and monitoring of ETL/ELT jobs
- PostgreSQL and programming (preferably Java, Python), proficiency in understanding data, entity relationships, structured & unstructured data, SQL and NoSQL databases.
- Knowledge of best practice in optimizing columnar and distributed data processing system and infrastructure.
- Experienced in designing and implementing dimensional modelling.
Knowledge of machine learning and data mining techniques in one or more areas of statistical modelling, text mining and information retrieval.