Service Availability DevOps Engineer Job in Kenya

 Job Title: Service Availability DevOps Engineer

Hiring Organization: Safaricom
Location – Locality: Nairobi
Location – Region: Kenya
Industry: Telecommunication
Job Type: Full Time
Salary: KES
Date Posted: 04/23/2024

Reporting to the Engineering Lead – Service Availability, the position holder will be tasked with monitoring & Observability and improving the operational aspects of all systems in scope within DIT. Drive automation and Dev-ops across the different domains. Foster service monitoring through proactive initiatives like AIOPs, machine learning among other available channels.

Responsibilities


  • Proactively building and implementing monitoring services, including end to end monitoring, scripting and automation, modern tooling and maintenance software.
  • Use of AI and Machine learning to perform log analysis and create predictive models that will assist in identifying potential failures.
  • Developing and executing automation scripts and maintenance jobs.
  • Developing automation around monitoring.
  • Onboarding DIT systems to the service monitoring tools (APMs like ELK).
  • Clearly document any monitoring gaps noted and collaborate with the relevant teams to ensure timely closure.
  • Performance of Applications error analysis and follow-up to ensure optimal customer experience.
  • Deployment of planned & operational changes on systems in scope.
  • Support all Digital squads to ensure new products are monitored.
  • Support in Zero touch Operations initiatives.
  • Support in the development of collectors and agents

Qualifications

  • Bachelor’s Degree in either Computer Science or Information Technology, Electrical and communication engineering or Business Information Systems or in a relevant field in telecommunication.
  • Domain knowledge in at least 2 of the following areas , Sysadmin especially Linux, Orchestration (Kubernetes), Linux Kernel, Open telemetry.
  • Good understanding of back-end programming such us Python & RUST
  • Technical understanding of SRE concepts & DevOps Practices with respect to providing stable services to customers and adhering to availability KPIs, Service Level Objectives, Service Level Indicators & conforming to target monthly error budget.
  • Be well versed with one or more modern monitoring tools such as ELK, Prometheus, Dynatrace, AppDynamics, New Relic, Splunk etc.
  • Good understanding of the micro service architecture & appreciation of the traditional/classic SOA
  • Ability to manage a team having leadership skills, ownership of issues been analytical and a problem solver.
  • Being able to implement strict change management policy.
  • Conversant with agile ways of working.
Click Here To Apply