Twiga Senior Site Reliability Engineer Jobs in Kenya

Twiga Senior Site Reliability Engineer Jobs in Kenya

About Twiga

Twiga is a B2B e-commerce company that builds fair and reliable markets for agricultural producers, food manufacturers and retailers based on transparency and efficiency. Our Mission is to build a closed ecosystem for the African retail, anchored on affordable access to food and grocery across urban cities. Our Ambition is to leverage technology, the ubiquity of mobile phones, modern distribution and logistics to modernize African retail.

Senior Site Reliability Engineer Vacancy

The role holder will be responsible for leading the end-to-end design, development and deployment of engineering solutions to run scalable, distributed and fault-tolerant software systems for Twiga Foods. The role holder will lead the implementation of automated solutions to ensure uptime, reliability and improvement of Twiga Food’s systems in line with set service level objectives.

He/she will be required to provide leadership in determining software engineering needs from product/engineering requirements and collaborating across the organisation to clarify requirements and expected outcomes.

They are also accountable for work assigned, ensuring that it is broken down into a plan with estimates, priorities and deliverables; ensuring that adherence to the plan and communicating when any adjustments to scope are needed to meet deadlines.

Additionally, he/she will contribute to the wellbeing of the Twiga technology ecosystem by tracking production systems’ capacity and performance, fixing issues and taking on-call responsibilities.

Key Responsibilities

Site Reliability

Collaborate with other cross-functional teams to design, develop, and deliver required software

Develop, manage and support SRE tools and applications.

Lead/own and drive the development/implementation of SRE tools within the Product/Technical Requirements Document.

Develop or review technical specification documents within the SRE team and wider engineering team.

Lead the deployment, training, and rollout of major/minor SRE tools across various engineering/tech teams.

Deliver feature work consistently and on time whilst still tackling tech debt. Ensure that code fits agreed, accuracy, testability, and efficiency and style guidelines. Software systems that meet agreed SLO for performance and reliability

Produce a work breakdown structure with estimates, deadlines, and deliverables. Own features from technical specification, implementation right through to deployment into production

Engage in improving the software development lifecycle, providing feedback on requirements, architecture, designs, and solutions.

Build resilience into systems so underlying failures are handled gracefully and do not impact end users.

Develop automated predictive analysis of future capacity needs and proactively work on the efficiency and capacity planning to set clear requirements and reduce the system resources usage.

Manage individual priorities, deadlines, and deliverables.

Defend and challenge technical decisions made through solution design and code review feedback

Finalise and own technical documentation for the developed features

On-Call Technical Support

Monitor application availability and performance, take steps to improve overall application performance and stability, and follow through with implementation

Participate in on-call technical support rotation, respond to all incidents, and lead minor/major incidents in collaboration with relevant engineering/product stakeholders.

Triage system issues and debug/track/resolve by analysing the sources and offering corrective measures. Through end-to-end incident response and management.

Drive efficiencies through systems improvement and root cause analysis resulting in service delivery, maturity, and scalability.

Analyze logs and telemetry data by writing monitoring and automation code.

Identify and automate repetitive, manual, and non-tactical work that impacts software development and deployment.

Innovation

Investigate site reliability technologies and their applicability to the Twiga ecosystem.

Identify significant projects that result in substantial improvements in reliability, cost savings and/or revenue.

Provide reports on findings, with recommendations and a viable plan of action.

Lead design reviews with peers and stakeholders to decide amongst available technologies

Evaluate and review existing

systems, SRE processes, & tools.

Develop and lead implementation of a viable technical specification document in collaboration with members of SRE or engineering team.

Contribute to the definition of SLOs for services/applications.

In-Team Collaboration

Work with peers to build a stronger engineering team

Lead process improvements that boost productivity and quality of Twiga engineering

Regularly contribute improvements to existing documentation and codebase as per agreed standards.

Review code developed by others and provide feedback to ensure adherence to Twiga Engineering best practices.

Contribute regular knowledge shares through a variety of mediums including lunch and learn sessions.

Provide mentorship for SRE engineers and interns in the section.

Mentor/Coach/Train engineers on system design, reliability, monitoring, and availability concepts to help improve the overall system quality.

Develop and maintain relationships with various engineering teams and their members.

Acquire and maintain an understanding of multiple engineering teams processes and tools.

Influences the engineering roadmap and works with engineering and/or product counterparts to influence improved resiliency and reliability of Twiga systems.

Deep domain knowledge and radiation that knowledge through recorded demos, technical presentations, discussions, and Incident Reviews.

Self-management

Model Twiga’s culture and way of working.

Deliver the performance objectives set for the team. Hold monthly 1-on-1 performance reviews with line manager, and institute corrective action where performance falls below expectation.

Proactively manage own learning and development

Adhere to the annual leave plan agreed with the line manager

Adhere to people management policies

Compliance

Comply with all organization policies, procedures, and statutory guidelines. Minimize and mitigate risks to the organization and enforce zero-tolerance to non-compliance.

Close gaps/lapses identified as an outcome of audits; risk and/or any other compliance review; investigations; or other assessment mechanisms and take corrective/preventive actions within the agreed timelines.

Minimum Qualifications & Requirements

Degree in Engineering, Computer Science, Information Technology or a related discipline. Or demonstrated equivalent skill/competence.

Minimum of 5 years of relevant experience

Observability and monitoring of infrastructure, applications, services, and networks

Troubleshooting issues across the entire stack (hardware, software, network etc.)

Writing infrastructure as code and automation scripts

Building and maintaining CI/CD pipelines

Building, running, and optimising containers with Docker or ContainerD

Setting up, running, and managing Virtual machines, Kubernetes clusters, Databases and Virtual Private Networks

Operating highly available and reliable infrastructure

At least 3 years’ experience working with relational databases (Postgres, MySQL or Microsoft SQL Server) non-relational, and in-memory data stores

At least 2 years' experience creating/managing SLIs/SLOs/Error Budgets.

Strong technical understanding of android, front-end and backend development

Experience in design, implementing and securing distributed systems

Strong experience with; Analysing logs, metrics and traces.

Creating system reports and system alerts.

The use, maintenance and configuration of monitoring, observability and telemetry metrics and logging infrastructure (Prometheus, Grafana, ELK, or Sentry)

Understanding of Agile/Scrum development principles

Understanding of ITIL incident and problem management practices

Can work accurately and quickly, to ensure key project milestones are achieved within set timelines, even when working under pressure.

Always have a positive attitude and approach to the role and team.

How to Apply

For more information and job application details, see; Twiga Senior Site Reliability Engineer Jobs in Kenya

Find jobs in Kenya. Jobs - Kenya jobs. Search our career portal & find the latest Kenyan job positions, career opportunities & jobs in Kenya.

Jobs in Kenya - banking jobs, IT jobs, accounting jobs, NGO jobs, business administration, ICT, UN jobs, procurement jobs, education jobs, hospital jobs, human resources jobs, engineering, teaching jobs, and other careers in Kenya.

Find your dream job from 1000s of vacancies in Kenya posted and updated daily - click here!

Click here to post comments

Join in and write your own page! It's easy to do. How? Simply click here to return to 2 Best Africa Jobs.

Army Jobs - Navy Jobs - Military Jobs

Scholarship 2026/27

Current Scholarships 2026/2027 - Fully Funded

Full Undergraduate Scholarships 2026/2027

Fully Funded Masters Scholarships 2026/27

PhD Scholarships for International Students - Fully Funded!

Funding Opportunities for Journalists 2026/2027

Funding for NGOs 2026/2027

Funding for Artists 2026/2027

Funding for Entrepreneurs 2026/2027

✅Over 4500 Current Fully Funded Scholarships for International Students

✅ Undergraduate Scholarships 2026

✅ Masters Scholarships 2026

✅ PhD Scholarships 2026 Click Here!

Free Scholarship Newsletter --- E-mail Address

First Name

Then Don't worry — your e-mail address is totally secure. I promise to use it only to send you Advance Africa Newsletter.