Job Details

Site Reliability Engineer (Sre)

J&M Group Inc Urgent Hiring

Job Description

Platform Engineering & Operations

• Design, deploy, and manage Kafka clusters across on-prem, cloud, and Kubernetes environments

• Ensure high availability, fault tolerance, disaster recovery, and capacity planning

• Implement Kafka ecosystem tools:

  • Kafka Connect, Schema Registry, ksqlDB
  • • Automate provisioning using:
  • Terraform, Ansible, Helm, scripting
  • • Configure monitoring and observability:
  • Prometheus, Grafana, Splunk, ELK, Datadog
  • • Perform performance tuning (partitions, replication, retention, ISR, broker configs)

Security & Governance

• Implement authentication & authorization (SASL, ACLs, RBAC)

• Enforce encryption standards (TLS, data security)

• Manage schema governance and lifecycle policies

Required Skills

• Strong experience with Apache Kafka & event streaming

• Hands-on with Kafka ecosystem tools (Connect, Schema Registry, ksqlDB)

• Experience with Terraform, Ansible, Kubernetes (Helm)

• Knowledge of monitoring tools (Prometheus, Grafana, ELK, Splunk)

• Strong security practices (SASL, TLS, RBAC)

• Experience with API platforms (Akana API Management preferred)

Job Overview

  • Job Type: Contract
  • Work Mode: Hybrid
  • Deadline: Apply by Apr 19, 2026
  • Job Location: Kitchener
  • Category: Engineering & Infrastructure
  • Hourly Rate:

© 2026 iTRiders. All Rights Reserved.

Report Bug