Site Reliability Engineer
san jose, ca
About HCLTech:
HCLTech is a global technology company, home to 221,000+ people across 60 countries, delivering industry-leading capabilities centered around digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients across all major verticals, providing industry solutions for Engineering Services, Manufacturing, Life Sciences and Healthcare, Technology and Services, Telecom and Media, Retail and CPG, and Public Services. To learn how we can supercharge progress for you, visit https://www.hcltech.com/about-us.
Job Description:
12+ years of proven experience in compute platform engineering with a focus on automation.
Experience with design and deployment of virtualization architectures, including VMware, Openshift or KubeVirt platforms.
Proven experience evaluating existing application architectures and identify opportunities for containerization to improve scalability, reliability, and efficiency.
Strong analytical skills with the ability to define and track key performance metrics.
Experience in developing tools for data analysis and performance profiling, Development with Terraform, Config Management tools.
Proficiency in programming languages such as Go and/or Python.
Experience with running large environments consisting of BareMetal, large scale virtualized environment with a mix of tens of thousands of VM’s and cloud infrastructure.
Ways to stand out from the crowd:
Deep understanding of other infrastructure components like Storage, DNS, AD, Security Tools etc..
Hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud Platform.
Solid understanding of microservices architecture, infrastructure as code (IaC) and configuration management tools.
Understanding of AI ops and how to leverage LLMs to automate various optimization initiatives