Senior Reliability Engineer Jobs
By Digital Diagnostics, Inc. At , Remote
Location – Chicago, IL | Coralville, IA | or Remote-US
What We Have to Offer
Lead or participate in deploying updates or improvements as needed.
Lead or participate in support activities.
Identify performance and scalability bottlenecks in Digital Diagnostics’ global technical infrastructure.
Identify and work to eliminate waste in cloud infrastructure costs.
Senior Site Reliability Engineer/Devops Engineer
By Zillow At , Remote
Knowledge and experience working with microservices
Leverage your knowledge to build technical consensus around architecture and technology choices
Build and manage StreetEasy's cloud infrastructure, contributing to our commitment to reliability and efficiency
A Bachelor's degree in Computer Science or a related technical field, or equivalent practical experience
1-3 years of experience in site reliability engineering, DevOps, or a related field
Experience with cloud service providers, preferably AWS
Site Reliability Engineer (Sre) - Mid/Senior
By Vanilla Technologies Inc. At , Remote
Project management tools such as Jira, Git, and Confluence
Accounting for and addressing software vulnerabilities
Securing infrastructure, applications, and code
Ensuring high SLA for uptime & security
Quick, continuous automation and deployment of updates
Preserving infrastructure and stability of code
Site Reliability Engineer - Kubernetes
By Avantage Entertainment At , Remote $115,000 - $130,000 a year
Strong detail orientation, time management skills, dependability, and flexibility required (our team spans at least 12 time zones).
Support our DevOps team with management of application deployments using GitOps tooling in the Kubernetes environment.
Proactively researches new capabilities and trends and reports findings to senior leadership.
Bachelor's degree in computer science or equivalent occupational experience.
Experience in an AWS or other cloud environment.
In-depth experience in Kubernetes (Red Hat OpenShift preferred).
Principal Site Reliability Engineer
By GoDaddy At , Remote $168,000 - $252,000 a year
Process improvement, management, and development experience.
Translate core architecture and business requirements into technical cloud infrastructure solutions that consist of platform, network, software, cloud automation, security, etc.
3+ years of experience in complex distributed networking, system performance tuning, and monitoring.
Experience with CI/CD development using Kubernetes, Docker, etc.
Experience in virtualization technologies such as KVM, and OpenStack.
Experience with back-end services, highly distributed and scalable services, and deployment automation.
Site Reliability Engineer *Sre*
By Synchronoss Technologies At , Remote
Proven ability to deliver a superior operations support experience working directly with corporate clients’ technology teams and associated change management.
Experience in monitoring tools such as Prometheous, Thanos and Grafana
Experience with Terraform and Ansible.
Experience with Cloud platforms such as AWS
Excellent verbal, written and analytical skills, with the ability to tailor communication to the intended audience.
Experience working with ticketing systems.
Site Reliability Engineer * Sre*
By Synchronoss Technologies At , Remote
Experience with Configuration Management Automation tools (chef or puppet).
Deploy and manage Kubernetes (EKS) based docker applications in AWS/OCI.
Solid experience in building a solution on AWS or Oracle Cloud or other public cloud services using Terraform.
Knowledge in Infrastructure monitoring tools (ELK stack, Prometheus, Grafana, or similar)
Knowledge of AWS/OCI best practices. Very keen to learn new technologies, Flexible to work on new platforms/environments and models like Agile/Scrum.
Excellent written and verbal skills.
Site Reliability Engineer - Cloud Infrastructure
By Lambda At , Remote $147,000 - $229,000 a year
Have experience with configuration management and infrastructure-as-code tooling
Experience building and maintaining internal tools and infrastructure (secrets management, artifact storage, CI/CD platforms, identity management)
Build abstractions that simplify and unify the management of development and staging environments
Work with multiple engineering teams to gather requirements and translate them into tooling and infrastructure projects
Have experience building and maintaining CI/CD pipelines
Have experience deploying and monitoring infrastructure in public cloud environments
Sre - Site Reliability Engineer (Ambra Team)
By Intelerad At , Remote
Experience with Systems Lifecycle Management Products (Foreman, Katello, RedHat Satellite)
Demonstrated knowledge of configuration management tools like Puppet, Chef and Ansible
Own system designs, documentation, platform management, and capacity planning for Enterprise Imaging Systems in your area of responsibility
University or college education in science, technology, engineering, or equivalent industry experience
Build software and systems to manage platform infrastructure and applications
Excellent verbal and written communication skills and ability to communicate technical subjects to a broad range of stakeholders
Site Reliability/Devops Engineer
By Axoni At , Remote
Experience with automation and configuration management tools (Terraform, Ansible, Salt, Chef, Puppet)
Experience troubleshooting issues on a remote distributed system
Manage and configure all pre-production, production, and client facing infrastructure
Coordinate with the Applications team to satisfy all non-functional project requirements (security, performance, scalability, and resiliency)
Experience with at least one of the following scripting languages: Bash and/or Python
Experience with Docker (Docker compose, yamls, etc)
Aws Site Reliability Engineer
By Derivative Path At , Remote
Excellent communication, organizational and time-management skills
Work closely with architects, software engineers, quality engineers, product owners, and management to design scalable, robust systems using cloud architecture
Participate in system design consulting, platform management, and capacity planning
Proficient with AWS certification preferred
Prior experience within the Capital Markets, Financial Services, and IT & Services
Design and implement fully automated CI/CD Pipelines using industry tools
Software Engineer, Site Reliability
By Packback Inc At , Remote $108,000 - $140,000 a year
2+ years of devops experience using Docker and Kubernetes
Experience with CI/CD pipelines, containerization, and orchestration
Experience reviewing code to both give and receive constructive feedback.
Experience with helm and terraform
Experience working on highly scalable cloud infrastructures
Startup or small company experience
Senior Site Reliability Engineer
By Lumin Digital At , Remote $170,000 - $200,000 a year
Expert-level knowledge of at least one configuration management system (Chef, Ansible, Puppet, etc.).
Exceptional full stack and environment troubleshooting skills.
Exceptional written and verbal communication skills.
Experience with a microservice architecture running in containers (Docker or other containerization technology).
Experience with Terraform and Kubernetes
2+ years of experience as a software engineer. C#, Angular, JavaScript preferred.
Senior Site Reliability Engineer
By Adyen At , Chicago
Have a good understanding of Infrastructure as Code and experience with configuration management and automation tools such as Puppet and Ansible;
Strong familiarity with SRE practices and methodologies such as defining SLOs, change management processes and incident response;
Together with the team lead the way in continuously improving our incident management and on-call processes
Have experience with building, operating and troubleshooting large-scale distributed systems spanning multiple data centers across the globe;
Skilled in one or more programming or scripting languages such as Python, Java or bash;
We use SLOs to drive platform stability and innovation
Senior Engineer Ii - Digital Site Reliability
By Lululemon At , Seattle $132,300 - $173,500 a year
Contribute to engineering automation, management or development of pre-prod and production systems
Mentor and guide junior team members, sharing knowledge and expertise to foster a culture of learning and continuous improvement.
Eight+ years of engineering experience
Five+ years experience with CI/CD tools, GitLab preferred
Proficiency in at least one programming language (e.g., Python, Go, Java) and experience with scripting and automation.
Acknowledge the presence of choice in every moment and take personal responsibility for your life.
Senior Site Reliability Engineer, Trello
By Atlassian At , San Francisco
3+ years of hands-on experience with public cloud offerings such as AWS,GCP or Azure
Familiarity with Incident management, post-incident analysis and participation in on-call rotation
3+ years experience operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring, tweaking dashboards, defining alerts, writing runbooks, etc.
Engineering microservices and tools across one or more programming languages (e.g. Go, Python,Bash)
Automation and Infrastructure-as-Code projects and tooling (e.g. Ansible, Puppet, Terraform)
Build and maintain a continuous integration and delivery pipeline (e.g. Bamboo, Bitbucket Pipelines, Github Actions)
Reliability Cae Senior Engineer I
By Honda Dev. and Mfg of Am.,LLC At , Raymond
Experience in data analysis and communication of complex information to engineering management is desired.
Experience with following software or similar is desired
Ability to communicate concerns and ideas through remote work environment
Education reimbursement for continued learning
BS in Mechanical / Automotive Engineering
Proficient in Microsoft Excel, Word, and PowerPoint
Cloud Senior Site Reliability Engineer
By Bank of America At , New York, Ny
Perform deep dives into systemic and latent reliability issues, incident management, problem management
Understanding of cost management, inventory management, FinOps model
Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
Evaluating and automating the scaling and capacity requirements within Azure environments
BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
Minimum 8+ years of hands-on experience maintaining cloud platforms on a major cloud service provider.
Senior Site Reliability Engineer
By NVIDIA At California, United States
BS degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics), or equivalent experience
Technical leadership beyond development that includes scoping, requirements capturing, leading and influencing multiple teams of engineers on broad development initiatives.
Experience with the ELK and Prometheus stacks as a power user and administrator.
Prior experience driving production issues and helping with on-call support.
Experience with Cuda, PyTorch, TensorRT, TensorFlow, and/or Triton.
Experience with StackStorm and similar automation platforms is a bonus.
Senior Site Reliability Engineer
By Business Wire At United States
Strong experience with AWS cloud infrastructure and container orchestration (Kubernetes, Docker)
Strong experience with monitoring and alerting systems such as Prometheus, Grafana, Nagios, etc.
Strong experience with at least one programming language. Java is highly preferred but other languages such as Python will be considered
Advanced experience with Linux system administration, Java based applications, and network architecture
Ability to work remotely 100%
Excellent health benefits that begin on your first day of employment
Senior Site Reliability Engineer (Remote)
By The Hartford At , Hartford, Ct
Progressively implement preventative controls and drive increased automation and self-healing capabilities. Continue to improve cost efficiency baselines
Hands on experience with Performance and Observability tools such as DynaTrace, Splunk, TrueSight, CloudWatch, CloudTrail, and related tools.
Experience with continuous integration and DevOps methodologies, preferred tools such as GitHub, Jenkins, Nexus, Rally, SonarQube etc.
Knowledge of complex traditional and modern enterprise architectures and systems (understand more than the component itself).
Strong hybrid cloud experience (private and public) across various service delivery models – IaaS, PaaS, SaaS.
Strong communication (verbally and written) / collaboration / negotiation skill, working in a diverse team cross business units

Are you an experienced Senior Site Reliability Engineer looking for a new challenge? We are looking for a motivated individual to join our team and help us ensure our systems are running smoothly and efficiently. You will be responsible for developing and maintaining our infrastructure, monitoring system performance, and troubleshooting any issues that arise. If you are passionate about technology and have a keen eye for detail, this could be the perfect opportunity for you!

What is Senior Site Reliability Engineer Skills Required?

•Strong knowledge of Linux/Unix administration
•Experience with scripting languages such as Bash, Python, Ruby, etc.
•Experience with automation/configuration management using tools such as Chef, Puppet, Ansible, etc.
•Experience with cloud technologies such as AWS, Azure, Google Cloud Platform, etc.
•Experience with container technologies such as Docker, Kubernetes, etc.
•Experience with monitoring tools such as Nagios, Zabbix, etc.
•Experience with version control systems such as Git, SVN, etc.
•Strong troubleshooting and problem-solving skills
•Excellent written and verbal communication skills

What is Senior Site Reliability Engineer Qualifications?

•Bachelor’s degree in Computer Science, Information Technology, or related field
•5+ years of experience in a Site Reliability Engineer role
•Experience with DevOps practices and tools
•Experience with database technologies such as MySQL, PostgreSQL, etc.

What is Senior Site Reliability Engineer Knowledge?

•Knowledge of ITIL best practices
•Knowledge of network protocols and technologies
•Knowledge of security best practices
•Knowledge of software development lifecycle

What is Senior Site Reliability Engineer Experience?

•Experience with large-scale distributed systems