Site Reliability Engineer (Sre) - Evening Shift
By Brightspot At , Chicago $100,000 - $115,000 a year
Automate manual tasks and build tools for system monitoring, deployment, and configuration management.
2+ years of relevant experience in Cloud Operations
Proven troubleshooting and problem-solving skills in a cloud-based application environment
Outstanding communication skills with the ability to work in a client-facing role
Monitor the availability, performance, and reliability of our systems and applications during the evening shift.
Investigate and resolve incidents, troubleshooting any issues that arise and ensuring prompt resolution to minimize downtime.
Senior Site Reliability Engineer
By Adyen At , Chicago
Have a good understanding of Infrastructure as Code and experience with configuration management and automation tools such as Puppet and Ansible;
Strong familiarity with SRE practices and methodologies such as defining SLOs, change management processes and incident response;
Together with the team lead the way in continuously improving our incident management and on-call processes
Have experience with building, operating and troubleshooting large-scale distributed systems spanning multiple data centers across the globe;
Skilled in one or more programming or scripting languages such as Python, Java or bash;
We use SLOs to drive platform stability and innovation
Sr. Site Reliability Engineer - Remote Us
By SitusAMC At , Remote $100,000 - $125,000 a year

SitusAMC is where the best and most passionate people come to transform our client’s businesses and their own careers. Whether you’re a real estate veteran, a passionate technologist, or looking to ...

Senior Site Reliability Engineer/Devops Engineer
By Zillow At , Remote
Knowledge and experience working with microservices
Leverage your knowledge to build technical consensus around architecture and technology choices
Build and manage StreetEasy's cloud infrastructure, contributing to our commitment to reliability and efficiency
A Bachelor's degree in Computer Science or a related technical field, or equivalent practical experience
1-3 years of experience in site reliability engineering, DevOps, or a related field
Experience with cloud service providers, preferably AWS
Site Reliability Engineer Ii - Remote
By Akamai At , Remote $93,656 - $140,803 a year
Defining requirements as part of the product lifecycle to influence the new designs and standards
Have 2 years of relevant experience and a Bachelors degree or its equivalent
Have proven experience as a systems performance/site reliability or DevOps engineer
Have experience of working with NoSQL databases, such as Cassandra or Redis
Have experience with orchestration tools e.g. Chef and/or Ansible
Join our highly skilled Security team
Lead Sre (Site Reliability Engineer)
By Concentrix At , Remote
Team lead experience with offshore resources
Expected experience even if not deep in these areas:
Nice to have experience (not required):
Ability to create structure and process for a greenfield dev team
React.js & responsive web app dev
- DevOps & CI/CD - specific tooling is related to a Full stack Java and automation
Cdn Site Reliability Engineer (L5) - Open Connect
By Netflix At , Remote
Knowledge of and proven experience with CDNs and HTTP cache/proxy technologies
Service Reliability/Operational experience running large scale, high performance systems & internet services with focus on security and reliability
Expert-level knowledge of Unix or Linux system administration at scale. We happen to use FreeBSD
Knowledge of networking concepts and application protocols, especially TCP/IP, BGP, HTTP/S and DNS
Experience with distributed analytic processing technologies (Hive, Presto/Trino, Spark SQL, etc)
Some experience with container and container orchestration technologies (Docker, Kubernetes)
Sr. Site Reliability Engineer
By eHealth At , Remote $113,500 - $141,900 a year
A security certification and/or knowledge of DevSecOps would be a plus
5+ years of experience as System engineer or SRE engineer (DevOps culture)
Strong Linux skills and excellent skills in one major programming language (Python, Java would be great.)
Hands-on experience implementing and maintaining Container stack with all the security and compliance consideration.
Experience managing Hybrid infrastructure and configuration using tools like Terraform, Ansible and Puppet.
Understanding of CI/CD and experience with Jenkins, Pipeline as code
Reliability & Maintainability Engineer (R&M) (Remote) - Huntsvlle, Al
By Davidson Technologies, Inc. At , Remote
Ensure the associated requirements and tasks are properly flowed through specifications and SOWs
Primary focus will be developing Failure Modes and Effects Analysis (FMEA)/Critical Items List (CIL) as defined in SLS SOW paragraph 5.8.2.
Perform all tasks in accordance with SLS-RQMT-014 and SLS-RQMT-016
Participate in the R&M working group, R&M team meetings, and customer meetings as required
All hardware failure modes will be considered in the analysis
For each postulated failure mode, potential failure causes will be identified and documented
Site Reliability Engineer (Sre) - Mid/Senior
By Vanilla Technologies Inc. At , Remote
Project management tools such as Jira, Git, and Confluence
Accounting for and addressing software vulnerabilities
Securing infrastructure, applications, and code
Ensuring high SLA for uptime & security
Quick, continuous automation and deployment of updates
Preserving infrastructure and stability of code
Site Reliability Engineer, Netflix Technology
By Netflix At , Remote
Experience with incident management and response
Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks
Reads signals in aggregate to develop deeper insights into the quality of experience for our users to help inform business decisions
Experience with complex sociotechnical systems and their successful operations at scale
Experience conducting blame-aware incident reviews
Strong analytical and problem-solving skills
Director Of Engineering, Site Reliability
By OneStudyTeam At , Remote
Experience implementing security controls for AWS environments, including setup and management of authentication controls, VPN’s, KMS, etc
Be the product manager for your vertical, defining the roadmap, requirements, goals and acceptance criteria
Learn more about our global benefits offerings on our careers site: https://careers.onestudyteam.com/us-benefits
Manage vendors, contracts and spend associated to operational infrastructure
Experience managing a team of 5+ SREs
Experience managing a global AWS footprint
Lead Site Reliability Engineer (Remote)
By IQVIA At , Remote
Bachelor’s Degree in Computer Science, Software Engineering, or equivalent professional experience
Significant (7+ years) experience building, managing, and supporting cloud-based IT infrastructure (IaC)
Thorough knowledge of Unix and/or Linux fundamentals and system administration
Experience with infrastructure-as-code (IaC) tools or technologies (notably Terraform)
Solid foundational knowledge of TCP/IP networking
Knowledge of source control systems and workflow (notably git)
Site Reliability Engineer (Sre)
By Luxoft At , Remote
5+ years of experience with administrating Linux and at least 2 years in supporting production environments;
Fluent developer skills in any popular programming language (C++ / Python / Java / Go. Java is preferred);
Experience with designing large-scale distributed solutions accompanied with it's capacity planning;
Experience with monitoring and alerting tools like Grafana, Datadog, Prometheus etc;
Strong knowledge of virtualization and containerization principles including orchestration tools;
Experience with relational and NoSQL DBMS
Backend Engineer (Site-Reliability) Jobs
By Terraform Labs At , Remote
In-depth knowledge of database management systems, including relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., Cassandra, MongoDB, Redis).
Collaborate with cross-functional teams to understand requirements and translate them into technical designs and implementation plans.
An interest in DeFi, or background in finance / Fintech
3+ years of professional work experience
Proven experience as a Backend Software Engineer, with a focus on site reliability and DevOps.
Experience with containerization and orchestration technologies such as Docker and Kubernetes.
Site Reliability Engineer Ii
By Exact Sciences Corporation At , Remote $82,000 - $130,000 a year
Support and comply with the company’s Quality Management System policies and procedures.
3+ years of experience in systems engineering
3+ years of work and/or formal classroom experience with modern application design and cloud environments
3+ years of work and/or formal classroom experience working with software development and operations teams
1+ years of experience developing highly available systems architecture using modern technologies.
AWS Solutions Architect, AWS SysOps Administrator, or AWS Developer certification.
Reliability Engineer (Open To Us Remote)
By Cargill At , Swedesboro, 08085, Nj $103,000 - $119,000 a year

Want to build a stronger, more sustainable future and cultivate your career? Join Cargill's global team of 160,000 employees who are committed to safe, responsible and sustainable ways to nourish the ...

Site Reliability Engineer - Kubernetes
By Avantage Entertainment At , Remote $115,000 - $130,000 a year
Strong detail orientation, time management skills, dependability, and flexibility required (our team spans at least 12 time zones).
Support our DevOps team with management of application deployments using GitOps tooling in the Kubernetes environment.
Proactively researches new capabilities and trends and reports findings to senior leadership.
Bachelor's degree in computer science or equivalent occupational experience.
Experience in an AWS or other cloud environment.
In-depth experience in Kubernetes (Red Hat OpenShift preferred).
Principal Site Reliability Engineer
By GoDaddy At , Remote $168,000 - $252,000 a year
Process improvement, management, and development experience.
Translate core architecture and business requirements into technical cloud infrastructure solutions that consist of platform, network, software, cloud automation, security, etc.
3+ years of experience in complex distributed networking, system performance tuning, and monitoring.
Experience with CI/CD development using Kubernetes, Docker, etc.
Experience in virtualization technologies such as KVM, and OpenStack.
Experience with back-end services, highly distributed and scalable services, and deployment automation.
Site Reliability Engineer *Sre*
By Synchronoss Technologies At , Remote
Proven ability to deliver a superior operations support experience working directly with corporate clients’ technology teams and associated change management.
Experience in monitoring tools such as Prometheous, Thanos and Grafana
Experience with Terraform and Ansible.
Experience with Cloud platforms such as AWS
Excellent verbal, written and analytical skills, with the ability to tailor communication to the intended audience.
Experience working with ticketing systems.

Are you looking for a challenging and rewarding role as a Remote Site Reliability Engineer? We are looking for a talented individual to join our team and help us ensure our systems are reliable and secure. You will be responsible for monitoring, troubleshooting, and resolving issues with our systems, as well as developing and implementing strategies to improve system performance. If you have a passion for technology and a desire to make a difference, this is the job for you!

Overview:

A Remote Site Reliability Engineer is responsible for ensuring the reliability, availability, and scalability of a company’s remote systems and services. This role requires a combination of technical and operational skills to ensure that the remote systems are running optimally and securely. The Remote Site Reliability Engineer will work with the development, operations, and security teams to ensure that the remote systems are reliable, secure, and available.

Detailed Job Description:

The Remote Site Reliability Engineer will be responsible for the following tasks:

• Design, implement, and maintain remote systems and services.
• Monitor and troubleshoot remote systems and services.
• Develop and maintain automation and configuration management systems.
• Develop and maintain security policies and procedures.
• Develop and maintain system and service performance metrics.
• Develop and maintain system and service availability metrics.
• Develop and maintain system and service scalability metrics.
• Develop and maintain system and service reliability metrics.
• Develop and maintain system and service security metrics.
• Develop and maintain system and service documentation.
• Develop and maintain system and service monitoring and alerting systems.
• Develop and maintain system and service backup and recovery systems.
• Develop and maintain system and service disaster recovery plans.
• Develop and maintain system and service capacity planning.
• Develop and maintain system and service performance tuning.
• Develop and maintain system and service patching and upgrades.
• Develop and maintain system and service security hardening.
• Develop and maintain system and service change management processes.
• Develop and maintain system and service incident response plans.
• Develop and maintain system and service root cause analysis processes.

What is Remote Site Reliability Engineer Job Skills Required?

• Expertise in remote systems and services.
• Expertise in automation and configuration management systems.
• Expertise in security policies and procedures.
• Expertise in system and service performance metrics.
• Expertise in system and service availability metrics.
• Expertise in system and service scalability metrics.
• Expertise in system and service reliability metrics.
• Expertise in system and service security metrics.
• Expertise in system and service documentation.
• Expertise in system and service monitoring and alerting systems.
• Expertise in system and service backup and recovery systems.
• Expertise in system and service disaster recovery plans.
• Expertise in system and service capacity planning.
• Expert