Principal Site Reliability Engineer Jobs in Costa Mesa, Orange, California , Employment

Saas Site Reliability Engineer And Automation Developer

By Siemens Digital Industries Software At , Costa Mesa, 92627 $116,900 - $210,400 a year

Develop and maintain automation tools, scripts, and frameworks to streamline deployment, configuration management, and monitoring processes.

Design and implement infrastructure solutions using configuration management tools, such as Ansible, Puppet, or Chef.

Proficiency in automation and configuration management tools (e.g., Ansible, Puppet, Chef).

In-depth knowledge and hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud and their scalability features.

Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).

Strong programming skills in languages such as Python, Go, or Ruby.

Site Reliability Engineer, Product - Usds

By TikTok At , Los Angeles $119,000 - $289,000 a year

Gain a solid understanding of the various components and services that power the TikTok experience

Maintain services to meet service-level-agreements (SLAs) and service-level-objectives (SLOs) by measuring and monitoring availability, performance, and overall system health

Scale systems sustainability through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes

Provide user support, incident responses and postmortems

In this role, you will:

Our time off and leave plans are:

Site Reliability Engineer, Systems

By Anthropic At , San Francisco, Ca

Automate operations and infrastructure management

Have significant experience with Kubernetes and cloud-native infrastructure

Have strong communication skills to work with a range of technical and non-technical colleagues

Python and Linux SysAdmin skills

Significant experience with Kubernetes architecture and administration

Strong Linux skills and cloud infrastructure expertise

Site Reliability Engineer (L4/5) - Core

By Netflix At , Los Gatos, Ca

Experience in risk management and/or analysis

Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks

Read signals and metrics to develop deeper insights into our customers’ quality of experience to help inform business decisions

Strong writing and presentation skills

Development experience with Java, JavaScript/Node.js, Python, Go

Knowledge of cloud platforms (i.e. AWS, GCP, etc.) and microservices architecture

Site Reliability Engineer Jobs

By Sohum Inc At San Francisco Bay Area, United States

Full time opportunity that offers excellent benefits.

• Configuration Management and IAC - Salt, Pulumi (Terraform will work)

• Bachelor’s degree in CS / other highly technical discipline, or equivalent experience

• 5+ years of experience and 3+ years experience as Site Reliability Engg

• Strong networking and firewall knowledge

• Exceptional problem solving and troubleshooting skills

Site Reliability Engineer Jobs

By WalkWater Technologies At Cupertino, CA, United States

Experience with SSL/mTLS and certificate management

Hands-on experience with cloud orchestration platforms such as Kubernetes or Nomad

Setting up CD/CD pipelines using GitHub hooks, TeamCity, Docker, and Artifactory

Familiarity with load balancers, traffic-envoys, and proxies

Familiarity with Java runtime / JVM

Familiarity with observability systems such as Prometheus or Open Metrics

Staff Site Reliability Engineer

By Netskope At , Santa Clara, Ca

You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management.

Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis.

You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills

Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure

Gain deep knowledge of our application stack

Experience improving the performance of micro-services and solve scaling/performance issues

Site Reliability Engineer Jobs

By Lawrence Berkeley National Laboratory At , San Francisco Bay Area, Ca $9,739 - $11,905 a month

Minimum of three years of experience in UNIX or Linux, Networking, IT infrastructure environment and management experience in a distributed-computing environment.

Knowledge of the processes for standard operating procedures, and best practices for implementation and change management.

Past experience with Incident Management and a good understanding of IT service management.

Experience with network security: configuring/maintaining ACLs, knowledge of firewalls

Bachelor’s Degree in a Computer Science or similar discipline or equivalent years of experience.

Strong hands-on knowledge of the Linux shell and working in a command-line (e.g. SSH) environment.

Principal Site Reliability Engineer

By Oracle At , Redwood City, 94065, Ca

Develop and implement various database life-cycle management flows.

Certification of Database products for cloud integration

Participate in Product Feature Review, Certification experiments and User Document reviews.

Research and acquire skills on new technologies as needed from time to time

6-14 years of Oracle database administration experience on large production environments

Database hands on skills especially around database and system troubleshooting and administration

Senior Site Reliability Engineer (Sre)

By Apple At , San Diego, Ca

Experience in a DevOPS or SRE role

Experience with modern web-scale services including servers, VIPs, load balancers, proxies

Highly experienced with one of these: Puppet, Chef, Saltstack, Ansible

Bonus: Native Kubernetes implementation including CNI, Kafka, etcd experience

Bonus: Experience with Cisco, Juniper, or Arista routing and switching hardware (+OS), including wireless

Able to write software needed to build and operate a large scale platform 24x7 including the development and staging platforms.

Sr. Site Reliability Engineer

By rockset At , San Mateo, Ca $140,000 - $185,000 a year

Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools

Willing to learn new skills and technologies

Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience

Experience as an SRE for 3+ years

Experience building and operating public-facing 24x7 web applications at scale

Experience working with cloud infrastructure and patterns (AWS preferred)

Staff Site Reliability Engineer

By Collective Health At , San Mateo, 94401, Ca $140,000 - $210,000 a year

Expertise in management and use of relational databases including.

10+ years of work experience in DevOps, Site Reliability Engineering, or Software Engineering.

Experience creating and monitoring SLIs and SLOs in order to set and remain within error budgets.

Experience in supporting customer-facing production systems and responding to incidents as part of an oncall rotation.

Knowledge of data structures, algorithms, distributed systems, and information retrieval.

Experience in solving diagnosing and resolving incidents that involve application, OS, network, infrastructure, partners, people, and process.

Manager, Site Reliability Engineer - Remote

By KPMG-UnitedStates At , San Diego, Ca

Manager, Site Reliability Engineer - Remote

Experience in supporting various enterprise class solutions and services including Windows server administration and security issue remediation

Be the Technical Lead representing the SRE\Tier 3 team for operational initiatives or project support

Improve reliability, quality, and time-to-market of our suite of software solutions

Create sustainable systems and services through automation and uplifts

Bachelor's degree from an accredited college or university is preferred

Site Reliability Engineer - Entry Level (Technology Rotational Development Program)

By Equifax At Alpharetta, GA, United States

Bachelor’s Degree in Computer Science, Information Technology, Project Management, or equivalent field; Completion of coursework by May 2024.

Ability to gain experience by cross-training in the various areas within the Technology organization and other key related functions.

Excellent leadership, teamwork and service skills.

Excellent oral and written communication skills.

Experienced working with and developing with Java

Exposure/knowledge of cloud technologies (Google Cloud Platform (GCP), Amazon Web Services (AWS), or Azure)

Aws Site Reliability Engineer

By Zeektek At United States

Help set up and manage our AWS EKS environment.

Help set up and manage our GitLab CI/CD pipeline.

Can engage and manage the heterogenous CI/CD and deployment environments of the teams we collaborate with

Site Reliability Engineer, DevOps manager

1.5+ years experience in SRE/DevOps or equivalent role

Work with other teams to assist in deploying our microservices and code into their environments (on prem and AWS)

Sr. Site Reliability Engineer - Remote Us

By SitusAMC At , Remote $100,000 - $125,000 a year

SitusAMC is where the best and most passionate people come to transform our client’s businesses and their own careers. Whether you’re a real estate veteran, a passionate technologist, or looking to ...

Site Reliability Engineer Jobs

By Adobe At , Lehi, 84043 $92,100 - $161,000 a year

What you need to succeed:

An understanding of SRE standard methodologies:

Infrastructure Site Reliability Engineer

By CVS Health At , Hartford $75,400 - $162,700 a year

A year or more experience with incident management, performance monitoring, and capacity planning tools.

Multiple years’ demonstrated proficiency in at least one configuration management tool such as Ansible, Puppet, or Chef.

Minimum of 5 years of experience in Infrastructure Engineering, System Administration, or related roles.

Multiple years’ experience with cloud platforms (e.g., Amazon Web Services, Microsoft Azure) and infrastructure-as-code tools (e.g., Terraform, CloudFormation).

Multiple years’ experience with containerization technologies such as Docker and container orchestration platforms like Kubernetes.

Multiple years’ demonstrated knowledge of networking principles and protocols, including TCP/IP, DNS, load balancing, and firewalls.

Site Reliability Engineer (Sre) - Evening Shift

By Brightspot At , Chicago $100,000 - $115,000 a year

Automate manual tasks and build tools for system monitoring, deployment, and configuration management.

2+ years of relevant experience in Cloud Operations

Proven troubleshooting and problem-solving skills in a cloud-based application environment

Outstanding communication skills with the ability to work in a client-facing role

Monitor the availability, performance, and reliability of our systems and applications during the evening shift.

Investigate and resolve incidents, troubleshooting any issues that arise and ensuring prompt resolution to minimize downtime.

(Remote) - Sr Site Reliability Engineer

By First American Financial Corporation At , Santa Ana $87,945 - $182,655 a year

Bachelor's degree in Computer Science, Information Technology, or equivalent education and experience.

Strong understanding of SRE practices: incident response, change/release management, capacity planning, infrastructure automation, elastic environments, chaos engineering and blameless postmortems.

Skilled in defining service level objectives, measuring service level indicators, and setting up error budgets.

Experienced in creating SRE adoption framework and onboarding procedure.

What You’ll Bring (At least 5-7 years' experience)

Maintain and improve reliability of core software systems.

Site Reliability Engineer Jobs

By Fisker Inc At , Manhattan Beach $60,900 - $169,650 a year

Experience with artifact management (Artifactory, Nexus)

Experience with strict security requirements and implementation

Design, provision, deploy, and manage Kubernetes clusters and resources

Bachelor’s degree in computer science or related technical field or equivalent experience

5+ years of SRE / DevOps Engineer experience

Experience with cloud infrastructure (AWS, GCP, Azure)

Site Reliability Engineer Jobs

By Zscaler At , San Jose

Strong Centos/UNIX skills, FreeBSD specific experience is a plus.

5 -7 years experience in a SaaS/ Cloud/Distributed environment growing at a rapid scale.

Minimum 3+ years of scripting experience in Python is required.

Hands-on experience with infrastructure as code and automation tools (Ansible, Chef, Puppet, Terraform).

Basic Networking skills (TCP/IP, DNS, LACP, CARP) for testing and troubleshooting are required.

Competitive salary and benefits, including equity

Site Reliability Engineer (Sre)

By Agama Solutions At , San Jose

5+ years of US experience as in a SRE role

Good communication (and listening) skills.

Some experience administering Linux “web” servers, at scale.

Working knowledge of DNS, HTTP, TLS, web security.

Experience with networking troubleshooting using tools such as TCP Dump.

Well versed in *nix Operating Systems (we use CentOS and Ubuntu LTS).

Site Reliability Engineer Jobs

By Ascendion At , Alpharetta

Knowledge of the cloud and managed services such as MS Flex Server or AWS RDS.

Strong experience as a database administrator.

Strong experience in PostgreSQL and/or MySQL.

Automation skill in Bash, Golang, Python a plus.

Knowledge of IaC and CI/CD tools such as Terraform and GitHub Actions a plus.

Experience in query optimization and performance improvement.

Sr. Software Engineer- Site Reliability (Remote)

By Home Depot / THD At , Atlanta, 30301 $160,000 a year

Knowledge of configuration management tools (e.g., Ansible, Puppet, or Chef)

This position typically reports to Software Engineer Manager or Sr. Manager

2-4 years of relevant work experience

Experience with cloud platforms (e.g., AWS, Azure, or GCP)

Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack)

Knowledge of version control systems (e.g., Git)

Site Reliability Engineer - Remote

By Sheetz At , Claysburg, 16625

(Equivalent combinations of education, licenses, certifications and/or experience may be considered)

Responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for their assigned system(s).

A four year degree in Computer Science, Management Information Systems, Computer Engineering is preferred.

6 years of applicable experience in a technology environment, preferably with time spent in an engineering capacity, is required.

Coding experience beyond simple scripts is required.

A four year degree which includes courses or training in computer programming, systems analysis, system development, or systems engineering, is required.

Site Reliability Engineer Ii - Remote

By Akamai At , Remote $93,656 - $140,803 a year

Defining requirements as part of the product lifecycle to influence the new designs and standards

Have 2 years of relevant experience and a Bachelors degree or its equivalent

Have proven experience as a systems performance/site reliability or DevOps engineer

Have experience of working with NoSQL databases, such as Cassandra or Redis

Have experience with orchestration tools e.g. Chef and/or Ansible

Join our highly skilled Security team

Lead Sre (Site Reliability Engineer)

By Concentrix At , Remote

Team lead experience with offshore resources

Expected experience even if not deep in these areas:

Nice to have experience (not required):

Ability to create structure and process for a greenfield dev team

React.js & responsive web app dev

- DevOps & CI/CD - specific tooling is related to a Full stack Java and automation

Cdn Site Reliability Engineer (L5) - Open Connect

By Netflix At , Remote

Knowledge of and proven experience with CDNs and HTTP cache/proxy technologies

Service Reliability/Operational experience running large scale, high performance systems & internet services with focus on security and reliability

Expert-level knowledge of Unix or Linux system administration at scale. We happen to use FreeBSD

Knowledge of networking concepts and application protocols, especially TCP/IP, BGP, HTTP/S and DNS

Experience with distributed analytic processing technologies (Hive, Presto/Trino, Spark SQL, etc)

Some experience with container and container orchestration technologies (Docker, Kubernetes)

Sr. Site Reliability Engineer

By eHealth At , Remote $113,500 - $141,900 a year

A security certification and/or knowledge of DevSecOps would be a plus

5+ years of experience as System engineer or SRE engineer (DevOps culture)

Strong Linux skills and excellent skills in one major programming language (Python, Java would be great.)

Hands-on experience implementing and maintaining Container stack with all the security and compliance consideration.

Experience managing Hybrid infrastructure and configuration using tools like Terraform, Ansible and Puppet.

Understanding of CI/CD and experience with Jenkins, Pipeline as code

Site Reliability Engineer Jobs

By eBay At , San Jose, 95125, Ca $168,400 - $262,900 a year

Develop automation systems for implementing eBay Traffic management

Manage eBay’s traffic infrastructure including SLB, CDN, etc.

Solid programming experience in languages like Golang, Java, C/C++

Experience with Kubernetes, docker is a must

Experience working with public cloud is a plus

Experience in software load balancer(IPVS, Envoy, Istio, Cilium etc) is a plus

Site Reliability Engineer - Remote

By Regal Rexnord At , Morehead, 40351, Ky

Experience and understanding in DevOps, Cloud Resiliency, Performance Engineering, Release Engineering, Application Performance Management and Capacity Planning, Caching, JavaScripts and .Net

Responsible for Application Performance Monitoring tool administration and management by monitoring availability and taking a holistic view of system health

Reduce organizational ‘toil’ via automation, scripting, and implementation and management of toolsets

Understand business / technical requirements and the overall business objectives of applications

2+ years of experience in software application development or test automation

5+ years of Performance Engineer or related experience with high-traffic, large-scale distributed systems, client-server architectures both on-prem and cloud (Primarily Azure)

Are you looking for an opportunity to join a fast-paced and innovative team as a Principal Site Reliability Engineer? We are looking for a highly motivated individual to join our team and help us build and maintain reliable, secure, and scalable systems. You will be responsible for developing and implementing strategies to ensure the availability, performance, and security of our systems. If you have a passion for technology and a drive to make a difference, this is the job for you!

Overview:

A Principal Site Reliability Engineer is responsible for ensuring the reliability, availability, and scalability of a company’s IT infrastructure. They are responsible for developing, implementing, and maintaining systems and processes that ensure the highest levels of performance and reliability. They must be able to troubleshoot and resolve complex technical issues quickly and efficiently.

Detailed Job Description:

The Principal Site Reliability Engineer is responsible for designing, developing, and maintaining systems and processes that ensure the highest levels of performance and reliability. They must be able to troubleshoot and resolve complex technical issues quickly and efficiently. They must be able to identify potential problems and develop solutions to prevent them from occurring. They must be able to work with other teams to ensure that the systems and processes are properly implemented and maintained. They must be able to provide technical guidance and support to other teams.

What is Principal Site Reliability Engineer Job Skills Required?

• Strong technical knowledge of IT infrastructure, including hardware, software, and networking

• Knowledge of system and process design

• Knowledge of system and process automation

• Knowledge of system and process monitoring

• Knowledge of system and process optimization

• Knowledge of system and process security

• Knowledge of system and process scalability

• Knowledge of system and process troubleshooting

• Ability to work independently and as part of a team

• Ability to work under pressure and meet deadlines

• Excellent problem-solving and analytical skills

• Excellent communication and interpersonal skills

What is Principal Site Reliability Engineer Job Qualifications?

• Bachelor’s degree in Computer Science, Information Technology, or related field

• 5+ years of experience in IT infrastructure, system and process design, system and process automation, system and process monitoring, system and process optimization, system and process security, system and process scalability, and system and process troubleshooting

• Experience with cloud technologies such as AWS, Azure, or GCP

• Experience with scripting languages such as Python, Bash, or PowerShell

• Experience with configuration management tools such as Chef, Puppet, or Ansible

• Experience with monitoring tools such as Nagios, Zabbix, or Splunk

• Experience with container technologies such as Docker or Kubernetes

What is Principal Site Reliability Engineer Job Knowledge?

• Knowledge of IT infrastructure, including hardware, software, and networking

• Knowledge of system and process design

• Knowledge of system and process automation

• Knowledge of system and process monitoring

• Knowledge of system and process optimization

• Knowledge of system

Latest vacancies

Systems Analyst - Excel, Xml, Sql, Scripting
By CyberCoders At Salt Lake City, UT, United States 8 months ago
(Senior) Finance & Shared Services Manager
By Catholics For Choice At Washington, DC, United States 8 months ago
Paralegal - Probate Administration
By CyberCoders At Miami, FL, United States 8 months ago
Account Executive - Automotive Software
By ECW Search At United States 8 months ago
Construction Project Coordinator Jobs
By CyberCoders At River Falls, WI, United States 8 months ago