Saas Site Reliability Engineer And Automation Developer
By Siemens Digital Industries Software At , Costa Mesa, 92627 $116,900 - $210,400 a year
Develop and maintain automation tools, scripts, and frameworks to streamline deployment, configuration management, and monitoring processes.
Design and implement infrastructure solutions using configuration management tools, such as Ansible, Puppet, or Chef.
Proficiency in automation and configuration management tools (e.g., Ansible, Puppet, Chef).
In-depth knowledge and hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud and their scalability features.
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).
Strong programming skills in languages such as Python, Go, or Ruby.
Senior Site Reliability Engineer, Trello
By Atlassian At , San Francisco
3+ years of hands-on experience with public cloud offerings such as AWS,GCP or Azure
Familiarity with Incident management, post-incident analysis and participation in on-call rotation
3+ years experience operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring, tweaking dashboards, defining alerts, writing runbooks, etc.
Engineering microservices and tools across one or more programming languages (e.g. Go, Python,Bash)
Automation and Infrastructure-as-Code projects and tooling (e.g. Ansible, Puppet, Terraform)
Build and maintain a continuous integration and delivery pipeline (e.g. Bamboo, Bitbucket Pipelines, Github Actions)
Site Reliability Engineer, Product - Usds
By TikTok At , Los Angeles $119,000 - $289,000 a year
Gain a solid understanding of the various components and services that power the TikTok experience
Maintain services to meet service-level-agreements (SLAs) and service-level-objectives (SLOs) by measuring and monitoring availability, performance, and overall system health
Scale systems sustainability through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes
Provide user support, incident responses and postmortems
In this role, you will:
Our time off and leave plans are:
Site Reliability Engineer, Systems
By Anthropic At , San Francisco, Ca
Automate operations and infrastructure management
Have significant experience with Kubernetes and cloud-native infrastructure
Have strong communication skills to work with a range of technical and non-technical colleagues
Python and Linux SysAdmin skills
Significant experience with Kubernetes architecture and administration
Strong Linux skills and cloud infrastructure expertise
Site Reliability Engineer (L4/5) - Core
By Netflix At , Los Gatos, Ca
Experience in risk management and/or analysis
Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks
Read signals and metrics to develop deeper insights into our customers’ quality of experience to help inform business decisions
Strong writing and presentation skills
Development experience with Java, JavaScript/Node.js, Python, Go
Knowledge of cloud platforms (i.e. AWS, GCP, etc.) and microservices architecture
Site Reliability Engineer Jobs
By Sohum Inc At San Francisco Bay Area, United States
Full time opportunity that offers excellent benefits.
• Configuration Management and IAC - Salt, Pulumi (Terraform will work)
• Bachelor’s degree in CS / other highly technical discipline, or equivalent experience
• 5+ years of experience and 3+ years experience as Site Reliability Engg
• Strong networking and firewall knowledge
• Exceptional problem solving and troubleshooting skills
Site Reliability Engineer Jobs
By WalkWater Technologies At Cupertino, CA, United States
Experience with SSL/mTLS and certificate management
Hands-on experience with cloud orchestration platforms such as Kubernetes or Nomad
Setting up CD/CD pipelines using GitHub hooks, TeamCity, Docker, and Artifactory
Familiarity with load balancers, traffic-envoys, and proxies
Familiarity with Java runtime / JVM
Familiarity with observability systems such as Prometheus or Open Metrics
Staff Site Reliability Engineer
By Netskope At , Santa Clara, Ca
You will be part of a high caliber engineering team in the exciting space of cloud tools and infrastructure management.
Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis.
You will solve complex, exciting challenges and improve the depth and breadth of your technical and analytical skills
Partner closely with our development teams and product managers to architect and build features that are highly available, performant and secure
Gain deep knowledge of our application stack
Experience improving the performance of micro-services and solve scaling/performance issues
Site Reliability Engineer Jobs
By Lawrence Berkeley National Laboratory At , San Francisco Bay Area, Ca $9,739 - $11,905 a month
Minimum of three years of experience in UNIX or Linux, Networking, IT infrastructure environment and management experience in a distributed-computing environment.
Knowledge of the processes for standard operating procedures, and best practices for implementation and change management.
Past experience with Incident Management and a good understanding of IT service management.
Experience with network security: configuring/maintaining ACLs, knowledge of firewalls
Bachelor’s Degree in a Computer Science or similar discipline or equivalent years of experience.
Strong hands-on knowledge of the Linux shell and working in a command-line (e.g. SSH) environment.
Principal Site Reliability Engineer
By Oracle At , Redwood City, 94065, Ca
Develop and implement various database life-cycle management flows.
Certification of Database products for cloud integration
Participate in Product Feature Review, Certification experiments and User Document reviews.
Research and acquire skills on new technologies as needed from time to time
6-14 years of Oracle database administration experience on large production environments
Database hands on skills especially around database and system troubleshooting and administration
Senior Site Reliability Engineer (Sre)
By Apple At , San Diego, Ca
Experience in a DevOPS or SRE role
Experience with modern web-scale services including servers, VIPs, load balancers, proxies
Highly experienced with one of these: Puppet, Chef, Saltstack, Ansible
Bonus: Native Kubernetes implementation including CNI, Kafka, etcd experience
Bonus: Experience with Cisco, Juniper, or Arista routing and switching hardware (+OS), including wireless
Able to write software needed to build and operate a large scale platform 24x7 including the development and staging platforms.
Sr. Site Reliability Engineer
By rockset At , San Mateo, Ca $140,000 - $185,000 a year
Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools
Willing to learn new skills and technologies
Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience
Experience as an SRE for 3+ years
Experience building and operating public-facing 24x7 web applications at scale
Experience working with cloud infrastructure and patterns (AWS preferred)
Staff Site Reliability Engineer
By Collective Health At , San Mateo, 94401, Ca $140,000 - $210,000 a year
Expertise in management and use of relational databases including.
10+ years of work experience in DevOps, Site Reliability Engineering, or Software Engineering.
Experience creating and monitoring SLIs and SLOs in order to set and remain within error budgets.
Experience in supporting customer-facing production systems and responding to incidents as part of an oncall rotation.
Knowledge of data structures, algorithms, distributed systems, and information retrieval.
Experience in solving diagnosing and resolving incidents that involve application, OS, network, infrastructure, partners, people, and process.
Manager, Site Reliability Engineer - Remote
By KPMG-UnitedStates At , San Diego, Ca
Manager, Site Reliability Engineer - Remote
Experience in supporting various enterprise class solutions and services including Windows server administration and security issue remediation
Be the Technical Lead representing the SRE\Tier 3 team for operational initiatives or project support
Improve reliability, quality, and time-to-market of our suite of software solutions
Create sustainable systems and services through automation and uplifts
Bachelor's degree from an accredited college or university is preferred

Are you looking for an exciting opportunity to gain hands-on experience as a Site Reliability Engineer Intern? We are looking for a motivated individual to join our team and help us ensure our systems are reliable and secure. As an intern, you will have the chance to work on a variety of projects, from developing automation tools to troubleshooting complex issues. You will also have the opportunity to learn from experienced engineers and gain valuable experience in the field of site reliability engineering. If you are passionate about technology and want to make a difference, this is the perfect opportunity for you!

Overview:

Site Reliability Engineer Internship is a position that involves working with a team of engineers to ensure the reliability, scalability, and performance of a company’s web-based applications and services. The intern will be responsible for monitoring and troubleshooting system performance, developing automation solutions, and providing technical support.

Detailed Job Description:

The Site Reliability Engineer Intern will be responsible for the following tasks:

• Monitor system performance and identify potential issues
• Develop automation solutions to improve system reliability
• Troubleshoot system issues and provide technical support
• Assist in the development and implementation of system architecture
• Assist in the development and maintenance of system documentation
• Assist in the development and deployment of new features and services
• Assist in the development and maintenance of system security
• Assist in the development and maintenance of system scalability
• Assist in the development and maintenance of system availability

What is Site Reliability Engineer Internship Job Skills Required?

• Knowledge of system architecture, design, and development
• Knowledge of system security, scalability, and availability
• Knowledge of system monitoring and troubleshooting
• Knowledge of automation and scripting
• Knowledge of web-based applications and services
• Knowledge of system documentation
• Knowledge of system performance metrics
• Knowledge of system maintenance and optimization
• Knowledge of system backup and recovery
• Knowledge of system disaster recovery
• Knowledge of system capacity planning
• Knowledge of system troubleshooting
• Knowledge of system debugging
• Knowledge of system testing
• Knowledge of system deployment
• Knowledge of system integration
• Knowledge of system monitoring
• Knowledge of system security
• Knowledge of system scalability
• Knowledge of system availability
• Knowledge of system performance
• Knowledge of system reliability
• Knowledge of system optimization
• Knowledge of system administration
• Knowledge of system configuration
• Knowledge of system management
• Knowledge of system troubleshooting
• Knowledge of system debugging
• Knowledge of system testing
• Knowledge of system deployment
• Knowledge of system integration
• Knowledge of system monitoring
• Knowledge of system security
• Knowledge of system scalability
• Knowledge of system availability
• Knowledge of system performance
• Knowledge of system reliability
• Knowledge of system optimization
• Knowledge of system administration
• Knowledge of system configuration
• Knowledge of system management
• Knowledge of system troubleshooting
• Knowledge of system debugging
• Knowledge of system testing
• Knowledge of system deployment
• Knowledge of system integration