Site Reliability Engineering (SRE) and NOC Manager Job at DAT, Beaverton, OR

VVY0V1VVTkJqUldmSE1pNndiditvb0lKVFE9PQ==
  • DAT
  • Beaverton, OR

Job Description

About DAT

DAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on DAT for the most relevant data and most accurate insights to help them make smarter business decisions and run their companies more profitably. We operate the largest marketplace of its kind in North America, with 400 million freights posted in 2022, and a database of $150 billion of annual global shipment market transaction data. Our headquarters are in Denver, CO, with additional offices in Missouri, Oregon, and Bangalore, India. For additional information, see

Job Application Deadline: 09/30/2024

The Opportunity

DAT is looking for a Site Reliability Engineering (SRE) and NOC Manager to join our team in Portland, OR; or Denver, CO or Remote.

We're looking for a leader who is responsible for managing a team of Site Reliability Engineers and NOC engineers. These two teams work to ensure the reliability, availability, and performance of DAT's systems and services. This role combines technical expertise with leadership skills to oversee the development and implementation of reliability strategies and operational practices.

This is your opportunity to own and leverage our observability tooling to implement best practices for gathering and reporting Site Reliability metrics and our Incident Management processes. We need you to build out the ways in which SRE ingests and manages work across our products and services and to own the incident management tooling and process and remediation reporting end to end.

Candidate profile
  • Experienced leader of SRE and NOC teams that has hands-on experience with triaging and remediating incidents.
  • Demonstrated track record of driving improvements in incident management processes and reporting.
  • Experience administering observability tools like NewRelic, Datadog, Cloudwatch or others.
  • Driven to determine where in the stack the problem lies and to be able to lead engineers to do the same.
  • Ability to debug code in Java, Python, GoLang and C
  • Constantly looking to improve processes, systems and reliability.
What You'll Do
  • Lead, mentor, and manage a team of SREs and a team of NOC engineers providing guidance, support, and career development opportunities.
  • Develop and execute strategies to enhance system reliability, availability, and performance.
  • Refine and improve our incident management process providing collection and reporting for all relevant MTTx metrics
  • Collaborate with engineering teams to design, deploy, and maintain scalable and robust systems.
  • Oversee the creation and maintenance of monitoring, alerting, and incident remediation processes.
  • Manage and prioritize operational tasks, incidents, and system improvements.
  • Drive continuous improvement initiatives, including post-incident reviews and root cause analysis.
  • Ensure adherence to best practices and industry standards for SRE and operational excellence.
  • Develop and track key performance metrics and service level objectives (SLOs).
  • Participate in on-call rotation and incident management as needed.
  • Foster a culture of collaboration, transparency, and accountability within the team and across the organization.
The Skills You'll Need
  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • Proven experience as an SRE or in a similar role, with a track record of managing and leading technical teams.
  • Strong knowledge of system design, networking, and cloud infrastructure.
  • Proficiency in scripting or programming languages (e.g., Python, Go, Shell).
  • Proficiency in AWS, GCP or other cloud platform
  • Experience with monitoring, logging, and observability tools (e.g., Datadog, NewRelic, Prometheus, Grafana, ELK Stack).
  • Familiarity with containerization and orchestration platforms (e.g., Docker, Kubernetes)
  • Excellent leadership, communication, and interpersonal skills.
  • Strong problem-solving abilities and attention to detail.
  • Ability to work in a fast-paced environment and manage multiple priorities effectively..
Bonus Skills
  • Have worked in legacy software systems with experience in modernizing the stack
  • Lift and shift experience on-prem to AWS

Why DAT? DAT is an award winning employer of choice.

For starters, we have a hybrid work environment, but we also know what makes a great workplace. We have a time-tested and resolute set of operating values predicated on integrity, mutual respect, open communication, and executing with excellence. These values inform our strategic vision as much as any one of our products does. We've been an employer of choice in the Portland metropolitan area for four decades, and within one year of opening our Denver office, DAT was #26 on Built In Colorado's 100 Best Places to Work In Colorado.
  • Medical, Dental, Vision, Life, and AD&D insurance
  • Parental Leave
  • Up to 20 days of paid time off starting in year one
  • An additional 10 holidays of paid time off per calendar year
  • 401k matching (immediately vested)
  • Employee Stock Purchase Plan
  • Short- and Long-term disability sick leave
  • Flexible Spending Accounts
  • Health Savings Accounts
  • Tuition Reimbursement Program
  • Employee Assistance Program
  • Additional programs - Employee Referral, Internal Recognition, and Wellness
  • Free TriMet transit pass (Beaverton Office)
  • Competitive salary and benefits package
  • Work on impactful projects in a cutting-edge environment
  • Collaborative and supportive team culture
  • Opportunity to make a real difference in the trucking industry
  • Employee Resource Groups

This position is not eligible for Visa sponsorship.

For Colorado-based candidates, in compliance with Colorado's Equal Pay for Equal Work Act, the minimum salary for this role is $140,000.00 + benefits. The maximum compensation for this role can vary significantly depending on your job-related skills and experience. DAT considers factors such as scope and responsibilities of the position, candidate's work experience, education and training, core skills, internal equity, and market and business elements when extending an offer.

DAT embraces the value of a diverse workforce, and believes it is a core strength of our company that we encourage those values in every DAT employee, at every level of our organization, regardless of tenure or rank. We provide equal employment opportunities (EEO) to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state, and local laws.

Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities

The contractor will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or (c) consistent with the contractor's legal duty to furnish information. 41 CFR 60-1.35(c)

#LI-RF1

#LI-hybrid

Job Tags

Holiday work, Temporary work, For contractors, Work experience placement, Local area, Immediate start, Remote job, Flexible hours, Shift work, Night shift,

Similar Jobs

Archdiocese of San Antonio

Special Events Coordinator Job at Archdiocese of San Antonio

 ...of our community through selfless service under the sign of love. Summary:The primary responsibility of the Fundraising & Special Events Coordinator is to assist with the planning, production, and execution of fundraising and events for CCAOSA. The coordinator will work... 

DCS Corp

M&S Computer Engineer Job at DCS Corp

 ...exceptional employees. We are seeking an Engineer to support extensive high visibility integration...  ...and models for engagement and/or mission level studies. At least 2 years of hands-...  ...such as MATLAB. Master Degree in Computer Science or other STEM field.... 

Merkle

Senior Manager, Operating Model/Change Management Job at Merkle

 ...for problem solving and a deep understanding of how businesses, brands and customers interact, with an advanced understanding of the rapidly...  ...deliverables related to organizational strategy and operating model design, often in coordination with concurrent Merkle data,... 

Get It Recruit - Hospitality

Flight Operations Coordinator/Dispatcher - Remote | WFH Job at Get It Recruit - Hospitality

Flight Operations Coordinator/Dispatcher We are on the lookout for a dynamic Flight Operations Coordinator to join our expanding Systems Operations Control team. As we continue to grow our fleet, we need skilled and detail-oriented professionals to support our flight... 

Ultimate Staffing

Assistant Underwriter Job at Ultimate Staffing

 ...We are seeking a motivated and detail-oriented Entry-Level Underwriting Assistant to join our dynamic underwriting team. In this role, you will provide vital administrative support to the underwriting department by assisting with the processing and e Assistant Underwriter...