We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Sr Edge Platforms SRE

Northwestern University
$$115,000 to $132,750
remote work
United States, Illinois, Evanston
633 Clark Street (Show on map)
Jun 06, 2025
Apply for Job
Job ID
52172
Location
Evanston, Illinois
Add to Favorite Jobs
Email this Job

Department: NAISE - NU ANL Inst Sci Eng
Salary/Grade: ITS/82

Job Summary:

This will be an SRE role with a focus on maintaining and improving operations of the edge fleet, cloud infrastructure, and data pipeline associated with multiple NSF and DOE funded projects. At this time, the projects collectively operate nearly 200 remote edge devices, each running Linux and a local Kubernetes cluster to host user applications. We expect this number to grow by around 300 devices over the next 5 years as part of Sage Grande, our latest NSF funded project, totaling to a fleet of nearly 500 devices. (See NSF award for more information.)

This incumbent will work closely with the software team to understand the existing design, requirements, and prior issues to inform decisions on monitoring tooling to either be selected or built as needed. The incumbent will also work with key collaborators (various universities, national labs, industry partners, tribal partners, and other non-profit organizations) to ensure that their expectations for nodes and data are being met.

Finally, we expect that this role will provide good opportunities for career growth. First, our fleet will continue to grow, so we expect multiple iterations on designing and implementing ideas and new technologies as they become available. Second, we anticipate additional cloud infrastructure and backends as we support more projects. This will provide plenty of time to understand cloud infrastructure and work with the software team to learn useful patterns for instrumentation and monitoring. Last, the unique nature of our fleet deployment means the incumbent will likely develop software engineering and data analysis skills through implementing novel tooling for addressing issues at scale.

This is a one-year term position. Opportunity for renewal will be based on performance and available funding.

The primary work location is Argonne National Laboratory. This position is primarily on-site, with the possibility of occasional remote work depending on job responsibilities and with management approval. Some travel to other sites is required.

*Note: Not all aspects of the job are covered by this job description.

Specific Responsibilities:

List the primary job duties and responsibilities in order of importance. Typically includes 7-9 bullet points.

  • Addressing software and minor hardware issues in the edge fleet in a timely manner and escalating issues which need attention from the deployment team and/or on-site staff.
  • Selecting, developing, and managing tooling and infrastructure for monitoring and alerting. Due to the unique aspects of our edge deployment, it is expected that you will develop substantial software tooling to address gaps that existing tools do not cover.
  • Developing relevant dashboards for the software team to understand how well services are performing.
  • Performing routine maintenance such as software upgrades and minor tasks such as renewing domain certificates annually.
  • Setup and manage support ticket systems for platform and device issues.
  • Lead a small team (1-2 people) of junior SREs, as we grow the SRE team.

Miscellaneous

Perform other duties as assigned.

MINIMUM QUALIFICATIONS (EDUCATION, EXPERIENCE, CERTIFICATIONS, SKILLS)

  • Successful completion of a full 4-year course of study in an accredited college or university leading to a bachelor's or higher degree in a major such as computer science, information technology, or related; OR appropriate combination of education and experience.
  • 4-5 years of direct experience supporting code, services, and deployments in production.
  • Demonstrated experience in Linux, including fundamentals of scripting, user management, networking, package management, SSH, and debugging.
  • Experience in software engineering and Python.
  • Familiarity with Kubernetes, particularly using Kubernetes for deployments, and being familiar with deploying and administering Kubernetes clusters.
  • Familiarity with monitoring and data collection tooling such as Prometheus, Grafana, Fluentbit, and Loki.
  • Familiarity with basic cybersecurity best practices such as how to securely deploy a web service.
  • Strong willingness to learn new tools and technologies on the job.
  • Strong communication skills.

PREFERRED QUALIFICATIONS (EDUCATION, EXPERIENCE, CERTIFICATIONS, SKILLS)

  • Familiarity with embedded Linux devices such as Raspberry Pi or Nvidia Jetson and Orin family.
  • Familiarity with basic cloud infrastructure concepts such as time series databases (ex. InfluxDB) S3 storage, message brokers (ex. RabbitMQ), caching (ex. Redis), and web services.
  • Familiarity with Infrastructure as Code and config management tooling such as Ansible.
  • Familiarity with basic data analysis and visualization in Python, with a strong ability to communicate issues using these tools.
  • A B.S. or M.S. degree in CS or related fields
  • Linux Operating System
  • Puppet/Chef/Ansible
  • SQL/MySQL/Postgres
  • Python
  • Shell Scripting

Target hiring range for this position will be between $$115,000 to $132,750 per year. Offered salary will be determined by the applicant's education, experience, knowledge, skills and abilities, as well as internal equity and alignment with market data.

Benefits:
At Northwestern, we are proud to provide meaningful, competitive, high-quality health care plans, retirement benefits, tuition discounts and more! Visit us at https://www.northwestern.edu/hr/benefits/index.html to learn more.

Work-Life and Wellness:
Northwestern offers comprehensive programs and services to help you and your family navigate life's challenges and opportunities, and adopt and maintain healthy lifestyles.
We support flexible work arrangements where possible and programs to help you locate and pay for quality, affordable childcare and senior/adult care. Visit us at https://www.northwestern.edu/hr/benefits/work-life/index.html to learn more.

Professional Growth & Development:
Northwestern supports employee career development in all circumstances whether your workspace is on campus or at home. If you're interested in developing your professional potential or continuing your formal education, we offer a variety of tools and resources. Visit us at https://www.northwestern.edu/hr/learning/index.html to learn more.

Northwestern University is an Equal Opportunity Employer and does not discriminate on the basis of protected characteristics, including disability and veteran status. View Northwestern's non-discrimination statement. Job applicants who wish to request an accommodation in the application or hiring process should contact the Office of Civil Rights and Title IX Compliance. View additional information on the accommodations process.

Applied = 0

(web-696f97f645-4mdcj)