Assignment Description

Our client in the automotive industry is currently seeking a HPC/HPT Engineer.

In our client’s team, you will work with the cutting edge of High-Performance Computing (HPC) infrastructure in Autonomous Drive. You will ensure the smooth daily operations of their HPC environments and help support their users in the business to ensure that our client gets the most value and performance out of the solution as possible.

Their team is managing an existing cluster, and they are now in the project phase for the next generation major High-Throughput Cluster (HTC) with over 100PiB storage and CPU/GPU nodes.

You will work closely with Autonomous Drive & Advanced Driver Assistance Systems (AD and ADAS) developers and engineers as an HPC Subject Matter Expert (SME), supporting in optimizing code and strategies to run in HPC, recommending ways of working, arranging workshops and training sessions, capturing requirements, identifying solutions, setting up proof of concepts, helping manage workloads, and other related tasks. You will also be involved in the continuous improvement and extension of the Storage and Compute (S&C) solutions as well as in the ongoing operations together with their S&C suppliers. You will be given a high degree of trust and be empowered to solve issues to ensure that the business receives the HPC services that they need.

Main Responsibilities:

  • Work with technical engineers in the business areas, assisting with methods/workflow, application optimization, automation, etc., to improve the overall user experience and efficiency.
  • Act as a local Subject Matter Expert (SME) for everything related to HPC/HTC, including Linux, configuration management, schedulers, hardware, applications, etc.
  • Assist in developing standards and procedures and work with our external providers to implement them.
  • Participate in both internal IT and business projects as an SME.
  • Collaborate with external service providers to assist in advanced troubleshooting, reviewing and signing off root cause analysis reports, etc.
  • Participate as an SME and team member in the development of our next-generation HPC/HTC solution.

Preferred Qualifications:

  • Relevant Linux experience.
  • Experience of HPC administration in an enterprise environment, including administration of network, servers, storage, clients, etc.
  • Degree in IT, Computer Science, or equivalent experience.
  • Experience with storage concepts such as scale-out NAS (NFS), SMB, parallel filesystems (Lustre), POSIX, and object storage (to name but a few).
  • Experience with parallel computing (HPC/HTC), performance troubleshooting & optimization, operating environments at large scale.
  • Communicative and positive attitude.
  • Fluent in English at a professional level.
  • Organized and strive for results and quality.
  • Ability to coordinate and handle more than one task at the same time.
  • Have a curious approach to new things and willingness to learn and improve.
  • Experience of Agile way of working.

To be successful in this position, you are a Linux native and fluent in Python and/or other scripting languages. You have experience working with HPC technologies and are familiar with tools like Slurm/PBS/LSF or similar, Docker/Singularity as well as Terraform/Ansible or other infrastructure automation languages.

Detaljer

Referens:43210

Ort: Göteborg

Omfattning:100%

Startdatum:2024-04-15

Slutdatum:2025-04-15

Konsultförmedlare

Det går inte längre att söka den här tjänsten.