Assignment description
At our client`s team, you will work with the cutting edge of High-Performance Computing (HPC) infrastructure in Autonomous Drive. You will ensure the smooth daily operations of their HPC environments and help support their users in the business to ensure that our client gets the most value and performance out of the solution as possible.
Their team is managing an existing cluster and they are now in the project phase for the next generation major High Throughput Cluster (HTC) with over 100PiB storage and CPU/GPU nodes.
You will work closely with Autonomous Drive & Advanced Driver Assistance Systems (AD and ADAS) developers and engineers as a HPC Subject Matter Expert (SME) supporting in optimizing code and strategies to run in HPC, recommend ways of working, arrange workshops and training sessions, capture requirements, identify solutions, setting up proof of concepts, help manage workloads and other related tasks. You will also be involved in the continuous improvement and extension of the Storage and Compute (S&C) solutions as well as in the ongoing operations together with their S&C suppliers. You will be given a high degree of trust and be empowered to solve issues to ensure that the business receives the HPC services that they need.
Your main tasks will be to:
- Work with technical engineers in the business areas assisting with methods/workflow, application optimization, automation etc to improve the overall user experience and efficiency
- Act as a local Subject Matter Expert (SME) for everything related to HPC/HTC including Linux, configuration management, schedulers, hardware, applications etc
- Assist in developing standards and procedures and to work with their external providers to implement them.
- Participate in both internal IT and business projects as SME.
- Work with their external service providers to assist in advanced troubleshooting, reviewing and signing off root cause analysis reports etc.
- Participate as an SME and team member in the development of their next generation HPC/HTC solution.
To be successful in this position you are a Linux native and fluent in Python and/or other scripting languages. You have experience from working with HPC technologies and are familiar with tools like Slurm/PBS/LSF or similar, Docker/Singularity as well as Terraform/Ansible or other infrastructure automation languages.
Preferred Qualifications:
- Relevant Linux experience
- Experience of HPC administration in an enterprise environment including administration of network, servers, storage, clients etc.
- Degree in IT, Computer Science or equivalent experience
- Experience with storage concepts such as scale-out NAS (NFS), SMB, parallel filesystems (Lustre), POSIX, and object storage (to name but a few)
- Experience with parallel computing (HPC/HTC), performance troubleshooting & optimization, operating environments at large scale
- Communicative and a positive attitude
- Understand and speak English at a professional level
- Organized and strive for results and quality
- Ability to coordinate and handle more than one task at the same time
- Have a curious approach to new things and willingness to learn and improve
- Experience of Agile way of working
Ansök
”*” anger obligatoriska fält
Detaljer
Referens:35450
Ort: Göteborg
Omfattning:100%
Startdatum:2024-02-01
Slutdatum:2025-01-30
Konsultförmedlare
Ida Ingeström
Det går inte längre att söka den här tjänsten.