To support teaching and research, IT Services (ITS) of ETH Zurich are working as a service organization in a large and complex IT environment. Scientific IT Services (SIS, a section of ITS) aims at bridging the gap between computational research and IT service as well as infrastructure provisioning, and provides a stimulating, flexible and family-friendly working environment. SIS is working closely together with ETH researchers in the wide area of scientific computing, supporting data management and analysis, and the development of scientific software through the operation of high-performance and cloud computing infrastructures.
To strengthen our team of experts for the engineering, provisioning and operation of our research IT platforms, SIS is looking for an experienced System Engineer / Linux System Administrator with focus on operational infrastructure and service operation.
This position will be part of a growing team of expert system engineers and system administrators that ensure engineering, development and operation of customer-tailored data management systems (e.g. openBIS), data analysis workflows, or data science IT platforms for various scientific domains.
While your primary focus will be engineering and supporting the baseline operational infrastructure needed to automate and scale a large number of heterogeneous services, your tasks will range from devising end-to-end solutions to meet customer's usecases and needs, to ensuring day-to-day operation of the provided services, to incident handling, to constantly improving the automation and the efficiency of the support infrastructure. Experimenting with new technologies from proof of concept to full production, will also be an integral and crucial part of your role.
This position requires a proven experience in service operation in production environments. You must be an experienced Linux system administrator; familiar in networking as well as open-source code development, and have experience automating complex, repetitive tasks using Bash or Python scripts. You should demonstrate experience in configuration management and application deployment systems (such as ansible, cfengine, puppet, etc.). Good knowledge of incident management, monitoring tools, process automation, fail-over, and disaster recovery is required. Prior experience with IT security and with virtualization technologies (cloud environments, containers and containers orchestration through Kubernetes) is a plus. Beside solid technical abilities, this position also requires an aptitude for finding good solutions to complex IT problems, together with a "focused on the finished product" mindset. Good communication skills and the ability to collaborate with other IT specialists are necessary.