EISSN 1726-3522
Язык: ru

Архив статей журнала

DEVELOPING A MODEL FOR HOLISTIC WORKLOAD ANALYSIS OF LARGE SUPERCOMPUTER SYSTEMS (2021)
Выпуск: Т. 22 № 1 (2021)
Авторы: Швец Павел Артёмович, Воеводин Вадим Владимирович, Жуматий Сергей Анатольевич

Any modern supercomputer has an extremely complex architecture, and efficient usage of its resources is often a very difficult task, even for experienced users. At the same time, the field of high-performance computing is becoming more and more in demand, so the issue of efficient utilization of supercomputers is very urgent. Therefore, users should know everything important about performance of their jobs running on a supercomputer in order to be able to optimize them, and administrators should be able to monitor and analyze all the nuances of the efficient functioning of such systems. However, there is currently no complete understanding of what data are best to be studied (and how it should be analyzed) in order to have a whole picture of the state of the supercomputer and the processes taking place there. In this paper, we make our first attempt to answer this question. To do this, we are developing a model that describes all the potential factors that may be important when analyzing the performance of supercomputer applications and the HPC system as a whole. The paper provides both a detailed description of this model for users and administrators and some interesting real-life examples discovered on the Lomonosov-2 supercomputer using a software implementation based on the proposed model.

Сохранить в закладках