일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 | 29 |
30 |
- GPU
- ubuntu
- HPFSS
- CUDA
- AMD
- SLURM
- nvidia
- hpcm패치
- HPE
- 1.10
- version
- java
- CPU
- client
- Cray
- infiniband
- Singularity
- HPCM
- gpfs
- PFSS
- 1.9
- rhel
- LUSTRE
- v1.9
- Docker
- patch
- PBS
- Linux
- build
- Source
- Today
- Total
목록2024/06 (5)
HPE CRAY 자료 공유
일반 사용자 계정으로 rocminfo 명령 실행 시 오류 해결 방법 오류 내용[sylee@cray ~]$ rocminfo ROCk module is loaded Unable to open /dev/kfd read-write: Permission denied sylee is not member of "video" group, the default DRM access group. Users must be a member of the "video" group or another DRM access group in order for ROCm applications to run successfully.- 일반계정(예: sylee)으로 AMD GPU의 rocminfo 명령 실행 시 /dev/kfd 디바이스에 권한..
OS별 기본 Python 버전 정리RHEL 버전Python 버전비고Red Hat Enterprise Linux 6Python 2.6 Red Hat Enterprise Linux 7Python 2.7 Red Hat Enterprise Linux 8Python 3.6 Red Hat Enterprise Linux 9Python 3.9 SLES 버전Python 버전비고SUSE Linux Enterprise Server 15Python 3.6
1. Patch 11793 - HPCM 1.10: cfirmware updates1.1. 패치 정보 주소https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-2435e54955e04bfa 1.2. 패치 목록HPCM-1765 add FW flashing support for Cray XD2000 computesHPCM-2589 add support for iLO firmware upgrade via cfirmwareHPCM-5186 add new async_apis rpmHPCM-5225 python library needs requests-toolbelt 1.0.0HPCM-5297 asyncio_cmdb..
1. Patch 11778 - HPCM 1.9: XD2000 platform and remote support1.1. 패치 정보 주소https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-69f55e3aa8624d88 1.2. 패치 목록HPCM-2906 XD2000 support HPCM-2907 Cray XD2000 nodes have a special bmc to query HPCM-2908 table of Sensor Type to Rest URI HPCM-2909 parse HW collection to serial numbers HPCM-2925 use redfish to query for FRU..
1. Patch 11754 - HPCM 1.8: slingshot 2.0 monitoring / alerting update1.1. 패치 정보 주소https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-728d575ec0234330 1.2. 패치 목록HPCM-2237 slingshot BER dashboard does not have any data HPCM-2776 Slingshot Error Reporting - Handle single line message HPCM-2868 Redundant Kafka topic for slingshot_CrayFabricHealthTelemetry HPCM-2869 ..