일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- 1.9
- 1.10
- hpcm패치
- build
- patch
- version
- Linux
- java
- infiniband
- HPCM
- rhel
- Docker
- gpfs
- top500
- HPFSS
- CUDA
- HPE
- Kernel
- ubuntu
- PFSS
- LUSTRE
- Cray
- nvidia
- SLURM
- GPU
- AMD
- Source
- Singularity
- conda
- CPU
- Today
- Total
HPE CRAY 자료 공유
[Patch] HPCM 1.8 패치 목록 본문
1. Patch 11754 - HPCM 1.8: slingshot 2.0 monitoring / alerting update
1.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-728d575ec0234330
1.2. 패치 목록
HPCM-2237 slingshot BER dashboard does not have any data
HPCM-2776 Slingshot Error Reporting - Handle single line message
HPCM-2868 Redundant Kafka topic for slingshot_CrayFabricHealthTelemetry
HPCM-2869 Some health events at Kafka bus, "PhysicalContext" has duplicate category string
HPCM-2870 FM doesn't populate NumberValue and TimestampLong fields, so they can be removed
HPCM-2871 SS Health events for some categories wrongly filtered out
HPCM-2872 Some Slingshot Indices are not getting created in Elastic DB for health monitors
HPCM-2886 CLI to enable/disable Slingshot features based on Slingshot version
HPCM-2895 HPCM monitoring: Show slingshot metric rxCongestion
HPCM-2896 HPCM monitoring: Show slingshot metric Congestion idle
HPCM-2930 Unable to view device alert for slingshotswitch
HPCM-2988 MessageID should not be removed completely as it is affecting other slingshot indices
HPCM-2991 TimeScaleDB dashboards query needs to accommodate new change in metric_name under label key table
HPCM-2988 MessageID should not be removed; affects other slingshot indices
HPCM-3024 cm monitoring slingshot get version has "None" output
HPCM-3026 slingshot_CrayFabricHealthTelemetry still reports in Kafka topic
HPCM-3027 slingshot_CrayFabricHealth default index not created under ELK
HPCM-3033 Slingshot rxBW/txBW/idle dashboard (ELK) has to be modified to consider MessageID
HPCM-3056 Slingshot fabric monitoring requires hpe-clusterview plugin
HPCM-3059 Slingshot cray fabric health alert is referring to wrong ELK index
HPCM-3112 Slingshot - increase kafka-connect memory to 6Gb
HPCM-3161 Slingshot - unable to view device alert for slingshotswitch
2. Patch 11755 - HPCM 1.8: recommended update #1
2.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-1876ef8075164a97
2.2. 패치 목록
HPCM-2781 Consoles are slow to reconnect on frontier after booting up system
HPCM-2874 cm node set -n leader1 --image --kernal command failed with timescabledb error
HPCM-2877 Post-upgrade admin reboot, not all mount points are remounted
HPCM-2936 cm-configuration service starts before network is online
HPCM-2937 Report generation not working for HPCG
HPCM-2956 Copyright character in slingshot-fabric-check causes python scripts to fail
HPCM-2961 slingshot-fabric-check: ensure header exists in perfstats.dat
HPCM-2983 Unable to flash RM on EX235a nodes
HPCM-2984 su-leader-nodes missing pixz dependency needed for new cm-logrotate-parallel
HPCM-2995 Allow MPI diags to run locally with and without fabric
HPCM-3028 Unable to flash newer models of slingshot switches
HPCM-3042 HPCM 1.8 cexec doesn't protect commands with semi-colons
HPCM-3054 cm monitoring elk restart fails to restart
HPCM-3056 Add hpe-grafana-clusterview-panel
HPCM-3057 Revert change that skipped miniroot cluster configuration for NFS based nodes
HPCM-3098 Reduce number of calls to power API in slurm monitoring script
HPCM-3117 Fix 15-network-setup renames data nic when HSN net is assigned to node
HPCM-3185 Add 'run all' script to run all diags locally on node
HPCM-3189 Add support for slurm 22.x to SPANK power plugin
HPCM-3195 Add error message when pxe boot file write would fail on full disk
HPCM-3196 Add script to map xnames to simple names
HPCM-3223 Add 95% disk warning to brick-and-ctdb-health-check.pl
HPCM-3225 Fix monitor mode regression
HPCM-3254 Update Intel MPI to fix errors in online diags and health checks
3. Patch 11763 - HPCM 1.8: recommended alerta update
3.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-8f495406aab14d07
3.2. 패치 목록
HPCM-3258 Add alerta-logstash plugin to alerta 8.x version
4. Patch 11765 - HPCM 1.8: monitoring updates
4.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-076d3137606145b8
4.2. 패치 목록
HPCM-3257: TypeError: index() missing required positional argument: 'doc_type'
HPCM-3288: Disable confluent-ksqldb
HPCM-3292: Timescale dashboards are not filtering on time
HPCM-3343: Integrate slurm monitoring enhancements and fixes for slurm 22.x compatibility
HPCM-3370: Curator not running due to outdated cron.d
HPCM-3423: cm monitoring fails to start
HPCM-3524: Add time filter for system monitoring timescale dashboards
HPCM-3632: Proliant IML alerts not seen in "cm health alert compute" due to hashlib.sha224() issue in ImlAlertRule.py
HPCM-3634: alerta environment variable for IML alerts (host) has been appended as bytes ('b) and causing the device-level alerts invisible
5. Patch 11771 - HPCM 1.8: quorum high availability update
5.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-4086133e99bf4df9
5.2. 패치 목록
HPCM-3278: Q-HA tooling (sles-cluster-configuration) assumes head is a /16 network
HPCM-3453: Q-HA adminvm image creation can fail for large image files
6. Patch 11774 - HPCM 1.8: HA-RLC and syslog updates
6.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-28f2f797ebe24c17
6.2. 패치 목록
HPCM-3582 Fix ICE compute syslog
HPCM-3813 HARLC: rhel8: Error while setting up resources using pcs (includes CAST-32282)
HPCM-4058 HARLC: rhel8: blademon cannot restart dhcpd using pcs
7. Patch 11777 - HPCM 1.8: cluster health check, yume, remlog-collect, grafana-dashboard updates
7.1. 패치 정보 주소
https://support.hpe.com/connect/s/softwaredetails?language=en_US&collectionId=MTX-ccd2229ffd7a44c7
7.2. 패치 목록
HPCM-3247 sapoweru accumulator.upate iterator regression
HPCM-3376 Inventory leaves defunct processes if ssh commands hang
HPCM-3873 On busy admin, image creation spends time in zypper-launched lsof
HPCM-3965 Rewrite slurm mon script with threadpool executors
HPCM-3966 Fix remlog-collect slurm/pbs monitoring from making multiple calls
HPCM-4192 hpe_clmgr_power_api: Provide API to get all node power
HPCM-4260 Fabric dashboard improvement
HPCM-4348 slurm_install_path and pbs_install_path parameter is missing in remlog-collect polling
'SYSTEMS > HPCM' 카테고리의 다른 글
[Patch] HPCM 1.10 패치 목록 (0) | 2024.06.03 |
---|---|
[Patch] HPCM 1.9 패치 목록 (0) | 2024.06.03 |
[HPCM] HPCM GUI 사용자 등록 (0) | 2024.05.30 |
[HPCM] cpasswd 사용 방법 (0) | 2024.05.27 |
[HPCM] HPCM pbspro connector (0) | 2024.04.19 |