일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
- 1.9
- hpcm패치
- HPCM
- LUSTRE
- infiniband
- rhel
- patch
- Docker
- 1.10
- CPU
- top500
- CUDA
- HPFSS
- Kernel
- nvidia
- java
- SLURM
- AMD
- GPU
- PFSS
- Source
- gpfs
- Singularity
- conda
- version
- ubuntu
- Cray
- build
- Linux
- HPE
- Today
- Total
목록전체 글 (98)
HPE CRAY 자료 공유
- Switch Management Software for NVIDIA InfiniBand NDR 64-port OSFP Managed Power to Connector Airflow Switch (HPE Part Number P45692-B21) 날짜 MLNX-OS 버전 경로 2023.10.03 3.11.1014 https://support.hpe.com/connect/s/softwaredetails?language=en_US&softwareId=MTX_1488dac1f95b4d4a84b4ec264c - NDR Switch ※참고: 전면 사진은 Unmanaged Switch ※ 실제 32개의 OSFP port를 가지고 있음. - NDR Cables - 참고자료: https://docs.nvidia.co..
HPE CRAY XD 670 제품 페이지 - GPU Driver & Fabric Manager: 현재(2023.11.14) HPE 권장 버전은 525.125.06 입니다. - CUDA Toolkit: Hopper(H100)의 최소 버전은 11.8 이고, HPE에서 권장되는 버전은 12.x 버전 입니다. CUDA Toolkit Download Link 12.2.1 https://developer.download.nvidia.com/compute/cuda/12.2.1/local_installers/cuda_12.2.1_535.86.10_linux.run 12.2.2 https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda_..
RHEL 8.6 OS에 slurm + pyxis + enroot 설치 기록 1. 의존성 패지키 설치 # yum groupinstall "Development Tools" # yum install jna python3-docutils python3-devel kernel-rpm-macros \ gcc-gfortran golang bzip2-devel pam-devel readline-devel java-1.8.0-openjdk-devel \ python39 python39-devel python39-pip libatomic libatomic-static \ mariadb mariadb-server mariadb-devel tcl-devel tk-devel libseccomp-devel \ perl perl..
※ slurm gres.conf 사용을 위한 간단한 예제 - cuda toolkit 설치 $ wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run $ sudo sh cuda_11.8.0_520.61.05_linux.run - rpmbuild 옵션에 "--with-nvml"을 추가 $ rpmbuild --define "_with_nvml --with-nvml=/usr/local/cuda-11.8" -ta slurm-22.05.6.tar.bz2 - gpu 라이브러리 포함 확인 $ cd ${HOME}/rpmbuild/RPMS/x86_64 $ rpm -qlp slur..
CentOS 7.9 버전에서 진행한 Lustre 2.15 버전 설치 방법 정리 의존성 패키지 설치# yum groupinstall "Development Tools"# yum install yum-utils kernel-devel zlib-devel libyaml-devel # yum install libmount-devel libnl3-devel libnl-devel libselinux-devel libselinux-static rpm build$ unzip lustre-cray-2.15.B1.gb93242.zip$ rpm -ivh lustre-2.15.0.4_rc2_cray_134_gb93242d-1.src.rpm $ cd $HOME/rpmbuild/SOURCES$ tar xvzf lustre-2.1..
[Compare nm-settings with ifcfg-* directives (IPv4)] nmcli con mod ifcfg-* file Effect ipv4.method manual BOOTPROTO=none IPv4 address configured statically ipv4.method auto OOTPROTO=dhcp Will look for configuration settings from a DHCPv4 server ipv4.address "192.168.0.10/24" IPADDR=192.168.0.10 PREFIX=24 Set static IPv4 address, network prefix ipv4.gateway 192.168.0.1 GATEWAY=192.168.0.1 Set IPv..
1. 의존성 패키지들 설치 # yum groupinstall "Development Tools" # yum install gcc-gfortran golang tcl-devel tk-devel 2. Environment Modules Source Build - Source Download page : https://modules.sourceforge.net - Source Build # wget https://sourceforge.net/projects/modules/files/Modules/modules-5.2.0/modules-5.2.0.tar.gz/download -O modules-5.2.0.tar.gz # tar xvzf modules-5.2.0.tar.gz # cd modules-5.2.0 # ..
# ipmitool No command provided! Commands: raw Send a RAW IPMI request and print response i2c Send an I2C Master Write-Read command and print response spd Print SPD info from remote I2C device lan Configure LAN Channels chassis Get chassis status and set power state power Shortcut to chassis power commands event Send pre-defined events to MC mc Management Controller status and global enables sdr ..
- Linux Openfile 개수 # cat /proc/sys/fs/file-nr - drop_caches pagecache cache clear # echo 1 > /proc/sys/vm/drop_caches dentries, inodes cache clear # echo 2 > /proc/sys/vm/drop_caches pagecache, dentries, inodes cache clear # echo 3 > /proc/sys/vm/drop_caches
conda를 이용하여 offline 환경에 MLDE 0.19.8 버전 사용 방법 정리 conda pack을 이용한 패키지 내보내기 $ conda create -n mlde_0.19.8 python=3.8 $ source activate mlde_0.19.8 $ conda install conda-pack $ pip install "determined==0.19.8" "msrest==0.6.21" "backoff==1.10.0" "azure_core==1.22.1" $ conda pack -n mlde_0.19.8 -o mlde_0.19.8.tar.gz $ conda deactivate conda unpack을 이용한 패키지 설치 $ mkdir -p mlde_0.19.8 $ cd mlde_0.19.8 $ ..