CEP's Blog -by Kage Park: Computer/HPC 카테고리 글 목록
http://kaget.cep.kr/blog/
2022-08-20T01:15:31+09:00
Textcube 1.10.7 : Tempo primo
KGT (Kage Engineer Tools)
Kage Park
http://kaget.cep.kr/blog/304
2019-12-27T08:06:45+09:00
2019-09-30T11:44:39+09:00
<h1 style="box-sizing: border-box; line-height: 1.25; padding-bottom: 0.3em; border-bottom: 1px solid #eaecef; color: #24292e; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; margin: 0px !important 0px 16px 0px;">kgt</h1>
<p style="box-sizing: border-box; margin-top: 0px; margin-bottom: 16px; color: #24292e; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; font-size: 16px;">Kage Engineer tools Keep moving original code to github.com site. This is GPL license(Opensource)</p>
<p style="box-sizing: border-box; margin-top: 0px; margin-bottom: 16px; color: #24292e; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; font-size: 16px;">Install:</p>
<h1 style="box-sizing: border-box; margin: 24px 0px 16px; line-height: 1.25; padding-bottom: 0.3em; border-bottom: 1px solid #eaecef; color: #24292e; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji';"><a id="user-content-git-clone-httpsgithubcomkageparkkgtgit" class="anchor" style="box-sizing: border-box; background-color: initial; color: #0366d6; text-decoration-line: none; float: left; padding-right: 4px; margin-left: -20px; line-height: 1;" href="https://github.com/kagepark/kgt#git-clone-httpsgithubcomkageparkkgtgit"></a>git clone <a style="box-sizing: border-box; background-color: initial; color: #0366d6; text-decoration-line: none;" href="https://github.com/kagepark/kgt.git">https://github.com/kagepark/kgt.git</a></h1><p><strong><a href="http://kaget.cep.kr/blog/304?commentInput=true#entry304WriteComment">댓글 쓰기</a></strong></p>
KxCAT for HPC using xCAT easy interface
Kage Park
http://kaget.cep.kr/blog/303
2019-12-27T08:06:35+09:00
2019-09-30T11:41:51+09:00
<span style="color: #24292e; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; font-size: 16px;">KxCAT based on xCAT which is opensource (</span><a style="box-sizing: border-box; color: #0366d6; text-decoration-line: none; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; font-size: 16px;" href="https://sourceforge.net/projects/xcat/" rel="nofollow">https://sourceforge.net/projects/xcat/</a><span style="color: #24292e; font-family: -apple-system, system-ui, 'Segoe UI', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; font-size: 16px;">) from IBM. It is just interfacing easy to use xCAT command using BASH shell scripting language. xCAT command was hard to me before. So, I start made some scripts for installing and some of commands for me at 2012. Now it looks cover up many function of xCAT and reduce engineer's mistake of xCAT commands or procedure. If some of user want experience of HPC system then this will help to you how HPC it works and what is it. It tested on CentOS 7.4 base. It will support CentOS 7.x and diskless and diskful compute node.</span><br />Wiki site: <a href="https://github.com/kagepark/kxcat/wiki">https://github.com/kagepark/kxcat/wiki<br /><br /></a><p><strong><a href="http://kaget.cep.kr/blog/303?commentInput=true#entry303WriteComment">댓글 쓰기</a></strong></p>
Filesystems
Kage Park
http://kaget.cep.kr/blog/297
2018-03-24T07:58:44+09:00
2018-03-24T07:58:18+09:00
* CEPh (Server environment)<br />- Physical storage clustering<br />- only one access point(a mount point on one of a node) only can access one storage device<br />- useful<br /> - increase I/O performance to one of a server's I/O<br /> - the storage device for HA server.<br /> - provide single storage device(rbd) per mount point in a storage pool.<br /> - Not support a single storage device(rbd) to a multi-mount point ( not support multi-nodes )<br /><br />* Lustre (HPC computing environment)<br />- http://lustre.org/<br />- https://downloads.hpdd.intel.com/public/lustre/<br />- Physical storage clustering <br />- Support TCP/IP(Ethernet/IPoIB)<br />
<div>- multi-access point(a mount point on multi-nodes) can access same storage device<br />- Support Ethernet network protocol (Ethernet/IP over IB)</div>
<div>- useful</div>
<div> - increase I/O performance</div>
<div><span style="font-size: 15.008px;"> - provide single storage device to a mount point on multi-hosts (Similar to NFS mount)</span></div>
<div></div>
<div><span style="font-size: 15.008px;">* BeeGFS (HPC computing environment)<br /></span>- https://www.beegfs.io/<br /><span style="font-size: 15.008px;">- Physical storage clustering <br /></span>- Support TCP/IP, IB<br />
<div style="font-size: 15.008px;">- multi-access point(a mount point on multi-nodes) can access same storage device</div>
<div style="font-size: 15.008px;">- useful</div>
<div style="font-size: 15.008px;"> - increase I/O performance</div>
<div style="font-size: 15.008px;"><span style="font-size: 15.008px;"> - provide single storage device to a mount point on multi-hosts (Similar to NFS mount)</span></div>
</div>
<br /><span style="font-size: 15.008px;">* GlusterFS (HPC computing environment)<br /></span>- https://www.gluster.org/<br /><span style="font-size: 15.008px;">- Physical storage clustering <br /></span>- Support TCP/IP, IB, Socket direct protocol<br />
<div style="font-size: 15.008px;">- multi-access point(a mount point on multi-nodes) can access same storage device</div>
<div style="font-size: 15.008px;">- useful</div>
<div style="font-size: 15.008px;"> - increase I/O performance</div>
<div style="font-size: 15.008px;"> - provide single storage device to a mount point on multi-hosts (Similar to NFS mount)</div><p><strong><a href="http://kaget.cep.kr/blog/297?commentInput=true#entry297WriteComment">댓글 쓰기</a></strong></p>
(basic)installation for hpl-2.0 with ATLAS, gcc and mpich
Kage Park
http://kaget.cep.kr/blog/266
2015-10-31T15:34:21+09:00
2011-04-05T05:45:17+09:00
requirement :<br> MPI program and environment in the system.<br> attached file based on MPI(mvapich/openmpi) of OFED.<br><br>download hpl-2.0 & atlas 3.8.3<br>http://sourceforge.net/projects/math-atlas/files/Stable/3.8.3/<br><br><br>make a temporary directory for compile/setup<br>$ mkdir -p /kage<br>copy hpl & atlas to /kage<br><br>extract ATLAS<br>$ tar zxvf atlas-xxxx.tar.gz<br>$ cd ATLAS<br><br>create configuration file<br>$ vi opt.conf<br>--------------------------------------------------------------------------------------------------------------<br>#http://math-atlas.sourceforge.net/atlas_install/<br>arch=Linux_Xeon_SSE2<br>mkdir -p $arch<br>cd $arch<br>../configure -b 64 -D -c -DPentiumCPS=240 --prefix=/kage/hpl/atlas<br><br>#../configure -b 32 \ # Currently the BCCD only supports 32-bit<br># -t -1 \ # -1 tells ATLAS to try to autodetect th<br># -Si cputhrchk 0 \ # Do not check for CPU throttling<br># --prefix=$HOME/hpl/atlas \ # Could be anywhere, but note this path,<br># --nof77 \ # Don't worry about FORTRAN<br># --cc=/usr/bin/gcc \ # Use gcc<br># -C ic /usr/bin/gcc # Really, use gcc (see doc for explainat<br><br><br>mkdir -p /kage/hpl/atlas<br>mkdir -p /kage/hpl/lib/atlas<br>make build<br>make check<br>make time<br>make install<br>--------------------------------------------------------------------------------------------------------------<br><br>compile ATLAS<br>$ sh opt.conf<br><br>extract hpl-2.0<br>$ tar zxvf hpl-2.0.xxx.tar.gz<br>$ cd hpl-2.0<br><br>create configuration file<br>$ vi setup/Make.Linux_ATHLON_CBLAS<br>--------------------------------------------------------------------------------------------------------------<br>
SHELL = /bin/sh<br>CD = cd<br>CP = cp<br>LN_S = ln -s<br>MKDIR = mkdir<br>RM = /bin/rm -f<br>TOUCH = touch<br>HOME = /kage<br>ARCH = Linux_ATHLON_CBLAS<br>TOPdir = $(HOME)/hpl-2.0<br>INCdir = $(TOPdir)/include<br>BINdir = $(TOPdir)/bin/$(ARCH)<br>LIBdir = $(TOPdir)/lib/$(ARCH)<br>HPLlib = $(LIBdir)/libhpl.a<br><br>#MPdir = /usr/local/mpi<br>#MPdir = /usr/mpi/gcc/mvapich-1.2.0<br>MPdir = /usr/mpi/gcc/openmpi-1.4.1<br>MPinc = -I$(MPdir)/include<br>#MPlib = $(MPdir)/lib/libmpich.a<br>MPlib = $(MPdir)/lib64/libmpi.so<br>#MPlib = $(MPdir)/lib64/libvt.mpi.a<br>#LAdir = $(HOME)/netlib/ARCHIVES/Linux_ATHLON<br>LAdir = $(HOME)/hpl/atlas<br>LAinc =<br>LAlib = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a<br>F2CDEFS =<br><br># ----------------------------------------------------------------------<br># - HPL includes / libraries / specifics <br># ----------------------------------------------------------------------<br>HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)<br>HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)<br>HPL_OPTS = -DHPL_CALL_CBLAS<br>HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)<br><br># ----------------------------------------------------------------------<br># - Compilers / linkers - Optimization flags <br># ----------------------------------------------------------------------<br>CC = /usr/bin/gcc<br>CCNOOPT = $(HPL_DEFS)<br>CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops -W -W<br>LINKER = /usr/bin/gcc<br>LINKFLAGS = $(CCFLAGS)<br>ARCHIVER = ar<br>ARFLAGS = r<br>RANLIB = echo<br>--------------------------------------------------------------------------------------------------------------<br>
<br>$ ln -s setup/Make.Linux_ATHLON_CBLAS .<br>$ vi opt.conf<br>--------------------------------------------------------------------------------------------------------------<br>#make arch=Linux_ATHLON_CBLAS clean<br>make arch=Linux_ATHLON_CBLAS<br>
--------------------------------------------------------------------------------------------------------------<br>
<br>compile<br>$ sh opt.conf<br><br>test run hpl with 4 process in localhost<br>$ cd/kage/hpl-2.0/bin/Linux_ATHLON_CBLAS<br>
$ vi hostlist<br>--------------------------------------------------------------------------------------------------------------<br>localhost<br>localhost<br>localhost<br>localhost<br>--------------------------------------------------------------------------------------------------------------<br>
$ /usr/mpi/gcc/openmpi-1.4.1/bin/mpirun -np 4 -machinefile ./hostlist ./xhpl >& hpl.out<br><br><br>check output<br>
$ tail -f hpl.out<br>
<br>~~~~~~~~<br>================================================================================<br>T/V N NB P Q Time Gflops<br>--------------------------------------------------------------------------------<br>WR00R2R4 35 4 4 1 0.00 2.112e-01<br>--------------------------------------------------------------------------------<br>||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0217524 ...... PASSED<br>================================================================================<br><br>Finished 864 tests with the following results:<br> 864 tests completed and passed residual checks,<br> 0 tests completed and failed residual checks,<br> 0 tests skipped because of illegal input values.<br>--------------------------------------------------------------------------------<br><br>End of Tests.<br>================================================================================<br><br>it is about 0.21GFlops for not optimized hardware/hpl input data<br><br><br><br><br><br>If you want full test for hardware stress.<br>$ vi stress.sh<br>--------------------------------------------------------------------------------------------------------------<br>MPI_BIN=/usr/mpi/gcc/openmpi-1.4.1/bin<br>[ -f hostlist ] && rm -f hostlist<br>
for i in $(seq 1 $(cat /proc/cpuinfo | grep "^processor" | wc -l)); do<br> echo $(hostname) >> hostlist<br>done<br>${MPI_BIN}/mpirun -np $(cat hostlist|wc -l) -machinefile hostlist ./xhpl<br>--------------------------------------------------------------------------------------------------------------<br>
<br><br>If you want install script then download below attached file.<br>and modify first few line for PATH.<br>run the file like as " sh hpl_install.sh" then it will be automatic install HPL.<br><br><div class="imageblock center" style="text-align: center; clear: both;"><a class="extensionIcon" href="http://www.cep.kr/blog/attachment/1000224398.tgz"><img src="http://www.cep.kr/blog/resources/image/extension/unknown.gif" alt="" /> kage_hpl-2.0.tgz</a></div><br>simple progress)<br>1. download hpl-2.0.tar.gz<br>2. download atlas3.8.3.tar.gz<br>3. download above kage_hpl-2.0.tgz<br>4. make a directory for temporary<br>5. copy 3 files to that directory.<br>6. modify first few line for path in hpl_install.sh<br>7. run hpl_install.sh <br>8. go to hpl/bin directory.<br>9. create input data like as "./configure.sh"<br>10. run HPL like as "./run.sh"<br>11. you can see result.<br><br><br>*)<br>check CPU number when it has a problem of number of CPU.<br> 1. configure.sh <br> CPU=<br> 2. run.sh <br> NP=<br><br>If "CPU=" has a digit number then "NP=" has same number.<br>If "CPU=" has no digit number then "NP=" has no digit number.<br>You can choose even number for "CPU=".<br><br><p><strong><a href="http://kaget.cep.kr/blog/266?commentInput=true#entry266WriteComment">댓글 쓰기</a></strong></p>
Install PBS Pro 11.0
Kage Park
http://kaget.cep.kr/blog/265
2015-10-31T15:34:20+09:00
2011-03-31T01:06:20+09:00
Altair<br>
Download license daemon & PBS file<br>
<br>
license daemon : altair_licensing_11.0.linux_x64.bin<br>
PBS : PBSPro_11.0.0-RHEL5_x86_64.tar.gz<br>
<br>
Install<br>
<br>
1. Create license : <br>
type : LM-X<br>
<br>
2. Install license daemon to license server:<br>
# sh altair_licensing_11.0.linux_x64.bin<br>
<br>
start daemon<br># chkconfig --level 35 altairlmxd on<br>
# /etc/init.d/altairlmxd start<br>
<br>
check license<br>
# ps -ef |grep lm<br>
root 3863 1 0 09:57 ? 00:00:00
/opt/pbs/licensing11.0/bin/lmx-serv-altair -b -c
/opt/pbs/licensing11.0/altair-serv.cfg<br>
<br>
debug <br>
# tail -f /opt/pbs/licensing11.0/logs/<hostname>.log<br>
<br>
<br>
<br>
3. Install PBS server in front-end server<br>
requirement daemon : pbs_sched, pbs_server.bin, postgres<br>
<br>
# useradd altair <br>
# tar zxvf PBSPro_11.0.0-RHEL5_x86_64.tar.gz<br>
# cd PBSPro_XXXX<br>
# ./INSTALL<br>
***<br>
Execution directory? [/opt/pbs/11.0.0.103450] <enter><br>
***<br>
Home directory? [/var/spool/PBS] <enter><br>
***<br>
PBS Installation:<br>
1. Server, execution and commands <= front end server<br>
2. Execution only <= compute node<br>
3. Commands only <= Just run command node (not submit)<br>
(1|2|3)?1 <enter><br>
PBS Professional version 9.0 and later is licensed<br>
via the Altair License Manager.<br>
<br>
The Altair License Manager can be downloaded from:<br>
http://www.pbspro.com/UserArea/Software/<br>
<br>
For more information, please refer to the PBS<br>
Professional Administrator's Guide, or contact pbssupport@altair.com.<br>
<br>
Continue with the installation ([y]|n)? <enter><br>
Please enter the list of Altair License file location(s)<br>
in a colon-separated list of entries of the form<br>
<port>@<host><br>
@<host><br>
<license file path><br>
<br>
Examples:<br>
7788@fest<br>
7788@tokyo:7788@madrid:7788@rio<br>
@perikles:27000@aspasia<br>
@127.3.4.5<br>
/usr/local/altair/security/altair_lic.dat<br>
Enter License File Location(s):@pbs_license_server <enter><br>
***<br>
Switch to the new version of PBS (y/n)?y <enter><br>
***<br>
Would you like to start PBS now (y|[n])?n <enter><br>
***<br>
<br>
# vi /etc/pbs.conf<br>
-------------------------------------<br>
PBS_EXEC=/opt/pbs/default<br>
PBS_HOME=/var/spool/PBS<br>
PBS_START_SERVER=1<br>
PBS_START_MOM=0 <== change from 1 to 0<br>
PBS_START_SCHED=1<br>
PBS_SERVER=home<br>
PBS_DATA_SERVICE_USER=altair<br>
-------------------------------------<br>
<br>
start daemon<br># chkconfig --level 35 pbs on<br>
# /etc/init.d/pbs start<br>
<br>
check log<br>
# tail -f /var/spool/PBS/server_logs/<date><br>
<br>
<br>
4. install PBS in compute node<br>
requirement daemon : pbs_mom<br>
requirement remote shell : default (rsh, rcp), avail (ssh, scp)<br>
* using ssh for remote shell<br>
<br>
# vi quick<br>
---------------------------------------------<br>
<enter><br>
<enter><br>
2 <enter><br>
y<br>
<server host name> <enter><br>
y <enter><br>
y <enter><br>
n <enter><br>
----------------------------------------------<br>
# ./INSTALL < quick<br>
<br>
# vi /var/spool/PBS/pbs_environment<br>
------------------------------------------------<br>
TZ=America/Chicago<br>
PATH=/bin:/usr/bin<br>
PBS_RSHCOMMAND=ssh <== add this line but not must.<br>
------------------------------------------------<br>
or<br>
------------------------------------------------<br>
TZ=America/Chicago<br>
PATH=/bin:/usr/bin<br>
------------------------------------------------<br>
<br>
<br>
# vi /opt/pbs.conf<br>
------------------------------------------------<br>
PBS_EXEC=/opt/pbs/default<br>
PBS_HOME=/var/spool/PBS<br>
PBS_START_SERVER=0<br>
PBS_START_MOM=1<br>
PBS_START_SCHED=0<br>
PBS_SERVER=<pbs server hostname><br>
PBS_SCP=/usr/bin/scp <== add this line (must)<br>
------------------------------------------------<br>
<br># chkconfig --level 35 pbs on<br>
# /etc/init.d/pbs start<br>
<br>
debug<br>
# tail -f /var/spool/PBS/mom_logs/<date><br>
<br>
<br>
6. test<br>
<br>
$ echo "sleep 60; hostname; pwd; date" | qsub<br>
$ qstat -an<br>
$ cat STDIN.o<job id><br>
<br>
<br>
7. useful commands<br>
PBS node list<br> $ pbsnodes -a<br><br>Trace Job<br>$ tracejob <job id><br>
<br>Queue state<br>$ qstat -an<br><br>Queue del<br>$ qdel <job id><br>$ qdel -W force <job id><br><br>Show Queue configuration, license infomation, ...<br>$ qstat -fB<br>$ qmgr -c "list server"<br><br>Add compute node to server<br>$ qmgr -c "create node <host name>"<br>
$ qmgr -c "create node <host name> resoures_available.ncpu=2"<br>
<br>Delete compute node <br>$ qmgr -c "delete node <host name>"<br><br>change License information<br>License server<br>$ qmgr -c "set server pbs_license_info=<port1>@<host1>"<br>
$ qmgr -c "set server pbs_license_info=<port1>@<host1>:...:<port#>@<host#>"<br>File<br>$ qmgr -c "set server pbs_license_info=<path license file>"<br>
$ qmgr -c "set server pbs_license_info=<path license file1>:..:<path license file2>"<br>
unset<br>$ qmgr -c "unset server pbs_license_info"<br>
<br>Server configuration<br>$ qmgr -c "print nodes @default"<br><br>move all jobs within a queue<br>$<new pbs path>/qmove <queue name>@<new server host name>:15001 <old pbs path>/qselect -q <queue name>@<old server host name>:13001<br><br><br><br>
<br>
8. remove PBS<br>
# rpm -qa |grep pbs<br>
# rpm -e pbs-xxxx<br>
# rm -fr /var/spool/PBS<br>
# rm -f /etc/pbs.conf<br>
# rm -fr /opt/pbs/11.0XXXXX<br>
<br>
<br>
<br>
If you find ulimit problem in PBS queue then modify /etc/init.d/pbs in compute node.<br>add "ulimit -l unlimited" before run pbs_mom daemon.<br><p><strong><a href="http://kaget.cep.kr/blog/265?commentInput=true#entry265WriteComment">댓글 쓰기</a></strong></p>