Differences

This shows you the differences between two versions of the page.

--- tcr_sri002 [2020/04/26 10:33] – tmatejuk
+++ tcr_sri002 [2023/08/01 01:08] (current) – external edit 127.0.0.1
@@ Line 35: / Line 35: @@
 ==== sbatch example ====
-Use Quantum Espresso 6.5. This description assumes that path to binaries is ''/home/users/${USER}/soft/qe-6.5/bin'' . Use bellow ''.sbatch'' file to run computation.
+Use Quantum Espresso 6.5. This description assumes that path to binaries is ''/home/users/${USER}/soft/qe-6.5/bin'' . Use bellow ''qe-test.sbatch'' file to run computation.
+  #!/bin/bash -l
   #SBATCH --job-name="qe-test_N2_n32"
   #SBATCH --nodes=2                   # number of computing_nodes
@@ Line 47: / Line 47: @@
   WORKDIR="/home/users/${USER}/soft_tests/qe_run_`date +%s`_${RANDOM}/"
   mkdir -p ${WORKDIR}
+  cd ${WORKDIR}
   export BIN_DIR="/home/users/${USER}/soft/qe-6.5/bin"
   export PATH=${BIN_DIR}:$PATH
   export PSEUDO_DIR=${WORKDIR}
   export TMP_DIR="/tmp"
-  mkdir -p ${WORKDIR}
-  cd ${WORKDIR}
   #copy input files and pseudo files to ${WORKDIR}
   cp /home/users/${USER}/downloads/quantum_espresso_input_files/* ${WORKDIR}
@@ Line 70: / Line 70: @@
 Bellow results show time of computation in function of used resources (computation scalability) for a specific computational task done with ''pw.x'' program.
-Assigning a larger amount of computing nodes does not always lead to a (efficient) reduction in computing time (wall-time of the job). To find the most appropriate number of nodes for a specific type of job, it is essential to run one's own benchmarks. In general, parallel jobs should scale to at least 70% efficiency for the sake of other TCR users. One user using twice the resources to squeeze out 10% more performance may be keeping other users from working at all.
+Assigning a larger amount of computing nodes does not always lead to a (efficient) reduction in computing time (wall-time of the job). To find the most appropriate number of nodes for a specific type of job, it is essential to run one's own benchmarks. In general, parallel <color #22b14c>jobs should scale to at least 70% efficiency for the sake of other TCR users</color>. One user using twice the resources to squeeze out 10% more performance may be keeping other users from working at all.
-To automate these tests 2 files were prepared ( ''run_qe-6.4.1_tests.sh'', ''qe-6.4.1.batch'' ). Test are run with script ''run_qe-6.4.1_tests.sh'' that prepares parameters and starts single 'batch' job (''qe-6.4.1.batch''). File ''qe-6.4.1.batch'' uses  [[https://slurm.schedmd.com/job_array.html|slurm's arrays]] that transform each submitted job to 3 separate computational jobs ( it is more efficient than submitting 3 separate jobs ).
 Bellow results should be consider as results of this specific computational task on this specific hardware (TCR cluster) and not overall benchmark for Quantum Espresso software suite.
+{{ tcr:qe65_tcr_result_001.png?nolink|}}
+^nodes	^min [s] 	^avg [s] 	^median [s] 	^max [s] 	^efficiency [%] ^
+|1	|1862	|2065.75	|2126	|2149	|<color #22b14c>100.00%</color>|
+|2	|1105	|1192.75	|1157.5	|1351	|<color #22b14c>84.25%</color>|
+|3	|763	|770	|768.5	|780	|<color #22b14c>81.35%</color>|
+|4	|798	|1026.25	|868.5	|1570	|58.33%|
+|5	|571	|589.75	|574	|640	|65.22%|
+|6	|488	|536	|525.5	|605	|63.59%|
+|7	|372	|502.25	|414.5	|808	|<color #22b14c>71.51%</color>|
+|8	|375	|489.75	|408.5	|767	|62.07%|
+|9	|324	|403.75	|346.5	|598	|63.85%|
+|10	|327	|456.25	|465.5	|567	|56.94%|
+|12	|285	|429.4	|375	|609	|54.44%|
+|16	|228	|274.25	|281.5	|306	|51.04%|
+*) values (min, avg, median, max, efficiency) do not include failed runs \\
+*) efficiency as ''t1 / ( nodes * tn )'' ( where t1 is min computation time at one node, tn is min computation time on N nodes ) \\

CRwiki

User Tools

Differences