PCG Tutorial Basis
PCG Tutorial Basis
PCG Tutorial Basis
05
This benchmark on basis set was done because calculation time strongly depends on basis set. There are a lot of different sets which describes chemical behaviour, some with good approximation to real functions and some with acceptable description. Smaller systems can be handled with large basis sets, but if we want to assay a molecule with more typical size we have to make some arrangements to handle such calculations. To see which smaller basis sets gave comparable descriptions to bigger ones in less computational effort there were done some benchmarks on typical molecules. We will also see when it is necessary to use bigger sets. This may help to get a feeling how calculation time rise with large basis sets. M. Checinski
In this Chapter are the benchmarks collected, which i made to decide which basis sets and Correlation Corrections are useful (quality & time consumption) for typical questions in laboratory or which one should be used for educational demands.
For making a general statement about a good Basisset/Correlation Correction for smaller computer(-cluster) it seems to be useful to compare different chemical environments. We will study typical organic and inorganic molecules, to find out which basis sets are not advisable for some structures.
Who just want to see the result should jump to the end of this chapter. There is a kind of summary.
At first we need an imagination of influence of basis sets on computation time. Because there are so many factors which have an influence on computational time it is impossible to say i.e. one set need two and a half times more computation time than an other. But to see how the tendency is i made a comparison of n-alkanes, to see how influence of an additional CH2-Group is.
800,0
600,0
400,0
200,0
0,0 c2 c3 c4 c5 c6 c7 c8
Here we see that different sets have different slopes. There are hough differences in computational time of a C8-Alkane computed with a MINI and a cc-pVTZ set (here 1:100).
250,0
150,0
50,0
-50,0 c2 c3 c4 c5 c6 c7 c8
Here we see that a pd-polarized split-valence set needs more computation time than an unpolarized triple-valence set.
Benchmarks of Basis/Correlation Correction To compare the basis sets qualitatively we need some properties, which we can compare. As we have seen the absolute value of total Energy is not the only important information, difference of total Energy by stretching a bond could give a good hint about the quality.
Another property can be the dipole moment, which depends on bond partners and bond length. But we can only compare dipole moments with real ones if the molecule were measured in gas phase.
PC-Gamess gives us thermodynamic properties, too. But here we should, compare comparable (gas phase) molecules, too.
Benchmarks of Basis/Correlation Correction At first we will discuss the behaviour of simplest alkane.
In previous chapters we have discussed the differences of RHF/UHF and HF in general. We have discussed about the cheap correlation correction of Moller-Plesset and the popular hybrid calculation of DFT (especial Becke3-LeeYoungParr). Now we will try to compare them qualitatively.
For that we study energy changing by C-H Bond stretching. We compare how a HF, HF/MP2 and B3LYP influence the description of this system. As basis sets we use the small split valence set 3-21 and the hough triple valence set aug-cc-pTVZ with additional diffuse and polarized functions.
After that we will compare how the popular sets describe bond-stretching of many different molecules at UHF/B3LYP level.
0,20
0,15
0,10
0,05
0,00 -0,30 -0,10 0,10 0,30 0,50 0,70 0,90 1,10 1,30
What we can see is that at HF level the description is significant different. On the other hand all calculations say that the equilibrium bond distance (at 0.05 A stepping) have relative the lowest energy. We will later see that this have not be ususal, but this system is easy to describe. And we shall not forget that all basis sets were fitted on such general Molecules to give good functions.
0,105
0,095
0,085
Here we see that we have three groups ( HF, HF/MP2 and DFT ). For this system the MP2 correction is comparable to the computational heavier B3LYP calculation.
0,105
0,095
0,085
But now the MP2 correction is not such good like in previous topic.
0,185
0,165
Here we see that the ACCT set is much better than 3-21, but this is nothing unexpected =) if we know that the ACCT set has for each Orbital 3 possible Orbitals which can be mixed and additional polarization and diffuse functions, to make the resulting Orbital more perfect for this chemical environment. Don't forget that this is a relative description, absolute values for ACCD are much lower than for 3-21.
Here we see a big difference between calculation with ACCT and 3-21, on UHF/B3LYPx level where the utilizations are most comparable we see a computational difference of 70:1 ! Ok, 12 minutes are not so long, but this is just a very small Molecule. For usual molecules it is a hough difference. At this point i have to say what B3Lx means. As mentioned PC-Gamess gives us a great flexibility in controlling calculations, some less accurate settings seems to make descriptions worser but other have low effects in qualitative descriptions but big in computational effort. I tested some settings with different molecules to check how to safe computational time with small loss of accuracy. In another chapter i will summarize these settings. The difference of B3Lx and B3L is just a smaller value of NRAD in $DFT part. This is by the way one cause why i made such benchmarks, to check which sets & settings gives best agreement in accuracy and time consumption.
-38,90
-39,40
-39,90
-40,40
-0,30
-0,10
0,10
0,30
0,50
0,70
0,90
1,10
1,30
To see how the absolute differences are here are some impressions of methan. There are hough differences between STO2/MINI and a mulit valence set. Differences between bigger sets are in another scale.
-0,30
-0,10
0,10
0,30
0,50
0,70
0,90
1,10
1,30
-39,80
-40,30
-0,30
-0,10
0,10
0,30
0,50
0,70
0,90
1,10
1,30
Here we see difference between HF / HF-MP2 / HF-B3LYP . Basis is 3-21. Here there are no differences between RHF & UHF, RHF/B3L & UHF/B3L , and there are low differences between RHF/B3L UHF/B3L & UHF/B3Lx
0,200
energy relative to groundstate E
0,150
0,100
0,050
0,000 -0,10
0,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80
0,90
1,00
1,10
1,20
1,30
0,16
0,06
-0,04 -0,30 -0,10 0,10 0,30 0,50 0,70 0,90 1,10 1,30
At first we see, that the STO-2G & MINI set is significant different from the other. Ethan C-C-Bond is a very simple system, and if these sets have such quality problems, we should use them only for didactical usage or fast geometry preoptimization.
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d ACCD ACCT
0,003
For small stretch lengths we see that sets with no polarization functions seem not to be very accurate. Another informations is that from STO-2 to DZV the relative energy minimum is 0.05 A from equilibrium geometry of PM3 optimization.
0,062
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d ACCD ACCT
0,057
0,052
On length around 0.5 A we have another picture. The order of sets is a little bit mixed. ACCT, ACCD, 6-311, TZV, DZV builds a close group. 6-31, 3-21 are not far away. The MIDI set is a little bit far away and seems to calculate this environment worser than 3-21
0,138
0,133
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d ACCD ACCT
0,128
On a distance of ~ 1 A we can compare this situation with a interaction of 2 Methyl radicals. Here we see that the hough sets give lowest energy configuration, for such special environment it's not unusual that a set with so many additional Functions and Polarized & Diffuse-functions can describe this situation better. Not far away are the split valence sets 6-31 and the 3-21.
Factor time in computational chemistry shall not be underestaminate. For this computation we can say that the MINI and STO-2 is qualitatively different. The MIDI and 3-21 set is qualitatively comparable to the bigger sets. For bigger molecules or a fast preview (or a slow cpu) they are a good agreement. Shown time relationship is not exact portable to other calculations, there are so many parameters which influence the calculations, but we can say the tendency is accepteable. The most CPU utilizations were ~ 100% per CPU, but the higher the molecule and the basis set is the larger is the number of stored Integrals. If they are larger than given RAM-Size they have to be stored to HDD with the consequence that the CPU utilization breaks significant down and the CPU time rise a lot, as seen for the ACCT calculation. To give a feeling how important this can be, we can see if we compare the same calculation with different RAM access. (MW for $SYSTEM MWORDS=xxx $END)
ACCT Zeit Util 2 CPU MW=180 MW=380 17524,6 38,77%
~ x times faster
0,200
energy relative to groundstate E
0,150
0,100
0,050
0,000 -0,10
0,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80
0,90
1,00
1,10
1,20
1,30
0,16
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d CCD CCT ACCD ACCT
0,06
-0,04 -0,30 -0,10 0,10 0,30 0,50 0,70 0,90 1,10 1,30
At first we see, that the STO-2G & MINI set is significant different from the other, again. With such quality problems, we should use them only for didactical usage or fast geometry pre-optimization (organic molecules only).
0,003
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d CCD CCT ACCD ACCT
On bond length near the equilibrium geometry of a PM3 optimized Acrolein we see that many sets haven't their minima.
0,062
0,057
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d CCD CCT ACCD ACCT
0,052
Here we see the known groups. For better interpretation we should look two pages later on the total energies.
0,118
STO-2 MINI MIDI 3N21 DZV-pd 6N31-2pd TZV-2pd 6N311-3p2d CCD CCT ACCD ACCT
0,113
The same situation on long distances. The triple valence sets (TZV, 6-311, CCT) and some split valence sets (6-31, DZV, ACCD) runs parallel. STO-2G, MINI, MIDI are far away or like 3-21 run qualitatively in another way . For a better comparison we should look at the total energy at a distance of 1 A.
E at 1 A MIDI -190,67140 3N21 -190,73610 DZV-pd -191,83155 6N31-2pd -191,80680 TZV-2pd -191,87138 6N311-3p2d -191,86636 CCD -191,81550 CCT -191,87677 ACCD -191,83076 ACCT -191,88065
ACCT
Here we see again how bigger sets rise computational time and which additional effect cpu utilization can have on wall clock. If we compare the energies of triple and split valence we see a little difference, this is not unusual because we have a bigger set on inner and outer orbitals. If we find better inner orbitals with triple valence set we will always find lower energy, even if the outer chemical Orbitals are like from a split valence. With this assay on acrolein we can make a first conclusion. STO-2G, MINI & MIDI is good for fast calculations, we will see later that even this sets describes i.e. oxidation of ethan with peroxoaceticacid in a right way. But if we will make research more serious we should use at least a 6-31(pd) set.
0,200
energy relative to groundstate E
0,150
0,100
0,050
0,000 -0,10
0,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80
0,90
1,00
1,10
1,20
1,30
0,30
0,20
0,10
STO-2 MINI MIDI 3N21 6N31 6N31-pd DZV-pd TZV-2p2d 6N311-2p2d CCD CCT ACCD ACCT
0,00
-0,10 -0,30 -0,10 0,10 0,30 0,50 0,70 0,90 1,10 1,30
0,125
STO-2 MINI MIDI 3N21 6N31 6N31-pd DZV-pd TZV-2p2d 6N311-2p2d CCD CCT ACCD ACCT
0,115
0,26
STO-2 MINI MIDI 3N21 6N31 6N31-pd DZV-pd TZV-2p2d 6N311-2p2d CCD CCT ACCD ACCT
0,25
Here is the corresponding data. The cpu utilization is comparable, so we can better compare computation time now. I.e. we get much better description with aug-cc-pVTZ set in comparison to 3-21 but it tooks 200 times more computational time.
E at 1 A
MIDI -574,3811
3N21 -574,4115
DZV-pd -577,2856
6N31-pd -577,2496
TZV-2pd -577,3339
6N311-3p2d -577,3258
CCD -577,2884
CCT -577,3425
ACCD -577,3028
ACCT -577,3464
Benchmarks of Basis/Correlation Correction Now lets take a look at an classical inorganic Molecule
3OCNi-CO stretch of Ni(CO)4
0,250
0,200
energy relative to groundstate E
0,150
0,100
0,050
0,000 -0,10
0,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80
0,90
1,00
1,10
1,20
1,30
0,05
MINI MIDI 3N21 6N31 6N31-dp 6N31-2d2p TZV TZV-dp TZV-2d2p Mhs-tm Mhs-ptm Cct
0,00 -0,30 -0,10 0,10 0,30 0,50 0,70 0,90 1,10 1,30
The structure was taken by a TZV-dp UHF/B3Lx optimization. Here we see the problem of today basis sets in inorganic chemistry. There are so many orbitally changing in the metallic sphere that we need a hough choice of functions for every orbital to find a good approximation of the real one.
0,010
MINI MIDI 3N21 6N31 6N31-dp 6N31-2d2p TZV TZV-dp TZV-2d2p Mhs-tm Mhs-ptm Cct
When we assay effect of polarization functions we see even at small bond-stretch differences.
0,038
0,028
MINI MIDI 3N21 6N31 6N31-dp 6N31-2d2p TZV TZV-dp TZV-2d2p Mhs-tm Mhs-ptm Cct
We also see that a split set of polarization functions has lower effects than one set. But if we compare computational effort we see a kind of doubling cpu time.
6N31 225,7 198,48% 6N31-dp 582,6 197,18% 6N31-2d2p 1382,7 197,71% TZV 739,2 197,85% TZV-dp 1314,1 197,73% TZV-2d2p 2462,6 195,55%
On previous calculations we saw that except STO-2 and MINI most basis sets are acceptable good for H,C,O calculations. For research we should use there at least 6-31-(pd). On Ni(CO4) we see that even a good split valence set with polarization functions have problems. So one should use for calculations which contains transition metals at least a triple valence set. We know calculation time hardly depends on chosen basis set, so we should look for some agreements. Using of hybrid sets with hough sets for transition metals and smaller sets for first two row elements is one strategy. On the other hand we create a kind of artefacts because, different sets have different strategies in describing orbitals and we have no consequent description by using hybrids. But every one can decide which kind of accuracy he needs. For didactical of private usage 6-31 is acceptable, but for research one should use a least a triple valence set.
Benchmarks of Basis/Correlation Correction Here is another benchmark of a komplex. Cl-Mn(CO)5 streching Cl-Mn Bond
Cl-Mn(CO)5
0,10
energy relative to groundstate E
0,05
0,00 -0,10
0,00
0,10
0,20
0,30
0,40
0,50
0,60
Cl-Mn(CO)5
0,06
0,00
-0,05 -0,15 -0,05 0,05 0,15 0,25 0,35 0,45 0,55 0,65 0,75 0,85
Text
Cl-Mn(CO)5
0,006
0,001
Text
Cl-Mn(CO)5
0,033
0,028
0,023
Text
Benchmarks of Basis/Correlation Correction Here is a benchmark of Ni(CO)4 again. With additional basis sets and new hardware.
3OCNi-CO stretch of Ni(CO)4
0,250
0,200
energy relative to groundstate E
0,150
0,100
0,050
0,000 -0,10
0,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80
0,90
1,00
1,10
1,20
1,30
Ni(CO)4
0,06
0,00
-0,05 -0,15 -0,05 0,05 0,15 0,25 0,35 0,45 0,55 0,65 0,75 0,85
Text
Ni(CO)4
0,006
0,001
Text
Ni(CO)4
0,033
0,028
0,023
Text
Energy Cis -150,7977 -156,2157 -156,3047 -156,3717 -157,1904 -157,2370 -157,2521 -157,2812 -157,2892 -157,2928 Trans -150,7994 -156,2168 -156,3067 -156,3736 -157,1925 -157,2392 -157,2539 -157,2833 -157,2912 -157,2949 Diff 0,001680 0,001119 0,002003 0,001948 0,002071 0,002135 0,001797 0,002122 0,002040 0,002022
Dipol Cis 0,164597 0,229661 0,179939 0,202529 0,195935 0,197658 0,228780 0,212560 0,249976 0,262794 Trans 0,000196 0,000370 0,000298 0,000283 0,000281 0,000254 0,000283 0,000245 0,000247 0,000252
Time Cis 1,8 2,3 4,1 4,0 5,6 13,5 16,3 40,5 49,9 1024,6 Trans 1,8 2,2 3,5 3,8 5,0 11,8 14,5 38,0 43,5 1038,8
STO-2 MINI MIDI 3N21 6N31 6N31-dp DZV-dp 6N311-2p2d TZV-2p2d ACCT
Here we see the same tendency that STO-2 & MINI set is significant different to split or triple zeta sets.
Here we see again that there is some kind of inconsistency in describing electron density. In some cases we have partial negatively in other partial positively charged nitrogen of nitril-group. We would expect a partial negative charge, so here we have a good example that bigger sets don't meen automatically better/realer descriptions.
.. to be continued
- more benchmarks of metal-organic molecules - benchmarks of excited states - benchmarks of different hybrid basis sets
This document is free available. It can be used for private or educational requirements. It must not be used for commercial aim without agreement of the author. It is literary property of Marek Pawel Checinski.
http://www.catalysis.de/ http://www.chemie.hu-berlin.de/ mail: marek.checinski catalysis.de