Page 1 of 1

computing partial charge of a molecule

Posted: Mon Dec 19, 2011 3:41 pm
by ericjang
Hello,

I am trying to compute the partial charge of each atom in a molecule. I found some source code online but it's quite convoluted and perhaps I should try to figure out the algorithm before I go about trying to debug. I am trying to understand the OpenBabel package but I am unable to find the Gasteiger charge implementation.

Suppose I have a chemical structure and I know which atoms are connected together, what kinds of element they are (N, C, O, H, etc), and I know what kinds of bonds (single, double, triple). How do I compute the partial charge, say, the Gasteiger algorithm? I have a table of VDW radii, if that helps too.

Thanks!

Re: computing partial charge of a molecule

Posted: Mon Dec 19, 2011 10:03 pm
by ericjang
ah, I found the algorithm within openbabel, it is located at: http://fossies.org/dox/openbabel-2.3.1/ ... ource.html
A similar algorithm exists for the Chemistry Development Kit implemented in Java (http://cdk.sourcearchive.com/documentat ... ource.html), but since I am adhering to OpenBabel I want to make sure that I am adhering to the exact same values (in case the algorithms differ slightly).

I suppose if someone could explain to me step by step in layman's terms how the Gasteiger algorithm computes partial charges, that would be great! :D

Re: computing partial charge of a molecule

Posted: Tue Dec 20, 2011 9:00 am
by deleted-71882
Hello ericjang,

I am most definitely not an expert on "Gasteiger-Marsili partial charges" but I can offer a few tips on how you can figure it out.

I downloaded the OpenBabel executable for Windows and found that indeed the Gasteiger charge code was mentioned as a plugin.

To get the code, I downloaded the source code from http://openbabel.org/wiki/Install. I strongly recommend that you use this first-hand source instead of a version you find on some other web site. The second-hand code may have been modified for better or worse, and it's likely to refer to still other code you have to search for. By getting the source directly from OpenBable, you minimize such problems.

A search of the files included in the source download revealed many, many files with the phrase "Gasteiger" in them. One file I found in the "src" directory was, indeed, the root function for computing the partial charges. It referred to another function called "AssignPartialCharges" and so forth. This process will allow you to have the complete set of sources needed to run the partial charge computation.

I suggest that you be sure you can run the code as-is before you try any changes. Then do something simple like add a line that says,

cout << "This is jang's code.";

and recompile and run to make sure you can rebuild the executable. Then you're ready to try real changes.

Study the comments in the code carefully to find any references that may help you understand the algorithm. I also see that there's a book, Chemoinformatics: A Textbook, by Johann Gasteiger (Editor) and Thomas Engel (Editor) that looks like it has a whole Ph.D's worth of stuff in it.

Also search Google Scholar for "Gasteiger partial charge." It turns up many of Gasteiger's original research papers.

Good luck and happy hunting. Let me know if you make some progress. WW

Re: computing partial charge of a molecule

Posted: Tue Dec 20, 2011 12:55 pm
by ericjang
Hi Wendellwiggins,

Thanks for your reply. I did download the src and step through it with a debugger, but I think that my understanding would be a lot stronger if someone could explain why and how the Gasteiger charges are computed in theory first. I ask this because I am implementing my own chemical model so I will not be able to use Babel's molecule object class.
Once I understand the theoretical algorithm, it will be much easier for me to understand the semantic meaning of the code and decide which parts I need to port over (for example, I am not using gradients and so there are some parts of molchrg.cpp that I will not need).

The only good explanation that I could find of the Gasteiger partial charge calculation is in the textbook: "Essentials of Computational Chemistry: Theories and Models", in Section 9.1.3.1 [Class I charges].

for an atom in a molecule:
partial charge = q = Z - Q

apparently Z - Q is non-trivial to calculate, so the Gasteiger-Marsilli model is as follows:
Image

Where X is electronegativity and a, b, c have to be optimized. But then the document goes on to describe that the "partial equalization of orbital electronegativity then proceeds as a convergent, iterative process". The equation that follows is rather confusing to me:
Image

where f is a damping factor raised to the nth power (n = number of iterations), approximated by about 0.5 but actually computed by

Image

^ What does this function mean?

So the new electronic populations are given by
Image

And presumably subtracting from Z gives us the partial charge Q. This is where I wanted to double check my logic, because it seems like the iterative step for any atom k is basically subtracting the sum of deltaQ for all k' atoms with higher electronegativity than k, and then adding the sum of deltaQ for all atoms that have less electronegativity than K.

I see some for-loops reminiscent of this in the molchrg.cpp, but I think I can understand better if I see the calculation worked out on paper (easier to know what I am doing too).

For example, suppose I have an ammonia molecule, NH3. Could you compute the gasteiger charge for me?

Thanks!

Re: computing partial charge of a molecule

Posted: Tue Dec 20, 2011 12:59 pm
by ericjang
furthermore, it seems like electronegativity can be -q or q (2 zeros of the quadratic formula). How do I know which value to use?

Re: computing partial charge of a molecule

Posted: Tue Dec 20, 2011 5:05 pm
by deleted-71882
ericjang,

I have to leave you to your own resources now. I know only a little general knowledge about computing molecular configurations, and I don't have the time to learn more now.

Be sure to research what's already been done thoroughly. It's bad in terms of efficiency and personal pride to find that you've spent many hours figuring out something that was already known.

I hope you have access to the research papers you can find on Google Scholar, and I hope you get copies of them for your reading.

Good luck, WW

Re: computing partial charge of a molecule

Posted: Wed Dec 21, 2011 2:09 am
by ericjang
Hello *anybody out there*:

I've made significant progress! I understand the paper a lot better now, but if someone could send me a private message with the original Gasteiger paper that would be wonderful. I find that the CDK implementation of Gasteiger charges is much simpler to understand, except that annoyingly enough, it sticks all of the computations into a single one-dimensional array and improperly names deltaQ as q. Could be worse though...

I have implemented the algorithm in pseudocode in case anybody else gets stuck on this: see GasteigerMarsiliPartialCharges.java to understand where variables come from. Whew!

//given a collection of atoms (e.g. molecule, two molecules, three molecules, two atoms, etc)

DEOC_HYDROGEN = 20.02;
MX_DAMP = 0.5;
MX_ITERATIONS = 20;
STEP_SIZE = 5;//amount of array space reserved for each atom's a,b,c,deoc, chi, q values. Slightly crude, but fast. Also, it would be smarter to make the STEP_SIZE = 6 (after all, 6 values), and then instead of [STEP_SIZE * j + j + 1] we just do [STEP_SIZE * j + 1]. I think that a multidimensional array wouldn't hurt either to make things simpler to understand. But I digress.

assignGasteigerMarsiliSigmaPartialCharges(molecule) {
- alpha = 1;
- set all the partial charges in every atom to 0;
- set up a,b,c values for every atom based on neighborhood interactions
label out: {
loop for MX_ITERATIONS {
- scale alpha by MX_DAMP;
loop across all the atoms {
- q = what value is q assigned???
- difference = abs(q_old)-abs(q);
- q_old = q;
- chi = a + b*q+c*q^2; (where chi, a, b, c correspond to this particular atom).
}
- if q stayed the same across all of the atoms (isDifferent checking), break out. Here is where I got confused as to where exactly 'out:' ends in the source code, so please let me know if my own placement here is wrong. I assumed that this is just a test for convergence so that we don't necessarily have to complete ALL of the MX_ITERATIONS.

//time to distribute the charges!
- deltaQ = (chi of atom 1 - chi of atom 2)/deoc
- decrement partial charge q for atom 1 by deltaQ*alpha
- increment partial charge q for atom 2 by deltaQ*alpha

}
}//end loop for MX_iterations
}//end 'out:' label here?


loop across bonds {
- atom 1 and atom 2 in bond
if (chi of atom 1 >= chi of atom 2) {
if (atom 2 is Hydrogen) {
deoc of atom 2 = DEOC_HYDROGEN = 20.02
}
} else {
//atom 2 is more electronegative
if (atom 1 is hydrogen) {
deoc of atom 1 = DEOC_HYDROGEN = 20.02
}
}



loop across all atoms in molecule {
- partial charge of atom = assign the values stored in the array to the original molecule object or Atom Container, whatever atom system floats your boat!
}

}

The function assignGasteigerSigmaMarsiliFactors() is pretty easy to understand. Simply assign accordign to rules. OpenBabel basically has the same ruleset, except that hybridization numbers are pre-calculated by using fancy SMARTS string matching.