"Chemical space" is an abstract, multidimensional, discrete space that gives coordinates that distinguish every possible unique molecule.

There are many ways one could construct a model of chemical space (and some models give very different answers, especially depending on definitions). The easiest (for me) to imagine is a series of nested discrete, multidimensional spaces, but there are other ways (presumably) with either less nesting, or one single space that contains all of the possible molecules.

For a nested case, imagine a multidimensional space where each axis represents one of the chemical elements (one could separate out all the isotopes as well, but for now I will only consider atomic number). In this case, all the points within the space represent the range of chemical formulae, but this is very far from mapping out the entirety of chemical space!!

At each point of this "formula space" there are likely to be several unique arrangements (isomers) of that set of atoms defined by the coordinates in "formula space." Therefore, each point in "formula space" is a multidimensional manifold, where the number of dimensions is determined by the set of atoms defined in "formula space." The simplest and broadest (upper limit) way to define the dimensionality of the manifolds is: each (spherical) atom has 3 (translational) degrees of freedom (in our own 3-dimensional space), but the center of the molecule has 3 (translational) degrees of freedom, so for N atoms there are 3N – 1 degrees of freedom (dimensions).

I think this system is greatly lacking. It captures all cases, and represents an upper limit, but it is enormously larger than necessary, composed mostly of predicted molecule types that cannot actually be stable given the geometry defined. Also, there is no obvious way to glean any sort of chemical understanding from the model. There are several constraints one could put no this limit, reducing the upper bound and the dimensionality of the space, but at the cost of more more complex definitions and algorthims.

One could also try a bottom-up approach, by having an algorithm that can calculate the number of possible (reasonably) stable isomers given a set of atoms (detemined by coordinates in formula space), but this can get quite complicated.

~~~~~~

For instance, just looking at a 2-Dimensional formula space: (X, Y) where X denotes number of carbon atoms in the molecule, and Y denotes number of pairs of H atoms (almost all stable hydrocarbon molecules have an even number of H atoms), and assuming that carbon always will make four bonds (4 singles, 2 singles and a double, 2 doubles, or 1 single and one triple), and hydrogen will always make one (single) bond.

For a molecule with X carbon atoms and 2Y hydrogen atoms C_{X}H_{2Y}, how does one construct a logical (computer solvable) algorithm for counting the number of possible isomers? I have some thoughts on this, but want to give you guys a chance to think about it before writing another long post about this :-)