I was thinking about designing molecular descriptors for the virtual screening purpose: such that two molecules have similar shape if and only if their descriptors are similar.

They could be used separately, or to complement e.g. some pharmacophore descriptors.

They should be optimized for ligands - which are usually elongated and flat.

Hence I thought to use the following approach:

- normalize rotation (using principal component analysis),

- describe bending - usually one coefficient is sufficient,

- describe evolution of cross-section, for example as evolving ellipse

Finally, the shape below is described by 8 real coefficients: length (1), bending (1) and 6 for evolution of ellipse in cross-section. It expresses bending and that this molecule is approximately circular on the left, and flat on the right:

preprint:

http://arxiv.org/pdf/1509.09211 [nofollow]slides:

https://dl.dropboxusercontent.com/u/12405967/shape_sem.pdf [nofollow]Mathematica implementation:

https://dl.dropboxusercontent.com/u/12405967/shape.nb [nofollow]Have you met something like that? Is it a reasonable approach?

I am comparing it with USR (ultrafast shape recognition) and (rotationally invariant) spherical harmonics - have you seen other approaches of this type?