The gradient is supposed to "live" in position space, since it tells you in what spatial direction the function is increasing fastest.

I'm not sure I follow your question about adding the derivative with respect to r to itself, but I probably should define the notation I used. Bold text was indicating unit direction vectors, and what the gradient will do is tell you how much a function changes with respect to one of its arguments, and then you have to multiply that by a unit vector so that it tells you in what direction it's changing. Hence you get a term

**r**∂/∂r

which means "find out how much the function is changing as we change r (i.e. take the derivative), and then make this value point in the direction it's changing (i.e. multiply it by the

**r** unit vector). You don't need to scale it since it's already measuring how the function changes in space, since r is a spatial coordinate.

If you have a function in spherical coordinates, you'll want to do the same things in the θ and φ directions. The problem is that these aren't measuring spatial coordinates. You start off using the same form for your derivative terms

**θ**∂/∂θ, where the boldface indicates a unit vector pointing either latitudinally or longitudinally from

**r**. However, you couldn't simply add these terms to

**r**∂/∂r since they're measuring angles, not spatial coordinates [you can check units here: the unit vectors are unitless, ∂/∂r is 1/length, and the ∂/∂θ and ∂/∂φ are 1/angle (which is unitless)]. To get these angular derivatives to measure how a function is changing in spatial position, you need to throw those scaling factors in.

Here's a couple of links that explain things as well (they show where the scaling factors come from, though they're pretty math-intense):

http://mathworld.wolfram.com/Gradient.htmlhttp://mathworld.wolfram.com/Laplacian.htmlhttp://mathworld.wolfram.com/CurvilinearCoordinates.html