Change of variables

Algebraic substitution

One of the standard contexts in which the term “change of variables” is used is that of algebraic substitution. That is, we have some equation defined using a known set of variables (say xx for example in the univariate case) along with a new variable u=g(x)u = g(x) defined as a function of the existing variable xx. This transformation gg must be invertible so that we can solve for x=g1(u)x = g^{-1}(u), and substitute the form g1(u)g^{-1}(u) in for the instances of xx. Note, however, when the form g(x)g(x) is available directly in the target equation, we can replace it directly with uu; no point in applying the inverse and then reapplying the transformation gg to get uu back.

The context for this substitution is often to simplify the equation and make it easier to deal with for the desired task.

Coordinate transformations

In some cases we might want to rephrase a given problem in a new coordinate system, either to reveal insights about the problem’s structure or make it less tedious to solve. Here we have a known diffeomorphism g(a,b)=(x,y)=(gx(a,b),gy(a,b))g(a,b) = (x,y) = (g_x(a,b), g_y(a,b)) from the new coordinate system to our original one. This mapping provides the necessary information to transform points to and from either coordinate system.

Just like was described in substitution, we can phrase the equation S(x,y)S(x,y) under the new coordinate system as T(a,b)T(a,b) by replacing instances of xx and yy in SS with gx(a,b)g_x(a,b) and gy(a,b)g_y(a,b), respectively. That is, we have S(g(a,b))=S(gx(a,b),gy(a,b))S(g(a,b)) = S(g_x(a,b), g_y(a,b)), but we only have access to the terms aa and bb. Instead of having to compute these values and plug them into SS, TT implements these transformations internally and takes only the values aa and bb. (This has always been particularly confusing for me with all the notation swapping and overloading.)

The resulting equation in terms of the new coordinates will look the same in the original coordinate system after the known coordinate transformation is applied. However it will likely look significantly different under the transformed coordinate system, and it is in this context that we can operate to solve our problem from a new perspective.

Example

From the example given in the Wikipedia article, suppose we have the function

S(x,y)=(x2+y2)1x2x2+y2S(x,y) = (x^2 + y^2) \sqrt{1-\frac{x^2}{x^2+y^2}}

and we want to apply a transformation to polar coordinate space. With the known mapping g(r,θ)=(x,y)=(rcos(θ),rsin(θ))g(r, \theta) = (x,y) = (r\cos(\theta), r\sin(\theta)), we can replace instances of xx and yy with their representations in terms of rr and θ\theta:

T(r,θ)=S(gx(r,θ),gy(r,θ))=r21r2cos2θr2T(r, \theta) = S(g_x(r,\theta), g_y(r,\theta)) = r^2 \sqrt{1-\frac{r^2\cos^2\theta}{r^2}}

TT is a new function that effectively maps rr and θ\theta through the coordinate transformation gg and subsequently queries SS. TT wraps this intuitive procedure up into a single closed form.

Integration

Integrals have the intuitive interpretation of summing up area under curves. There are many instances where an integral can be drastically simplified with a simple change of coordinate system, e.g. from regular Cartesian coordinates to polar coordinates. Like we’ve seen above, we can substitute values according the transformation mapping to yield an equation defined in terms of rr and θ\theta. However, we’re not done after this step (in the context of integration). Because the notion of size (see Measure theory) can warp under a transformation, we must track how area changes at all locations between the two coordinate systems. So while we might actually carry out the integral computation in the transformed coordinate system (because it simplifies the structure of the problem), our original question still wants us to compute the area under that form in the original coordinate system. As a result, we must account for the way in which area has changed between spaces. Here we use the determinant of the Jacobian matrix, which when evaluated at a point reveals the factor by which the transformation in question expands or shrinks volume (generalized length) near that point. Thus, we have the general change of variables theorem

T(U)f(v)dv=Uf(T(u))det(DT)(u)du\int_{T(U)}{f(v)dv} = \int_U{f(T(u)) |\det(DT)(u)| du}

where TT is our transformation mapping from one coordinate space to the other. Note that some formulations of the change of variables theorem replace TT with T1T^{-1}, depending on which direction the mapping is assumed to be defined.

Probability densities

Sources