Or can be written in easier way, where the index contains the variable we are partially differentiating the function g to:
Why is gradient so important? It has many properties that are useful and crucial. First of all we can look at an example where gradient makes the calculations easier and more clear. It is the chain rule of a function. Imagine we have a function g(x, y) where variables x and y are functions x(t) and y(t) => g(x(t), y(t)), (if x and y would be functions of more variables, e.g. x(u, v) then this would work if we would partially differentiate g with respect to u and in other equation to v). So we want the calculate the rate of change and derivative of g(x, y) with respect to t (instead of x and y).
So we have two variable function g(x, y) with dependent variables y and x on t. We just make a vector r whose components are x(t) and y(t). If we use the chain rule to see the rate of change of g with respect to t it looks like this:
So in the chain rule we always firstly partially differentiate g with respect to one variable and then differentiate this variable to t. It is important to see the difference between regular dg/dt and partial differentiation. Partial differentiation comes to place only when a function has more variables (in this case g(x, y)), if a function is dependent only on one variable we are free to differentiate it regularly (dx/dt and dy/dt as well as dg/dt).
Now the use of gradient is clear. The chain rule can be written as a dot product of differentiated vector r and gradient.
But now to the properties of gradient. One of the most important properties of gradients are that they are perpendicular to the level curve at a certain point. We can imagine a curve and at a certain point a tangent line (or tangent plane in 3D). The gradient vector is perpendicular to this tangent line, we can proof this with the dot product of both vectors.
So we will have a function but a certain level -> w = c. The curve is made from a position vector r, that is dependent on a variable t. If we differentiate and get dr/dt that is actually velocity, tangent to the curve. This is a basic geometric interpretation, symbolizing the level curves.
Now through the chain rule (as mentioned already above) we can write this equation and solve it:
Therefore the gradient of the function is perpendicular the the tangent line/plane, which represents the velocity.
Now another very important use of gradient is in the solutions of equations containing Lagrange multipliers. Lagrange multipliers are used in problems, very we need to find maximum or minimum of a function, but with a constraint. That means we have to find max/min of a function so that the solution satisfies some other equation (e.g. select sides of a rectangle, so that the surface area is maximal with a constant volume). This can be hugely used in economics, engineering etc.
So lets have a function w(x, y) and a constraint g(x, y) = c (a circle in the case below). We want to find maximum on curve w, such that constraint g is satisfied.
source: http://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/2.-partial-derivatives/part-b-chain-rule-gradient-and-directional-derivatives/session-36-proof/MIT18_02SC_notes_19.pdf
We know that point P satisfies both of these conditions, because if would go to lower level curves than w = 14, it wouldn't be a maximum and if we would go above the level curve w > 14, that it doesn't satisfy the constraint.
Now we now that tangent lines to g and w are identical at point P. Therefore gradients of both functions have to be parallel in that point, which means that one gradient is a multiple of the other one. The constant is called the Lagrange multipliers (its symbol is greek letter lambda).
source: http://en.wikipedia.org/wiki/Lagrange_multiplier
And here it how it will be used in mathematical problems. This equation can be written as three different equations (if we have function with three variables), plus a equation where we set the constraint. Here are two identical systems of linear equations.
This is the theory behind it and now let's try some example.
An example problem from MIT:
A rectangular box is placed in the first octant so that on corner Q is at the origin and the three sides adjacent to Q lie in the coordinate planes. The corner P diagonally opposite Q lies on the surface f(x, y, z) = c. Using Lagrange multipliers, tell for which point P the box will have the largest volume, and tell how you know it gives a maximum point, if the surface is the plane x + 2y + 3z = 18.
So I solved it this way. We have a volume function (height * width * depth=x* y* z) and then the constraint where the point P should be lying on.
z = (18 - x - 2y) / 3
We can draw symbolically the problem this way. The Lagrange formula states, that the gradients of both functions are linear with the constant lambda. So if we take partial derivatives of both functions, and one side of the equation multiply by the constant lambda we get system of equations. Also we have to add the constraint equation.
So we have system of 4 equations with four unknowns. That's absolutely fine and we solve the system. I found it easiest to multiply each of the first three equation by the variable they are missing so we get three equation xyz = "something". My results were that the ideal point P = (6, 3, 2) and Lagrange multiplier lambda = 6.
So this is knowledge that I gained recently about the gradients and Lagrange multipliers. I find it very useful and definitely worth knowing. Hopefully my pictures and explanation were sort of understandable. I am definitely not an expert so I tried explaining these theorems the way I understand them.
Lukas Cerny, 6. 7. 2014
Žádné komentáře:
Okomentovat