Saturday, January 11, 2020

Fake Complex Derivatives

Fake Complex Derivatives

Introduction

Having reduced (a part of) the research problem I was working on to finding a gradient, I was just about to be relieved when I realised I had no clue how to find “gradients” in the space of Hermitian matrices I was working in. And the function I wanted to find the gradient for wasn’t too simple looking either.

I turned to math.stackexchange in despair, and turns out the gradient to my function was super simple and elegant!

Now I owed it to complex matrix differentiation to go study that shit. Till the excitement lasts, atleast.

The answerer of the question suggested the book ‘Complex-Valued Matrix Differentiation’ which I started reading - hoping to cover up holes in complex differentiation and linear algebra on the way.

So, first things first. The word for “differentiable” for complex valued functions of complex numbers is analytic. Before we get into finding derivatives, we need to check whether the function is analytic in the domain we’re interested in.

Taking our function f(x+iy)=u+ivf(x + iy) = u + iv, and ensuring that the derivative is the same when a point is approached from all directions gives us the Cauchy-Reimann equations:

ux=vy , uy=vx\frac{\partial u}{\partial x} = \frac{\partial v}{\partial y} \text{ , } \frac{\partial u}{\partial y} = -\frac{\partial v}{\partial x}

which happen to be a necessary and sufficient condition for a function to be analytic at a given point.

Aside: Changing independent variables changes derivatives - and we get to choose these!

Consider the following functions:
p(x,y)=x+yq(x,y)=yp(x,y) = x + y \\ q(x,y) = y

We can now write xx and yy in terms of pp and qq:
x(p,q)=pqy(p,q)=qx (p,q)= p - q \\ y(p,q) = q

Now, consider pq\frac{\partial p}{\partial q}:
=(x+y)y=1 = \frac{\partial{(x + y)}}{\partial y} = 1
orrrr,
=pxxq+pyyq=0 = \frac{\partial p}{\partial x}\frac{\partial x}{\partial q} + \frac{\partial p}{\partial y}\frac{\partial y}{\partial q} = 0

So, which is the right partial derivative?

It turns out both are correct. In the first case, xx and yy are our independent variables, so pp turns out to be dependent on qq. In the second case, pp and qq are considered independent variables, so the partial derivative is 0.

So, these really depend on our choice of variables. Math, unlike CS sadly has no implication just because something is on the left or right of the == sign, so we’re left to specify this clearly when we’re working in a system.

Real valued complex functions aren’t differentiable :(

When we do matrix calculus, we want to be able to differentiate everything. Scalar functions of vectors, matrix functions of scalars, scalar functions of matrices – everything.

However, just take the simple case of a real valued complex function. This means vv is always 0, and the first Cauchy-Reimann equation itself is not satisfied unless uu is constant with respect to xx. So, dfdz\frac{d f}{d z} can’t be found for arbitrary real valued function.

Perfect. The first thing we learn is we can’t find derivatives of real valued functions, let alone the other grandiose plans involving matrices and vectors.

So, what do we do?

This is where we cheat. If we change the independent variables from xx and yy to zz and zz^*, where now,
x=z+z2,y=zz2ix = \frac{z + z^*}{2} ,y = \frac{z - z^*}{2i}

Then we can take the partial derivative with respect to zz !
For instance for the real valued function f=zzf = z^*z, fz\frac{\partial f}{\partial z} is just zz^*.

The partial derivatives with respect to zz and zz^* are called formal derivatives – which we have to satisfy ourselves with.

Tldr; we couldn’t find the derivative with respect to zz, so we changed the system such that zz became an independent variable, and made do with taking the partial derivative with respect to zz.

Aside: Derivatives for multivariable real valued functions?

One might ask, aren’t real valued complex functions analogous to having a real 2-variable function? What exactly happens there?

Firstly, there is no one-big-derivative defined for such functions. We can take the partial derivative with respect to each variable, and put them together in a gradient vector to find the direction in which the function slopes the most, but there is no one ‘number’ that is the derivative.

And the gradient exists if all the partial derivatives exist.

Which is the fundamental difference between complex differentiation and the above - the atomic unit we’re dealing with.

(You could still argue that the gradient is actually one complex number, I guess - still need to think about that!)

Thanks to Siddharth Bhat for the intuition on changing independent variables and Jayitha for the reasoning about real valued functions.

Share:

0 comments:

Post a Comment