#
Is convolution linear?

The idea used, as far as I understand, is to represent the 2 dimensional nxn input grid as a vector of n2 length, and the mxm output grid as a vector of m2 length. I don't see why this can't be generalized to higher-dimensional convolutions, since a transformation matrix can be constructed from one vector to another, no matter the length **(right?)**

**My question: Aren't all discrete convolutions (not just 2D) linear transforms? **Are there cases where such a transformation matrix cannot be found?

**The answer to your question - Is convolution linear is that -
**

- Convolution is a pretty misused term in recent times with the advent of CNN.

- Short answer, Convolution is a linear operator (check here) but what you are defining in context of CNN is not convolution, it is cross-correlation which is also linear in case of images
**(dot product).**

**Convolution: **Computers work only on discrete values, so the discrete time convolution is defined similarly as:

`y[n]=∑k=∞∞x[k]h[n−k]`

which has the nice property of

Y(ejω)=X(ejω)H(ejω)

where each A(ejω) is the Fourier Transform of a(t) So this is the basic idea of discrete convolution. The point to illustrate here is that convolution operation has some nice properties and is very helpful.

Now there are 2 main types of convolution (as far as I know). One is linear convolution another is circular convolution. In terms of 2D signal they are defined in the following ways:

y[m,n]=∑j=−∞∞∑k=−∞∞x[i,j]h[m−i,n−j]

Circular convolution is the same except that the input sequence is finite from 0 to N−1 giving rise to periodicity in the frequency domain. The convolution operation is a little bit different due to this periodicity.

So these are the actual definitions of convolution. It has a linearity property (clearly obvious since it is used to calculate output of an LTI system), and it also has linearity in terms of matrix multiplications, since we want the computer to do these calculations for us. I don't know a lot but there are many clever manipulations e.g the FFT algorithm, an indispensable tool in signal processing (used to convert signals to frequency domain for a certain sampling rate). Like this you can define convolutions in terms of Hermitian matrices and stuff (only if you want to process in the n domain it is much easier to process in the frequency domain).

For example a 1D circular convolution between 2 signals in n domain defined in matrix form, can be written as follows yn=hn∗xn: where

yt=⎡⎣⎢⎢⎢y0y1y2y3⎤⎦⎥⎥⎥=⎡⎣⎢⎢⎢h0h1h2h3h3h0h1h2h2h3h0h1h1h2h3h0⎤⎦⎥⎥⎥⎡⎣⎢⎢⎢x0x1x2x3⎤⎦⎥⎥⎥

The same can be done quite easily by converting the functions into frequency domain, multiplying and converting back to time domain.

When we move to 2D domain similar equation is formed except (in n domain)instead of h0 we will have a Hermitian matrix H0 and we will perform the Kronecker product (I don't know the justification or proofs of these equations, it probably satisfies the convolution theorem and is fast when ran on computers). Again much easier to do in the frequency domain.

Artificial Intelligence

SPONSORED BY

Sponsored logo

Aren't all discrete convolutions (not just 2D) linear transforms?

Asked 1 year, 11 months ago

Active 1 year, 10 months ago

Viewed 2k times

5

enter image description here

The image above, a screenshot from this article, describes discrete 2D convolution as linear transforms. The idea used, as far as I understand, is to represent the 2 dimensional nxn input grid as a vector of n2 length, and the mxm output grid as a vector of m2 length. I don't see why this can't be generalized to higher-dimensional convolutions, since a transformation matrix can be constructed from one vector to another, no matter the length (right?)

My question: Aren't all discrete convolutions (not just 2D) linear transforms?

Are there cases where such a transformation matrix cannot be found?

convolutional-neural-networks

signal-processing

Share

Improve this question

Follow

edited Mar 31, 2020 at 8:54

asked Mar 30, 2020 at 12:47

connected-subgroup

94111 silver badge1414 bronze badges

1

Are you sure you want to talk about continuous cases? Continuous operations cannot be denoted by matrices and thus non-linear by your definition but linear in the world's definition. –

user9947

Mar 30, 2020 at 14:40

What do you mean by "higher-dimensional convolutions"? Can you provide a link to a definition of them? –

nbro

Mar 30, 2020 at 18:16

Could you elaborate, @DuttaA ? Maybe post it as an answer. –

connected-subgroup

Mar 31, 2020 at 3:46

@nbro I would think of higher dimensional convolutional as starting with a 3D grid of input, say nxnxn, and mapping it to a smaller 3D grid using a kernel - same as in the case of 2D grids. This may generalize to higher dimensions, even. –

connected-subgroup

Mar 31, 2020 at 3:48

1

@jughead.ai well I had edited the question to discrete convolutions but you edited it back to convolution. Convolution is a very broad term and it's properties can't be fitted in an answer. –

user9947

Mar 31, 2020 at 7:08

Show 5 more comments

2 Answers

2

Convolution is a pretty misused term in recent times with the advent of CNN.

Short answer, Convolution is a linear operator (check here) but what you are defining in context of CNN is not convolution, it is cross-correlation which is also linear in case of images (dot product).

Convolution:

Computers work only on discrete values, so the discrete time convolution is defined similarly as:

y[n]=∑k=∞∞x[k]h[n−k]

which has the nice property of

Y(ejω)=X(ejω)H(ejω)

where each A(ejω) is the Fourier Transform of a(t) So this is the basic idea of discrete convolution. The point to illustrate here is that convolution operation has some nice properties and is very helpful.

Now there are 2 main types of convolution (as far as I know). One is linear convolution another is circular convolution. In terms of 2D signal they are defined in the following ways:

y[m,n]=∑j=−∞∞∑k=−∞∞x[i,j]h[m−i,n−j]

Circular convolution is the same except that the input sequence is finite from 0 to N−1 giving rise to periodicity in the frequency domain. The convolution operation is a little bit different due to this periodicity.

So these are the actual definitions of convolution. It has a linearity property (clearly obvious since it is used to calculate output of an LTI system), and it also has linearity in terms of matrix multiplications, since we want the computer to do these calculations for us. I don't know a lot but there are many clever manipulations e.g the FFT algorithm, an indispensable tool in signal processing (used to convert signals to frequency domain for a certain sampling rate). Like this you can define convolutions in terms of Hermitian matrices and stuff (only if you want to process in the n domain it is much easier to process in the frequency domain).

For example a 1D circular convolution between 2 signals in n domain defined in matrix form, can be written as follows yn=hn∗xn: where

yt=⎡⎣⎢⎢⎢y0y1y2y3⎤⎦⎥⎥⎥=⎡⎣⎢⎢⎢h0h1h2h3h3h0h1h2h2h3h0h1h1h2h3h0⎤⎦⎥⎥⎥⎡⎣⎢⎢⎢x0x1x2x3⎤⎦⎥⎥⎥

The same can be done quite easily by converting the functions into frequency domain, multiplying and converting back to time domain.

When we move to 2D domain similar equation is formed except (in n domain)instead of h0 we will have a Hermitian matrix H0 and we will perform the Kronecker product (I don't know the justification or proofs of these equations, it probably satisfies the convolution theorem and is fast when ran on computers). Again much easier to do in the frequency domain.

When we move to multiple dimensions, it is called Multidimensional discrete convolution as per this Wikipedia article. As the article suggests, the property where everything is in frequency domain.

Y(k1,k2,...,kM)=H(k1,k2,...,kM)X(k1,k2,...,kM)

still holds good. When we do convolutions in the n domain things get tricky and we have to do clever matrix manipulations as shown in the example above. Whereas, as stated above things get much easier in the frequency domain.

It's counter-intuitive that a picture has a frequency domain, and in general it's actually not frequency. BUT in DSP we use a lot of filters whose math properties are similar to those of filters in the traditional sense of frequency and hence have the same calculations as in a frequency setting as shown in the 1st example of 1D signal.

The point convolution by definition is a linear operation. Check here if the explanations are unconvincing. There are Linear Time Varying systems and its output may be determined by convolution, but then it uses the equation at a certain point of time i.e:

Y(ejω,t)=H(ejω,t)X(ejω)

.

Now whether it can be represented by matrix products in n domain or not I cannot say. But generalizing, it should be possible, only it will involve increasingly complex matrix properties and operations.

**Cross Correlation: **Cross correlation is nothing similar to convolution, it has a bunch of interesting properties on its own, hence I do not like the two being mixed together. In signal processing it is mainly related to finding energy (auto-correlation) and has other applications in communication channels.

While in statistics, it is related to finding correlation (misused term since it can take any value, but correlation coefficient takes value between -1 to 1) between 2 stochastic processes, at different time instants, where we have multiple samplings of the signal or ensembles of the signal based on which the expectation is taken (See here). In CNNs maybe a way to see this would be, the image dataset is an ensemble of stochastic processes, while the filter is another process which is fixed. And moving the filter over the image is a time delay.

In the context of digital image processing cross correlation is just a dot product. And hence it can be represented by matrices. Whether cross correlation is linear or not is difficult to tell (at least I don't know), but in the context of image processing it seems linear. Bottom-line is that it can be definitely implemented by matrices as it is a dot product; whether it is linear in a true mathematical sense is doubtful. The more intuitive way in the context of CNNs would be to view filters as just a bunch of shared weights for better regularization instead of cross correlation or convolution.