In Unet cc, what do the channels correspond to?

I'm studying the U-Net CNN architecture. I'm new to CNNs and am confused regarding the "number of channels". Referring to the U-Net diagram, the input image is convolved with a 3x3 mask which generates a 570x570 output. This output image is then convolved again by a 3x3 mask to produce a 568 x 568 signal. However, what do the 64's correspond to? The U-net says something about a multi-channel feature map. But how does convolving an image by a 3x3 mask results in a "64".

Answered by Amit jaisawal

In this example you have a gray scale image of size 572x572 and 1 (gray) channel. The first convolution operation consists of 64 filters of size 3x3 and 1 channel per filter. The channel of the filters always fits the channel size of the previous layer (here: the Input). In the second convolution step of this explicit architecture, you again use 64 filters of size 3x3. In this case, each of these filters consists of 64 channels according to the previous output (64 feature maps/channels). The output of the second convolution consists of 64 feature maps according to the amount of 64 filters in the second convolution.





Your Answer

Interviews

Parent Categories