👨🏿‍🔬 👩🏼‍🍳 🕡 机器学习。神经网络（第3部分）-显微镜下的卷积网络。探索Tensorflow.js API 📌 🌥️ 🌀

也可以看看：

机器学习。神经网络（第1部分）：感知器学习过程
机器学习。神经网络（第2部分）：使用TensorFlow.js进行OR或XOR建模

在先前的文章中，仅当原始层的每个神经元与先前层的所有神经元都有连接时，才使用神经网络层的一种类型-密集，完全连接。

例如，要处理24x24的黑白图像，我们必须将图像的矩阵表示形式转换为包含24x24 = 576个元素的向量。可以想象，通过这样的变换，我们失去了一个重要的属性-像素在轴的垂直和水平方向上的相对位置，而且，在大多数情况下，位于图像左上角的像素可能几乎对像素不具有逻辑上可解释的影响。右下角。

为了消除这些缺点，将卷积层（CNN）用于图像处理。

CNN的主要目的是从原始图像中提取包含辅助（特征）特征（特征）（例如边缘，轮廓，弧或面）的小部分。在下一个处理级别，可以从这些边缘识别出纹理（圆形，正方形等）的更复杂的可重复片段，然后将其折叠为甚至更复杂的纹理（面部，车轮等的一部分）。

例如，考虑一个经典问题-数字图像识别。每个数字都有自己的一组典型数字（圆圈，线）。同时，每个圆或直线可以由较小的边缘组成（图1）

图1-顺序连接的卷积层的工作原理，并在每个层上分配特征。一组菊花链式CNN层中的每个下一个层都基于先前识别的图案提取出更复杂的图案。 — 1 – , . CNN , .

1. (convolutional layer)

CNN ( ), c () , . – CNN – .

, 2x2 ( K) , 2x2 ( N), :

$\left[\begin{matrix}n_{11}&n_{12}\\n_{21}&n_{22}\\\end{matrix}\right]\ast\left[\begin{matrix}k_{11}&k_{12}\\k_{21}&k_{22}\\\end{matrix}\right]=n_{11}k_{11}+n_{12}k_{12}+n_{21}k_{21}+n_{22}k_{22}$

, .

, (fully-connected, dense layers):

${sum=\ \vec{X}}^T\vec{W}=\sum_{i=1}^{n=4}{x_iw_i}=x_1w_1+x_2w_2+x_3w_3+x_4w_4$

, - , – - , ( ).

2. , , , .

(kernel size) – 3, 5, 7.

(kernel) [k_h, k_w], [n_h, n_w], ( 3):

, . , . , .

, – (padding). , . , p_h p_w , :

, , , :

- . , (stride). – (stride).

, s_w, s_h, :

$c_w=\left \lfloor (n_w+p_w-k_w+s_w)/s_w \right \rfloor; c_h=\left \lfloor (n_h+p_h-k_h+s_h)/s_h \right \rfloor$

, ( – ). (). , (CONV1) 9x9x1 ( – - ), 2 1x1 (stride) (padding) , , . 9x9x2 2 – (. 6). CONV2 , , 2x2, , 2, 2x2x2. (CONV2) 9x9x4, 4 – .

, k_w k_h , n_wx n_hx n_d, n_d - , , k_w x k_h x n_d ( 6, CONV2).

7 , RGB, 3x3. , (3 ), 3x3x3.

图7-如果输入图像具有三个RGB通道，则在卷积层中进行计算 — 7 - , RGB

TensorFlow.js

, : tf.layers.conv2d, – , :

- filter – number –

- kernelSize – number | number[] – , number, , –

- strides – number | number[] - , [1,1], .

- padding – ‘same’, ‘valid’ – , ‘valid’

'same'

, , () (stride) . , - 11 , – 5, 13/5=2.6, – 3 ( 8).

图8-内核大小= 6，步幅= 5的框架中缩进的操作模式“有效”和“相同”。 — 8 – ‘valid’ ‘same’ kernelSize=6 strides=5.

stride=1, ( 9), , ( 8).

图9-内核大小= 3且步幅= 1的框架中缩进的“有效”和“相同”操作模式 — 9 – ‘valid’ ‘same’ kernelSize=3 strides=1

'valid'

, strides , 8.

TensorFlow.js

, . :

- :

$\左[\开始{矩阵} 1＆0＆-1 \\ 1＆0＆-1 \\ 1＆0＆-1 \\\结束{matrix} \ right]$

- :

$\左[\开始{矩阵} 1＆1＆1 \\ 0＆0＆0 \\-1＆-1＆-1 \\\结束{matrix} \ right]$

, , tf.browser.fromPixels. , img canvas .

<img src="./sources/itechart.png" alt="Init image" id="target-image"/>
<canvas id="output-image-01"></canvas>

<script>
   const imgSource = document.getElementById('target-image');
   const image = tf.browser.fromPixels(imgSource, 1);
</script>

, , , 3x3, “same” ‘relu’:

const model = tf.sequential({
    layers: [
        tf.layers.conv2d({
            inputShape: image.shape,
            filters: 1,
            kernelSize: 3,
            padding: 'same',
            activation: 'relu'
        })
    ]
});

[NUM_SAMPLES, WIDTH, HEIGHT,CHANNEL], tf.browser.fromPixel [WIDTH, HEIGHT, CHANNEL], – ( , ):

const input = image.reshape([1].concat(image.shape));

. , setWeights Layer, :

model.getLayer(null, 0).setWeights([
    tf.tensor([
         1,  1,  1,
         0,  0,  0,
        -1, -1, -1
    ], [3, 3, 1, 1]),
    tf.tensor([0])
]);

, , 0-255, NUM_SAMPLES:

const output = model.predict(input);

const max = output.max().arraySync();
const min = output.min().arraySync();

const outputImage = output.reshape(image.shape)
    .sub(min)
    .div(max - min)
    .mul(255)
    .cast('int32');

canvas, tf.browser.toPixels:

tf.browser.toPixels(outputImage, document.getElementById('output-image-01'));

2. (pooling layer)

, ( ), , . , , (pooling layer, subsample layer), . MaxPooling .

, .

. (kernel) , (stride) 1x1, . , (. 10).

, 4x4, 2x2 (stride) , 2x2, .

, ( 11) . , , MaxPooling . (translation invariance). , , 50%. , , MaxPooling .

图11-平滑MaxPooling层之后的空间位移 — 11 – MaxPooling

, .

, , – (stride).

MaxPooling AveragePooling, , , . , MaxPooling. AveragePooling , , MaxPooling .

TensorFlow.js (pooling layer)

tf.layers.maxPooling2d tf.layers.averagePooling2d. – , :

- poolSize -号| number [] -过滤器的尺寸，如果指定了number，则过滤器的尺寸为正方形；如果将其指定为数组，则高度和宽度可能会有所不同

-大步前进-数字| number []是前进步骤，是一个可选参数，默认情况下与指定的poolSize具有相同的尺寸。

-填充- “同”，“有效” -设置零填充，默认为“有效”

机器学习。神经网络（第3部分）-显微镜下的卷积网络。探索Tensorflow.js API

1. (convolutional layer)

TensorFlow.js

'same'

'valid'

TensorFlow.js

2. (pooling layer)

TensorFlow.js (pooling layer)

More articles: