#人工神经网络-感知器训练

#问题描述

下面这个训练集合是线性可分的:

输入 输出
1 0 0 1
0 1 1 0
1 1 0 1
1 1 1 0
0 0 1 0
1 0 1 1

(手工)训练此训练集合中的线性阈值单元。你的单元包括执行阈值的输入在内的四个输入。设所有权值的初始值为0。用固定递增纠错程序来训练你的单元直至找到一个解。
在每次训练循环后标出各组权值。以前面的输入为顶点画出一个三维立方体的草图,并根据最终权集画出分割平面的草图。

#训练过程

#初始化

取μ=0.1, θ=0.0,初始状态w1=w2=w3=0
激活函数如公式1所示.

$$ f_h(x)= \begin{cases} 1,&{x \geq 0} \\ 0,&{x < 0} \end{cases} \tag{1} $$

输出结果如公式2所示。

$$ o = f_h(\omega_1x+\omega_2y+\omega_3z+\theta) \tag{2} $$

权重更新规则如公示3所示, 其中 $input$ 为输入。

$$ \omega' = \omega + \Delta\omega * input \\ \theta' = \theta + \Delta\omega \tag{3} $$

感知器如图1所示。

图1 感知器图

#第一轮循环

将所有输入输入到感知器中进行计算。

$$ \begin{aligned} &o|_{(1, 0, 0)}=f_h(0+0+0+0.0)=1, &正确 \\ &o|_{(0, 1, 1)}=f_h(0+0+0+0.0)=1, &应该是0 \\ \end{aligned} $$

更新权重$\Delta\omega=-0.1, \omega_1 = 0 + -0.1 * 0 = 0.0, 同样地得到\omega_2 = -0.1, \omega_3 = -0.1, \theta = -0.1$

$$ \begin{aligned} &o|_{(1, 1, 0)}=f_h(0-0.1+0-0.1)=0, &应该是1 \end{aligned} $$

更新权重$\Delta\omega=0.1, \omega_1 = 0.1, \omega_2 = 0.0, \omega_3 = -0.1, \theta = -0.0$

$$ \begin{aligned} &o|_{(1, 1, 1)}=f_h(0.1+0-0.1+0.0)=1, &应该是0 \end{aligned} $$

更新权重$\Delta\omega=-0.1, \omega_1 = 0.1, \omega_2 = -0.1, \omega_3 = -0.2, \theta = -0.1$

$$ \begin{aligned} &o|_{(0, 0, 1)}=f_h(0+0-0.2-0.1)=0, &正确 \\ &o|_{(1, 0, 1)}=f_h(-0.1+0-0.2-0.1)=0, &应该是1 \\ \end{aligned} $$

更新权重$\Delta\omega=0.1, \omega_1 = 0.1, \omega_2 = -0.1, \omega_3 = -0.1, \theta = -0.0$

图2 一轮循环后的感知器图

#第二轮循环

$$ \begin{aligned} &o|_{(1, 0, 0)}=f_h(0.1+0+0+0.0)=1, &正确 \\ &o|_{(0, 1, 1)}=f_h(0-0.1-0.1+0.0)=0, &正确 \\ &o|_{(1, 1, 0)}=f_h(0.1-0.1+0+0.0)=1, &正确 \\ &o|_{(1, 1, 1)}=f_h(0.1-0.1-0.1+0.0)=0, &正确 \\ &o|_{(0, 0, 1)}=f_h(0+0-0.1+0.0)=0, &正确 \\ &o|_{(1, 0, 1)}=f_h(0.1+0-0.1+0.0)=1, &正确 \\ \end{aligned} $$

至此感知器收敛,循环结束。

图3 最终的的感知器图

#结论

最终权集画出的分割平面的草图如图4所示。

1
2
3
4
5
6
7
clear all
x = -2:0.1:2
y = -2:0.1:2
[X,Y]= meshgrid(x,y)
Z = X - Y
scatter3([1,1,1],[0,1,0],[0,0,1],'rp')
scatter3([0,1,0],[1,1,0],[1,1,1],'k')

分割平面图

图4 分割平面图

#附录

#感知器训练代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-

# ----------model import----------

from functools import reduce

# ----------end model import----------


# ----------global variables----------


# ----------end global variables----------


# ----------function definition----------

def f(x):
'''
activation function
'''
return 1 if x >= 0 else 0

# ----------end function definition----------


# ----------class definition----------

class Perceptron(object):
def __init__(self, input_num, activator):
'''
initiate
'''
self.activator = activator
self.weights = [0.0] * input_num
self.bias = 0.0

def __str__(self):
'''
print weights and nbias
'''
return 'weights: %s\nbias: %f\n' % (list(self.weights), self.bias)

def predict(self, input_vec):
'''
predict function
'''
return self.activator(
reduce(lambda a, b: a + b,
map(lambda x_w: x_w[0]*x_w[1],
zip(input_vec, self.weights)), 0.0) + self.bias)

def train(self, input_vecs, labels, iteration, rate):
'''
train the perceptron also print weights and nbias for each iteration
'''
for i in range(iteration):
print('iterations: %d' % (i+1))
self._one_iteration(input_vecs, labels, rate)

def _one_iteration(self, input_vecs, labels, rate):
'''
iteration
'''
samples = zip(input_vecs, labels)
for (input_vec, label) in samples:
output = self.predict(input_vec)
self._update_weights(input_vec, output, label, rate)
print(input_vec, output, label)
print(self)

def _update_weights(self, input_vec, output, label, rate):
'''
update weights
'''
delta = label - output
self.weights = list(map(
lambda x_w: x_w[1] + rate * delta * x_w[0],
zip(input_vec, self.weights)))
self.bias += rate * delta

# ----------end class definition----------


# ----------main function----------

if __name__ == '__main__':
p = Perceptron(3, f)
input_vecs = [[1, 0, 0], [0, 1, 1], [
1, 1, 0], [1, 1, 1], [0, 0, 1], [1, 0, 1]]
labels = [1, 0, 1, 0, 0, 1]

p.train(input_vecs, labels, 10, 0.1)

for x, y, z in input_vecs:
print('input: %d %d %d predict: %d' % (x, y, z, p.predict([x, y, z])))

# ----------end main function----------