python中模型训练不收敛问题
近期在做人脸表情识别时,遇到了一个问题,就是模型写好进行训练时,出现了不收敛的现象。情况如下:
出现这种问题,很难无从下手,一步一步的排查各个环节,首先检查模型,发现模型并木有问题。部分模型代码:
def train_model():
# 构建模型----------------------------------------------------------
x = tf.placeholder(tf.float32, [None, 128, 128, 1])
y_ = tf.placeholder(tf.int32, [None, ])
y_out, logits = deepnn(x)
loss = loss_value(y_out, y_)
train_step = tf.train.AdamOptimizer(learning_rate=0.0005).minimize(loss)
correct_prediction = tf.equal(tf.cast(tf.argmax(y_out, 1), tf.int32), y_)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
然后检查loss函数,发现这部也没有问题,截取loss函数代码:
def loss_value(logit, labels):
labels = tf.cast(labels, tf.int64)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logit,
labels=labels,
name='cross_entropy_per_example')
# 交叉熵损失
cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
# 权重损失
tf.add_to_collection('losses', cross_entropy_mean)
return tf.add_n(tf.get_collection('losses'), name='total_loss')
一般情况下,这两部分排查完之后,查看数据是否读取正确,是否将数据读进去了。读取数据部分代码:
def load_data(image_path):
cate = [image_path+f for f in os.listdir(image_path) if os.path.isdir(image_path+f)]
imgs = []
labels = []
for idx, folder in enumerate(cate):
for im in glob.glob(folder+'/*.jpg'):
print('reading the images:%s' % (im))
img = io.imread(im)
imgs.append(img)
labels.append(idx)
return np.asarray(imgs, np.float32), np.asarray(labels, np.int32)
image, label = load_data(image_path)
运行一下读取数据部分的代码,发现数据根本就木有被读进去
主要原因是忘记写路径了,image_path = ‘XXX路径’:
添加上路径之后,还是有问题,主要是图像标签label有问题,打印一下label,发现label是空的。
解决方法:在读取图像处,添加上图像和标签的代码
img = transform.resize(img, (128, 128,1))
添加之后,顺便打印一下label,查看是否正确
image_path = "D:/2.0/project0/data/"
# 定义读取图片的函数, 并将其resize成width*height尺寸大小
def load_data(image_path):
cate = [image_path+f for f in os.listdir(image_path) if os.path.isdir(image_path+f)]
imgs = []
labels = []
for idx, folder in enumerate(cate):
for im in glob.glob(folder+'/*.jpg'):
print('reading the images:%s' % (im))
img = io.imread(im)
# img = color.rgb2gray(img)
img = transform.resize(img, (128, 128,1))
print(img.shape)
imgs.append(img)
labels.append(idx)
return np.asarray(imgs, np.float32), np.asarray(labels, np.int32)
image, label = load_data(image_path)
print(label)
顺便打印一下img的shape和label,如下图,说明已经修改对了
接下来重新训练
现在就好了。