tensorboard

@TOC

之前介绍过visdom模块,visdom模块不能将网络结构可视化,而tensorboard模块可以,因此这里学习了tensorboard模块,这里是在pytorch上使用tensorboard的帮助文档,本文依据文档进行改编,官方github地址:https://github.com/lanpa/tensorboardX

tensorboard可视化一共分为四步

  • tensorboard安装
  • 创建writer实例
  • 添加需要的图表等信息
  • 运行events日志

一、tensorboard安装

需要安装两个package,一个是tensorboardX,另一个是tensorboard(启动网页)。使用pip install tensorboardXpip install tensorboard命令进行安装。

二、创建一个writer实例

在添加文字或者创建图表时,首先需要创建一个writer实例,官方文档如下,在我们创建writer时若没有参数(如下面的writer2),默认文件在runs目录下,默认的子文件夹名字为该文件的创建日期,当然也可以手动设定子文件夹的名字如writer和writer3所示

1
2
3
4
5
6
7
8
from tensorboardX import SummaryWriter 
#SummaryWriter encapsulates everything
writer = SummaryWriter('runs/exp-1')
#creates writer object. The log will be saved in 'runs/exp-1'
writer2 = SummaryWriter()
#creates writer2 object with auto generated file name, the dir will be something like 'runs/Aug20-17-20-33'
writer3 = SummaryWriter(comment='3x learning rate')
#creates writer3 object with auto generated file name, the comment will be appended to the filename. The dir will be something like 'runs/Aug20-17-20-33-3xlearning rate'

三、添加数据形式

1. 记录标量

(1)add_scalar

add_scalar(tag, scalar_value, global_step, walltime)

  • tag:数据标签
  • scalar_value:数据的值,即y轴的值
  • global_step:全局步数,即x轴的值
  • walltime:记录event的时间,可选
1
2
3
4
5
6
7
8
from tensorboardX import SummaryWriter 
import time
writer = SummaryWriter("runs/scalar")
x = range(100)
for i in x:
time.sleep(0.1)
writer.add_scalar('y=2x', i * 2, i, walltime=time.time())
writer.close()

运行后进入到runs的上级目录,输入tensorboard --logdir=runs/scalar,进入到网址http://localhost:6006/即可看到可视化结果

(2)add_scalars

add_scalars(main_tag, scalar_dict, global_step, walltime)

  • main_tag:数据标签
  • tag_scalar_dict:数据的值,即y轴的值
  • global_step:全局步数,即x轴的值
  • walltime:记录event的时间,可选
1
2
3
4
5
6
7
8
9
10
11
from tensorboardX import 
import numpy as np
SummaryWriter writer = SummaryWriter()
r = 5
for i in range(100):
writer.add_scalars('run_14h', {'xsinx':i*np.sin(i/r),
'xcosx':i*np.cos(i/r),
'tanx': np.tan(i/r)}, i)
writer.close()
# This call adds three values to the same scalar plot with the tag
# 'run_14h' in TensorBoard's scalar section.

2. 记录文本

add_text(tag, text_string, global_step, walltime)

  • tag:数据标签
  • text_string:数据的值
  • global_step:全局步数,即x轴的值
  • walltime:记录event的时间,可选
1
2
3
4
5
6
7
8
9
10
from tensorboardX import SummaryWriter
import time

writer = SummaryWriter("runs/text")
x = range(20)
for i in x:
# 最多添加十个epoch
writer.add_text('text', "This is epoch {}".format(i), i)
time.sleep(0.1)
writer.close()

3. 记录图片

(1)add_image

add_image(tag, img_tensor, global_step, walltime, dataformats)

  • tag:标签
  • img_tensor:An uint8 or float Tensor of shape [channel, height, width] where channel is 1, 3, or 4. The elements in img_tensor can either have values in [0, 1] (float32) or [0, 255] (uint8). Users are responsible to scale the data in the correct range/type.
  • global_step:记录epoch
  • walltime:记录时间
  • dataformats:指定张量每个维度的含义,例如CHW,[channel,height,width]

img_tensor: Default is (3,H,W)(3,H,W). You can use torchvision.utils.make_grid() to convert a batch of tensor into 3xHxW format or use add_images() and let us do the job. Tensor with (1,H,W)(1,H,W), (H,W)(H,W), (H,W,3)(H,W,3) is also suitible as long as corresponding dataformats argument is passed. e.g. CHW, HWC, HW.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from tensorboardX import SummaryWriter
import numpy as np
img = np.zeros((3, 100, 100))
img[0] = np.arange(0, 10000).reshape(100, 100) / 10000
img[1] = 1 - np.arange(0, 10000).reshape(100, 100) / 10000

img_HWC = np.zeros((100, 100, 3))
img_HWC[:, :, 0] = np.arange(0, 10000).reshape(100, 100) / 10000
img_HWC[:, :, 1] = 1 - np.arange(0, 10000).reshape(100, 100) / 10000

writer = SummaryWriter("runs/image")
writer.add_image('my_image', img, 0)

# If you have non-default dimension setting, set the dataformats argument.
writer.add_image('my_image_HWC', img_HWC, 0, dataformats='HWC')
writer.close()

(2)add_images

add_images(tag: str, img_tensor,global_step, walltime, dataformats: Optional[str] = ‘NCHW’)

  • tag:标签
  • img_tensor:An uint8 or float Tensor of shape [channel, height, width] where channel is 1, 3, or 4. The elements in img_tensor can either have values in [0, 1] (float32) or [0, 255] (uint8). Users are responsible to scale the data in the correct range/type.
  • global_step:记录epoch
  • walltime:记录时间
  • dataformats:指定张量每个维度的含义,例如NCHW,[number,channel,height,width]

img_tensor: Default is (N,3,H,W)(N,3,H,W). If dataformats is specified, other shape will be accepted. e.g. NCHW or NHWC.

1
2
3
4
5
6
7
8
9
from tensorboardX import SummaryWriter 
import numpy as np

img_batch = np.zeros((16, 3, 100, 100))
for i in range(16):
img_batch[i, 0] = np.arange(0, 10000).reshape(100, 100) / 10000 / 16 * i img_batch[i, 1] = (1 - np.arange(0, 10000).reshape(100, 100) / 10000) / 16 * i
writer = SummaryWriter("runs/images")
writer.add_images('my_image_batch', img_batch, 0)
writer.close()

4. 添加计算图

add_graph(model, input_to_model=None, verbose=False)

  • model:需要绘制的模型
  • imput_to_model:一个变量或者tuple类型的变量
  • verbose:是否在console中打印计算图结构
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import torch
import torch.nn as nn
import torch.nn.functional as F
from tensorboardX import SummaryWriter


class Net1(nn.Module):
def __init__(self):
super(Net1, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
self.conv2_drop = nn.Dropout2d()
self.fc1 = nn.Linear(320, 50)
self.fc2 = nn.Linear(50, 10)
self.bn = nn.BatchNorm2d(20)

def forward(self, x):
x = F.max_pool2d(self.conv1(x), 2)
x = F.relu(x) + F.relu(-x)
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
x = self.bn(x)
x = x.view(-1, 320)
x = F.relu(self.fc1(x))
x = F.dropout(x, training=self.training)
x = self.fc2(x)
x = F.softmax(x, dim=1)
return x


dummy_input = torch.rand(13, 1, 28, 28)

model = Net1()
# 使用with语句,可以不调用w.close
# with SummaryWriter(comment='Net1') as w:
with SummaryWriter("runs/graph") as w:
# 第一个参数为需要保存的模型
# 第二个参数为输入值->元祖类型
w.add_graph(model, (dummy_input), verbose=True)

四、运行tensorboard

当我们在程序中添加了程序日志events之后,就可以使用tensorboard --logdir=<your_log_dir>运行日志进行可视化了,例如:tensorboard/runs下有两个文件夹linear和graph,进入到tensorboard文件夹中,输入tensorboard --logdir=runs即可运行两个文件夹,如果指向运行其中一个,输入tensorboard --logdir=runs/linear即可

五、tensorboard官方demo

运行时需要将mnist/MNIST数据集放在demo.py同级目录下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import torch
import torchvision.utils as vutils
import numpy as np
import torchvision.models as models
from torchvision import datasets
from tensorboardX import SummaryWriter

resnet18 = models.resnet18(False)
writer = SummaryWriter("runs/demo")
sample_rate = 44100
freqs = [262, 294, 330, 349, 392, 440, 440, 440, 440, 440, 440]

for n_iter in range(100):
dummy_s1 = torch.rand(1)
dummy_s2 = torch.rand(1)
# data grouping by `slash`
writer.add_scalar('data/scalar1', dummy_s1[0], n_iter)
writer.add_scalar('data/scalar2', dummy_s2[0], n_iter)

writer.add_scalars('data/scalar_group', {'xsinx': n_iter * np.sin(n_iter),
'xcosx': n_iter * np.cos(n_iter),
'arctanx': np.arctan(n_iter)}, n_iter)

dummy_img = torch.rand(32, 3, 64, 64) # output from network
if n_iter % 10 == 0:
x = vutils.make_grid(dummy_img, normalize=True, scale_each=True)
writer.add_image('Image', x, n_iter)

dummy_audio = torch.zeros(sample_rate * 2)
for i in range(x.size(0)):
# amplitude of sound should in [-1, 1]
dummy_audio[i] = np.cos(freqs[n_iter // 10] * np.pi * float(i) / float(sample_rate))
writer.add_audio('myAudio', dummy_audio, n_iter, sample_rate=sample_rate)

writer.add_text('Text', 'text logged at step:' + str(n_iter), n_iter)

for name, param in resnet18.named_parameters():
writer.add_histogram(name, param.clone().cpu().data.numpy(), n_iter)

# needs tensorboard 0.4RC or later
writer.add_pr_curve('xoxo', np.random.randint(2, size=100), np.random.rand(100), n_iter)

dataset = datasets.MNIST('mnist', train=False, download=False)
images = dataset.data[:100].float()
label = dataset.targets[:100]

features = images.view(100, 784)
writer.add_embedding(features, metadata=label, label_img=images.unsqueeze(1))

# export scalar data to JSON for external processing
writer.export_scalars_to_json("./all_scalars.json")
writer.close()