pytorch 状态字典:state_dict使用详解-侯体宗的博客

pytorch 状态字典:state_dict使用详解
Python / 管理员发布于 5年前 441

pytorch 中的 state_dict 是一个简单的python的字典对象,将每一层与它的对应参数建立映射关系.(如model的每一层的weights及偏置等等)

(注意,只有那些参数可以训练的layer才会被保存到模型的state_dict中,如卷积层,线性层等等)

优化器对象Optimizer也有一个state_dict,它包含了优化器的状态以及被使用的超参数(如lr, momentum,weight_decay等)

备注：

1) state_dict是在定义了model或optimizer之后pytorch自动生成的,可以直接调用.常用的保存state_dict的格式是".pt"或'.pth'的文件,即下面命令的 PATH="./***.pt"

torch.save(model.state_dict(), PATH)

2) load_state_dict 也是model或optimizer之后pytorch自动具备的函数,可以直接调用

model = TheModelClass(*args, **kwargs)model.load_state_dict(torch.load(PATH))model.eval()

注意：model.eval() 的重要性,在2)中最后用到了model.eval(),是因为,只有在执行该命令后,"dropout层"及"batch normalization层"才会进入 evalution 模态. 而在"训练(training)模态"与"评估(evalution)模态"下,这两层有不同的表现形式.

模态字典(state_dict)的保存(model是一个网络结构类的对象)

1.1)仅保存学习到的参数,用以下命令

 torch.save(model.state_dict(), PATH)

1.2)加载model.state_dict,用以下命令

 model = TheModelClass(*args, **kwargs) model.load_state_dict(torch.load(PATH)) model.eval()

备注：model.load_state_dict的操作对象是一个具体的对象,而不能是文件名

2.1)保存整个model的状态,用以下命令

torch.save(model,PATH)

2.2)加载整个model的状态,用以下命令:

   # Model class must be defined somewhere model = torch.load(PATH) model.eval()

state_dict 是一个python的字典格式,以字典的格式存储,然后以字典的格式被加载,而且只加载key匹配的项

如何仅加载某一层的训练的到的参数(某一层的state)

If you want to load parameters from one layer to another, but some keys do not match, simply change the name of the parameter keys in the state_dict that you are loading to match the keys in the model that you are loading into.

conv1_weight_state = torch.load('./model_state_dict.pt')['conv1.weight']

加载模型参数后,如何设置某层某参数的"是否需要训练"(param.requires_grad)

for param in list(model.pretrained.parameters()): param.requires_grad = False

注意: requires_grad的操作对象是tensor.

疑问:能否直接对某个层直接之用requires_grad呢?例如:model.conv1.requires_grad=False

回答:经测试,不可以.model.conv1 没有requires_grad属性.

全部测试代码:

#-*-coding:utf-8-*-import torchimport torch.nn as nnimport torch.nn.functional as Fimport torch.optim as optim   # define modelclass TheModelClass(nn.Module): def __init__(self):  super(TheModelClass,self).__init__()  self.conv1 = nn.Conv2d(3,6,5)  self.pool = nn.MaxPool2d(2,2)  self.conv2 = nn.Conv2d(6,16,5)  self.fc1 = nn.Linear(16*5*5,120)  self.fc2 = nn.Linear(120,84)  self.fc3 = nn.Linear(84,10)  def forward(self,x):  x = self.pool(F.relu(self.conv1(x)))  x = self.pool(F.relu(self.conv2(x)))  x = x.view(-1,16*5*5)  x = F.relu(self.fc1(x))  x = F.relu(self.fc2(x))  x = self.fc3(x)  return x # initial modelmodel = TheModelClass() #initialize the optimizeroptimizer = optim.SGD(model.parameters(),lr=0.001,momentum=0.9) # print the model's state_dictprint("model's state_dict:")for param_tensor in model.state_dict(): print(param_tensor,'\t',model.state_dict()[param_tensor].size()) print("\noptimizer's state_dict")for var_name in optimizer.state_dict(): print(var_name,'\t',optimizer.state_dict()[var_name]) print("\nprint particular param")print('\n',model.conv1.weight.size())print('\n',model.conv1.weight) print("------------------------------------")torch.save(model.state_dict(),'./model_state_dict.pt')# model_2 = TheModelClass()# model_2.load_state_dict(torch.load('./model_state_dict'))# model.eval()# print('\n',model_2.conv1.weight)# print((model_2.conv1.weight == model.conv1.weight).size())## 仅仅加载某一层的参数conv1_weight_state = torch.load('./model_state_dict.pt')['conv1.weight']print(conv1_weight_state==model.conv1.weight) model_2 = TheModelClass()model_2.load_state_dict(torch.load('./model_state_dict.pt'))model_2.conv1.requires_grad=Falseprint(model_2.conv1.requires_grad)print(model_2.conv1.bias.requires_grad)

以上这篇pytorch 状态字典:state_dict使用详解就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持。

上一条：
Pytorch 计算误判率,计算准确率,计算召回率的例子
下一条：
基于Pycharm加载多个项目过程图解

0条评论 (评论内容有缓存机制,请悉知!)

最新最热

近期文章
在windows10中升级go版本至1.24后LiteIDE的Ctrl+左击无法跳转问题解决方案(0个评论)
智能合约Solidity学习CryptoZombie第四课:僵尸作战系统(0个评论)
智能合约Solidity学习CryptoZombie第三课:组建僵尸军队(高级Solidity理论)(0个评论)
智能合约Solidity学习CryptoZombie第二课:让你的僵尸猎食(0个评论)
智能合约Solidity学习CryptoZombie第一课:生成一只你的僵尸(0个评论)
在go中实现一个常用的先进先出的缓存淘汰算法示例代码(0个评论)
在go+gin中使用"github.com/skip2/go-qrcode"实现url转二维码功能(0个评论)
在go语言中使用api.geonames.org接口实现根据国际邮政编码获取地址信息功能(1个评论)
在go语言中使用github.com/signintech/gopdf实现生成pdf分页文件功能(0个评论)
gmail发邮件报错:534 5.7.9 Application-specific password required...解决方案(0个评论)

近期评论
122 在
学历：一种延缓就业设计，生活需求下的权衡之选中评论工作几年后，报名考研了，到现在还没认真学习备考，迷茫中。作为一名北漂互联网打工人..
123 在
Clash for Windows作者删库跑路了，github已404中评论按理说只要你在国内，所有的流量进出都在监控范围内，不管你怎么隐藏也没用，想搞你分..
原梓番博客在
在Laravel框架中使用模型Model分表最简单的方法中评论好久好久都没看友情链接申请了，今天刚看，已经添加。..
博主在
佛跳墙vpn软件不会用?上不了网?佛跳墙vpn常见问题以及解决办法中评论 @1111老铁这个不行了，可以看看近期评论的其他文章..
1111 在
佛跳墙vpn软件不会用?上不了网?佛跳墙vpn常见问题以及解决办法中评论网站不能打开，博主百忙中能否发个APP下载链接，佛跳墙或极光..

Top