Skip to content

Conversation

@szagoruyko
Copy link
Contributor

@szagoruyko szagoruyko commented May 17, 2019

I trained WRN-50-2 and WRN-101-2 with master torchvision, which now allows making WRN models with simple width_per_group argument. I did not use the standard training procedure for ResNet though, here are the differences:

  • SGD with cosine learning rate and warm restarts for 256 epochs (~0.2% to top1 accuracy)
  • FP16 training with batchnorm in FP32 with apex O2

so the checkpoints are in torch.float16 to save space.

model top1, top5 error
WRN-50-2 21.49, 5.91
WRN-101-2 21.16, 5.72

idk do we want these in torchvision? I could put them in wide-residual-networks instead.

@codecov-io
Copy link

codecov-io commented May 17, 2019

Codecov Report

Merging #912 into master will increase coverage by 0.09%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #912      +/-   ##
==========================================
+ Coverage   63.87%   63.97%   +0.09%     
==========================================
  Files          66       66              
  Lines        5273     5279       +6     
  Branches      793      793              
==========================================
+ Hits         3368     3377       +9     
+ Misses       1673     1671       -2     
+ Partials      232      231       -1
Impacted Files Coverage Δ
torchvision/models/resnet.py 88.27% <100%> (+0.45%) ⬆️
torchvision/transforms/transforms.py 82.73% <0%> (+0.6%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8616227...fdcb388. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR Sergey!

This is almost good to merge.
Can you add an entry for it in https://github.com/pytorch/vision/blob/master/docs/source/models.rst, with the accuracies as well in the table?

Then I'll copy the model weights to the pytorch website and let you know

@szagoruyko
Copy link
Contributor Author

Thanks for looking at it @fmassa , I added:

  • docstrings for wide_resnet50_2 and wide_resnet101_2 functions (with explanation of diff vs ResNet)
  • docs at docs/source/models.rst
  • wide_resnet50_2 and wide_resnet101_2 are now properly imported

@fmassa
Copy link
Member

fmassa commented Jun 24, 2019

Sorry for the delay in replying.

I've uploaded the pre-trained weights to

https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth
https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth

Can you update the URLs and fix the conflicts? Then it's good to go!

@szagoruyko
Copy link
Contributor Author

@fmassa done, updated links and rebased on master

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Just waiting for CI to finish before merging the PR

@fmassa fmassa merged commit 2b6da28 into pytorch:master Jun 26, 2019
@fmassa
Copy link
Member

fmassa commented Jun 26, 2019

Thanks Sergey!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants