Skip to content

RL 训练参数尝试

limiao edited this page Dec 12, 2017 · 18 revisions

几个方向

  • 是否固定 Image Model

limiao: 目前都是固定Image Model的 heda: finetune Image Model

  • 尝试Adam训练

limiao: 目前Adam正在尝试1e-2,1e-3,1e-4,1e-5几种学习率,发现1e-2和1e-3都有问题,不收敛

配置1: init_lr= 1e-4, decay=0.6, 每8个epoch decay一次

epoch Bleu_4 CIDEr METEOR ROUGE_L
643194 0.5345 1.6569 0.3905 0.6682
662220 0.5677 1.7897 0.4016 0.6843
681211 0.5647 1.8022 0.4020 0.6821
700125 0.5715 1.8214 0.4046 0.6857
720293 0.5736 1.8287 0.4039 0.6857
741736 0.5767 1.8431 0.4057 0.6879
760705 0.5797 1.8491 0.4061 0.6879
780968 0.5778 1.8439 0.4072 0.6887
802434 0.5779 1.8509 0.4066 0.6889
821201 0.5810 1.8578 0.4074 0.6887
841051 0.5812 1.8597 0.4077 0.6897
860083 0.5781 1.8594 0.4074 0.6893
880657 0.5787 1.8617 0.4069 0.6883
900000 0.5809 1.8604 0.4071 0.6890

配置2: init_lr= 1e-5, decay=0.6, 每8个epoch decay一次

epoch Bleu_4 CIDEr METEOR ROUGE_L
641934 0.5675 1.7896 0.4095 0.6870
660532 0.5711 1.8089 0.4045 0.6878
680054 0.5732 1.8246 0.4051 0.6879
700059 0.5742 1.8276 0.4050 0.6887
720200 0.5773 1.8405 0.4067 0.6898
740028 0.5784 1.8411 0.4066 0.6906
760831 0.5780 1.8379 0.4067 0.6902
780655 0.5776 1.8394 0.4065 0.6897
800441 0.5780 1.8399 0.4066 0.6896
820207 0.5799 1.8479 0.4071 0.6910
841853 0.5806 1.8521 0.4074 0.6909
860381 0.5791 1.8485 0.4072 0.6906
881337 0.5797 1.8461 0.4071 0.6910
900000 0.5804 1.8482 0.4074 0.6906

配置3: init_lr= 5e-5, decay=0.8, 每3个epoch decay一次

epoch Bleu_4 CIDEr METEOR ROUGE_L
642973 0.5602 1.7642 0.3975 0.6818
660958 0.5702 1.8143 0.4051 0.6864
680028 0.5768 1.8324 0.4054 0.6887
700438 0.5781 1.8390 0.4053 0.6891
719491 0.5788 1.8437 0.4065 0.6901
721901 0.5778 1.8427 0.4072 0.6900
739994 0.5789 1.8487 0.4066 0.6894
741193 0.5804 1.8541 0.4073 0.6903
761743 0.5793 1.8545 0.4072 0.6898
780236 0.5784 1.8461 0.4057 0.6886
800265 0.5818 1.8583 0.4074 0.6901
820801 0.5808 1.8567 0.4076 0.6901
840423 0.5825 1.8650 0.4077 0.6905
861866 0.5806 1.8603 0.4079 0.6903
880121 0.5828 1.8654 0.4084 0.6914
900000 0.5808 1.8623 0.4081 0.6902
920853 0.5857 1.8882 0.4108 0.6936
940510 0.5870 1.8902 0.4107 0.6939
961431 0.5859 1.8906 0.4108 0.6938
981077 0.5861 1.8897 0.4106 0.6934
1002022 0.5866 1.8898 0.4104 0.6934
1020615 0.5860 1.8901 0.4107 0.6934
1040198 0.5862 1.8915 0.4108 0.6938
1061261 0.5860 1.8911 0.4108 0.6938
1080903 0.5865 1.8919 0.4108 0.6942
1100000 0.5863 1.8922 0.4108 0.6939
1121657 0.5865 1.8923 0.4108 0.6939
1141293 0.5864 1.8919 0.4108 0.6939
1162079 0.5862 1.8917 0.4108 0.6939
1181385 0.5863 1.8917 0.4108 0.6939
1200151 0.5861 1.8913 0.4108 0.6938
1221827 0.5861 1.8911 0.4108 0.6938
1240831 0.5862 1.8914 0.4109 0.6939
  • 尝试不同学习率

limiao: 目前SGD正在尝试了两种0.1和0.01两种学习率,decay factor还是0.6。 仅仅从训练集上的cider score变化看,“好像”0.1的学习率好一点,这个结论还有待继续验证

  • 尝试加入Multi-Task loss

finetune Image Model SGD learning with decay lr=1.0 decay=0.66 从 600000 开始 RL

epoch Bleu_4 CIDEr METEOR ROUGE_L
500396 0.5617 1.7343 0.4076 0.6827
521053 0.5639 1.7425 0.4087 0.6839
541263 0.5638 1.7492 0.4078 0.6845
560347 0.5631 1.7471 0.4080 0.6836
580619 0.5644 1.7509 0.4085 0.6845
600000 0.5624 1.7435 0.4080 0.6832
602046 0.5499 1.7083 0.4005 0.6766
604529 0.5494 1.7087 0.3985 0.6761
606388 0.5490 1.7189 0.3963 0.6747
608213 0.5508 1.7101 0.3965 0.6749
610376 0.5486 1.7130 0.3955 0.6745
620474 0.5538 1.7318 0.3977 0.6776
640705 0.5586 1.7573 0.3990 0.6797
664577 0.5599 1.7562 0.3991 0.6806
680642 0.5658 1.7843 0.4012 0.6830
704343 0.5740 1.8129 0.4051 0.6871
720317 0.5733 1.8121 0.4045 0.6869
744213 0.5772 1.8250 0.4056 0.6888
760304 0.5767 1.8283 0.4057 0.6885
784556 0.5771 1.8326 0.4053 0.6889
792001 0.5786 1.8369 0.4062 0.6895
800000 0.5780 1.8362 0.4064 0.6894

beam search sample approximation RL

epoch Bleu_4 CIDEr METEOR ROUGE_L
642190 0.5775 1.8618 0.4105 0.6883
661032 0.5809 1.8665 0.4094 0.6916
680574 0.5799 1.8738 0.4086 0.6916
700271 0.5834 1.8748 0.4099 0.6917
725510 0.5825 1.8846 0.4111 0.6931
740265 0.5814 1.8898 0.4117 0.6924
760240 0.5825 1.8909 0.4116 0.6937
780165 0.5827 1.8936 0.4120 0.6941
800000 0.5851 1.8985 0.4121 0.6947
820638 0.5854 1.8981 0.4118 0.6939
860621 0.5852 1.9008 0.4119 0.6944
881359 0.5855 1.9015 0.4125 0.6948
900481 0.5845 1.9009 0.4121 0.6941
920828 0.5841 1.8994 0.4125 0.6947
940024 0.5837 1.9013 0.4123 0.6939
960276 0.5846 1.9008 0.4121 0.6938
980426 0.5840 1.9004 0.4121 0.6942
1000000 0.5835 1.8998 0.4124 0.6941