Skip to content

RL 训练参数尝试

Heda Wang edited this page Nov 29, 2017 · 18 revisions

几个方向

  • 是否固定 Image Model

limiao: 目前都是固定Image Model的 heda: finetune Image Model

  • 尝试Adam训练

limiao: 目前Adam正在尝试1e-2,1e-3,1e-4,1e-5几种学习率,发现1e-2和1e-3都有问题,不收敛

配置1: init_lr= 1e-4, decay=0.6, 每8个epoch decay一次

epoch Bleu_4 CIDEr METEOR ROUGE_L
643194 0.5345 1.6569 0.3905 0.6682
662220 0.5677 1.7897 0.4016 0.6843
681211 0.5647 1.8022 0.4020 0.6821
700125 0.5715 1.8214 0.4046 0.6857
720293 0.5736 1.8287 0.4039 0.6857
741736 0.5767 1.8431 0.4057 0.6879
760705 0.5797 1.8491 0.4061 0.6879
780968 0.5778 1.8439 0.4072 0.6887
802434 0.5779 1.8509 0.4066 0.6889
821201 0.5810 1.8578 0.4074 0.6887
841051 0.5812 1.8597 0.4077 0.6897
860083 0.5781 1.8594 0.4074 0.6893
880657 0.5787 1.8617 0.4069 0.6883
900000 0.5809 1.8604 0.4071 0.6890

配置2: init_lr= 1e-5, decay=0.6, 每8个epoch decay一次

epoch Bleu_4 CIDEr METEOR ROUGE_L
641934 0.5675 1.7896 0.4095 0.6870
660532 0.5711 1.8089 0.4045 0.6878
680054 0.5732 1.8246 0.4051 0.6879
700059 0.5742 1.8276 0.4050 0.6887
720200 0.5773 1.8405 0.4067 0.6898
740028 0.5784 1.8411 0.4066 0.6906
760831 0.5780 1.8379 0.4067 0.6902
780655 0.5776 1.8394 0.4065 0.6897
800441 0.5780 1.8399 0.4066 0.6896
820207 0.5799 1.8479 0.4071 0.6910
841853 0.5806 1.8521 0.4074 0.6909
860381 0.5791 1.8485 0.4072 0.6906
881337 0.5797 1.8461 0.4071 0.6910
900000 0.5804 1.8482 0.4074 0.6906

配置3: init_lr= 5e-5, decay=0.8, 每3个epoch decay一次

epoch Bleu_4 CIDEr METEOR ROUGE_L
642973 0.5602 1.7642 0.3975 0.6818
660958 0.5702 1.8143 0.4051 0.6864
680028 0.5768 1.8324 0.4054 0.6887
700438 0.5781 1.8390 0.4053 0.6891
719491 0.5788 1.8437 0.4065 0.6901
721901 0.5778 1.8427 0.4072 0.6900
739994 0.5789 1.8487 0.4066 0.6894
741193 0.5804 1.8541 0.4073 0.6903
761743 0.5793 1.8545 0.4072 0.6898
780236 0.5784 1.8461 0.4057 0.6886
800265 0.5818 1.8583 0.4074 0.6901
820801 0.5808 1.8567 0.4076 0.6901
840423 0.5825 1.8650 0.4077 0.6905
861866 0.5806 1.8603 0.4079 0.6903
880121 0.5828 1.8654 0.4084 0.6914
900000 0.5808 1.8623 0.4081 0.6902
  • 尝试不同学习率

limiao: 目前SGD正在尝试了两种0.1和0.01两种学习率,decay factor还是0.6。 仅仅从训练集上的cider score变化看,“好像”0.1的学习率好一点,这个结论还有待继续验证

  • 尝试加入Multi-Task loss

finetune Image Model SGD learning with decay lr=1.0 decay=0.66 从 600000 开始 RL

epoch Bleu_4 CIDEr METEOR ROUGE_L
500396 0.5617 1.7343 0.4076 0.6827
521053 0.5639 1.7425 0.4087 0.6839
541263 0.5638 1.7492 0.4078 0.6845
560347 0.5631 1.7471 0.4080 0.6836
580619 0.5644 1.7509 0.4085 0.6845
600000 0.5624 1.7435 0.4080 0.6832
602046 0.5499 1.7083 0.4005 0.6766
604529 0.5494 1.7087 0.3985 0.6761
606388 0.5490 1.7189 0.3963 0.6747
608213 0.5508 1.7101 0.3965 0.6749
610376 0.5486 1.7130 0.3955 0.6745
612209 0.5491 1.7147 0.3962 0.6739
614071 0.5472 1.7197 0.3961 0.6742
616523 0.5509 1.7202 0.3956 0.6755
618034 0.5537 1.7303 0.3966 0.6771
620474 0.5538 1.7318 0.3977 0.6776
622296 0.5532 1.7372 0.3972 0.6776
624427 0.5559 1.7440 0.3973 0.6780
626226 0.5550 1.7452 0.3976 0.6780
628071 0.5561 1.7479 0.3983 0.6787
630541 0.5594 1.7490 0.3982 0.6791
632081 0.5587 1.7541 0.3987 0.6796
634226 0.5601 1.7549 0.3997 0.6801
640705 0.5586 1.7573 0.3990 0.6797
648463 0.5631 1.7620 0.3994 0.6818
656147 0.5603 1.7657 0.4005 0.6811
664577 0.5599 1.7562 0.3991 0.6806
672414 0.5653 1.7805 0.4003 0.6826
680642 0.5658 1.7843 0.4012 0.6830
688809 0.5666 1.7888 0.4023 0.6839
696679 0.5663 1.7870 0.4012 0.6835
704343 0.5740 1.8129 0.4051 0.6871
712363 0.5732 1.8137 0.4044 0.6871
720317 0.5733 1.8121 0.4045 0.6869
728089 0.5737 1.8151 0.4047 0.6873
736091 0.5753 1.8198 0.4049 0.6877
744213 0.5772 1.8250 0.4056 0.6888
752369 0.5756 1.8260 0.4057 0.6883
760304 0.5767 1.8283 0.4057 0.6885
768095 0.5771 1.8276 0.4053 0.6889
776060 0.5774 1.8327 0.4054 0.6888
784556 0.5771 1.8326 0.4053 0.6889
792001 0.5786 1.8369 0.4062 0.6895
800000 0.5780 1.8362 0.4064 0.6894