Skip to content

Eval详细结果(Attention相关模型)

Heda Wang edited this page Nov 3, 2017 · 1 revision

ShowAndTell Advanced Model (Semantic and Visual Attention) finetune with decay

Semantic Attention is the word-hashed one, train 105k steps and train_inception_with_decay to 600k steps, initlr=1.0, decay=0.6.

epoch Bleu_4 CIDEr METEOR ROUGE_L
107001 0.2638 0.5634 0.2726 0.5011
120053 0.3978 1.0530 0.3307 0.5805
140132 0.4233 1.1623 0.3475 0.6029
161037 0.4408 1.2202 0.3524 0.6113
180265 0.5000 1.4703 0.3783 0.6471
200351 0.4991 1.4741 0.3794 0.6467
221452 0.4823 1.3863 0.3664 0.6336
240649 0.5144 1.5623 0.3851 0.6557
260617 0.5163 1.5325 0.3880 0.6560
281553 0.5354 1.6334 0.3953 0.6670
300457 0.5403 1.6507 0.3962 0.6694
320328 0.5432 1.6586 0.3983 0.6726
341305 0.5571 1.7124 0.4042 0.6803
361672 0.5533 1.7025 0.4043 0.6777
381049 0.5517 1.7009 0.4013 0.6763
400866 0.5582 1.7284 0.4059 0.6810
421739 0.5571 1.7205 0.4052 0.6798
440685 0.5588 1.7378 0.4071 0.6828
460547 0.5644 1.7443 0.4085 0.6837
481648 0.5641 1.7476 0.4081 0.6839
500313 0.5610 1.7364 0.4066 0.6820
521550 0.5642 1.7599 0.4082 0.6838
541261 0.5640 1.7546 0.4078 0.6836
560138 0.5645 1.7573 0.4084 0.6844
580973 0.5647 1.7596 0.4086 0.6843
600000 0.5620 1.7554 0.4090 0.6841

ShowAndTell Advanced Model (Visual Attention with Lexical Embedding) finetune with decay

Postag lexicla embedding, 2-layer lstm, visual attention. First train 105k steps and then finetune with decay to 600k steps, initlr=1.0, decay=0.6.

epoch Bleu_4 CIDEr METEOR ROUGE_L
106444 0.1572 0.2445 0.2275 0.4225
120285 0.4341 1.1887 0.3544 0.6082
140008 0.4307 1.1691 0.3491 0.6058
161137 0.4396 1.2197 0.3511 0.6100
180322 0.4971 1.4785 0.3786 0.6475
200463 0.4903 1.4322 0.3792 0.6427
221334 0.4948 1.4457 0.3793 0.6451
240785 0.5174 1.5341 0.3850 0.6573
260122 0.5150 1.5426 0.3877 0.6565
280134 0.5314 1.6125 0.3925 0.6663
300901 0.5384 1.6313 0.3956 0.6689
320198 0.5364 1.6359 0.3977 0.6686
340154 0.5469 1.6830 0.4002 0.6753
360815 0.5469 1.6758 0.4003 0.6757
380105 0.5498 1.6942 0.4014 0.6781
400157 0.5573 1.7235 0.4051 0.6823
420870 0.5560 1.7224 0.4050 0.6805
440042 0.5555 1.7293 0.4055 0.6810
461377 0.5589 1.7350 0.4060 0.6828
480581 0.5569 1.7328 0.4080 0.6836
500472 0.5583 1.7315 0.4062 0.6824
521185 0.5612 1.7414 0.4071 0.6842
540437 0.5609 1.7391 0.4070 0.6837
560336 0.5611 1.7437 0.4075 0.6840
580885 0.5611 1.7469 0.4078 0.6841
600000 0.5592 1.7408 0.4074 0.6835

ShowAndTell Advanced Model (Visual Attention and Lexical Embedding) train-inception-with-decay

From scratch, initlr=1.0, decay=0.6. Postag and char as lexical embedding. Train to 600k steps.

epoch Bleu_4 CIDEr METEOR ROUGE_L
662 0.1006 0.0782 0.1988 0.3861
20444 0.3686 0.9388 0.3247 0.5674
40285 0.4090 1.0830 0.3431 0.5937
60787 0.4251 1.1379 0.3460 0.6005
80579 0.4212 1.1714 0.3453 0.6002
101101 0.4387 1.1698 0.3543 0.6085
120180 0.4804 1.3721 0.3718 0.6345
140363 0.4522 1.2478 0.3532 0.6149
160288 0.4805 1.3862 0.3733 0.6315
180557 0.5082 1.5104 0.3818 0.6523
200282 0.5153 1.5135 0.3845 0.6544
220415 0.5132 1.5177 0.3854 0.6556
240764 0.5206 1.5473 0.3859 0.6576
260994 0.5221 1.5665 0.3888 0.6597

ShowAndTell Advanced Model (Sematic Attention) top-k word prediction

From scratch, initlr=1.0, decay=0.6 to 600k step. Using postag and char as semantic attention. Embedding size is 256, postag embedding size is 32, and char embedding size is 64.

epoch Bleu_4 CIDEr METEOR ROUGE_L
140392 0.4197 1.1379 0.3433 0.5992
160688 0.4517 1.2643 0.3633 0.6236
180498 0.4702 1.3399 0.3670 0.6312
200574 0.4843 1.3998 0.3731 0.6405
220920 0.4701 1.3529 0.3666 0.6301
240485 0.4980 1.4508 0.3800 0.6472
261336 0.5037 1.4850 0.3820 0.6528
280991 0.5161 1.5351 0.3888 0.6608
300618 0.5124 1.5300 0.3872 0.6573
320690 0.5194 1.5533 0.3886 0.6612
340050 0.5285 1.5827 0.3934 0.6663
360974 0.5289 1.5946 0.3936 0.6667
380127 0.5267 1.5974 0.3949 0.6666
400226 0.5314 1.6072 0.3942 0.6681
420000 0.5349 1.6174 0.3966 0.6707