I figured out how to cram GPT-2 1.5B onto a single TPU core with Adam optimizer #23

shawwn · 2019-12-21T00:31:36Z

It comes down to tensor shape. 2D = good, 3D = bad.

Relevant commit: shawwn/gpt-2@4d766e9

The text was updated successfully, but these errors were encountered:

shawwn · 2019-12-21T00:32:10Z

Also memory saving gradients + checkpointing every layer.

fartwhif · 2020-01-13T03:43:11Z

With and without these modifications, how much resource is needed to do a simple run against, say, input text: I am very happy because this model is great!

Whenever I try it, starting out with a CUDA enabled GPU with 4GB RAM mostly free, and 64GB general purpose RAM mostly free, it always crashes with an OOM. at first I thought maybe it needs at least 256GB like the original hardware? I have no idea just throwing numbers around...

2020-01-13 03:31:04.774288: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 745 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:07:00.0
totalMemory: 3.95GiB freeMemory: 3.90GiB
2020-01-13 03:31:04.774302: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-01-13 03:31:04.774720: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-13 03:31:04.774731: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2020-01-13 03:31:04.774737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2020-01-13 03:31:04.774787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2020 MB memory) -> physical GPU (device: 0, name: GeForce GTX 745, pci bus id: 0000:07:00.0, compute capability: 5.0)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2020-01-13 03:35:46.686119: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2020-01-13 03:37:45.714226: W tensorflow/core/common_runtime/bfc_allocator.cc:267] Allocator (GPU_0_bfc) ran out of memory trying to allocate 195.12MiB.  Current allocation summary follows.
2020-01-13 03:37:45.714341: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (256): 	Total Chunks: 16, Chunks in use: 16. 4.0KiB allocated for chunks. 4.0KiB in use in bin. 65B client-requested in use in bin.
2020-01-13 03:37:45.714353: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (512): 	Total Chunks: 2, Chunks in use: 2. 1.0KiB allocated for chunks. 1.0KiB in use in bin. 648B client-requested in use in bin.
2020-01-13 03:37:45.714364: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1024): 	Total Chunks: 1, Chunks in use: 1. 1.2KiB allocated for chunks. 1.2KiB in use in bin. 1.0KiB client-requested in use in bin.
2020-01-13 03:37:45.714375: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2048): 	Total Chunks: 1, Chunks in use: 0. 3.2KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714387: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4096): 	Total Chunks: 154, Chunks in use: 153. 616.0KiB allocated for chunks. 612.0KiB in use in bin. 612.0KiB client-requested in use in bin.
2020-01-13 03:37:45.714398: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8192): 	Total Chunks: 25, Chunks in use: 25. 300.0KiB allocated for chunks. 300.0KiB in use in bin. 300.0KiB client-requested in use in bin.
2020-01-13 03:37:45.714410: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16384): 	Total Chunks: 25, Chunks in use: 25. 400.0KiB allocated for chunks. 400.0KiB in use in bin. 400.0KiB client-requested in use in bin.
2020-01-13 03:37:45.714421: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (32768): 	Total Chunks: 1, Chunks in use: 1. 62.5KiB allocated for chunks. 62.5KiB in use in bin. 62.4KiB client-requested in use in bin.
2020-01-13 03:37:45.714430: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (65536): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714442: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (131072): 	Total Chunks: 3, Chunks in use: 2. 638.0KiB allocated for chunks. 396.5KiB in use in bin. 396.3KiB client-requested in use in bin.
2020-01-13 03:37:45.714454: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (262144): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714464: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (524288): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714477: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (1048576): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714488: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (2097152): 	Total Chunks: 2, Chunks in use: 1. 7.21MiB allocated for chunks. 3.90MiB in use in bin. 3.90MiB client-requested in use in bin.
2020-01-13 03:37:45.714498: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (4194304): 	Total Chunks: 26, Chunks in use: 26. 104.00MiB allocated for chunks. 104.00MiB in use in bin. 104.00MiB client-requested in use in bin.
2020-01-13 03:37:45.714512: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (8388608): 	Total Chunks: 26, Chunks in use: 25. 312.10MiB allocated for chunks. 300.00MiB in use in bin. 300.00MiB client-requested in use in bin.
2020-01-13 03:37:45.714526: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (16777216): 	Total Chunks: 51, Chunks in use: 50. 816.00MiB allocated for chunks. 800.00MiB in use in bin. 800.00MiB client-requested in use in bin.
2020-01-13 03:37:45.714535: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (33554432): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714544: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (67108864): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714556: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (134217728): 	Total Chunks: 4, Chunks in use: 2. 779.58MiB allocated for chunks. 391.24MiB in use in bin. 391.24MiB client-requested in use in bin.
2020-01-13 03:37:45.714564: I tensorflow/core/common_runtime/bfc_allocator.cc:597] Bin (268435456): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-01-13 03:37:45.714575: I tensorflow/core/common_runtime/bfc_allocator.cc:613] Bin for 195.12MiB was 128.00MiB, Chunk State: 
2020-01-13 03:37:45.714589: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 193.62MiB | Requested Size: 3.90MiB | in_use: 0, prev:   Size: 194.92MiB | Requested Size: 194.92MiB | in_use: 1
2020-01-13 03:37:45.714626: I tensorflow/core/common_runtime/bfc_allocator.cc:619]   Size: 194.73MiB | Requested Size: 194.73MiB | in_use: 0, prev:   Size: 4.0KiB | Requested Size: 4.0KiB | in_use: 1, next:   Size: 194.92MiB | Requested Size: 194.92MiB | in_use: 1
2020-01-13 03:37:45.714645: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13015c0000 of size 1280
2020-01-13 03:37:45.714660: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13015c0500 of size 16384
2020-01-13 03:37:45.714672: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13015c4500 of size 4096
2020-01-13 03:37:45.714684: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13015c5500 of size 4096
2020-01-13 03:37:45.714697: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13015c6500 of size 16777216
2020-01-13 03:37:45.714708: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13025c6500 of size 4096
2020-01-13 03:37:45.714723: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x13025c7500 of size 16781312
2020-01-13 03:37:45.714736: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13035c8500 of size 16777216
2020-01-13 03:37:45.714757: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13045c8500 of size 4096
2020-01-13 03:37:45.714771: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13045c9500 of size 4096
2020-01-13 03:37:45.714783: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13045ca500 of size 12288
2020-01-13 03:37:45.714795: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13045cd500 of size 4096
2020-01-13 03:37:45.714808: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13045ce500 of size 12582912
2020-01-13 03:37:45.714818: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13051ce500 of size 12288
2020-01-13 03:37:45.714827: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13051d1500 of size 16777216
2020-01-13 03:37:45.714836: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13061d1500 of size 4096
2020-01-13 03:37:45.714846: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13061d2500 of size 4096
2020-01-13 03:37:45.714857: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13061d3500 of size 16777216
2020-01-13 03:37:45.714869: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13071d3500 of size 16384
2020-01-13 03:37:45.714878: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13071d7500 of size 4096
2020-01-13 03:37:45.714890: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13071d8500 of size 4194304
2020-01-13 03:37:45.714902: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13075d8500 of size 12582912
2020-01-13 03:37:45.714911: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d8500 of size 4096
2020-01-13 03:37:45.714920: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9500 of size 256
2020-01-13 03:37:45.714928: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9600 of size 256
2020-01-13 03:37:45.714936: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9700 of size 512
2020-01-13 03:37:45.714943: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9900 of size 512
2020-01-13 03:37:45.714952: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9b00 of size 256
2020-01-13 03:37:45.714963: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9c00 of size 256
2020-01-13 03:37:45.714976: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9d00 of size 256
2020-01-13 03:37:45.714986: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9e00 of size 256
2020-01-13 03:37:45.714997: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081d9f00 of size 256
2020-01-13 03:37:45.715008: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081da000 of size 256
2020-01-13 03:37:45.715021: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081da100 of size 256
2020-01-13 03:37:45.715036: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081da200 of size 256
2020-01-13 03:37:45.715047: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081da300 of size 256
2020-01-13 03:37:45.715056: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081da400 of size 256
2020-01-13 03:37:45.715063: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081da500 of size 4096
2020-01-13 03:37:45.715071: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13081db500 of size 4194304
2020-01-13 03:37:45.715079: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13085db500 of size 4096
2020-01-13 03:37:45.715090: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13085dc500 of size 12582912
2020-01-13 03:37:45.715098: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13091dc500 of size 12288
2020-01-13 03:37:45.715106: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13091df500 of size 4091904
2020-01-13 03:37:45.715114: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x13095c6500 of size 12685312
2020-01-13 03:37:45.715121: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130a1df500 of size 4096
2020-01-13 03:37:45.715129: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130a1e0500 of size 16777216
2020-01-13 03:37:45.715138: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b1e0500 of size 16384
2020-01-13 03:37:45.715145: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b1e4500 of size 12288
2020-01-13 03:37:45.715154: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b1e7500 of size 4096
2020-01-13 03:37:45.715161: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b1e8500 of size 4096
2020-01-13 03:37:45.715170: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b1e9500 of size 4194304
2020-01-13 03:37:45.715177: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b5e9500 of size 4096
2020-01-13 03:37:45.715185: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b5ea500 of size 4096
2020-01-13 03:37:45.715192: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b5eb500 of size 4096
2020-01-13 03:37:45.715200: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b5ec500 of size 201216
2020-01-13 03:37:45.715207: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x130b61d700 of size 4096
2020-01-13 03:37:45.715217: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b61e700 of size 64000
2020-01-13 03:37:45.715225: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x130b62e100 of size 247296
2020-01-13 03:37:45.715234: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b66a700 of size 204800
2020-01-13 03:37:45.715246: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x130b69c700 of size 3472896
2020-01-13 03:37:45.715257: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b9ec500 of size 4096
2020-01-13 03:37:45.715266: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130b9ed500 of size 12582912
2020-01-13 03:37:45.715275: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130c5ed500 of size 12288
2020-01-13 03:37:45.715283: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130c5f0500 of size 16777216
2020-01-13 03:37:45.715291: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130d5f0500 of size 16777216
2020-01-13 03:37:45.715300: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130e5f0500 of size 256
2020-01-13 03:37:45.715311: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130e5f0600 of size 256
2020-01-13 03:37:45.715322: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130e5f0700 of size 256
2020-01-13 03:37:45.715335: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x130e5f0800 of size 3328
2020-01-13 03:37:45.715350: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130e5f1500 of size 16777216
2020-01-13 03:37:45.715363: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5f1500 of size 16384
2020-01-13 03:37:45.715377: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5f5500 of size 4096
2020-01-13 03:37:45.715390: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5f6500 of size 4096
2020-01-13 03:37:45.715399: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5f7500 of size 4096
2020-01-13 03:37:45.715407: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5f8500 of size 4096
2020-01-13 03:37:45.715416: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5f9500 of size 4096
2020-01-13 03:37:45.715427: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5fa500 of size 4096
2020-01-13 03:37:45.715441: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f5fb500 of size 4194304
2020-01-13 03:37:45.715454: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f9fb500 of size 4096
2020-01-13 03:37:45.715467: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x130f9fc500 of size 12582912
2020-01-13 03:37:45.715478: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13105fc500 of size 4096
2020-01-13 03:37:45.715492: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13105fd500 of size 12288
2020-01-13 03:37:45.715504: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1310600500 of size 16777216
2020-01-13 03:37:45.715516: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1311600500 of size 4096
2020-01-13 03:37:45.715529: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1311601500 of size 16777216
2020-01-13 03:37:45.715542: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312601500 of size 16384
2020-01-13 03:37:45.715555: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312605500 of size 4096
2020-01-13 03:37:45.715568: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312606500 of size 4096
2020-01-13 03:37:45.715581: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312607500 of size 4096
2020-01-13 03:37:45.715594: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312608500 of size 4096
2020-01-13 03:37:45.715613: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312609500 of size 4096
2020-01-13 03:37:45.715627: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131260a500 of size 4194304
2020-01-13 03:37:45.715638: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1312a0a500 of size 16777216
2020-01-13 03:37:45.715648: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1313a0a500 of size 4096
2020-01-13 03:37:45.715658: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1313a0b500 of size 12582912
2020-01-13 03:37:45.715670: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131460b500 of size 12288
2020-01-13 03:37:45.715678: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131460e500 of size 16777216
2020-01-13 03:37:45.715688: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131560e500 of size 4096
2020-01-13 03:37:45.715696: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131560f500 of size 16777216
2020-01-13 03:37:45.715705: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131660f500 of size 16384
2020-01-13 03:37:45.715714: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1316613500 of size 4096
2020-01-13 03:37:45.715723: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1316614500 of size 4096
2020-01-13 03:37:45.715733: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1316615500 of size 4096
2020-01-13 03:37:45.715745: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1316616500 of size 16384
2020-01-13 03:37:45.715756: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131661a500 of size 16384
2020-01-13 03:37:45.715765: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131661e500 of size 4096
2020-01-13 03:37:45.715776: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131661f500 of size 4194304
2020-01-13 03:37:45.715786: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1316a1f500 of size 4096
2020-01-13 03:37:45.715794: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1316a20500 of size 12582912
2020-01-13 03:37:45.715805: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1317620500 of size 12288
2020-01-13 03:37:45.715815: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1317623500 of size 16777216
2020-01-13 03:37:45.715824: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1318623500 of size 4096
2020-01-13 03:37:45.715835: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1318624500 of size 16777216
2020-01-13 03:37:45.715844: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1319624500 of size 16384
2020-01-13 03:37:45.715856: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1319628500 of size 4096
2020-01-13 03:37:45.715865: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1319629500 of size 4096
2020-01-13 03:37:45.715873: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131962a500 of size 16777216
2020-01-13 03:37:45.715882: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131a62a500 of size 12582912
2020-01-13 03:37:45.715891: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131b22a500 of size 4096
2020-01-13 03:37:45.715900: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131b22b500 of size 4096
2020-01-13 03:37:45.715909: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131b22c500 of size 4096
2020-01-13 03:37:45.715919: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131b22d500 of size 4194304
2020-01-13 03:37:45.715928: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131b62d500 of size 4096
2020-01-13 03:37:45.715940: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131b62e500 of size 12582912
2020-01-13 03:37:45.715950: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131c22e500 of size 12288
2020-01-13 03:37:45.715958: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131c231500 of size 16777216
2020-01-13 03:37:45.715967: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131d231500 of size 4096
2020-01-13 03:37:45.715976: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131d232500 of size 4096
2020-01-13 03:37:45.715985: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131d233500 of size 16777216
2020-01-13 03:37:45.715995: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e233500 of size 4096
2020-01-13 03:37:45.716005: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e234500 of size 16384
2020-01-13 03:37:45.716015: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e238500 of size 4096
2020-01-13 03:37:45.716024: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e239500 of size 4096
2020-01-13 03:37:45.716033: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e23a500 of size 4096
2020-01-13 03:37:45.716042: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e23b500 of size 4096
2020-01-13 03:37:45.716051: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e23c500 of size 4194304
2020-01-13 03:37:45.716062: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e63c500 of size 4096
2020-01-13 03:37:45.716071: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131e63d500 of size 12582912
2020-01-13 03:37:45.716080: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131f23d500 of size 12288
2020-01-13 03:37:45.716092: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x131f240500 of size 16777216
2020-01-13 03:37:45.716101: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1320240500 of size 4096
2020-01-13 03:37:45.716110: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1320241500 of size 4096
2020-01-13 03:37:45.716119: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1320242500 of size 16777216
2020-01-13 03:37:45.716128: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1321242500 of size 16777216
2020-01-13 03:37:45.716137: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1322242500 of size 16384
2020-01-13 03:37:45.716146: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1322246500 of size 4096
2020-01-13 03:37:45.716155: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1322247500 of size 4096
2020-01-13 03:37:45.716166: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1322248500 of size 4096
2020-01-13 03:37:45.716175: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1322249500 of size 4194304
2020-01-13 03:37:45.716184: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1322649500 of size 4096
2020-01-13 03:37:45.716194: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132264a500 of size 4096
2020-01-13 03:37:45.716202: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132264b500 of size 12582912
2020-01-13 03:37:45.716211: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132324b500 of size 4096
2020-01-13 03:37:45.716220: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132324c500 of size 12288
2020-01-13 03:37:45.716229: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132324f500 of size 16777216
2020-01-13 03:37:45.716239: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132424f500 of size 4096
2020-01-13 03:37:45.716249: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1324250500 of size 16777216
2020-01-13 03:37:45.716257: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1325250500 of size 16384
2020-01-13 03:37:45.716266: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1325254500 of size 4096
2020-01-13 03:37:45.716277: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1325255500 of size 12288
2020-01-13 03:37:45.716287: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1325258500 of size 4096
2020-01-13 03:37:45.716295: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1325259500 of size 4096
2020-01-13 03:37:45.716304: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132525a500 of size 4096
2020-01-13 03:37:45.716315: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132525b500 of size 4096
2020-01-13 03:37:45.716324: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132525c500 of size 12582912
2020-01-13 03:37:45.716335: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1325e5c500 of size 4194304
2020-01-13 03:37:45.716343: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132625c500 of size 4096
2020-01-13 03:37:45.716356: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132625d500 of size 4194304
2020-01-13 03:37:45.716370: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132665d500 of size 4096
2020-01-13 03:37:45.716382: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132665e500 of size 12582912
2020-01-13 03:37:45.716394: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132725e500 of size 12288
2020-01-13 03:37:45.716406: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1327261500 of size 16777216
2020-01-13 03:37:45.716419: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1328261500 of size 4096
2020-01-13 03:37:45.716432: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1328262500 of size 16777216
2020-01-13 03:37:45.716445: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1329262500 of size 16384
2020-01-13 03:37:45.716459: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1329266500 of size 4096
2020-01-13 03:37:45.716473: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1329267500 of size 4096
2020-01-13 03:37:45.716485: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1329268500 of size 4096
2020-01-13 03:37:45.716498: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1329269500 of size 4096
2020-01-13 03:37:45.716511: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132926a500 of size 4096
2020-01-13 03:37:45.716524: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132926b500 of size 4194304
2020-01-13 03:37:45.716537: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132966b500 of size 4096
2020-01-13 03:37:45.716549: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132966c500 of size 12582912
2020-01-13 03:37:45.716562: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132a26c500 of size 12288
2020-01-13 03:37:45.716574: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132a26f500 of size 16777216
2020-01-13 03:37:45.716587: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132b26f500 of size 4096
2020-01-13 03:37:45.716600: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132b270500 of size 16777216
2020-01-13 03:37:45.716618: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132c270500 of size 16384
2020-01-13 03:37:45.716631: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132c274500 of size 4194304
2020-01-13 03:37:45.716641: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132c674500 of size 4096
2020-01-13 03:37:45.716650: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132c675500 of size 4096
2020-01-13 03:37:45.716659: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132c676500 of size 12582912
2020-01-13 03:37:45.716669: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d276500 of size 4096
2020-01-13 03:37:45.716680: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d277500 of size 4096
2020-01-13 03:37:45.716689: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d278500 of size 4096
2020-01-13 03:37:45.716698: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d279500 of size 4096
2020-01-13 03:37:45.716706: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d27a500 of size 4096
2020-01-13 03:37:45.716715: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d27b500 of size 4194304
2020-01-13 03:37:45.716725: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d67b500 of size 4096
2020-01-13 03:37:45.716735: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132d67c500 of size 12582912
2020-01-13 03:37:45.716747: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132e27c500 of size 12288
2020-01-13 03:37:45.716758: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132e27f500 of size 16777216
2020-01-13 03:37:45.716768: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132f27f500 of size 4096
2020-01-13 03:37:45.716777: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x132f280500 of size 16777216
2020-01-13 03:37:45.716785: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1330280500 of size 12288
2020-01-13 03:37:45.716794: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1330283500 of size 16384
2020-01-13 03:37:45.716804: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1330287500 of size 4096
2020-01-13 03:37:45.716814: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1330288500 of size 4096
2020-01-13 03:37:45.716823: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1330289500 of size 4096
2020-01-13 03:37:45.716834: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133028a500 of size 4096
2020-01-13 03:37:45.716844: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133028b500 of size 4194304
2020-01-13 03:37:45.716853: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133068b500 of size 4096
2020-01-13 03:37:45.716862: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133068c500 of size 12582912
2020-01-13 03:37:45.716871: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133128c500 of size 12288
2020-01-13 03:37:45.716880: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133128f500 of size 16777216
2020-01-13 03:37:45.716888: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133228f500 of size 16777216
2020-01-13 03:37:45.716897: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133328f500 of size 4096
2020-01-13 03:37:45.716908: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1333290500 of size 16777216
2020-01-13 03:37:45.716919: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334290500 of size 16384
2020-01-13 03:37:45.716928: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334294500 of size 4096
2020-01-13 03:37:45.716936: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334295500 of size 4096
2020-01-13 03:37:45.716945: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334296500 of size 4096
2020-01-13 03:37:45.716954: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334297500 of size 4096
2020-01-13 03:37:45.716963: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334298500 of size 4194304
2020-01-13 03:37:45.716973: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334698500 of size 4096
2020-01-13 03:37:45.716983: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1334699500 of size 12582912
2020-01-13 03:37:45.716992: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1335299500 of size 4096
2020-01-13 03:37:45.717001: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133529a500 of size 12288
2020-01-13 03:37:45.717010: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133529d500 of size 16777216
2020-01-13 03:37:45.717018: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133629d500 of size 4096
2020-01-13 03:37:45.717027: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133629e500 of size 16777216
2020-01-13 03:37:45.717035: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133729e500 of size 16384
2020-01-13 03:37:45.717047: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13372a2500 of size 4096
2020-01-13 03:37:45.717057: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13372a3500 of size 4096
2020-01-13 03:37:45.717068: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13372a4500 of size 4096
2020-01-13 03:37:45.717077: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13372a5500 of size 4096
2020-01-13 03:37:45.717087: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13372a6500 of size 4194304
2020-01-13 03:37:45.717097: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13376a6500 of size 16777216
2020-01-13 03:37:45.717105: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13386a6500 of size 12288
2020-01-13 03:37:45.717114: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13386a9500 of size 256
2020-01-13 03:37:45.717123: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13386a9600 of size 4096
2020-01-13 03:37:45.717132: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13386aa600 of size 12582912
2020-01-13 03:37:45.717143: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13392aa600 of size 12288
2020-01-13 03:37:45.717151: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13392ad600 of size 16777216
2020-01-13 03:37:45.717160: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133a2ad600 of size 4096
2020-01-13 03:37:45.717169: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133a2ae600 of size 16777216
2020-01-13 03:37:45.717178: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2ae600 of size 16384
2020-01-13 03:37:45.717187: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2b2600 of size 4096
2020-01-13 03:37:45.717197: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2b3600 of size 4096
2020-01-13 03:37:45.717207: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2b4600 of size 4096
2020-01-13 03:37:45.717219: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2b5600 of size 16384
2020-01-13 03:37:45.717227: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2b9600 of size 4096
2020-01-13 03:37:45.717237: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b2ba600 of size 4194304
2020-01-13 03:37:45.717246: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b6ba600 of size 4096
2020-01-13 03:37:45.717255: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133b6bb600 of size 12582912
2020-01-13 03:37:45.717265: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133c2bb600 of size 12288
2020-01-13 03:37:45.717275: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133c2be600 of size 16777216
2020-01-13 03:37:45.717286: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133d2be600 of size 4096
2020-01-13 03:37:45.717296: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133d2bf600 of size 16777216
2020-01-13 03:37:45.717304: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2bf600 of size 16384
2020-01-13 03:37:45.717313: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2c3600 of size 4096
2020-01-13 03:37:45.717321: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2c4600 of size 4096
2020-01-13 03:37:45.717330: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2c5600 of size 4096
2020-01-13 03:37:45.717339: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2c6600 of size 4096
2020-01-13 03:37:45.717350: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2c7600 of size 4096
2020-01-13 03:37:45.717359: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e2c8600 of size 4194304
2020-01-13 03:37:45.717370: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e6c8600 of size 4096
2020-01-13 03:37:45.717379: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133e6c9600 of size 12582912
2020-01-13 03:37:45.717388: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133f2c9600 of size 12288
2020-01-13 03:37:45.717397: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x133f2cc600 of size 16777216
2020-01-13 03:37:45.717405: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13402cc600 of size 4096
2020-01-13 03:37:45.717415: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13402cd600 of size 16777216
2020-01-13 03:37:45.717423: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412cd600 of size 4096
2020-01-13 03:37:45.717432: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412ce600 of size 16384
2020-01-13 03:37:45.717443: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412d2600 of size 4096
2020-01-13 03:37:45.717452: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412d3600 of size 4096
2020-01-13 03:37:45.717463: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412d4600 of size 4096
2020-01-13 03:37:45.717471: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412d5600 of size 4096
2020-01-13 03:37:45.717482: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13412d6600 of size 4194304
2020-01-13 03:37:45.717491: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13416d6600 of size 4096
2020-01-13 03:37:45.717499: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13416d7600 of size 12582912
2020-01-13 03:37:45.717508: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13422d7600 of size 12288
2020-01-13 03:37:45.717519: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13422da600 of size 16777216
2020-01-13 03:37:45.717528: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13432da600 of size 4096
2020-01-13 03:37:45.717537: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13432db600 of size 4096
2020-01-13 03:37:45.717547: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13432dc600 of size 16777216
2020-01-13 03:37:45.717556: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13442dc600 of size 16384
2020-01-13 03:37:45.717566: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13442e0600 of size 4096
2020-01-13 03:37:45.717574: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13442e1600 of size 4096
2020-01-13 03:37:45.717583: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13442e2600 of size 4096
2020-01-13 03:37:45.717594: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13442e3600 of size 4096
2020-01-13 03:37:45.717611: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13442e4600 of size 4194304
2020-01-13 03:37:45.717621: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13446e4600 of size 4096
2020-01-13 03:37:45.717630: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13446e5600 of size 12582912
2020-01-13 03:37:45.717638: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13452e5600 of size 4096
2020-01-13 03:37:45.717647: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13452e6600 of size 12288
2020-01-13 03:37:45.717658: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13452e9600 of size 16777216
2020-01-13 03:37:45.717667: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13462e9600 of size 4096
2020-01-13 03:37:45.717679: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13462ea600 of size 16777216
2020-01-13 03:37:45.717689: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13472ea600 of size 16384
2020-01-13 03:37:45.717700: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13472ee600 of size 4096
2020-01-13 03:37:45.717710: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13472ef600 of size 4096
2020-01-13 03:37:45.717719: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13472f0600 of size 4096
2020-01-13 03:37:45.717729: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13472f1600 of size 4096
2020-01-13 03:37:45.717738: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13472f2600 of size 4194304
2020-01-13 03:37:45.717749: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13476f2600 of size 4194304
2020-01-13 03:37:45.717759: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1347af2600 of size 4096
2020-01-13 03:37:45.717770: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1347af3600 of size 12582912
2020-01-13 03:37:45.717778: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13486f3600 of size 16777216
2020-01-13 03:37:45.717787: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13496f3600 of size 12288
2020-01-13 03:37:45.717795: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13496f6600 of size 4096
2020-01-13 03:37:45.717803: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x13496f7600 of size 16777216
2020-01-13 03:37:45.717812: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x134a6f7600 of size 4096
2020-01-13 03:37:45.717822: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x134a6f8600 of size 16384
2020-01-13 03:37:45.717832: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x134a6fc600 of size 4096
2020-01-13 03:37:45.717842: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x134a6fd600 of size 16384
2020-01-13 03:37:45.717850: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x134a701600 of size 4096
2020-01-13 03:37:45.717860: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x134a702600 of size 205852672
2020-01-13 03:37:45.717869: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1356b53600 of size 4194304
2020-01-13 03:37:45.717877: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1356f53600 of size 4096
2020-01-13 03:37:45.717886: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1356f54600 of size 4096
2020-01-13 03:37:45.717896: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1356f55600 of size 16777216
2020-01-13 03:37:45.717907: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1357f55600 of size 4096
2020-01-13 03:37:45.717916: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1357f56600 of size 4096
2020-01-13 03:37:45.717926: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1357f57600 of size 4096
2020-01-13 03:37:45.717935: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1357f58600 of size 16777216
2020-01-13 03:37:45.717943: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1358f58600 of size 4096
2020-01-13 03:37:45.717954: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1358f59600 of size 12582912
2020-01-13 03:37:45.717965: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1359b59600 of size 4096
2020-01-13 03:37:45.717975: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1359b5a600 of size 4194304
2020-01-13 03:37:45.717985: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1359f5a600 of size 16777216
2020-01-13 03:37:45.717994: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x135af5a600 of size 4194304
2020-01-13 03:37:45.718002: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x135b35a600 of size 4096
2020-01-13 03:37:45.718011: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x135b35b600 of size 4096
2020-01-13 03:37:45.718020: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x135b35c600 of size 204185600
2020-01-13 03:37:45.718030: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Chunk at 0x1367616600 of size 204390400
2020-01-13 03:37:45.718039: I tensorflow/core/common_runtime/bfc_allocator.cc:632] Free  at 0x1373902600 of size 203020800
2020-01-13 03:37:45.718050: I tensorflow/core/common_runtime/bfc_allocator.cc:638]      Summary of in-use Chunks by size: 
2020-01-13 03:37:45.718063: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 16 Chunks of size 256 totalling 4.0KiB
2020-01-13 03:37:45.718075: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 2 Chunks of size 512 totalling 1.0KiB
2020-01-13 03:37:45.718086: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 1280 totalling 1.2KiB
2020-01-13 03:37:45.718096: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 153 Chunks of size 4096 totalling 612.0KiB
2020-01-13 03:37:45.718107: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 25 Chunks of size 12288 totalling 300.0KiB
2020-01-13 03:37:45.718117: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 25 Chunks of size 16384 totalling 400.0KiB
2020-01-13 03:37:45.718129: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 64000 totalling 62.5KiB
2020-01-13 03:37:45.718140: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 201216 totalling 196.5KiB
2020-01-13 03:37:45.718150: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 204800 totalling 200.0KiB
2020-01-13 03:37:45.718160: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 4091904 totalling 3.90MiB
2020-01-13 03:37:45.718171: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 26 Chunks of size 4194304 totalling 104.00MiB
2020-01-13 03:37:45.718181: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 25 Chunks of size 12582912 totalling 300.00MiB
2020-01-13 03:37:45.718191: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 50 Chunks of size 16777216 totalling 800.00MiB
2020-01-13 03:37:45.718203: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 204390400 totalling 194.92MiB
2020-01-13 03:37:45.718215: I tensorflow/core/common_runtime/bfc_allocator.cc:641] 1 Chunks of size 205852672 totalling 196.32MiB
2020-01-13 03:37:45.718225: I tensorflow/core/common_runtime/bfc_allocator.cc:645] Sum Total of in-use chunks: 1.56GiB
2020-01-13 03:37:45.718238: I tensorflow/core/common_runtime/bfc_allocator.cc:647] Stats: 
Limit:                  2119041024
InUse:                  1678640384
MaxInUse:               1908564736
NumAllocs:                  961530
MaxAllocSize:            228620800

2020-01-13 03:37:45.718258: W tensorflow/core/common_runtime/bfc_allocator.cc:271] ************************************************************************________***********_________
2020-01-13 03:37:45.718294: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at concat_op.cc:153 : Resource exhausted: OOM when allocating tensor with shape[1,25,2,16,999,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Launching background thread
Hello from background thread
About to start loop in background thread...
Entering test single inputs dict mode
Launched background thread
my json data: {"text":"I am very happy because this model is great!"}

the json data has type: <class 'str'>
got 56 bytes of json data
json after applying json.loads and stringified{'text': 'I am very happy because this model is great!'}
the type of inputs dict is now: <class 'dict'>
Attempting inference...
Launched background thread watcher
Main thread got an input, now handing it to the users code
User code threw an exception
ai_integration Context handler caught error, sending it back over the queue... OOM when allocating tensor with shape[1,25,2,16,999,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node sample_sequence/while/concat (defined at /model/models/gpt2/sample.py:61) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[node sample_sequence/while/Exit_3 (defined at /model/models/gpt2/sample.py:82) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'sample_sequence/while/concat', defined at:
  File "main.py", line 67, in <module>
    p = next(predictions)  # return just the first one
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 611, in predict
    features, None, model_fn_lib.ModeKeys.PREDICT, self.config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/model/model_fns.py", line 57, in gpt2_model
    temperature=1.0, top_k=params["top_k"]
  File "/model/models/gpt2/sample.py", line 82, in sample_sequence
    back_prop=False,
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3556, in while_loop
    return_same_structure)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3087, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3022, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3525, in <lambda>
    body = lambda i, lv: (i + 1, orig_body(*lv))
  File "/model/models/gpt2/sample.py", line 61, in body
    tf.concat([past, next_outputs['presents']], axis=-2),
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1256, in concat
    return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1149, in concat_v2
    "ConcatV2", values=values, axis=axis, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,25,2,16,999,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node sample_sequence/while/concat (defined at /model/models/gpt2/sample.py:61) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[node sample_sequence/while/Exit_3 (defined at /model/models/gpt2/sample.py:82) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Inference finished. Now inspecting the result.
Model has indicated that it failed, success = False
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,25,2,16,999,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node sample_sequence/while/concat}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node sample_sequence/while/Exit_3}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/ai_integration/fake_inference_function.py", line 84, in get_next_input
    yield inputs_dict
  File "main.py", line 67, in <module>
    p = next(predictions)  # return just the first one
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 629, in predict
    preds_evaluated = mon_sess.run(predictions)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 676, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1171, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1270, in run
    raise six.reraise(*original_exc_info)
  File "/usr/local/lib/python3.5/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1327, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1091, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,25,2,16,999,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node sample_sequence/while/concat (defined at /model/models/gpt2/sample.py:61) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[node sample_sequence/while/Exit_3 (defined at /model/models/gpt2/sample.py:82) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'sample_sequence/while/concat', defined at:
  File "main.py", line 67, in <module>
    p = next(predictions)  # return just the first one
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 611, in predict
    features, None, model_fn_lib.ModeKeys.PREDICT, self.config)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/model/model_fns.py", line 57, in gpt2_model
    temperature=1.0, top_k=params["top_k"]
  File "/model/models/gpt2/sample.py", line 82, in sample_sequence
    back_prop=False,
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3556, in while_loop
    return_same_structure)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3087, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3022, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3525, in <lambda>
    body = lambda i, lv: (i + 1, orig_body(*lv))
  File "/model/models/gpt2/sample.py", line 61, in body
    tf.concat([past, next_outputs['presents']], axis=-2),
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1256, in concat
    return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1149, in concat_v2
    "ConcatV2", values=values, axis=axis, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,25,2,16,999,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node sample_sequence/while/concat (defined at /model/models/gpt2/sample.py:61) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[node sample_sequence/while/Exit_3 (defined at /model/models/gpt2/sample.py:82) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

shawwn · 2020-01-13T07:24:05Z

The model itself is 5.6GB (1558M parameters * 4 bytes per float32 = 5.6GB) so your best bet is to sample from the model using a Colab notebook. They usually give you a GPU with 16GB.

…

On Jan 12, 2020, at 9:43 PM, fartwhif ***@***.***> wrote: so without these fancy modifications, how much resources are needed to do a simple run against, say, input text: I am very happy because this model is great! WHenever I try, starting out with a CUDA enabled GPU with 4GB RAM mostly free, and 64GB general purpose RAM mostly free, it always crashes with an OOM. at first I thought maybe it's 256GB like the training set? I have no idea just throwing numbers around... — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I figured out how to cram GPT-2 1.5B onto a single TPU core with Adam optimizer #23

I figured out how to cram GPT-2 1.5B onto a single TPU core with Adam optimizer #23

shawwn commented Dec 21, 2019

shawwn commented Dec 21, 2019

fartwhif commented Jan 13, 2020 •

edited

Loading

shawwn commented Jan 13, 2020 via email

I figured out how to cram GPT-2 1.5B onto a single TPU core with Adam optimizer #23

I figured out how to cram GPT-2 1.5B onto a single TPU core with Adam optimizer #23

Comments

shawwn commented Dec 21, 2019

shawwn commented Dec 21, 2019

fartwhif commented Jan 13, 2020 • edited Loading

shawwn commented Jan 13, 2020 via email

fartwhif commented Jan 13, 2020 •

edited

Loading