Training:At training steps 100, training MLE loss is 2.1784725362062454, train CRF loss is 12.934807199835777 Training:At training steps 200, training MLE loss is 2.0982215950638055, train CRF loss is 12.300238718390466 Training:At training steps 300, training MLE loss is 2.0392028558254243, train CRF loss is 11.546043652296067 Training:At training steps 400, training MLE loss is 1.9793093436770142, train CRF loss is 10.874783922359347 Training:At training steps 500, training MLE loss is 1.926870346069336, train CRF loss is 10.30856076514721 Validation:At training steps 500, training MLE loss is 1.926870346069336, train CRF loss is 10.30856076514721, validation MLE loss is 1.7011445776412362, validation ppl is 5.48, validation CRF loss is 7.628421112110741, validation BLEU is 2.17 Training:At training steps 600, training MLE loss is 1.6453405766189098, train CRF loss is 7.60670139580965 Training:At training steps 700, training MLE loss is 1.63284099817276, train CRF loss is 7.421994224190712 Training:At training steps 800, training MLE loss is 1.6440962411959965, train CRF loss is 7.273698962877194 Training:At training steps 900, training MLE loss is 1.6543213388882578, train CRF loss is 7.1638311782479285 Training:At training steps 1000, training MLE loss is 1.662141635477543, train CRF loss is 7.0699912850856785 Validation:At training steps 1000, training MLE loss is 1.662141635477543, train CRF loss is 7.0699912850856785, validation MLE loss is 1.6142848832042593, validation ppl is 5.024, validation CRF loss is 6.718102128882157, validation BLEU is 13.77 Training:At training steps 1100, training MLE loss is 1.6860213121399283, train CRF loss is 6.558073410093784 Training:At training steps 1200, training MLE loss is 1.6879597918875515, train CRF loss is 6.476125329062342 Training:At training steps 1300, training MLE loss is 1.6901934181898832, train CRF loss is 6.400528302540382 Training:At training steps 1400, training MLE loss is 1.6874116013851017, train CRF loss is 6.323282389268279 Training:At training steps 1500, training MLE loss is 1.685257480673492, train CRF loss is 6.2425215153992175 Validation:At training steps 1500, training MLE loss is 1.685257480673492, train CRF loss is 6.2425215153992175, validation MLE loss is 1.6475643758711063, validation ppl is 5.194, validation CRF loss is 6.161781003600673, validation BLEU is 36.95 Training:At training steps 1600, training MLE loss is 1.649856363348663, train CRF loss is 5.814491910338401 Training:At training steps 1700, training MLE loss is 1.6471128077618777, train CRF loss is 5.770473006144166 Training:At training steps 1800, training MLE loss is 1.6595591168105601, train CRF loss is 5.715045610864957 Training:At training steps 1900, training MLE loss is 1.6628169665671886, train CRF loss is 5.660444211177528 Training:At training steps 2000, training MLE loss is 1.66742782895267, train CRF loss is 5.606660210967064 Validation:At training steps 2000, training MLE loss is 1.66742782895267, train CRF loss is 5.606660210967064, validation MLE loss is 1.764728539868405, validation ppl is 5.84, validation CRF loss is 5.423004655461562, validation BLEU is 41.7 Training:At training steps 2100, training MLE loss is 1.717916771657765, train CRF loss is 5.346369043886662 Training:At training steps 2200, training MLE loss is 1.6950387885607778, train CRF loss is 5.31025716394186 Training:At training steps 2300, training MLE loss is 1.6802360815927386, train CRF loss is 5.256107118974129 Training:At training steps 2400, training MLE loss is 1.6720353521779179, train CRF loss is 5.206417094878852 Training:At training steps 2500, training MLE loss is 1.671622773014009, train CRF loss is 5.165145996421575 Validation:At training steps 2500, training MLE loss is 1.671622773014009, train CRF loss is 5.165145996421575, validation MLE loss is 1.9742059707641602, validation ppl is 7.201, validation CRF loss is 5.105836190675435, validation BLEU is 43.74 Training:At training steps 2600, training MLE loss is 1.6571612641215325, train CRF loss is 4.902831239402294 Training:At training steps 2700, training MLE loss is 1.6445588554814459, train CRF loss is 4.872643101140857 Training:At training steps 2800, training MLE loss is 1.6433387485146522, train CRF loss is 4.834612847864628 Training:At training steps 2900, training MLE loss is 1.6397519523371011, train CRF loss is 4.795379407741129 Training:At training steps 3000, training MLE loss is 1.6356980685442686, train CRF loss is 4.762988601058722 Validation:At training steps 3000, training MLE loss is 1.6356980685442686, train CRF loss is 4.762988601058722, validation MLE loss is 1.6534317606373836, validation ppl is 5.225, validation CRF loss is 4.8574807079214795, validation BLEU is 43.58 Training:At training steps 3100, training MLE loss is 1.6360853765904904, train CRF loss is 4.485567877143621 Training:At training steps 3200, training MLE loss is 1.6284693580120801, train CRF loss is 4.457970159947872 Training:At training steps 3300, training MLE loss is 1.6227223824088772, train CRF loss is 4.441106905713678 Training:At training steps 3400, training MLE loss is 1.6225302474945784, train CRF loss is 4.403005280159414 Training:At training steps 3500, training MLE loss is 1.6200870532393457, train CRF loss is 4.353320182204246 Validation:At training steps 3500, training MLE loss is 1.6200870532393457, train CRF loss is 4.353320182204246, validation MLE loss is 1.8913114964962006, validation ppl is 6.628, validation CRF loss is 4.564021948136781, validation BLEU is 45.8 Training:At training steps 3600, training MLE loss is 1.6040248465910554, train CRF loss is 4.044802532866597 Training:At training steps 3700, training MLE loss is 1.6018272586539388, train CRF loss is 4.0397713960707184 Training:At training steps 3800, training MLE loss is 1.5963242715224624, train CRF loss is 4.0197391292452815 Training:At training steps 3900, training MLE loss is 1.5921865471638739, train CRF loss is 3.9750617011077702 Training:At training steps 4000, training MLE loss is 1.5853307132795453, train CRF loss is 3.940721734583378 Validation:At training steps 4000, training MLE loss is 1.5853307132795453, train CRF loss is 3.940721734583378, validation MLE loss is 2.1705432879297355, validation ppl is 8.763, validation CRF loss is 4.453233875726399, validation BLEU is 45.63 Training:At training steps 4100, training MLE loss is 1.5391007668152452, train CRF loss is 3.6718646658957006 Training:At training steps 4200, training MLE loss is 1.5435833856463432, train CRF loss is 3.615952889546752 Training:At training steps 4300, training MLE loss is 1.5476558462902903, train CRF loss is 3.591468147709966 Training:At training steps 4400, training MLE loss is 1.5343777189403773, train CRF loss is 3.553918971568346 Training:At training steps 4500, training MLE loss is 1.5221277109012008, train CRF loss is 3.520597518607974 Validation:At training steps 4500, training MLE loss is 1.5221277109012008, train CRF loss is 3.520597518607974, validation MLE loss is 2.176476232315365, validation ppl is 8.815, validation CRF loss is 4.167857879086545, validation BLEU is 48.68 Training:At training steps 4600, training MLE loss is 1.4212393000349401, train CRF loss is 3.2431894658505915 Training:At training steps 4700, training MLE loss is 1.4229580554924905, train CRF loss is 3.1964455591887235 Training:At training steps 4800, training MLE loss is 1.4213595109681287, train CRF loss is 3.161690682694316 Training:At training steps 4900, training MLE loss is 1.4253752730600535, train CRF loss is 3.1181843662355093 Training:At training steps 5000, training MLE loss is 1.4164781229123473, train CRF loss is 3.077226140663028 Validation:At training steps 5000, training MLE loss is 1.4164781229123473, train CRF loss is 3.077226140663028, validation MLE loss is 2.561228219615786, validation ppl is 12.952, validation CRF loss is 4.061976834347374, validation BLEU is 51.0 Training:At training steps 5100, training MLE loss is 1.3682063813507557, train CRF loss is 2.809998641014099 Training:At training steps 5200, training MLE loss is 1.3568593801930546, train CRF loss is 2.797075958047062 Training:At training steps 5300, training MLE loss is 1.3430961011039715, train CRF loss is 2.763538802030186 Training:At training steps 5400, training MLE loss is 1.3248187417769806, train CRF loss is 2.712593943467364 Training:At training steps 5500, training MLE loss is 1.3149318660907447, train CRF loss is 2.680907035868615 Validation:At training steps 5500, training MLE loss is 1.3149318660907447, train CRF loss is 2.680907035868615, validation MLE loss is 2.6075468753513538, validation ppl is 13.566, validation CRF loss is 4.010923859320189, validation BLEU is 51.85 Training:At training steps 5600, training MLE loss is 1.2547741066105664, train CRF loss is 2.462791693750769 Training:At training steps 5700, training MLE loss is 1.2338031995482743, train CRF loss is 2.4133313175290825 Training:At training steps 5800, training MLE loss is 1.2256498014740647, train CRF loss is 2.3840066698255638 Training:At training steps 5900, training MLE loss is 1.2132700395653955, train CRF loss is 2.347328111664392 Training:At training steps 6000, training MLE loss is 1.2074419733490795, train CRF loss is 2.3193258109502493 Validation:At training steps 6000, training MLE loss is 1.2074419733490795, train CRF loss is 2.3193258109502493, validation MLE loss is 3.0970318191929866, validation ppl is 22.132, validation CRF loss is 4.178590947075894, validation BLEU is 52.92 Training:At training steps 6100, training MLE loss is 1.148437958834693, train CRF loss is 2.145907217897475 Training:At training steps 6200, training MLE loss is 1.1234108827030287, train CRF loss is 2.098066014042124 Training:At training steps 6300, training MLE loss is 1.1075852538490047, train CRF loss is 2.0537152569927275 Training:At training steps 6400, training MLE loss is 1.098778738884721, train CRF loss is 2.0261904539656825 Training:At training steps 6500, training MLE loss is 1.0817914879024029, train CRF loss is 1.9971989628262818 Validation:At training steps 6500, training MLE loss is 1.0817914879024029, train CRF loss is 1.9971989628262818, validation MLE loss is 3.1845500437836898, validation ppl is 24.156, validation CRF loss is 4.200787600718047, validation BLEU is 55.56 Training:At training steps 6600, training MLE loss is 1.0046104171779007, train CRF loss is 1.8302255751192569 Training:At training steps 6700, training MLE loss is 0.9805403192806988, train CRF loss is 1.786233987575397 Training:At training steps 6800, training MLE loss is 0.9647217171732336, train CRF loss is 1.7629264024148386 Training:At training steps 6900, training MLE loss is 0.9564048809395171, train CRF loss is 1.7310621694475412 Training:At training steps 7000, training MLE loss is 0.9456058407053352, train CRF loss is 1.7042760421559215 Validation:At training steps 7000, training MLE loss is 0.9456058407053352, train CRF loss is 1.7042760421559215, validation MLE loss is 3.5988758378907253, validation ppl is 36.557, validation CRF loss is 4.369878254438701, validation BLEU is 55.46 Training:At training steps 7100, training MLE loss is 0.8745595926418901, train CRF loss is 1.5353786450391635 Training:At training steps 7200, training MLE loss is 0.8637740414449945, train CRF loss is 1.526656605375465 Training:At training steps 7300, training MLE loss is 0.8642973381187766, train CRF loss is 1.5084321306242297 Training:At training steps 7400, training MLE loss is 0.8620728149765636, train CRF loss is 1.4821778037201148 Training:At training steps 7500, training MLE loss is 0.8497014679973945, train CRF loss is 1.4562445906521753 Validation:At training steps 7500, training MLE loss is 0.8497014679973945, train CRF loss is 1.4562445906521753, validation MLE loss is 3.8774175738033496, validation ppl is 48.299, validation CRF loss is 4.473550234970293, validation BLEU is 56.48 Training:At training steps 7600, training MLE loss is 0.7817190407169983, train CRF loss is 1.2995845413696951 Training:At training steps 7700, training MLE loss is 0.7713232061709278, train CRF loss is 1.283740378561197 Training:At training steps 7800, training MLE loss is 0.7612530946467693, train CRF loss is 1.2598780168849044 Training:At training steps 7900, training MLE loss is 0.7544622603757307, train CRF loss is 1.234763264853682 Training:At training steps 8000, training MLE loss is 0.749508782430552, train CRF loss is 1.213887683926383 Validation:At training steps 8000, training MLE loss is 0.749508782430552, train CRF loss is 1.213887683926383, validation MLE loss is 4.134637098563345, validation ppl is 62.467, validation CRF loss is 4.616520097381191, validation BLEU is 57.42 Training:At training steps 8100, training MLE loss is 0.7026818502834067, train CRF loss is 1.0739362951274962 Training:At training steps 8200, training MLE loss is 0.6899540470377542, train CRF loss is 1.0639628748688847 Training:At training steps 8300, training MLE loss is 0.6796083650008465, train CRF loss is 1.0466389678604902 Training:At training steps 8400, training MLE loss is 0.6697699141688644, train CRF loss is 1.0287308476210455 Training:At training steps 8500, training MLE loss is 0.6597233803421259, train CRF loss is 1.007862571743084 Validation:At training steps 8500, training MLE loss is 0.6597233803421259, train CRF loss is 1.007862571743084, validation MLE loss is 4.2928627760786755, validation ppl is 73.176, validation CRF loss is 4.9051362119222945, validation BLEU is 56.26 Training:At training steps 8600, training MLE loss is 0.6251462790905498, train CRF loss is 0.9242439621849917 Training:At training steps 8700, training MLE loss is 0.6101812116347719, train CRF loss is 0.8881829871854279 Training:At training steps 8800, training MLE loss is 0.6015447341487743, train CRF loss is 0.8656682099937462 Training:At training steps 8900, training MLE loss is 0.5966220863728086, train CRF loss is 0.8501351060831803 Training:At training steps 9000, training MLE loss is 0.5868371502426453, train CRF loss is 0.8315429028719664 Validation:At training steps 9000, training MLE loss is 0.5868371502426453, train CRF loss is 0.8315429028719664, validation MLE loss is 4.624423397214789, validation ppl is 101.944, validation CRF loss is 5.085879824663463, validation BLEU is 56.55 Training:At training steps 9100, training MLE loss is 0.5432243651268073, train CRF loss is 0.7433415025146678 Training:At training steps 9200, training MLE loss is 0.5461192704923451, train CRF loss is 0.7240626674419036 Training:At training steps 9300, training MLE loss is 0.5424998440073493, train CRF loss is 0.7079840987469167 Training:At training steps 9400, training MLE loss is 0.5395023194732493, train CRF loss is 0.6914269380346013 Training:At training steps 9500, training MLE loss is 0.5275486172467936, train CRF loss is 0.6773570795102569 Validation:At training steps 9500, training MLE loss is 0.5275486172467936, train CRF loss is 0.6773570795102569, validation MLE loss is 4.863465105232439, validation ppl is 129.472, validation CRF loss is 5.159960232282939, validation BLEU is 55.36 Training:At training steps 9600, training MLE loss is 0.48099258104339243, train CRF loss is 0.5647649196404382 Training:At training steps 9700, training MLE loss is 0.47229832061188065, train CRF loss is 0.5556708462028473 Training:At training steps 9800, training MLE loss is 0.4702873572286141, train CRF loss is 0.5412080482273207 Training:At training steps 9900, training MLE loss is 0.46143423402289047, train CRF loss is 0.5281206803090754 Training:At training steps 10000, training MLE loss is 0.4555416349087609, train CRF loss is 0.5167036624765023 Validation:At training steps 10000, training MLE loss is 0.4555416349087609, train CRF loss is 0.5167036624765023, validation MLE loss is 5.264748996809909, validation ppl is 193.398, validation CRF loss is 5.550742416005385, validation BLEU is 56.55 Training:At training steps 10100, training MLE loss is 0.4247302026383113, train CRF loss is 0.4552066841124906 Training:At training steps 10200, training MLE loss is 0.4133478711656062, train CRF loss is 0.4482459806257975 Training:At training steps 10300, training MLE loss is 0.4147153708092325, train CRF loss is 0.4382082056369836 Training:At training steps 10400, training MLE loss is 0.4114508805204241, train CRF loss is 0.425686695934146 Training:At training steps 10500, training MLE loss is 0.4075417493577697, train CRF loss is 0.41719312600653213 Validation:At training steps 10500, training MLE loss is 0.4075417493577697, train CRF loss is 0.41719312600653213, validation MLE loss is 5.318483948707581, validation ppl is 204.074, validation CRF loss is 5.541174521571712, validation BLEU is 58.7 Training:At training steps 10600, training MLE loss is 0.3699938302190276, train CRF loss is 0.3690635350090088 Training:At training steps 10700, training MLE loss is 0.3654350729330326, train CRF loss is 0.36312849806008674 Training:At training steps 10800, training MLE loss is 0.3579249342769617, train CRF loss is 0.3588216792953972 Training:At training steps 10900, training MLE loss is 0.3475873048710491, train CRF loss is 0.3457045231673601 Training:At training steps 11000, training MLE loss is 0.34709290405298815, train CRF loss is 0.3383693556598664 Validation:At training steps 11000, training MLE loss is 0.34709290405298815, train CRF loss is 0.3383693556598664, validation MLE loss is 5.622063484631087, validation ppl is 276.459, validation CRF loss is 5.85330164118817, validation BLEU is 58.39 Training:At training steps 11100, training MLE loss is 0.3272928967757616, train CRF loss is 0.2913328892843856 Training:At training steps 11200, training MLE loss is 0.31757975337037353, train CRF loss is 0.2867831326254964 Training:At training steps 11300, training MLE loss is 0.31277060515169675, train CRF loss is 0.2816899385668512 Training:At training steps 11400, training MLE loss is 0.307950277968921, train CRF loss is 0.2758089397009826 Training:At training steps 11500, training MLE loss is 0.30890544422413224, train CRF loss is 0.26996647533129725 Validation:At training steps 11500, training MLE loss is 0.30890544422413224, train CRF loss is 0.26996647533129725, validation MLE loss is 5.841583399396193, validation ppl is 344.324, validation CRF loss is 5.895392342617638, validation BLEU is 56.88 Training:At training steps 11600, training MLE loss is 0.2881745820424112, train CRF loss is 0.24491363806271693 Training:At training steps 11700, training MLE loss is 0.29300937523252285, train CRF loss is 0.2399480794579813 Training:At training steps 11800, training MLE loss is 0.2943728491971221, train CRF loss is 0.24425970093803698 Training:At training steps 11900, training MLE loss is 0.28964038355135924, train CRF loss is 0.23905303974128855 Training:At training steps 12000, training MLE loss is 0.28682241859695934, train CRF loss is 0.23301122970025245 Validation:At training steps 12000, training MLE loss is 0.28682241859695934, train CRF loss is 0.23301122970025245, validation MLE loss is 6.121181839390805, validation ppl is 455.403, validation CRF loss is 6.000786436231513, validation BLEU is 58.83 Training:At training steps 12100, training MLE loss is 0.2773310349567328, train CRF loss is 0.20967423537481408 Training:At training steps 12200, training MLE loss is 0.26523506515848566, train CRF loss is 0.20374638494204647 Training:At training steps 12300, training MLE loss is 0.25673826232910263, train CRF loss is 0.19812067757966056 Training:At training steps 12400, training MLE loss is 0.2577871160298946, train CRF loss is 0.19648294928629412 Training:At training steps 12500, training MLE loss is 0.25290717545010555, train CRF loss is 0.19159533007063329 Validation:At training steps 12500, training MLE loss is 0.25290717545010555, train CRF loss is 0.19159533007063329, validation MLE loss is 6.192658364772797, validation ppl is 489.145, validation CRF loss is 6.376368986932855, validation BLEU is 61.0 Training:At training steps 12600, training MLE loss is 0.23055440933261706, train CRF loss is 0.17231995385714982 Training:At training steps 12700, training MLE loss is 0.2293779178587647, train CRF loss is 0.16654334977430152 Training:At training steps 12800, training MLE loss is 0.21947098807768878, train CRF loss is 0.16401278129318597 Training:At training steps 12900, training MLE loss is 0.21445209587054706, train CRF loss is 0.1608177019982537 Training:At training steps 13000, training MLE loss is 0.21362188525735837, train CRF loss is 0.15905371428897752 Validation:At training steps 13000, training MLE loss is 0.21362188525735837, train CRF loss is 0.15905371428897752, validation MLE loss is 5.995989708524001, validation ppl is 401.814, validation CRF loss is 6.241055705045399, validation BLEU is 59.39 Training:At training steps 13100, training MLE loss is 0.20180555783415913, train CRF loss is 0.14134813913276958 Training:At training steps 13200, training MLE loss is 0.1955210721498679, train CRF loss is 0.1405336221260586 Training:At training steps 13300, training MLE loss is 0.1992277227852719, train CRF loss is 0.14390699789039899 Training:At training steps 13400, training MLE loss is 0.1992623662795131, train CRF loss is 0.14422308118859972 Training:At training steps 13500, training MLE loss is 0.1999259954685367, train CRF loss is 0.1418878367754769 Validation:At training steps 13500, training MLE loss is 0.1999259954685367, train CRF loss is 0.1418878367754769, validation MLE loss is 6.236251953401063, validation ppl is 510.94, validation CRF loss is 6.372128056852441, validation BLEU is 59.39 Training:At training steps 13600, training MLE loss is 0.1883058279097895, train CRF loss is 0.125663155334787 Training:At training steps 13700, training MLE loss is 0.19002788537301513, train CRF loss is 0.1270971595252877 Training:At training steps 13800, training MLE loss is 0.18502907073395364, train CRF loss is 0.12442975914201801 Training:At training steps 13900, training MLE loss is 0.1820061389997113, train CRF loss is 0.12127588374995128 Training:At training steps 14000, training MLE loss is 0.1789755030403303, train CRF loss is 0.12046926028769962 Validation:At training steps 14000, training MLE loss is 0.1789755030403303, train CRF loss is 0.12046926028769962, validation MLE loss is 6.466351640851874, validation ppl is 643.133, validation CRF loss is 6.612944377096076, validation BLEU is 58.5 Training:At training steps 14100, training MLE loss is 0.15747668566562426, train CRF loss is 0.10318211679904607 Training:At training steps 14200, training MLE loss is 0.15883921808467677, train CRF loss is 0.10610216247380436 Training:At training steps 14300, training MLE loss is 0.15630672404250012, train CRF loss is 0.10377752934908206 Training:At training steps 14400, training MLE loss is 0.1559991072965613, train CRF loss is 0.10381915502338472 Training:At training steps 14500, training MLE loss is 0.15679464786147218, train CRF loss is 0.10301892519814737 Validation:At training steps 14500, training MLE loss is 0.15679464786147218, train CRF loss is 0.10301892519814737, validation MLE loss is 6.3078018207299085, validation ppl is 548.837, validation CRF loss is 6.540762302122618, validation BLEU is 60.44 Training:At training steps 14600, training MLE loss is 0.14836147996603358, train CRF loss is 0.09121755739179208 Training:At training steps 14700, training MLE loss is 0.14612986552433996, train CRF loss is 0.09022963507594568 Training:At training steps 14800, training MLE loss is 0.14520430684613833, train CRF loss is 0.0897221003162243 Training:At training steps 14900, training MLE loss is 0.14400671402339354, train CRF loss is 0.08938064130901921 Training:At training steps 15000, training MLE loss is 0.1400690131079841, train CRF loss is 0.08755445748720377 Validation:At training steps 15000, training MLE loss is 0.1400690131079841, train CRF loss is 0.08755445748720377, validation MLE loss is 6.399616479873657, validation ppl is 601.614, validation CRF loss is 6.700088833507738, validation BLEU is 60.09 Training:At training steps 15100, training MLE loss is 0.12360495931567585, train CRF loss is 0.08529563566562047 Training:At training steps 15200, training MLE loss is 0.12195537306163146, train CRF loss is 0.08345298356503861 Training:At training steps 15300, training MLE loss is 0.12113295527466184, train CRF loss is 0.08035242358639834 Training:At training steps 15400, training MLE loss is 0.11986786501527888, train CRF loss is 0.07846641463791741 Training:At training steps 15500, training MLE loss is 0.12104202742373445, train CRF loss is 0.07812822067772374 Validation:At training steps 15500, training MLE loss is 0.12104202742373445, train CRF loss is 0.07812822067772374, validation MLE loss is 6.864885016491539, validation ppl is 958.036, validation CRF loss is 7.037457914728868, validation BLEU is 60.1 Training:At training steps 15600, training MLE loss is 0.11447354335467025, train CRF loss is 0.06221918804419602 Training:At training steps 15700, training MLE loss is 0.11298380131838712, train CRF loss is 0.061468606447449475 Training:At training steps 15800, training MLE loss is 0.11155751374867577, train CRF loss is 0.06231634511456377 Training:At training steps 15900, training MLE loss is 0.10842128410562055, train CRF loss is 0.0619604719164505 Training:At training steps 16000, training MLE loss is 0.10728312725172783, train CRF loss is 0.06118294412442674 Validation:At training steps 16000, training MLE loss is 0.10728312725172783, train CRF loss is 0.06118294412442674, validation MLE loss is 6.857525091422231, validation ppl is 951.01, validation CRF loss is 7.087279881301679, validation BLEU is 58.77 Training:At training steps 16100, training MLE loss is 0.10090909428246278, train CRF loss is 0.06604414603452483 Training:At training steps 16200, training MLE loss is 0.09925672267543974, train CRF loss is 0.062138224358846944 Training:At training steps 16300, training MLE loss is 0.10090418465669813, train CRF loss is 0.06426452727546225 Training:At training steps 16400, training MLE loss is 0.09848108276016092, train CRF loss is 0.06113000371849957 Training:At training steps 16500, training MLE loss is 0.09730565325446697, train CRF loss is 0.06009272731914967 Validation:At training steps 16500, training MLE loss is 0.09730565325446697, train CRF loss is 0.06009272731914967, validation MLE loss is 7.191787688355697, validation ppl is 1328.476, validation CRF loss is 7.3239968826896265, validation BLEU is 59.12 Training:At training steps 16600, training MLE loss is 0.09701449829760236, train CRF loss is 0.055532499634776966 Training:At training steps 16700, training MLE loss is 0.09239072546978434, train CRF loss is 0.05350442541579753 Training:At training steps 16800, training MLE loss is 0.08995584327442885, train CRF loss is 0.05382314547814739 Training:At training steps 16900, training MLE loss is 0.09131666521482032, train CRF loss is 0.055646031491279044 Training:At training steps 17000, training MLE loss is 0.08822444755904076, train CRF loss is 0.05407587670630608 Validation:At training steps 17000, training MLE loss is 0.08822444755904076, train CRF loss is 0.05407587670630608, validation MLE loss is 6.89798106645283, validation ppl is 990.273, validation CRF loss is 7.152861507315385, validation BLEU is 59.75 Training:At training steps 17100, training MLE loss is 0.07665106416889103, train CRF loss is 0.049780751858712335 Training:At training steps 17200, training MLE loss is 0.07914235677835706, train CRF loss is 0.045109050205802956 Training:At training steps 17300, training MLE loss is 0.07879692089515719, train CRF loss is 0.045889903980815445 Training:At training steps 17400, training MLE loss is 0.07926124805453934, train CRF loss is 0.046934111295499924 Training:At training steps 17500, training MLE loss is 0.0766879546859216, train CRF loss is 0.044920390413207674 Validation:At training steps 17500, training MLE loss is 0.0766879546859216, train CRF loss is 0.044920390413207674, validation MLE loss is 7.49906010063071, validation ppl is 1806.344, validation CRF loss is 7.559512593244252, validation BLEU is 59.67 Training:At training steps 17600, training MLE loss is 0.07305128398085145, train CRF loss is 0.045187681112636824 Training:At training steps 17700, training MLE loss is 0.07202580113984858, train CRF loss is 0.04605540263076125 Training:At training steps 17800, training MLE loss is 0.07372386656331993, train CRF loss is 0.0463684781668036 Training:At training steps 17900, training MLE loss is 0.07258614651050213, train CRF loss is 0.045449953905836636 Training:At training steps 18000, training MLE loss is 0.07190056971744227, train CRF loss is 0.044600839302275604 Validation:At training steps 18000, training MLE loss is 0.07190056971744227, train CRF loss is 0.044600839302275604, validation MLE loss is 7.302204273248973, validation ppl is 1483.567, validation CRF loss is 7.3966263030704695, validation BLEU is 60.19 Training:At training steps 18100, training MLE loss is 0.07104298689114216, train CRF loss is 0.043254005014250085 Training:At training steps 18200, training MLE loss is 0.06832526516072715, train CRF loss is 0.03915011136846819 Training:At training steps 18300, training MLE loss is 0.06701130104221142, train CRF loss is 0.03872835903643745 Training:At training steps 18400, training MLE loss is 0.06746382509210312, train CRF loss is 0.03920891508111934 Training:At training steps 18500, training MLE loss is 0.06635269111216435, train CRF loss is 0.038398729862256155 Validation:At training steps 18500, training MLE loss is 0.06635269111216435, train CRF loss is 0.038398729862256155, validation MLE loss is 7.568509923784356, validation ppl is 1936.253, validation CRF loss is 7.812963416701869, validation BLEU is 60.3 Training:At training steps 18600, training MLE loss is 0.05477254355117516, train CRF loss is 0.03199583519917511 Training:At training steps 18700, training MLE loss is 0.05367387758511876, train CRF loss is 0.030250844246016068 Training:At training steps 18800, training MLE loss is 0.05292981302730338, train CRF loss is 0.03072116440443086 Training:At training steps 18900, training MLE loss is 0.054997139789662565, train CRF loss is 0.03145089166077986 Training:At training steps 19000, training MLE loss is 0.055170523317550936, train CRF loss is 0.03233011078386979 Validation:At training steps 19000, training MLE loss is 0.055170523317550936, train CRF loss is 0.03233011078386979, validation MLE loss is 7.649217338938462, validation ppl is 2099.002, validation CRF loss is 7.775305942485207, validation BLEU is 60.42 Training:At training steps 19100, training MLE loss is 0.0475691871517946, train CRF loss is 0.027079495979746328 Training:At training steps 19200, training MLE loss is 0.045850706506555775, train CRF loss is 0.026501422045348805 Training:At training steps 19300, training MLE loss is 0.04543964020217655, train CRF loss is 0.026232702633176147 Training:At training steps 19400, training MLE loss is 0.045147004949221504, train CRF loss is 0.026357885943784094 Training:At training steps 19500, training MLE loss is 0.04522244256720454, train CRF loss is 0.026251692250174073 Validation:At training steps 19500, training MLE loss is 0.04522244256720454, train CRF loss is 0.026251692250174073, validation MLE loss is 7.843798129182113, validation ppl is 2549.871, validation CRF loss is 8.05462728676043, validation BLEU is 59.46 Training:At training steps 19600, training MLE loss is 0.04998258007190994, train CRF loss is 0.032368516072011036 Training:At training steps 19700, training MLE loss is 0.046934845224371594, train CRF loss is 0.029802016789532343 Training:At training steps 19800, training MLE loss is 0.04586911797331892, train CRF loss is 0.027980513513720513 Training:At training steps 19900, training MLE loss is 0.04440008558393292, train CRF loss is 0.027382685870628868 Training:At training steps 20000, training MLE loss is 0.04302087818062409, train CRF loss is 0.027495210384584027 Validation:At training steps 20000, training MLE loss is 0.04302087818062409, train CRF loss is 0.027495210384584027, validation MLE loss is 8.016214201324864, validation ppl is 3029.686, validation CRF loss is 8.206905239506773, validation BLEU is 60.65 Training:At training steps 20100, training MLE loss is 0.0380755607514022, train CRF loss is 0.025368185887666143 Training:At training steps 20200, training MLE loss is 0.03745514133182198, train CRF loss is 0.024685286896606828 Training:At training steps 20300, training MLE loss is 0.036929618129418335, train CRF loss is 0.02403564739660404 Training:At training steps 20400, training MLE loss is 0.0354188182138462, train CRF loss is 0.02288692815242975 Training:At training steps 20500, training MLE loss is 0.03517499287056841, train CRF loss is 0.022216567980479586 Validation:At training steps 20500, training MLE loss is 0.03517499287056841, train CRF loss is 0.022216567980479586, validation MLE loss is 8.00203766320881, validation ppl is 2987.038, validation CRF loss is 8.154047263296027, validation BLEU is 60.84 Training:At training steps 20600, training MLE loss is 0.02925657658560392, train CRF loss is 0.018767972310874157 Training:At training steps 20700, training MLE loss is 0.03146484537655619, train CRF loss is 0.02053292332994104 Training:At training steps 20800, training MLE loss is 0.03144789602711273, train CRF loss is 0.02064126753971745 Training:At training steps 20900, training MLE loss is 0.03011835956081351, train CRF loss is 0.01961544704134187 Training:At training steps 21000, training MLE loss is 0.029521299328669134, train CRF loss is 0.019208216029459064 Validation:At training steps 21000, training MLE loss is 0.029521299328669134, train CRF loss is 0.019208216029459064, validation MLE loss is 8.252566450520566, validation ppl is 3837.462, validation CRF loss is 8.319485639270983, validation BLEU is 59.24 Training:At training steps 21100, training MLE loss is 0.026284580850983282, train CRF loss is 0.016797429682867335 Training:At training steps 21200, training MLE loss is 0.025966182992733032, train CRF loss is 0.016694613206412138 Training:At training steps 21300, training MLE loss is 0.02509393142821284, train CRF loss is 0.015821357016575693 Training:At training steps 21400, training MLE loss is 0.025433296899051897, train CRF loss is 0.016066151486012713 Training:At training steps 21500, training MLE loss is 0.025222979960104436, train CRF loss is 0.016224831299708312 Validation:At training steps 21500, training MLE loss is 0.025222979960104436, train CRF loss is 0.016224831299708312, validation MLE loss is 8.496889142613663, validation ppl is 4899.503, validation CRF loss is 8.709328814556724, validation BLEU is 60.33 Training:At training steps 21600, training MLE loss is 0.020850242441363145, train CRF loss is 0.013305847390869312 Training:At training steps 21700, training MLE loss is 0.021390414299735486, train CRF loss is 0.013433363544656367 Training:At training steps 21800, training MLE loss is 0.020336338820841345, train CRF loss is 0.012535774282661123 Training:At training steps 21900, training MLE loss is 0.019721552840758255, train CRF loss is 0.011984550611433657 Training:At training steps 22000, training MLE loss is 0.019632916727652668, train CRF loss is 0.012070516738564782 Validation:At training steps 22000, training MLE loss is 0.019632916727652668, train CRF loss is 0.012070516738564782, validation MLE loss is 8.543186438711066, validation ppl is 5131.67, validation CRF loss is 8.780698769970945, validation BLEU is 60.77 Training:At training steps 22100, training MLE loss is 0.01708879556636876, train CRF loss is 0.011361660406617301 Training:At training steps 22200, training MLE loss is 0.016529187374230484, train CRF loss is 0.010577494024541596 Training:At training steps 22300, training MLE loss is 0.015766017189489022, train CRF loss is 0.009932244687171824 Training:At training steps 22400, training MLE loss is 0.014558933463291355, train CRF loss is 0.009187226699030661 Training:At training steps 22500, training MLE loss is 0.0145342320271107, train CRF loss is 0.009426976598033758 Validation:At training steps 22500, training MLE loss is 0.0145342320271107, train CRF loss is 0.009426976598033758, validation MLE loss is 8.745189004822782, validation ppl is 6280.4, validation CRF loss is 8.893932825640627, validation BLEU is 62.12 Training:At training steps 22600, training MLE loss is 0.015995134230351923, train CRF loss is 0.009668788914679505 Training:At training steps 22700, training MLE loss is 0.014752098881845928, train CRF loss is 0.009257284775200145 Training:At training steps 22800, training MLE loss is 0.015017593309206134, train CRF loss is 0.00897139206669961 Training:At training steps 22900, training MLE loss is 0.013974721534551047, train CRF loss is 0.00869132072634059 Training:At training steps 23000, training MLE loss is 0.013818781181257048, train CRF loss is 0.008666806319037294 Validation:At training steps 23000, training MLE loss is 0.013818781181257048, train CRF loss is 0.008666806319037294, validation MLE loss is 8.83764239988829, validation ppl is 6888.733, validation CRF loss is 9.040750861167908, validation BLEU is 60.91 Training:At training steps 23100, training MLE loss is 0.010893949910771493, train CRF loss is 0.006767193456546475 Training:At training steps 23200, training MLE loss is 0.010344168947062932, train CRF loss is 0.006716342901716028 Training:At training steps 23300, training MLE loss is 0.011225300991815804, train CRF loss is 0.007229011012391566 Training:At training steps 23400, training MLE loss is 0.010930192259508588, train CRF loss is 0.007030810294898444 Training:At training steps 23500, training MLE loss is 0.010444937532260229, train CRF loss is 0.006499980457502648 Validation:At training steps 23500, training MLE loss is 0.010444937532260229, train CRF loss is 0.006499980457502648, validation MLE loss is 9.095906119597586, validation ppl is 8918.706, validation CRF loss is 9.25924493764576, validation BLEU is 61.37 Training:At training steps 23600, training MLE loss is 0.00854926366876226, train CRF loss is 0.005252689561927308 Training:At training steps 23700, training MLE loss is 0.009848597727082591, train CRF loss is 0.006129381914407052 Training:At training steps 23800, training MLE loss is 0.009537629893066876, train CRF loss is 0.006008059269318267 Training:At training steps 23900, training MLE loss is 0.00977002863171048, train CRF loss is 0.0059465492234650925 Training:At training steps 24000, training MLE loss is 0.009341851843956608, train CRF loss is 0.0057164490867902575 Validation:At training steps 24000, training MLE loss is 0.009341851843956608, train CRF loss is 0.0057164490867902575, validation MLE loss is 9.22791019866341, validation ppl is 10177.251, validation CRF loss is 9.330364371600904, validation BLEU is 60.76 Training:At training steps 24100, training MLE loss is 0.0070669642506022865, train CRF loss is 0.004761354542571756 Training:At training steps 24200, training MLE loss is 0.006731424178663055, train CRF loss is 0.004393131263731968 Training:At training steps 24300, training MLE loss is 0.007176450324751449, train CRF loss is 0.004387882051342456 Training:At training steps 24400, training MLE loss is 0.006822508174459342, train CRF loss is 0.004343009586493624 Training:At training steps 24500, training MLE loss is 0.006879484821118949, train CRF loss is 0.004352027078910941 Validation:At training steps 24500, training MLE loss is 0.006879484821118949, train CRF loss is 0.004352027078910941, validation MLE loss is 9.181007159383674, validation ppl is 9710.928, validation CRF loss is 9.4167528340691, validation BLEU is 60.93 Training:At training steps 24600, training MLE loss is 0.006067505877877339, train CRF loss is 0.0034238513446854133 Training:At training steps 24700, training MLE loss is 0.005961891537091258, train CRF loss is 0.003641504988970803 Training:At training steps 24800, training MLE loss is 0.005845536018323509, train CRF loss is 0.003280170556028733 Training:At training steps 24900, training MLE loss is 0.006264998472165391, train CRF loss is 0.0034415040783312366 Training:At training steps 25000, training MLE loss is 0.006382254267350268, train CRF loss is 0.003588787785934205 Validation:At training steps 25000, training MLE loss is 0.006382254267350268, train CRF loss is 0.003588787785934205, validation MLE loss is 9.296922539409838, validation ppl is 10904.41, validation CRF loss is 9.471397061096994, validation BLEU is 61.53 Training:At training steps 25100, training MLE loss is 0.006390881068127352, train CRF loss is 0.0043867170969898025 Training:At training steps 25200, training MLE loss is 0.005759002213721522, train CRF loss is 0.004048751630091307 Training:At training steps 25300, training MLE loss is 0.005213465684461215, train CRF loss is 0.0037105780512518144 Training:At training steps 25400, training MLE loss is 0.005071283572279083, train CRF loss is 0.003449735242757648 Training:At training steps 25500, training MLE loss is 0.004869693233557966, train CRF loss is 0.003250637154131857 Validation:At training steps 25500, training MLE loss is 0.004869693233557966, train CRF loss is 0.003250637154131857, validation MLE loss is 9.323140031413027, validation ppl is 11194.076, validation CRF loss is 9.546792431881553, validation BLEU is 61.18 Training:At training steps 25600, training MLE loss is 0.006097755166905916, train CRF loss is 0.0034005597275779034 Training:At training steps 25700, training MLE loss is 0.00544383708806347, train CRF loss is 0.0032628180245032378 Training:At training steps 25800, training MLE loss is 0.004789618071103132, train CRF loss is 0.0029963603431185323 Training:At training steps 25900, training MLE loss is 0.0043610505598147145, train CRF loss is 0.002722119748955417 Training:At training steps 26000, training MLE loss is 0.004147443499419626, train CRF loss is 0.0026305440272084936 Validation:At training steps 26000, training MLE loss is 0.004147443499419626, train CRF loss is 0.0026305440272084936, validation MLE loss is 9.695976006357293, validation ppl is 16252.077, validation CRF loss is 9.820894071930333, validation BLEU is 61.11 Training:At training steps 26100, training MLE loss is 0.004089213970077287, train CRF loss is 0.0022049149694302896 Training:At training steps 26200, training MLE loss is 0.004218602038502442, train CRF loss is 0.0023022754695598024 Training:At training steps 26300, training MLE loss is 0.004078990325424419, train CRF loss is 0.0023782401809648752 Training:At training steps 26400, training MLE loss is 0.003797782350861504, train CRF loss is 0.002407004669573295 Training:At training steps 26500, training MLE loss is 0.0034111233692437565, train CRF loss is 0.002144823188078777 Validation:At training steps 26500, training MLE loss is 0.0034111233692437565, train CRF loss is 0.002144823188078777, validation MLE loss is 9.777539510475961, validation ppl is 17633.213, validation CRF loss is 9.864710682316831, validation BLEU is 60.33 Training:At training steps 26600, training MLE loss is 0.0025737685462320246, train CRF loss is 0.0015534839076083396 Training:At training steps 26700, training MLE loss is 0.003205704940532994, train CRF loss is 0.002018898155101667 Training:At training steps 26800, training MLE loss is 0.0030357713276044198, train CRF loss is 0.0019466419320310843 Training:At training steps 26900, training MLE loss is 0.0030505345841814846, train CRF loss is 0.0020245876066625457 Training:At training steps 27000, training MLE loss is 0.002895322879319602, train CRF loss is 0.0019930873298347647 Validation:At training steps 27000, training MLE loss is 0.002895322879319602, train CRF loss is 0.0019930873298347647, validation MLE loss is 9.86074472101111, validation ppl is 19163.155, validation CRF loss is 9.964679473324827, validation BLEU is 60.59 Training:At training steps 27100, training MLE loss is 0.003471623843662576, train CRF loss is 0.0021299945885364657 Training:At training steps 27200, training MLE loss is 0.0028345625170627525, train CRF loss is 0.0016867184574495808 Training:At training steps 27300, training MLE loss is 0.002797800329679133, train CRF loss is 0.0015972316200198538 Training:At training steps 27400, training MLE loss is 0.0030614189446456362, train CRF loss is 0.0018242224681735942 Training:At training steps 27500, training MLE loss is 0.0028990908890238207, train CRF loss is 0.0017641736082247067 Validation:At training steps 27500, training MLE loss is 0.0028990908890238207, train CRF loss is 0.0017641736082247067, validation MLE loss is 9.736431241035461, validation ppl is 16923.039, validation CRF loss is 9.967844417220668, validation BLEU is 60.77 Training:At training steps 27600, training MLE loss is 0.0022520684002095316, train CRF loss is 0.001219500389917978 Training:At training steps 27700, training MLE loss is 0.0020582025998751143, train CRF loss is 0.0012399876222221851 Training:At training steps 27800, training MLE loss is 0.0018988354374955977, train CRF loss is 0.0012656622689387195 Training:At training steps 27900, training MLE loss is 0.0020499655078100305, train CRF loss is 0.0013761961409575153 Training:At training steps 28000, training MLE loss is 0.0021000170492742358, train CRF loss is 0.0014408990253076005 Validation:At training steps 28000, training MLE loss is 0.0021000170492742358, train CRF loss is 0.0014408990253076005, validation MLE loss is 10.05707279631966, validation ppl is 23320.144, validation CRF loss is 10.216041163394326, validation BLEU is 60.87 Training:At training steps 28100, training MLE loss is 0.0030015912698652812, train CRF loss is 0.0015753465051522575 Training:At training steps 28200, training MLE loss is 0.0021553401237937035, train CRF loss is 0.0013385489776878501 Training:At training steps 28300, training MLE loss is 0.0023101436348520908, train CRF loss is 0.0014178949422297604 Training:At training steps 28400, training MLE loss is 0.0021658204908319347, train CRF loss is 0.0013128533926620566 Training:At training steps 28500, training MLE loss is 0.0020536117997030337, train CRF loss is 0.0012449698579116762 Validation:At training steps 28500, training MLE loss is 0.0020536117997030337, train CRF loss is 0.0012449698579116762, validation MLE loss is 10.128378541846024, validation ppl is 25043.724, validation CRF loss is 10.258555976968063, validation BLEU is 61.15 Training:At training steps 28600, training MLE loss is 0.0024479896528925723, train CRF loss is 0.0017714147669991266 Training:At training steps 28700, training MLE loss is 0.002281338145934391, train CRF loss is 0.001563111144292233 Training:At training steps 28800, training MLE loss is 0.0020702301666679886, train CRF loss is 0.0014385161749518598 Training:At training steps 28900, training MLE loss is 0.002008903879271698, train CRF loss is 0.001372129466636055 Training:At training steps 29000, training MLE loss is 0.0019216985737273184, train CRF loss is 0.001337039505882072 Validation:At training steps 29000, training MLE loss is 0.0019216985737273184, train CRF loss is 0.001337039505882072, validation MLE loss is 10.165614906110262, validation ppl is 25993.841, validation CRF loss is 10.272295782440587, validation BLEU is 61.57 Training:At training steps 29100, training MLE loss is 0.0021004996795419132, train CRF loss is 0.001512581290372621 Training:At training steps 29200, training MLE loss is 0.001769671635924029, train CRF loss is 0.00131976620627537 Training:At training steps 29300, training MLE loss is 0.0017995481166565741, train CRF loss is 0.0011791506613501271 Training:At training steps 29400, training MLE loss is 0.0018358306837782357, train CRF loss is 0.0012368177915022293 Training:At training steps 29500, training MLE loss is 0.0017529008837771855, train CRF loss is 0.0012212560362883673 Validation:At training steps 29500, training MLE loss is 0.0017529008837771855, train CRF loss is 0.0012212560362883673, validation MLE loss is 10.237551042908116, validation ppl is 27932.636, validation CRF loss is 10.402361838441147, validation BLEU is 61.1 Training:At training steps 29600, training MLE loss is 0.001221103362411673, train CRF loss is 0.0010092335449366496 Training:At training steps 29700, training MLE loss is 0.0010336451689000447, train CRF loss is 0.0009480215501744627 Training:At training steps 29800, training MLE loss is 0.0012007679662480045, train CRF loss is 0.0008372467470420685 Training:At training steps 29900, training MLE loss is 0.0014192314204120474, train CRF loss is 0.0009195845096815358 Training:At training steps 30000, training MLE loss is 0.0014702532679053875, train CRF loss is 0.000938574109893505 Validation:At training steps 30000, training MLE loss is 0.0014702532679053875, train CRF loss is 0.000938574109893505, validation MLE loss is 10.249113741673922, validation ppl is 28257.487, validation CRF loss is 10.354402673871894, validation BLEU is 60.47 Training:At training steps 30100, training MLE loss is 0.0010683440049524743, train CRF loss is 0.0004603646420962848 Training:At training steps 30200, training MLE loss is 0.0009929577583443002, train CRF loss is 0.0005059560729364354 Training:At training steps 30300, training MLE loss is 0.0010612938884031105, train CRF loss is 0.0006914804397244471 Training:At training steps 30400, training MLE loss is 0.0010913732225761327, train CRF loss is 0.0007712223775558436 Training:At training steps 30500, training MLE loss is 0.001146334385863262, train CRF loss is 0.000809068615499827 Validation:At training steps 30500, training MLE loss is 0.001146334385863262, train CRF loss is 0.000809068615499827, validation MLE loss is 10.296753607298198, validation ppl is 29636.252, validation CRF loss is 10.407992237492612, validation BLEU is 61.66 Training:At training steps 30600, training MLE loss is 0.0012141171151793236, train CRF loss is 0.0005131602229355581 Training:At training steps 30700, training MLE loss is 0.001347488975399609, train CRF loss is 0.0006856354009069387 Training:At training steps 30800, training MLE loss is 0.0012347654854029692, train CRF loss is 0.0007486440666295883 Training:At training steps 30900, training MLE loss is 0.0012704704979926958, train CRF loss is 0.000782448985703611 Training:At training steps 31000, training MLE loss is 0.0012557242631412112, train CRF loss is 0.0008162036603596085 Validation:At training steps 31000, training MLE loss is 0.0012557242631412112, train CRF loss is 0.0008162036603596085, validation MLE loss is 10.327420316244426, validation ppl is 30559.177, validation CRF loss is 10.501715741659465, validation BLEU is 61.48 Training:At training steps 31100, training MLE loss is 0.0012381376905844265, train CRF loss is 0.0010613700668697446 Training:At training steps 31200, training MLE loss is 0.0013575888809920721, train CRF loss is 0.0009497780362935849 Training:At training steps 31300, training MLE loss is 0.0014416057644118753, train CRF loss is 0.0009399143159723477 Training:At training steps 31400, training MLE loss is 0.0012270713445975902, train CRF loss is 0.0008238808439036471 Training:At training steps 31500, training MLE loss is 0.0012134538603784763, train CRF loss is 0.0007785439245142962 Validation:At training steps 31500, training MLE loss is 0.0012134538603784763, train CRF loss is 0.0007785439245142962, validation MLE loss is 10.45793444859354, validation ppl is 34819.556, validation CRF loss is 10.583544248028806, validation BLEU is 60.83 Training:At training steps 31600, training MLE loss is 0.0019435942114647189, train CRF loss is 0.001385372487654233 Training:At training steps 31700, training MLE loss is 0.001587936618189051, train CRF loss is 0.0010103360994991296 Training:At training steps 31800, training MLE loss is 0.001580630807659642, train CRF loss is 0.0010822541530072286 Training:At training steps 31900, training MLE loss is 0.0015256975561359393, train CRF loss is 0.0009780940454692112 Training:At training steps 32000, training MLE loss is 0.0014858926603979101, train CRF loss is 0.0009503163854977332 Validation:At training steps 32000, training MLE loss is 0.0014858926603979101, train CRF loss is 0.0009503163854977332, validation MLE loss is 10.214229263757405, validation ppl is 27288.735, validation CRF loss is 10.346967534015054, validation BLEU is 61.1 Training:At training steps 32100, training MLE loss is 0.0009594346613645296, train CRF loss is 0.0008121515492382647 Training:At training steps 32200, training MLE loss is 0.0008866923681047571, train CRF loss is 0.0006580232175866052 Training:At training steps 32300, training MLE loss is 0.0010013789902749003, train CRF loss is 0.0006493612621822717 Training:At training steps 32400, training MLE loss is 0.0009270331079260985, train CRF loss is 0.0006038494669043714 Training:At training steps 32500, training MLE loss is 0.0009404302349078937, train CRF loss is 0.000582582718504745 Validation:At training steps 32500, training MLE loss is 0.0009404302349078937, train CRF loss is 0.000582582718504745, validation MLE loss is 10.32527554662604, validation ppl is 30493.705, validation CRF loss is 10.481086329409951, validation BLEU is 61.0 Training:At training steps 32600, training MLE loss is 0.0006843339490139118, train CRF loss is 0.0005805088098744537 Training:At training steps 32700, training MLE loss is 0.0009006773443623784, train CRF loss is 0.0005286427181350728 Training:At training steps 32800, training MLE loss is 0.0009620828586010747, train CRF loss is 0.0005757921176861588 Training:At training steps 32900, training MLE loss is 0.0008745517744167128, train CRF loss is 0.0005068571203108718 Training:At training steps 33000, training MLE loss is 0.0007652360343484515, train CRF loss is 0.0004514577508804347 Validation:At training steps 33000, training MLE loss is 0.0007652360343484515, train CRF loss is 0.0004514577508804347, validation MLE loss is 10.160235605741802, validation ppl is 25854.388, validation CRF loss is 10.350101728188363, validation BLEU is 61.39 Training:At training steps 33100, training MLE loss is 0.0004951131574512869, train CRF loss is 0.0003280192510444113 Training:At training steps 33200, training MLE loss is 0.0005746110610972981, train CRF loss is 0.0004136154058865671 Training:At training steps 33300, training MLE loss is 0.0005972413244040838, train CRF loss is 0.00041764271308401363 Training:At training steps 33400, training MLE loss is 0.0005467381848096739, train CRF loss is 0.00034375473353755794 Training:At training steps 33500, training MLE loss is 0.0006404933656252765, train CRF loss is 0.00035115311487577826 Validation:At training steps 33500, training MLE loss is 0.0006404933656252765, train CRF loss is 0.00035115311487577826, validation MLE loss is 10.376764504533066, validation ppl is 32104.918, validation CRF loss is 10.546200576581453, validation BLEU is 61.28 Training:At training steps 33600, training MLE loss is 0.0003336527621318504, train CRF loss is 0.00028805335895167874 Training:At training steps 33700, training MLE loss is 0.0006406832208406914, train CRF loss is 0.0004008439461712965 Training:At training steps 33800, training MLE loss is 0.0006892377188250071, train CRF loss is 0.0004663916823390283 Training:At training steps 33900, training MLE loss is 0.0006914420112294242, train CRF loss is 0.0004362154118048078 Training:At training steps 34000, training MLE loss is 0.000665798256021813, train CRF loss is 0.000427728479259156 Validation:At training steps 34000, training MLE loss is 0.000665798256021813, train CRF loss is 0.000427728479259156, validation MLE loss is 10.381663429109674, validation ppl is 32262.583, validation CRF loss is 10.53336181138691, validation BLEU is 61.18 Training:At training steps 34100, training MLE loss is 0.0007476670282768818, train CRF loss is 0.00047058710356248754 Training:At training steps 34200, training MLE loss is 0.00047604926121749824, train CRF loss is 0.000293347556846677 Training:At training steps 34300, training MLE loss is 0.0005142965599058465, train CRF loss is 0.00033110406122135674 Training:At training steps 34400, training MLE loss is 0.0005042159683452877, train CRF loss is 0.0003126211475192553 Training:At training steps 34500, training MLE loss is 0.000559373581031626, train CRF loss is 0.00032471816586299607 Validation:At training steps 34500, training MLE loss is 0.000559373581031626, train CRF loss is 0.00032471816586299607, validation MLE loss is 10.497041570512872, validation ppl is 36208.225, validation CRF loss is 10.653609470317239, validation BLEU is 61.17 Training:At training steps 34600, training MLE loss is 0.0007020744674679349, train CRF loss is 0.000386726713762231 Training:At training steps 34700, training MLE loss is 0.0006465682019583706, train CRF loss is 0.00035439893129221025 Training:At training steps 34800, training MLE loss is 0.0005804774317089441, train CRF loss is 0.0003428007267852366 Training:At training steps 34900, training MLE loss is 0.0005483068213814661, train CRF loss is 0.00031828239730037454 Training:At training steps 35000, training MLE loss is 0.000523968372537583, train CRF loss is 0.00031291436005815676 Validation:At training steps 35000, training MLE loss is 0.000523968372537583, train CRF loss is 0.00031291436005815676, validation MLE loss is 10.464112815104032, validation ppl is 35035.349, validation CRF loss is 10.617857017015156, validation BLEU is 60.94 Training:At training steps 35100, training MLE loss is 0.0004374585101303647, train CRF loss is 0.00019844847269725907 Training:At training steps 35200, training MLE loss is 0.0005195529329054298, train CRF loss is 0.0002970732559103162 Training:At training steps 35300, training MLE loss is 0.0004531196436313076, train CRF loss is 0.00028284150284443014 Training:At training steps 35400, training MLE loss is 0.00041869693346293995, train CRF loss is 0.0002921552559068885 Training:At training steps 35500, training MLE loss is 0.0003513890073936278, train CRF loss is 0.00025479151705684356 Validation:At training steps 35500, training MLE loss is 0.0003513890073936278, train CRF loss is 0.00025479151705684356, validation MLE loss is 10.424302421118083, validation ppl is 33667.977, validation CRF loss is 10.60277481455552, validation BLEU is 61.75 Training:At training steps 35600, training MLE loss is 0.00010917394967663897, train CRF loss is 9.631098986642605e-05 Training:At training steps 35700, training MLE loss is 0.0004001802556816281, train CRF loss is 0.00014389165113262603 Training:At training steps 35800, training MLE loss is 0.00044105041714968715, train CRF loss is 0.0002491539129414866 Training:At training steps 35900, training MLE loss is 0.00040956877087343005, train CRF loss is 0.00023482881154008318 Training:At training steps 36000, training MLE loss is 0.00037723928021020934, train CRF loss is 0.00021931962241638115 Validation:At training steps 36000, training MLE loss is 0.00037723928021020934, train CRF loss is 0.00021931962241638115, validation MLE loss is 10.46721223153566, validation ppl is 35144.107, validation CRF loss is 10.652010516116494, validation BLEU is 60.81 Training:At training steps 36100, training MLE loss is 0.00023385882508546827, train CRF loss is 0.0002121308577734915 Training:At training steps 36200, training MLE loss is 0.0002849562236281854, train CRF loss is 0.0002475564964435195 Training:At training steps 36300, training MLE loss is 0.0002719033861800748, train CRF loss is 0.0001877095197993217 Training:At training steps 36400, training MLE loss is 0.00030668883926663444, train CRF loss is 0.00021689228186178356 Training:At training steps 36500, training MLE loss is 0.0003032129816013761, train CRF loss is 0.00021090761785577073 Validation:At training steps 36500, training MLE loss is 0.0003032129816013761, train CRF loss is 0.00021090761785577073, validation MLE loss is 10.431996853728043, validation ppl is 33928.032, validation CRF loss is 10.648132989281102, validation BLEU is 61.65 Training:At training steps 36600, training MLE loss is 0.0003202365969073301, train CRF loss is 0.0001841494292267809 Training:At training steps 36700, training MLE loss is 0.0005660416504842554, train CRF loss is 0.000334394457185625 Training:At training steps 36800, training MLE loss is 0.00047791387620135323, train CRF loss is 0.0002938281493715363 Training:At training steps 36900, training MLE loss is 0.000412010529142706, train CRF loss is 0.00026809153523981724 Training:At training steps 37000, training MLE loss is 0.0004400115734673943, train CRF loss is 0.00029728693774078163 Validation:At training steps 37000, training MLE loss is 0.0004400115734673943, train CRF loss is 0.00029728693774078163, validation MLE loss is 10.475453928897256, validation ppl is 35434.951, validation CRF loss is 10.644454290992336, validation BLEU is 61.47 Training:At training steps 37100, training MLE loss is 0.0002564120675499242, train CRF loss is 7.890545272571714e-05 Training:At training steps 37200, training MLE loss is 0.00034700474682673244, train CRF loss is 0.0001744734052685759 Training:At training steps 37300, training MLE loss is 0.0002792704583223041, train CRF loss is 0.0001334023774398115 Training:At training steps 37400, training MLE loss is 0.00026664697044105775, train CRF loss is 0.00013039709743654538 Training:At training steps 37500, training MLE loss is 0.00027018196315115, train CRF loss is 0.00013008480328863926 Validation:At training steps 37500, training MLE loss is 0.00027018196315115, train CRF loss is 0.00013008480328863926, validation MLE loss is 10.436691328098899, validation ppl is 34087.681, validation CRF loss is 10.593820873059725, validation BLEU is 60.91 Training:At training steps 37600, training MLE loss is 0.00046610050702748196, train CRF loss is 0.0002395525646397223 Training:At training steps 37700, training MLE loss is 0.00037808071560469013, train CRF loss is 0.00020786307528599223 Training:At training steps 37800, training MLE loss is 0.0003354244545598042, train CRF loss is 0.00019520557817120625 Training:At training steps 37900, training MLE loss is 0.00030825755816788923, train CRF loss is 0.00019238618173055165 Training:At training steps 38000, training MLE loss is 0.00029721190532471626, train CRF loss is 0.00018046758518377003 Validation:At training steps 38000, training MLE loss is 0.00029721190532471626, train CRF loss is 0.00018046758518377003, validation MLE loss is 10.466337994525308, validation ppl is 35113.396, validation CRF loss is 10.634784083617362, validation BLEU is 61.56 Training:At training steps 38100, training MLE loss is 4.107880875894386e-05, train CRF loss is 7.62385002362942e-05 Training:At training steps 38200, training MLE loss is 0.00013529233656286085, train CRF loss is 8.776119391726623e-05 Training:At training steps 38300, training MLE loss is 0.00018297991218177935, train CRF loss is 0.00012430561338416506 Training:At training steps 38400, training MLE loss is 0.0001550121932831603, train CRF loss is 0.00010504843900549288 Training:At training steps 38500, training MLE loss is 0.00017431867965484547, train CRF loss is 0.00011097325093365917 Validation:At training steps 38500, training MLE loss is 0.00017431867965484547, train CRF loss is 0.00011097325093365917, validation MLE loss is 10.474471161240025, validation ppl is 35400.144, validation CRF loss is 10.626767340459322, validation BLEU is 62.33 Training:At training steps 38600, training MLE loss is 0.00027576457968991414, train CRF loss is 9.575708831839335e-05 Training:At training steps 38700, training MLE loss is 0.0003051884739380745, train CRF loss is 0.00013856402404776702 Training:At training steps 38800, training MLE loss is 0.0002676623144180142, train CRF loss is 0.00010214313321335044 Training:At training steps 38900, training MLE loss is 0.00023456877374571455, train CRF loss is 8.364348769686725e-05 Training:At training steps 39000, training MLE loss is 0.0002007715116392144, train CRF loss is 8.697337575754816e-05 Validation:At training steps 39000, training MLE loss is 0.0002007715116392144, train CRF loss is 8.697337575754816e-05, validation MLE loss is 10.448917075207358, validation ppl is 34506.986, validation CRF loss is 10.592580023564791, validation BLEU is 61.89 Training:At training steps 39100, training MLE loss is 0.00017273576407736218, train CRF loss is 0.00011082062429894179 Training:At training steps 39200, training MLE loss is 0.00010595403907953133, train CRF loss is 8.258485301590346e-05 Training:At training steps 39300, training MLE loss is 9.603587248385327e-05, train CRF loss is 6.0796454941748714e-05 Training:At training steps 39400, training MLE loss is 8.695566031699215e-05, train CRF loss is 5.393619909866643e-05 Training:At training steps 39500, training MLE loss is 7.988362991510382e-05, train CRF loss is 4.3907779194825915e-05 Validation:At training steps 39500, training MLE loss is 7.988362991510382e-05, train CRF loss is 4.3907779194825915e-05, validation MLE loss is 10.498891516735679, validation ppl is 36275.27, validation CRF loss is 10.635401826155814, validation BLEU is 61.61 Training:At training steps 39600, training MLE loss is 0.0003583588991117656, train CRF loss is 0.00011591148407035678 Training:At training steps 39700, training MLE loss is 0.00027483402137992366, train CRF loss is 0.00010382642701362244 Training:At training steps 39800, training MLE loss is 0.0002330334807677098, train CRF loss is 8.725985465734739e-05 Training:At training steps 39900, training MLE loss is 0.00017818984687585686, train CRF loss is 6.5913905189795e-05 Training:At training steps 40000, training MLE loss is 0.00017715282742954104, train CRF loss is 7.53722471128313e-05 Validation:At training steps 40000, training MLE loss is 0.00017715282742954104, train CRF loss is 7.53722471128313e-05, validation MLE loss is 10.505817902715583, validation ppl is 36527.399, validation CRF loss is 10.669674603562607, validation BLEU is 61.7 Training:At training steps 100, training MLE loss is 2.2811869828402997, train CRF loss is 11.052510795593262 Training:At training steps 200, training MLE loss is 2.255762308463454, train CRF loss is 10.410282488167287 Training:At training steps 300, training MLE loss is 2.212592060615619, train CRF loss is 9.873975217938423 Training:At training steps 400, training MLE loss is 2.188086412567645, train CRF loss is 9.363668749406934 Training:At training steps 500, training MLE loss is 2.16455927285552, train CRF loss is 8.910168413698674 Validation:At training steps 500, training MLE loss is 2.16455927285552, train CRF loss is 8.910168413698674, validation MLE loss is 2.117206424474716, validation ppl is 8.308, validation CRF loss is 6.730422013684323, validation BLEU is 10.54 Training:At training steps 600, training MLE loss is 2.140844774246216, train CRF loss is 6.574071246981621 Training:At training steps 700, training MLE loss is 2.1258326763287188, train CRF loss is 6.418661434501409 Training:At training steps 800, training MLE loss is 2.1275195812185603, train CRF loss is 6.30054344817996 Training:At training steps 900, training MLE loss is 2.13024114029482, train CRF loss is 6.203887984864414 Training:At training steps 1000, training MLE loss is 2.126773683041334, train CRF loss is 6.113428187608719 Validation:At training steps 1000, training MLE loss is 2.126773683041334, train CRF loss is 6.113428187608719, validation MLE loss is 2.1131367448129152, validation ppl is 8.274, validation CRF loss is 5.72259164170215, validation BLEU is 17.11 Training:At training steps 1100, training MLE loss is 2.14180282831192, train CRF loss is 5.646971111148596 Training:At training steps 1200, training MLE loss is 2.137418267093599, train CRF loss is 5.588670820370316 Training:At training steps 1300, training MLE loss is 2.1248096603155138, train CRF loss is 5.526316619316737 Training:At training steps 1400, training MLE loss is 2.121144823115319, train CRF loss is 5.474404931776226 Training:At training steps 1500, training MLE loss is 2.121409834295511, train CRF loss is 5.423800172775984 Validation:At training steps 1500, training MLE loss is 2.121409834295511, train CRF loss is 5.423800172775984, validation MLE loss is 2.2897602084435915, validation ppl is 9.873, validation CRF loss is 5.685534263912, validation BLEU is 25.36 Training:At training steps 1600, training MLE loss is 2.17428931042552, train CRF loss is 5.180105352699757 Training:At training steps 1700, training MLE loss is 2.172066620290279, train CRF loss is 5.150814874470234 Training:At training steps 1800, training MLE loss is 2.177697018956145, train CRF loss is 5.100898307263851 Training:At training steps 1900, training MLE loss is 2.1857949395850302, train CRF loss is 5.0526478920131925 Training:At training steps 2000, training MLE loss is 2.1936051041334865, train CRF loss is 5.00207175296545 Validation:At training steps 2000, training MLE loss is 2.1936051041334865, train CRF loss is 5.00207175296545, validation MLE loss is 2.269936304343374, validation ppl is 9.679, validation CRF loss is 5.002905669965242, validation BLEU is 28.75 Training:At training steps 2100, training MLE loss is 2.221118821427226, train CRF loss is 4.724271337240935 Training:At training steps 2200, training MLE loss is 2.232934748530388, train CRF loss is 4.666415438428521 Training:At training steps 2300, training MLE loss is 2.2502081613987683, train CRF loss is 4.621775572101275 Training:At training steps 2400, training MLE loss is 2.2596865287981927, train CRF loss is 4.56377299990505 Training:At training steps 2500, training MLE loss is 2.274729610517621, train CRF loss is 4.504527596473694 Validation:At training steps 2500, training MLE loss is 2.274729610517621, train CRF loss is 4.504527596473694, validation MLE loss is 2.4099926854434766, validation ppl is 11.134, validation CRF loss is 4.484051898906105, validation BLEU is 29.6 Training:At training steps 2600, training MLE loss is 2.3340825448930262, train CRF loss is 4.220172623097897 Training:At training steps 2700, training MLE loss is 2.3205094004422424, train CRF loss is 4.164380846917629 Training:At training steps 2800, training MLE loss is 2.3437177654355765, train CRF loss is 4.116653722102443 Training:At training steps 2900, training MLE loss is 2.352616595812142, train CRF loss is 4.0727717224135995 Training:At training steps 3000, training MLE loss is 2.358433369085193, train CRF loss is 4.0269134711176156 Validation:At training steps 3000, training MLE loss is 2.358433369085193, train CRF loss is 4.0269134711176156, validation MLE loss is 2.5217071881419733, validation ppl is 12.45, validation CRF loss is 4.04869333380147, validation BLEU is 32.29 Training:At training steps 3100, training MLE loss is 2.3672131111472847, train CRF loss is 3.7194535579532384 Training:At training steps 3200, training MLE loss is 2.3671949372813104, train CRF loss is 3.6606295788288117 Training:At training steps 3300, training MLE loss is 2.373208217372497, train CRF loss is 3.6288332046071687 Training:At training steps 3400, training MLE loss is 2.374438435938209, train CRF loss is 3.5931807081773877 Training:At training steps 3500, training MLE loss is 2.372917861327529, train CRF loss is 3.539596389502287 Validation:At training steps 3500, training MLE loss is 2.372917861327529, train CRF loss is 3.539596389502287, validation MLE loss is 2.9422322856752494, validation ppl is 18.958, validation CRF loss is 3.84994029371362, validation BLEU is 32.76 Training:At training steps 3600, training MLE loss is 2.3928673453629017, train CRF loss is 3.2907084508985283 Training:At training steps 3700, training MLE loss is 2.362893597483635, train CRF loss is 3.2367957358807327 Training:At training steps 3800, training MLE loss is 2.35515991161267, train CRF loss is 3.1844313100725414 Training:At training steps 3900, training MLE loss is 2.345932520609349, train CRF loss is 3.1447568565793333 Training:At training steps 4000, training MLE loss is 2.340043738231063, train CRF loss is 3.0965292286723853 Validation:At training steps 4000, training MLE loss is 2.340043738231063, train CRF loss is 3.0965292286723853, validation MLE loss is 2.9864859486881055, validation ppl is 19.816, validation CRF loss is 3.4626409850622477, validation BLEU is 37.89 Training:At training steps 4100, training MLE loss is 2.290125606060028, train CRF loss is 2.838023669831455 Training:At training steps 4200, training MLE loss is 2.2514440654218197, train CRF loss is 2.795534501671791 Training:At training steps 4300, training MLE loss is 2.241100072885553, train CRF loss is 2.7590368450184664 Training:At training steps 4400, training MLE loss is 2.2177138091623783, train CRF loss is 2.724264712696895 Training:At training steps 4500, training MLE loss is 2.211236559778452, train CRF loss is 2.6797950944304465 Validation:At training steps 4500, training MLE loss is 2.211236559778452, train CRF loss is 2.6797950944304465, validation MLE loss is 2.8018128526838204, validation ppl is 16.474, validation CRF loss is 3.217300534248352, validation BLEU is 46.81 Training:At training steps 4600, training MLE loss is 2.1095707868784666, train CRF loss is 2.4829609475657346 Training:At training steps 4700, training MLE loss is 2.0821543791517616, train CRF loss is 2.431850243490189 Training:At training steps 4800, training MLE loss is 2.058922615920504, train CRF loss is 2.4104921078309416 Training:At training steps 4900, training MLE loss is 2.049734729770571, train CRF loss is 2.3815468026697637 Training:At training steps 5000, training MLE loss is 2.024888768680394, train CRF loss is 2.354393816612661 Validation:At training steps 5000, training MLE loss is 2.024888768680394, train CRF loss is 2.354393816612661, validation MLE loss is 2.8686975099538503, validation ppl is 17.614, validation CRF loss is 3.282036889540522, validation BLEU is 49.64 Training:At training steps 5100, training MLE loss is 1.9288366330787539, train CRF loss is 2.2074592044577 Training:At training steps 5200, training MLE loss is 1.9034514378011227, train CRF loss is 2.1776027478836477 Training:At training steps 5300, training MLE loss is 1.8679992469400168, train CRF loss is 2.1470287619593242 Training:At training steps 5400, training MLE loss is 1.8556466285977513, train CRF loss is 2.114688434484415 Training:At training steps 5500, training MLE loss is 1.825534968689084, train CRF loss is 2.0867674079313874 Validation:At training steps 5500, training MLE loss is 1.825534968689084, train CRF loss is 2.0867674079313874, validation MLE loss is 2.9034431294391028, validation ppl is 18.237, validation CRF loss is 3.138205520416561, validation BLEU is 51.58 Training:At training steps 5600, training MLE loss is 1.6919519149512052, train CRF loss is 1.9416148261912167 Training:At training steps 5700, training MLE loss is 1.6736714518815279, train CRF loss is 1.8996924241725355 Training:At training steps 5800, training MLE loss is 1.6573372503059607, train CRF loss is 1.8875379468065996 Training:At training steps 5900, training MLE loss is 1.6402305752364919, train CRF loss is 1.8548941772896796 Training:At training steps 6000, training MLE loss is 1.6237104040049017, train CRF loss is 1.830622211139649 Validation:At training steps 6000, training MLE loss is 1.6237104040049017, train CRF loss is 1.830622211139649, validation MLE loss is 3.0967876817050732, validation ppl is 22.127, validation CRF loss is 3.2580809420660923, validation BLEU is 52.43 Training:At training steps 6100, training MLE loss is 1.5248064261488616, train CRF loss is 1.6868244295194745 Training:At training steps 6200, training MLE loss is 1.50380803136155, train CRF loss is 1.670412194659002 Training:At training steps 6300, training MLE loss is 1.5028241553654273, train CRF loss is 1.6582558121625335 Training:At training steps 6400, training MLE loss is 1.4921988954814152, train CRF loss is 1.6343346319417469 Training:At training steps 6500, training MLE loss is 1.4679176716171205, train CRF loss is 1.6079994477778674 Validation:At training steps 6500, training MLE loss is 1.4679176716171205, train CRF loss is 1.6079994477778674, validation MLE loss is 3.079996993667201, validation ppl is 21.758, validation CRF loss is 3.249937344538538, validation BLEU is 53.05 Training:At training steps 6600, training MLE loss is 1.3547022628597916, train CRF loss is 1.411781009119004 Training:At training steps 6700, training MLE loss is 1.3558754680212588, train CRF loss is 1.4068054713774472 Training:At training steps 6800, training MLE loss is 1.3432379225889841, train CRF loss is 1.3849434855238845 Training:At training steps 6900, training MLE loss is 1.3258010378177278, train CRF loss is 1.3626220441411716 Training:At training steps 7000, training MLE loss is 1.3137775630224495, train CRF loss is 1.3388814178379254 Validation:At training steps 7000, training MLE loss is 1.3137775630224495, train CRF loss is 1.3388814178379254, validation MLE loss is 3.3773860209866573, validation ppl is 29.294, validation CRF loss is 3.337721956403632, validation BLEU is 55.48 Training:At training steps 7100, training MLE loss is 1.2313520059362053, train CRF loss is 1.2047402019612492 Training:At training steps 7200, training MLE loss is 1.1882557923905552, train CRF loss is 1.1897143433103339 Training:At training steps 7300, training MLE loss is 1.1761395513949295, train CRF loss is 1.1708725441945718 Training:At training steps 7400, training MLE loss is 1.169544610735029, train CRF loss is 1.1563723658735399 Training:At training steps 7500, training MLE loss is 1.1539238697513938, train CRF loss is 1.1324315425599925 Validation:At training steps 7500, training MLE loss is 1.1539238697513938, train CRF loss is 1.1324315425599925, validation MLE loss is 3.629344764508699, validation ppl is 37.688, validation CRF loss is 3.4491390491786755, validation BLEU is 55.48 Training:At training steps 7600, training MLE loss is 1.0777389688789845, train CRF loss is 1.0125737712625413 Training:At training steps 7700, training MLE loss is 1.0518088536872527, train CRF loss is 0.9821781302592717 Training:At training steps 7800, training MLE loss is 1.0298181820086514, train CRF loss is 0.9590522354259156 Training:At training steps 7900, training MLE loss is 1.0172471356438473, train CRF loss is 0.946991485969047 Training:At training steps 8000, training MLE loss is 0.9936261380128563, train CRF loss is 0.930291133416351 Validation:At training steps 8000, training MLE loss is 0.9936261380128563, train CRF loss is 0.930291133416351, validation MLE loss is 3.860939405466381, validation ppl is 47.51, validation CRF loss is 3.6086476743221283, validation BLEU is 56.14 Training:At training steps 8100, training MLE loss is 0.8963493474945426, train CRF loss is 0.8058710320526734 Training:At training steps 8200, training MLE loss is 0.8715645140688867, train CRF loss is 0.8041881971030671 Training:At training steps 8300, training MLE loss is 0.8604826360568404, train CRF loss is 0.798425519313023 Training:At training steps 8400, training MLE loss is 0.8521306762821041, train CRF loss is 0.7882240468500095 Training:At training steps 8500, training MLE loss is 0.8357044133162126, train CRF loss is 0.7731249325173558 Validation:At training steps 8500, training MLE loss is 0.8357044133162126, train CRF loss is 0.7731249325173558, validation MLE loss is 3.801952616164559, validation ppl is 44.789, validation CRF loss is 3.884310449424543, validation BLEU is 56.63 Training:At training steps 8600, training MLE loss is 0.7558777328813449, train CRF loss is 0.725309450016357 Training:At training steps 8700, training MLE loss is 0.7628006850951351, train CRF loss is 0.6964685618388466 Training:At training steps 8800, training MLE loss is 0.7449892454772877, train CRF loss is 0.6780233269153784 Training:At training steps 8900, training MLE loss is 0.7464025482564466, train CRF loss is 0.6600423384364694 Training:At training steps 9000, training MLE loss is 0.7413312384844758, train CRF loss is 0.6411626100423746 Validation:At training steps 9000, training MLE loss is 0.7413312384844758, train CRF loss is 0.6411626100423746, validation MLE loss is 4.232669502496719, validation ppl is 68.901, validation CRF loss is 3.9245673308247015, validation BLEU is 55.88 Training:At training steps 9100, training MLE loss is 0.6618283490836621, train CRF loss is 0.5535442643193528 Training:At training steps 9200, training MLE loss is 0.6597009405540303, train CRF loss is 0.5490144532773411 Training:At training steps 9300, training MLE loss is 0.6456825284147635, train CRF loss is 0.5330001536265869 Training:At training steps 9400, training MLE loss is 0.6363978875288739, train CRF loss is 0.5211823342645948 Training:At training steps 9500, training MLE loss is 0.630161542817019, train CRF loss is 0.5099258829173632 Validation:At training steps 9500, training MLE loss is 0.630161542817019, train CRF loss is 0.5099258829173632, validation MLE loss is 4.423436691886501, validation ppl is 83.382, validation CRF loss is 4.04938826121782, validation BLEU is 57.22 Training:At training steps 9600, training MLE loss is 0.6336405044328421, train CRF loss is 0.46609326694277115 Training:At training steps 9700, training MLE loss is 0.599951407344779, train CRF loss is 0.46085344582621474 Training:At training steps 9800, training MLE loss is 0.5851533394749276, train CRF loss is 0.44976795418876764 Training:At training steps 9900, training MLE loss is 0.567820717553841, train CRF loss is 0.4327547168326419 Training:At training steps 10000, training MLE loss is 0.5585428241614718, train CRF loss is 0.4206836387895164 Validation:At training steps 10000, training MLE loss is 0.5585428241614718, train CRF loss is 0.4206836387895164, validation MLE loss is 4.5354688261684615, validation ppl is 93.267, validation CRF loss is 4.206281133388218, validation BLEU is 58.26 Training:At training steps 10100, training MLE loss is 0.48836598204332404, train CRF loss is 0.37328967611596453 Training:At training steps 10200, training MLE loss is 0.49669513994798764, train CRF loss is 0.35814472166195627 Training:At training steps 10300, training MLE loss is 0.49046305180992933, train CRF loss is 0.3544332517061654 Training:At training steps 10400, training MLE loss is 0.4790708733079373, train CRF loss is 0.3473529315485939 Training:At training steps 10500, training MLE loss is 0.46923753186664546, train CRF loss is 0.34276967538782627 Validation:At training steps 10500, training MLE loss is 0.46923753186664546, train CRF loss is 0.34276967538782627, validation MLE loss is 4.647866409075887, validation ppl is 104.362, validation CRF loss is 4.202584978781249, validation BLEU is 58.52 Training:At training steps 10600, training MLE loss is 0.4269804530404508, train CRF loss is 0.2955153701227391 Training:At training steps 10700, training MLE loss is 0.4149965055700159, train CRF loss is 0.28982853480825727 Training:At training steps 10800, training MLE loss is 0.4045857131498633, train CRF loss is 0.2832144464737697 Training:At training steps 10900, training MLE loss is 0.40497688138493687, train CRF loss is 0.2774399580523641 Training:At training steps 11000, training MLE loss is 0.3980543533951277, train CRF loss is 0.26957699126869555 Validation:At training steps 11000, training MLE loss is 0.3980543533951277, train CRF loss is 0.26957699126869555, validation MLE loss is 4.8565697481757715, validation ppl is 128.582, validation CRF loss is 4.602603118670614, validation BLEU is 59.86 Training:At training steps 11100, training MLE loss is 0.3602533950645011, train CRF loss is 0.2562421132122108 Training:At training steps 11200, training MLE loss is 0.3550568700573058, train CRF loss is 0.23974403999658533 Training:At training steps 11300, training MLE loss is 0.350204929857379, train CRF loss is 0.2290983031821088 Training:At training steps 11400, training MLE loss is 0.3474678182957723, train CRF loss is 0.22362135005010714 Training:At training steps 11500, training MLE loss is 0.34113693255290856, train CRF loss is 0.2200235371870367 Validation:At training steps 11500, training MLE loss is 0.34113693255290856, train CRF loss is 0.2200235371870367, validation MLE loss is 4.932647639199307, validation ppl is 138.746, validation CRF loss is 4.811819038893047, validation BLEU is 59.24 Training:At training steps 11600, training MLE loss is 0.2998198435993982, train CRF loss is 0.2008126459119376 Training:At training steps 11700, training MLE loss is 0.294501487720936, train CRF loss is 0.19990219471412274 Training:At training steps 11800, training MLE loss is 0.2917395775264231, train CRF loss is 0.1986499193487786 Training:At training steps 11900, training MLE loss is 0.2935360346685502, train CRF loss is 0.19319644363205954 Training:At training steps 12000, training MLE loss is 0.2913606093061171, train CRF loss is 0.1917793658425653 Validation:At training steps 12000, training MLE loss is 0.2913606093061171, train CRF loss is 0.1917793658425653, validation MLE loss is 4.792036432968943, validation ppl is 120.547, validation CRF loss is 4.710033018338053, validation BLEU is 58.13 Training:At training steps 12100, training MLE loss is 0.2585144041152671, train CRF loss is 0.16191432874569728 Training:At training steps 12200, training MLE loss is 0.24759942043587216, train CRF loss is 0.16558446447746974 Training:At training steps 12300, training MLE loss is 0.250105771200809, train CRF loss is 0.164787760952507 Training:At training steps 12400, training MLE loss is 0.2509953876025975, train CRF loss is 0.16426988716862298 Training:At training steps 12500, training MLE loss is 0.2491815638476546, train CRF loss is 0.15947725073059155 Validation:At training steps 12500, training MLE loss is 0.2491815638476546, train CRF loss is 0.15947725073059155, validation MLE loss is 5.263330174119849, validation ppl is 193.124, validation CRF loss is 4.806929133440319, validation BLEU is 59.84 Training:At training steps 12600, training MLE loss is 0.24010614263359456, train CRF loss is 0.1384160237386095 Training:At training steps 12700, training MLE loss is 0.23761439642563345, train CRF loss is 0.14474008694542134 Training:At training steps 12800, training MLE loss is 0.2324783375064726, train CRF loss is 0.14610855065340123 Training:At training steps 12900, training MLE loss is 0.23359585191632504, train CRF loss is 0.14132644146000076 Training:At training steps 13000, training MLE loss is 0.2295194005923113, train CRF loss is 0.13909213019669187 Validation:At training steps 13000, training MLE loss is 0.2295194005923113, train CRF loss is 0.13909213019669187, validation MLE loss is 5.228508328136645, validation ppl is 186.514, validation CRF loss is 5.007791613277636, validation BLEU is 59.77 Training:At training steps 13100, training MLE loss is 0.18718456052723922, train CRF loss is 0.10957019461788149 Training:At training steps 13200, training MLE loss is 0.19174567929534533, train CRF loss is 0.11469703308668613 Training:At training steps 13300, training MLE loss is 0.1929784304558901, train CRF loss is 0.11523174274049476 Training:At training steps 13400, training MLE loss is 0.19327225416516738, train CRF loss is 0.11119195395875067 Training:At training steps 13500, training MLE loss is 0.1904881099286431, train CRF loss is 0.10978862972045317 Validation:At training steps 13500, training MLE loss is 0.1904881099286431, train CRF loss is 0.10978862972045317, validation MLE loss is 5.319792985916138, validation ppl is 204.342, validation CRF loss is 5.282060055356276, validation BLEU is 58.13 Training:At training steps 13600, training MLE loss is 0.19638578899310233, train CRF loss is 0.11198088558167 Training:At training steps 13700, training MLE loss is 0.18440076506625702, train CRF loss is 0.10627063387905764 Training:At training steps 13800, training MLE loss is 0.17836110066607944, train CRF loss is 0.10678070669229177 Training:At training steps 13900, training MLE loss is 0.17696019729643012, train CRF loss is 0.10265582479192745 Training:At training steps 14000, training MLE loss is 0.1751572461729811, train CRF loss is 0.10093850105096135 Validation:At training steps 14000, training MLE loss is 0.1751572461729811, train CRF loss is 0.10093850105096135, validation MLE loss is 5.305698441831689, validation ppl is 201.482, validation CRF loss is 5.25853944765894, validation BLEU is 59.44 Training:At training steps 14100, training MLE loss is 0.15141079688284662, train CRF loss is 0.08916159163563861 Training:At training steps 14200, training MLE loss is 0.15520621258790926, train CRF loss is 0.08921928059808124 Training:At training steps 14300, training MLE loss is 0.15797823245782638, train CRF loss is 0.09362032680255652 Training:At training steps 14400, training MLE loss is 0.15273923548782478, train CRF loss is 0.09198110856987568 Training:At training steps 14500, training MLE loss is 0.15031921767504536, train CRF loss is 0.09134074929438066 Validation:At training steps 14500, training MLE loss is 0.15031921767504536, train CRF loss is 0.09134074929438066, validation MLE loss is 5.548354459436316, validation ppl is 256.815, validation CRF loss is 5.363412565306613, validation BLEU is 60.26 Training:At training steps 14600, training MLE loss is 0.1378579749859455, train CRF loss is 0.07417594364573234 Training:At training steps 14700, training MLE loss is 0.13436570693197608, train CRF loss is 0.07453937615099904 Training:At training steps 14800, training MLE loss is 0.13526799299644457, train CRF loss is 0.07731199622492416 Training:At training steps 14900, training MLE loss is 0.13383088278211744, train CRF loss is 0.07790850646176864 Training:At training steps 15000, training MLE loss is 0.1316655975474314, train CRF loss is 0.0762639445425093 Validation:At training steps 15000, training MLE loss is 0.1316655975474314, train CRF loss is 0.0762639445425093, validation MLE loss is 5.658380392350648, validation ppl is 286.684, validation CRF loss is 5.401213297718449, validation BLEU is 60.56 Training:At training steps 15100, training MLE loss is 0.13526401138951769, train CRF loss is 0.0766662037395372 Training:At training steps 15200, training MLE loss is 0.1266317418520066, train CRF loss is 0.07197826116277611 Training:At training steps 15300, training MLE loss is 0.12143856592532756, train CRF loss is 0.07013800872276078 Training:At training steps 15400, training MLE loss is 0.11906006465910196, train CRF loss is 0.06997900382135412 Training:At training steps 15500, training MLE loss is 0.11646938196234259, train CRF loss is 0.06765468779336334 Validation:At training steps 15500, training MLE loss is 0.11646938196234259, train CRF loss is 0.06765468779336334, validation MLE loss is 5.77137700821224, validation ppl is 320.979, validation CRF loss is 5.542571092906751, validation BLEU is 60.31 Training:At training steps 15600, training MLE loss is 0.10330368589598947, train CRF loss is 0.0648820839858081 Training:At training steps 15700, training MLE loss is 0.10171667316042658, train CRF loss is 0.06044404475698002 Training:At training steps 15800, training MLE loss is 0.09808025431027696, train CRF loss is 0.05974487106431601 Training:At training steps 15900, training MLE loss is 0.09786521030469658, train CRF loss is 0.05954914711185665 Training:At training steps 16000, training MLE loss is 0.09669529003268144, train CRF loss is 0.057994166622431294 Validation:At training steps 16000, training MLE loss is 0.09669529003268144, train CRF loss is 0.057994166622431294, validation MLE loss is 6.210713424180684, validation ppl is 498.056, validation CRF loss is 5.682585753892598, validation BLEU is 58.8 Training:At training steps 16100, training MLE loss is 0.09422439041414692, train CRF loss is 0.05304858101871332 Training:At training steps 16200, training MLE loss is 0.09019525674958913, train CRF loss is 0.05150930365611089 Training:At training steps 16300, training MLE loss is 0.09045077007327261, train CRF loss is 0.052972808141828406 Training:At training steps 16400, training MLE loss is 0.0915275577648731, train CRF loss is 0.05352369403773821 Training:At training steps 16500, training MLE loss is 0.09207297397612274, train CRF loss is 0.052992570685206374 Validation:At training steps 16500, training MLE loss is 0.09207297397612274, train CRF loss is 0.052992570685206374, validation MLE loss is 5.873778703965638, validation ppl is 355.59, validation CRF loss is 5.690863113654287, validation BLEU is 59.98 Training:At training steps 16600, training MLE loss is 0.08343264362680201, train CRF loss is 0.04587626010209306 Training:At training steps 16700, training MLE loss is 0.08522337932850405, train CRF loss is 0.04691305059831052 Training:At training steps 16800, training MLE loss is 0.08202326249440375, train CRF loss is 0.04566609790548948 Training:At training steps 16900, training MLE loss is 0.08059203577801555, train CRF loss is 0.04510234564525348 Training:At training steps 17000, training MLE loss is 0.07904138555564531, train CRF loss is 0.0451074054581602 Validation:At training steps 17000, training MLE loss is 0.07904138555564531, train CRF loss is 0.0451074054581602, validation MLE loss is 6.14724892691562, validation ppl is 467.43, validation CRF loss is 6.126191873299448, validation BLEU is 59.19 Training:At training steps 17100, training MLE loss is 0.07017753975966116, train CRF loss is 0.040789195302952524 Training:At training steps 17200, training MLE loss is 0.07507420244157856, train CRF loss is 0.04622530179462956 Training:At training steps 17300, training MLE loss is 0.07180183301516081, train CRF loss is 0.04486849035568146 Training:At training steps 17400, training MLE loss is 0.06972475770909682, train CRF loss is 0.04416452746708131 Training:At training steps 17500, training MLE loss is 0.06734671839510065, train CRF loss is 0.04239770289847331 Validation:At training steps 17500, training MLE loss is 0.06734671839510065, train CRF loss is 0.04239770289847331, validation MLE loss is 6.0744720038614775, validation ppl is 434.62, validation CRF loss is 5.87323199134124, validation BLEU is 59.7 Training:At training steps 17600, training MLE loss is 0.06532064158323238, train CRF loss is 0.036492121594619675 Training:At training steps 17700, training MLE loss is 0.05946300553010701, train CRF loss is 0.03400781054884647 Training:At training steps 17800, training MLE loss is 0.05868224840117288, train CRF loss is 0.03318418687019441 Training:At training steps 17900, training MLE loss is 0.05768524668841508, train CRF loss is 0.033130023793323034 Training:At training steps 18000, training MLE loss is 0.056846171692136065, train CRF loss is 0.0325451339367844 Validation:At training steps 18000, training MLE loss is 0.056846171692136065, train CRF loss is 0.0325451339367844, validation MLE loss is 6.328240880840703, validation ppl is 560.17, validation CRF loss is 6.176153267684736, validation BLEU is 59.78 Training:At training steps 18100, training MLE loss is 0.056898363786835944, train CRF loss is 0.034449511547484234 Training:At training steps 18200, training MLE loss is 0.05756940596865093, train CRF loss is 0.0323313803549695 Training:At training steps 18300, training MLE loss is 0.05619930692499641, train CRF loss is 0.032033675248338035 Training:At training steps 18400, training MLE loss is 0.05596199887797013, train CRF loss is 0.032741378328207275 Training:At training steps 18500, training MLE loss is 0.054450411442550486, train CRF loss is 0.0312743550366327 Validation:At training steps 18500, training MLE loss is 0.054450411442550486, train CRF loss is 0.0312743550366327, validation MLE loss is 6.473591258651332, validation ppl is 647.806, validation CRF loss is 6.16940595915443, validation BLEU is 59.47 Training:At training steps 18600, training MLE loss is 0.055271960822534535, train CRF loss is 0.02691880936198686 Training:At training steps 18700, training MLE loss is 0.04961792570568548, train CRF loss is 0.0274380653111794 Training:At training steps 18800, training MLE loss is 0.04766116454037425, train CRF loss is 0.027338406819301136 Training:At training steps 18900, training MLE loss is 0.046721164390964416, train CRF loss is 0.02581818064269001 Training:At training steps 19000, training MLE loss is 0.046570689210667524, train CRF loss is 0.025904495880074478 Validation:At training steps 19000, training MLE loss is 0.046570689210667524, train CRF loss is 0.025904495880074478, validation MLE loss is 6.511913098787007, validation ppl is 673.113, validation CRF loss is 6.3976614161541585, validation BLEU is 59.17 Training:At training steps 19100, training MLE loss is 0.04300578902012376, train CRF loss is 0.023087793877416517 Training:At training steps 19200, training MLE loss is 0.04242261441807415, train CRF loss is 0.023387366648487243 Training:At training steps 19300, training MLE loss is 0.04130474360132837, train CRF loss is 0.023480234236508673 Training:At training steps 19400, training MLE loss is 0.042220038957588885, train CRF loss is 0.024507686617387277 Training:At training steps 19500, training MLE loss is 0.042377946975678926, train CRF loss is 0.024197330786378927 Validation:At training steps 19500, training MLE loss is 0.042377946975678926, train CRF loss is 0.024197330786378927, validation MLE loss is 6.685835587350946, validation ppl is 800.98, validation CRF loss is 6.609461511436262, validation BLEU is 59.19 Training:At training steps 19600, training MLE loss is 0.038817731903933464, train CRF loss is 0.0206898441902311 Training:At training steps 19700, training MLE loss is 0.03769768099455803, train CRF loss is 0.020446177033127524 Training:At training steps 19800, training MLE loss is 0.03666693733038452, train CRF loss is 0.02015159895242789 Training:At training steps 19900, training MLE loss is 0.034997903696919436, train CRF loss is 0.018874185804411282 Training:At training steps 20000, training MLE loss is 0.03394072824058881, train CRF loss is 0.018508796327002586 Validation:At training steps 20000, training MLE loss is 0.03394072824058881, train CRF loss is 0.018508796327002586, validation MLE loss is 6.924228680761237, validation ppl is 1016.61, validation CRF loss is 6.761939102097561, validation BLEU is 58.68 Training:At training steps 20100, training MLE loss is 0.034179719742722055, train CRF loss is 0.018242859078869175 Training:At training steps 20200, training MLE loss is 0.03076136422019715, train CRF loss is 0.016617623979319093 Training:At training steps 20300, training MLE loss is 0.029725008997006815, train CRF loss is 0.015172143568301901 Training:At training steps 20400, training MLE loss is 0.028330120907487776, train CRF loss is 0.014714200771544625 Training:At training steps 20500, training MLE loss is 0.02761320958571784, train CRF loss is 0.014719666100349997 Validation:At training steps 20500, training MLE loss is 0.02761320958571784, train CRF loss is 0.014719666100349997, validation MLE loss is 7.033572008735256, validation ppl is 1134.074, validation CRF loss is 6.933215552254727, validation BLEU is 59.34 Training:At training steps 20600, training MLE loss is 0.027753565851253655, train CRF loss is 0.016199706540597703 Training:At training steps 20700, training MLE loss is 0.02720805265073366, train CRF loss is 0.015625557934019697 Training:At training steps 20800, training MLE loss is 0.025578110628752027, train CRF loss is 0.014675980123859253 Training:At training steps 20900, training MLE loss is 0.025236001044915232, train CRF loss is 0.014454357534294394 Training:At training steps 21000, training MLE loss is 0.025485616164200864, train CRF loss is 0.014250949477569094 Validation:At training steps 21000, training MLE loss is 0.025485616164200864, train CRF loss is 0.014250949477569094, validation MLE loss is 7.268829408444856, validation ppl is 1434.87, validation CRF loss is 7.058154987661462, validation BLEU is 59.64 Training:At training steps 21100, training MLE loss is 0.024916738513505125, train CRF loss is 0.01374920249219457 Training:At training steps 21200, training MLE loss is 0.024497987271157023, train CRF loss is 0.013124405680012625 Training:At training steps 21300, training MLE loss is 0.023410908691491105, train CRF loss is 0.013100469584839645 Training:At training steps 21400, training MLE loss is 0.02246930603672629, train CRF loss is 0.012323801847347165 Training:At training steps 21500, training MLE loss is 0.021772877473548845, train CRF loss is 0.012306333736046646 Validation:At training steps 21500, training MLE loss is 0.021772877473548845, train CRF loss is 0.012306333736046646, validation MLE loss is 7.15752133883928, validation ppl is 1283.725, validation CRF loss is 7.088707189810903, validation BLEU is 58.85 Training:At training steps 21600, training MLE loss is 0.019091614530832645, train CRF loss is 0.009966123132877697 Training:At training steps 21700, training MLE loss is 0.019054520693142684, train CRF loss is 0.010632636492302864 Training:At training steps 21800, training MLE loss is 0.019121732207420406, train CRF loss is 0.01092351941961109 Training:At training steps 21900, training MLE loss is 0.019243717009217908, train CRF loss is 0.01102716832470791 Training:At training steps 22000, training MLE loss is 0.018271890739608953, train CRF loss is 0.010748618210364207 Validation:At training steps 22000, training MLE loss is 0.018271890739608953, train CRF loss is 0.010748618210364207, validation MLE loss is 7.249385030646073, validation ppl is 1407.239, validation CRF loss is 7.176942841002815, validation BLEU is 59.26 Training:At training steps 22100, training MLE loss is 0.014687963393810968, train CRF loss is 0.007902336400924614 Training:At training steps 22200, training MLE loss is 0.013858927675807528, train CRF loss is 0.008074244485027466 Training:At training steps 22300, training MLE loss is 0.01293655355639359, train CRF loss is 0.007249335395994026 Training:At training steps 22400, training MLE loss is 0.012610156691575804, train CRF loss is 0.007181645659162619 Training:At training steps 22500, training MLE loss is 0.01290042036865146, train CRF loss is 0.0074138508489795425 Validation:At training steps 22500, training MLE loss is 0.01290042036865146, train CRF loss is 0.0074138508489795425, validation MLE loss is 7.475815427930732, validation ppl is 1764.84, validation CRF loss is 7.307647099620418, validation BLEU is 60.18 Training:At training steps 22600, training MLE loss is 0.012220510013753447, train CRF loss is 0.0068758208873778235 Training:At training steps 22700, training MLE loss is 0.011361494749692759, train CRF loss is 0.006258202239471178 Training:At training steps 22800, training MLE loss is 0.01100386449098533, train CRF loss is 0.0060542464382633936 Training:At training steps 22900, training MLE loss is 0.010435125187198704, train CRF loss is 0.005975745639804144 Training:At training steps 23000, training MLE loss is 0.010309619750525521, train CRF loss is 0.005815412769641997 Validation:At training steps 23000, training MLE loss is 0.010309619750525521, train CRF loss is 0.005815412769641997, validation MLE loss is 7.542449436689678, validation ppl is 1886.445, validation CRF loss is 7.40793672360872, validation BLEU is 59.16 Training:At training steps 23100, training MLE loss is 0.009423308456468238, train CRF loss is 0.005032998957953723 Training:At training steps 23200, training MLE loss is 0.008279962627477024, train CRF loss is 0.004651888304401117 Training:At training steps 23300, training MLE loss is 0.007536709351483714, train CRF loss is 0.00422790987926692 Training:At training steps 23400, training MLE loss is 0.007503763066610998, train CRF loss is 0.004299489897667529 Training:At training steps 23500, training MLE loss is 0.00727002786951876, train CRF loss is 0.004258459913794949 Validation:At training steps 23500, training MLE loss is 0.00727002786951876, train CRF loss is 0.004258459913794949, validation MLE loss is 7.821245074272156, validation ppl is 2493.007, validation CRF loss is 7.688841957794993, validation BLEU is 60.2 Training:At training steps 23600, training MLE loss is 0.0072441359691425514, train CRF loss is 0.004754380027230467 Training:At training steps 23700, training MLE loss is 0.006804231651355197, train CRF loss is 0.003976369439843916 Training:At training steps 23800, training MLE loss is 0.006644084288928944, train CRF loss is 0.0038367311476777285 Training:At training steps 23900, training MLE loss is 0.006622902831516013, train CRF loss is 0.003760132558590165 Training:At training steps 24000, training MLE loss is 0.0062835748298318206, train CRF loss is 0.003715791649625193 Validation:At training steps 24000, training MLE loss is 0.0062835748298318206, train CRF loss is 0.003715791649625193, validation MLE loss is 7.895793274829262, validation ppl is 2685.959, validation CRF loss is 7.727125927021629, validation BLEU is 60.41 Training:At training steps 24100, training MLE loss is 0.005967418276311847, train CRF loss is 0.0031636869515859136 Training:At training steps 24200, training MLE loss is 0.005853381759711542, train CRF loss is 0.0033267623241122536 Training:At training steps 24300, training MLE loss is 0.0054662131483700065, train CRF loss is 0.0032255588317048898 Training:At training steps 24400, training MLE loss is 0.005135640199255105, train CRF loss is 0.002962783905714712 Training:At training steps 24500, training MLE loss is 0.005204888826888731, train CRF loss is 0.003009195495609677 Validation:At training steps 24500, training MLE loss is 0.005204888826888731, train CRF loss is 0.003009195495609677, validation MLE loss is 7.949364373558446, validation ppl is 2833.773, validation CRF loss is 7.900111185876947, validation BLEU is 58.89 Training:At training steps 24600, training MLE loss is 0.004182531828997902, train CRF loss is 0.0029700393726977393 Training:At training steps 24700, training MLE loss is 0.0046545039524923315, train CRF loss is 0.0027926614458526912 Training:At training steps 24800, training MLE loss is 0.004298811033019496, train CRF loss is 0.0027715568336695315 Training:At training steps 24900, training MLE loss is 0.003917809342358749, train CRF loss is 0.0024386813674704844 Training:At training steps 25000, training MLE loss is 0.003980168276191658, train CRF loss is 0.002355744393744174 Validation:At training steps 25000, training MLE loss is 0.003980168276191658, train CRF loss is 0.002355744393744174, validation MLE loss is 8.097884033855639, validation ppl is 3287.504, validation CRF loss is 7.990614520876031, validation BLEU is 58.83 Training:At training steps 25100, training MLE loss is 0.004675058121034059, train CRF loss is 0.0029763886162993457 Training:At training steps 25200, training MLE loss is 0.0037784124582712964, train CRF loss is 0.002618874872275363 Training:At training steps 25300, training MLE loss is 0.0035689860196306297, train CRF loss is 0.0024122887793052755 Training:At training steps 25400, training MLE loss is 0.003533787969594784, train CRF loss is 0.0024646436046418586 Training:At training steps 25500, training MLE loss is 0.003405958215272187, train CRF loss is 0.002256317597909084 Validation:At training steps 25500, training MLE loss is 0.003405958215272187, train CRF loss is 0.002256317597909084, validation MLE loss is 8.314539238026267, validation ppl is 4082.804, validation CRF loss is 8.093400760700828, validation BLEU is 59.57 Training:At training steps 25600, training MLE loss is 0.0031286512856040583, train CRF loss is 0.001964592562629486 Training:At training steps 25700, training MLE loss is 0.0033908983375960013, train CRF loss is 0.0020247611133854338 Training:At training steps 25800, training MLE loss is 0.0032873012227462445, train CRF loss is 0.0018712382102583133 Training:At training steps 25900, training MLE loss is 0.003093645832498481, train CRF loss is 0.0017085911335734194 Training:At training steps 26000, training MLE loss is 0.0030412543725846885, train CRF loss is 0.0016765684753425854 Validation:At training steps 26000, training MLE loss is 0.0030412543725846885, train CRF loss is 0.0016765684753425854, validation MLE loss is 8.262835383415222, validation ppl is 3877.072, validation CRF loss is 8.155806961812472, validation BLEU is 60.48 Training:At training steps 26100, training MLE loss is 0.0021395775896999015, train CRF loss is 0.0014979143334066115 Training:At training steps 26200, training MLE loss is 0.002262836381053034, train CRF loss is 0.001310525250652801 Training:At training steps 26300, training MLE loss is 0.0023861020890910394, train CRF loss is 0.0014086391991448308 Training:At training steps 26400, training MLE loss is 0.0023734481333168413, train CRF loss is 0.001558301159644636 Training:At training steps 26500, training MLE loss is 0.0023700270943768256, train CRF loss is 0.0015441141547740882 Validation:At training steps 26500, training MLE loss is 0.0023700270943768256, train CRF loss is 0.0015441141547740882, validation MLE loss is 8.395449964623703, validation ppl is 4426.878, validation CRF loss is 8.339540525486594, validation BLEU is 59.59 Training:At training steps 26600, training MLE loss is 0.0022114759921434946, train CRF loss is 0.001086555000487177 Training:At training steps 26700, training MLE loss is 0.0020203783926931052, train CRF loss is 0.001185757087835786 Training:At training steps 26800, training MLE loss is 0.0020258460239793853, train CRF loss is 0.0011920148134511343 Training:At training steps 26900, training MLE loss is 0.0019596842587629205, train CRF loss is 0.0011215530123395402 Training:At training steps 27000, training MLE loss is 0.0017816255147413466, train CRF loss is 0.0010490569755412977 Validation:At training steps 27000, training MLE loss is 0.0017816255147413466, train CRF loss is 0.0010490569755412977, validation MLE loss is 8.463143731418409, validation ppl is 4736.926, validation CRF loss is 8.284451948968988, validation BLEU is 59.42 Training:At training steps 27100, training MLE loss is 0.0018995248739794679, train CRF loss is 0.0010426832346591387 Training:At training steps 27200, training MLE loss is 0.0018164464626635795, train CRF loss is 0.0009604022647927457 Training:At training steps 27300, training MLE loss is 0.0019590005962557213, train CRF loss is 0.0010996540814910342 Training:At training steps 27400, training MLE loss is 0.0018182856009477306, train CRF loss is 0.0010002197591321416 Training:At training steps 27500, training MLE loss is 0.0018007155359422609, train CRF loss is 0.0010228488134730008 Validation:At training steps 27500, training MLE loss is 0.0018007155359422609, train CRF loss is 0.0010228488134730008, validation MLE loss is 8.477328181266785, validation ppl is 4804.596, validation CRF loss is 8.380302171958121, validation BLEU is 59.23 Training:At training steps 27600, training MLE loss is 0.0013562827784154884, train CRF loss is 0.0006371036545362774 Training:At training steps 27700, training MLE loss is 0.0017814121133099942, train CRF loss is 0.0011542366278810446 Training:At training steps 27800, training MLE loss is 0.001779409699964396, train CRF loss is 0.0011622062066021558 Training:At training steps 27900, training MLE loss is 0.0016273724789891584, train CRF loss is 0.0011243595611049916 Training:At training steps 28000, training MLE loss is 0.00168813160749068, train CRF loss is 0.0010966646002361538 Validation:At training steps 28000, training MLE loss is 0.00168813160749068, train CRF loss is 0.0010966646002361538, validation MLE loss is 8.553827605749431, validation ppl is 5186.569, validation CRF loss is 8.401891250359384, validation BLEU is 60.19 Training:At training steps 28100, training MLE loss is 0.0012945550419425512, train CRF loss is 0.000870232578971466 Training:At training steps 28200, training MLE loss is 0.0014717396849770602, train CRF loss is 0.0008784035750327268 Training:At training steps 28300, training MLE loss is 0.0013146445655037896, train CRF loss is 0.0007264187245200156 Training:At training steps 28400, training MLE loss is 0.001259447312326147, train CRF loss is 0.0007912448736534694 Training:At training steps 28500, training MLE loss is 0.0012849154021447478, train CRF loss is 0.0008437511857845727 Validation:At training steps 28500, training MLE loss is 0.0012849154021447478, train CRF loss is 0.0008437511857845727, validation MLE loss is 8.662770045431037, validation ppl is 5783.533, validation CRF loss is 8.627392706118131, validation BLEU is 59.74 Training:At training steps 28600, training MLE loss is 0.0018953581333481207, train CRF loss is 0.0012409654362767154 Training:At training steps 28700, training MLE loss is 0.0016083421545823456, train CRF loss is 0.0011548982247288552 Training:At training steps 28800, training MLE loss is 0.0014660304510581911, train CRF loss is 0.0011673966528910299 Training:At training steps 28900, training MLE loss is 0.0013992761734401126, train CRF loss is 0.0010861483229429713 Training:At training steps 29000, training MLE loss is 0.001432650916442278, train CRF loss is 0.0010510473843788688 Validation:At training steps 29000, training MLE loss is 0.001432650916442278, train CRF loss is 0.0010510473843788688, validation MLE loss is 8.840582866417733, validation ppl is 6909.019, validation CRF loss is 8.63013928187521, validation BLEU is 60.22 Training:At training steps 29100, training MLE loss is 0.0009220715289494186, train CRF loss is 0.0006711233031484376 Training:At training steps 29200, training MLE loss is 0.0008567686133793684, train CRF loss is 0.0005885428283515059 Training:At training steps 29300, training MLE loss is 0.0009091414823798575, train CRF loss is 0.0006562878172575074 Training:At training steps 29400, training MLE loss is 0.0009310170664010298, train CRF loss is 0.0006184217136195391 Training:At training steps 29500, training MLE loss is 0.0009046576888855071, train CRF loss is 0.000585896157975327 Validation:At training steps 29500, training MLE loss is 0.0009046576888855071, train CRF loss is 0.000585896157975327, validation MLE loss is 8.744511698421679, validation ppl is 6276.148, validation CRF loss is 8.706991201952883, validation BLEU is 60.29 Training:At training steps 29600, training MLE loss is 0.0011371050433373363, train CRF loss is 0.0007281262238895581 Training:At training steps 29700, training MLE loss is 0.001080318215091231, train CRF loss is 0.0007822654111099503 Training:At training steps 29800, training MLE loss is 0.001200096898418644, train CRF loss is 0.0007339235714778968 Training:At training steps 29900, training MLE loss is 0.0010814777367378042, train CRF loss is 0.0006636762417253994 Training:At training steps 30000, training MLE loss is 0.0011514351168421674, train CRF loss is 0.0007268207148861388 Validation:At training steps 30000, training MLE loss is 0.0011514351168421674, train CRF loss is 0.0007268207148861388, validation MLE loss is 8.854189728435717, validation ppl is 7003.671, validation CRF loss is 8.688137261491073, validation BLEU is 60.0 Training:At training steps 30100, training MLE loss is 0.001186228886303009, train CRF loss is 0.000687841667173168 Training:At training steps 30200, training MLE loss is 0.0010967150618253413, train CRF loss is 0.0005131108949634155 Training:At training steps 30300, training MLE loss is 0.0011138486114507865, train CRF loss is 0.0006158997077118325 Training:At training steps 30400, training MLE loss is 0.0010877573708411037, train CRF loss is 0.000565669786685169 Training:At training steps 30500, training MLE loss is 0.0010912412044336796, train CRF loss is 0.0005583293228621562 Validation:At training steps 30500, training MLE loss is 0.0010912412044336796, train CRF loss is 0.0005583293228621562, validation MLE loss is 8.881011523698506, validation ppl is 7194.064, validation CRF loss is 8.762970039719029, validation BLEU is 60.29 Training:At training steps 30600, training MLE loss is 0.0004045546445094916, train CRF loss is 0.0003411416820905133 Training:At training steps 30700, training MLE loss is 0.0007534510256382516, train CRF loss is 0.0004965798127264321 Training:At training steps 30800, training MLE loss is 0.0008564700485156603, train CRF loss is 0.0005012929984587977 Training:At training steps 30900, training MLE loss is 0.0007607673446651083, train CRF loss is 0.0004398360786451383 Training:At training steps 31000, training MLE loss is 0.0008440802786934134, train CRF loss is 0.00046980109716214535 Validation:At training steps 31000, training MLE loss is 0.0008440802786934134, train CRF loss is 0.00046980109716214535, validation MLE loss is 9.028084874153137, validation ppl is 8333.884, validation CRF loss is 8.861317289502997, validation BLEU is 59.77 Training:At training steps 31100, training MLE loss is 0.0009696424411291705, train CRF loss is 0.0004984254527858622 Training:At training steps 31200, training MLE loss is 0.0008259181372910687, train CRF loss is 0.0005040280343023507 Training:At training steps 31300, training MLE loss is 0.0008018411288719487, train CRF loss is 0.0005348674799591076 Training:At training steps 31400, training MLE loss is 0.0008509908050557571, train CRF loss is 0.0005256509370260487 Training:At training steps 31500, training MLE loss is 0.0007660455249306176, train CRF loss is 0.00046316798809333725 Validation:At training steps 31500, training MLE loss is 0.0007660455249306176, train CRF loss is 0.00046316798809333725, validation MLE loss is 8.904196036489386, validation ppl is 7362.803, validation CRF loss is 8.781302257588035, validation BLEU is 59.97 Training:At training steps 31600, training MLE loss is 0.0007304490935485535, train CRF loss is 0.0003681496650749816 Training:At training steps 31700, training MLE loss is 0.0008649525085072958, train CRF loss is 0.0005459122479830581 Training:At training steps 31800, training MLE loss is 0.0008105383666040471, train CRF loss is 0.0005301057144415111 Training:At training steps 31900, training MLE loss is 0.0007445861966630429, train CRF loss is 0.0005226984819618763 Training:At training steps 32000, training MLE loss is 0.0007356768569538682, train CRF loss is 0.0005291805540872465 Validation:At training steps 32000, training MLE loss is 0.0007356768569538682, train CRF loss is 0.0005291805540872465, validation MLE loss is 8.990410578878302, validation ppl is 8025.751, validation CRF loss is 8.880518210561652, validation BLEU is 59.96 Training:At training steps 32100, training MLE loss is 0.0004478577683960691, train CRF loss is 0.00031324981910114236 Training:At training steps 32200, training MLE loss is 0.0004700168558946986, train CRF loss is 0.0003054351896666096 Training:At training steps 32300, training MLE loss is 0.0005296610470812226, train CRF loss is 0.00029738804010168705 Training:At training steps 32400, training MLE loss is 0.0005462243816652642, train CRF loss is 0.0003187657158803292 Training:At training steps 32500, training MLE loss is 0.0006906071732850726, train CRF loss is 0.0004621076460010531 Validation:At training steps 32500, training MLE loss is 0.0006906071732850726, train CRF loss is 0.0004621076460010531, validation MLE loss is 8.974650037916083, validation ppl is 7900.253, validation CRF loss is 8.756417638377139, validation BLEU is 60.64 Training:At training steps 32600, training MLE loss is 0.0005890956462876857, train CRF loss is 0.0002544561433900094 Training:At training steps 32700, training MLE loss is 0.0005853552611404083, train CRF loss is 0.00047003448599192366 Training:At training steps 32800, training MLE loss is 0.0004914368139221268, train CRF loss is 0.00034492062241320287 Training:At training steps 32900, training MLE loss is 0.0005512487212963267, train CRF loss is 0.00039061786220497387 Training:At training steps 33000, training MLE loss is 0.0004912338714206295, train CRF loss is 0.00034251142991968654 Validation:At training steps 33000, training MLE loss is 0.0004912338714206295, train CRF loss is 0.00034251142991968654, validation MLE loss is 8.925241790319744, validation ppl is 7519.401, validation CRF loss is 8.749840309745387, validation BLEU is 59.85 Training:At training steps 33100, training MLE loss is 0.00039915254389900066, train CRF loss is 7.124405309991478e-05 Training:At training steps 33200, training MLE loss is 0.0004722735849560446, train CRF loss is 0.00023244764973786135 Training:At training steps 33300, training MLE loss is 0.00045749221949431686, train CRF loss is 0.00024267658222263845 Training:At training steps 33400, training MLE loss is 0.0004316265042578905, train CRF loss is 0.00021844192129607309 Training:At training steps 33500, training MLE loss is 0.0004295554419822345, train CRF loss is 0.0002567216754777375 Validation:At training steps 33500, training MLE loss is 0.0004295554419822345, train CRF loss is 0.0002567216754777375, validation MLE loss is 8.993609102148758, validation ppl is 8051.463, validation CRF loss is 8.842970728874207, validation BLEU is 59.72 Training:At training steps 33600, training MLE loss is 0.0007053993451963786, train CRF loss is 0.00035589923844975947 Training:At training steps 33700, training MLE loss is 0.0006954000099195296, train CRF loss is 0.000347201473436769 Training:At training steps 33800, training MLE loss is 0.0006648761534148599, train CRF loss is 0.00032175395634863454 Training:At training steps 33900, training MLE loss is 0.0005730871338449138, train CRF loss is 0.0002720436743104371 Training:At training steps 34000, training MLE loss is 0.0005114211395854316, train CRF loss is 0.00028754325364401456 Validation:At training steps 34000, training MLE loss is 0.0005114211395854316, train CRF loss is 0.00028754325364401456, validation MLE loss is 9.036632010811253, validation ppl is 8405.42, validation CRF loss is 8.825097874591226, validation BLEU is 60.6 Training:At training steps 34100, training MLE loss is 0.0002740559379729166, train CRF loss is 0.00022490088754515637 Training:At training steps 34200, training MLE loss is 0.0002162713903647159, train CRF loss is 0.00019838243274461264 Training:At training steps 34300, training MLE loss is 0.00036211834063262414, train CRF loss is 0.00023407684093176506 Training:At training steps 34400, training MLE loss is 0.0003247083450486349, train CRF loss is 0.000201793831672199 Training:At training steps 34500, training MLE loss is 0.0004004831044932674, train CRF loss is 0.000193590903386875 Validation:At training steps 34500, training MLE loss is 0.0004004831044932674, train CRF loss is 0.000193590903386875, validation MLE loss is 9.038910376398187, validation ppl is 8424.592, validation CRF loss is 8.94504143689808, validation BLEU is 60.15 Training:At training steps 34600, training MLE loss is 0.00024994857866278143, train CRF loss is 0.0002613514188641819 Training:At training steps 34700, training MLE loss is 0.0001975930253193018, train CRF loss is 0.0001934916914095708 Training:At training steps 34800, training MLE loss is 0.0002343052079637973, train CRF loss is 0.00019382544665128194 Training:At training steps 34900, training MLE loss is 0.0003966058204722301, train CRF loss is 0.000266108619576404 Training:At training steps 35000, training MLE loss is 0.0004108675170304942, train CRF loss is 0.0002529568079828044 Validation:At training steps 35000, training MLE loss is 0.0004108675170304942, train CRF loss is 0.0002529568079828044, validation MLE loss is 9.057928104149667, validation ppl is 8586.342, validation CRF loss is 8.920903808192202, validation BLEU is 59.31 Training:At training steps 35100, training MLE loss is 0.00019074479472717878, train CRF loss is 0.00011629682660688534 Training:At training steps 35200, training MLE loss is 0.0003425519844878845, train CRF loss is 0.00022462877604209952 Training:At training steps 35300, training MLE loss is 0.00044260266456056565, train CRF loss is 0.0003166175841771738 Training:At training steps 35400, training MLE loss is 0.00037579896951374326, train CRF loss is 0.0002647583070271964 Training:At training steps 35500, training MLE loss is 0.0003835649495078096, train CRF loss is 0.00022419798616004006 Validation:At training steps 35500, training MLE loss is 0.0003835649495078096, train CRF loss is 0.00022419798616004006, validation MLE loss is 9.056727854829086, validation ppl is 8576.043, validation CRF loss is 8.937727256825095, validation BLEU is 59.69 Training:At training steps 35600, training MLE loss is 0.00012148296304757909, train CRF loss is 0.0002315904593824669 Training:At training steps 35700, training MLE loss is 0.00018831837279820052, train CRF loss is 0.0002062504472365445 Training:At training steps 35800, training MLE loss is 0.00032653426204535857, train CRF loss is 0.00020352748863156891 Training:At training steps 35900, training MLE loss is 0.00029024602098153365, train CRF loss is 0.00016480508124656847 Training:At training steps 36000, training MLE loss is 0.00027877620701204206, train CRF loss is 0.00015773958906205364 Validation:At training steps 36000, training MLE loss is 0.00027877620701204206, train CRF loss is 0.00015773958906205364, validation MLE loss is 9.113909834309629, validation ppl is 9080.73, validation CRF loss is 8.962728067448264, validation BLEU is 59.96 Training:At training steps 36100, training MLE loss is 0.000280028254311479, train CRF loss is 0.00014521596272432102 Training:At training steps 36200, training MLE loss is 0.0003164712576156666, train CRF loss is 0.00022270302503920146 Training:At training steps 36300, training MLE loss is 0.0003244936847287547, train CRF loss is 0.00018434164621525953 Training:At training steps 36400, training MLE loss is 0.00029925522333568947, train CRF loss is 0.0001564537540797195 Training:At training steps 36500, training MLE loss is 0.000284612242473509, train CRF loss is 0.0001438977942867732 Validation:At training steps 36500, training MLE loss is 0.000284612242473509, train CRF loss is 0.0001438977942867732, validation MLE loss is 9.066988411702608, validation ppl is 8664.491, validation CRF loss is 8.91640273520821, validation BLEU is 59.38 Training:At training steps 36600, training MLE loss is 0.0004995992879753823, train CRF loss is 0.00042549213358943126 Training:At training steps 36700, training MLE loss is 0.0003232455883826898, train CRF loss is 0.0002375901303116912 Training:At training steps 36800, training MLE loss is 0.0003358177492396806, train CRF loss is 0.00024589916832544006 Training:At training steps 36900, training MLE loss is 0.0002820425705433798, train CRF loss is 0.00021476255214412986 Training:At training steps 37000, training MLE loss is 0.000239256517723117, train CRF loss is 0.00018135221427934044 Validation:At training steps 37000, training MLE loss is 0.000239256517723117, train CRF loss is 0.00018135221427934044, validation MLE loss is 9.219712257385254, validation ppl is 10094.159, validation CRF loss is 8.96977804836474, validation BLEU is 59.93 Training:At training steps 37100, training MLE loss is 0.00013602576058697837, train CRF loss is 8.385465302511719e-06 Training:At training steps 37200, training MLE loss is 0.00010695798574947124, train CRF loss is 5.299721342990615e-06 Training:At training steps 37300, training MLE loss is 0.00014124385251315126, train CRF loss is 9.258377547514278e-05 Training:At training steps 37400, training MLE loss is 0.0001544829980898468, train CRF loss is 7.610027499359795e-05 Training:At training steps 37500, training MLE loss is 0.00013431931185411343, train CRF loss is 6.488067371248008e-05 Validation:At training steps 37500, training MLE loss is 0.00013431931185411343, train CRF loss is 6.488067371248008e-05, validation MLE loss is 9.023310755428515, validation ppl is 8294.192, validation CRF loss is 8.919578715374595, validation BLEU is 60.33 Training:At training steps 37600, training MLE loss is 0.00016346494594637323, train CRF loss is 0.00013435882313134418 Training:At training steps 37700, training MLE loss is 0.00016064229294181502, train CRF loss is 7.727326025014358e-05 Training:At training steps 37800, training MLE loss is 0.0001563141271761218, train CRF loss is 7.318137769600627e-05 Training:At training steps 37900, training MLE loss is 0.0001610221009944883, train CRF loss is 8.748902269076431e-05 Training:At training steps 38000, training MLE loss is 0.00021351990030907863, train CRF loss is 0.0001581466276881418 Validation:At training steps 38000, training MLE loss is 0.00021351990030907863, train CRF loss is 0.0001581466276881418, validation MLE loss is 9.19092501464643, validation ppl is 9807.719, validation CRF loss is 9.028727142434372, validation BLEU is 60.06 Training:At training steps 38100, training MLE loss is 0.0003956249468879318, train CRF loss is 0.00042704776585571034 Training:At training steps 38200, training MLE loss is 0.0003043748875349189, train CRF loss is 0.00024947153648875676 Training:At training steps 38300, training MLE loss is 0.0002572575028410285, train CRF loss is 0.00017559248120033733 Training:At training steps 38400, training MLE loss is 0.00023829890756236593, train CRF loss is 0.000185062424429987 Training:At training steps 38500, training MLE loss is 0.00021541193719488465, train CRF loss is 0.000149773275493553 Validation:At training steps 38500, training MLE loss is 0.00021541193719488465, train CRF loss is 0.000149773275493553, validation MLE loss is 9.165278798655459, validation ppl is 9559.386, validation CRF loss is 9.026203613532218, validation BLEU is 60.42 Training:At training steps 38600, training MLE loss is 0.0003185817070587968, train CRF loss is 3.0723898791777946e-05 Training:At training steps 38700, training MLE loss is 0.00020068882109319523, train CRF loss is 2.757465692608152e-05 Training:At training steps 38800, training MLE loss is 0.00020243533085190975, train CRF loss is 6.387168240262308e-05 Training:At training steps 38900, training MLE loss is 0.00015597127351558407, train CRF loss is 5.0616512948449264e-05 Training:At training steps 39000, training MLE loss is 0.00022218504675946635, train CRF loss is 0.00010166093046882452 Validation:At training steps 39000, training MLE loss is 0.00022218504675946635, train CRF loss is 0.00010166093046882452, validation MLE loss is 9.101078635767886, validation ppl is 8964.957, validation CRF loss is 8.972085651598478, validation BLEU is 60.43 Training:At training steps 39100, training MLE loss is 6.183317741930199e-05, train CRF loss is 0.00017478625442417074 Training:At training steps 39200, training MLE loss is 3.903952098177493e-05, train CRF loss is 9.740624669076326e-05 Training:At training steps 39300, training MLE loss is 6.778269954191888e-05, train CRF loss is 6.95189669079858e-05 Training:At training steps 39400, training MLE loss is 6.009196502892169e-05, train CRF loss is 5.956618709099137e-05 Training:At training steps 39500, training MLE loss is 5.8155107025301634e-05, train CRF loss is 4.907322188369889e-05 Validation:At training steps 39500, training MLE loss is 5.8155107025301634e-05, train CRF loss is 4.907322188369889e-05, validation MLE loss is 9.143634193821958, validation ppl is 9354.7, validation CRF loss is 8.998105871049981, validation BLEU is 60.31 Training:At training steps 39600, training MLE loss is 0.0001575194046859274, train CRF loss is 0.00013918273241603885 Training:At training steps 39700, training MLE loss is 0.00011999715941125581, train CRF loss is 9.730899275701698e-05 Training:At training steps 39800, training MLE loss is 8.757245746023539e-05, train CRF loss is 6.581612688648726e-05 Training:At training steps 39900, training MLE loss is 7.453407464152393e-05, train CRF loss is 5.2497292885662625e-05 Training:At training steps 40000, training MLE loss is 6.523616401395442e-05, train CRF loss is 5.9446820020470526e-05 Validation:At training steps 40000, training MLE loss is 6.523616401395442e-05, train CRF loss is 5.9446820020470526e-05, validation MLE loss is 9.149942329055385, validation ppl is 9413.897, validation CRF loss is 8.991749957988137, validation BLEU is 60.13 Training:At training steps 100, training MLE loss is 2.1717731401324274, train CRF loss is 15.980498926639557 Training:At training steps 200, training MLE loss is 2.1496638102456926, train CRF loss is 15.447494914829731 Training:At training steps 300, training MLE loss is 2.1632305027792853, train CRF loss is 14.659334386587142 Training:At training steps 400, training MLE loss is 2.1785211004130542, train CRF loss is 13.957447227984666 Training:At training steps 500, training MLE loss is 2.1735479488968847, train CRF loss is 13.350011906743049 Validation:At training steps 500, training MLE loss is 2.1735479488968847, train CRF loss is 13.350011906743049, validation MLE loss is 2.171900921746304, validation ppl is 8.775, validation CRF loss is 9.970341167951885, validation BLEU is 0.71 Training:At training steps 600, training MLE loss is 2.1185822080075742, train CRF loss is 10.219210146069527 Training:At training steps 700, training MLE loss is 2.0987272767722605, train CRF loss is 10.038689716011286 Training:At training steps 800, training MLE loss is 2.0897069253772496, train CRF loss is 9.891252293090025 Training:At training steps 900, training MLE loss is 2.0695489360764623, train CRF loss is 9.747049093544483 Training:At training steps 1000, training MLE loss is 2.0561148837208747, train CRF loss is 9.624453610658646 Validation:At training steps 1000, training MLE loss is 2.0561148837208747, train CRF loss is 9.624453610658646, validation MLE loss is 1.9018172888379348, validation ppl is 6.698, validation CRF loss is 8.577417185432033, validation BLEU is 3.77 Training:At training steps 1100, training MLE loss is 1.9731998317688704, train CRF loss is 8.87911224067211 Training:At training steps 1200, training MLE loss is 1.9840785229578615, train CRF loss is 8.767385147362948 Training:At training steps 1300, training MLE loss is 2.0018965027232967, train CRF loss is 8.662650276521841 Training:At training steps 1400, training MLE loss is 2.0158303409069775, train CRF loss is 8.535314926728606 Training:At training steps 1500, training MLE loss is 2.0320747469067575, train CRF loss is 8.402309090077877 Validation:At training steps 1500, training MLE loss is 2.0320747469067575, train CRF loss is 8.402309090077877, validation MLE loss is 2.22620079391881, validation ppl is 9.265, validation CRF loss is 7.564110360647502, validation BLEU is 32.34 Training:At training steps 1600, training MLE loss is 2.150935985594988, train CRF loss is 7.719476763010025 Training:At training steps 1700, training MLE loss is 2.16540182325989, train CRF loss is 7.59501041829586 Training:At training steps 1800, training MLE loss is 2.1788895408560833, train CRF loss is 7.46847324659427 Training:At training steps 1900, training MLE loss is 2.1973780230619013, train CRF loss is 7.364499156028033 Training:At training steps 2000, training MLE loss is 2.214938744068146, train CRF loss is 7.2569782250523565 Validation:At training steps 2000, training MLE loss is 2.214938744068146, train CRF loss is 7.2569782250523565, validation MLE loss is 2.3509802175195595, validation ppl is 10.496, validation CRF loss is 6.739558703020999, validation BLEU is 33.55 Training:At training steps 2100, training MLE loss is 2.313026740178466, train CRF loss is 6.615828494727611 Training:At training steps 2200, training MLE loss is 2.2971064081415533, train CRF loss is 6.522432666271925 Training:At training steps 2300, training MLE loss is 2.2865994846324127, train CRF loss is 6.41377397403121 Training:At training steps 2400, training MLE loss is 2.2960541334934534, train CRF loss is 6.32523199211806 Training:At training steps 2500, training MLE loss is 2.3039891836196182, train CRF loss is 6.213100199609995 Validation:At training steps 2500, training MLE loss is 2.3039891836196182, train CRF loss is 6.213100199609995, validation MLE loss is 2.826370176516081, validation ppl is 16.884, validation CRF loss is 6.260369620825115, validation BLEU is 36.25 Training:At training steps 2600, training MLE loss is 2.3458506274223327, train CRF loss is 5.571729794293642 Training:At training steps 2700, training MLE loss is 2.3590128177031873, train CRF loss is 5.479554129168391 Training:At training steps 2800, training MLE loss is 2.399712371900678, train CRF loss is 5.428224938809872 Training:At training steps 2900, training MLE loss is 2.3985865114815534, train CRF loss is 5.368164353258908 Training:At training steps 3000, training MLE loss is 2.4024889016747473, train CRF loss is 5.291929104119539 Validation:At training steps 3000, training MLE loss is 2.4024889016747473, train CRF loss is 5.291929104119539, validation MLE loss is 2.639981796866969, validation ppl is 14.013, validation CRF loss is 4.81180004697097, validation BLEU is 40.72 Training:At training steps 3100, training MLE loss is 2.4268264627456664, train CRF loss is 4.786335317790508 Training:At training steps 3200, training MLE loss is 2.4432136641815303, train CRF loss is 4.745540032163262 Training:At training steps 3300, training MLE loss is 2.424235284005602, train CRF loss is 4.661576209565004 Training:At training steps 3400, training MLE loss is 2.416814477369189, train CRF loss is 4.576295952163637 Training:At training steps 3500, training MLE loss is 2.407819376602769, train CRF loss is 4.513830145001411 Validation:At training steps 3500, training MLE loss is 2.407819376602769, train CRF loss is 4.513830145001411, validation MLE loss is 2.568184394585459, validation ppl is 13.042, validation CRF loss is 4.938692579143925, validation BLEU is 39.25 Training:At training steps 3600, training MLE loss is 2.462867563068867, train CRF loss is 4.154083133339882 Training:At training steps 3700, training MLE loss is 2.405178325623274, train CRF loss is 4.075151573829353 Training:At training steps 3800, training MLE loss is 2.406274285316467, train CRF loss is 4.0155152530719835 Training:At training steps 3900, training MLE loss is 2.379813734292984, train CRF loss is 3.9725591743551196 Training:At training steps 4000, training MLE loss is 2.3738915291428566, train CRF loss is 3.92047960755229 Validation:At training steps 4000, training MLE loss is 2.3738915291428566, train CRF loss is 3.92047960755229, validation MLE loss is 2.8043621151070846, validation ppl is 16.517, validation CRF loss is 4.042380648223977, validation BLEU is 45.58 Training:At training steps 4100, training MLE loss is 2.2997633124142887, train CRF loss is 3.6572322091460228 Training:At training steps 4200, training MLE loss is 2.269409193303436, train CRF loss is 3.6015417101606726 Training:At training steps 4300, training MLE loss is 2.2373586850240827, train CRF loss is 3.5270946010450523 Training:At training steps 4400, training MLE loss is 2.2176802155748008, train CRF loss is 3.490497032869607 Training:At training steps 4500, training MLE loss is 2.19158246307075, train CRF loss is 3.450654644191265 Validation:At training steps 4500, training MLE loss is 2.19158246307075, train CRF loss is 3.450654644191265, validation MLE loss is 2.643473444800628, validation ppl is 14.062, validation CRF loss is 3.862715821517141, validation BLEU is 49.08 Training:At training steps 4600, training MLE loss is 2.0470344261452555, train CRF loss is 3.287897620499134 Training:At training steps 4700, training MLE loss is 2.0029062732867895, train CRF loss is 3.2161080899462102 Training:At training steps 4800, training MLE loss is 1.9674384306867918, train CRF loss is 3.1713991291075945 Training:At training steps 4900, training MLE loss is 1.941312533505261, train CRF loss is 3.1324400427844377 Training:At training steps 5000, training MLE loss is 1.9192380537465215, train CRF loss is 3.0878282637521623 Validation:At training steps 5000, training MLE loss is 1.9192380537465215, train CRF loss is 3.0878282637521623, validation MLE loss is 2.9410692265159204, validation ppl is 18.936, validation CRF loss is 3.674807824586567, validation BLEU is 52.42 Training:At training steps 5100, training MLE loss is 1.7687736926227808, train CRF loss is 2.859533615782857 Training:At training steps 5200, training MLE loss is 1.7480364665016532, train CRF loss is 2.8070631173625586 Training:At training steps 5300, training MLE loss is 1.7304582074160377, train CRF loss is 2.7696163879334925 Training:At training steps 5400, training MLE loss is 1.7033239054959268, train CRF loss is 2.728145287428051 Training:At training steps 5500, training MLE loss is 1.6838533061891794, train CRF loss is 2.696395574249327 Validation:At training steps 5500, training MLE loss is 1.6838533061891794, train CRF loss is 2.696395574249327, validation MLE loss is 2.8775617135198495, validation ppl is 17.771, validation CRF loss is 3.6494701504707336, validation BLEU is 52.93 Training:At training steps 5600, training MLE loss is 1.5936960318312048, train CRF loss is 2.480217757336795 Training:At training steps 5700, training MLE loss is 1.5558701142296194, train CRF loss is 2.458092798497528 Training:At training steps 5800, training MLE loss is 1.5420357172812025, train CRF loss is 2.427251782802244 Training:At training steps 5900, training MLE loss is 1.5231516008358448, train CRF loss is 2.39422299942933 Training:At training steps 6000, training MLE loss is 1.513269730709493, train CRF loss is 2.362602831333876 Validation:At training steps 6000, training MLE loss is 1.513269730709493, train CRF loss is 2.362602831333876, validation MLE loss is 3.152378886938095, validation ppl is 23.392, validation CRF loss is 3.5132672567116585, validation BLEU is 56.63 Training:At training steps 6100, training MLE loss is 1.4457135154679417, train CRF loss is 2.1454391354881226 Training:At training steps 6200, training MLE loss is 1.4019024896156043, train CRF loss is 2.1048167759086938 Training:At training steps 6300, training MLE loss is 1.3853856692835689, train CRF loss is 2.0899262911702197 Training:At training steps 6400, training MLE loss is 1.366945026377216, train CRF loss is 2.063477150015533 Training:At training steps 6500, training MLE loss is 1.350044223241508, train CRF loss is 2.0294142961017787 Validation:At training steps 6500, training MLE loss is 1.350044223241508, train CRF loss is 2.0294142961017787, validation MLE loss is 3.257627628351513, validation ppl is 25.988, validation CRF loss is 3.653502487822583, validation BLEU is 55.39 Training:At training steps 6600, training MLE loss is 1.2304438047204167, train CRF loss is 1.817593237310648 Training:At training steps 6700, training MLE loss is 1.2195669837063179, train CRF loss is 1.7896968400664628 Training:At training steps 6800, training MLE loss is 1.208649230155473, train CRF loss is 1.7845405550859867 Training:At training steps 6900, training MLE loss is 1.193481498749461, train CRF loss is 1.7551243675593287 Training:At training steps 7000, training MLE loss is 1.1768012634087355, train CRF loss is 1.724524119026959 Validation:At training steps 7000, training MLE loss is 1.1768012634087355, train CRF loss is 1.724524119026959, validation MLE loss is 3.1346082640321633, validation ppl is 22.98, validation CRF loss is 3.7391348324323954, validation BLEU is 56.1 Training:At training steps 7100, training MLE loss is 1.0764824985340238, train CRF loss is 1.5717228436283768 Training:At training steps 7200, training MLE loss is 1.060897398237139, train CRF loss is 1.532814249889925 Training:At training steps 7300, training MLE loss is 1.0517929436266422, train CRF loss is 1.5078141110793999 Training:At training steps 7400, training MLE loss is 1.0219652676256374, train CRF loss is 1.478792689590482 Training:At training steps 7500, training MLE loss is 1.0076446787752211, train CRF loss is 1.4563104643095284 Validation:At training steps 7500, training MLE loss is 1.0076446787752211, train CRF loss is 1.4563104643095284, validation MLE loss is 3.550917785418661, validation ppl is 34.845, validation CRF loss is 3.819064686172887, validation BLEU is 57.76 Training:At training steps 7600, training MLE loss is 0.9110962071735412, train CRF loss is 1.2960485809948294 Training:At training steps 7700, training MLE loss is 0.9166999311093241, train CRF loss is 1.2785402191383763 Training:At training steps 7800, training MLE loss is 0.8989687800997247, train CRF loss is 1.255835036681965 Training:At training steps 7900, training MLE loss is 0.8836804958828725, train CRF loss is 1.2336262367549353 Training:At training steps 8000, training MLE loss is 0.8741700017936528, train CRF loss is 1.2065055419262498 Validation:At training steps 8000, training MLE loss is 0.8741700017936528, train CRF loss is 1.2065055419262498, validation MLE loss is 3.822009968130212, validation ppl is 45.696, validation CRF loss is 4.011594789592843, validation BLEU is 59.12 Training:At training steps 8100, training MLE loss is 0.7966633305791766, train CRF loss is 1.0699411880620755 Training:At training steps 8200, training MLE loss is 0.8103040926647372, train CRF loss is 1.0512338166555855 Training:At training steps 8300, training MLE loss is 0.7985100121640911, train CRF loss is 1.0378625517408364 Training:At training steps 8400, training MLE loss is 0.7930431564035826, train CRF loss is 1.022007031394751 Training:At training steps 8500, training MLE loss is 0.7767127457740717, train CRF loss is 1.0066559033649973 Validation:At training steps 8500, training MLE loss is 0.7767127457740717, train CRF loss is 1.0066559033649973, validation MLE loss is 3.940195952591143, validation ppl is 51.429, validation CRF loss is 4.221783355662697, validation BLEU is 59.31 Training:At training steps 8600, training MLE loss is 0.674311353941448, train CRF loss is 0.8797832804685458 Training:At training steps 8700, training MLE loss is 0.672303274308797, train CRF loss is 0.8660756448027678 Training:At training steps 8800, training MLE loss is 0.6703645603684708, train CRF loss is 0.8579855678929016 Training:At training steps 8900, training MLE loss is 0.6642413517736714, train CRF loss is 0.8395459866734746 Training:At training steps 9000, training MLE loss is 0.6622886396690737, train CRF loss is 0.8271108075884404 Validation:At training steps 9000, training MLE loss is 0.6622886396690737, train CRF loss is 0.8271108075884404, validation MLE loss is 4.223971931557906, validation ppl is 68.304, validation CRF loss is 4.224803999850624, validation BLEU is 58.22 Training:At training steps 9100, training MLE loss is 0.5812943256739527, train CRF loss is 0.730990419111331 Training:At training steps 9200, training MLE loss is 0.5904437133762985, train CRF loss is 0.7165014886789141 Training:At training steps 9300, training MLE loss is 0.581485539705803, train CRF loss is 0.6967615989479237 Training:At training steps 9400, training MLE loss is 0.5696362374455203, train CRF loss is 0.683244161948096 Training:At training steps 9500, training MLE loss is 0.5661348838917911, train CRF loss is 0.6731708326146473 Validation:At training steps 9500, training MLE loss is 0.5661348838917911, train CRF loss is 0.6731708326146473, validation MLE loss is 4.484035322540684, validation ppl is 88.591, validation CRF loss is 4.578259355143497, validation BLEU is 57.97 Training:At training steps 9600, training MLE loss is 0.5361109394370578, train CRF loss is 0.5865520781348459 Training:At training steps 9700, training MLE loss is 0.5165989224845543, train CRF loss is 0.5775485247731558 Training:At training steps 9800, training MLE loss is 0.516597777606609, train CRF loss is 0.5620960137967874 Training:At training steps 9900, training MLE loss is 0.5125068700700649, train CRF loss is 0.5498648116210098 Training:At training steps 10000, training MLE loss is 0.5018749550140928, train CRF loss is 0.5385008114759112 Validation:At training steps 10000, training MLE loss is 0.5018749550140928, train CRF loss is 0.5385008114759112, validation MLE loss is 4.717786977165623, validation ppl is 111.92, validation CRF loss is 4.938375846335762, validation BLEU is 57.58 Training:At training steps 10100, training MLE loss is 0.44007028454681857, train CRF loss is 0.45448239034973087 Training:At training steps 10200, training MLE loss is 0.4291684170841472, train CRF loss is 0.4481701994704781 Training:At training steps 10300, training MLE loss is 0.4392102448132937, train CRF loss is 0.4432119843152274 Training:At training steps 10400, training MLE loss is 0.4402495170674956, train CRF loss is 0.43763818309282215 Training:At training steps 10500, training MLE loss is 0.43869951141480124, train CRF loss is 0.43077229355517194 Validation:At training steps 10500, training MLE loss is 0.43869951141480124, train CRF loss is 0.43077229355517194, validation MLE loss is 5.093419206769843, validation ppl is 162.946, validation CRF loss is 4.98552595941644, validation BLEU is 61.25 Training:At training steps 10600, training MLE loss is 0.39397344129509293, train CRF loss is 0.3879075446477509 Training:At training steps 10700, training MLE loss is 0.38643389533564915, train CRF loss is 0.37667016248058643 Training:At training steps 10800, training MLE loss is 0.39000898980380344, train CRF loss is 0.3700008495719521 Training:At training steps 10900, training MLE loss is 0.3870625981394551, train CRF loss is 0.3612003298172749 Training:At training steps 11000, training MLE loss is 0.3783071803053608, train CRF loss is 0.3538934206333797 Validation:At training steps 11000, training MLE loss is 0.3783071803053608, train CRF loss is 0.3538934206333797, validation MLE loss is 5.198952270181556, validation ppl is 181.082, validation CRF loss is 5.006847089842746, validation BLEU is 60.04 Training:At training steps 11100, training MLE loss is 0.34762958151026396, train CRF loss is 0.2997526730762911 Training:At training steps 11200, training MLE loss is 0.3433303146711842, train CRF loss is 0.2927610991296569 Training:At training steps 11300, training MLE loss is 0.33277445544508133, train CRF loss is 0.2936153232991516 Training:At training steps 11400, training MLE loss is 0.33644791719620115, train CRF loss is 0.29233406664948236 Training:At training steps 11500, training MLE loss is 0.334837684751139, train CRF loss is 0.2901142888465183 Validation:At training steps 11500, training MLE loss is 0.334837684751139, train CRF loss is 0.2901142888465183, validation MLE loss is 5.068856876147421, validation ppl is 158.992, validation CRF loss is 5.290193965560512, validation BLEU is 58.23 Training:At training steps 11600, training MLE loss is 0.33569229183718563, train CRF loss is 0.28681518539990064 Training:At training steps 11700, training MLE loss is 0.32739528450591027, train CRF loss is 0.2709402292886989 Training:At training steps 11800, training MLE loss is 0.32207789571102086, train CRF loss is 0.2599463830921862 Training:At training steps 11900, training MLE loss is 0.3166964212142193, train CRF loss is 0.25192538571500334 Training:At training steps 12000, training MLE loss is 0.30912475423893193, train CRF loss is 0.24719680222710302 Validation:At training steps 12000, training MLE loss is 0.30912475423893193, train CRF loss is 0.24719680222710302, validation MLE loss is 5.513072980077643, validation ppl is 247.912, validation CRF loss is 5.423084048848403, validation BLEU is 59.64 Training:At training steps 12100, training MLE loss is 0.2881133070791111, train CRF loss is 0.2155154546918129 Training:At training steps 12200, training MLE loss is 0.28187688747457285, train CRF loss is 0.22102766311463712 Training:At training steps 12300, training MLE loss is 0.27897889104385587, train CRF loss is 0.216276971641455 Training:At training steps 12400, training MLE loss is 0.2775996107846913, train CRF loss is 0.21178674491449784 Training:At training steps 12500, training MLE loss is 0.271354223270886, train CRF loss is 0.20662983581778827 Validation:At training steps 12500, training MLE loss is 0.271354223270886, train CRF loss is 0.20662983581778827, validation MLE loss is 5.730179927851024, validation ppl is 308.025, validation CRF loss is 5.503149365123949, validation BLEU is 58.83 Training:At training steps 12600, training MLE loss is 0.24817069953191095, train CRF loss is 0.18212594715976593 Training:At training steps 12700, training MLE loss is 0.24500605616725807, train CRF loss is 0.18114617478374384 Training:At training steps 12800, training MLE loss is 0.2417307168825937, train CRF loss is 0.1798768823331314 Training:At training steps 12900, training MLE loss is 0.23427015779063368, train CRF loss is 0.17450832342989087 Training:At training steps 13000, training MLE loss is 0.23154433552038972, train CRF loss is 0.16934881162573037 Validation:At training steps 13000, training MLE loss is 0.23154433552038972, train CRF loss is 0.16934881162573037, validation MLE loss is 5.7649819286246045, validation ppl is 318.933, validation CRF loss is 5.704931180728109, validation BLEU is 58.54 Training:At training steps 13100, training MLE loss is 0.19839937978427769, train CRF loss is 0.14283204253046278 Training:At training steps 13200, training MLE loss is 0.20413189193322978, train CRF loss is 0.14699376671104802 Training:At training steps 13300, training MLE loss is 0.20401014970793768, train CRF loss is 0.14240148000860245 Training:At training steps 13400, training MLE loss is 0.1997622122732446, train CRF loss is 0.14266008714544567 Training:At training steps 13500, training MLE loss is 0.1955870578808317, train CRF loss is 0.1405610753246474 Validation:At training steps 13500, training MLE loss is 0.1955870578808317, train CRF loss is 0.1405610753246474, validation MLE loss is 5.905915655587849, validation ppl is 367.203, validation CRF loss is 5.765597942628358, validation BLEU is 58.93 Training:At training steps 13600, training MLE loss is 0.20070843712066563, train CRF loss is 0.13475577647418505 Training:At training steps 13700, training MLE loss is 0.18734002353508913, train CRF loss is 0.13558580495742262 Training:At training steps 13800, training MLE loss is 0.18391285333957058, train CRF loss is 0.13162873656634777 Training:At training steps 13900, training MLE loss is 0.1828453440946214, train CRF loss is 0.1282918484848051 Training:At training steps 14000, training MLE loss is 0.17936903121689102, train CRF loss is 0.12460989471059838 Validation:At training steps 14000, training MLE loss is 0.17936903121689102, train CRF loss is 0.12460989471059838, validation MLE loss is 5.973139740918812, validation ppl is 392.737, validation CRF loss is 5.903415193683223, validation BLEU is 60.43 Training:At training steps 14100, training MLE loss is 0.175563222306082, train CRF loss is 0.10734408373561109 Training:At training steps 14200, training MLE loss is 0.16835514340553345, train CRF loss is 0.10446635611680222 Training:At training steps 14300, training MLE loss is 0.1656855420774688, train CRF loss is 0.10377215814768079 Training:At training steps 14400, training MLE loss is 0.16705063682726176, train CRF loss is 0.10452911580129978 Training:At training steps 14500, training MLE loss is 0.16305544432481292, train CRF loss is 0.1046986093237506 Validation:At training steps 14500, training MLE loss is 0.16305544432481292, train CRF loss is 0.1046986093237506, validation MLE loss is 6.047951968092668, validation ppl is 423.245, validation CRF loss is 5.983198793310868, validation BLEU is 60.63 Training:At training steps 14600, training MLE loss is 0.14998826491344516, train CRF loss is 0.0970267070019554 Training:At training steps 14700, training MLE loss is 0.1491893406577219, train CRF loss is 0.09931931936156388 Training:At training steps 14800, training MLE loss is 0.14921187724402443, train CRF loss is 0.0993650475714253 Training:At training steps 14900, training MLE loss is 0.14848351304428206, train CRF loss is 0.09734436790712664 Training:At training steps 15000, training MLE loss is 0.15004828367449044, train CRF loss is 0.09635930647057739 Validation:At training steps 15000, training MLE loss is 0.15004828367449044, train CRF loss is 0.09635930647057739, validation MLE loss is 6.119932246835608, validation ppl is 454.834, validation CRF loss is 6.085922184743379, validation BLEU is 60.14 Training:At training steps 15100, training MLE loss is 0.13077036108496032, train CRF loss is 0.0814539032129369 Training:At training steps 15200, training MLE loss is 0.13041718357803803, train CRF loss is 0.08004672673075675 Training:At training steps 15300, training MLE loss is 0.1266527167918654, train CRF loss is 0.08227754971871112 Training:At training steps 15400, training MLE loss is 0.12685611522728324, train CRF loss is 0.08196692770923107 Training:At training steps 15500, training MLE loss is 0.12699953495706176, train CRF loss is 0.08034730372156765 Validation:At training steps 15500, training MLE loss is 0.12699953495706176, train CRF loss is 0.08034730372156765, validation MLE loss is 6.256576745133651, validation ppl is 521.431, validation CRF loss is 6.405696407744759, validation BLEU is 60.47 Training:At training steps 15600, training MLE loss is 0.1290804371439117, train CRF loss is 0.07926899501113212 Training:At training steps 15700, training MLE loss is 0.12983942935418782, train CRF loss is 0.0824866900428134 Training:At training steps 15800, training MLE loss is 0.12841579761914546, train CRF loss is 0.08072292392570148 Training:At training steps 15900, training MLE loss is 0.12358014205566632, train CRF loss is 0.07724990531997804 Training:At training steps 16000, training MLE loss is 0.12198452301473844, train CRF loss is 0.07551724616489879 Validation:At training steps 16000, training MLE loss is 0.12198452301473844, train CRF loss is 0.07551724616489879, validation MLE loss is 6.136161455982609, validation ppl is 462.276, validation CRF loss is 6.205081315417039, validation BLEU is 60.31 Training:At training steps 16100, training MLE loss is 0.10067006707512974, train CRF loss is 0.06363780211802464 Training:At training steps 16200, training MLE loss is 0.09963532596250957, train CRF loss is 0.06305840944508702 Training:At training steps 16300, training MLE loss is 0.100169940381442, train CRF loss is 0.06306331721751993 Training:At training steps 16400, training MLE loss is 0.09947647481142724, train CRF loss is 0.06254639800373553 Training:At training steps 16500, training MLE loss is 0.09843576088388727, train CRF loss is 0.061627624527954825 Validation:At training steps 16500, training MLE loss is 0.09843576088388727, train CRF loss is 0.061627624527954825, validation MLE loss is 6.657123261376431, validation ppl is 778.309, validation CRF loss is 6.571410957135652, validation BLEU is 60.44 Training:At training steps 16600, training MLE loss is 0.09565873884613098, train CRF loss is 0.05980375367116153 Training:At training steps 16700, training MLE loss is 0.09496971733775297, train CRF loss is 0.059869349696551805 Training:At training steps 16800, training MLE loss is 0.09443216108735025, train CRF loss is 0.05928828882029601 Training:At training steps 16900, training MLE loss is 0.09513114719127856, train CRF loss is 0.05939953986642706 Training:At training steps 17000, training MLE loss is 0.09351492858881624, train CRF loss is 0.05762374127838075 Validation:At training steps 17000, training MLE loss is 0.09351492858881624, train CRF loss is 0.05762374127838075, validation MLE loss is 6.676929364078923, validation ppl is 793.878, validation CRF loss is 6.70785098954251, validation BLEU is 59.6 Training:At training steps 17100, training MLE loss is 0.0839634492730812, train CRF loss is 0.05463849070589163 Training:At training steps 17200, training MLE loss is 0.08266888214624174, train CRF loss is 0.054785945820182175 Training:At training steps 17300, training MLE loss is 0.0806015504484149, train CRF loss is 0.05346168268242726 Training:At training steps 17400, training MLE loss is 0.07997528512289463, train CRF loss is 0.0528196995916818 Training:At training steps 17500, training MLE loss is 0.07964534274214645, train CRF loss is 0.05190756082498842 Validation:At training steps 17500, training MLE loss is 0.07964534274214645, train CRF loss is 0.05190756082498842, validation MLE loss is 6.782194357169302, validation ppl is 882.002, validation CRF loss is 6.797831052228024, validation BLEU is 59.83 Training:At training steps 17600, training MLE loss is 0.08349845455010722, train CRF loss is 0.05120474166725671 Training:At training steps 17700, training MLE loss is 0.07884658023674547, train CRF loss is 0.050316845293990865 Training:At training steps 17800, training MLE loss is 0.07893113618246617, train CRF loss is 0.04941190618305759 Training:At training steps 17900, training MLE loss is 0.07600461024236864, train CRF loss is 0.047565382932629775 Training:At training steps 18000, training MLE loss is 0.07425908154555531, train CRF loss is 0.0456146661638482 Validation:At training steps 18000, training MLE loss is 0.07425908154555531, train CRF loss is 0.0456146661638482, validation MLE loss is 6.797506479840529, validation ppl is 895.611, validation CRF loss is 6.881734973505924, validation BLEU is 60.66 Training:At training steps 18100, training MLE loss is 0.07262344812114861, train CRF loss is 0.03925599606525694 Training:At training steps 18200, training MLE loss is 0.06673758022900814, train CRF loss is 0.03835455179508394 Training:At training steps 18300, training MLE loss is 0.06582639840358544, train CRF loss is 0.040590918110766934 Training:At training steps 18400, training MLE loss is 0.0644737996280837, train CRF loss is 0.03942186868672671 Training:At training steps 18500, training MLE loss is 0.0627950158194227, train CRF loss is 0.038207846653740316 Validation:At training steps 18500, training MLE loss is 0.0627950158194227, train CRF loss is 0.038207846653740316, validation MLE loss is 7.060185444982428, validation ppl is 1164.661, validation CRF loss is 7.000014267469707, validation BLEU is 60.21 Training:At training steps 18600, training MLE loss is 0.05516069285360913, train CRF loss is 0.029931640696230062 Training:At training steps 18700, training MLE loss is 0.056814460348339535, train CRF loss is 0.03320457410863476 Training:At training steps 18800, training MLE loss is 0.05614797988292854, train CRF loss is 0.03407946033164445 Training:At training steps 18900, training MLE loss is 0.055496614229970015, train CRF loss is 0.033851847471222135 Training:At training steps 19000, training MLE loss is 0.05601810005526272, train CRF loss is 0.03395858169800792 Validation:At training steps 19000, training MLE loss is 0.05601810005526272, train CRF loss is 0.03395858169800792, validation MLE loss is 7.059527541461744, validation ppl is 1163.895, validation CRF loss is 7.142239925108458, validation BLEU is 60.41 Training:At training steps 19100, training MLE loss is 0.046309531706672825, train CRF loss is 0.028031125452603618 Training:At training steps 19200, training MLE loss is 0.04874246702071417, train CRF loss is 0.028964564599387687 Training:At training steps 19300, training MLE loss is 0.04920159592324457, train CRF loss is 0.02895547699784416 Training:At training steps 19400, training MLE loss is 0.05053058615901108, train CRF loss is 0.02896469634887378 Training:At training steps 19500, training MLE loss is 0.05007314527531938, train CRF loss is 0.02969968796933938 Validation:At training steps 19500, training MLE loss is 0.05007314527531938, train CRF loss is 0.02969968796933938, validation MLE loss is 7.108850212473619, validation ppl is 1222.741, validation CRF loss is 7.137746647784584, validation BLEU is 60.83 Training:At training steps 19600, training MLE loss is 0.045273404680535805, train CRF loss is 0.02658472676064278 Training:At training steps 19700, training MLE loss is 0.04423192378329723, train CRF loss is 0.025553148792518066 Training:At training steps 19800, training MLE loss is 0.043078625478003736, train CRF loss is 0.024793328532852734 Training:At training steps 19900, training MLE loss is 0.04265422873583054, train CRF loss is 0.02417426762656066 Training:At training steps 20000, training MLE loss is 0.04223183189647075, train CRF loss is 0.023708732300586115 Validation:At training steps 20000, training MLE loss is 0.04223183189647075, train CRF loss is 0.023708732300586115, validation MLE loss is 7.4095932527592305, validation ppl is 1651.754, validation CRF loss is 7.4288085134405835, validation BLEU is 59.78 Training:At training steps 20100, training MLE loss is 0.039572962027537247, train CRF loss is 0.020859752461353152 Training:At training steps 20200, training MLE loss is 0.038057362985363025, train CRF loss is 0.021363023391541325 Training:At training steps 20300, training MLE loss is 0.03631223166554796, train CRF loss is 0.02141830156414002 Training:At training steps 20400, training MLE loss is 0.035862812753157905, train CRF loss is 0.021061514723713416 Training:At training steps 20500, training MLE loss is 0.036025846633775976, train CRF loss is 0.02130795253105657 Validation:At training steps 20500, training MLE loss is 0.036025846633775976, train CRF loss is 0.02130795253105657, validation MLE loss is 7.610862565668006, validation ppl is 2020.02, validation CRF loss is 7.522579108413897, validation BLEU is 60.17 Training:At training steps 20600, training MLE loss is 0.035684959553427334, train CRF loss is 0.021830544960980164 Training:At training steps 20700, training MLE loss is 0.033407961451956736, train CRF loss is 0.019008282018664877 Training:At training steps 20800, training MLE loss is 0.0322914581755488, train CRF loss is 0.018589785042764873 Training:At training steps 20900, training MLE loss is 0.03111004438747063, train CRF loss is 0.018985285425386366 Training:At training steps 21000, training MLE loss is 0.030870916181350773, train CRF loss is 0.018939824567242653 Validation:At training steps 21000, training MLE loss is 0.030870916181350773, train CRF loss is 0.018939824567242653, validation MLE loss is 7.733497751386542, validation ppl is 2283.576, validation CRF loss is 7.581159105426387, validation BLEU is 59.6 Training:At training steps 21100, training MLE loss is 0.03136454014078613, train CRF loss is 0.019089281890344312 Training:At training steps 21200, training MLE loss is 0.02741247770129588, train CRF loss is 0.016531603013587453 Training:At training steps 21300, training MLE loss is 0.02604389185905901, train CRF loss is 0.01638649908303097 Training:At training steps 21400, training MLE loss is 0.025502906905643056, train CRF loss is 0.016171279733479742 Training:At training steps 21500, training MLE loss is 0.025089662344703185, train CRF loss is 0.015737646238150625 Validation:At training steps 21500, training MLE loss is 0.025089662344703185, train CRF loss is 0.015737646238150625, validation MLE loss is 7.746689636456339, validation ppl is 2313.9, validation CRF loss is 7.711765047750975, validation BLEU is 59.95 Training:At training steps 21600, training MLE loss is 0.020473437932269432, train CRF loss is 0.012814852464178638 Training:At training steps 21700, training MLE loss is 0.01988071468652092, train CRF loss is 0.012602098214753958 Training:At training steps 21800, training MLE loss is 0.019289680565960265, train CRF loss is 0.012147852797887515 Training:At training steps 21900, training MLE loss is 0.020009223970256205, train CRF loss is 0.012870767126453166 Training:At training steps 22000, training MLE loss is 0.01948030669736085, train CRF loss is 0.012489868526929094 Validation:At training steps 22000, training MLE loss is 0.01948030669736085, train CRF loss is 0.012489868526929094, validation MLE loss is 8.019413116731142, validation ppl is 3039.393, validation CRF loss is 7.955217863384046, validation BLEU is 60.63 Training:At training steps 22100, training MLE loss is 0.02116915347910537, train CRF loss is 0.013076029557757449 Training:At training steps 22200, training MLE loss is 0.019739185121350634, train CRF loss is 0.012144379427563022 Training:At training steps 22300, training MLE loss is 0.01850870699027708, train CRF loss is 0.011797642291906146 Training:At training steps 22400, training MLE loss is 0.01794858765412237, train CRF loss is 0.011371721621012227 Training:At training steps 22500, training MLE loss is 0.016776499808091778, train CRF loss is 0.010570873251497996 Validation:At training steps 22500, training MLE loss is 0.016776499808091778, train CRF loss is 0.010570873251497996, validation MLE loss is 8.125297075823733, validation ppl is 3378.872, validation CRF loss is 8.096541134934677, validation BLEU is 59.82 Training:At training steps 22600, training MLE loss is 0.01223470647067932, train CRF loss is 0.00694564135903196 Training:At training steps 22700, training MLE loss is 0.011570428106290694, train CRF loss is 0.006890862259396741 Training:At training steps 22800, training MLE loss is 0.011509797823188693, train CRF loss is 0.006771619516059613 Training:At training steps 22900, training MLE loss is 0.01168060280869328, train CRF loss is 0.006889529956744642 Training:At training steps 23000, training MLE loss is 0.011272812286167243, train CRF loss is 0.006765381383093118 Validation:At training steps 23000, training MLE loss is 0.011272812286167243, train CRF loss is 0.006765381383093118, validation MLE loss is 8.236892267277366, validation ppl is 3777.782, validation CRF loss is 8.259434420811502, validation BLEU is 60.38 Training:At training steps 23100, training MLE loss is 0.010306856813612092, train CRF loss is 0.005246984092862342 Training:At training steps 23200, training MLE loss is 0.011325509233966358, train CRF loss is 0.0065131467151991985 Training:At training steps 23300, training MLE loss is 0.010790214041382458, train CRF loss is 0.005987795363325669 Training:At training steps 23400, training MLE loss is 0.010161757937046233, train CRF loss is 0.005878848037500999 Training:At training steps 23500, training MLE loss is 0.009689154436918871, train CRF loss is 0.005565631667222091 Validation:At training steps 23500, training MLE loss is 0.009689154436918871, train CRF loss is 0.005565631667222091, validation MLE loss is 8.496901876048037, validation ppl is 4899.566, validation CRF loss is 8.441153432193556, validation BLEU is 60.9 Training:At training steps 23600, training MLE loss is 0.007066319723043422, train CRF loss is 0.0037314332754575163 Training:At training steps 23700, training MLE loss is 0.007858260082173772, train CRF loss is 0.004480224244153543 Training:At training steps 23800, training MLE loss is 0.007812076380715263, train CRF loss is 0.004670692200767981 Training:At training steps 23900, training MLE loss is 0.007844829980840287, train CRF loss is 0.004745924783393277 Training:At training steps 24000, training MLE loss is 0.007597822827986308, train CRF loss is 0.004495660908040965 Validation:At training steps 24000, training MLE loss is 0.007597822827986308, train CRF loss is 0.004495660908040965, validation MLE loss is 8.660868058079167, validation ppl is 5772.543, validation CRF loss is 8.612270587369016, validation BLEU is 60.79 Training:At training steps 24100, training MLE loss is 0.0070173636745281265, train CRF loss is 0.004356562985352328 Training:At training steps 24200, training MLE loss is 0.006859469161618639, train CRF loss is 0.00439284950008815 Training:At training steps 24300, training MLE loss is 0.006802501208639503, train CRF loss is 0.004397574663161563 Training:At training steps 24400, training MLE loss is 0.006449612600239511, train CRF loss is 0.004122235058748638 Training:At training steps 24500, training MLE loss is 0.006137589003334291, train CRF loss is 0.004018123426108364 Validation:At training steps 24500, training MLE loss is 0.006137589003334291, train CRF loss is 0.004018123426108364, validation MLE loss is 8.692100242564553, validation ppl is 5955.677, validation CRF loss is 8.62780975668054, validation BLEU is 61.51 Training:At training steps 24600, training MLE loss is 0.0057054928535401825, train CRF loss is 0.0027478225962317814 Training:At training steps 24700, training MLE loss is 0.004846717523529005, train CRF loss is 0.00280162515957874 Training:At training steps 24800, training MLE loss is 0.004407260143128907, train CRF loss is 0.002640772281311503 Training:At training steps 24900, training MLE loss is 0.004386081729684407, train CRF loss is 0.0026950924405634404 Training:At training steps 25000, training MLE loss is 0.004257964514918999, train CRF loss is 0.0025356991918154994 Validation:At training steps 25000, training MLE loss is 0.004257964514918999, train CRF loss is 0.0025356991918154994, validation MLE loss is 8.868985477246737, validation ppl is 7108.066, validation CRF loss is 8.775565282294625, validation BLEU is 59.87 Training:At training steps 25100, training MLE loss is 0.004276390907955646, train CRF loss is 0.002057032702853032 Training:At training steps 25200, training MLE loss is 0.004680996004177851, train CRF loss is 0.002517517079940472 Training:At training steps 25300, training MLE loss is 0.004275634161467988, train CRF loss is 0.002417470187541942 Training:At training steps 25400, training MLE loss is 0.003988246120539593, train CRF loss is 0.0023222679181953564 Training:At training steps 25500, training MLE loss is 0.003917544523464768, train CRF loss is 0.0022772082197503336 Validation:At training steps 25500, training MLE loss is 0.003917544523464768, train CRF loss is 0.0022772082197503336, validation MLE loss is 9.046151324322349, validation ppl is 8485.816, validation CRF loss is 8.938045501708984, validation BLEU is 60.55 Training:At training steps 25600, training MLE loss is 0.003695287802101206, train CRF loss is 0.002007682339541086 Training:At training steps 25700, training MLE loss is 0.003686504942831402, train CRF loss is 0.002156487573070165 Training:At training steps 25800, training MLE loss is 0.0038509904541634693, train CRF loss is 0.0021472200794332504 Training:At training steps 25900, training MLE loss is 0.0037327702297518433, train CRF loss is 0.002120983296036967 Training:At training steps 26000, training MLE loss is 0.003445652382748268, train CRF loss is 0.0019887349954194724 Validation:At training steps 26000, training MLE loss is 0.003445652382748268, train CRF loss is 0.0019887349954194724, validation MLE loss is 8.979322307988218, validation ppl is 7937.251, validation CRF loss is 9.004903849802519, validation BLEU is 60.79 Training:At training steps 26100, training MLE loss is 0.0025502396419998293, train CRF loss is 0.0018595346565427517 Training:At training steps 26200, training MLE loss is 0.002678074081484554, train CRF loss is 0.0018551876786299236 Training:At training steps 26300, training MLE loss is 0.002820048095605587, train CRF loss is 0.001963488628054143 Training:At training steps 26400, training MLE loss is 0.0028011879187021405, train CRF loss is 0.001848061942515632 Training:At training steps 26500, training MLE loss is 0.002648124346560041, train CRF loss is 0.001723541238822154 Validation:At training steps 26500, training MLE loss is 0.002648124346560041, train CRF loss is 0.001723541238822154, validation MLE loss is 9.158129346998114, validation ppl is 9491.286, validation CRF loss is 9.180725044325778, validation BLEU is 60.45 Training:At training steps 26600, training MLE loss is 0.003280944286687454, train CRF loss is 0.0021385078888925247 Training:At training steps 26700, training MLE loss is 0.003015490714264705, train CRF loss is 0.0018760366428638875 Training:At training steps 26800, training MLE loss is 0.0030056558494925014, train CRF loss is 0.0018494342755577353 Training:At training steps 26900, training MLE loss is 0.0027697367925134425, train CRF loss is 0.0017007148355216916 Training:At training steps 27000, training MLE loss is 0.0025424340207581566, train CRF loss is 0.0015408151578929505 Validation:At training steps 27000, training MLE loss is 0.0025424340207581566, train CRF loss is 0.0015408151578929505, validation MLE loss is 9.217054950563531, validation ppl is 10067.372, validation CRF loss is 9.194199750297948, validation BLEU is 60.75 Training:At training steps 27100, training MLE loss is 0.0024642269783236217, train CRF loss is 0.0013858834581064094 Training:At training steps 27200, training MLE loss is 0.0023224512495607095, train CRF loss is 0.0012348800447995578 Training:At training steps 27300, training MLE loss is 0.001991923827420419, train CRF loss is 0.001095751971369731 Training:At training steps 27400, training MLE loss is 0.0018699756907674214, train CRF loss is 0.0010458271624301753 Training:At training steps 27500, training MLE loss is 0.0018247642553186865, train CRF loss is 0.0010403937279013452 Validation:At training steps 27500, training MLE loss is 0.0018247642553186865, train CRF loss is 0.0010403937279013452, validation MLE loss is 9.2837056988164, validation ppl is 10761.236, validation CRF loss is 9.238595096688522, validation BLEU is 60.09 Training:At training steps 27600, training MLE loss is 0.0015212469740340556, train CRF loss is 0.0006322161513428703 Training:At training steps 27700, training MLE loss is 0.0019949345825185797, train CRF loss is 0.0009705125635514578 Training:At training steps 27800, training MLE loss is 0.002350737611269068, train CRF loss is 0.0010980545701740807 Training:At training steps 27900, training MLE loss is 0.0021628422794889727, train CRF loss is 0.0011252578635290267 Training:At training steps 28000, training MLE loss is 0.001989857983789544, train CRF loss is 0.0011109737622708176 Validation:At training steps 28000, training MLE loss is 0.001989857983789544, train CRF loss is 0.0011109737622708176, validation MLE loss is 9.190084501316672, validation ppl is 9799.479, validation CRF loss is 9.175359032656017, validation BLEU is 61.36 Training:At training steps 28100, training MLE loss is 0.001355675436137612, train CRF loss is 0.0008138005192752384 Training:At training steps 28200, training MLE loss is 0.0013193811431166558, train CRF loss is 0.0007650522746707189 Training:At training steps 28300, training MLE loss is 0.0013668041602224541, train CRF loss is 0.0007625109991045805 Training:At training steps 28400, training MLE loss is 0.001447613463830034, train CRF loss is 0.0008651816853351502 Training:At training steps 28500, training MLE loss is 0.001460378371255245, train CRF loss is 0.0008422372894100211 Validation:At training steps 28500, training MLE loss is 0.001460378371255245, train CRF loss is 0.0008422372894100211, validation MLE loss is 9.364394947102195, validation ppl is 11665.545, validation CRF loss is 9.285821826834427, validation BLEU is 60.87 Training:At training steps 28600, training MLE loss is 0.001040381965453372, train CRF loss is 0.0006537775008401781 Training:At training steps 28700, training MLE loss is 0.001070986628361712, train CRF loss is 0.000693329298942742 Training:At training steps 28800, training MLE loss is 0.001206288527745656, train CRF loss is 0.0008154735138696371 Training:At training steps 28900, training MLE loss is 0.0011772760155807332, train CRF loss is 0.0008059490488670407 Training:At training steps 29000, training MLE loss is 0.001183729060305046, train CRF loss is 0.000792020517777396 Validation:At training steps 29000, training MLE loss is 0.001183729060305046, train CRF loss is 0.000792020517777396, validation MLE loss is 9.413018050946688, validation ppl is 12246.777, validation CRF loss is 9.338696580184134, validation BLEU is 61.38 Training:At training steps 29100, training MLE loss is 0.0009632638271033717, train CRF loss is 0.0005157232372483644 Training:At training steps 29200, training MLE loss is 0.0009516951984891758, train CRF loss is 0.00047316838173917076 Training:At training steps 29300, training MLE loss is 0.0010882861904523327, train CRF loss is 0.000569401514582597 Training:At training steps 29400, training MLE loss is 0.0010397495494490856, train CRF loss is 0.0005568942791559061 Training:At training steps 29500, training MLE loss is 0.0010803206638199394, train CRF loss is 0.0005930825790176533 Validation:At training steps 29500, training MLE loss is 0.0010803206638199394, train CRF loss is 0.0005930825790176533, validation MLE loss is 9.535820402597126, validation ppl is 13846.952, validation CRF loss is 9.457702605347885, validation BLEU is 60.93 Training:At training steps 29600, training MLE loss is 0.0014837551363592711, train CRF loss is 0.0010817649093758108 Training:At training steps 29700, training MLE loss is 0.0013780047825735816, train CRF loss is 0.0008551681821887324 Training:At training steps 29800, training MLE loss is 0.0013078833550590194, train CRF loss is 0.0008332984453542054 Training:At training steps 29900, training MLE loss is 0.0012355600724511776, train CRF loss is 0.0007488150734724853 Training:At training steps 30000, training MLE loss is 0.0011405022574334138, train CRF loss is 0.000727129353450219 Validation:At training steps 30000, training MLE loss is 0.0011405022574334138, train CRF loss is 0.000727129353450219, validation MLE loss is 9.638162023142764, validation ppl is 15339.125, validation CRF loss is 9.572040677070618, validation BLEU is 61.08 Training:At training steps 30100, training MLE loss is 0.0010453180960561961, train CRF loss is 0.0006713166089435507 Training:At training steps 30200, training MLE loss is 0.0010932583183681243, train CRF loss is 0.0007488170705855679 Training:At training steps 30300, training MLE loss is 0.0009876335653621704, train CRF loss is 0.0006063619781037962 Training:At training steps 30400, training MLE loss is 0.0010151631255071332, train CRF loss is 0.0005939314444131616 Training:At training steps 30500, training MLE loss is 0.0009690209511304897, train CRF loss is 0.0005670660688312275 Validation:At training steps 30500, training MLE loss is 0.0009690209511304897, train CRF loss is 0.0005670660688312275, validation MLE loss is 9.64294675776833, validation ppl is 15412.694, validation CRF loss is 9.594075331562443, validation BLEU is 61.78 Training:At training steps 30600, training MLE loss is 0.0010422022035198536, train CRF loss is 0.0007663870685691121 Training:At training steps 30700, training MLE loss is 0.0009263238636777597, train CRF loss is 0.0006129482997008239 Training:At training steps 30800, training MLE loss is 0.0008159712588062943, train CRF loss is 0.000510191772277994 Training:At training steps 30900, training MLE loss is 0.0008200029285304683, train CRF loss is 0.0005054999277120775 Training:At training steps 31000, training MLE loss is 0.0007810352526148085, train CRF loss is 0.0004441984309751028 Validation:At training steps 31000, training MLE loss is 0.0007810352526148085, train CRF loss is 0.0004441984309751028, validation MLE loss is 9.60170328617096, validation ppl is 14789.952, validation CRF loss is 9.598769893771724, validation BLEU is 61.27 Training:At training steps 31100, training MLE loss is 0.0010024526842388436, train CRF loss is 0.00034850632782057467 Training:At training steps 31200, training MLE loss is 0.0008863368745682977, train CRF loss is 0.00042416467435338887 Training:At training steps 31300, training MLE loss is 0.0007970814968997844, train CRF loss is 0.00044863110597271217 Training:At training steps 31400, training MLE loss is 0.0008369685496476101, train CRF loss is 0.00044121720447424015 Training:At training steps 31500, training MLE loss is 0.0008130812914816192, train CRF loss is 0.0004989881868291874 Validation:At training steps 31500, training MLE loss is 0.0008130812914816192, train CRF loss is 0.0004989881868291874, validation MLE loss is 9.559792800953513, validation ppl is 14182.907, validation CRF loss is 9.558710637845492, validation BLEU is 60.11 Training:At training steps 31600, training MLE loss is 0.0007735083233046844, train CRF loss is 0.0006124676830949927 Training:At training steps 31700, training MLE loss is 0.0007086367730287556, train CRF loss is 0.0005227509958066512 Training:At training steps 31800, training MLE loss is 0.0005549361868524583, train CRF loss is 0.00045517499430632806 Training:At training steps 31900, training MLE loss is 0.0004953277528529275, train CRF loss is 0.00040074402805074795 Training:At training steps 32000, training MLE loss is 0.0004762014007636294, train CRF loss is 0.00038902674164893367 Validation:At training steps 32000, training MLE loss is 0.0004762014007636294, train CRF loss is 0.00038902674164893367, validation MLE loss is 9.697299166729575, validation ppl is 16273.596, validation CRF loss is 9.630972034052798, validation BLEU is 60.71 Training:At training steps 32100, training MLE loss is 0.0007169062373930257, train CRF loss is 0.00034359939823747077 Training:At training steps 32200, training MLE loss is 0.0006314806888474024, train CRF loss is 0.0003206885018435579 Training:At training steps 32300, training MLE loss is 0.0006245398279402938, train CRF loss is 0.00030445022689677676 Training:At training steps 32400, training MLE loss is 0.0006447768024629768, train CRF loss is 0.00032632104812216014 Training:At training steps 32500, training MLE loss is 0.000599495260156036, train CRF loss is 0.0003149160504347197 Validation:At training steps 32500, training MLE loss is 0.000599495260156036, train CRF loss is 0.0003149160504347197, validation MLE loss is 9.76208435861688, validation ppl is 17362.784, validation CRF loss is 9.620432721941095, validation BLEU is 60.37 Training:At training steps 32600, training MLE loss is 0.0005047448865399674, train CRF loss is 0.0003014114464217732 Training:At training steps 32700, training MLE loss is 0.000504623145778383, train CRF loss is 0.0003451547853642989 Training:At training steps 32800, training MLE loss is 0.0004996167254043715, train CRF loss is 0.00031857095268276553 Training:At training steps 32900, training MLE loss is 0.0005115694277385758, train CRF loss is 0.000340487821089277 Training:At training steps 33000, training MLE loss is 0.0004711385574026314, train CRF loss is 0.0003265254509905944 Validation:At training steps 33000, training MLE loss is 0.0004711385574026314, train CRF loss is 0.0003265254509905944, validation MLE loss is 9.802544869874653, validation ppl is 18079.697, validation CRF loss is 9.754774984560514, validation BLEU is 60.96 Training:At training steps 33100, training MLE loss is 0.0003206946495297477, train CRF loss is 0.00026200362993352046 Training:At training steps 33200, training MLE loss is 0.0003680654768654538, train CRF loss is 0.00027268886081520226 Training:At training steps 33300, training MLE loss is 0.0003209060027795585, train CRF loss is 0.0002244330437674617 Training:At training steps 33400, training MLE loss is 0.0003419376430238053, train CRF loss is 0.0002093135209431729 Training:At training steps 33500, training MLE loss is 0.00034404167307130264, train CRF loss is 0.0002231658613797123 Validation:At training steps 33500, training MLE loss is 0.00034404167307130264, train CRF loss is 0.0002231658613797123, validation MLE loss is 9.747199133822793, validation ppl is 17106.249, validation CRF loss is 9.707024530360574, validation BLEU is 60.61 Training:At training steps 33600, training MLE loss is 0.0005807600548155445, train CRF loss is 0.0006224867083561137 Training:At training steps 33700, training MLE loss is 0.000447318074102368, train CRF loss is 0.00041478204415379416 Training:At training steps 33800, training MLE loss is 0.00045423014069865126, train CRF loss is 0.0003354489022644458 Training:At training steps 33900, training MLE loss is 0.000484105415071455, train CRF loss is 0.00036244906341274086 Training:At training steps 34000, training MLE loss is 0.00044334265394553824, train CRF loss is 0.0003227237697323657 Validation:At training steps 34000, training MLE loss is 0.00044334265394553824, train CRF loss is 0.0003227237697323657, validation MLE loss is 9.79210923219982, validation ppl is 17892.005, validation CRF loss is 9.702482348994204, validation BLEU is 61.13 Training:At training steps 34100, training MLE loss is 0.00031183010329565266, train CRF loss is 0.00014565413327529697 Training:At training steps 34200, training MLE loss is 0.0002504202316216523, train CRF loss is 9.327365677042065e-05 Training:At training steps 34300, training MLE loss is 0.0002934207678663095, train CRF loss is 9.943798958122289e-05 Training:At training steps 34400, training MLE loss is 0.0003500906563731089, train CRF loss is 0.00016833506415514954 Training:At training steps 34500, training MLE loss is 0.00034435750003175645, train CRF loss is 0.00018208915727655216 Validation:At training steps 34500, training MLE loss is 0.00034435750003175645, train CRF loss is 0.00018208915727655216, validation MLE loss is 9.89977174683621, validation ppl is 19925.822, validation CRF loss is 9.794579091824984, validation BLEU is 60.76 Training:At training steps 34600, training MLE loss is 0.0002636633473037567, train CRF loss is 0.00024900119717506273 Training:At training steps 34700, training MLE loss is 0.00025391262717864313, train CRF loss is 0.000182000255088679 Training:At training steps 34800, training MLE loss is 0.0002923463664419103, train CRF loss is 0.0002452532546318779 Training:At training steps 34900, training MLE loss is 0.0003487283387240494, train CRF loss is 0.0002252171974443895 Training:At training steps 35000, training MLE loss is 0.00034164932457156035, train CRF loss is 0.00024263661139618798 Validation:At training steps 35000, training MLE loss is 0.00034164932457156035, train CRF loss is 0.00024263661139618798, validation MLE loss is 9.833255987418326, validation ppl is 18643.559, validation CRF loss is 9.784394364607962, validation BLEU is 61.37 Training:At training steps 35100, training MLE loss is 0.0004638118056986688, train CRF loss is 0.0004239800738770949 Training:At training steps 35200, training MLE loss is 0.0003690349797205517, train CRF loss is 0.0003490947002754363 Training:At training steps 35300, training MLE loss is 0.00043481906550299157, train CRF loss is 0.0003841657892341906 Training:At training steps 35400, training MLE loss is 0.0003786025942658158, train CRF loss is 0.000305971493970757 Training:At training steps 35500, training MLE loss is 0.000337441289732618, train CRF loss is 0.0002705695741596408 Validation:At training steps 35500, training MLE loss is 0.000337441289732618, train CRF loss is 0.0002705695741596408, validation MLE loss is 9.810878960709823, validation ppl is 18231.004, validation CRF loss is 9.708735466003418, validation BLEU is 60.72 Training:At training steps 35600, training MLE loss is 0.00018434581248285884, train CRF loss is 0.00014923189772861 Training:At training steps 35700, training MLE loss is 0.00024261546163567146, train CRF loss is 0.0001688014976862462 Training:At training steps 35800, training MLE loss is 0.00030610100666153117, train CRF loss is 0.00019248613491228638 Training:At training steps 35900, training MLE loss is 0.00034967513868792364, train CRF loss is 0.00030408344654581754 Training:At training steps 36000, training MLE loss is 0.0003342243400093648, train CRF loss is 0.00027033660768220445 Validation:At training steps 36000, training MLE loss is 0.0003342243400093648, train CRF loss is 0.00027033660768220445, validation MLE loss is 9.814719909115842, validation ppl is 18301.163, validation CRF loss is 9.780684558968796, validation BLEU is 61.61 Training:At training steps 36100, training MLE loss is 0.00033675461568847057, train CRF loss is 0.00022124934467201206 Training:At training steps 36200, training MLE loss is 0.0003342222363536299, train CRF loss is 0.0001902238944702761 Training:At training steps 36300, training MLE loss is 0.0002686133421439212, train CRF loss is 0.00017215398281522187 Training:At training steps 36400, training MLE loss is 0.00024442346740250663, train CRF loss is 0.000172232037970369 Training:At training steps 36500, training MLE loss is 0.0002618969559905559, train CRF loss is 0.00017067206627161902 Validation:At training steps 36500, training MLE loss is 0.0002618969559905559, train CRF loss is 0.00017067206627161902, validation MLE loss is 9.763341847218966, validation ppl is 17384.631, validation CRF loss is 9.730069925910549, validation BLEU is 61.9 Training:At training steps 36600, training MLE loss is 0.00040980200796927933, train CRF loss is 0.00030722009392947225 Training:At training steps 36700, training MLE loss is 0.00033810234711966787, train CRF loss is 0.0002790105632355688 Training:At training steps 36800, training MLE loss is 0.00028368587709805634, train CRF loss is 0.0002273717802820648 Training:At training steps 36900, training MLE loss is 0.00025570755701597976, train CRF loss is 0.0001954674042260729 Training:At training steps 37000, training MLE loss is 0.00025138348924766494, train CRF loss is 0.00018296825597635813 Validation:At training steps 37000, training MLE loss is 0.00025138348924766494, train CRF loss is 0.00018296825597635813, validation MLE loss is 9.861475731197157, validation ppl is 19177.169, validation CRF loss is 9.794927590771726, validation BLEU is 60.98 Training:At training steps 37100, training MLE loss is 0.0003662445870105002, train CRF loss is 0.00023677196777711185 Training:At training steps 37200, training MLE loss is 0.0003242914462790827, train CRF loss is 0.00019435707976162898 Training:At training steps 37300, training MLE loss is 0.00026858285360910155, train CRF loss is 0.00018256774709154064 Training:At training steps 37400, training MLE loss is 0.0002487923845941029, train CRF loss is 0.0001823131314608306 Training:At training steps 37500, training MLE loss is 0.0002187238964552609, train CRF loss is 0.00015900399560955504 Validation:At training steps 37500, training MLE loss is 0.0002187238964552609, train CRF loss is 0.00015900399560955504, validation MLE loss is 9.826099514961243, validation ppl is 18510.613, validation CRF loss is 9.722260450061999, validation BLEU is 59.71 Training:At training steps 37600, training MLE loss is 9.415500999028208e-05, train CRF loss is 3.738037257478233e-05 Training:At training steps 37700, training MLE loss is 8.452256476053279e-05, train CRF loss is 2.9755953776970935e-05 Training:At training steps 37800, training MLE loss is 9.496620114859145e-05, train CRF loss is 3.735455193809134e-05 Training:At training steps 37900, training MLE loss is 8.450326373417912e-05, train CRF loss is 3.008256786196206e-05 Training:At training steps 38000, training MLE loss is 7.198398708339278e-05, train CRF loss is 2.8265512250309043e-05 Validation:At training steps 38000, training MLE loss is 7.198398708339278e-05, train CRF loss is 2.8265512250309043e-05, validation MLE loss is 9.803883816066541, validation ppl is 18103.921, validation CRF loss is 9.747668109442058, validation BLEU is 60.74 Training:At training steps 38100, training MLE loss is 0.00018803852216444012, train CRF loss is 0.00018965335954881368 Training:At training steps 38200, training MLE loss is 0.00019148612144174494, train CRF loss is 0.00016804112534827054 Training:At training steps 38300, training MLE loss is 0.00016899031332653992, train CRF loss is 0.0001370057471738558 Training:At training steps 38400, training MLE loss is 0.00016323141850236144, train CRF loss is 0.00010741043791386629 Training:At training steps 38500, training MLE loss is 0.0001382098434989501, train CRF loss is 8.857802377003843e-05 Validation:At training steps 38500, training MLE loss is 0.0001382098434989501, train CRF loss is 8.857802377003843e-05, validation MLE loss is 9.8071304245999, validation ppl is 18162.793, validation CRF loss is 9.740335232333132, validation BLEU is 60.75 Training:At training steps 38600, training MLE loss is 0.0001097190903363933, train CRF loss is 8.600163392089578e-06 Training:At training steps 38700, training MLE loss is 0.000161046015741464, train CRF loss is 4.161725205847899e-05 Training:At training steps 38800, training MLE loss is 0.0001304383556190819, train CRF loss is 2.848416190206038e-05 Training:At training steps 38900, training MLE loss is 0.00011836733595619399, train CRF loss is 2.952491579929384e-05 Training:At training steps 39000, training MLE loss is 0.00010724069502994215, train CRF loss is 3.1202775502459975e-05 Validation:At training steps 39000, training MLE loss is 0.00010724069502994215, train CRF loss is 3.1202775502459975e-05, validation MLE loss is 9.848012447357178, validation ppl is 18920.711, validation CRF loss is 9.797776793178759, validation BLEU is 60.92 Training:At training steps 39100, training MLE loss is 0.0001985062393567255, train CRF loss is 8.773445740856812e-05 Training:At training steps 39200, training MLE loss is 0.00018349392219061287, train CRF loss is 7.909885964910934e-05 Training:At training steps 39300, training MLE loss is 0.00014781230619068138, train CRF loss is 8.46070118458971e-05 Training:At training steps 39400, training MLE loss is 0.0001553739638361043, train CRF loss is 7.595514120468771e-05 Training:At training steps 39500, training MLE loss is 0.00017375867247594994, train CRF loss is 7.804126011314772e-05 Validation:At training steps 39500, training MLE loss is 0.00017375867247594994, train CRF loss is 7.804126011314772e-05, validation MLE loss is 9.840300271385594, validation ppl is 18775.353, validation CRF loss is 9.823768609448484, validation BLEU is 61.02 Training:At training steps 39600, training MLE loss is 6.894660001579219e-05, train CRF loss is 6.698218894026554e-05 Training:At training steps 39700, training MLE loss is 4.122589226051788e-05, train CRF loss is 3.760001469586216e-05 Training:At training steps 39800, training MLE loss is 0.00012044916037083075, train CRF loss is 5.708582668905245e-05 Training:At training steps 39900, training MLE loss is 0.00011010359424189809, train CRF loss is 4.7396103000303394e-05 Training:At training steps 40000, training MLE loss is 0.00010572806712051858, train CRF loss is 4.0423374469828134e-05 Validation:At training steps 40000, training MLE loss is 0.00010572806712051858, train CRF loss is 4.0423374469828134e-05, validation MLE loss is 9.885999547807794, validation ppl is 19653.28, validation CRF loss is 9.837015082961635, validation BLEU is 61.1