arxiv:2203.12171

An Empirical Study of Memorization in NLP

Published on Mar 23, 2022

Authors:

Xiaosen Zheng ,

Abstract

A recent study by Feldman (2020) proposed a <PRE_TAG>long-tail theory</POST_TAG> to explain the <PRE_TAG>memorization</POST_TAG> behavior of deep learning models. However, <PRE_TAG>memorization</POST_TAG> has not been empirically verified in the context of NLP, a gap addressed by this work. In this paper, we use three different NLP tasks to check if the long-tail theory holds. Our experiments demonstrate that top-ranked memorized training instances are likely atypical, and removing the top-memorized training instances leads to a more serious drop in <PRE_TAG>test accuracy</POST_TAG> compared with removing training instances randomly. Furthermore, we develop an attribution method to better understand why a training instance is memorized. We empirically show that our <PRE_TAG><PRE_TAG><PRE_TAG>memorization</POST_TAG> attribution method</POST_TAG></POST_TAG> is faithful, and share our interesting finding that the top-memorized parts of a training instance tend to be features negatively correlated with the class label.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2203.12171 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2203.12171 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2203.12171 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.