Skip to content

Latest commit

 

History

History
40 lines (29 loc) · 8.46 KB

File metadata and controls

40 lines (29 loc) · 8.46 KB

LinkReview

  • Here we have collect info about all the works that may be useful for writing our paper
  • Each of the contributors is responsible for their part of the work, as specified in the table

[Topic] Cross-lingual Knowledge Transition

Title Year Authors Paper Code Summary
How do Large Language Models Handle Multilingualism? @Nikita_Okhotnikov 2024 Yiran Zhao, Wenxuan Zhang, Guizhen Chen, Kenji Kawaguchi, Lidong Bing arxiv preprint - Authors define "language specific neurons" that dramatically affect the performance on a single language and finetune these on little training corpus gaining noticeable performance uplift
Do Llamas Work in English? On the Latent Language of Multilingual Transformers @Anastasia Voznyuk 2024 Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West arxiv preprint GitHub Authors claim that models operate in English and only after 15th layer transition to the target language. Entropy, at first high, to the final layers, decreases
Emerging cross-lingual structure in pretrained language models@Anastasia Voznyuk 2024 Alexis Conneau, Shijie Wu, Haoran Li, Luke Zettlemoyer, and Veselin Stoyanov ACL - This study examines multilingual masked language modeling and explores factors behind its effectiveness for cross-lingual transfer. It finds that transfer is possible even without shared vocabulary or similar text domains, as long as top-layer parameters are shared. Additionally, monolingual BERT representations across languages can be aligned post-hoc, suggesting universal symmetries in embedding spaces, which are discovered and aligned during joint training in multilingual models.
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models @Andrei Semenov 2024 Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Wayne Xin Zhao, Furu Wei, Ji-Rong Wen arxiv GitHub There exists lanugage-specific neurons, responsible for generating putput in particular language. Consequently, we can affect the quality of the multilingual output, by activating and deactivating these neurons
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs @Andrei Semenov 2024 Weixuan Wang, Barry Haddow, Wei Peng, Alexandra Birch arxiv GitHub Study about how neuron activation is shared across languages by categorizing neurons into four types: all-shared, partial-shared, specific, and non-activated. Task type affects linguistic sharing patterns, neuron behavior varies across inputs, and all-shared neurons are crucial for correct responses. Increasing all-shared neurons improves accuracy on multilingual tasks.
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models @Andrei Semenov 2024 Xinyu Zhou, Delong Chen, Samuel Cahyawijaya, Xufeng Duan, Zhenguang G. Cai arxiv GitHub By measuring activation differences across minimal pairs, this study quantifies linguistic similarity in LLMs. Experiments with 100+ LLMs and 150k minimal pairs in three languages reveal that: 1) training data influences linguistic similarity, with higher agreement in high-resource languages, 2) similarity aligns with fine-grained linguistic categories but not broader ones, 3) it is weakly correlated with semantic similarity, showing context dependency, and 4) LLMs show limited cross-lingual alignment in understanding linguistic phenomena.
Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners @Andrei Semenov 2024 Shimao Zhang, Changjiang Gao, Wenhao Zhu, Jiajun Chen, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang arxiv GitHub Most LLMs show unbalanced performance across languages, but translation-based multilingual alignment is effective. This study explores the spontaneous improvement in multilingual alignment when LLMs are instruction-tuned on question translation data (without annotated answers). This boosts alignment between English and many languages, even those not seen during tuning.
Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs @Nikita_Okhotnikov 2024 Maxim Ifergan, Leshem Choshen, Roee Aharoni, Idan Szpektor, Omri Abend arxiv preprint - LLM factual knowledge are inconsistent across languages. The methodology to measure knowledge representations sharing across languages proposed. Script similarity -- dominant factor in representation sharing. Multiligual sharing has a potential to increase performance in the best-performing language.

[Subtopic]: Multilingual Represenatitions

Title Year Authors Paper Code Summary
On the cross-lingual transferability of monolingual representations @Alexander Terentyev 2020 Artetxe, Mikel, Sebastian Ruder, and Dani Yogatama. arxiv - A masked language model, trained on one language, is transferred to another by learning a new embedding matrix, freezing all other layers. This approach, without shared vocabulary or joint training, rivals multilingual BERT on cross-lingual tasks, showing monolingual models can generalize across languages. Also, new cross-lingual benchmark is added

[Subtopic]: How distinct neurons work in LLMs?

Title Year Authors Paper Code Summary
Neurons in Large Language Models: Dead, N-gram, Positional @Andrei Semenov 2024 Elena Voita, Javier Ferrando, Christoforos Nalmpantis arxiv Blog Many neurons are “dead”, i.e. they never activate on a large collection of diverse data. At the same time, many of the alive neurons are reserved for discrete features and act as token and n-gram detectors.
Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons @Andrei Semenov 2024 Yongqi Leng and Deyi Xiong arxiv - Experiments identify task-specific neurons linked to specific tasks, offering insights into generalization and catastrophic forgetting in multi-task learning. Overlapping neurons across tasks, especially in certain LLM layers, strongly correlate with generalization performance.
Evaluating Neuron Interpretation Methods of NLP Models @Andrei Semenov 2023 Yimin Fan, Fahim Dalvi, Nadir Durrani, Hassan Sajjad arxiv GitHub Neurons that are commonly discovered by different interpretation methods are more informative than others, and two novel methods for detecting informative neurons are presented

[Surveys]

Title Year Authors Paper
Knowledge Mechanisms in Large Language Models: A Survey and Perspective @Andrei Semenov 2024 Mengru Wang, Yunzhi Yao, Ziwen Xu, Shuofei Qiao, Shumin Deng, Peng Wang, Xiang Chen, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang arxiv
Towards a Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review @Andrei Semenov 2023 Fred Philippy, Siwen Guo, Shohreh Haddadan arxiv