当前位置：首页 > news >正文

研究生入学前文献翻译训练

news 2025/7/6 11:02:10

文献翻译

人工智能
- 1.《Meta - Learning with Memory - Augmented Neural Networks》
- - one-shot learning：
  - Neural Turing Machines，NTMs
- 2.《Model - Agnostic Meta - Learning for Fast Adaptation of Deep Networks》
- - Meta - learning
  - gradient steps
  - finetune
- 3.《Attention Is All You Need》
- - attention mechanism
  - encoder and a decoder
  - WMT
  - BLEU
- 《Imagenet Classification with Deep Convolutional Neural Networks》
- 《Automatic Chain of Thought Prompting in Large Language Models》
DeepSeek
- 1.《DeepSeek LLM: Scaling Open - Source Language Models with Longtermism》
- 2.《DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture - of - Experts Language Models》
- 3.《DeepSeek - V2: A Strong, Economical, and Efficient Mixture - of - Experts Language Model》
- 4.《DeepSeek - Coder: When the Large Language Model Meets Programming — the Rise of Code Intelligence》
- 5.《DeepSeek - Math: Pushing the Limits of Mathematical Reasoning in Open Language Models》
- 6.《DeepSeek - V3 Technical Report》
- 7.《DeepSeek - R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning》
- 8.《Native Sparse Attention: Enabling Efficient Long - Context Modeling for Large - Scale Language Models》

人工智能

1.《Meta - Learning with Memory - Augmented Neural Networks》

https://proceedings.mlr.press/v48/santoro16.pdf
在这里插入图片描述
文献综述：
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of “one-shot learning.” Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered,the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.

Abstract
Despite recent breakthroughs（突破） in the applications of deep neural networks（深度神经网络）, one setting（一种背景） that presents（提出） a persistent（持久的；坚持不懈的；持续不断的） challenge is that of “one-shot learning.（一次性学习）” Traditional gradient-based（基于梯度的） networks require a lot of data to learn, often through extensive（广泛的） iterative（迭代的） training. When new data is encountered（遇到）,the models must inefficiently relearn their parameters to adequately incorporate（吸收） the new information without （不在。。情况下）catastrophic（灾难的） interference（干扰）. Architectures（架构） with **augmented （增强的）**memory capacities（性能）, such as Neural Turing Machines (NTMs)（神经图灵机）, offer the ability to quickly encode and retrieve（检索） new information, and hence（因此） can potentially（潜在地） obviate（消除；排除；使不必要） the downsides（缺点） of conventional（传统的） models. Here, we demonstrate（证明） the ability of a memory-augmented neural network to rapidly（迅速地） assimilate（同化） new data, and leverage（杠杆） this data to make accurate（准确的） predictions after only a few samples. We also introduce a new method for accessing（访问） an external memory that focuses on memory content, unlike previous（以前的；先前的） methods that additionally（此外） use memory location-based focusing mechanisms（“机制”“机理）.

翻译：
尽管深度神经网络在应用方面近期取得了诸多突破，但 “一次性学习” 这一情形始终是一项极具挑战性的任务。传统的基于梯度的网络需要大量数据进行学习，且往往要经过大量的迭代训练。当遇到新数据时，这些模型必须低效地重新学习其参数，以在不产生灾难性干扰的情况下充分整合新信息。诸如神经图灵机（NTMs）这类具备增强记忆能力的架构，能够快速编码和检索新信息，因此有可能克服传统模型的弊端。在此，我们展示了一种记忆增强神经网络快速吸收新数据的能力，并且利用这些数据在仅获取少量样本后就能做出准确预测。我们还引入了一种访问外部记忆的新方法，该方法专注于记忆内容，而不像之前的方法那样还使用基于记忆位置的聚焦机制。

one-shot learning：

一次性学习（One - shot learning是机器学习中的一种范式，旨在让模型仅通过一个或几个示例就能学习和识别新的对象或概念，有别于传统的需要大量数据进行训练的方法。以下是详细介绍：

工作原理：首先在数据准备阶段，与传统的基于机器学习的对象分类算法不同，一次性学习中每个类别或概念的示例数量有限，通常每个类别只有一个示例。然后从可用数据中提取有意义的特征，这些特征是每个类别的独特特征或模式，帮助模型在数据有限的情况下专注于关键信息。接下来在模型架构方面，常采用神经网络，尤其是连体网络（Siamese networks）或三元组网络（Triplet networks）。连体网络由两个具有共享权重和架构的相同子网络组成，它取两个输入样本，提取特征向量，并计算它们之间的距离或相似度，以确定输入是否属于同一类别。三元组网络使用三个输入样本：一个来自目标类别的锚样本、一个来自同一类别的正样本和一个来自不同类别的负样本。网络学习最小化锚样本与正样本之间的距离，并最大化锚样本与负样本之间的距离。在训练过程中，模型根据特征空间中的相似性或不相似性调整其参数，以区分不同类别。训练完成后，在推理阶段，当呈现一个新样本时，模型计算其特征向量，并将其与训练数据集中的已知示例进行比较，根据相似性对新样本进行分类。

关键特征:具有最小数据需求的特点，一次性学习模型仅用一个或几个示例就能做出准确预测。还具备高泛化能力，这些模型被设计为能从有限数据中进行泛化，并根据所学特征识别新实例。此外，学习过程高效，常使用度量学习等技术来学习相似性函数，或利用迁移学习在预学习的表示基础上进行构建。而且它受人类认知启发，模仿人类从有限接触中快速学习新概念的能力。

应用:在医学成像中，它可以从少量医学图像中辅助诊断罕见疾病，并利用稀疏的患者数据帮助制定个性化治疗方案。在面部识别应用中，它能通过单张图像识别个体，从而增强安全系统，并改进生物识别认证系统。对于手写和字符识别任务，它能从有限的示例中准确识别手写字符或罕见字体，有助于文档数字化以及低资源语言的语言处理。在机器人领域，它能让机器人通过最少的训练识别并操作新物体，提高其适应性和效率。

与其他学习范式的区别：与零次学习（Zero - shot learning）相比，零次学习通过使用先验知识和推理能力，使模型在没有任何特定任务或领域相关训练数据的情况下执行任务，而一次性学习在训练集中每个类别至少有一个示例。而少样本学习（Few - shot learning）是一次性学习的推广，当训练集中每个类别的示例数量多于一个但仍然相对较少时，就称为少样本学习。

Neural Turing Machines，NTMs

神经图灵机（Neural Turing Machines，NTMs）是一类将神经网络的能力与图灵机的算法能力相结合的人工神经网络。以下是详细介绍：
定义与起源：神经图灵机由 DeepMind 的亚历克斯・格雷夫斯（Alex Graves）、格雷格・韦恩（Greg Wayne）和伊沃・达尼埃尔卡（Ivo Danihelka&

查看全文

http://www.mrgr.cn/news/95738.html