如果LLM的突然到来让你感到沮丧,不妨读下主目录的Choose Your Weapon Survival Strategies for Depressed AI Academics 持续更新以下内容,Star to keep updated~
- 开源LLM
- 指令微调和RLHF数据以及训练框架
- Prompt和LLM相关论文按细分方向梳理
- AIGC相关应用
- Prompt指南和教程
- ChatGPT及AGI相关解读
- 解密Prompt系列1. Tunning-Free Prompt:GPT2 & GPT3 & LAMA & AutoPrompt
- 解密Prompt系列2. 冻结Prompt微调LM: T5 & PET & LM-BFF
- 解密Prompt系列3. 冻结LM微调Prompt: Prefix-tuning & Prompt-tuning & P-tuning
- 解密Prompt系列4. 升级Instruction Tuning:Flan/T0/InstructGPT/TKInstruct
- 解密prompt系列5. APE+SELF=自动化指令集构建代码实现
- 解密Prompt系列6. lora指令微调扣细节-请冷静,1个小时真不够~
- 解密Prompt系列7. 偏好对齐RLHF-OpenAI·DeepMind·Anthropic对比分析
- ChatGPT应用1. MakeInstruction零人工指令样本构建
- ChatGPT应用2. ChatPDF简单复现
- 高星中文「类Alpaca」项目对比测评
- 可商用LLM列表
- CMU开源聊天机器人评测应用: ChatGPT>Vicuna>others;在对话场景中训练可能很重要
- Berkley出品大模型排位赛榜有准中文榜单: GPT4自然是稳居第一,GPT4>Claude>GPT3.5>Vicuna>others
模型链接 | 模型描述 |
---|---|
Google Bard | 谷歌bard虽迟但到,可以申请waitlist了 |
Claude | ChatGPT最大竞争对手Claude也开放申请了,slack中无限试用 |
LLaMA | Meta开源指令微调LLM,规模70 亿到 650 亿不等 |
MPT | MosaicML开源的预训练+指令微调的新模型,可商用,支持84k tokens超长输入 |
RedPajama | RedPajama项目既开源预训练数据后开源3B,7B的预训练+指令微调模型 |
ChatLLaMA | 基于RLHF微调了LLaMA |
Alpaca | 斯坦福开源的使用52k数据在7B的LLaMA上微调得到, |
Alpaca-lora | LORA微调的LLaMA |
Dromedary | IBM self-aligned model with the LLaMA base |
Vicuna | Alpaca前成员等开源以LLama13B为基础使用ShareGPT指令微调的模型,提出了用GPT4来评测模型效果 |
koala | 使用alpaca,HC3等开源指令集+ ShareGPT等ChatGPT数据微调llama,在榜单上排名较高 |
ColossalChat | HPC-AI Tech开源的Llama+RLHF微调 |
MiniGPT4 | Vicuna+BLIP2 文本视觉融合 |
StackLLama | LLama使用Stackexchange数据+SFT+RL |
Cerebras | Cerebras开源了1亿到130亿的7个模型,从预训练数据到参数全开源 |
PaLM-E | 谷歌多模态大模型,540B的PaLM语言模型和22B的ViT视觉模型相结合,得到562B的PaLM-E模型,在机器人应用场景有了新的突破 |
Dolly-v2 | 可商用 7b指令微调开源模型在GPT-J-6B上微调 |
OpenChatKit | openai研究员打造GPT-NoX-20B微调+6B审核模型过滤 |
MetaLM | 微软开源的大规模自监督预训练模型 |
Amazon Titan | 亚马逊在aws上增加自家大模型 |
OPT-IML | Meta复刻GPT3,up to 175B, 不过效果并不及GPT3 |
Bloom | BigScience出品,规模最大176B |
BloomZ | BigScience出品, 基于Bloom微调 |
Galacia | 和Bloom相似,更针对科研领域训练的模型 |
T0 | BigScience出品,3B~11B的在T5进行指令微调的模型 |
模型链接 | 模型描述 |
---|---|
ChatGLM | 清华开源的、支持中英双语的对话语言模型,使用了代码训练,指令微调和RLHF。和以下GLM相同大小的130B的模型还在开发中。试用了下超出预期! |
Moss | 为复旦正名!开源了预训练,指令微调的全部数据和模型。可商用 |
Wombat-7B | 达摩院开源无需强化学习使用RRHF对齐的语言模型, alpaca基座 |
Chinese-LLaMA-Alpaca | 哈工大中文指令微调的LLaMA |
Luotuo | 中文指令微调的LLaMA,和ChatGLM |
文心一言 | 已经拿到邀请码并试用,虽然人格化程度显著低,但效果上并没有很拉胯,国产YYDS!不过商业化霸王条款确实不少 |
通义千问 | 阿里系LLM开放申请 |
星火 | 科大讯飞星火,数学是真的厉害 |
BiLLa | LLama词表扩充预训练+预训练和任务1比1混合SFT+指令样本SFT三阶段训练 |
Phoenix | 港中文开源凤凰和奇美拉LLM,Bloom基座,40+语言支持 |
OpenBuddy | Llama 多语言对话微调模型 |
Guanaco | LLama 7B基座,在alpaca52K数据上加入534K多语言指令数据微调 |
ziya | IDEA研究院在7B/13B llama上继续预训练+SFT+RM+PPO+HFTT+COHFT+RBRS |
Chinese Vincuna | LLama 7B基座,使用Belle+Guanaco数据训练 |
Linly | Llama 7B基座,使用belle+guanaco+pclue+firefly+CSL+newscommentary等7个指令微调数据集训练 |
Firefly | 中文2.6B模型,提升模型中文写作,古文能力,待开源全部训练代码,当前只有模型 |
Baize | 使用100k self-chat对话数据微调的LLama |
BELLE | 使用ChatGPT生成数据对开源模型进行中文优化 |
Chatyuan | chatgpt出来后最早的国内开源对话模型,T5架构是下面PromptCLUE的衍生模型 |
PromptCLUE | 多任务Prompt语言模型 |
PLUG | 阿里达摩院发布的大模型,提交申请会给下载链接 |
CPM2.0 | 智源发布CPM2.0 |
GLM | 清华发布的中英双语130B预训练模型 |
模型链接 | 模型描述 |
---|---|
MedPalm | Google在Faln-PaLM的基础上通过多种类型的医疗QA数据进行prompt-tuning指令微调得到,同时构建了MultiMedQA |
ChatDoctor | 110K真实医患对话样本+5KChatGPT生成数据进行指令微调 |
Huatuo Med-ChatGLM | 医学知识图谱和chatgpt构建中文医学指令数据集+医学文献和chatgpt构建多轮问答数据 |
Chinese-vicuna-med | Chinese-vicuna在cMedQA2数据上微调 |
OpenBioMed | 清华AIR开源轻量版BioMedGPT, 知识图谱&20+生物研究领域多模态预训练模型 |
DoctorGLM | ChatDoctor+MedDialog+CMD 多轮对话+单轮指令样本微调GLM |
MedicalGPT-zh | 自建的医学数据库ChatGPT生成QA+16个情境下SELF构建情景对话 |
PMC-LLaMA | 医疗论文微调Llama |
NHS-LLM | Chatgpt生成的医疗问答,对话,微调模型 |
LawGPT-zh | 利用ChatGPT清洗CrimeKgAssitant数据集得到52k单轮问答+我们根据中华人民共和国法律手册上最核心的9k法律条文,利用ChatGPT联想生成具体的情景问答+知识问答使用ChatGPT基于文本构建QA对 |
FinChat.io | 使用最新的财务数据,电话会议记录,季度和年度报告,投资书籍等进行训练 |
OpenGPT | 领域LLM指令样本生成+微调框架 |
乾元BigBang金融2亿模型 | 金融领域预训练+任务微调 |
度小满千亿金融大模型 | 在Bloom-176B的基础上进行金融+中文预训练和微调 |
- OpenAI Cookbook: 提供OpenAI模型使用示例 ⭐
- OpenAI 接口被墙解决办法: 使用腾讯云搭建代理,亲测非常好用且手残党也可以轻松上手
- PromptPerfect:用魔法打败魔法,输入原始提示词,模型进行定向优化,试用后我有点沉默了,可以定向支持不同使用prompt的模型如Difussion,ChatGPT, Dalle等
- ClickPrompt: 为各种prompt加持的工具生成指令包括Difussion,chatgptdeng, 需要OpenAI Key
- ChatGPT ShortCut:提供各式场景下的Prompt范例,范例很全,使用后可以点赞! ⭐
- Full ChatGPT Prompts + Resources: 各种尝尽的prompt范例,和以上场景有所不同
- learning Prompt: prompt engineering超全教程,和落地应用收藏,包括很多LLM调用Agent的高级场景 ⭐
- The art of asking chatgpt for high quality answers: 如何写Prompt指令出书了,链接是中文翻译的版本,比较偏基础使用
- Prompt-Engineer-Guide: 同learnig prompt类的集成教程,互相引用可还行?!分类索引做的更好些 ⭐
- OpenAI 应用汇总指南: 纯应用类的汇总指南
- AI 导航: 包括但不限于ChatGPT的应用汇总网站,更新很快,发现了一些新大陆
- AI Alignment Forum: RLHF等对齐相关最新论文和观点的讨论论坛
- cognosys: 全网最火的web端AutoGPT,不过咋说呢试用了下感觉下巴要笑掉了,不剧透去试试你就知道
- godmode:需要人为每一步交互的的AutoGPT
- agentgpt: 基础AutoGPT
- New Bing:需要连外网否则会重定向到bing中国,需要申请waitlist ⭐
- Perplexity.ai: 同样需要科学上网,感觉比Bing做的更好的接入ChatGPT的神奇搜索引擎,在Bing之外还加入了相关推荐和追问 ⭐
- BingGPT: NewBing开源桌面客户端,可以将聊天记录导出
- DocsGPT: 把ChatGPT开放域问答转化成封闭域问答的通用方案,试用垂类领域问答场景,可以试用定制的ChatBot ⭐
- langchain-ChatGLM: 基于ChatGLM的本地知识问答,和上面的DocsGPT相似,不过可以本地部署:star:
- ChatPDF: 国内的ChatPDF, 上传pdf后,会给出文章的Top5可能问题,然后对话式从文档中进行问答和检索,10s读3万字
- ChatDoc:ChatPDF升级版,增加了表格类解析,和完善的索引引用加跳转加对应文章内容高亮,哈哈我准备自己整一个
- ChatPaper: 根据输入关键词,自动在arxiv上下载最新的论文,并对论文进行摘要总结,可以在huggingface上试用!
- OpenRead: 面向论文写作,阅读场景,可以帮助生成文献综述,以及提供和NotionAI相似的智能Markdown用于写作
- researchgpt: 和ChatPDF类似,支持arivx论文下载,加载后对话式获取论文重点
- BriefGPT: 日更Arxiv论文,并对论文进行摘要,关键词抽取,帮助研究者了解最新动态, UI不错哟
- ChatGPT-academic: 又是一个基于gradio实现的paper润色,摘要等功能打包的实现
- feishu-chatgpt: 飞书chatgpt,和365copilot相似也是多组件集成, 有点全!
- ChatMind: chatgpt生成思维导图,针对话题的生成还可以,但是针对某本书的就是瞎编了,但是感觉和检索式阅读方式结合效果会出彩~
- Shell: 基于ChatGPT的AI英语聊天工具,口语练习助手
- AI Topiah: 聆心智能AI角色聊天,和路飞唠了两句,多少有点中二之魂在燃烧
- chatbase: 情感角色聊天,还没尝试
- Vana: virtual DNA, 通过聊天创建虚拟自己!概念很炫
- WriteSonic:AI写作,支持对话和定向创作如广告文案,商品描述, 支持Web检索是亮点,支持中文
- copy.ai: WriteSonic竞品,亮点是像论文引用一样每句话都有对应网站链接,可以一键复制到右边的创作Markdown,超级好用! ⭐
- NotionAI:智能Markdown,适用真相!在创作中用command调用AI辅助润色,扩写,检索内容,给创意idea
- Jasper: 同上,全是竞品哈哈
- copy.down: 中文的营销文案生成,只能定向创作,支持关键词到文案的生成
- ChatExcel: 指令控制excel计算,对熟悉excel的有些鸡肋,对不熟悉的有点用
- ChatPPT: 使用ChatGPT进行PPT制作
- BibiGPT: Bilibli视频内容一键总结,多模态文档
- Microsoft 365 Copilot:微软Office全面接入GPT4,智能PPT,Excel,Word,暂无链接。其实就是上面开源创意的全家桶套餐
- Google Workspace: 谷歌推出的搭载各种AI服务的办公场景全覆盖,暂无使用方案。
- Copilot: 要付费哟
- Fauxpilot: copilot本地开源替代
- CodeGex: 国内替代品,还没试过
- Codeium: Copilot替代品,有免费版本支持各种plugin
- Wolverine: 代码自我debug的python脚本
- dreamstudio.ai: 开创者,Stable Difussion, 有试用quota
- midjourney: 开创者,艺术风格为主
- Dall.E: 三巨头这就凑齐了
- ControlNet: 为绘画创作加持可控性
- GFPGAN: 照片修复
- Visual ChatGPT: 微软发布图像ChatGPT,对话方式进行图像生成编辑,问答 ⭐
- gemo.ai: 多模态聊天机器人,包括文本,图像,视频生成
- OpenAI ChatGPT Intro
- OpenAI InstructGPT intro
- AllenAI ChatGPT能力解读:How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources ⭐
- Huggingface ChatGPT能力解读:The techniques behind ChatGPT: RLHF, IFT, CoT, Red teaming, and more
- Stephen Wolfram ChatGPT能力解读: What Is ChatGPT Doing and Why Does It Work?
- Chatgpt相关解读汇总
- 麻省理工科技采访OpenAI工程师
- AGI历史与现状
- 张俊林 通向AGI之路:大型语言模型(LLM)技术精要
- 知乎回答 OpenAI 发布 GPT-4,有哪些技术上的优化或突破?
- 追赶ChatGPT的难点与平替
- 压缩即泛化,泛化即智能
- 陆奇最新演讲实录:我的大模型世界观|第十四期
- https://github.com/dongguanting/In-Context-Learning_PaperList
- https://github.com/thunlp/PromptPapers
- https://github.com/Timothyxxx/Chain-of-ThoughtsPapers
- A Survey of Large Language Models
- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing ⭐
- Paradigm Shift in Natural Language Processing
- Pre-Trained Models: Past, Present and Future
- LARGER LANGUAGE MODELS DO IN-CONTEXT LEARNING DIFFERENTLY
- Evidence of Meaning in Language Models Trained on Programs
- Sparks of Artificial General Intelligence: Early experiments with GPT-4
- How does in-context learning work? A framework for understanding the differences from traditional supervised learning
- Why can GPT learn in-context? Language Model Secretly Perform Gradient Descent as Meta-Optimizers
- Emerging Ability of Large Language Models
- Rethinking the Role of Demonstrations What Makes incontext learning work?
- Can Explanations Be Useful for Calibrating Black Box Models
- GPT2: Language Models are Unsupervised Multitask Learners
- GPT3: Language Models are Few-Shot Learners ⭐
- LAMA: Language Models as Knowledge Bases?
- AutoPrompt: Eliciting Knowledge from Language Models
- T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- PET-TC(a): Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference ⭐
- PET-TC(b): PETSGLUE It’s Not Just Size That Matters Small Language Models are also few-shot learners
- GenPET: Few-Shot Text Generation with Natural Language Instructions
- LM-BFF: Making Pre-trained Language Models Better Few-shot Learners ⭐
- ADEPT: Improving and Simplifying Pattern Exploiting Training
- Prefix-tuning: Optimizing continuous prompts for generation
- Prompt-tunning: The power of scale for parameter-efficient prompt tuning ⭐
- P-tunning: GPT Understands Too ⭐
- WARP: Word-level Adversarial ReProgramming
- P-tunning v2: Prompt Tuning Can Be Comparable to Fine-tunning Universally Across Scales and Tasks
- PTR: Prompt Tuning with Rules for Text Classification
- PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains
- LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS ⭐
- LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
- Parameter-Efficient Transfer Learning for NLP
- INTRINSIC DIMENSIONALITY EXPLAINS THE EFFECTIVENESS OF LANGUAGE MODEL FINE-TUNING
- Flan: FINETUNED LANGUAGE MODELS ARE ZERO-SHOT LEARNERS ⭐
- Flan-T5: Scaling Instruction-Finetuned Language Models
- Instruct-GPT: Training language models to follow instructions with human feedback star:
- T0: MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION
- Natural Instructions: Cross-Task Generalization via Natural Language Crowdsourcing Instructions
- Tk-INSTRUCT: SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Tasks
- Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
- LaMDA: Language Models for Dialog Applications
- Sparrow: Improving alignment of dialogue agents via targeted human judgements star:
- BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage
- How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
- [zero-shot-COT] Large Language Models are Zero-Shot Reasoners ⭐
- [Manual COT] Chain of Thought Prompting Elicits Reasoning in Large Language Models ⭐
- SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS
- COMPLEXITY-BASED PROMPTING FOR MULTI-STEP REASONING
- LEAST-TO-MOST PROMPTING ENABLES COMPLEX REASONING IN LARGE LANGUAGE MODELS
- Solving Quantitative Reasoning Problems with Language Models
- Specializing Smaller Language Models towards Multi-Step Reasoning
- Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
- TEXT AND PATTERNS: FOR EFFECTIVE CHAIN OF THOUGHT IT TAKES TWO TO TANGO
- Decomposed Prompting A MODULAR APPROACH FOR Solving Complex Tasks
- Solving math word problems with processand outcome-based feedback
- CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
- Deepmind
- Teaching language models to support answers with verified quotes
- sparrow, Improving alignment of dialogue agents via targetd human judgements ⭐
- openai
- PPO: Proximal Policy Optimization Algorithms ⭐
- Deep Reinforcement Learning for Human Preference
- Fine-Tuning Language Models from Human Preferences
- learning to summarize from human feedback
- InstructGPT: Training language models to follow instructions with human feedback ⭐
- Scaling Laws for Reward Model Over optimization ⭐
- Anthropic
- A General Language Assistant as a Laboratory for Alignmen
- Red Teaming Language Models to Reduce Harms Methods,Scaling Behaviors and Lessons Learned
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback ⭐
- Constitutional AI Harmlessness from AI Feedback ⭐
- Pretraining Language Models with Human Preferences
- AllenAI, RL4LM:IS REINFORCEMENT LEARNING (NOT) FOR NATURAL LANGUAGE PROCESSING BENCHMARKS
- RRHF: Rank Responses to Align Language Models with Human Feedback without tears
- Tool Former: Toolformer: Language Models Can Teach Themselves to Use Tools
- MRKL SystemsA modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning ⭐
- ReAct: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS ⭐
- Self-ask: MEASURING AND NARROWING THE COMPOSITIONALITY GAP IN LANGUAGE MODELS
- PAL: Program-aided Language Models
- HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
- OpenAGI: When LLM Meets Domain Experts
- Tool Learning with Foundation Models
- APE: LARGE LANGUAGE MODELS ARE HUMAN-LEVEL PROMPT ENGINEERS ⭐
- SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions ⭐
- iPrompt: Explaining Data Patterns in Natural Language via Interpretable Autoprompting
- Flipped Learning: Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
- Fairness-guided Few-shot Prompting for Large Language Models
- Instruction induction: From few examples to natural language task descriptions.
- Baize An Open-Source Chat Model with Parameter-Efficient Tuning on self-Chat Data
- BioGPT:Generative Pre-trained Transformer for Biomedical Text Generation and Mining
- Galactia:A Large Language Model for Science
- PubMed GPT: A Domain-specific large language model for biomedical text ⭐
- BloombergGPT: A Large Language Model for Finance
- ChatDoctor:Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
- Med-PaLM:Large Language Models Encode Clinical Knowledge[V1,V2] ⭐
- Augmented Large Language Models with Parametric Knowledge Guiding
- XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters
- Parallel Context Windows for Large Language Models
- Structured Prompting: Scaling In-Context Learning to 1,000 Examples
- 苏剑林, NBCE:使用朴素贝叶斯扩展LLM的Context处理长度 ⭐
- Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
- Unlimiformer: Long-Range Transformers with Unlimited Length Input
- Scaling Transformer to 1M tokens and beyond with RMT
- RECURRENTGPT: Interactive Generation of (Arbitrarily) Long Text
- TRAIN SHORT, TEST LONG: ATTENTION WITH LINEAR BIASES ENABLES INPUT LENGTH EXTRAPOLATION ⭐
- BELLE: Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
- Baize: Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
- A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Large LM
- Exploring ChatGPT’s Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
- Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
- LIMA: Less Is More for Alignment ⭐
- Generated Knowledge Prompting for Commonsense Reasoning
- In-Context Instruction Learning
- PROMPTING GPT-3 TO BE RELIABLE
- InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
- Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
- PaLM-E: An Embodied Multimodal Language Model