11. Main Reference
Wenqiang Feng, Di Zhen. GenAI: Best Practices, 2024.
Wenqiang Feng. Learning Apache Spark with Python, 2017.
Michael Gunther etc. Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models, 2024.
Akari Asai etc. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection, 2023.
Yunho Mo etc. Parameter-Efficient Fine-Tuning Method for Task-Oriented Dialogue Systems, 2023.
Philipp Schmid. Fine-tune Embedding models for Retrieval Augmented Generation (RAG), 2024.
Ashish Vaswani etc. Attention Is All You Need, 2017.
Maxime Labonne. Fine-Tune Your Own Llama 2 Model in a Colab Notebook, 2024.
Yang Liu. G-EVAL: NLG Evaluation using GPT-4 with Better Human Alignment, 2023.
Tri Dao etc. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness, 2022.
Tri Dao. FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning, 2023.
Jay Shah etc. FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision, 2024.
Andrei Ivanov etc. Data Movement Is All You Need: A Case Study on Optimizing Transformers, 2024.
Yi Dong etc. Safeguarding Large Language Models: A Survey, 2024.
Hakan Inan etc. Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations, 2023.
Luca Beurer-Kellner etc. Prompting Is Programming: A Query Language for Large Language Models, 2022.
Long Ouyang etc. Training language models to follow instructions with human feedback, 2022.
John Schulman etc. Proximal Policy Optimization Algorithms, 2017.
Rui Zheng etc. Secrets of RLHF in Large Language Models Part I: PPO, 2023.
Rafael Rafailov etc. Direct Preference Optimization: Your Language Model is Secretly a Reward Model, 2023.
DeepSeek AI. DeepSeek-V3 Technical Report, 2024.
DeepSeek AI. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model, 2024.