8. References#
Meta AI. Llama 4: multimodal intelligence. https://ai.meta.com/blog/llama-4-multimodal-intelligence/, 2026.
Tao An. Cognitive Workspace: Active Memory Management for LLMs – An Empirical Study of Functional Infinite Context. 2025. Version Number: 1. doi:10.48550/ARXIV.2508.13171.
Anthropic. Claude 4.6 sonnet release notes. https://www.anthropic.com/news/claude-sonnet-4-6, 2026.
Muhammad Arslan, Hussam Ghanem, Saba Munawar, and Christophe Cruz. A Survey on RAG with LLMs. Procedia Computer Science, 246:3781–3790, January 2024. doi:10.1016/j.procs.2024.09.178.
Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Salmani, Rohan Singh, Luke Zettlemoyer, and Hannaneh Hajishirzi. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. In Advances in Neural Information Processing Systems. 2023. URL: https://arxiv.org/abs/2310.11511 (visited on 2026-05-01).
Mary E. Bester and Kathryn Zeigler. Teaching Nursing Students Effective Artificial Intelligence Prompt Engineering: The CARE Framework. Nurse Educator, 51(1):13–17, January 2026. doi:10.1097/NNE.0000000000001969.
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020. URL: https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html (visited on 2026-02-11).
Banghao Chen, Zhaofeng Zhang, Nicolas Langrené, and Shengxin Zhu. Unleashing the potential of prompt engineering for large language models. Patterns, 6(6):101260, June 2025. doi:10.1016/j.patter.2025.101260.
Tonmoy Debnath, Md Nurul Absar Siddiky, Muhammad Enayetur Rahman, Prosenjit Das, Antu Kumar Guha, Muhammad Rezaur Rahman, and H. M. Dipu Kabir. A Comprehensive Survey of Prompt Engineering Techniques in Large Language Models. October 2025. doi:10.36227/techrxiv.174140719.96375390/v2.
Nathan Deen. LLMs Generate Western Bias Even When Trained with Non-Western Languages. April 2024. Published: Georgia Institute of Technology College of Computing. URL: https://www.cc.gatech.edu/news/llms-generate-western-bias-even-when-trained-non-western-languages (visited on 2026-03-03).
Google DeepMind. Gemini 3.1 pro model card. https://deepmind.google/models/model-cards/gemini-3-1-pro/, 2026.
Enzo Doyen and Amalia Todirascu. Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias. 2025. Version Number: 1. doi:10.48550/ARXIV.2502.10577.
Vrunda Gadesha. What is few shot prompting? 2026. Publication Title: IBM Think. URL: https://www.ibm.com/think/topics/few-shot-prompting (visited on 2026-03-03).
Vrunda Gadesha, Eda Kavlakoglu, and Vanna Winland. What is chain of thought (CoT) prompting? 2026. Publication Title: IBM Think. URL: https://www.ibm.com/think/topics/chain-of-thoughts (visited on 2026-03-03).
Yunfan Gao, Yun Xiong, Wenlong Wu, Bohan Li, Yijie Zhong, and Haofen Wang. U-NIAH: Unified RAG and LLM Evaluation for Long Context Needle-in-a-Haystack. ACM Transactions on Information Systems, 44(3):1–30, March 2026. doi:10.1145/3786609.
Yufei Guo, Muzhe Guo, Juntao Su, Zhou Yang, Mengqiu Zhu, Hongfei Li, Mengyang Qiu, and Shuo Shuo Liu. Bias in Large Language Models: Origin, Evaluation, and Mitigation. November 2024. arXiv:2411.10915 [cs]. doi:10.48550/arXiv.2411.10915.
Benjamin Haibe-Kains, George Alexandru Adam, Ahmed Hosny, Farnoosh Khodakarami, Massive Analysis Quality Control (MAQC) Society Board of Directors, Thakkar Shraddha, Rebecca Kusko, Susanna-Assunta Sansone, Weida Tong, Russ D. Wolfinger, Christopher E. Mason, Wendell Jones, Joaquin Dopazo, Cesare Furlanello, Levi Waldron, Bo Wang, Chris McIntosh, Anna Goldenberg, Anshul Kundaje, Casey S. Greene, Tamara Broderick, Michael M. Hoffman, Jeffrey T. Leek, Keegan Korthauer, Wolfgang Huber, Alvis Brazma, Joelle Pineau, Robert Tibshirani, Trevor Hastie, John P. A. Ioannidis, John Quackenbush, and Hugo J. W. L. Aerts. Transparency and reproducibility in artificial intelligence. Nature, 586(7829):E14–E16, October 2020. doi:10.1038/s41586-020-2766-y.
Dirk Holst, Keno Moenck, Julian Koch, Ole Schmedemann, and Thorsten Schüppstuhl. Transparent Reporting of AI in Systematic Literature Reviews: Development of the PRISMA-trAIce Checklist. JMIR AI, 4:e80247–e80247, December 2025. doi:10.2196/80247.
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55(12):1–38, December 2023. doi:10.1145/3571730.
Stig Pedersen Korsholm. Mastering prompt engineering: a comparative guide to nine prompt engineering frameworks for tech professionals. https://www.linkedin.com/pulse/mastering-prompt-engineering-comparative-guide-nine-tech-korsholm-hnjif/, February 2024.
Oscar Lau and Su Golder. Comparison of Elicit AI and Traditional Literature Searching in Evidence Syntheses Using Four Case Studies. Cochrane Evidence Synthesis and Methods, 3(6):e70050, November 2025. doi:10.1002/cesm.70050.
Quinn Leng, Jacob Portes, Sam Havens, Matei Zaharia, and Michael Carbin. Long Context RAG Performance of Large Language Models. 2024. Version Number: 1. doi:10.48550/ARXIV.2411.03538.
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS '20, 9459–9474. Red Hook, NY, USA, December 2020. Curran Associates Inc. URL: https://dl.acm.org/doi/10.5555/3495724.3496517.
Xinze Li, Yixin Cao, Yubo Ma, and Aixin Sun. Long Context vs. RAG for LLMs: An Evaluation and Revisits. 2025. Version Number: 1. doi:10.48550/ARXIV.2501.01880.
Zijun Liu, Zhennan Wan, Peng Li, Ming Yan, Ji Zhang, Fei Huang, and Yang Liu. Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration. 2025. Version Number: 1. doi:10.48550/ARXIV.2505.21471.
Alexandra Sasha Luccioni, Sylvain Viguier, and Anne-Laure Ligozat. Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model. 2022. Version Number: 1. doi:10.48550/ARXIV.2211.02001.
Justin M. Mittelstädt, Julia Maier, Panja Goerke, Frank Zinn, and Michael Hermes. Large language models can outperform humans in social situational judgments. Scientific Reports, 14(1):27449, November 2024. doi:10.1038/s41598-024-79048-0.
OpenAI. Introducing gpt‑5.5. https://openai.com/index/introducing-gpt-5-5/, 2026.
OpenAI. Prompt engineering. https://developers.openai.com/api/docs/guides/prompt-engineering, 2026.
Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. MemGPT: Towards LLMs as Operating Systems. 2023. Version Number: 2. doi:10.48550/ARXIV.2310.08560.
Boci Peng, Yun Zhu, Yongchao Liu, Xiaohe Bo, Haizhou Shi, Chuntao Hong, Yan Zhang, and Siliang Tang. Graph Retrieval-Augmented Generation: A Survey. ACM Transactions on Information Systems, 44(2):1–52, February 2026. URL: https://dl.acm.org/doi/10.1145/3777378, doi:10.1145/3777378.
Nikhil Sharma, Q. Vera Liao, and Ziang Xiao. Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking. February 2024. arXiv:2402.05880 [cs]. doi:10.48550/arXiv.2402.05880.
Fabio Vivas. ROSES Framework: Role, Objective, Scenario, Expected Solution, Steps. April 2025. Publication Title: Fabio Vivas. URL: https://fvivas.com/en/roses-framework-prompts-llm/ (visited on 2026-03-03).
Peng Xu, Wei Ping, Xianchao Wu, Chejian Xu, Zihan Liu, Mohammad Shoeybi, and Bryan Catanzaro. ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities. 2024. Version Number: 3. doi:10.48550/ARXIV.2407.14482.
Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, and Zhen-Hua Ling. Corrective Retrieval Augmented Generation. 2024. URL: https://arxiv.org/abs/2401.15884 (visited on 2026-05-01).
Muhammad Nadeem Yousaf. Practical Considerations and Ethical Implications of Using Artificial Intelligence in Writing Scientific Manuscripts. ACG Case Reports Journal, 12(2):e01629, February 2025. doi:10.14309/crj.0000000000001629.
Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, and Sercan Ö. Arik. Chain of Agents: Large Language Models Collaborating on Long-Context Tasks. 2024. Version Number: 1. doi:10.48550/ARXIV.2406.02818.
Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, and Zhangyang Wang. Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding. Advances in Neural Information Processing Systems, 37:60755–60775, December 2024. doi:10.52202/079017-1943.
Elicit Team. Elicit's limitations. 2024. URL: https://support.elicit.com/en/articles/549569 (visited on 2026-03-03).
Google Cloud. What is Model Context Protocol (MCP)? A guide. 2026. URL: https://cloud.google.com/discover/what-is-model-context-protocol?hl=en (visited on 2026-03-05).
Juuzt AI. ERA framework - Expectation-driven AI prompt engineering. 2026. URL: https://juuzt.ai/knowledge-base/prompt-frameworks/the-era-framework/ (visited on 2026-03-03).
theMITmonk. You’re not behind (yet): how to learn ai in 17 minutes. Nov 2023. YouTube Video. URL: https://www.youtube.com/watch?v=EWFFaKxsz_s.
University of Michigan Medical School. Prompt Frameworks - UMMS Artificial Intelligence - Research Guides at University of Michigan Library. March 2026. URL: https://guides.lib.umich.edu/c.php?g=1406239&p=10420137 (visited on 2026-03-03).
xAI. Grok 4.3. xAI Docs, May 2026. Accessed: 2026-05-19. URL: https://docs.x.ai/developers/models/grok-4.3.