Authors:
(1) Mingjie Liu, NVIDIA {Equal contribution};
(2) Teodor-Dumitru Ene, NVIDIA {Equal contribution};
(3) Robert Kirby, NVIDIA {Equal contribution};
(4) Chris Cheng, NVIDIA {Equal contribution};
(5) Nathaniel Pinckney, NVIDIA {Equal contribution};
(6) Rongjian Liang, NVIDIA {Equal contribution};
(7) Jonah Alben, NVIDIA;
(8) Himyanshu Anand, NVIDIA;
(9) Sanmitra Banerjee, NVIDIA;
(10) Ismet Bayraktaroglu, NVIDIA;
(11) Bonita Bhaskaran, NVIDIA;
(12) Bryan Catanzaro, NVIDIA;
(13) Arjun Chaudhuri, NVIDIA;
(14) Sharon Clay, NVIDIA;
(15) Bill Dally, NVIDIA;
(16) Laura Dang, NVIDIA;
(17) Parikshit Deshpande, NVIDIA;
(18) Siddhanth Dhodhi, NVIDIA;
(19) Sameer Halepete, NVIDIA;
(20) Eric Hill, NVIDIA;
(21) Jiashang Hu, NVIDIA;
(22) Sumit Jain, NVIDIA;
(23) Brucek Khailany, NVIDIA;
(24) George Kokai, NVIDIA;
(25) Kishor Kunal, NVIDIA;
(26) Xiaowei Li, NVIDIA;
(27) Charley Lind, NVIDIA;
(28) Hao Liu, NVIDIA;
(29) Stuart Oberman, NVIDIA;
(30) Sujeet Omar, NVIDIA;
(31) Sreedhar Pratty, NVIDIA;
(23) Jonathan Raiman, NVIDIA;
(33) Ambar Sarkar, NVIDIA;
(34) Zhengjiang Shao, NVIDIA;
(35) Hanfei Sun, NVIDIA;
(36) Pratik P Suthar, NVIDIA;
(37) Varun Tej, NVIDIA;
(38) Walker Turner, NVIDIA;
(39) Kaizhe Xu, NVIDIA;
(40) Haoxing Ren, NVIDIA.
Table of Links
- Abstract and Intro
- Dataset
- ChipNemo Domain Adaptation Methods
- LLM Applications
- Evaluations
- Discussion
- Related Works
- Conclusions
- Acknowledgments, Contributions and References
- Appendix
IX. ACKNOWLEDGEMENTS
The authors would like to thank: NVIDIA IT teams for their support on NVBugs integration; NVIDIA Hardware Security team for their support on security issues; NVIDIA NeMo teams for their support and guidance on training and inference of ChipNeMo models; NVIDIA Infrastructure teams for supporting the GPU training and inference resources for the project; NVIDIA Hardware design teams for their support and insight.
X. CONTRIBUTIONS
Mingjie Liu conducted DAPT and SFT model training.
Teodor-Dumitru Ene, Robert Kirby developed inference and application evaluation infrastructure.
Chris Cheng developed RAG framework.
Nathaniel Pinckney collected and prepared data sets for training.
Rongjian Liang developed custom tokenizers.
Walker Turner, Charley Lind, George Kokai developed a general circuit design knowledge benchmark.
Siddhanth Dhodhi, Ismet Bayraktaroglu, Himyanshu Anand, Eric Hill designed engineering assistant chatbot, provided domain instruction datasets, evaluation benchmarks, and conducted evaluation.
Parikshit Deshpande, Zhengjiang Shao, Kaizhe Xu, Jiashang Hu, Laura Dang, Xiaowei Li, Hao Liu, Ambar Sarkar developed engineering assistant chatbot application.
Sreedhar Pratty, Kishor Kunal, Varun Tej, Sumit Jain, Sujeet Omar, Pratik P Suthar, Hanfei Sun developed EDA scripts generation application, provided domain instruction datasets and evaluation benchmarks.
Bonita Bhaskaran, Arjun Chaudhuri, Sanmitra Banerjee developed bug summarization and analysis application, provided domain instruction datasets and evaluation benchmarks.
Brucek Khailany, Stuart Oberman, Sharon Clay, Sameer Halepete, Jonathan Raiman, Bryan Catanzaro, Jonah Alben, Bill Dally advised from AI research and hardware engineering perspectives.
Haoxing Ren designed and led the research.
REFERENCES
[1] B. Khailany et al., “Accelerating chip design with machine learning,” IEEE Micro, vol. 40, no. 6, pp. 23–32, 2020.
[2] H. Ren and M. Fojtik, “Invited- nvcell: Standard cell layout in advanced technology nodes with reinforcement learning,” in 2021 58th ACM/IEEE Design Automation Conference (DAC), 2021.
[3] R. Roy et al., “PrefixRL: Optimization of parallel prefix circuits using deep reinforcement learning,” in 2021 58th ACM/IEEE Design Automation Conference (DAC), 2021.
[4] W.-L. Chiang et al., “Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality,” March 2023. [Online]. Available: https://lmsys.org/blog/2023-03-30-vicuna/
[5] H. Touvron et al., “Llama 2: Open foundation and fine-tuned chat models,” 2023.
[6] S. Thakur et al., “Benchmarking large language models for automated verilog rtl code generation,” in 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2023, pp. 1–6.
[7] J. Blocklove et al., “Chip-chat: Challenges and opportunities in conversational hardware design,” 2023.
[8] Z. He et al., “Chateda: A large language model powered autonomous agent for eda,” 2023.
[9] S. Bubeck et al., “Sparks of artificial general intelligence: Early experiments with gpt-4,” 2023.
[10] S. Wu et al., “Bloomberggpt: A large language model for finance,” 2023.
[11] M. LLC. (2022) Biomedlm: a domain-specific large language model for biomedical text. [Online]. Available: https://www.mosaicml.com/blog/introducing-pubmed-gpt
[12] M. Liu et al., “VerilogEval: evaluating large language models for verilog code generation,” in 2023 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2023.
[13] E. Nijkamp et al., “Codegen: An open large language model for code with multi-turn program synthesis,” ICLR, 2023.
[14] S. Gururangan et al., “Don’t stop pretraining: Adapt language models to domains and tasks,” 2020.
[15] P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive nlp tasks,” 2021.
[16] E. J. Hu et al., “Lora: Low-rank adaptation of large language models,” CoRR, vol. abs/2106.09685, 2021. [Online]. Available: https://arxiv.org/abs/2106.09685
[17] L. Gao et al., “The pile: An 800gb dataset of diverse text for language modeling.”
[18] D. Kocetkov et al., “The stack: 3 tb of permissively licensed source code,” 2022.
[19] A. Kopf ¨ et al., “Openassistant conversations – democratizing large language model alignment,” 2023.
[20] J. Wei et al., “Finetuned language models are zero-shot learners,” 2022.
[21] V. Sanh et al., “Multitask prompted training enables zero-shot task generalization,” 2022.
[22] D. Hendrycks et al., “Measuring massive multitask language understanding,” 2021.
[23] M. Chen et al., “Evaluating large language models trained on code,” 2021.
[24] F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A pretrained language model for Indonesian Twitter with effective domain-specific vocabulary initialization,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Nov. 2021, pp. 10 660–10 668.
[25] O. Kuchaiev et al., “Nemo: a toolkit for building ai applications using neural modules,” 2019.
[26] M. Shoeybi et al., “Megatron-lm: Training multi-billion parameter language models using model parallelism,” arXiv preprint arXiv:1909.08053, 2019.
[27] T. Dao et al., “FlashAttention: Fast and memory-efficient exact attention with IO-awareness,” in Advances in Neural Information Processing Systems, 2022. [28] A. Chowdhery et al., “Palm: Scaling language modeling with pathways,” 2022.
[29] Z. Ji et al., “Survey of hallucination in natural language generation,” ACM Comput. Surv., vol. 55, no. 12, mar 2023. [Online]. Available: https://doi.org/10.1145/3571730
[30] L. Wang et al., “Text embeddings by weakly-supervised contrastive pre-training,” arXiv preprint arXiv:2212.03533, 2022.
[31] L. Gao et al., “Tevatron: An efficient and flexible toolkit for dense retrieval,” 2022.
[32] B. Roziere ` et al., “Code llama: Open foundation models for code,” 2023.
[33] N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019. [Online]. Available: http://arxiv.org/abs/1908.10084
[34] R. Pope et al., “Efficiently scaling transformer inference,” 2022.
[35] R. Y. Aminabadi et al., “Deepspeed inference: Enabling efficient inference of transformer models at unprecedented scale,” 2022.
[36] L. Ouyang et al., “Training language models to follow instructions with human feedback,” 2022.
[37] W. Xiong et al., “Effective long-context scaling of foundation models,” 2023.
[38] R. Taylor et al., “Galactica: A large language model for science,” 2022.
[39] A. Lewkowycz et al., “Solving quantitative reasoning problems with language models,” 2022.
[40] P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive nlp tasks,” 2021.
[41] S. Borgeaud et al., “Improving language models by retrieving from trillions of tokens,” 2022.
[42] S. Robertson and H. Zaragoza, “The probabilistic relevance framework: Bm25 and beyond,” Found. Trends Inf. Retr., vol. 3, no. 4, p. 333–389, apr 2009. [Online]. Available: https://doi.org/10.1561/1500000019
[43] V. Karpukhin et al., “Dense passage retrieval for open-domain question answering,” 2020.
[44] G. Izacard et al., “Unsupervised dense information retrieval with contrastive learning,” 2022.
[45] W. Shi et al., “Replug: Retrieval-augmented black-box language models,” 2023.
[46] G. Izacard et al., “Few-shot Learning with Retrieval Augmented Language Models,” 2022. [Online]. Available: http://arxiv.org/abs/2208.03299
[47] O. Ram et al., “In-context retrieval-augmented language models,” 2023.
[48] S. Zhou et al., “Docprompting: Generating code by retrieving the docs,” 2023.
[49] R. Rafailov et al., “Direct preference optimization: Your language model is secretly a reward model,” 2023.
[50] Y. Dong et al., “Steerlm: Attribute conditioned sft as an (user-steerable) alternative to rlhf,” 2023.
[51] H. Pearce, B. Tan, and R. Karri, “Dave: Deriving automatically verilog from english,” in Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD, ser. MLCAD ’20. New York, NY, USA: Association for Computing Machinery, 2020, p. 27–32. [Online]. Available: https://doi.org/10.1145/3380446.3430634
[52] “Beautiful Soup,” https://www.crummy.com/software/BeautifulSoup/, accessed: 10 Oct 2023.
[53] K. Sakaguchi et al., “Winogrande: An adversarial winograd schema challenge at scale,” arXiv preprint arXiv:1907.10641, 2019.
[54] R. Zellers et al., “Hellaswag: Can a machine really finish your sentence?” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
[55] P. Clark et al., “Think you have solved question answering? try arc, the ai2 reasoning challenge,” 2018.
[56] G. Lai et al., “Race: Large-scale reading comprehension dataset from examinations,” 2017.
This paper is available on arxiv under CC 4.0 license.