Skip to main content
Erschienen in: Automated Software Engineering 1/2024

01.05.2024

Distilled GPT for source code summarization

verfasst von: Chia-Yi Su, Collin McMillan

Erschienen in: Automated Software Engineering | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A code summary is a brief natural language description of source code. Summaries are usually only a single sentence long, and yet form the backbone of developer documentation. A short descriptions such as “changes all visible polygons to the color blue” can give a programmer a high-level idea of what code does without the effort of reading the code itself. Recently, products based on Large Language Models such as ChatGPT have demonstrated a strong ability to write these descriptions automatically. However, to use these tools, programmers must send their code to untrusted third parties for processing (e.g., via an API call). This loss of custody is not acceptable to many organizations. In this paper, we present an alternative: we train an open source model using sample output generated by GPT\(-\)3.5 in a process related to knowledge distillation. Our model is small enough (350 m parameters) to be run on a single 16gb GPU, yet we show in our evaluation that it is large enough to mimic GPT\(-\)3.5 on this task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aghajani, E., Nagy, C., Vega-Márquez, O.L., et al.: Software documentation issues unveiled. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), IEEE, pp. 1199–1210 (2019) Aghajani, E., Nagy, C., Vega-Márquez, O.L., et al.: Software documentation issues unveiled. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), IEEE, pp. 1199–1210 (2019)
Zurück zum Zitat Banerjee, S., Lavie, A.: Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp. 65–72 (2005) https://aclanthology.org/W05-0909 Banerjee, S., Lavie, A.: Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp. 65–72 (2005) https://​aclanthology.​org/​W05-0909
Zurück zum Zitat Bansal, A., Haque, S., McMillan, C.: Project-level encoding for neural source code summarization of subroutines. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), IEEE, pp. 253–264 (2021b) Bansal, A., Haque, S., McMillan, C.: Project-level encoding for neural source code summarization of subroutines. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), IEEE, pp. 253–264 (2021b)
Zurück zum Zitat Bender, E.M., Gebru, T., McMillan-Major, A., et al.: On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, New York, NY, USA, FAccT ’21, pp. 610-623 (2021), https://doi.org/10.1145/3442188.3445922 Bender, E.M., Gebru, T., McMillan-Major, A., et al.: On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, New York, NY, USA, FAccT ’21, pp. 610-623 (2021), https://​doi.​org/​10.​1145/​3442188.​3445922
Zurück zum Zitat Chen, Z., Jiang, F., Chen, J., et al.: Phoenix: democratizing chatgpt across languages. arXiv preprint arXiv:2304.10453 (2023) Chen, Z., Jiang, F., Chen, J., et al.: Phoenix: democratizing chatgpt across languages. arXiv preprint arXiv:​2304.​10453 (2023)
Zurück zum Zitat Danilova, A., Naiakshina, A., Horstmann, S., et al.: Do you really code? designing and evaluating screening questions for online surveys with programmers. In: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE), IEEE, pp. 537–548 (2021) Danilova, A., Naiakshina, A., Horstmann, S., et al.: Do you really code? designing and evaluating screening questions for online surveys with programmers. In: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE), IEEE, pp. 537–548 (2021)
Zurück zum Zitat Delgado, R., Tibau, X.A.: Why Cohen’s kappa should be avoided as performance measure in classification. PloS one 14(9), e0222916 (2019)CrossRef Delgado, R., Tibau, X.A.: Why Cohen’s kappa should be avoided as performance measure in classification. PloS one 14(9), e0222916 (2019)CrossRef
Zurück zum Zitat Derner, E., Batistič, K.: Beyond the safeguards: exploring the security risks of chatgpt. arXiv preprint arXiv:2305.08005 (2023) Derner, E., Batistič, K.: Beyond the safeguards: exploring the security risks of chatgpt. arXiv preprint arXiv:​2305.​08005 (2023)
Zurück zum Zitat Donker, D., Hasman, A., Van Geijn, H.: Interpretation of low kappa values. Int. J. Bio Med. Comput. 33(1), 55–64 (1993)CrossRef Donker, D., Hasman, A., Van Geijn, H.: Interpretation of low kappa values. Int. J. Bio Med. Comput. 33(1), 55–64 (1993)CrossRef
Zurück zum Zitat Forward, A., Lethbridge, T.C.: The relevance of software documentation, tools and technologies: A survey. In: Proceedings of the 2002 ACM Symposium on Document Engineering. Association for Computing Machinery, New York, NY, USA, DocEng ’02, pp. 26-33, (2002) https://doi.org/10.1145/585058.585065 Forward, A., Lethbridge, T.C.: The relevance of software documentation, tools and technologies: A survey. In: Proceedings of the 2002 ACM Symposium on Document Engineering. Association for Computing Machinery, New York, NY, USA, DocEng ’02, pp. 26-33, (2002) https://​doi.​org/​10.​1145/​585058.​585065
Zurück zum Zitat Ghorbani, A., Cassee, N., Robinson, D., et al.: Autonomy is an acquired taste: exploring developer preferences for github bots. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE, pp. 1405–1417 (2023) Ghorbani, A., Cassee, N., Robinson, D., et al.: Autonomy is an acquired taste: exploring developer preferences for github bots. In: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE, pp. 1405–1417 (2023)
Zurück zum Zitat Gou, J., Yu, B., Maybank, S.J., et al.: Knowledge distillation: a survey. Int. J. Comput. Vis 129, 1789–1819 (2021)CrossRef Gou, J., Yu, B., Maybank, S.J., et al.: Knowledge distillation: a survey. Int. J. Comput. Vis 129, 1789–1819 (2021)CrossRef
Zurück zum Zitat Gudibande, A., Wallace, E., Snell, C., et al.: The false promise of imitating proprietary llms. arXiv preprint arXiv:2305.15717 (2023) Gudibande, A., Wallace, E., Snell, C., et al.: The false promise of imitating proprietary llms. arXiv preprint arXiv:​2305.​15717 (2023)
Zurück zum Zitat Hsieh, C.Y., Li, C.L., Yeh, C.K., et al.: Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:2305.02301 (2023) Hsieh, C.Y., Li, C.L., Yeh, C.K., et al.: Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:​2305.​02301 (2023)
Zurück zum Zitat Hu, X., Li, G., Xia, X., et al.: Summarizing source code with transferred API knowledge. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, IJCAI’18, p 2269-2275 (2018b) Hu, X., Li, G., Xia, X., et al.: Summarizing source code with transferred API knowledge. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, IJCAI’18, p 2269-2275 (2018b)
Zurück zum Zitat Israel, G.D.: Determining sample size (1992) Israel, G.D.: Determining sample size (1992)
Zurück zum Zitat Jiang, S., Armaly, A., McMillan, C.: Automatically generating commit messages from diffs using neural machine translation. In: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, ASE ’17, pp. 135-146 (2017) Jiang, S., Armaly, A., McMillan, C.: Automatically generating commit messages from diffs using neural machine translation. In: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, ASE ’17, pp. 135-146 (2017)
Zurück zum Zitat LeClair, A., McMillan, C.: Recommendations for datasets for source code summarization. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 3931–3937 (2019) LeClair, A., McMillan, C.: Recommendations for datasets for source code summarization. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 3931–3937 (2019)
Zurück zum Zitat Li, J., Gui, L., Zhou, Y., et al.: Distilling chatgpt for explainable automated student answer assessment. arXiv preprint arXiv:2305.12962 (2023a) Li, J., Gui, L., Zhou, Y., et al.: Distilling chatgpt for explainable automated student answer assessment. arXiv preprint arXiv:​2305.​12962 (2023a)
Zurück zum Zitat Liang, Y., Zhu, K.Q.: Automatic generation of text descriptive comments for code blocks. In: Proceedings of the thirty-second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, AAAI’18/IAAI’18/EAAI’18 (2018) Liang, Y., Zhu, K.Q.: Automatic generation of text descriptive comments for code blocks. In: Proceedings of the thirty-second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, AAAI’18/IAAI’18/EAAI’18 (2018)
Zurück zum Zitat Lu, Y., Zhao, Z., Li, G., et al.: Learning to generate comments for API-based code snippets. In: Li, Z., Jiang, H., Li, G., et al. (eds.) Software Engineering and Methodology for Emerging Domains, pp. 3–14. Singapore, Springer Singapore (2019)CrossRef Lu, Y., Zhao, Z., Li, G., et al.: Learning to generate comments for API-based code snippets. In: Li, Z., Jiang, H., Li, G., et al. (eds.) Software Engineering and Methodology for Emerging Domains, pp. 3–14. Singapore, Springer Singapore (2019)CrossRef
Zurück zum Zitat Ma, W., Liu, S., Wang, W., et al.: The scope of chatgpt in software engineering: A thorough investigation. arXiv preprint arXiv:2305.12138 (2023) Ma, W., Liu, S., Wang, W., et al.: The scope of chatgpt in software engineering: A thorough investigation. arXiv preprint arXiv:​2305.​12138 (2023)
Zurück zum Zitat Nie, P., Rai, R., Li, J.J., et al.: A framework for writing trigger-action todo comments in executable format. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2019, pp. 385–396 (2019) https://doi.org/10.1145/3338906.3338965 Nie, P., Rai, R., Li, J.J., et al.: A framework for writing trigger-action todo comments in executable format. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2019, pp. 385–396 (2019) https://​doi.​org/​10.​1145/​3338906.​3338965
Zurück zum Zitat Papineni, K., Roukos, S., Ward, T., et al.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics, pp. 311–318, https://doi.org/10.3115/1073083.1073135 (2002) Papineni, K., Roukos, S., Ward, T., et al.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics, pp. 311–318, https://​doi.​org/​10.​3115/​1073083.​1073135 (2002)
Zurück zum Zitat Pérez-Mayos, L., Ballesteros, M., Wanner, L.: How much pretraining data do language models need to learn syntax? arXiv preprint arXiv:2109.03160 (2021) Pérez-Mayos, L., Ballesteros, M., Wanner, L.: How much pretraining data do language models need to learn syntax? arXiv preprint arXiv:​2109.​03160 (2021)
Zurück zum Zitat Rodeghero, P., Jiang, S., Armaly, A., et al.: Detecting user story information in developer-client conversations to generate extractive summaries. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 49–59 (2017) https://doi.org/10.1109/ICSE.2017.13 Rodeghero, P., Jiang, S., Armaly, A., et al.: Detecting user story information in developer-client conversations to generate extractive summaries. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 49–59 (2017) https://​doi.​org/​10.​1109/​ICSE.​2017.​13
Zurück zum Zitat Roy, D., Fakhoury, S., Arnaoudova, V.: Reassessing automatic evaluation metrics for code summarization tasks. In: Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (2021) https://doi.org/10.1145/3468264.3468588 Roy, D., Fakhoury, S., Arnaoudova, V.: Reassessing automatic evaluation metrics for code summarization tasks. In: Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (2021) https://​doi.​org/​10.​1145/​3468264.​3468588
Zurück zum Zitat Schaeffer, R., Miranda, B., Koyejo, S.: Are emergent abilities of large language models a mirage? arXiv preprint arXiv:2304.15004 (2023) Schaeffer, R., Miranda, B., Koyejo, S.: Are emergent abilities of large language models a mirage? arXiv preprint arXiv:​2304.​15004 (2023)
Zurück zum Zitat Shi, L., Mu, F., Chen, X., et al.: Are we building on the rock? on the importance of data preprocessing for code summarization. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery, ESEC/FSE 2022, pp. 107-119 (2022) Shi, L., Mu, F., Chen, X., et al.: Are we building on the rock? on the importance of data preprocessing for code summarization. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery, ESEC/FSE 2022, pp. 107-119 (2022)
Zurück zum Zitat Sridhara, G., Hill, E., Muppaneni, D., et al.: Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, ACM, pp. 43–52 (2010) https://doi.org/10.1145/1858996.1859006 Sridhara, G., Hill, E., Muppaneni, D., et al.: Towards automatically generating summary comments for java methods. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, ACM, pp. 43–52 (2010) https://​doi.​org/​10.​1145/​1858996.​1859006
Zurück zum Zitat Su, C.Y., Bansal, A., Jain, V., et al.: A language model of java methods with train/test deduplication. In: 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Demonstrations (FSE’23 Demos) (2023) Su, C.Y., Bansal, A., Jain, V., et al.: A language model of java methods with train/test deduplication. In: 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Demonstrations (FSE’23 Demos) (2023)
Zurück zum Zitat Sun, W., Fang, C., You, Y., et al.: Automatic code summarization via chatgpt: How far are we? arXiv preprint arXiv:2305.12865 (2023) Sun, W., Fang, C., You, Y., et al.: Automatic code summarization via chatgpt: How far are we? arXiv preprint arXiv:​2305.​12865 (2023)
Zurück zum Zitat Tang, Y., da Costa, A.A.B., Zhang, J., et al.: Domain knowledge distillation from large language model: An empirical study in the autonomous driving domain. arXiv preprint arXiv:2307.11769 (2023) Tang, Y., da Costa, A.A.B., Zhang, J., et al.: Domain knowledge distillation from large language model: An empirical study in the autonomous driving domain. arXiv preprint arXiv:​2307.​11769 (2023)
Zurück zum Zitat Wan, Y., Zhao, Z., Yang, M., et al.: Improving automatic source code summarization via deep reinforcement learning. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. Association for Computing Machinery, New York, NY, USA, ASE ’18, pp. 397-407 (2018) https://doi.org/10.1145/3238147.3238206, Wan, Y., Zhao, Z., Yang, M., et al.: Improving automatic source code summarization via deep reinforcement learning. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. Association for Computing Machinery, New York, NY, USA, ASE ’18, pp. 397-407 (2018) https://​doi.​org/​10.​1145/​3238147.​3238206,
Zurück zum Zitat Wang, L., Yoon, K.J.: Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. IEEE Transact. Pattern Anal. Mach. Intell. 44(6), 3048–3068 (2021)CrossRef Wang, L., Yoon, K.J.: Knowledge distillation and student-teacher learning for visual intelligence: a review and new outlooks. IEEE Transact. Pattern Anal. Mach. Intell. 44(6), 3048–3068 (2021)CrossRef
Zurück zum Zitat Xu, C., Xu, Y., Wang, S., et al.: Small models are valuable plug-ins for large language models. arXiv preprint arXiv:2305.08848 (2023) Xu, C., Xu, Y., Wang, S., et al.: Small models are valuable plug-ins for large language models. arXiv preprint arXiv:​2305.​08848 (2023)
Zurück zum Zitat Yu, Y., Zhuang, Y., Zhang, J., et al.: Large language model as attributed training data generator: a tale of diversity and bias. arXiv preprint arXiv:2306.15895 (2023) Yu, Y., Zhuang, Y., Zhang, J., et al.: Large language model as attributed training data generator: a tale of diversity and bias. arXiv preprint arXiv:​2306.​15895 (2023)
Zurück zum Zitat Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International Conference on Learning Representations (2016) Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International Conference on Learning Representations (2016)
Zurück zum Zitat Zhai, X., Kolesnikov, A., Houlsby, N., et al.: Scaling vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12104–12113 (2022) Zhai, X., Kolesnikov, A., Houlsby, N., et al.: Scaling vision transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12104–12113 (2022)
Zurück zum Zitat Zhang, R., Han, J., Zhou, A., et al.: Llama-adapter: efficient fine-tuning of language models with zero-init attention. Parameters 7, 13B (2023) Zhang, R., Han, J., Zhou, A., et al.: Llama-adapter: efficient fine-tuning of language models with zero-init attention. Parameters 7, 13B (2023)
Metadaten
Titel
Distilled GPT for source code summarization
verfasst von
Chia-Yi Su
Collin McMillan
Publikationsdatum
01.05.2024
Verlag
Springer US
Erschienen in
Automated Software Engineering / Ausgabe 1/2024
Print ISSN: 0928-8910
Elektronische ISSN: 1573-7535
DOI
https://doi.org/10.1007/s10515-024-00421-4

Weitere Artikel der Ausgabe 1/2024

Automated Software Engineering 1/2024 Zur Ausgabe

Premium Partner