The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks

Autor: Venkatesh, Ashwin Prasad Shivarpatna, Sabu, Samkutty, Mir, Amir M., Reis, Sofia, Bodden, Eric
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: The application of Large Language Models (LLMs) in software engineering, particularly in static analysis tasks, represents a paradigm shift in the field. In this paper, we investigate the role that current LLMs can play in improving callgraph analysis and type inference for Python programs. Using the PyCG, HeaderGen, and TypeEvalPy micro-benchmarks, we evaluate 26 LLMs, including OpenAI's GPT series and open-source models such as LLaMA. Our study reveals that LLMs show promising results in type inference, demonstrating higher accuracy than traditional methods, yet they exhibit limitations in callgraph analysis. This contrast emphasizes the need for specialized fine-tuning of LLMs to better suit specific static analysis tasks. Our findings provide a foundation for further research towards integrating LLMs for static analysis tasks.
Comment: To be published in: ICSE FORGE 2024 (AI Foundation Models and Software Engineering)
Databáze: arXiv