Codegen performs advanced static analysis to build a rich graph representation of your codebase. This pre-computation step analyzes dependencies, references, types, and control flow to enable fast and reliable code manipulation operations.
Codegen is built on top of
Tree-sitter and
rustworkx and has implemented most
language server features from scratch.
Codegen is open source. Check out the source
code to learn more!
At the heart of Codegen is a comprehensive graph representation of your code. When you initialize a Codebase, it performs static analysis to construct a rich graph structure connecting code elements:
Copy
Ask AI
# Initialize and analyze the codebasefrom codegen import Codebasecodebase = Codebase("./")# Access pre-computed relationshipsfunction = codebase.get_symbol("process_data")print(f"Dependencies: {function.dependencies}") # Instant lookupprint(f"Usages: {function.usages}") # No parsing needed
Codegen’s graph construction happens in two stages:
AST Parsing: We use Tree-sitter as our foundation for parsing code into Abstract Syntax Trees. Tree-sitter provides fast, reliable parsing across multiple languages.
Multi-file Graph Construction: Custom parsing logic, implemented in rustworkx and Python, analyzes these ASTs to construct a more sophisticated graph structure. This graph captures relationships between symbols, files, imports, and more.
Learn about how Codegen handles language specifics in the Language
Support guide.
We’ve started with these ecosystems but designed our architecture to be extensible. The graph-based approach provides a consistent interface across languages while handling language-specific details under the hood.
Codegen is just getting started, and we’re excited about the possibilities ahead. We enthusiastically welcome contributions from the community, whether it’s: