The codegen notebook command creates a virtual environment and opens a Jupyter notebook for quick prototyping. This is often the fastest way to get up and running.
Copy
Ask AI
# Launch Jupyter with a demo notebookcodegen notebook --demo
The notebook --demo comes pre-configured to load FastAPI’s codebase, so you can start
exploring right away!
# Print overall statsprint("🔍 Codebase Analysis")print("=" * 50)print(f"📚 Total Classes: {len(codebase.classes)}")print(f"⚡ Total Functions: {len(codebase.functions)}")print(f"🔄 Total Imports: {len(codebase.imports)}")# Find class with most inheritanceif codebase.classes: deepest_class = max(codebase.classes, key=lambda x: len(x.superclasses)) print(f"\n🌳 Class with most inheritance: {deepest_class.name}") print(f" 📊 Chain Depth: {len(deepest_class.superclasses)}") print(f" ⛓️ Chain: {' -> '.join(s.name for s in deepest_class.superclasses)}")# Find first 5 recursive functionsrecursive = [f for f in codebase.functions if any(call.name == f.name for call in f.function_calls)][:5]if recursive: print(f"\n🔄 Recursive functions:") for func in recursive: print(f" - {func.name}")
Let’s specifically drill into large test files, which can be cumbersome to manage.
Copy
Ask AI
from collections import Counter# Filter to all test functions and classestest_functions = [x for x in codebase.functions if x.name.startswith('test_')]test_classes = [x for x in codebase.classes if x.name.startswith('Test')]print("🧪 Test Analysis")print("=" * 50)print(f"📝 Total Test Functions: {len(test_functions)}")print(f"🔬 Total Test Classes: {len(test_classes)}")print(f"📊 Tests per File: {len(test_functions) / len(codebase.files):.1f}")# Find files with the most testsprint("\n📚 Top Test Files by Class Count")print("-" * 50)file_test_counts = Counter([x.file for x in test_classes])for file, num_tests in file_test_counts.most_common()[:5]: print(f"🔍 {num_tests} test classes: {file.filepath}") print(f" 📏 File Length: {len(file.source)} lines") print(f" 💡 Functions: {len(file.functions)}")
Lets split up the largest test files into separate modules for better organization.This uses Codegen’s codebase.move_to_file(…), which will:
update all imports
(optionally) move dependencies
do so very fast ⚡️
While maintaining correctness.
Copy
Ask AI
filename = 'tests/test_path.py'print(f"📦 Splitting Test File: {filename}")print("=" * 50)# Grab a filefile = codebase.get_file(filename)base_name = filename.replace('.py', '')# Group tests by subpathtest_groups = {}for test_function in file.functions: if test_function.name.startswith('test_'): test_subpath = '_'.join(test_function.name.split('_')[:3]) if test_subpath not in test_groups: test_groups[test_subpath] = [] test_groups[test_subpath].append(test_function)# Print and process each groupfor subpath, tests in test_groups.items(): print(f"\\n{subpath}/") new_filename = f"{base_name}/{subpath}.py" # Create file if it doesn't exist if not codebase.has_file(new_filename): new_file = codebase.create_file(new_filename) file = codebase.get_file(new_filename) # Move each test in the group for test_function in tests: print(f" - {test_function.name}") test_function.move_to_file(new_file, strategy="add_back_edge")# Commit changes to diskcodebase.commit()
Once you have a general sense of your codebase, you can filter down to exactly what you’re looking for. Codegen’s graph structure makes it straightforward and performant to find and traverse specific code elements:
Copy
Ask AI
# Grab specific content by namemy_resource = codebase.get_symbol('TestResource')# Find classes that inherit from a specific baseresource_classes = [ cls for cls in codebase.classes if cls.is_subclass_of('Resource')]# Find functions with specific decoratorstest_functions = [ f for f in codebase.functions if any('pytest' in d.source for d in f.decorators)]# Find files matching certain patternstest_files = [ f for f in codebase.files if f.name.startswith('test_')]
Codegen guarantees that code transformations maintain correctness. It automatically handles updating imports, references, and dependencies. Here are some common transformations:
Copy
Ask AI
# Move all Enum classes to a dedicated filefor cls in codebase.classes: if cls.is_subclass_of('Enum'): # Codegen automatically: # - Updates all imports that reference this class # - Maintains the class's dependencies # - Preserves comments and decorators # - Generally performs this in a sane manner cls.move_to_file(f'enums.py')# Rename a function and all its usagesold_function = codebase.get_function('process_data')old_function.rename('process_resource') # Updates all references automatically# Change a function's signaturehandler = codebase.get_function('event_handler')handler.get_parameter('e').rename('event') # Automatically updates all call-siteshandler.add_parameter('timeout: int = 30') # Handles formatting and edge caseshandler.add_return_type('Response | None')# Perform surgery on call-sitesfor fcall in handler.call_sites: arg = fcall.get_arg_by_parameter_name('env') # f(..., env={ data: x }) => f(..., env={ data: x or None }) if isinstance(arg.value, Collection): data_key = arg.value.get('data') data_key.value.edit(f'{data_key.value} or None')
When moving symbols, Codegen will automatically update all imports and
references. See Moving Symbols to
learn more.
Codegen’s graph structure makes it easy to analyze relationships between code elements across files:
Copy
Ask AI
# Find dead codefor func in codebase.functions: if len(func.usages) == 0: print(f'🗑️ Dead code: {func.name}') func.remove()# Analyze import relationshipsfile = codebase.get_file('api/endpoints.py')print("\nFiles that import endpoints.py:")for import_stmt in file.inbound_imports: print(f" {import_stmt.file.path}")print("\nFiles that endpoints.py imports:")for import_stmt in file.imports: if import_stmt.resolved_symbol: print(f" {import_stmt.resolved_symbol.file.path}")# Explore class hierarchiesbase_class = codebase.get_class('BaseModel')if base_class: print(f"\nClasses that inherit from {base_class.name}:") for subclass in base_class.subclasses: print(f" {subclass.name}") # We can go deeper in the inheritance tree for sub_subclass in subclass.subclasses: print(f" └─ {sub_subclass.name}")
Codegen also supports a number of advanced settings that can be used to customize the behavior of the graph construction process.These flags are helpful for debugging problematic repos, optimizing Codegen’s performance, or testing unreleased or experimental (potentially backwards-breaking) features.
Copy
Ask AI
from codegen import Codebasefrom codegen.configs import CodebaseConfig# Initialize a Codebase with custom configurationcodebase = Codebase( "path/to/git/repo"", config=CodebaseConfig( verify_graph=True, method_usages=False, sync_enabled=True, generics=False, import_resolution_overrides={ "old_module": "new_module" }, ts_language_engine=True, v8_ts_engine=True ))
To learn more about available settings, see the Advanced Settings page.
These are considered experimental and unstable features that may be removed or changed in the future.