Skip to content

mutating/getsources

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Downloads Downloads Coverage Status Lines of code Hits-of-Code Test-Package Python versions PyPI version Checked with mypy Ruff DeepWiki

logo

This library lets you retrieve a function's source code at runtime. It can serve as a foundation for tools that work with ASTs. It is a thin wrapper around inspect.getsource and dill.source.getsource.

Table of contents

Installation

You can install getsources with pip:

pip install getsources

You can also use instld to quickly try this package and others without installing them.

Get raw source

The standard library provides the getsource function that returns the source code of functions and other objects. However, this does not work with functions defined in the REPL.

This library provides a function with the same name and nearly the same interface, but without this limitation:

# You can run this code snippet in the REPL.
from getsources import getsource

def function():
    ...

print(getsource(function))
#> def function():
#>     ...

This allows AST-based tools to work reliably in both scripts and the REPL. All other functions in the library are built on top of it.

⚠️ Please note that this library is intended solely for retrieving the source code of functions of any kind, including generators, async functions, regular functions, class methods, lambdas, and so on. It is not intended for classes, modules, or other objects. Other use cases may work, but they are not covered by the test suite.

Get cleaned source

The getsource function returns a function's source code in raw form. This means that the code snippet captures some unnecessary surrounding code.

Here is an example where the standard getsource output includes extra leading whitespace:

if True:
    def function():
        ...

print(getsource(function))
#>     def function():
#>         ...

↑ Notice the extra leading spaces.

For lambda functions, it may also return the entire surrounding expression:

print(getsource(lambda x: x))
#> print(getsource(lambda x: x))

To address these issues, the library provides a function called getclearsource, which returns the function's source with unnecessary context removed:

from getsources import getclearsource

class SomeClass:
    @staticmethod
    def method():
        ...

print(getclearsource(SomeClass.method))
#> @staticmethod
#> def method():
#>     ...
print(getclearsource(lambda x: x))
#> lambda x: x

To extract only the substring containing a lambda function, the library uses AST parsing behind the scenes. Unfortunately, this does not allow it to distinguish between multiple lambda functions defined in a single line, so in this case you will get an exception:

lambdas = [lambda: None, lambda x: x]

getclearsource(lambdas[0])
#> ...
#> getsources.errors.UncertaintyWithLambdasError: Several lambda functions are defined in a single line of code, can't determine which one.

If you absolutely must obtain at least some source code for these lambdas, use getsource:

try:
    getclearsource(function)
except UncertaintyWithLambdasError:
    getsource(function)

However, in general, the getclearsource function is recommended for retrieving the source code of functions when working with the AST.

Generate source hashes

In some cases, you may not care about a function's exact source, but you still need to distinguish between different implementations. In this case, the getsourcehash function is useful. It returns a short hash string derived from the function's source code:

from getsources import getsourcehash

def function():
    ...

print(getsourcehash(function))
#> 7SWJGZ

ⓘ A hash string uses only characters from the Crockford Base32 alphabet, which consists solely of uppercase English letters and digits; ambiguous characters are excluded, which makes the hash easier to read.

ⓘ The getsourcehash function is built on top of getclearsource and ignores "extra" characters in the source code.

By default, the hash string length is 6 characters, but you can choose a length from 4 to 8 characters:

print(getsourcehash(function, size=4))
#> WJGZ
print(getsourcehash(function, size=8))
#> XG7SWJGZ

By default, the full source code of a function is used, including its name and arguments. If you only want to compare function bodies, pass only_body=True:

def function_1():
    ...

def function_2(a=5):
    ...

print(getsourcehash(function_1, only_body=True))
#> V587A6
print(getsourcehash(function_2, only_body=True))
#> V587A6

By default, docstrings are considered part of the function body. If you want to skip them as well, pass skip_docstring=True:

def function_1():
    """some text"""
    ...

def function_2(a=5):
    ...

print(getsourcehash(function_1, only_body=True, skip_docstring=True))
#> V587A6
print(getsourcehash(function_2, only_body=True, skip_docstring=True))
#> V587A6

About

A way to get the source code of functions

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages