pyplugins.apis.symbols module

Symbols Plugin (symbols.py) for Penguin

This module provides the Symbols plugin for the Penguin framework, serving as a robust, centralized service for resolving binary symbols to file offsets. It allows other plugins and scripts to locate functions and variables within guest executables and shared libraries, even in challenging scenarios like stripped binaries or non-standard architectures.

Features

  • Robust Forward Lookup: Resolves symbol names to file offsets using a tiered strategy: 1. Pre-computed JSON cache (fastest). 2. Native nm utility (static and dynamic tables). 3. PyELFtools (section parsing). 4. Manual PT_DYNAMIC segment parsing (handles “sstripped” binaries with no section headers). 5. readelf fallbacks, including MIPS GOT scraping for embedded targets.

  • Reverse Resolution: Maps file offsets back to the nearest symbol name (useful for stack trace generation).

  • Address Resolution: Maps virtual addresses to file offsets (useful for handling raw addresses provided by users).

  • Introspection: Methods to list, filter, and bulk-load symbols for specific binaries.

  • Architecture Aware: Automatically handles absolute addressing (ET_EXEC) vs relative offsets (ET_DYN) and architecture-specific symbol tables.

Example Usage

from penguin import plugins

# 1. Forward Lookup: Get the binary path and file offset for a function
#    (Useful for placing hooks or uprobes)
path, offset = plugins.Symbols.lookup("/usr/bin/httpd", "httpGetEnv")
if offset:
    print(f"Function located at offset {hex(offset)} in {path}")

# 2. Address Resolution: Convert a virtual address to a file offset
offset = plugins.Symbols.resolve_addr("/usr/bin/httpd", 0x400500)

# 3. Reverse Lookup: Identify a function from a crash/instruction pointer
name, dist = plugins.Symbols.resolve_offset("/usr/lib/libc.so.0", 0x12345)

Purpose

The Symbols plugin bridges the gap between high-level analysis names (functions) and low-level binary offsets required for instrumentation.

class pyplugins.apis.symbols.Symbols[source]

Bases: Plugin

Symbols Plugin

A central service for resolving symbol names to file offsets within the guest.

find_all(symbol)[source]

Search for a symbol in ALL known libraries in the database.

Parameters

symbolstr

Symbol name to look up.

Returns

List[Tuple[str, int]]

A list of (library_path, file_offset) for every occurrence of the symbol.

Parameters:

symbol (str)

Return type:

List[Tuple[str, int]]

get_offset(path, symbol)[source]
Parameters:
  • path (str)

  • symbol (str)

Return type:

int | None

list_symbols(path, filter_str=None)[source]

Returns a list of all symbol names found in the binary. Optional filter_str performs a substring match.

Parameters:
  • path (str)

  • filter_str (str | None)

Return type:

List[str]

load_symbols(path)[source]

Force-loads symbols for a binary into the cache. Useful for pre-warming the cache or debugging what symbols are detected.

Parameters:

path (str)

Return type:

Dict[str, int]

lookup(path, symbol)[source]

Resolve a symbol name to a specific library path and file offset.

Parameters

pathstr

Path to binary (supports wildcards like “/libc.so”).

symbolstr

Symbol name to look up.

Returns

Tuple[str, int] or (None, None)

Parameters:
  • path (str)

  • symbol (str)

Return type:

Tuple[str | None, int | None]

resolve_addr(path, vaddr, base_addr=None)[source]

Resolves a virtual address to a file offset.

If base_addr is provided, the offset is calculated as vaddr - base_addr. Otherwise, it attempts to map the virtual address to a file offset using ELF segments. If that fails, it retries by assuming common base addresses (e.g., 0x400000) and checking if the adjusted address maps to a segment or valid file offset.

Returns the file offset or None if resolution fails.

Parameters:
  • path (str)

  • vaddr (int)

  • base_addr (int | None)

Return type:

int | None

resolve_offset(path, offset)[source]

Reverse lookup: Given a file offset, find the nearest preceding symbol. Returns (SymbolName, DistanceFromStart)

Example: resolve_offset(“/bin/httpd”, 0x4005) -> (“main”, 5)

Parameters:
  • path (str)

  • offset (int)

Return type:

Tuple[str, int] | None