penguin.static_analyses module

penguin.static_analyses

Static analysis utilities for the Penguin emulation environment.

This module provides classes and helpers for analyzing extracted filesystems.

class penguin.static_analyses.ArchId[source]

Bases: StaticAnalysis

Identify the most common architecture in the extracted filesystem.

run(extracted_fs, prior_results)[source]

Count architectures to identify most common.

If both 32 and 64 bit binaries from the most common architecture are present, prefer 64-bit. Raise an error if architecture cannot be determined or is unsupported.

Parameters:
  • extracted_fs (str) – Path to extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Most common architecture string.

Raises:

ValueError – If unable to determine architecture.

Return type:

str

class penguin.static_analyses.ClusterCollector[source]

Bases: StaticAnalysis

Collect summary statistics for the filesystem to help identify clusters.

static compute_file_hash(file_path)[source]

Compute SHA256 hash of a file.

Parameters:

file_path (str) – Path to file.

Returns:

Hex digest string or None on failure.

Return type:

str | None

run(extract_dir, prior_results)[source]

Collect basename and hash of every executable file.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Dict with lists of files, executables, and hashes.

Return type:

dict[str, list[str]]

class penguin.static_analyses.EnvFinder[source]

Bases: StaticAnalysis

Identify potential environment variables and their values in the filesystem.

BORING_VARS: list[str] = ['TERM']
run(extract_dir, prior_results)[source]

Find environment variables and their possible values.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Dict of environment variable names to possible values.

Return type:

dict[str, list | None]

class penguin.static_analyses.FileSystemHelper[source]

Bases: object

static find_regex(target_regex, extract_root, ignore=None)[source]

Search the filesystem for matches to a regex pattern using ripgrep.

Parameters:
  • target_regex (Pattern) – Compiled regex pattern to match.

  • extract_root (str) – Root directory to search.

  • ignore (list | tuple | None) – Optional list/tuple of matches to ignore.

Returns:

Dict of {match: {“count”: int, “files”: [str]}}

Return type:

dict

class penguin.static_analyses.InitFinder[source]

Bases: StaticAnalysis

Find potential init scripts and binaries in an extracted filesystem.

run(filesystem_root_path, prior_results)[source]

Search the filesystem for binaries that might be init scripts.

Parameters:
  • filesystem_root_path (str) – Root path of extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Sorted list of init script paths.

Return type:

list[str]

class penguin.static_analyses.InterfaceFinder[source]

Bases: StaticAnalysis

Identify network interfaces in the filesystem.

run(extract_dir, prior_results)[source]

Find network interfaces using sysfs and command references.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Dict of interfaces found via sysfs and commands.

Return type:

dict[str, list[str]] | None

class penguin.static_analyses.KernelVersionFinder[source]

Bases: StaticAnalysis

Find and select the best kernel version from extracted filesystem.

static is_kernel_version(name)[source]

Check if a string matches a kernel version pattern.

Parameters:

name (str) – Version string.

Returns:

True if matches kernel version pattern.

Return type:

bool

run(extract_dir, prior_results)[source]

Run kernel version analysis.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Dict with potential and selected kernel versions.

Return type:

dict[str, list[str] | str]

static select_best_kernel(kernel_versions)[source]

Select the most recent kernel version and match to available kernels.

Parameters:

kernel_versions (set[str]) – Iterable of kernel version strings.

Returns:

Best matching kernel version string.

Return type:

str

class penguin.static_analyses.LibrarySymbols[source]

Bases: StaticAnalysis

Examine libraries in the filesystem for NVRAM keys and exported symbols.

Uses pyelftools to find definitions for NVRAM_KEYS variables and tracks exported function names.

NVRAM_KEYS: list[str] = ['Nvrams', 'router_defaults']
static get_nvram_info(elf_path, archend)[source]

Extract NVRAM key-value pairs from an ELF file.

Parameters:
  • elf_path (str) – Path to ELF file.

  • archend (str) – Architecture/endianness info.

Returns:

Dict of NVRAM key-value pairs.

Return type:

dict[str, str | None]

run(extract_dir, prior_results)[source]

Analyze libraries for NVRAM keys and symbols.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Dict with nvram values and symbol paths.

Return type:

dict[str, dict]

class penguin.static_analyses.PseudofileFinder[source]

Bases: StaticAnalysis

Find device and proc pseudofiles in the extracted filesystem.

IGLOO_ADDED_DEVICES: list[str] = ['autofs', 'btrfs-control', 'cfs0', 'cfs1', 'cfs2', 'cfs3', 'cfs4', 'console', 'cpu_dma_latency', 'full', 'fuse', 'input', 'kmsg', 'loop-control', 'loop0', 'loop1', 'loop2', 'loop3', 'loop4', 'loop5', 'loop6', 'loop7', 'mem', 'memory_bandwidth', 'mice', 'net', 'network_latency', 'network_throughput', 'null', 'port', 'ppp', 'psaux', 'ptmx', 'pts', 'ptyp0', 'ptyp1', 'ptyp2', 'ptyp3', 'ptyp4', 'ptyp5', 'ptyp6', 'ptyp7', 'ptyp8', 'ptyp9', 'ptypa', 'ptypb', 'ptypc', 'ptypd', 'ptype', 'ptypf', 'ram', 'ram0', 'ram1', 'ram10', 'ram11', 'ram12', 'ram13', 'ram14', 'ram15', 'ram2', 'ram3', 'ram4', 'ram5', 'ram6', 'ram7', 'ram8', 'ram9', 'random', 'root', 'tty', 'tty0', 'tty1', 'tty10', 'tty11', 'tty12', 'tty13', 'tty14', 'tty15', 'tty16', 'tty17', 'tty18', 'tty19', 'tty2', 'tty20', 'tty21', 'tty22', 'tty23', 'tty24', 'tty25', 'tty26', 'tty27', 'tty28', 'tty29', 'tty3', 'tty30', 'tty31', 'tty32', 'tty33', 'tty34', 'tty35', 'tty36', 'tty37', 'tty38', 'tty39', 'tty4', 'tty40', 'tty41', 'tty42', 'tty43', 'tty44', 'tty45', 'tty46', 'tty47', 'tty48', 'tty49', 'tty5', 'tty50', 'tty51', 'tty52', 'tty53', 'tty54', 'tty55', 'tty56', 'tty57', 'tty58', 'tty59', 'tty6', 'tty60', 'tty61', 'tty62', 'tty63', 'tty7', 'tty8', 'tty9', 'ttyS0', 'ttyS1', 'ttyS2', 'ttyS3', 'ttyp0', 'ttyp1', 'ttyp2', 'ttyp3', 'ttyp4', 'ttyp5', 'ttyp6', 'ttyp7', 'ttyp8', 'ttyp9', 'ttypa', 'ttypb', 'ttypc', 'ttypd', 'ttype', 'ttypf', 'tun', 'urandom', 'vcs', 'vcs1', 'vcsa', 'vcsa1', 'vda', 'vga_arbiter', 'vsock', 'zero', 'root', 'pts', 'ttyAMA0', 'ttyAMA1', 'stdin', 'stdout', 'stderr']
IGLOO_PROCFS: list[str] = ['buddyinfo', 'cgroups', 'cmdline', 'config.gz', 'consoles', 'cpuinfo', 'crypto', 'devices', 'diskstats', 'execdomains', 'fb', 'filesystems', 'interrupts', 'iomem', 'ioports', 'kallsyms', 'key-users', 'keys', 'kmsg', 'kpagecount', 'kpageflags', 'loadavg', 'locks', 'meminfo', 'misc', 'modules', 'mounts', 'mtd', 'net', 'pagetypeinfo', 'partitions', 'penguin_net', 'sched_debug', 'slabinfo', 'softirqs', 'stat', 'swaps', 'sysrq-trigger', 'thread-self', 'timer_list', 'uptime', 'version', 'vmallocinfo', 'vmstat', 'zoneinfo', 'bus', 'bus/pci', 'bus/pci/00', 'bus/pci/00/00.0', 'bus/pci/00/0a.0', 'bus/pci/00/0a.1 ', 'bus/pci/00/0a.2 ', 'bus/pci/00/0a.3 ', 'bus/pci/00/0b.0 ', 'bus/pci/00/12.0 ', 'bus/pci/00/13.0 ', 'bus/pci/00/14.0 ', 'bus/pci/devices ', 'bus/input', 'bus/input/devices', 'bus/input/handlers', 'cpu', 'cpu/alignment', 'driver', 'driver/rtc', 'fs', 'fs/afs', 'fs/afs/cells', 'fs/afs/rootcell', 'fs/ext4', 'fs/f2fs', 'fs/jbd2', 'fs/nfsd', 'fs/lockd', 'fs/lockd/nlm_end_grace', 'fs/nfsfs', 'fs/nfsfs/servers', 'fs/nfsfs/volumes', 'sysvipc/shm', 'sysvipc/sem', 'sysvipc/msg', 'scsi/device_info', 'scsi/scsi', 'tty/drivers', 'tty/ldisc', 'tty/driver', 'tty/driver/serial', 'tty/ldisc']
PROC_IGNORE: list[str] = ['irq', 'self', 'PID', 'device-tree', 'net', 'vmcore']
run(extract_dir, prior_results)[source]

Run pseudofile analysis.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Returns:

Dict with lists of device and proc files.

Return type:

dict[str, list[str]]

class penguin.static_analyses.StaticAnalysis[source]

Bases: ABC

Abstract base class for static analyses.

run(extract_dir, prior_results)[source]

Run the static analysis.

Parameters:
  • extract_dir (str) – Directory containing extracted filesystem.

  • prior_results (dict) – Results from previous analyses.

Return type:

None