penguin.graphs module

class penguin.graphs.Configuration(name, info=None, exclusive=None)[source]

Bases: GraphNode

class penguin.graphs.ConfigurationGraph(base_config)[source]

Bases: object

Parameters:

base_config (Configuration)

add_dependencies(parent_config, child_config)[source]
Parameters:
add_derived_configuration(derived_config, parent_config, mitigation)[source]

Add a new configuration derived from a specific mitigation and parent configuration.

Parameters:
add_edge(from_node, to_node, weight=1.0, unknown=False, delta=None)[source]
Parameters:
  • from_node (GraphNode)

  • to_node (GraphNode)

  • weight (float)

  • unknown (bool)

  • delta (str | None)

add_node(node)[source]
Parameters:

node (GraphNode)

calculate_expected_config_health(cc)[source]

Given a config, calculate the expected health score of that config. Note this ignores any actual health score we’ve seen for this config.

Return type:

float

create_config_png(file_path)[source]
Parameters:

file_path (str)

create_html(file_path)[source]

Create an interactive html page of the graph

Args:

file_path (str): The file path where the HTML image will be saved.

Parameters:

file_path (str)

create_png(file_path)[source]

Create a PNG image of the graph with enhanced visual features.

Args:

file_path (str): The file path where the PNG image will be saved.

Parameters:

file_path (str)

static determine_edge_type(from_node, to_node)[source]
Parameters:
static find_delta(derived, parent, prefix='')[source]

Given two dicts, create a string representation of the difference between them.

find_unexplored_configurations(exclude=None, potential=None)[source]

Find all configurations that have not been run yet.

If exclude is provided, we’ll exclude those configurations from the search. If parent is set, we’ll only consider direct descendants of that parent.

Returns:

list: A list of configuration IDs that have not been linked to any failures or mitigations.

Parameters:
Return type:

List[Configuration]

get_all_configurations()[source]

Get all configurations in our graph

Return type:

List[Configuration]

get_best_run_configuration()[source]

Find the configuration with the highest health score - not estimated, actual.

Return type:

Configuration | None

get_child_configs(config)[source]
Parameters:

config (Configuration)

Return type:

List[Configuration]

get_config_depth(config)[source]

Given a config, find its depth in the graph

Parameters:

config (Configuration)

Return type:

int

get_existing_node(new_node)[source]
Parameters:

new_node (GraphNode)

Return type:

GraphNode | None

get_existing_node_or_self(new_node)[source]

Given a new node, return the existing node in the graph if it exists. Otherwise return the new node. Check with hashing of node object

Parameters:

new_node (GraphNode)

Return type:

GraphNode

get_node(node_id)[source]

Get a node from the graph.

Args:

node_id (str): The ID of the node to retrieve.

Returns:

GraphNode: The node with the given ID.

Parameters:

node_id (UUID)

Return type:

GraphNode

get_parent_config(config)[source]

Given a config, find its parent config. Returns None if it’s the root config

Parameters:

config (Configuration)

Return type:

Configuration | None

get_parent_failure(config)[source]

Given a config, find its parent failure. Returns None if it’s the root config

Parameters:

config (Configuration)

Return type:

Failure | None

get_parent_mitigation(config)[source]

Given a config, find its parent mitigation. Returns None if it’s the root config

Parameters:

config (Configuration)

Return type:

Mitigation | None

get_root_config()[source]
Return type:

Configuration | None

has_edge(from_node, to_node)[source]
Parameters:
has_node(node)[source]
Parameters:

node (GraphNode)

mitigations_for(failure)[source]

Given a failure, return a list of mitigations that could be applied

node_has_predecessor(node)[source]

Check if a given node has a predecessor

Parameters:

node (GraphNode)

report_config_run(config, health_score)[source]

After we’ve run a configuration we have its health score.

For all but the root config, we’ll have a chain: parent config -> failure -> mitigation -> this config and there’s also an edge from parent config -> this config We make two updates for weight: at parent config -> this config we directly set the weight as this edge is only considered once. Then at the failure->mitigation edge we add the weight to a list of weights, and update the weight to be the average of the list. This is because mitigations are tested multiple times.

The goal is to tune our weights such that 1) From the parent config, we’ll be able to select child configs with high health scores 2) From the failure we’ll be able to select mitigations with high health scores

Parameters:
save_graph(file_path)[source]

Save the graph to a file using pickle.

Args:

file_path (str): The file path where the graph should be saved.

Parameters:

file_path (str)

set_cc_edge_weight(from_node, to_node, new_weight)[source]

Set the weight of an edge between two configurations in the graph.

Args:

from_node (str): The starting node of the edge. to_node (str): The ending node of the edge. new_weight (float): The new weight to assign to the edge.

Parameters:
update_parent_fail_mit_weight(config, health_score)[source]
Parameters:
class penguin.graphs.ConfigurationManager(base_config)[source]

Bases: object

Parameters:

base_config (Configuration)

calculate_config_depth(cc)[source]
run_configuration(config, weight, run_config_f, find_mitigations_f, find_new_configs_f, logger=None)[source]

Run a given configuration to get a list of failure and a health score. Update the graph with the new information to set weights Add new failures and mitigations to the graph

Parameters:
run_exploration_cycle(run_config_f, find_mitigations_f, find_new_configs_f, logger=None)[source]

Get the best config and run it. Hold lock while selecting. While we’re running, ensure config is in self.pending_runs

Parameters:
select_best_config()[source]

First try finding an un-run+non-pending config that’s derived from a mitigation we’ve never run before. Prioritize by expected health score.

Priority init > pseudofile > netdev > signals > env vars

If we’ve run every mitigation, just select the best config based on expected health score.

Return type:

Tuple[Configuration | None, float]

select_best_config_llm()[source]

TODO

Return type:

Tuple[Configuration | None, float]

select_best_config_orig()[source]

— DEPRECATED —

Select the best configuration to run next. Node can’t have been run before

Just return the first unexplored config for now

For each un-run node, we look at its parent config and parent mitigation. We can calculate an expected weight based on these two as: expected_weight = parent_config_weight + parent_mitigation_weight

We support biasing this calculation when we get results that are better than expected from equally-likely-to-be-good mitigations. E.g., if we add two inits and just run one, we bias the health score of the other to be comprably weighed instead of falling behind and never getting run.

We do this for inits, dynamically-discovered env vars, and ioctl models.

XXX: We assume these nodes all show up at the same time - i.e., after an exclusive nodes runs. If that changes later, we’ll need to better track these

Call with self.lock held!

Return type:

Tuple[Configuration | None, float]

stringify_state()[source]

Return a string representation of the current graph.

Organize configs by run idx. Report deltas from parent Report score/pending/estimated for each config

stringify_state2()[source]

Return a string representation of the current graph. Each node should have a line. Under each node we’ll indent and list out edges

class penguin.graphs.Failure(name, type, info=None, patch_name=None)[source]

Bases: GraphNode

to_dict()[source]
class penguin.graphs.GraphNode(name)[source]

Bases: object

Base class for all graph nodes

to_dict()[source]
class penguin.graphs.Mitigation(name, type, info=None, exclusive=None, patch=None, failure_name=None)[source]

Bases: GraphNode

penguin.graphs.get_global_mitigation_weight(mitigation_type)[source]

Global hyperparameter for how much we should prioritize a given mitigation type. Intuition is that we should check all inits.

With new approach of searching across a generation, we’ll always cover the whole init generation. So now we can leave these all equal?

TODO: it seems like pseudofiles are generally more important than env variables or blocking signals. Perhaps our search should be more of a:

If any untested pseudofile failure mitigations - select If any untested env failure mitigations - select (dynval, then apply) If any untested signal failure mitigations - select

Then select highest estimated scores of remaining un-run nodes.

This would ensure we create devices (straightforward and often good), before we go into expensive dynval tests that infrequently work

Parameters:

mitigation_type (str)

Return type:

float

penguin.graphs.run_test()[source]

Use stubs to simulate running configs, identifying failures, finding mitigations, and applying them.