frequent

Frequent subgraph pattern mining via the SPMiner Docker service. Requires the [frequent] extra (pip install "fSTG-Toolkit[frequent]").

patterns

Pattern data structures.

class fstg_toolkit.frequent.patterns.FrequentPattern(*args, backend=None, **kwargs)[source]

Bases: DiGraph

A directed graph representing a frequent subgraph pattern.

A frequent pattern is a recurring subgraph structure discovered from spatio-temporal graphs. It extends NetworkX’s DiGraph to represent the pattern’s topology, node attributes (e.g., brain regions), and edge attributes (e.g., temporal transitions).

__init__(graph: DiGraph)[source]

Initialize the FrequentPattern.

Parameters:

graph (nx.DiGraph) – A directed graph to wrap as a frequent pattern.

static from_dict(graph_dict: dict[str, Any]) FrequentPattern[source]
class fstg_toolkit.frequent.patterns.FrequentPatterns(patterns: dict[str, FrequentPattern])[source]

Bases: object

A collection of frequent subgraph patterns for a single subject or group.

Stores multiple frequent patterns discovered from a spatio-temporal graph dataset, each with a unique identifier. This immutable container allows iteration over the patterns and retrieval of their count.

patterns

A mapping from pattern identifiers to FrequentPattern objects.

Type:

dict[str, FrequentPattern]

__iter__() Iterator[tuple[str, FrequentPattern]][source]

Iterate over pattern identifiers and objects.

Yields:

tuple[str, FrequentPattern] – Tuples of (pattern_id, pattern) for each pattern in the collection.

__len__() int[source]

Return the number of patterns in this collection.

Returns:

The count of distinct patterns.

Return type:

int

patterns: dict[str, FrequentPattern]
class fstg_toolkit.frequent.patterns.FrequentPatternsPopulationAnalysis(patterns: dict[tuple[str, ...], FrequentPatterns], ids_names: tuple[str], equivalence_strategy: Type[PatternEquivalenceStrategy])[source]

Bases: object

Analyze frequent patterns across a population using an equivalence strategy.

Identifies unique patterns in a multi-subject dataset and tracks which subjects/groups contain each unique pattern, using a specified equivalence criterion to group structurally similar patterns.

__init__(patterns: dict[tuple[str, ...], FrequentPatterns], ids_names: tuple[str], equivalence_strategy: Type[PatternEquivalenceStrategy])[source]

Initialize population analysis.

Parameters:
  • patterns (dict[tuple[str, ...], FrequentPatterns]) – Dictionary mapping subject/group ID tuples to their frequent patterns.

  • ids_names (tuple[str]) – Names of the ID dimensions (e.g., (“subject”, “session”)).

  • equivalence_strategy (Type[PatternEquivalenceStrategy]) – Strategy class to determine if two patterns are equivalent.

get_counts(factors: list[str]) DataFrame[source]

Count occurrences of each unique pattern, optionally grouped by factors.

Aggregates the tracking data to compute how many subjects/groups contain each unique pattern. If factors are specified, counts are computed separately for each combination of factor values.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by (e.g., [‘session’]). Pass an empty list to get counts across all subjects.

Returns:

A DataFrame with unique pattern indices as rows and ‘Count’ column containing the number of subjects with each pattern. If factors are provided, the result is multi-indexed by the factor columns and pattern index ‘idx’.

Return type:

pd.DataFrame

get_occurrence_histogram(factors: list[str]) DataFrame[source]

Build a histogram of pattern occurrence counts, optionally grouped by factors.

Computes how many patterns share the same occurrence count. For example, if 5 patterns each appear in exactly 3 subjects, the histogram will have a row with Occurrences=3, Patterns=5.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by. Pass an empty list to get a single histogram.

Returns:

A DataFrame with columns [Occurrences, Patterns, PatternIndices, *factors], where PatternIndices is a sorted comma-separated string of 1-based pattern indices that have that occurrence count.

Return type:

pd.DataFrame

get_pattern_co_occurrence(factors: list[str]) dict[tuple[str, ...], list[list[int]]][source]

Compute pattern co-occurrence matrices, optionally grouped by factors.

For each subject in a factor group, finds all pattern indices the subject has, then increments the co-occurrence counter for every pair of patterns.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by. Pass an empty list to get a single co-occurrence matrix.

Returns:

A dictionary mapping factor-group tuples (or ('',) if no factors) to a symmetric 2D list of size len(unique_patterns), where cell (i, j) is the number of subjects that have both pattern i and pattern j.

Return type:

dict[tuple[str, …], list[list[int]]]

get_pattern_complexity(factors: list[str]) DataFrame[source]

Compute pattern complexity (node count) distribution, optionally grouped by factors.

For each unique pattern, computes its size as the number of nodes. The size is weighted by the number of subjects that have the pattern in each factor group.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by. Pass an empty list to get counts across all subjects.

Returns:

A DataFrame with columns [Size, Count, PatternIndices, *factors], where PatternIndices is a sorted comma-separated string of 1-based pattern indices that have that node count.

Return type:

pd.DataFrame

get_patterns_per_region(factors: list[str]) DataFrame[source]

Count pattern occurrences per brain region, optionally grouped by factors.

For each unique pattern, extracts all regions present in its nodes. Each region occurrence is weighted by the number of subjects that have the pattern in the given factor group.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by. Pass an empty list to get counts across all subjects.

Returns:

A DataFrame with columns [Region, Count, PatternIndices, *factors], where PatternIndices is a sorted comma-separated string of 1-based pattern indices that contain at least one node in that region.

Return type:

pd.DataFrame

get_region_co_occurrence(factors: list[str]) dict[tuple[str, ...], tuple[list[str], list[list[int]]]][source]

Compute region co-occurrence matrices from spatial edges, optionally grouped by factors.

For each spatial edge (no transition attribute, connecting different regions), records the sorted region pair. Counts are weighted by the number of subjects that have the pattern in each factor group.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by. Pass an empty list to get a single co-occurrence matrix.

Returns:

A dictionary mapping factor-group tuples (or ('',) if no factors) to a tuple of (region_labels_sorted, symmetric_2d_list) where the 2D list contains co-occurrence counts between regions.

Return type:

dict[tuple[str, …], tuple[list[str], list[list[int]]]]

get_temporal_dynamics(factors: list[str]) DataFrame[source]

Extract temporal edge dynamics per region, optionally grouped by factors.

For each unique pattern, extracts temporal edges (those with a transition attribute) and records the source node’s region and the transition type. Counts are weighted by the number of subjects that have the pattern in each factor group.

Parameters:

factors (list[str]) – Column names from the tracking DataFrame to group by. Pass an empty list to get counts across all subjects.

Returns:

A DataFrame with columns [Region, Transition, Count, PatternIndices, *factors], where PatternIndices is a sorted comma-separated string of 1-based pattern indices that have at least one temporal edge matching that region and transition type.

Return type:

pd.DataFrame

class fstg_toolkit.frequent.patterns.PatternEquivalenceStrategy[source]

Bases: ABC

Abstract base class for pattern equivalence comparison strategies.

Defines the interface for determining whether two frequent patterns are equivalent under different criteria (structure only, with transitions, etc.).

abstractmethod classmethod equivalent(p1: FrequentPattern, p2: FrequentPattern) bool[source]

Determine if two patterns are equivalent under this strategy.

Parameters:
Returns:

True if patterns are equivalent, False otherwise.

Return type:

bool

class fstg_toolkit.frequent.patterns.PatternEquivalenceStrategyRegistry[source]

Bases: object

Registry for PatternEquivalenceStrategy implementations.

Strategies self-register via the @PatternEquivalenceStrategyRegistry.register(name) decorator. Look up by the registered name string.

classmethod get(name: str) Type[PatternEquivalenceStrategy][source]

Look up a strategy class by its registered name.

Parameters:

name (str) – The registered name key.

Returns:

The registered strategy class.

Return type:

type[PatternEquivalenceStrategy]

Raises:

KeyError – If no strategy is registered under this name.

classmethod names() list[str][source]

Return the names of all registered strategies.

Returns:

Sorted list of registered strategy name keys.

Return type:

list[str]

classmethod register(name: str) Callable[[Type[PatternEquivalenceStrategy]], Type[PatternEquivalenceStrategy]][source]

Class decorator factory that registers a strategy under the given name.

Parameters:

name (str) – The name key to register the strategy under.

Returns:

Decorator that stores the class and returns it unchanged.

Return type:

Callable

class fstg_toolkit.frequent.patterns.PatternStructure[source]

Bases: PatternEquivalenceStrategy

Equivalence strategy based on graph structure only.

Two patterns are equivalent if they are isomorphic as directed graphs, regardless of node or edge attributes.

classmethod equivalent(p1: FrequentPattern, p2: FrequentPattern) bool[source]

Determine if two patterns are equivalent under this strategy.

Parameters:
Returns:

True if patterns are equivalent, False otherwise.

Return type:

bool

class fstg_toolkit.frequent.patterns.PatternStructureRegionsTransitions[source]

Bases: PatternEquivalenceStrategy

Equivalence strategy based on exact structure including regions and transitions.

Two patterns are equivalent only if they have identical nodes and edges with all their attributes (regions and transitions).

classmethod equivalent(p1: FrequentPattern, p2: FrequentPattern) bool[source]

Determine if two patterns are equivalent under this strategy.

Parameters:
Returns:

True if patterns are equivalent, False otherwise.

Return type:

bool

class fstg_toolkit.frequent.patterns.PatternStructureTransitions[source]

Bases: PatternEquivalenceStrategy

Equivalence strategy based on structure and edge transitions.

Two patterns are equivalent if they are isomorphic and all corresponding edges have the same transition attributes.

classmethod equivalent(p1: FrequentPattern, p2: FrequentPattern) bool[source]

Determine if two patterns are equivalent under this strategy.

Parameters:
Returns:

True if patterns are equivalent, False otherwise.

Return type:

bool

spminer

SPMiner Docker service integration.

class fstg_toolkit.frequent.spminer.SPMinerService[source]

Bases: object

Service wrapper for the SPMiner frequent subgraph pattern miner.

Manages the Docker image lifecycle (build/load on demand) and provides a simple interface to run the miner on a directory of input graphs and collect the results.

__init__()[source]

Initialise the service by connecting to Docker.

Raises:

RuntimeError – If Docker is not available on the host system.

prepare()[source]

Build or load the SPMiner Docker image if it is not already loaded.

The image is built from the spminer/ submodule located next to this package. Subsequent calls are no-ops if the image is already loaded.

run(input_dir: Path, output_dir: Path)[source]

Run the SPMiner container on a directory of graph files.

Mounts input_dir as read-only and output_dir as read-write inside the container. Progress updates are yielded as they arrive.

Parameters:
  • input_dir (Path) – Directory containing the input graph files.

  • output_dir (Path) – Directory where the miner will write its output.

Yields:

tuple[int, int](completed, total) progress tuples parsed from container stdout.