Data Model¶

ZIP Archive Format¶

fSTG Toolkit stores datasets as ZIP archives (.zip files). Each archive is self-contained and can hold one or more spatio-temporal graphs together with their associated data.

Archive Contents¶

my_graphs.zip
├── areas.csv                    # Brain area/region definitions
├── matrices/
│   ├── subject1.npz             # Raw correlation matrices (optional)
│   └── subject2.npz
├── graphs/
│   ├── subject1.json            # Serialised spatio-temporal graph
│   └── subject2.json
├── metrics/
│   ├── local.csv                # Spatial metrics (per node, per time step)
│   └── global.csv               # Temporal metrics (per graph)
└── patterns/
    ├── pattern_0.json           # Frequent subgraph patterns (if mined)
    └── pattern_1.json

Files are only written when the corresponding processing step has been run. A freshly built archive will contain only areas.csv, matrices/, and graphs/.

Graph JSON Format¶

Each graph is serialised as a JSON file using the NetworkX node_link_data format with additional fSTG-specific metadata:

{
  "graph": {"max_time": 9, "areas": [1, 2, 3]},
  "nodes": [
    {"id": "(1, 0)", "time": 0, "area": 1, "region": "Visual"},
    ...
  ],
  "links": [
    {"source": "(1, 0)", "target": "(2, 0)", "type": "spatial", "correlation": 0.72},
    {"source": "(1, 0)", "target": "(1, 1)", "type": "temporal", "rc5": "EQ"},
    ...
  ]
}

DataLoader and DataSaver¶

Two classes manage reading from and writing to archives:

`DataLoader`¶

fstg_toolkit.io.DataLoader provides lazy and eager loading:

from fstg_toolkit.io import DataLoader

loader = DataLoader('my_graphs.zip')

# List available graphs without loading them
names = loader.lazy_load_graphs()

# Load the areas description
areas = loader.load_areas()

# Load a specific graph
graph = loader.load_graph(areas, 'subject1')

# Load metrics
metrics = loader.load_metrics()

# Load frequent patterns
patterns = loader.load_frequent_patterns()

`DataSaver`¶

fstg_toolkit.io.DataSaver accumulates data in memory and writes it to a ZIP archive atomically:

from fstg_toolkit.io import DataSaver

saver = DataSaver()
saver.add_areas(areas_df)
saver.add_graphs({'subject1': graph1, 'subject2': graph2})
saver.add_matrices({'subject1': matrices_array})

total_files, file_descriptions = saver.save('output.zip')

Calling save() on an existing archive merges new data into it rather than overwriting the whole file.

Public API Functions¶

The top-level module exports two convenience wrappers for single-graph workflows:

from fstg_toolkit import save_spatio_temporal_graph, load_spatio_temporal_graph

# Save a single graph (creates a ZIP archive with one graph)
save_spatio_temporal_graph(graph, 'my_graph.zip')

# Load a single graph (raises if the archive contains more than one)
graph = load_spatio_temporal_graph('my_graph.zip')

Data Flow¶

areas.csv + matrices.npz
        │
        ▼
spatio_temporal_graph_from_corr_matrices()   ← factory.py
        │  SpatioTemporalGraph objects
        ▼
DataSaver.save('output.zip')                 ← io.py
        │  ZIP archive
        ▼
DataLoader('output.zip')                     ← io.py
        │
        ├─ calculate_spatial_metrics()       ← metrics.py
        ├─ calculate_temporal_metrics()      ← metrics.py
        └─ SPMinerService.run()              ← frequent/spminer.py