ageas.n_kfold_selection

ageas.n_kfold_selection(hangar: Hangar, operation_name: str = 'trail', accelerator: str = 'cpu', cuda_devices: list = None, query_dataset=None, test_dataset=None, n_dataloader_workers: int = 1, using_model_types: list = None, using_model_list: list = None, kfold_selection_list: list = None, valid_fraction: float = 0.1, stratified_kfold_test: bool = False, stratified_kfold_valid: bool = False, oversample_method: str = None, oversample_by: str = 'median', monitor_type: str = 'max', monitor_metric: str = 'test.accuracy', retention_point: float = 0.9, cutoff_point: float = 0.7, selection_ratio: float = 0.5, skip_final: bool = False, seed: int = 42, verbose: bool = None) → Deck

N-iteration k-fold selection of top-performing models.

Each iteration performs a fresh k-fold cross-validation over the squad, ranks the units by monitor_metric and keeps the top selection_ratio plus any unit above retention_point, while discarding everything below cutoff_point. After all iterations a final pass (the “last mission”) retrains the survivors on the full dataset and applies a final retention filter.

Parameters:

hangar – Source hangar from which the operating squad is generated.
operation_name – Name under which the per-unit reports are stored.
accelerator – Accelerator hint, 'cpu' or 'cuda'.
cuda_devices – Optional GPU device indices. None uses all visible devices.
query_dataset – Dataset that is split by k-fold for training and validation.
test_dataset – Optional held-out test set used in the last mission.
n_dataloader_workers – Number of dataloader workers for the deck.
using_model_types – Whitelist of model types to include from the hangar.
using_model_list – Whitelist of explicit unit IDs to include from the hangar.
kfold_selection_list – Fold counts per selection round. [5] runs a single 5-fold round; [2, 3, 4] would run three rounds. Defaults to [5].
valid_fraction – Fraction of each training fold reserved for validation.
stratified_kfold_test – If True, the test fold is class-stratified.
stratified_kfold_valid – If True, the validation split is class-stratified.
oversample_method – Oversampling method (e.g. 'repeat'). None disables oversampling.
oversample_by – Target class size when oversampling: 'mean', 'median', or 'max'.
monitor_type – 'max' for higher-is-better metrics, 'min' for lower-is-better.
monitor_metric – Dotted metric used to rank units between rounds.
retention_point – Units at or above this metric value are retained unconditionally.
cutoff_point – Hard minimum: units below this value are dropped.
selection_ratio – Fraction of the squad to keep when ranked by the metric.
skip_final – If True, skip the last mission retraining pass.
seed – Random seed forwarded to the k-fold splitter.
verbose – If True, emit per-fold and per-round progress logs.

Returns:

The deck after selection with only the surviving units.