ageas.n_kfold_selection
- ageas.n_kfold_selection(hangar: Hangar, operation_name: str = 'trail', accelerator: str = 'cpu', cuda_devices: list = None, query_dataset=None, test_dataset=None, n_dataloader_workers: int = 1, using_model_types: list = None, using_model_list: list = None, kfold_selection_list: list = None, valid_fraction: float = 0.1, stratified_kfold_test: bool = False, stratified_kfold_valid: bool = False, oversample_method: str = None, oversample_by: str = 'median', monitor_type: str = 'max', monitor_metric: str = 'test.accuracy', retention_point: float = 0.9, cutoff_point: float = 0.7, selection_ratio: float = 0.5, skip_final: bool = False, seed: int = 42, verbose: bool = None) Deck
N-iteration k-fold selection of top-performing models.
Each iteration performs a fresh k-fold cross-validation over the squad, ranks the units by
monitor_metricand keeps the topselection_ratioplus any unit aboveretention_point, while discarding everything belowcutoff_point. After all iterations a final pass (the “last mission”) retrains the survivors on the full dataset and applies a final retention filter.- Parameters:
hangar – Source hangar from which the operating squad is generated.
operation_name – Name under which the per-unit reports are stored.
accelerator – Accelerator hint,
'cpu'or'cuda'.cuda_devices – Optional GPU device indices.
Noneuses all visible devices.query_dataset – Dataset that is split by k-fold for training and validation.
test_dataset – Optional held-out test set used in the last mission.
n_dataloader_workers – Number of dataloader workers for the deck.
using_model_types – Whitelist of model types to include from the hangar.
using_model_list – Whitelist of explicit unit IDs to include from the hangar.
kfold_selection_list – Fold counts per selection round.
[5]runs a single 5-fold round;[2, 3, 4]would run three rounds. Defaults to[5].valid_fraction – Fraction of each training fold reserved for validation.
stratified_kfold_test – If
True, the test fold is class-stratified.stratified_kfold_valid – If
True, the validation split is class-stratified.oversample_method – Oversampling method (e.g.
'repeat').Nonedisables oversampling.oversample_by – Target class size when oversampling:
'mean','median', or'max'.monitor_type –
'max'for higher-is-better metrics,'min'for lower-is-better.monitor_metric – Dotted metric used to rank units between rounds.
retention_point – Units at or above this metric value are retained unconditionally.
cutoff_point – Hard minimum: units below this value are dropped.
selection_ratio – Fraction of the squad to keep when ranked by the metric.
skip_final – If
True, skip the last mission retraining pass.seed – Random seed forwarded to the k-fold splitter.
verbose – If
True, emit per-fold and per-round progress logs.
- Returns:
The deck after selection with only the surviving units.