ageas.n_kfold_selection

ageas.n_kfold_selection(hangar: Hangar, operation_name: str = 'trail', accelerator: str = 'cpu', cuda_devices: list = None, query_dataset=None, test_dataset=None, n_dataloader_workers: int = 1, using_model_types: list = None, using_model_list: list = None, kfold_selection_list: list = None, valid_fraction: float = 0.1, stratified_kfold_test: bool = False, stratified_kfold_valid: bool = False, oversample_method: str = None, oversample_by: str = 'median', monitor_type: str = 'max', monitor_metric: str = 'test.accuracy', retention_point: float = 0.9, cutoff_point: float = 0.7, selection_ratio: float = 0.5, skip_final: bool = False, seed: int = 42, verbose: bool = None) Deck

N-iteration k-fold selection of top-performing models.

Each iteration performs a fresh k-fold cross-validation over the squad, ranks the units by monitor_metric and keeps the top selection_ratio plus any unit above retention_point, while discarding everything below cutoff_point. After all iterations a final pass (the “last mission”) retrains the survivors on the full dataset and applies a final retention filter.

Parameters:
  • hangar – Source hangar from which the operating squad is generated.

  • operation_name – Name under which the per-unit reports are stored.

  • accelerator – Accelerator hint, 'cpu' or 'cuda'.

  • cuda_devices – Optional GPU device indices. None uses all visible devices.

  • query_dataset – Dataset that is split by k-fold for training and validation.

  • test_dataset – Optional held-out test set used in the last mission.

  • n_dataloader_workers – Number of dataloader workers for the deck.

  • using_model_types – Whitelist of model types to include from the hangar.

  • using_model_list – Whitelist of explicit unit IDs to include from the hangar.

  • kfold_selection_list – Fold counts per selection round. [5] runs a single 5-fold round; [2, 3, 4] would run three rounds. Defaults to [5].

  • valid_fraction – Fraction of each training fold reserved for validation.

  • stratified_kfold_test – If True, the test fold is class-stratified.

  • stratified_kfold_valid – If True, the validation split is class-stratified.

  • oversample_method – Oversampling method (e.g. 'repeat'). None disables oversampling.

  • oversample_by – Target class size when oversampling: 'mean', 'median', or 'max'.

  • monitor_type'max' for higher-is-better metrics, 'min' for lower-is-better.

  • monitor_metric – Dotted metric used to rank units between rounds.

  • retention_point – Units at or above this metric value are retained unconditionally.

  • cutoff_point – Hard minimum: units below this value are dropped.

  • selection_ratio – Fraction of the squad to keep when ranked by the metric.

  • skip_final – If True, skip the last mission retraining pass.

  • seed – Random seed forwarded to the k-fold splitter.

  • verbose – If True, emit per-fold and per-round progress logs.

Returns:

The deck after selection with only the surviving units.