This is the demo code from iterative-stratification
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
import numpy as np
X = np.array([[1,2], [3,4], [1,2], [3,4], [1,2], [3,4], [1,2]])
y = np.array([[0,0], [0,0], [0,1], [0,1], [1,1], [1,1], [1,0]])
mskf = MultilabelStratifiedKFold(n_splits=2, shuffle=True, random_state=0)
for train_index, test_index in mskf.split(X, y):
print("TRAIN:", train_index, "TEST:", test_index)
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
And here is the result
TRAIN: [2 3 6] TEST: [0 1 4 5]
TRAIN: [0 1 4 5] TEST: [2 3 6]