Skip to main content

Table 1 Performance of five feature selection strategies for identifying placental cell subpopulations on four machine learning models (Independent dataset)

From: A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery

Base classifier

Feature selection

Feature numbers

Accuracy (%)

Precision (%)

Recall (%)

F1 measure (%)

KNN

PCA

160

71.67

79.69

68.57

71.32

RFC

PCA

3000

91.13

93.43

90.80

91.87

SVM

PCA

1200

90.12

91.93

89.98

90.87

XGBoost

PCA

2000

92.11

93.03

91.72

92.32

KNN

MIC

210

88.39

90.43

88.40

88.83

RFC

MIC

260

92.40

93.63

92.28

92.83

SVM

MIC

160

92.65

93.22

92.89

93.14

XGBoost

MIC

310

93.07

93.76

92.87

93.25

KNN

TURF

110

87.88

90.61

86.80

87.85

RFC

TURF

310

92.35

93.90

92.24

92.86

SVM

TURF

210

92.31

93.23

92.48

92.75

XGBoost

TURF

110

92.61

92.98

92.46

92.65

KNN

F-score

310

85.43

88.74

85.71

85.97

RFC

F-score

610

92.02

93.43

92.03

92.62

SVM

F-score

710

92.02

92.87

92.37

92.54

XGBoost

F-score

410

92.10

93.04

92.31

92.64

KNN

ANOVA

360

84.00

87.28

83.99

83.91

RFC

ANOVA

810

92.10

93.75

91.95

92.65

SVM

ANOVA

710

92.02

93.33

92.51

92.83

XGBoost

ANOVA

460

92.11

93.24

92.22

92.67