diff --git "a/src/original_data_profile.html" "b/src/original_data_profile.html" new file mode 100644--- /dev/null +++ "b/src/original_data_profile.html" @@ -0,0 +1,3053 @@ + + + + + + + + + + + + + + + +
+ + +
+ + + + + +
+
+ +
DataFrame
+
NO COMPARISON TARGET
+
+
+
150
+
ROWS
+
+
+
+
3
+
DUPLICATES
+
+
+
+
15.5 kb
+
RAM
+
+
+
+
+ 5 +
+
FEATURES
+
+
+
+
+
1
+
CATEGORICAL
+
+
+
+
4
+
NUMERICAL
+
+
+
+
0
+
TEXT
+
+
+ +
+
+
+ + + +
+ +
+ 2.1.4
+ Get updates, docs & report issues here

+ Created & maintained by Francois Bertrand
+ Graphic design by Jean-Francois Hains +
+
+
+ + + +
+
+
+
+
+ + +
+ 1 +
+
+ sepal_length + +
+
+
VALUES:
+
+ 150 +
+
+ (100%) +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+
+
+
DISTINCT:
+
+ 35 +
+
+ (23%) +
+
+
+
+
+
ZEROES:
+
+ --- +
+
+ +
+
+
+ +
+
+
MAX
+
7.90
+
+
+
95%
+
7.25
+
+
+
Q3
+
6.40
+
+
+
AVG
+
5.84
+
+
+
MEDIAN
+
5.80
+
+
+
Q1
+
5.10
+
+
+
5%
+
4.60
+
+
+
MIN
+
4.30
+
+
+
+
+
RANGE
+
3.60
+
+
+
IQR
+
1.30
+
+
+
STD
+
0.828
+
+
+
VAR
+
0.686
+
+
+
+
+
KURT.
+
-0.552
+
+
+
SKEW
+
0.315
+
+
+
SUM
+
876
+
+
+ +
+ +
+
+
+
+
+
+ + +
+ 2 +
+
+ sepal_width + +
+
+
VALUES:
+
+ 150 +
+
+ (100%) +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+
+
+
DISTINCT:
+
+ 23 +
+
+ (15%) +
+
+
+
+
+
ZEROES:
+
+ --- +
+
+ +
+
+
+ +
+
+
MAX
+
4.40
+
+
+
95%
+
3.80
+
+
+
Q3
+
3.30
+
+
+
AVG
+
3.05
+
+
+
MEDIAN
+
3.00
+
+
+
Q1
+
2.80
+
+
+
5%
+
2.34
+
+
+
MIN
+
2.00
+
+
+
+
+
RANGE
+
2.40
+
+
+
IQR
+
0.500
+
+
+
STD
+
0.434
+
+
+
VAR
+
0.188
+
+
+
+
+
KURT.
+
0.291
+
+
+
SKEW
+
0.334
+
+
+
SUM
+
458
+
+
+ +
+ +
+
+
+
+
+
+ + +
+ 3 +
+
+ petal_length + +
+
+
VALUES:
+
+ 150 +
+
+ (100%) +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+
+
+
DISTINCT:
+
+ 43 +
+
+ (29%) +
+
+
+
+
+
ZEROES:
+
+ --- +
+
+ +
+
+
+ +
+
+
MAX
+
6.90
+
+
+
95%
+
6.10
+
+
+
Q3
+
5.10
+
+
+
MEDIAN
+
4.35
+
+
+
AVG
+
3.76
+
+
+
Q1
+
1.60
+
+
+
5%
+
1.30
+
+
+
MIN
+
1.00
+
+
+
+
+
RANGE
+
5.90
+
+
+
IQR
+
3.50
+
+
+
STD
+
1.76
+
+
+
VAR
+
3.11
+
+
+
+
+
KURT.
+
-1.40
+
+
+
SKEW
+
-0.274
+
+
+
SUM
+
564
+
+
+ +
+ +
+
+
+
+
+
+ + +
+ 4 +
+
+ petal_width + +
+
+
VALUES:
+
+ 150 +
+
+ (100%) +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+
+
+
DISTINCT:
+
+ 22 +
+
+ (15%) +
+
+
+
+
+
ZEROES:
+
+ --- +
+
+ +
+
+
+ +
+
+
MAX
+
2.50
+
+
+
95%
+
2.30
+
+
+
Q3
+
1.80
+
+
+
MEDIAN
+
1.30
+
+
+
AVG
+
1.20
+
+
+
Q1
+
0.30
+
+
+
5%
+
0.20
+
+
+
MIN
+
0.10
+
+
+
+
+
RANGE
+
2.40
+
+
+
IQR
+
1.50
+
+
+
STD
+
0.763
+
+
+
VAR
+
0.582
+
+
+
+
+
KURT.
+
-1.34
+
+
+
SKEW
+
-0.105
+
+
+
SUM
+
180
+
+
+ +
+ +
+
+
+
+
+
+ + +
+ 5 +
+
+ species +
+
+
VALUES:
+
+ 150 +
+
+ (100%) +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+
+
+
DISTINCT:
+
+ 3 +
+
+ (2%) +
+
+
+ +
+ +
+
+
+ + +
+
+
+ + Associations +
+ [Only including dataset "DataFrame"]
+ ■ Squares are categorical associations (uncertainty coefficient & correlation ratio) from 0 to 1. The uncertainty coefficient is assymmetrical, + (i.e. ROW LABEL values indicate how much they PROVIDE INFORMATION to each LABEL at the TOP). +

Circles are the symmetrical numerical correlations (Pearson's) from -1 to 1. The trivial diagonal is intentionally left blank for clarity. +
+ +
+
+ +
+
+ + Associations +
+ [Only including dataset "None"]
+ ■ Squares are categorical associations (uncertainty coefficient & correlation ratio) from 0 to 1. The uncertainty coefficient is assymmetrical, + (i.e. ROW LABEL values indicate how much they PROVIDE INFORMATION to each LABEL at the TOP). +

Circles are the symmetrical numerical correlations (Pearson's) from -1 to 1. The trivial diagonal is intentionally left blank for clarity. +
+ +
+
+ + + +
+
+ +
+
+ sepal_length +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+ + +
+ + + + +
+ + + + + +
+ + > +
+
NUMERICAL ASSOCIATIONS
+
+ (PEARSON, -1 to 1) +
+ +
+
+
petal_length
+
0.87
+
+
+
petal_width
+
0.82
+
+
+
sepal_width
+
-0.11
+
+
+
CATEGORICAL ASSOCIATIONS
+
+ (CORRELATION RATIO, 0 to 1) +
+
+
+
species
+
0.79
+
+
+
+ +
+ +
+
MOST FREQUENT VALUES
+
+
+
5.0
+
+
10
+
6.7%
+
+
+
+
5.1
+
+
9
+
6.0%
+
+
+
+
6.3
+
+
9
+
6.0%
+
+
+
+
5.7
+
+
8
+
5.3%
+
+
+
+
6.7
+
+
8
+
5.3%
+
+
+
+
5.8
+
+
7
+
4.7%
+
+
+
+
5.5
+
+
7
+
4.7%
+
+
+
+
6.4
+
+
7
+
4.7%
+
+
+
+
4.9
+
+
6
+
4.0%
+
+
+
+
5.4
+
+
6
+
4.0%
+
+
+
+
6.1
+
+
6
+
4.0%
+
+
+
+
6.0
+
+
6
+
4.0%
+
+
+
+
5.6
+
+
6
+
4.0%
+
+
+
+
4.8
+
+
5
+
3.3%
+
+
+
+
6.5
+
+
5
+
3.3%
+
+
+
+
+ +
+ +
+
SMALLEST VALUES
+
+
+
4.3
+
+
1
+
0.7%
+
+
+
+
4.4
+
+
3
+
2.0%
+
+
+
+
4.5
+
+
1
+
0.7%
+
+
+
+
4.6
+
+
4
+
2.7%
+
+
+
+
4.7
+
+
2
+
1.3%
+
+
+
+
4.8
+
+
5
+
3.3%
+
+
+
+
4.9
+
+
6
+
4.0%
+
+
+
+
5.0
+
+
10
+
6.7%
+
+
+
+
5.1
+
+
9
+
6.0%
+
+
+
+
5.2
+
+
4
+
2.7%
+
+
+
+
5.3
+
+
1
+
0.7%
+
+
+
+
5.4
+
+
6
+
4.0%
+
+
+
+
5.5
+
+
7
+
4.7%
+
+
+
+
5.6
+
+
6
+
4.0%
+
+
+
+
5.7
+
+
8
+
5.3%
+
+
+
+
+ +
+ +
+
LARGEST VALUES
+
+
+
7.9
+
+
1
+
0.7%
+
+
+
+
7.7
+
+
4
+
2.7%
+
+
+
+
7.6
+
+
1
+
0.7%
+
+
+
+
7.4
+
+
1
+
0.7%
+
+
+
+
7.3
+
+
1
+
0.7%
+
+
+
+
7.2
+
+
3
+
2.0%
+
+
+
+
7.1
+
+
1
+
0.7%
+
+
+
+
7.0
+
+
1
+
0.7%
+
+
+
+
6.9
+
+
4
+
2.7%
+
+
+
+
6.8
+
+
3
+
2.0%
+
+
+
+
6.7
+
+
8
+
5.3%
+
+
+
+
6.6
+
+
2
+
1.3%
+
+
+
+
6.5
+
+
5
+
3.3%
+
+
+
+
6.4
+
+
7
+
4.7%
+
+
+
+
6.3
+
+
9
+
6.0%
+
+
+
+
+ +
+
+
+
+ +
+
+ sepal_width +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+ + +
+ + + + +
+ + + + + +
+ + > +
+
NUMERICAL ASSOCIATIONS
+
+ (PEARSON, -1 to 1) +
+ +
+
+
petal_length
+
-0.42
+
+
+
petal_width
+
-0.36
+
+
+
sepal_length
+
-0.11
+
+
+
CATEGORICAL ASSOCIATIONS
+
+ (CORRELATION RATIO, 0 to 1) +
+
+
+
species
+
0.63
+
+
+
+ +
+ +
+
MOST FREQUENT VALUES
+
+
+
3.0
+
+
26
+
17.3%
+
+
+
+
2.8
+
+
14
+
9.3%
+
+
+
+
3.2
+
+
13
+
8.7%
+
+
+
+
3.1
+
+
12
+
8.0%
+
+
+
+
3.4
+
+
12
+
8.0%
+
+
+
+
2.9
+
+
10
+
6.7%
+
+
+
+
2.7
+
+
9
+
6.0%
+
+
+
+
2.5
+
+
8
+
5.3%
+
+
+
+
3.5
+
+
6
+
4.0%
+
+
+
+
3.3
+
+
6
+
4.0%
+
+
+
+
3.8
+
+
6
+
4.0%
+
+
+
+
2.6
+
+
5
+
3.3%
+
+
+
+
2.3
+
+
4
+
2.7%
+
+
+
+
3.7
+
+
3
+
2.0%
+
+
+
+
2.4
+
+
3
+
2.0%
+
+
+
+
+ +
+ +
+
SMALLEST VALUES
+
+
+
2.0
+
+
1
+
0.7%
+
+
+
+
2.2
+
+
3
+
2.0%
+
+
+
+
2.3
+
+
4
+
2.7%
+
+
+
+
2.4
+
+
3
+
2.0%
+
+
+
+
2.5
+
+
8
+
5.3%
+
+
+
+
2.6
+
+
5
+
3.3%
+
+
+
+
2.7
+
+
9
+
6.0%
+
+
+
+
2.8
+
+
14
+
9.3%
+
+
+
+
2.9
+
+
10
+
6.7%
+
+
+
+
3.0
+
+
26
+
17.3%
+
+
+
+
3.1
+
+
12
+
8.0%
+
+
+
+
3.2
+
+
13
+
8.7%
+
+
+
+
3.3
+
+
6
+
4.0%
+
+
+
+
3.4
+
+
12
+
8.0%
+
+
+
+
3.5
+
+
6
+
4.0%
+
+
+
+
+ +
+ +
+
LARGEST VALUES
+
+
+
4.4
+
+
1
+
0.7%
+
+
+
+
4.2
+
+
1
+
0.7%
+
+
+
+
4.1
+
+
1
+
0.7%
+
+
+
+
4.0
+
+
1
+
0.7%
+
+
+
+
3.9
+
+
2
+
1.3%
+
+
+
+
3.8
+
+
6
+
4.0%
+
+
+
+
3.7
+
+
3
+
2.0%
+
+
+
+
3.6
+
+
3
+
2.0%
+
+
+
+
3.5
+
+
6
+
4.0%
+
+
+
+
3.4
+
+
12
+
8.0%
+
+
+
+
3.3
+
+
6
+
4.0%
+
+
+
+
3.2
+
+
13
+
8.7%
+
+
+
+
3.1
+
+
12
+
8.0%
+
+
+
+
3.0
+
+
26
+
17.3%
+
+
+
+
2.9
+
+
10
+
6.7%
+
+
+
+
+ +
+
+
+
+ +
+
+ petal_length +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+ + +
+ + + + +
+ + + + + +
+ + > +
+
NUMERICAL ASSOCIATIONS
+
+ (PEARSON, -1 to 1) +
+ +
+
+
petal_width
+
0.96
+
+
+
sepal_length
+
0.87
+
+
+
sepal_width
+
-0.42
+
+
+
CATEGORICAL ASSOCIATIONS
+
+ (CORRELATION RATIO, 0 to 1) +
+
+
+
species
+
0.97
+
+
+
+ +
+ +
+
MOST FREQUENT VALUES
+
+
+
1.5
+
+
14
+
9.3%
+
+
+
+
1.4
+
+
12
+
8.0%
+
+
+
+
5.1
+
+
8
+
5.3%
+
+
+
+
4.5
+
+
8
+
5.3%
+
+
+
+
1.6
+
+
7
+
4.7%
+
+
+
+
1.3
+
+
7
+
4.7%
+
+
+
+
5.6
+
+
6
+
4.0%
+
+
+
+
4.7
+
+
5
+
3.3%
+
+
+
+
4.9
+
+
5
+
3.3%
+
+
+
+
4.0
+
+
5
+
3.3%
+
+
+
+
4.2
+
+
4
+
2.7%
+
+
+
+
5.0
+
+
4
+
2.7%
+
+
+
+
4.4
+
+
4
+
2.7%
+
+
+
+
4.8
+
+
4
+
2.7%
+
+
+
+
1.7
+
+
4
+
2.7%
+
+
+
+
+ +
+ +
+
SMALLEST VALUES
+
+
+
1.0
+
+
1
+
0.7%
+
+
+
+
1.1
+
+
1
+
0.7%
+
+
+
+
1.2
+
+
2
+
1.3%
+
+
+
+
1.3
+
+
7
+
4.7%
+
+
+
+
1.4
+
+
12
+
8.0%
+
+
+
+
1.5
+
+
14
+
9.3%
+
+
+
+
1.6
+
+
7
+
4.7%
+
+
+
+
1.7
+
+
4
+
2.7%
+
+
+
+
1.9
+
+
2
+
1.3%
+
+
+
+
3.0
+
+
1
+
0.7%
+
+
+
+
3.3
+
+
2
+
1.3%
+
+
+
+
3.5
+
+
2
+
1.3%
+
+
+
+
3.6
+
+
1
+
0.7%
+
+
+
+
3.7
+
+
1
+
0.7%
+
+
+
+
3.8
+
+
1
+
0.7%
+
+
+
+
+ +
+ +
+
LARGEST VALUES
+
+
+
6.9
+
+
1
+
0.7%
+
+
+
+
6.7
+
+
2
+
1.3%
+
+
+
+
6.6
+
+
1
+
0.7%
+
+
+
+
6.4
+
+
1
+
0.7%
+
+
+
+
6.3
+
+
1
+
0.7%
+
+
+
+
6.1
+
+
3
+
2.0%
+
+
+
+
6.0
+
+
2
+
1.3%
+
+
+
+
5.9
+
+
2
+
1.3%
+
+
+
+
5.8
+
+
3
+
2.0%
+
+
+
+
5.7
+
+
3
+
2.0%
+
+
+
+
5.6
+
+
6
+
4.0%
+
+
+
+
5.5
+
+
3
+
2.0%
+
+
+
+
5.4
+
+
2
+
1.3%
+
+
+
+
5.3
+
+
2
+
1.3%
+
+
+
+
5.2
+
+
2
+
1.3%
+
+
+
+
+ +
+
+
+
+ +
+
+ petal_width +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+ + +
+ + + + +
+ + + + + +
+ + > +
+
NUMERICAL ASSOCIATIONS
+
+ (PEARSON, -1 to 1) +
+ +
+
+
petal_length
+
0.96
+
+
+
sepal_length
+
0.82
+
+
+
sepal_width
+
-0.36
+
+
+
CATEGORICAL ASSOCIATIONS
+
+ (CORRELATION RATIO, 0 to 1) +
+
+
+
species
+
0.96
+
+
+
+ +
+ +
+
MOST FREQUENT VALUES
+
+
+
0.2
+
+
28
+
18.7%
+
+
+
+
1.3
+
+
13
+
8.7%
+
+
+
+
1.8
+
+
12
+
8.0%
+
+
+
+
1.5
+
+
12
+
8.0%
+
+
+
+
1.4
+
+
8
+
5.3%
+
+
+
+
2.3
+
+
8
+
5.3%
+
+
+
+
1.0
+
+
7
+
4.7%
+
+
+
+
0.4
+
+
7
+
4.7%
+
+
+
+
0.3
+
+
7
+
4.7%
+
+
+
+
0.1
+
+
6
+
4.0%
+
+
+
+
2.1
+
+
6
+
4.0%
+
+
+
+
2.0
+
+
6
+
4.0%
+
+
+
+
1.2
+
+
5
+
3.3%
+
+
+
+
1.9
+
+
5
+
3.3%
+
+
+
+
1.6
+
+
4
+
2.7%
+
+
+
+
+ +
+ +
+
SMALLEST VALUES
+
+
+
0.1
+
+
6
+
4.0%
+
+
+
+
0.2
+
+
28
+
18.7%
+
+
+
+
0.3
+
+
7
+
4.7%
+
+
+
+
0.4
+
+
7
+
4.7%
+
+
+
+
0.5
+
+
1
+
0.7%
+
+
+
+
0.6
+
+
1
+
0.7%
+
+
+
+
1.0
+
+
7
+
4.7%
+
+
+
+
1.1
+
+
3
+
2.0%
+
+
+
+
1.2
+
+
5
+
3.3%
+
+
+
+
1.3
+
+
13
+
8.7%
+
+
+
+
1.4
+
+
8
+
5.3%
+
+
+
+
1.5
+
+
12
+
8.0%
+
+
+
+
1.6
+
+
4
+
2.7%
+
+
+
+
1.7
+
+
2
+
1.3%
+
+
+
+
1.8
+
+
12
+
8.0%
+
+
+
+
+ +
+ +
+
LARGEST VALUES
+
+
+
2.5
+
+
3
+
2.0%
+
+
+
+
2.4
+
+
3
+
2.0%
+
+
+
+
2.3
+
+
8
+
5.3%
+
+
+
+
2.2
+
+
3
+
2.0%
+
+
+
+
2.1
+
+
6
+
4.0%
+
+
+
+
2.0
+
+
6
+
4.0%
+
+
+
+
1.9
+
+
5
+
3.3%
+
+
+
+
1.8
+
+
12
+
8.0%
+
+
+
+
1.7
+
+
2
+
1.3%
+
+
+
+
1.6
+
+
4
+
2.7%
+
+
+
+
1.5
+
+
12
+
8.0%
+
+
+
+
1.4
+
+
8
+
5.3%
+
+
+
+
1.3
+
+
13
+
8.7%
+
+
+
+
1.2
+
+
5
+
3.3%
+
+
+
+
1.1
+
+
3
+
2.0%
+
+
+
+
+ +
+
+
+
+ +
+
+ species +
+
+
+
MISSING:
+
+ --- +
+
+ +
+
+
+ + + +
+ +
+
TOP CATEGORIES
+

+
+ +
+ +
+
+ + +
+ +
Iris-setosa
+ + +
+
50
+
33%
+
+ +
+
+ +
Iris-versicolor
+ + +
+
50
+
33%
+
+ +
+
+ +
Iris-virginica
+ + +
+
50
+
33%
+
+ +
+
+
+
+ +
ALL
+ + +
+
150
+
100%
+
+ +
+
+
+ + +
+ + +
+ +
+ CATEGORICAL ASSOCIATIONS
+ (UNCERTAINTY COEFFICIENT, 0 to 1) +
+
species
+ PROVIDES INFORMATION ON...
+
+
+
THESE FEATURES
GIVE INFORMATION
+ ON species:
+
+
+ +
+ NUMERICAL ASSOCIATIONS
+ (CORRELATION RATIO, 0 to 1) +
+
species
+ CORRELATION RATIO WITH...
+
+
+
petal_length
+
0.97
+
+
+
petal_width
+
0.96
+
+
+
sepal_length
+
0.79
+
+
+
sepal_width
+
0.63
+
+
+
+
+
+
+
+ + \ No newline at end of file