Overview

Dataset statistics

Number of variables9
Number of observations1030
Missing cells0
Missing cells (%)0.0%
Duplicate rows11
Duplicate rows (%)1.1%
Total size in memory72.5 KiB
Average record size in memory72.1 B

Variable types

Numeric9

Warnings

Dataset has 11 (1.1%) duplicate rowsDuplicates
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture)High correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture)High correlation
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture)High correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture)High correlation
Age (day) is highly correlated with Concrete compressive strength(MPa. megapascals)High correlation
Concrete compressive strength(MPa. megapascals) is highly correlated with Age (day)High correlation
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture)High correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture)High correlation
Blast Furnace Slag (component 2)(kg in a m^3 mixture) is highly correlated with Coarse Aggregate (component 6)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Coarse Aggregate (component 6)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Cement (component 1)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 6 other fieldsHigh correlation
Fine Aggregate (component 7)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Fly Ash (component 3)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Water (component 4)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Concrete compressive strength(MPa. megapascals) is highly correlated with Cement (component 1)(kg in a m^3 mixture)High correlation
Blast Furnace Slag (component 2)(kg in a m^3 mixture) has 471 (45.7%) zeros Zeros
Fly Ash (component 3)(kg in a m^3 mixture) has 566 (55.0%) zeros Zeros
Superplasticizer (component 5)(kg in a m^3 mixture) has 379 (36.8%) zeros Zeros

Reproduction

Analysis started2021-07-07 18:00:57.943161
Analysis finished2021-07-07 18:01:08.944311
Duration11 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Cement (component 1)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct278
Distinct (%)27.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean281.1678641
Minimum102
Maximum540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:09.032447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile143.745
Q1192.375
median272.9
Q3350
95-th percentile480
Maximum540
Range438
Interquartile range (IQR)157.625

Descriptive statistics

Standard deviation104.5063645
Coefficient of variation (CV)0.3716867318
Kurtosis-0.5206522845
Mean281.1678641
Median Absolute Deviation (MAD)79.4
Skewness0.5094811789
Sum289602.9
Variance10921.58022
MonotonicityNot monotonic
2021-07-07T14:01:09.141223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
362.620
 
1.9%
42520
 
1.9%
251.415
 
1.5%
31014
 
1.4%
44614
 
1.4%
25013
 
1.3%
47513
 
1.3%
33113
 
1.3%
34912
 
1.2%
38712
 
1.2%
Other values (268)884
85.8%
ValueCountFrequency (%)
1024
0.4%
108.34
0.4%
1164
0.4%
122.64
0.4%
1322
 
0.2%
1335
0.5%
133.11
 
0.1%
134.71
 
0.1%
1352
 
0.2%
135.72
 
0.2%
ValueCountFrequency (%)
5409
0.9%
531.35
0.5%
5281
 
0.1%
5257
0.7%
5222
 
0.2%
5202
 
0.2%
5162
 
0.2%
5051
 
0.1%
500.11
 
0.1%
50010
1.0%

Blast Furnace Slag (component 2)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct185
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.89582524
Minimum0
Maximum359.4
Zeros471
Zeros (%)45.7%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:09.253890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median22
Q3142.95
95-th percentile236
Maximum359.4
Range359.4
Interquartile range (IQR)142.95

Descriptive statistics

Standard deviation86.27934175
Coefficient of variation (CV)1.167580732
Kurtosis-0.5081754789
Mean73.89582524
Median Absolute Deviation (MAD)22
Skewness0.8007168956
Sum76112.7
Variance7444.124812
MonotonicityNot monotonic
2021-07-07T14:01:09.365222image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0471
45.7%
18930
 
2.9%
106.320
 
1.9%
2414
 
1.4%
2012
 
1.2%
14511
 
1.1%
98.110
 
1.0%
1910
 
1.0%
228
 
0.8%
268
 
0.8%
Other values (175)436
42.3%
ValueCountFrequency (%)
0471
45.7%
114
 
0.4%
13.65
 
0.5%
155
 
0.5%
17.21
 
0.1%
17.51
 
0.1%
17.61
 
0.1%
1910
 
1.0%
2012
 
1.2%
228
 
0.8%
ValueCountFrequency (%)
359.42
 
0.2%
342.12
 
0.2%
316.12
 
0.2%
305.34
0.4%
290.22
 
0.2%
2884
0.4%
282.84
0.4%
272.82
 
0.2%
262.25
0.5%
2601
 
0.1%

Fly Ash (component 3)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct156
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.18834951
Minimum0
Maximum200.1
Zeros566
Zeros (%)55.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:09.482657image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3118.3
95-th percentile167
Maximum200.1
Range200.1
Interquartile range (IQR)118.3

Descriptive statistics

Standard deviation63.99700415
Coefficient of variation (CV)1.181010397
Kurtosis-1.328746435
Mean54.18834951
Median Absolute Deviation (MAD)0
Skewness0.5373539058
Sum55814
Variance4095.616541
MonotonicityNot monotonic
2021-07-07T14:01:09.597417image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0566
55.0%
118.320
 
1.9%
14116
 
1.6%
24.515
 
1.5%
7914
 
1.4%
9413
 
1.3%
100.411
 
1.1%
174.210
 
1.0%
16710
 
1.0%
95.710
 
1.0%
Other values (146)345
33.5%
ValueCountFrequency (%)
0566
55.0%
24.515
 
1.5%
591
 
0.1%
601
 
0.1%
711
 
0.1%
71.51
 
0.1%
75.61
 
0.1%
761
 
0.1%
772
 
0.2%
782
 
0.2%
ValueCountFrequency (%)
200.11
 
0.1%
2001
 
0.1%
1953
0.3%
194.91
 
0.1%
1941
 
0.1%
1931
 
0.1%
1901
 
0.1%
1871
 
0.1%
185.31
 
0.1%
1852
0.2%

Water (component 4)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct195
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean181.5672816
Minimum121.8
Maximum247
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:09.719202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum121.8
5-th percentile146.1
Q1164.9
median185
Q3192
95-th percentile228
Maximum247
Range125.2
Interquartile range (IQR)27.1

Descriptive statistics

Standard deviation21.35421857
Coefficient of variation (CV)0.1176104989
Kurtosis0.1220816744
Mean181.5672816
Median Absolute Deviation (MAD)13
Skewness0.07462838429
Sum187014.3
Variance456.0026505
MonotonicityNot monotonic
2021-07-07T14:01:09.838851image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
192118
 
11.5%
22854
 
5.2%
185.746
 
4.5%
203.536
 
3.5%
18628
 
2.7%
164.920
 
1.9%
16220
 
1.9%
18515
 
1.5%
153.515
 
1.5%
19314
 
1.4%
Other values (185)664
64.5%
ValueCountFrequency (%)
121.85
0.5%
126.65
0.5%
1271
 
0.1%
127.31
 
0.1%
137.85
0.5%
1401
 
0.1%
140.85
0.5%
141.85
0.5%
1421
 
0.1%
143.35
0.5%
ValueCountFrequency (%)
2471
 
0.1%
246.91
 
0.1%
2371
 
0.1%
236.71
 
0.1%
22854
5.2%
221.41
 
0.1%
2212
 
0.2%
220.11
 
0.1%
2202
 
0.2%
219.71
 
0.1%

Superplasticizer (component 5)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct111
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.204660194
Minimum0
Maximum32.2
Zeros379
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:09.956437image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median6.4
Q310.2
95-th percentile16.055
Maximum32.2
Range32.2
Interquartile range (IQR)10.2

Descriptive statistics

Standard deviation5.973841392
Coefficient of variation (CV)0.9627991228
Kurtosis1.411268965
Mean6.204660194
Median Absolute Deviation (MAD)5.3
Skewness0.9072025749
Sum6390.8
Variance35.68678098
MonotonicityNot monotonic
2021-07-07T14:01:10.069997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0379
36.8%
11.637
 
3.6%
827
 
2.6%
719
 
1.8%
617
 
1.7%
7.816
 
1.6%
8.916
 
1.6%
9.916
 
1.6%
916
 
1.6%
1015
 
1.5%
Other values (101)472
45.8%
ValueCountFrequency (%)
0379
36.8%
1.74
 
0.4%
1.91
 
0.1%
21
 
0.1%
2.21
 
0.1%
2.52
 
0.2%
36
 
0.6%
3.11
 
0.1%
3.43
 
0.3%
3.65
 
0.5%
ValueCountFrequency (%)
32.25
0.5%
28.25
0.5%
23.45
0.5%
22.11
 
0.1%
226
0.6%
20.81
 
0.1%
201
 
0.1%
191
 
0.1%
18.81
 
0.1%
18.65
0.5%

Coarse Aggregate (component 6)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct284
Distinct (%)27.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean972.918932
Minimum801
Maximum1145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:10.186942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum801
5-th percentile842
Q1932
median968
Q31029.4
95-th percentile1104
Maximum1145
Range344
Interquartile range (IQR)97.4

Descriptive statistics

Standard deviation77.75395397
Coefficient of variation (CV)0.07991822485
Kurtosis-0.5990161032
Mean972.918932
Median Absolute Deviation (MAD)46.3
Skewness-0.04021974481
Sum1002106.5
Variance6045.677357
MonotonicityNot monotonic
2021-07-07T14:01:10.299728image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93257
 
5.5%
852.145
 
4.4%
944.730
 
2.9%
96829
 
2.8%
112524
 
2.3%
96719
 
1.8%
104719
 
1.8%
94212
 
1.2%
82212
 
1.2%
97412
 
1.2%
Other values (274)771
74.9%
ValueCountFrequency (%)
8014
0.4%
801.11
 
0.1%
801.41
 
0.1%
8112
0.2%
8141
 
0.1%
814.11
 
0.1%
817.91
 
0.1%
8181
 
0.1%
8192
0.2%
819.21
 
0.1%
ValueCountFrequency (%)
11451
 
0.1%
1134.35
 
0.5%
11301
 
0.1%
112524
2.3%
1124.42
 
0.2%
11202
 
0.2%
11192
 
0.2%
1118.82
 
0.2%
11181
 
0.1%
11132
 
0.2%

Fine Aggregate (component 7)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct302
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean773.5804854
Minimum594
Maximum992.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:10.418265image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum594
5-th percentile613
Q1730.95
median779.5
Q3824
95-th percentile898.09
Maximum992.6
Range398.6
Interquartile range (IQR)93.05

Descriptive statistics

Standard deviation80.17598014
Coefficient of variation (CV)0.1036427129
Kurtosis-0.1021769893
Mean773.5804854
Median Absolute Deviation (MAD)45.5
Skewness-0.2530095977
Sum796787.9
Variance6428.187792
MonotonicityNot monotonic
2021-07-07T14:01:10.536031image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
755.830
 
2.9%
59430
 
2.9%
67023
 
2.2%
61322
 
2.1%
80116
 
1.6%
746.615
 
1.5%
887.115
 
1.5%
84514
 
1.4%
71214
 
1.4%
75012
 
1.2%
Other values (292)839
81.5%
ValueCountFrequency (%)
59430
2.9%
6055
 
0.5%
611.85
 
0.5%
6121
 
0.1%
61322
2.1%
613.22
 
0.2%
6141
 
0.1%
6232
 
0.2%
6305
 
0.5%
6314
 
0.4%
ValueCountFrequency (%)
992.65
0.5%
9454
0.4%
943.14
0.4%
9424
0.4%
925.75
0.5%
905.95
0.5%
903.85
0.5%
903.65
0.5%
901.85
0.5%
900.95
0.5%

Age (day)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45.66213592
Minimum1
Maximum365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:10.638721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q17
median28
Q356
95-th percentile180
Maximum365
Range364
Interquartile range (IQR)49

Descriptive statistics

Standard deviation63.16991158
Coefficient of variation (CV)1.383419989
Kurtosis12.16898898
Mean45.66213592
Median Absolute Deviation (MAD)21
Skewness3.269177401
Sum47032
Variance3990.437729
MonotonicityNot monotonic
2021-07-07T14:01:10.724104image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
28425
41.3%
3134
 
13.0%
7126
 
12.2%
5691
 
8.8%
1462
 
6.0%
9054
 
5.2%
10052
 
5.0%
18026
 
2.5%
9122
 
2.1%
36514
 
1.4%
Other values (4)24
 
2.3%
ValueCountFrequency (%)
12
 
0.2%
3134
 
13.0%
7126
 
12.2%
1462
 
6.0%
28425
41.3%
5691
 
8.8%
9054
 
5.2%
9122
 
2.1%
10052
 
5.0%
1203
 
0.3%
ValueCountFrequency (%)
36514
 
1.4%
3606
 
0.6%
27013
 
1.3%
18026
 
2.5%
1203
 
0.3%
10052
 
5.0%
9122
 
2.1%
9054
 
5.2%
5691
 
8.8%
28425
41.3%

Concrete compressive strength(MPa. megapascals)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct845
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.81796117
Minimum2.33
Maximum82.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2021-07-07T14:01:10.833292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2.33
5-th percentile10.961
Q123.71
median34.445
Q346.135
95-th percentile66.802
Maximum82.6
Range80.27
Interquartile range (IQR)22.425

Descriptive statistics

Standard deviation16.70574196
Coefficient of variation (CV)0.4664068366
Kurtosis-0.3137248604
Mean35.81796117
Median Absolute Deviation (MAD)10.93
Skewness0.4169772884
Sum36892.5
Variance279.0818145
MonotonicityNot monotonic
2021-07-07T14:01:10.948608image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.46
 
0.6%
71.34
 
0.4%
41.054
 
0.4%
31.354
 
0.4%
23.524
 
0.4%
77.34
 
0.4%
79.34
 
0.4%
35.34
 
0.4%
65.23
 
0.3%
18.133
 
0.3%
Other values (835)990
96.1%
ValueCountFrequency (%)
2.331
0.1%
3.321
0.1%
4.571
0.1%
4.781
0.1%
4.831
0.1%
4.91
0.1%
6.271
0.1%
6.281
0.1%
6.471
0.1%
6.811
0.1%
ValueCountFrequency (%)
82.61
 
0.1%
81.751
 
0.1%
80.21
 
0.1%
79.991
 
0.1%
79.41
 
0.1%
79.34
0.4%
78.81
 
0.1%
77.34
0.4%
76.81
 
0.1%
76.241
 
0.1%

Interactions

2021-07-07T14:00:59.696998image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:00:59.795683image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:00:59.893385image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:00:59.992693image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.087501image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.181914image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.281777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.382191image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.475472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.572539image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.671647image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.778055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.885522image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:00.994168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.100223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.207341image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.316414image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.419228image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.527030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.627004image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.734900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.844406image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:01.956322image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.063299image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.172556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.285659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.391183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.501292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.594431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.695929image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.798596image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.898011image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:02.999134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:03.103497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:03.214186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:03.315903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:03.754733image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:03.849163image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:03.952622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.058684image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.158988image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.260136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.364401image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.469347image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.578556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.684937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.783780image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:04.891896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.002101image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.107240image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.214776image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.322726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.431948image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.535984image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.644651image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.747230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.856530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:05.968382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.075911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.186636image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.301079image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.412332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.518202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.628650image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.723314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.825474image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:06.930115image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.030875image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.132269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.237761image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.342724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.441955image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.547162image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.646978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.755283image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.865567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:07.972174image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:08.078865image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:08.192781image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:08.307448image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-07-07T14:01:08.412584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-07-07T14:01:11.371021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-07T14:01:11.542863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-07T14:01:11.709776image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-07T14:01:11.878464image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-07-07T14:01:08.597955image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-07T14:01:08.836053image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Cement (component 1)(kg in a m^3 mixture)Blast Furnace Slag (component 2)(kg in a m^3 mixture)Fly Ash (component 3)(kg in a m^3 mixture)Water (component 4)(kg in a m^3 mixture)Superplasticizer (component 5)(kg in a m^3 mixture)Coarse Aggregate (component 6)(kg in a m^3 mixture)Fine Aggregate (component 7)(kg in a m^3 mixture)Age (day)Concrete compressive strength(MPa. megapascals)
0540.00.00.0162.02.51040.0676.028.079.99
1540.00.00.0162.02.51055.0676.028.061.89
2332.5142.50.0228.00.0932.0594.0270.040.27
3332.5142.50.0228.00.0932.0594.0365.041.05
4198.6132.40.0192.00.0978.4825.5360.044.30
5266.0114.00.0228.00.0932.0670.090.047.03
6380.095.00.0228.00.0932.0594.0365.043.70
7380.095.00.0228.00.0932.0594.028.036.45
8266.0114.00.0228.00.0932.0670.028.045.85
9475.00.00.0228.00.0932.0594.028.039.29

Last rows

Cement (component 1)(kg in a m^3 mixture)Blast Furnace Slag (component 2)(kg in a m^3 mixture)Fly Ash (component 3)(kg in a m^3 mixture)Water (component 4)(kg in a m^3 mixture)Superplasticizer (component 5)(kg in a m^3 mixture)Coarse Aggregate (component 6)(kg in a m^3 mixture)Fine Aggregate (component 7)(kg in a m^3 mixture)Age (day)Concrete compressive strength(MPa. megapascals)
1020288.4121.00.0177.47.0907.9829.528.042.14
1021298.20.0107.0209.711.1879.6744.228.031.88
1022264.5111.086.5195.55.9832.6790.428.041.54
1023159.8250.00.0168.412.21049.3688.228.039.46
1024166.0259.70.0183.212.7858.8826.828.037.92
1025276.4116.090.3179.68.9870.1768.328.044.28
1026322.20.0115.6196.010.4817.9813.428.031.18
1027148.5139.4108.6192.76.1892.4780.028.023.70
1028159.1186.70.0175.611.3989.6788.928.032.77
1029260.9100.578.3200.68.6864.5761.528.032.40

Duplicate rows

Most frequently occurring

Cement (component 1)(kg in a m^3 mixture)Blast Furnace Slag (component 2)(kg in a m^3 mixture)Fly Ash (component 3)(kg in a m^3 mixture)Water (component 4)(kg in a m^3 mixture)Superplasticizer (component 5)(kg in a m^3 mixture)Coarse Aggregate (component 6)(kg in a m^3 mixture)Fine Aggregate (component 7)(kg in a m^3 mixture)Age (day)Concrete compressive strength(MPa. megapascals)# duplicates
1362.6189.00.0164.911.6944.7755.83.035.304
3362.6189.00.0164.911.6944.7755.828.071.304
4362.6189.00.0164.911.6944.7755.856.077.304
5362.6189.00.0164.911.6944.7755.891.079.304
2362.6189.00.0164.911.6944.7755.87.055.903
6425.0106.30.0153.516.5852.1887.13.033.403
7425.0106.30.0153.516.5852.1887.17.049.203
8425.0106.30.0153.516.5852.1887.128.060.293
9425.0106.30.0153.516.5852.1887.156.064.303
10425.0106.30.0153.516.5852.1887.191.065.203