Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 1030 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 11 |
Duplicate rows (%) | 1.1% |
Total size in memory | 72.5 KiB |
Average record size in memory | 72.1 B |
Variable types
Numeric | 9 |
---|
Dataset has 11 (1.1%) duplicate rows | Duplicates |
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture) | High correlation |
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture) | High correlation |
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture) | High correlation |
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture) | High correlation |
Age (day) is highly correlated with Concrete compressive strength(MPa. megapascals) | High correlation |
Concrete compressive strength(MPa. megapascals) is highly correlated with Age (day) | High correlation |
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture) | High correlation |
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture) | High correlation |
Blast Furnace Slag (component 2)(kg in a m^3 mixture) is highly correlated with Coarse Aggregate (component 6)(kg in a m^3 mixture) and 5 other fields | High correlation |
Coarse Aggregate (component 6)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fields | High correlation |
Cement (component 1)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 6 other fields | High correlation |
Fine Aggregate (component 7)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fields | High correlation |
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fields | High correlation |
Fly Ash (component 3)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fields | High correlation |
Water (component 4)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 5 other fields | High correlation |
Concrete compressive strength(MPa. megapascals) is highly correlated with Cement (component 1)(kg in a m^3 mixture) | High correlation |
Blast Furnace Slag (component 2)(kg in a m^3 mixture) has 471 (45.7%) zeros | Zeros |
Fly Ash (component 3)(kg in a m^3 mixture) has 566 (55.0%) zeros | Zeros |
Superplasticizer (component 5)(kg in a m^3 mixture) has 379 (36.8%) zeros | Zeros |
Reproduction
Analysis started | 2021-07-07 18:00:57.943161 |
---|---|
Analysis finished | 2021-07-07 18:01:08.944311 |
Duration | 11 seconds |
Software version | pandas-profiling v3.0.0 |
Download configuration | config.json |
Distinct | 278 |
---|---|
Distinct (%) | 27.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 281.1678641 |
Minimum | 102 |
---|---|
Maximum | 540 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 102 |
---|---|
5-th percentile | 143.745 |
Q1 | 192.375 |
median | 272.9 |
Q3 | 350 |
95-th percentile | 480 |
Maximum | 540 |
Range | 438 |
Interquartile range (IQR) | 157.625 |
Descriptive statistics
Standard deviation | 104.5063645 |
---|---|
Coefficient of variation (CV) | 0.3716867318 |
Kurtosis | -0.5206522845 |
Mean | 281.1678641 |
Median Absolute Deviation (MAD) | 79.4 |
Skewness | 0.5094811789 |
Sum | 289602.9 |
Variance | 10921.58022 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
362.6 | 20 | 1.9% |
425 | 20 | 1.9% |
251.4 | 15 | 1.5% |
310 | 14 | 1.4% |
446 | 14 | 1.4% |
250 | 13 | 1.3% |
475 | 13 | 1.3% |
331 | 13 | 1.3% |
349 | 12 | 1.2% |
387 | 12 | 1.2% |
Other values (268) | 884 |
Value | Count | Frequency (%) |
102 | 4 | |
108.3 | 4 | |
116 | 4 | |
122.6 | 4 | |
132 | 2 | 0.2% |
133 | 5 | |
133.1 | 1 | 0.1% |
134.7 | 1 | 0.1% |
135 | 2 | 0.2% |
135.7 | 2 | 0.2% |
Value | Count | Frequency (%) |
540 | 9 | |
531.3 | 5 | |
528 | 1 | 0.1% |
525 | 7 | |
522 | 2 | 0.2% |
520 | 2 | 0.2% |
516 | 2 | 0.2% |
505 | 1 | 0.1% |
500.1 | 1 | 0.1% |
500 | 10 |
Distinct | 185 |
---|---|
Distinct (%) | 18.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 73.89582524 |
Minimum | 0 |
---|---|
Maximum | 359.4 |
Zeros | 471 |
Zeros (%) | 45.7% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 22 |
Q3 | 142.95 |
95-th percentile | 236 |
Maximum | 359.4 |
Range | 359.4 |
Interquartile range (IQR) | 142.95 |
Descriptive statistics
Standard deviation | 86.27934175 |
---|---|
Coefficient of variation (CV) | 1.167580732 |
Kurtosis | -0.5081754789 |
Mean | 73.89582524 |
Median Absolute Deviation (MAD) | 22 |
Skewness | 0.8007168956 |
Sum | 76112.7 |
Variance | 7444.124812 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 471 | |
189 | 30 | 2.9% |
106.3 | 20 | 1.9% |
24 | 14 | 1.4% |
20 | 12 | 1.2% |
145 | 11 | 1.1% |
98.1 | 10 | 1.0% |
19 | 10 | 1.0% |
22 | 8 | 0.8% |
26 | 8 | 0.8% |
Other values (175) | 436 |
Value | Count | Frequency (%) |
0 | 471 | |
11 | 4 | 0.4% |
13.6 | 5 | 0.5% |
15 | 5 | 0.5% |
17.2 | 1 | 0.1% |
17.5 | 1 | 0.1% |
17.6 | 1 | 0.1% |
19 | 10 | 1.0% |
20 | 12 | 1.2% |
22 | 8 | 0.8% |
Value | Count | Frequency (%) |
359.4 | 2 | 0.2% |
342.1 | 2 | 0.2% |
316.1 | 2 | 0.2% |
305.3 | 4 | |
290.2 | 2 | 0.2% |
288 | 4 | |
282.8 | 4 | |
272.8 | 2 | 0.2% |
262.2 | 5 | |
260 | 1 | 0.1% |
Distinct | 156 |
---|---|
Distinct (%) | 15.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 54.18834951 |
Minimum | 0 |
---|---|
Maximum | 200.1 |
Zeros | 566 |
Zeros (%) | 55.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 118.3 |
95-th percentile | 167 |
Maximum | 200.1 |
Range | 200.1 |
Interquartile range (IQR) | 118.3 |
Descriptive statistics
Standard deviation | 63.99700415 |
---|---|
Coefficient of variation (CV) | 1.181010397 |
Kurtosis | -1.328746435 |
Mean | 54.18834951 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 0.5373539058 |
Sum | 55814 |
Variance | 4095.616541 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 566 | |
118.3 | 20 | 1.9% |
141 | 16 | 1.6% |
24.5 | 15 | 1.5% |
79 | 14 | 1.4% |
94 | 13 | 1.3% |
100.4 | 11 | 1.1% |
174.2 | 10 | 1.0% |
167 | 10 | 1.0% |
95.7 | 10 | 1.0% |
Other values (146) | 345 |
Value | Count | Frequency (%) |
0 | 566 | |
24.5 | 15 | 1.5% |
59 | 1 | 0.1% |
60 | 1 | 0.1% |
71 | 1 | 0.1% |
71.5 | 1 | 0.1% |
75.6 | 1 | 0.1% |
76 | 1 | 0.1% |
77 | 2 | 0.2% |
78 | 2 | 0.2% |
Value | Count | Frequency (%) |
200.1 | 1 | 0.1% |
200 | 1 | 0.1% |
195 | 3 | |
194.9 | 1 | 0.1% |
194 | 1 | 0.1% |
193 | 1 | 0.1% |
190 | 1 | 0.1% |
187 | 1 | 0.1% |
185.3 | 1 | 0.1% |
185 | 2 |
Water (component 4)(kg in a m^3 mixture)
Real number (ℝ≥0)
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
Distinct | 195 |
---|---|
Distinct (%) | 18.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 181.5672816 |
Minimum | 121.8 |
---|---|
Maximum | 247 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 121.8 |
---|---|
5-th percentile | 146.1 |
Q1 | 164.9 |
median | 185 |
Q3 | 192 |
95-th percentile | 228 |
Maximum | 247 |
Range | 125.2 |
Interquartile range (IQR) | 27.1 |
Descriptive statistics
Standard deviation | 21.35421857 |
---|---|
Coefficient of variation (CV) | 0.1176104989 |
Kurtosis | 0.1220816744 |
Mean | 181.5672816 |
Median Absolute Deviation (MAD) | 13 |
Skewness | 0.07462838429 |
Sum | 187014.3 |
Variance | 456.0026505 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
192 | 118 | 11.5% |
228 | 54 | 5.2% |
185.7 | 46 | 4.5% |
203.5 | 36 | 3.5% |
186 | 28 | 2.7% |
164.9 | 20 | 1.9% |
162 | 20 | 1.9% |
185 | 15 | 1.5% |
153.5 | 15 | 1.5% |
193 | 14 | 1.4% |
Other values (185) | 664 |
Value | Count | Frequency (%) |
121.8 | 5 | |
126.6 | 5 | |
127 | 1 | 0.1% |
127.3 | 1 | 0.1% |
137.8 | 5 | |
140 | 1 | 0.1% |
140.8 | 5 | |
141.8 | 5 | |
142 | 1 | 0.1% |
143.3 | 5 |
Value | Count | Frequency (%) |
247 | 1 | 0.1% |
246.9 | 1 | 0.1% |
237 | 1 | 0.1% |
236.7 | 1 | 0.1% |
228 | 54 | |
221.4 | 1 | 0.1% |
221 | 2 | 0.2% |
220.1 | 1 | 0.1% |
220 | 2 | 0.2% |
219.7 | 1 | 0.1% |
Superplasticizer (component 5)(kg in a m^3 mixture)
Real number (ℝ≥0)
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS
Distinct | 111 |
---|---|
Distinct (%) | 10.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 6.204660194 |
Minimum | 0 |
---|---|
Maximum | 32.2 |
Zeros | 379 |
Zeros (%) | 36.8% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 6.4 |
Q3 | 10.2 |
95-th percentile | 16.055 |
Maximum | 32.2 |
Range | 32.2 |
Interquartile range (IQR) | 10.2 |
Descriptive statistics
Standard deviation | 5.973841392 |
---|---|
Coefficient of variation (CV) | 0.9627991228 |
Kurtosis | 1.411268965 |
Mean | 6.204660194 |
Median Absolute Deviation (MAD) | 5.3 |
Skewness | 0.9072025749 |
Sum | 6390.8 |
Variance | 35.68678098 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 379 | |
11.6 | 37 | 3.6% |
8 | 27 | 2.6% |
7 | 19 | 1.8% |
6 | 17 | 1.7% |
7.8 | 16 | 1.6% |
8.9 | 16 | 1.6% |
9.9 | 16 | 1.6% |
9 | 16 | 1.6% |
10 | 15 | 1.5% |
Other values (101) | 472 |
Value | Count | Frequency (%) |
0 | 379 | |
1.7 | 4 | 0.4% |
1.9 | 1 | 0.1% |
2 | 1 | 0.1% |
2.2 | 1 | 0.1% |
2.5 | 2 | 0.2% |
3 | 6 | 0.6% |
3.1 | 1 | 0.1% |
3.4 | 3 | 0.3% |
3.6 | 5 | 0.5% |
Value | Count | Frequency (%) |
32.2 | 5 | |
28.2 | 5 | |
23.4 | 5 | |
22.1 | 1 | 0.1% |
22 | 6 | |
20.8 | 1 | 0.1% |
20 | 1 | 0.1% |
19 | 1 | 0.1% |
18.8 | 1 | 0.1% |
18.6 | 5 |
Distinct | 284 |
---|---|
Distinct (%) | 27.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 972.918932 |
Minimum | 801 |
---|---|
Maximum | 1145 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 801 |
---|---|
5-th percentile | 842 |
Q1 | 932 |
median | 968 |
Q3 | 1029.4 |
95-th percentile | 1104 |
Maximum | 1145 |
Range | 344 |
Interquartile range (IQR) | 97.4 |
Descriptive statistics
Standard deviation | 77.75395397 |
---|---|
Coefficient of variation (CV) | 0.07991822485 |
Kurtosis | -0.5990161032 |
Mean | 972.918932 |
Median Absolute Deviation (MAD) | 46.3 |
Skewness | -0.04021974481 |
Sum | 1002106.5 |
Variance | 6045.677357 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
932 | 57 | 5.5% |
852.1 | 45 | 4.4% |
944.7 | 30 | 2.9% |
968 | 29 | 2.8% |
1125 | 24 | 2.3% |
967 | 19 | 1.8% |
1047 | 19 | 1.8% |
942 | 12 | 1.2% |
822 | 12 | 1.2% |
974 | 12 | 1.2% |
Other values (274) | 771 |
Value | Count | Frequency (%) |
801 | 4 | |
801.1 | 1 | 0.1% |
801.4 | 1 | 0.1% |
811 | 2 | |
814 | 1 | 0.1% |
814.1 | 1 | 0.1% |
817.9 | 1 | 0.1% |
818 | 1 | 0.1% |
819 | 2 | |
819.2 | 1 | 0.1% |
Value | Count | Frequency (%) |
1145 | 1 | 0.1% |
1134.3 | 5 | 0.5% |
1130 | 1 | 0.1% |
1125 | 24 | |
1124.4 | 2 | 0.2% |
1120 | 2 | 0.2% |
1119 | 2 | 0.2% |
1118.8 | 2 | 0.2% |
1118 | 1 | 0.1% |
1113 | 2 | 0.2% |
Distinct | 302 |
---|---|
Distinct (%) | 29.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 773.5804854 |
Minimum | 594 |
---|---|
Maximum | 992.6 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 594 |
---|---|
5-th percentile | 613 |
Q1 | 730.95 |
median | 779.5 |
Q3 | 824 |
95-th percentile | 898.09 |
Maximum | 992.6 |
Range | 398.6 |
Interquartile range (IQR) | 93.05 |
Descriptive statistics
Standard deviation | 80.17598014 |
---|---|
Coefficient of variation (CV) | 0.1036427129 |
Kurtosis | -0.1021769893 |
Mean | 773.5804854 |
Median Absolute Deviation (MAD) | 45.5 |
Skewness | -0.2530095977 |
Sum | 796787.9 |
Variance | 6428.187792 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
755.8 | 30 | 2.9% |
594 | 30 | 2.9% |
670 | 23 | 2.2% |
613 | 22 | 2.1% |
801 | 16 | 1.6% |
746.6 | 15 | 1.5% |
887.1 | 15 | 1.5% |
845 | 14 | 1.4% |
712 | 14 | 1.4% |
750 | 12 | 1.2% |
Other values (292) | 839 |
Value | Count | Frequency (%) |
594 | 30 | |
605 | 5 | 0.5% |
611.8 | 5 | 0.5% |
612 | 1 | 0.1% |
613 | 22 | |
613.2 | 2 | 0.2% |
614 | 1 | 0.1% |
623 | 2 | 0.2% |
630 | 5 | 0.5% |
631 | 4 | 0.4% |
Value | Count | Frequency (%) |
992.6 | 5 | |
945 | 4 | |
943.1 | 4 | |
942 | 4 | |
925.7 | 5 | |
905.9 | 5 | |
903.8 | 5 | |
903.6 | 5 | |
901.8 | 5 | |
900.9 | 5 |
Distinct | 14 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 45.66213592 |
Minimum | 1 |
---|---|
Maximum | 365 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 3 |
Q1 | 7 |
median | 28 |
Q3 | 56 |
95-th percentile | 180 |
Maximum | 365 |
Range | 364 |
Interquartile range (IQR) | 49 |
Descriptive statistics
Standard deviation | 63.16991158 |
---|---|
Coefficient of variation (CV) | 1.383419989 |
Kurtosis | 12.16898898 |
Mean | 45.66213592 |
Median Absolute Deviation (MAD) | 21 |
Skewness | 3.269177401 |
Sum | 47032 |
Variance | 3990.437729 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=14)
Value | Count | Frequency (%) |
28 | 425 | |
3 | 134 | 13.0% |
7 | 126 | 12.2% |
56 | 91 | 8.8% |
14 | 62 | 6.0% |
90 | 54 | 5.2% |
100 | 52 | 5.0% |
180 | 26 | 2.5% |
91 | 22 | 2.1% |
365 | 14 | 1.4% |
Other values (4) | 24 | 2.3% |
Value | Count | Frequency (%) |
1 | 2 | 0.2% |
3 | 134 | 13.0% |
7 | 126 | 12.2% |
14 | 62 | 6.0% |
28 | 425 | |
56 | 91 | 8.8% |
90 | 54 | 5.2% |
91 | 22 | 2.1% |
100 | 52 | 5.0% |
120 | 3 | 0.3% |
Value | Count | Frequency (%) |
365 | 14 | 1.4% |
360 | 6 | 0.6% |
270 | 13 | 1.3% |
180 | 26 | 2.5% |
120 | 3 | 0.3% |
100 | 52 | 5.0% |
91 | 22 | 2.1% |
90 | 54 | 5.2% |
56 | 91 | 8.8% |
28 | 425 |
Concrete compressive strength(MPa. megapascals)
Real number (ℝ≥0)
HIGH CORRELATION
HIGH CORRELATION
Distinct | 845 |
---|---|
Distinct (%) | 82.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 35.81796117 |
Minimum | 2.33 |
---|---|
Maximum | 82.6 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 8.2 KiB |
Quantile statistics
Minimum | 2.33 |
---|---|
5-th percentile | 10.961 |
Q1 | 23.71 |
median | 34.445 |
Q3 | 46.135 |
95-th percentile | 66.802 |
Maximum | 82.6 |
Range | 80.27 |
Interquartile range (IQR) | 22.425 |
Descriptive statistics
Standard deviation | 16.70574196 |
---|---|
Coefficient of variation (CV) | 0.4664068366 |
Kurtosis | -0.3137248604 |
Mean | 35.81796117 |
Median Absolute Deviation (MAD) | 10.93 |
Skewness | 0.4169772884 |
Sum | 36892.5 |
Variance | 279.0818145 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
33.4 | 6 | 0.6% |
71.3 | 4 | 0.4% |
41.05 | 4 | 0.4% |
31.35 | 4 | 0.4% |
23.52 | 4 | 0.4% |
77.3 | 4 | 0.4% |
79.3 | 4 | 0.4% |
35.3 | 4 | 0.4% |
65.2 | 3 | 0.3% |
18.13 | 3 | 0.3% |
Other values (835) | 990 |
Value | Count | Frequency (%) |
2.33 | 1 | |
3.32 | 1 | |
4.57 | 1 | |
4.78 | 1 | |
4.83 | 1 | |
4.9 | 1 | |
6.27 | 1 | |
6.28 | 1 | |
6.47 | 1 | |
6.81 | 1 |
Value | Count | Frequency (%) |
82.6 | 1 | 0.1% |
81.75 | 1 | 0.1% |
80.2 | 1 | 0.1% |
79.99 | 1 | 0.1% |
79.4 | 1 | 0.1% |
79.3 | 4 | |
78.8 | 1 | 0.1% |
77.3 | 4 | |
76.8 | 1 | 0.1% |
76.24 | 1 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
Cement (component 1)(kg in a m^3 mixture) | Blast Furnace Slag (component 2)(kg in a m^3 mixture) | Fly Ash (component 3)(kg in a m^3 mixture) | Water (component 4)(kg in a m^3 mixture) | Superplasticizer (component 5)(kg in a m^3 mixture) | Coarse Aggregate (component 6)(kg in a m^3 mixture) | Fine Aggregate (component 7)(kg in a m^3 mixture) | Age (day) | Concrete compressive strength(MPa. megapascals) | |
---|---|---|---|---|---|---|---|---|---|
0 | 540.0 | 0.0 | 0.0 | 162.0 | 2.5 | 1040.0 | 676.0 | 28.0 | 79.99 |
1 | 540.0 | 0.0 | 0.0 | 162.0 | 2.5 | 1055.0 | 676.0 | 28.0 | 61.89 |
2 | 332.5 | 142.5 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 270.0 | 40.27 |
3 | 332.5 | 142.5 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 365.0 | 41.05 |
4 | 198.6 | 132.4 | 0.0 | 192.0 | 0.0 | 978.4 | 825.5 | 360.0 | 44.30 |
5 | 266.0 | 114.0 | 0.0 | 228.0 | 0.0 | 932.0 | 670.0 | 90.0 | 47.03 |
6 | 380.0 | 95.0 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 365.0 | 43.70 |
7 | 380.0 | 95.0 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 28.0 | 36.45 |
8 | 266.0 | 114.0 | 0.0 | 228.0 | 0.0 | 932.0 | 670.0 | 28.0 | 45.85 |
9 | 475.0 | 0.0 | 0.0 | 228.0 | 0.0 | 932.0 | 594.0 | 28.0 | 39.29 |
Last rows
Cement (component 1)(kg in a m^3 mixture) | Blast Furnace Slag (component 2)(kg in a m^3 mixture) | Fly Ash (component 3)(kg in a m^3 mixture) | Water (component 4)(kg in a m^3 mixture) | Superplasticizer (component 5)(kg in a m^3 mixture) | Coarse Aggregate (component 6)(kg in a m^3 mixture) | Fine Aggregate (component 7)(kg in a m^3 mixture) | Age (day) | Concrete compressive strength(MPa. megapascals) | |
---|---|---|---|---|---|---|---|---|---|
1020 | 288.4 | 121.0 | 0.0 | 177.4 | 7.0 | 907.9 | 829.5 | 28.0 | 42.14 |
1021 | 298.2 | 0.0 | 107.0 | 209.7 | 11.1 | 879.6 | 744.2 | 28.0 | 31.88 |
1022 | 264.5 | 111.0 | 86.5 | 195.5 | 5.9 | 832.6 | 790.4 | 28.0 | 41.54 |
1023 | 159.8 | 250.0 | 0.0 | 168.4 | 12.2 | 1049.3 | 688.2 | 28.0 | 39.46 |
1024 | 166.0 | 259.7 | 0.0 | 183.2 | 12.7 | 858.8 | 826.8 | 28.0 | 37.92 |
1025 | 276.4 | 116.0 | 90.3 | 179.6 | 8.9 | 870.1 | 768.3 | 28.0 | 44.28 |
1026 | 322.2 | 0.0 | 115.6 | 196.0 | 10.4 | 817.9 | 813.4 | 28.0 | 31.18 |
1027 | 148.5 | 139.4 | 108.6 | 192.7 | 6.1 | 892.4 | 780.0 | 28.0 | 23.70 |
1028 | 159.1 | 186.7 | 0.0 | 175.6 | 11.3 | 989.6 | 788.9 | 28.0 | 32.77 |
1029 | 260.9 | 100.5 | 78.3 | 200.6 | 8.6 | 864.5 | 761.5 | 28.0 | 32.40 |
Most frequently occurring
Cement (component 1)(kg in a m^3 mixture) | Blast Furnace Slag (component 2)(kg in a m^3 mixture) | Fly Ash (component 3)(kg in a m^3 mixture) | Water (component 4)(kg in a m^3 mixture) | Superplasticizer (component 5)(kg in a m^3 mixture) | Coarse Aggregate (component 6)(kg in a m^3 mixture) | Fine Aggregate (component 7)(kg in a m^3 mixture) | Age (day) | Concrete compressive strength(MPa. megapascals) | # duplicates | |
---|---|---|---|---|---|---|---|---|---|---|
1 | 362.6 | 189.0 | 0.0 | 164.9 | 11.6 | 944.7 | 755.8 | 3.0 | 35.30 | 4 |
3 | 362.6 | 189.0 | 0.0 | 164.9 | 11.6 | 944.7 | 755.8 | 28.0 | 71.30 | 4 |
4 | 362.6 | 189.0 | 0.0 | 164.9 | 11.6 | 944.7 | 755.8 | 56.0 | 77.30 | 4 |
5 | 362.6 | 189.0 | 0.0 | 164.9 | 11.6 | 944.7 | 755.8 | 91.0 | 79.30 | 4 |
2 | 362.6 | 189.0 | 0.0 | 164.9 | 11.6 | 944.7 | 755.8 | 7.0 | 55.90 | 3 |
6 | 425.0 | 106.3 | 0.0 | 153.5 | 16.5 | 852.1 | 887.1 | 3.0 | 33.40 | 3 |
7 | 425.0 | 106.3 | 0.0 | 153.5 | 16.5 | 852.1 | 887.1 | 7.0 | 49.20 | 3 |
8 | 425.0 | 106.3 | 0.0 | 153.5 | 16.5 | 852.1 | 887.1 | 28.0 | 60.29 | 3 |
9 | 425.0 | 106.3 | 0.0 | 153.5 | 16.5 | 852.1 | 887.1 | 56.0 | 64.30 | 3 |
10 | 425.0 | 106.3 | 0.0 | 153.5 | 16.5 | 852.1 | 887.1 | 91.0 | 65.20 | 3 |