Analysis of compressive strength of concrete.

The compressive strength of concrete is a highly non-linear function of multiple parameters, such as ageing time and cement concentration. I demonstrate how to use Scikit's pipeline to construct a simple auto-ML functionality and compare the performance of multiple regressors.
Note: a full profile of the dataset is available here and was created by using the pandas_profiling project.

Learn more

Effect size on age-standardized data of prevalence of diabetes.

We often rely too much on the the p-value of hypothesis testing. I demonstrate a case where the p-value is misleading and what to watch out for in your data. Here is a teaser, built with Plotly, of how diabetes prevalence evolved over time.

Learn more

Diabetes patient record: time series analysis and forecasting.

This is a time series analysis of recorded blood glucose (BG) measurements of a diabetes patient. I develop a simple model with gradient boosted trees where knowledge of the history of BG is used to predict the next measurement. The model is then compared with the predictions of Facebook's Prophet package. Disclaimer: the analysis is not intended to give medical advise!

Learn more