Patient Condition Analysis and Prediction using Streamlit

Understanding the dataset:

The dataset powering the Patient Condition Analysis Dashboard and Prediction originates from Kaggle. This dataset comprises patient’s test results (Patient’s condition), and the factors influencing it. It has 10000 records and 14 attributes essential for comprehensive patient analysis including:

Deciphering Medical Insights:

Analyzing test results is pivotal in healthcare as it provides critical insights into a patient’s condition, guiding diagnosis, and treatment decisions. By scrutinizing test results, healthcare professionals can identify anomalies, track disease progression, and adjust treatment plans accordingly. Moreover, it facilitates early detection of potential complications, enabling proactive intervention to improve patient outcomes and prevent adverse events.

Using Python and Streamlit to develop the App:

Harnessing the power of Python, we can unravel the intricacies of the dataset and glean actionable insights. Below are key snippets illustrating the analytical process:

Importing required libraries:

Feature Engineering:

Augmenting the dataset with additional variables to facilitate in-depth analysis and model building.

Converting date_of_admission’ column to datetime format and extracts the year from it, storing it in a new column named ‘admit_year. This column helps us to gain insights across the year.

Creating Interactive Visuals and KPI

Utilizing streamlit components and Plotly to generate interactive visualizations for a deeper understanding of the dataset

Encoding the features:

Transforming categorical variables into numerical format to prepare the data for modelling.

Removing Unnecessary Variables:

Streamlining the dataset by eliminating redundant or irrelevant variables.

For model building we use only 7 features – age, gender, blood type, medical condition, admission type, medication and BMI.

Building Predictive Models – Leveraging Random Forest and LightGBM:

Harnessing the power of machine learning, we can develop predictive models to anticipate patient outcomes based on their characteristics. Here, we’ll build two models – LightGBM and Random Forest – to predict patient conditions. These models utilize historical data to forecast potential medical outcomes, enabling healthcare professionals to make informed decisions and tailor treatment plans accordingly.

Note: While the models, visuals and the approach indicate the potential of the dashboard, the possibilities and its functionality, please note that by including additional data volume and expanding the set of features as available, the accuracy and other metrics can be further improved, making the model more robust.

Now, let’s take a closer look at the steps involved.

Train (fit) the model on the training data using the fit () method.

After fitting the model, evaluate its performance on the test data. One common metric is accuracy, which measures the proportion of correctly predicted instances.

Random Forest:

With a limited dataset, achieving high accuracy numbers can be challenging. However, by gathering more diverse data and adhering to the methods outlined in the blog, improved accuracy is attainable.


Age has the highest feature importance score among all the features. This indicates that age is the most influential factor in predicting the target variable according to the model. BMI (Body Mass Index) is the second most important feature according to the model. Gender has the lowest feature importance score among all the features. This suggests that gender has the least influence on the model’s predictions compared to the other features. And so on every other feature scores indicate their influence in predicting the target value. The higher the score, the higher the influence in prediction.



Prediction using the models:

Using the models built for prediction involves leveraging the power of ensemble learning to make accurate predictions across various domains. Users can input relevant patient information, such as age, gender, medical condition, etc. The built models then predict the potential outcome, such as the severity of the medical condition or the likelihood of a particular test result. The result can be predicted using both the models.

Currently displaying the predicted output based on Random Forest.

Batch Prediction:

By leveraging the predictive models, we can predict outcomes for all records in the dataset. This comprehensive analysis provides valuable insights into patient conditions and helps healthcare providers prioritize interventions and allocate resources efficiently.

1)Upload a .csv file of the below format.

2)Choose any model from the “select a model” radio button to predict the outcomes of all the records.

Choose a model to get the predicted output:


The Patient Condition Analysis Dashboard and Prediction system empowers practitioners with data-driven insights and predictive capabilities. By incorporating additional features along with an expansive data set and amalgamating advanced analytics techniques, it will facilitate more personalized care and improve patient outcomes. As healthcare continues to evolve, leveraging technology to extract actionable insights from vast datasets becomes imperative, ushering in a future where proactive and personalized healthcare is the norm.

Visit the Streamlit App: Patient Condition Analysis and Prediction App to explore these features firsthand and gain a deeper understanding.

You can find the app code in the mentioned URL :

Please feel free to get in touch with us regarding your Streamlit solution needs. Our Streamlit solutions encompass a range of services and tools designed to streamline your data visualization, quick prototyping and dashboard development processes.

Leave a Reply