Newbie question: How to perform data cleaning in a SQL database at a company and then visualize with PowerBI

  Kiến thức lập trình

The data is stored in a live database e.g. MySQL and it needs to be cleaned with all the duplication, NaN values, outliers, and so on before it can be used with PowerBI for visualization.

What are best practices when starting as an entry-level data analyst tasked to clean the data and then provide visualization?

As per my limited knowledge, the best practice is to create a small chunk of test data from the production data and then perform all the EDA, cleaning, and visualization with PowerBI. Then implement the test queries in the production and finally connect the production database with powerBI.

Any help and suggestions from experienced professionals? Thanks in advance ☺️

I wanted to know how data cleaning is done in live databases, how these queries can be automated and what are the best practices of the industry.

New contributor

Safi Azim is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

LEAVE A COMMENT