Become a Data Analyst with ChatGPT: A Comprehensive Guide

Become a Data Analyst with ChatGPT: Discover how to clean, interpret, and analyze your data within ChatGPT in just 10 minutes. Unlock advanced data analysis features and custom personas for seamless data insights.

July 17, 2024


Unlock the power of ChatGPT to become a data analyst in just 10 minutes! Discover how to clean, interpret, and analyze your data sets using the advanced features of ChatGPT. This guide will show you the step-by-step process to uncover valuable insights and make data-driven decisions, without the need for expensive software or extensive training.

Activate the Advanced Data Analysis Feature in ChatGPT

To activate the Advanced Data Analysis feature in ChatGPT, follow these steps:

  1. Open the ChatGPT sidebar by clicking on the menu icon in the top-left corner.
  2. Scroll down to the "Settings" section and click on "Settings".
  3. Navigate to the "Beta Features" tab.
  4. Locate the "Advanced Data Analysis" feature and toggle it on.
  5. Close the sidebar and you will now see the "Advanced Data Analysis" option available in the ChatGPT interface.

With this feature enabled, you can now attach files to ChatGPT and perform advanced data analysis tasks, such as data cleaning, exploratory data analysis, and feature engineering.

Activate Custom Instructions for Better Responses

To activate custom instructions in ChatGPT, follow these steps:

  1. Open the sidebar in ChatGPT and navigate to the "Settings" section.
  2. Click on the "Custom Instructions" tab.
  3. In the first box, provide information about yourself or the task you want ChatGPT to assist with. This could include your role, expertise, or the specific problem you're trying to solve.
  4. In the second box, specify how you would like ChatGPT to respond, such as the tone, level of detail, or any particular formatting you prefer.
  5. Click "Save" to apply the custom instructions.

With these custom instructions in place, ChatGPT will tailor its responses to your preferences, providing more relevant and helpful information to assist you with your data analysis tasks.

Upload and Clean the Data Set

To begin, we need to activate the Advanced Data Analysis feature within ChatGPT4. To do this, open the sidebar, navigate to Settings, click on the "Beta" tab, and enable the "Plugins" and "Advanced Data Analysis" features.

Next, we'll want to activate custom instructions to provide ChatGPT with more context about our data analysis goals. You can create and save custom personas, such as a "Data Scientist" profile, to ensure ChatGPT responds accordingly.

Now, we can upload our data set to ChatGPT. ChatGPT supports a wide range of file formats, including text files, spreadsheets, PDFs, and more. Once the file is uploaded, we can ask ChatGPT to review the data and provide recommendations for cleaning and formatting.

ChatGPT will analyze the data, identify any issues (e.g., missing values, data types, outliers), and suggest steps to address them. You can then instruct ChatGPT to proceed with the data cleaning process, and it will provide a downloadable, cleaned version of the data set for you to use in the next steps of your analysis.

The key here is to leverage ChatGPT's capabilities to handle the data cleaning and formatting tasks, allowing you to focus on the more high-level analysis and problem-solving aspects of your work.

Explore the Data Using Exploratory Data Analysis (EDA)

Now that the data has been cleaned and formatted, we can proceed with exploratory data analysis (EDA) to gain insights and identify key trends within the data.

First, let's examine the distribution of the numerical features, such as age and estimated salary. The data visualization shows a relatively even distribution of age, with the majority of users falling between 25 and 55 years old. The estimated salary distribution, on the other hand, appears to be right-skewed, indicating a higher concentration of users with lower salaries.

Next, we'll look at the categorical features, gender and the binary purchase variable. The data shows that the majority of users did not make a purchase, with only a small fraction converting. Additionally, the gender distribution appears to be fairly balanced.

To further explore the relationships between the variables, we'll generate a correlation matrix and pair plots. The correlation matrix reveals a moderate positive correlation between age and estimated salary, as one might expect. The pair plots provide a visual representation of these relationships, allowing us to identify any potential nonlinear patterns or outliers.

Overall, this exploratory data analysis has provided a solid foundation for understanding the key characteristics and trends within the data. We can now use these insights to inform the next steps in our analysis, such as feature engineering and predictive modeling.


In this tutorial, we have explored how to leverage the power of ChatGPT to become a data analyst, even without extensive training or expensive degrees. By activating the Advanced Data Analysis feature and utilizing custom instructions, we were able to seamlessly clean, format, and analyze a dataset within the ChatGPT interface.

The key takeaways are:

  1. Activate the Advanced Data Analysis feature in ChatGPT to unlock the ability to upload and work with various data formats.
  2. Customize ChatGPT's instructions to tailor its responses to your specific needs, such as adopting the persona of a data scientist.
  3. Upload your dataset and let ChatGPT guide you through the data cleaning process, ensuring your data is ready for analysis.
  4. Leverage ChatGPT's exploratory data analysis (EDA) capabilities to uncover insights and trends within your data, without the need for advanced statistical knowledge.
  5. Ask targeted questions to ChatGPT to gain deeper understanding of your data and identify influential factors, such as the role of gender, age, or income on purchasing behavior.

By embracing the capabilities of ChatGPT, you can become a proficient data analyst in a matter of minutes, without the traditional barriers of time and cost. This powerful tool empowers you to extract valuable insights from your data and make informed decisions, all within a user-friendly conversational interface.