Interactive global CO2 emissions data visuals in 4 easy steps.
ChatGPT is revolutionizing how we create complex data visualizations, letting us easily leverage awesome Python libraries without getting tangled in syntax.
But there’s a method to implementing generative AI to work in our favour— modular prompt engineering.
Designing just the right instruction for ChatGPT is a bit of an art form, but when you get it right, you have a working code in minutes rather than hours.
But don’t just take my word for it — let’s put this theory to the test with a real-world example that uses a real-world data set.
Let’s use GPT 3.5 for generating Python code for data visualization tasks using Plotly Dash — and let’s go modularly step-by-step to make it a walk in the park.
Step 1. Setup and Data Preprocessing
First and foremost, we must verify that all the essential modules are installed.
We will be using Pandas, Plotly, JSON, and Dash in this tutorial. The necessary libraries are:
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import json
import dash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output
Given our global emissions dataset, we know it contains specific fields such as ‘Country’, ‘Year’, and ‘Per Capita’. We will instruct ChatGPT to preprocess this data.
My request:
I have a CSV file named ‘global_emissions.csv’ with fields ‘Country’, ‘Year’, and ‘Per Capita’. I need to load this file using pandas, then in the ‘Country’ field, replace ‘USA’ with ‘United States of America’. Finally, I want to filter the data to only include records where ‘Year’ is between 1970 and 2020 inclusive. Can you help me with this?
ChatGPT’s Response:
import pandas as pd
# Step 1: Load the CSV file
emissions_data = pd.read_csv('global_emissions.csv')
# Step 2: Replace 'USA' with 'United States of America' in the 'Country' field
emissions_data['Country'] = emissions_data['Country'].replace('USA', 'United States of America')
# Step 3: Filter data where 'Year' is between 1970 and 2020 inclusive
emissions_data = emissions_data[(emissions_data['Year'] >= 1970) & (emissions_data['Year'] <= 2020)]
# Step 4: Display the resulting DataFrame
print(emissions_data)
Initially, this code loads the CSV data into a pandas DataFrame. Next, it replaces occurrences of ‘USA’ in the ‘Country’ field with ‘United States of America’. Finally, the data is filtered to include only records with a ‘Year’ value between 1970 and 2020, both inclusive.
Step 2. Building the Dash App Layout
When constructing the application layout, we aim for a specific design. The app should contain a header, a dropdown for year selection, and two graphs: a bar chart and a choropleth map. Let’s provide these details to ChatGPT.
My request:
I need to create a Dash application layout with a header that says ‘Global Per Capita CO2 Emissions By Year’. Below the header, I want a dropdown for selecting a year from the ‘Year’ field of my DataFrame. Then, I need two graphs positioned side by side: first, a horizontal bar chart that takes up 40% of the row, and then a choropleth map that takes up the rest. The dropdown should have the id ‘year-dropdown’.Can you assist me with this?
ChatGPT’s Response:
app = dash.Dash(__name__)
available_years = sorted(emissions_data["Year"].unique(), reverse=True)
app.layout = html.Div([
html.H1("Global Per Capita CO2 Emissions By Year"),
html.Div([
html.Label("Select Year:", style={'fontWeight': 'bold'}),
dcc.Dropdown(
id='year-dropdown',
options=[{'label': i, 'value': i} for i in available_years],
value=available_years[0]
),
dcc.Graph(id='bar-chart', style={"width": "40%", "display": "inline-block"}),
dcc.Graph(id='map', style={"width": "60%", "display": "inline-block"})
])
])
This code sets up the layout of the Dash application. It creates a header, a dropdown with available years from the DataFrame, and placeholders for the two graphs (a bar chart and a map) side by side.
Step 3. Updating the Figures
We now need to specify how our charts should be updated based on the selected year. Let’s provide these details to ChatGPT. [Update it’s also working very well with BARD]
# This callback function is linked to the dropdown and the two plots.
@app.callback(
[Output('bar-chart', 'figure'),
Output('map', 'figure')],
[Input('year-dropdown', 'value')]
)
def update_figure(selected_year):
filtered_year_data = emissions_data[emissions_data["Year"] == selected_year]
# Group the filtered data by Country, and calculate the mean Per Capita emissions
average_emissions = filtered_year_data.groupby("Country").agg({"Per Capita": "mean"})
average_emissions = average_emissions[average_emissions["Per Capita"] > 0]
bottom_10_per_capita = average_emissions.nsmallest(10, "Per Capita").reset_index()
# Create a horizontal bar plot of 10 lowest countries
fig_bar = px.bar(bottom_10_per_capita, x="Per Capita", y="Country", orientation='h', color='Per Capita', color_continuous_scale='Greens_r')
fig_bar.update_layout(showlegend=False, title="10 Countries With Lowest Emissionsin " + str(selected_year), xaxis_title="Per Capita Emissions", yaxis_title="Country")
# Reverse the order of the countries in the y-axis
fig_bar.update_yaxes(autorange="reversed")
fig_bar.update_coloraxes(showscale=False)
# Group the filtered data by Country, and calculate the mean Per Capita
average_emissions_map_data = filtered_year_data.groupby("Country")["Per Capita"].mean().reset_index()
# Create a choropleth map of the mean Per Capita emissions
fig_map = px.choropleth(
average_emissions_map_data,
locations='Country',
locationmode='country names',
color='Per Capita',
color_continuous_scale="Greens_r",
title="Global Per Capita Emissions in " + str(selected_year)
)
fig_map.update_geos(projection_type="natural earth")
fig_map.update_layout(coloraxis_colorbar=dict(title='Per Capita Emissions'))
return fig_bar, fig_map
This callback function filters the emissions data for the selected year and calculates the mean Per Capita emissions by country.
The callback function implements a drop-down menu to allow for the selection of global CO2 emissions by “Year”.
Plotly interactivity via callback function — dropdown menu to select global CO2 emissions by Year
It then creates a bar chart of the bottom 10 countries with positive Per Capita emissions and a choropleth map showing the average Per Capita emissions for each country.
The bar chart:
The choropleth map:
The figures for both charts are returned and automatically updated in the layout. In the above examples, the darker green colors indicate fewer CO2 emissions.
Step 4. Running the Server
Finally, we need to run our server to interact with the application. Let’s ask ChatGPT to help with this.
My request:
I need to run a Dash server for my application in debug mode. How do I do this?
ChatGPT’s Response:
if __name__ == "__main__":
app.run_server(debug=True)
This line of code checks if the script is run directly (not imported), and if so, it starts the Dash server in debug mode. Running in debug mode provides more detailed error messages, which is useful during development.
The final result:
Fully Interactive Python plotly web application created using prompt engineering
The user can select the year, and both the bar chart and choropleth map will be updated with the data for the selected Year. Additional charts are easily added through additional modular prompting.
Note: We have the flexibility to alter the color; I have opted for green to create contrast with the CO2 emissions.
In Summary…
Modular prompt engineering relieves programmers from grappling with lower-level syntax problems, allowing them to concentrate on higher-level concepts and ideas.
The art of crafting the ideal prompt is indeed a skill, but once mastered, coding becomes seamless. Practice makes perfect in this regard.
With this approach, data exploration, analysis, and storytelling become more accessible and insightful. Consequently, GPT becomes an invaluable tool in your data science and data visualization toolkit.