JupyTuesday: TidyTuesday Example in Python
Quick and dirty try of TidyTuesday data in a Jupyter Notebook
Important: This post is hard to read. Experimenting with generating a notebook and posting it without a lot of extra work. Might try recording creation of a post sometime. But for now, this post is hard to parse as a user experience, although it does have an Altair graph you can poke and prod at the end...
- From TidyTuesday, read in the data and get to a visualization with Seaborn and Altair
- Quite rough, thinking it might be beneficial to try recording making a notebook...
import pandas as pd
transit_cost = pd.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-01-05/transit_cost.csv')
transit_cost.isnull().sum()
transit_cost = transit_cost[~transit_cost.e.isnull()]
transit_cost.tunnel_per.value_counts().head()
- most of the tunnels are completed
completed = transit_cost[
(transit_cost.tunnel_per == '100.00%') &
(~transit_cost.cost_km_millions.isnull())
]
completed.isnull().sum()
# mean and counts for cost_km_millions by country
completed.groupby('country')['cost_km_millions'].agg(['mean', 'count']).sort_values('mean', ascending=False).head(10)
compare = completed.groupby('country').agg(
{
'cost_km_millions': ['mean', 'count'],
'length': ['mean']
}
)
# rename the multiIndex
compare.columns = ['__'.join(col).strip() for col in compare.columns.values]
compare.reset_index(inplace=True)
compare.head()
import seaborn as sns
sns.relplot(
data=compare,
x='cost_km_millions__mean',
y='length__mean',
size='cost_km_millions__count'
)
import altair as alt
alt.Chart(compare).mark_circle(size=50).encode(
x=alt.X('cost_km_millions__mean', axis=alt.Axis(title='Average Cost per Kilometer ($Million/km)')),
y=alt.Y('length__mean', axis=alt.Axis(title='Average Length (km)')),
tooltip=['country', 'cost_km_millions__mean', 'length__mean'],
).properties(
title='Dig Dug: Building Tunnels'
).configure_mark(
opacity=0.7
).configure_title(
fontSize=24
).interactive()
Image Credit
- Honeycomb by Zach Bogart from the Noun Project
- Three women cricket spectators picnicking from "State Library of New South Wales" on Flickr Commons