In data analytics, you often need to summarize data — like total sales per region or average order value per customer. Pandas makes this easy with .groupby() and aggregation functions.
To group data by a specific column and calculate the sum:
df.groupby("region")["sales"].sum()This groups the data by region and then sums the sales for each group.
.sum() – total value.mean() – average.count() – number of entries.max() / .min() – highest or lowestYou can apply multiple aggregations like this:
df.groupby("region")["sales"].agg(["sum", "mean", "count"])Grouped results have a special index. To turn it back into a normal DataFrame:
grouped = df.groupby("region")["sales"].sum().reset_index()Grouping and aggregating help you uncover patterns, compare performance, and summarize large datasets — all essential skills for a data analyst.