patito.Relation.aggregate
- Relation.aggregate(*aggregations, group_by, **named_aggregations)
Return relation formed by
GROUP BY
SQL aggregation(s).- Parameters
aggregations (
str
) – Zero or more aggregation expressions such as “sum(column_name)” and “count(distinct column_name)”.named_aggregations (
str
) – Zero or more aggregated expressions where the keyword is used to name the given aggregation. For example,my_column="sum(column_name)"
is inserted as"sum(column_name) as my_column"
in the executed SQL query.group_by (
Union
[str
,Iterable
[str
]]) – A single column name or iterable collection of column names to group by.
Examples
>>> import patito as pt >>> df = pt.DataFrame({"a": [1, 2, 3], "b": ["X", "Y", "X"]}) >>> relation = pt.Relation(df) >>> relation.aggregate( ... "b", ... "sum(a)", ... "greatest(b)", ... max_a="max(a)", ... group_by="b", ... ).to_df() shape: (2, 4) ┌─────┬────────┬─────────────┬───────┐ │ b ┆ sum(a) ┆ greatest(b) ┆ max_a │ │ --- ┆ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ str ┆ i64 │ ╞═════╪════════╪═════════════╪═══════╡ │ X ┆ 4.0 ┆ X ┆ 3 │ ├╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤ │ Y ┆ 2.0 ┆ Y ┆ 2 │ └─────┴────────┴─────────────┴───────┘
- Return type
Relation