-
Notifications
You must be signed in to change notification settings - Fork 175
Closed
Labels
bug: incorrect resultSomething isn't workingSomething isn't workinghigh priorityYour PR will be reviewed very quickly if you address thisYour PR will be reviewed very quickly if you address thispandas-likeIssue is related to pandas-like backendsIssue is related to pandas-like backendspolarsIssue is related to polars backendIssue is related to polars backendpyarrowIssue is related to pyarrow backendIssue is related to pyarrow backend
Description
Noticed while looking at the pyarrow version for (#3224) that the index seems to be wrong when the column isn't already ordered.
Possibly a regression from #3239? (cc @MarcoGorelli)
Just guessing as that's the most recent change
Repro
pandas, pyarrow and polars all produce the same result, which I think is wrong?
import narwhals as nw
data = {"a": ["A", "B", "A"], "b": [1, 2, 3], "c": [9, 2, 4]}
df = nw.from_dict(data, backend="polars")
df.with_row_index(order_by="c").sort("index")┌───────────────────────────┐
| Narwhals DataFrame |
|---------------------------|
|shape: (3, 4) |
|┌───────┬─────┬─────┬─────┐|
|│ index ┆ a ┆ b ┆ c │|
|│ --- ┆ --- ┆ --- ┆ --- │|
|│ i64 ┆ str ┆ i64 ┆ i64 │|
|╞═══════╪═════╪═════╪═════╡|
|│ 0 ┆ A ┆ 3 ┆ 4 │|
|│ 1 ┆ A ┆ 1 ┆ 9 │|
|│ 2 ┆ B ┆ 2 ┆ 2 │|
|└───────┴─────┴─────┴─────┘|
└───────────────────────────┘
Expected
duckdb, ibis, sqlframe, dask all produce this - which is an index ordered by "c"
df.lazy("duckdb").with_row_index(order_by="c").sort("index").collect("polars")┌───────────────────────────┐
| Narwhals DataFrame |
|---------------------------|
|shape: (3, 4) |
|┌───────┬─────┬─────┬─────┐|
|│ index ┆ a ┆ b ┆ c │|
|│ --- ┆ --- ┆ --- ┆ --- │|
|│ i64 ┆ str ┆ i64 ┆ i64 │|
|╞═══════╪═════╪═════╪═════╡|
|│ 0 ┆ B ┆ 2 ┆ 2 │|
|│ 1 ┆ A ┆ 3 ┆ 4 │|
|│ 2 ┆ A ┆ 1 ┆ 9 │|
|└───────┴─────┴─────┴─────┘|
└───────────────────────────┘
Metadata
Metadata
Assignees
Labels
bug: incorrect resultSomething isn't workingSomething isn't workinghigh priorityYour PR will be reviewed very quickly if you address thisYour PR will be reviewed very quickly if you address thispandas-likeIssue is related to pandas-like backendsIssue is related to pandas-like backendspolarsIssue is related to polars backendIssue is related to polars backendpyarrowIssue is related to pyarrow backendIssue is related to pyarrow backend