upvote
> The pandas API is awful

I hate to be the "you're holding it wrong" guy but 90% of "Pandas bad!" posts I find are either outright misinformed or mischaracterizing one person's particular opinion as some kind of common truth. This one is both!

> That comes from the assumption that there is almost always a meaningful index (timestamps)

The index can be literally any unique row label or ID. It's idiosyncratic among "data frames" (SQL has no equivalent concept, and the R community has disowned theirs), but it's really not such a crazy thing to have row labels built into your data table. Excel supports this in several different ways (frozen columns, VLOOKUP) and users expect it in just about any table-oriented GUI tool.

> having to write index=False every single time you write to disk

If you're actually using the index as it's meant to be used, you'd see why this isn't the default setting.

> functions seemingly randomly returning dataframes with column data as the index

I assume you're talking about the behavior of .groupby() and .rolling()? It's never been random. Under-documented and hard to reason about group_keys= and related options, yes. But not random.

> appending the index to the Series numpy data leading to incredibly confusing bugs

I've been using Pandas professionally almost daily since 2015 and I have no idea what this means.

reply
I think the commenter you are replying to might well understand these nuances. The point is not that Pandas is inscrutable, but instead that it‘s annoying to use in many common use-cases.
reply