The Python type system is pretty bad, but it's still 100x better than not using types. We are heavy users of the (Rust) type system at Svix, and it's been a godsend. I wrote about it here https://www.svix.com/blog/strong-typing-hill-to-die-on/
We also use Python in some places, including the shitty Python type-system (and some cool hackery to make SQLAlchemy feel very typed and work nicely with Pydantic).
> Writing software without types lets you go at full speed. Full speed towards the cliff.
Isn't it strange that back when Python (or Ruby) didn't even have type hints (not type checkers, type hints!), it would easily outperform pretty much every heavily typed language?
Somehow when types weren't an option we weren't going towards the cliff, but now that they are, not using them means jumping off a cliff? Something doesn't add up.
It's because the nature of typing has changed drastically over the last decade or so, in well known languages, going from C++/Java's `FancyObject *fancyObject = new FancyObject()` (which was definitely annoying to type, and was seen as a way to "tell the compiler how to arrange memory" as opposed to "how do we ensure constraints hold?") to modern TypeScript, where large well-typed programs can be written with barely a type annotation in sight.
There's also a larger understanding that as programs get larger and larger, they get harder to maintain and more importantly refactor, and good types help with this much more than brittle unit tests do. (You can also eliminate a lot of busywork tests with types.)
No it hasn’t? C++ type system has hardly changed (until concepts) and is one of the most powerful available.
A certain generation of devs thought types were academic nonsense and then relearned the existence of those features in other languages. Now they are zealots about using them.
I think the point is that in newer languages like typescript, the price paid for static typing is lower because type inference does so much of the leg work. You get all the benefits of static typing, and the cost is usually tiny - you just need to define your types (a valuable exercise regardless) and add them to function signatures.
We’ve come a long way from the C++ or Java I wrote when I was young, where types were named and renamed constantly. As I understand it, even C++ has the auto keyword now.
Large programs are harder to maintain because people don't have the balls to break them into smaller ones with proper boundaries. They prefer incremental bandaids like type hints or unit tests that make it easier to deal with the big ball of mud, instead of not building the ball in the first place.
Every single typed system I have ever worked on, no matter how poorly designed, has been easier to alter than the vast majority of ruby, python, perl, php, and elixir that I've worked on
I have the opposite experience:
Inserting a library that wraps an existing one to add new features has been a nightmare in every statically typed language I’ve used — including times it’s virtually impossible because you’d need the underlying library to understand the wrapper type in its methods.
In Python (with duck typing), that’s a complete non-issue.
Can you give an example? I think part of the problem is that mixins and such are so hard to do in most statically typed languages that programmers just don’t code things that way.
I see your point - I certainly find myself reaching for clever high level patterns less in typescript than I do in JavaScript because complex typing can get in the way. But also, programs that make heavy use of metaprogramming are often, also, harder to read and debug. There’s something very nice and straightforward about explicit, concrete types.
I'm not the person you asked the question to but I had an unpleasant experience with Typescript recently.
I used a HTTP requests library in a nuxtjs app (probably nuxt's native library) and I spent too much of my time conjuring the request and response types that would please the type checker. It was extremely frustrating because the code would work in Javascript but the compiler wouldn't accept it because of typing.
I can't give you the details because I'm not at my computer now but the type was a mix of HTTP verbs and the structure of the JSON response. I gave up after a while and rewrote the code using fetch and no types. If they stand between me and the final result they can go down the drain.
> back when Python (or Ruby) didn't even have type hints (not type checkers, type hints!), it would easily outperform pretty much every heavily typed language?
No it didn't. It outperformed Java 1.2, and people thought that Java 1.2 was what a typed language looked like. Python always sucked compared to OCaml (yet alone OCaml with a decent IDE), but OCaml had a weird syntax and the documentation was in French, so no-one cared. Now that we finally have a copy of OCaml with curly braces and a critical mass of obnoxious fanboy hype, more people have noticed.
A lot of the old robust code tended to have guard statements like “if not isinstance(…): raise ValueError”, which does a great job of surfacing mistakes before they can compound too much. We all wrote scads of production Python over the decades before typing caught on. I think it’s much easier to do a good job of it now. Having your IDE yell at you before you’ve even finished saving the file sure beats running it and hoping for the best.
> when types weren't an option we weren't going towards the cliff
Erm yes we were. Untyped Python wasn't magically tolerable just because type hints hadn't been implemented yet.
How come all those unicorns were built with intolerable Python/Ruby, not Java/C#/Go?
https://charliereese.ca/y-combinator-top-50-software-startup...
They are likely leveraging Django/Rails which treads the beaten path for Startups.
Startups are also more likely to do monoliths.
For Enterprise & microservices, you will start to see more Java/Go/C#.
I would expect dynamic type crowd to embrace microservices first, given how everybody says that dynamic codebases are a huge mess.
Regardless, to me enterprise represents legacy, bureaucracy, incidental complexity, heavy typing, stagnation.
I understand that some people would like to think that heavy type-reliance is a way for enterprise to address some of it's inherent problems.
But I personally believe that it's just another symptom of enterprise mindset. Long-ass upfront design documents and "designing the layout of the program in types first" are clearly of the same nature.
It's no surprise that Typescript was born at Microsoft.
You want your company to stagnate sooner? Hyperfixate on types. Now your startup can feel the "joys" of enterprise even at the seed stage.
Eh. The amount of work it takes to specify your types in a typescript program is tiny. Type inference does almost all of the work. And the benefit of that work is largely felt in maintenance & onboarding, since the code is easier to read when you’re new and come back to later. Refactoring large JavaScript programs is a nightmare.
The real enterprise death doesn’t come from types. It comes from tasteless over use of classes - especially once you have a complex web of long lived objects that and all reference each other. Significant portions of code in these codebases ends up dedicated to useless tasks like lifecycle management instead of the actual work of your application. It’s kind of the code version of corporate beaurocracy - classes everywhere devoted to doing BS jobs.
It’s not complicated people. Just write the code that tells the computer what you want it to do. No more. Unnecessary encapsulation and premature abstraction will kill your velocity dead.
This distinction makes no sense. Can you explain why types would be more relevant?
Actually I don't think types are relevant here. People are choosing based on other weighted factors like toolchain, ecosystem, products, and culture.
Looking at that blog post, I find it illustrative in how people who like strong types and people who dislike strong types are addressing different form of bugs. If the main types of issues comes from bugs like 1 + "2" == 12", then strong types is a big help. It also enables many developers who spend the majority of time in a programming editor to quickly get automatic help with such bugs.
The other side is those people who do not find those kind of bugs annoying, or they simply don't get hit by such bugs at a rate that is high enough to warrant using a strong type system. Developers who spend their time prototyping in ipython also get less out of the strong types. The bugs that those developers are concerned about are design bugs, like finding out why a bunch of small async programs reading from a message buss may stall once every second Friday, and where the bug may be a dependency of a dependency of a dependency that do not use a socket timeout. Types are similar not going to help those who spend the wast majority of time on bugs where someone finally says "This design could never have worked".
Take care to differentiate strong/weak typing from dynamic/static typing. Many dynamically typed languages (especially older ones) are also weakly typed, but some dynamic langugages, like Python, are strongly typed. 1 + "2" == 12 is weak typing, and Python has strong typing. Type declarations are static typing, in contrast to traditional Python, which had (and still has) dynamic typing.
What's even worse, when typing is treated as an indisputable virtue (and not a tradeoff), pretty much every team starts sacrificing readability for the sake of typing.
And lo and behold, they end up with _more_ design bugs. And the sad part is that they will never even recognize that too much typing is to blame.
Nonsense. You might consider it a tradeoff, but it's a very heavily skewed one. Minor downsides on one side, huge upsides on the other.
Also I would say type hints sacrifice aesthetics, not readability. Most code with type hints is easier to read, in the same way that graphs with labelled axes and units are easier to read. They might have more "stuff" there which people might think is ugly, but they convey critical information which allows you to understand the code.
> Most code with type hints is easier to read
That has not been my experience in the past few years.
I've always been a fan of type hints in Python: intention behind them was to contribute to readability and when developer had that intention in mind, they worked really well.
However, with the release of mypy and Typescript, engineering culture largely shifted towards "typing is a virtue" mindset. Type hints are no longer a documentation tool, they are a constraint enforcing tool. And that tool is often at odds with readability.
Readability is subjective and ephemeral, type constraints (and intellisense) are very tangible. Naturally, developers are failing to find balance between the two.
I write a lot of typescript and rust. In those languages, when I want to understand some code I haven’t seen before, I always start by reading the types. Understanding what and how the data moves through a system is usually key to understanding everything. And usually I lean heavily on my editor for this - in typescript there’s a lot of value in the simple act of hovering over values to see what type they are.
I’m working with a medium size python program at the moment. It’s mostly written by someone smart but early career, and they’ve made a rabbit warren of classes and mixins that get combined in complex ways. I’ve been encouraging him to add types - and wherever those types exist, the code becomes 100% more legible to my code editor - and ultimately to me.
I don’t think I’d bother with types in Python for small programs. But my experience is that good type hints lay out a welcome mat to anyone who comes along later to figure the code out. And honestly, a lot of the time that person is the original author, just months or years after the code was written.
> pretty much every team starts sacrificing readability
People are sacrificing this when they start using python in the first place
It's not about the bugs, it's about designing the layout of the program in types first (ie, laying out all of the data structures required) such that the actual coding of the functionality is fairly trivial. This is known as type driven development: https://blog.ploeh.dk/2015/08/10/type-driven-development/
At work, I find type hints useful as basically enforced documentation and as a weak sort of test, but few type systems offer decent basic support for the sort of things you would need to do type driven programming in scientific/numerical work. Things like making sure matrices have compatible dimensions, handling units, and constraining the range of a numerical variable would be a solid minimum.
I've read that F# has units, Ada and Pascal have ranges as types (my understanding is these are runtime enforced mostly), Rust will land const generics that might be useful for matrix type stuff some time soon. Does any language support all 3 of these things well together? Do you basically need fully dependent types for this?
Obviously, with discipline you can work to enforce all these things at runtime, but I'd like it if there was a language that made all 3 of these things straightforward.
I suspect C++ still comes the closest to what you’re asking for today, at least among mainstream programming languages.
Matrix dimensions are certainly doable, for example, because templates representing mathematical types like matrices and vectors can be parametrised by integers defining their dimension(s) as well as the type of an individual element.
You can also use template wizardry to write libraries like mp-units¹ or units² that provide explicit representations for numerical values with units. You can even get fancy with user-defined literals so you can write things like 0.5_m and have a suitably-typed value created (though that particular trick does get less useful once you need arbitrary compound units like kg·m·s⁻²).
Both of those are fairly well-defined problems, and the available solutions do provide a good degree of static checking at compile time.
IMHO, the range question is the trickiest one of your three examples, because in real mathematical code there are so many different things you might want to constrain. You could define a parametrised type representing open or closed ranges of integers between X and Y easily enough, but how far down the rabbit hole do you go? Fractional values with attached precision/error metadata? The 572 specific varieties of matrix that get defined in a linear algebra textbook, and which variety you get back when you compute a product of any two of them?
I'd be happy for just ranges on floats being quick and easy to specify even if the checking is at runtime (which it seems like it almost will have to be). I can imagine how to attach precision error/metadata when I need it with custom types as long as operator overloading is supported. I think similarly for specialized matrices, normal user defined types and operator overloading gets tolerably far. Although I can understand how different languages may be better or worse at it. Multiple dispatch might be more convenient than single dispatch, operator overloading is way more convenient than not having operator overloading, etc.
A lot of my frustration it is that the ergonomics of these things tend to be not great even when they are available. Or the different pieces (units, shape checking, ranges) don't necessarily compose together easily because they end up as 3 separate libraries or something.
Crystal certainly supports that kind of typing, and being able to restrict bounds based on dynamic elements recently landed in GCC making it simple in plain C as well.
If x is of type T, what type do you want (x - x) to be?
That's a hard one because it depends on what sort of details you let into types and maybe even on the specific type T. Not saying what I'm asking for is easy! Units and shape would be preserved in all cases I can think of. But with subranges (x - x) may have a super-type of x... or if the type system is very clever the type of (x - x) maybe be narrowed to a value :p
And then there's a subtlety where units might be preserved, but x may be "absolute" where as (x - x) is relative and you can do operations with relative units you can't with absolute units and vice versa. Like the difference between x being a position on a map and delta_x being movement from a position. You can subtract two positions on a map in a standard mathematical sense but not add them.
I think you're missing the point of the blog a bit, as the `1 + "2" == "12"` type of issues wasn't it. It definitely also sucks and much more common than you make it sound (especially when refactoring) but it's definitely not that.
Anyhow, no need to rehash the same arguments, there was a long thread here on HN about the post, you can read some of it here: https://news.ycombinator.com/item?id=37764326
I think there is another overlooked factor: some languages’ type systems suck and your opinion of types depends more on your first experience rather than a true comparison.
> The other side is those people who do not find those kind of bugs annoying
Anecdotally, I find these are the same people who work less effectively and efficiently. At my company, I know people who mainly use Notepad++ for editing code when VSCode (or another IDE) is readily available, who use print over debuggers, who don't get frustrated by runtime errors that could be caught in IDEs, and who opt out of using coding assistants. I happen to know as a matter of fact that the person who codes in Notepad++ frequently has trivial errors, and generally these people don't push code out as fast they could.
And they don't care to change the way they work even after seeing the alternatives and knowing they are objectively more efficient.
I am not their managers, so I say to myself "this is none of my business" and move on. I do feel pity for them.
Well, using print over debuggers is fairly common in Rust and other languages with strong type systems because most bugs are, due to the extreme lengths the compiler goes to to able to detect them even before running the program, just lacks of information of the value of an expression at a single point in the program flow, which is where dbg! shines. I agree with all the other points though.
Anecdotally, I was just writing a generic BPE implementation, and spend a few hours tracking down a bug. I used debug statements to look at the values of expressions, and noticed that something was off. Only later did I figure out that I modified a value, but used the old copy — a simple logic error that #[must_use] could have prevented. cargo clippy -W pedantic is annoying, but this taught be I better listen to what it has to say.
>these people don't push code out as fast they could.
Well, one of my coworkers pushes code quite fast, and also he is the one who get rejected more often because he keep adding .tmp, .pyc and even .env files to his commits. I guess "git add asterisk" is faster, and thus more efficient, than adding files slowly or taking time to edit gitignore.
Not so long ago I read a history here in HN about a guy that first coded in his head, then wrote everything in paper, and finally coded in a computer. It compiled without errors. Slow pusher? Inefficient?
> Not so long ago I read a history here in HN about a guy that first coded in his head, then wrote everything in paper, and finally coded in a computer. It compiled without errors. Slow pusher? Inefficient?
I've read and heard stories about these folks too, apparently this was more common decades ago.
To be clear, I don't think I could pull it off with any language. It's quite impressive and admirable to get things right on the first try.
Having said that, the thing is, languages were a lot simpler back then too. I'm not convinced this is realistically even possible with today's languages unless you constrain yourself to some overly restrictive subset. Like try this with C++, and I would be shocked if you can write nontrivial programs without getting compiler errors. Like to give a trivial example, every time I write my own iterator class for a container, I miss something when I hit compile: like either a comparison operator, or subtraction, or conversion to const iterator, or post-decrement, or subscript, or some member typedef. Or try it with python, and I bet you'll call .get() on something and then forget to check for null somewhere.
I would love to be proven wrong though. If anyone knows of someone who does this with a modern language, please share.
They invented .gitignore to prevent those files to get checked in into the repository.
Head, paper, keyboard is what we did in the 80s when compilers were too slow to afford throwing code at them and fix the errors later. Was that code in the HN story a substantial piece of code or some 100 lines program? Our programs used to be small.
For the Python people, it seems a matter of habit and culture. When a person has gone down a certain direction for so long, it can be really hard to change. Think that's why it is a good idea to be exposed to other languages earlier on, where the person would have seen other type systems and other ways of doing things. There wouldn't be so much trauma and drama, when confronted with types or differences.
I had a play with Dart a while back. It felt like Python with types designed in from the outset. Would quite like to use it more seriously.
It's in that funny position though where it is in danger of becoming synonymous with Flutter. Like Ruby and Rails.
Isn't the rust type system fairly off-topic here? Python is a dynamic language, Rust is on the other end of the scale.
The Rust Evangelism Strike Force used to be more subtle! (joke)
> (and some cool hackery to make SQLAlchemy feel very typed and work nicely with Pydantic).
Sounds interesting. Can you elaborate on the cool hackery? We introduced SQLModel recently but struggle in a few cases (e.g. multi-level joins). Do you know reference projects for SQLAlchemy and pydantic?
My info is maybe a bit dated, as it's been a while since we wrote this hackery. We also adopted SQLModel at some point but we had to patch it to work well (I think some of my contributions are now in upstream). As for some of the hacks:
def c(prop: t.Any) -> sa.Column: # type: ignore
return prop
To make it possible to access sqlmodel properties as columns for doing things like `in_` but still maintaining type safety.Added types ourselves to the base model like this:
__table__: t.ClassVar[sa.Table]
Added functions that help with typing like this: @classmethod
async def _fetch_one(cls: t.Type[BaseT], db: BaseReadOnlySqlSession, query: Select) -> t.Optional[BaseT]:
try:
return (await db.execute(query)).scalar_one()
except NoResultFound:
return None
and stuff like this for relationships: def ezrelationship(
model: t.Type[T_],
id_our: t.Union[str, sa.Column], # type: ignore
id_other: t.Optional[t.Union[t.Any, sa.Column]] = None, # type: ignore
) -> T_:
if id_other is None:
id_other = model.id
return sqlm.Relationship(sa_relationship=relationship(model, primaryjoin=f"foreign({id_our}) == {id_other}"))
def ezrelationship_back(
id_our: t.Union[str, sa.Column], # type: ignore
id_other: t.Union[str, sa.Column], # type: ignore
) -> t.Any:
model, only_id2 = id_other.split(".")
return sqlm.Relationship(
sa_relationship=relationship(
model,
primaryjoin=f"foreign({id_our}) == {id_other}_id",
back_populates=only_id2,
)
)
I hope this helps, I don't have time to find all the stuff, but we also hacked on SQLAlchemy a bit, and in other places. > The Python type system is pretty bad
Coming from the perspective of a religious python hater, their type hints are better than what you give credit for: Supports generics, nominative, structural, unions, bottom type, and literals.
What is missing is mainstream adoption in libraries which is a matter of time.
Optional typing is always a castle built on sand. I don't see Python typing ever becoming reliable, because there's no way you can retrofit the entire ecosystem that thoroughly.
> What is missing is mainstream adoption in libraries which is a matter of time.
I don't think that's a big problem anymore. Between typeshed and typing's overall momentum, most libraries have at least decent typing and those that don't often have typed alternatives.
I don't think that's a big problem anymore.
ORMs have entered the chat…
These sometimes use a lot of dynamic modification, such as adding implicit ID fields or adding properties to navigate a relationship with another type that is defined in code only from the other side.
It can also be awkward to deal with “not null” database fields if the way the ORM model classes are defined means fields are nullable as far as the Python type hints are concerned, yet the results of an actual database query should never have a null value there. Guarding against None every time you refer to one of them is tedious.
I’m not exactly the world’s loudest advocate for ORMs anyway, but on projects that also try to take type safety seriously, they do seem to be a bit of a dark corner within the Python ecosystem.
Can you give some examples of how the Python type system is disappointing you?
As a heavy user of Python’s type annotations, I’m very happy with them, but I would like for them to be first class at runtime, so I can do useful and interesting things with them. The status quo is that a type annotation can be a class, a string, or a “typing special form.” I would like for a type annotation to be an object that could exist independently and be treated as a value, and this is only sometimes true.
Mainly, the seems to be no way, in a dynamic language, to dynamically check if functions get the right types.
To me, this means I don't really understand the python type hinting at all, as adding hints to just one or two functions provides no value to me at all.
I assume I must be not using them usefully, as I've tried adding type hints to some projects and they just seemed to do nothing useful.
You want runtime typechecking.
See either beartype [1] or typeguard [2]. And if you're doing any kind of array-based programming (JAX or not), then jaxtyping [3].
[1] https://github.com/beartype/beartype/
Thanks for posting this. I had seen beartype several years ago but I don't believe it had the whole-module registration feature yet. I'm looking forward to trying both of the libraries since the ergonomics are better than decorating every function individually.
Type hints alone don't do this, but you can use Pydantic to accomplish what you want. In Python type hints aren't enforced anywhere at runtime. They're for a type-checker to validate your source.
https://docs.pydantic.dev/latest/concepts/validation_decorat...
default values! Since type hints are *hints*, it is difficult to set default values for complicated types. For instance, if you have lists, dicts, sets in the type signature, without a library like pydantic, it is difficult and non-standard. This becomes even more problematic when you start doing more complicated data structures. The configuration in this library starts to show the problems. https://koxudaxi.github.io/datamodel-code-generator/custom_t...
The issue very much is a lack of a standard for the entire language; rather than it not being possible.
I might be dense, but I don't understand what that has to do with type hints...
To my eyes, the problem of choosing useful defaults for complicated types/datastructures is independent of whether I add type hints for them.
I think I am missing something...
If it’s 100x better than no types, then probably 10x better than C++ type system. It takes some time to unlearn using dicts everywhere, but then namedtuples become your best friend and noticeably improve maintainability. Probably the only place where python type system feels inadequate is describing json-like data near the point of its (de)serialization.
Pretty much anywhere you're tempted to use a namedtuple, you should be using a dataclass[0] instead.
And typing JSON-like data is possible with TypedDict[1].
[0] https://docs.python.org/3/library/dataclasses.html
[1] https://docs.python.org/3/library/typing.html#typing.TypedDi...
Why? I thought one should prefer immutability. As for typed dicts.. yes, I’m mostly stuck on old python versions, nice reminder.
You can use TypedDict from `typing_extensions` if your version doesn't have it. You can use a lot of the newer stuff from there, too, especially if you enable `__future__.annotations`.
How old is your Python, though? TypedDict is from 3.8. That was 5 years ago.
You can use:
> @dataclass(frozen=True)
to create an immutable data class. There’s TyepdDict that is decent for a JSON like data structure if the types are simple. It doesn’t have the bells and whistles of Pydantic, but gets the job done for passing predictable dicts around and ensuring consistency while developing