The thing that the author says they would prefer is already in Python, it's called NewType (https://docs.python.org/3/library/typing.html#typing.NewType)
They say "...so I can't create a bunch of different names for eg typing.Any and then expect type checkers to complain if I mix them."
`MyType = NewType('MyType', Any)`
is how you do this.
At the end, they suggest a workflow: "I think my ideal type hint situation would be if I could create distinct but otherwise unconstrained types for things like function arguments and function returns, have mypy or other typing tools complain when I mixed them, and then later go back to fill in the concrete implementation details of each type hint"
That's just doing the above, but then changing the `NewType('MyType', Any)` to something like `NewType('MyType', list[dict[str, int]])` later when you want to fill in the concrete implementation.
This is great, thank you for this. I've always wanted something that would complain if I passed Meters to something expecting Feet, but aliases didn't consider this an error.
Why is it that many of the examples for the "typing.protocal" class right below this involve meth??? Python WTF?
> After the code has stabilized I can probably go back to write type hints [...] but I'm not sure that this would provide very much value.
I think most developers who revisit their projects 6+ months later would disagree with the second part of this statement.
My typical flow for "quick scripts" is:
on first pass I'll add basic type hints (typing ":str" after a func param takes .2 seconds)
for more complex data structures (think a json response from an api), dict (or typing.Dict) work fine
if you want a Python project to be maintainable, type hints are a requirement imho.
And when I inherit a code base with no type hints asking an LLM to have a go at adding type hints also takes no time at all
That’s basically my approach and attitude, too. I was skeptical at first but I’d never go back to undecorated code. Paired with a decent language server, it’s soooo much easier writing correct code now.
I agree that writing type hints can be painful, especially if you are starting with a large code base that is mostly untyped. You might consider using RightTyper (https://github.com/RightTyper/RightTyper) - basically run your Python 3.12+ program with it, and it will add type hints to your code. It’s fast, basically automatic, and by design RightTyper avoids overfitting to your types, letting a type checker like MyPy surface edge cases. In effect, the type checker becomes an anomaly detector (Full disclosure, I am one of the authors of RightTyper.)
From the GitHub page:
RightTyper is a Python tool that generates types for your function arguments and return values. RightTyper lets your code run at nearly full speed with almost no memory overhead. As a result, you won't experience slow downs in your code or large memory consumption while using it, allowing you to integrate it with your standard tests and development process. By virtue of its design, and in a significant departure from previous approaches, RightTyper only captures the most commonly used types, letting a type checker like mypy detect possibly incorrect type mismatches in your code.
Dealing with this currently in a giant old legacy python 2.7 codebase that was migrated to 3.10 in the past year. I do see the requirement @ 3.12, is there a specific reason for this that wouldn't be available to 3.10?
It relies on some features introduced in Python 3.12, specifically the `sys.monitoring` API.
Why is your GitHub organization’s logo, the AWS logo?
In addition to being a faculty member at UMass Amherst, I am an Amazon Scholar, working at Amazon Web Services, where this work was conducted.
how come it isn't hosted under the official aws organization? https://github.com/aws
Can’t honestly remember the details! But it’s very much an AWS-owned org.
I was part of a skunkworks project at eBay a decade ago. eBay's OSPO was pretty loose at the time. We made our own eBay org to release our stuff under, with their blessing. They linked to our projects from their site.
We've all since left, and eBay's OSPO has churned too. Some years ago, I ended up with a giant undismissible banner on every GitHub page that the org we made has been "flagged." Apparently someone at modern eBay saw that we used the eBay logo and threw a trademark fit about it.
GitHub's support is perhaps the most useless I've ever encountered. They take months to reply to an issue, and there's no notification system. They expect you to constantly check your issue to see if they've replied, and they'll close it if you don't reply to their arbitrarily timed messages promptly.
GitHub Support basically told me to fuck myself. They didn't care that it was sanctioned by eBay at the time we made it. They didn't care that I showed them the obsolete eBay OSPO repo that linked to our org. They gave me no avenue to talk to anyone to get it resolved, nor did they give me any way to dismiss the banner.
Unless I want to write a Chrome extension to dismiss it, my GH session will forever have a black mark that I took a contract at vintage eBay, that modern eBay forgot about and sicked itself on.
Similar to the author, I infrequently write Python code (though I have a long history with it), but I feel quite the opposite about type hints. A few specific comments:
- The LLMs can really help with typing tricky situations. If your editor can't already tell you what to use, asking an LLM usually can give me the answer.
- Type annotations for code that might change is a lifesaver, because when I change it later on I now get a bunch of conflicts where I've used it using the old way.
- Feel free to add annotations where it makes sense and is easy, and if something doesn't make sense or it is too hard to figure out the right type, you can skip it and still gain the benefits of using it elsewhere.
- Annotations don't "force you to think about types", you already are thinking about types. They let you think a bit less about types I would argue, because they're documented in function calls and returns. "Can I read() from input_file, or do I need to open()read()?" "input_file:Path" makes it better documented, without encoding the object type in the name.
I'm coming up on 30 years of using Python, and I never really missed typing, but honestly I write basically all of my new code with annotations because of the IDE benefits I get from it. I love that the Python implementation allows me to get the benefits without forcing it on me. In my distant past I very much loved coding in C, but was quite happy with Python's lack of strict typing. This feels like a good middle-ground.
The Python type system is pretty bad, but it's still 100x better than not using types. We are heavy users of the (Rust) type system at Svix, and it's been a godsend. I wrote about it here https://www.svix.com/blog/strong-typing-hill-to-die-on/
We also use Python in some places, including the shitty Python type-system (and some cool hackery to make SQLAlchemy feel very typed and work nicely with Pydantic).
Looking at that blog post, I find it illustrative in how people who like strong types and people who dislike strong types are addressing different form of bugs. If the main types of issues comes from bugs like 1 + "2" == 12", then strong types is a big help. It also enables many developers who spend the majority of time in a programming editor to quickly get automatic help with such bugs.
The other side is those people who do not find those kind of bugs annoying, or they simply don't get hit by such bugs at a rate that is high enough to warrant using a strong type system. Developers who spend their time prototyping in ipython also get less out of the strong types. The bugs that those developers are concerned about are design bugs, like finding out why a bunch of small async programs reading from a message buss may stall once every second Friday, and where the bug may be a dependency of a dependency of a dependency that do not use a socket timeout. Types are similar not going to help those who spend the wast majority of time on bugs where someone finally says "This design could never have worked".
Take care to differentiate strong/weak typing from dynamic/static typing. Many dynamically typed languages (especially older ones) are also weakly typed, but some dynamic langugages, like Python, are strongly typed. 1 + "2" == 12 is weak typing, and Python has strong typing. Type declarations are static typing, in contrast to traditional Python, which had (and still has) dynamic typing.
It's not about the bugs, it's about designing the layout of the program in types first (ie, laying out all of the data structures required) such that the actual coding of the functionality is fairly trivial. This is known as type driven development: https://blog.ploeh.dk/2015/08/10/type-driven-development/
At work, I find type hints useful as basically enforced documentation and as a weak sort of test, but few type systems offer decent basic support for the sort of things you would need to do type driven programming in scientific/numerical work. Things like making sure matrices have compatible dimensions, handling units, and constraining the range of a numerical variable would be a solid minimum.
I've read that F# has units, Ada and Pascal have ranges as types (my understanding is these are runtime enforced mostly), Rust will land const generics that might be useful for matrix type stuff some time soon. Does any language support all 3 of these things well together? Do you basically need fully dependent types for this?
Obviously, with discipline you can work to enforce all these things at runtime, but I'd like it if there was a language that made all 3 of these things straightforward.
I suspect C++ still comes the closest to what you’re asking for today, at least among mainstream programming languages.
Matrix dimensions are certainly doable, for example, because templates representing mathematical types like matrices and vectors can be parametrised by integers defining their dimension(s) as well as the type of an individual element.
You can also use template wizardry to write libraries like mp-units¹ or units² that provide explicit representations for numerical values with units. You can even get fancy with user-defined literals so you can write things like 0.5_m and have a suitably-typed value created (though that particular trick does get less useful once you need arbitrary compound units like kg·m·s⁻²).
Both of those are fairly well-defined problems, and the available solutions do provide a good degree of static checking at compile time.
IMHO, the range question is the trickiest one of your three examples, because in real mathematical code there are so many different things you might want to constrain. You could define a parametrised type representing open or closed ranges of integers between X and Y easily enough, but how far down the rabbit hole do you go? Fractional values with attached precision/error metadata? The 572 specific varieties of matrix that get defined in a linear algebra textbook, and which variety you get back when you compute a product of any two of them?
I'd be happy for just ranges on floats being quick and easy to specify even if the checking is at runtime (which it seems like it almost will have to be). I can imagine how to attach precision error/metadata when I need it with custom types as long as operator overloading is supported. I think similarly for specialized matrices, normal user defined types and operator overloading gets tolerably far. Although I can understand how different languages may be better or worse at it. Multiple dispatch might be more convenient than single dispatch, operator overloading is way more convenient than not having operator overloading, etc.
A lot of my frustration it is that the ergonomics of these things tend to be not great even when they are available. Or the different pieces (units, shape checking, ranges) don't necessarily compose together easily because they end up as 3 separate libraries or something.
Crystal certainly supports that kind of typing, and being able to restrict bounds based on dynamic elements recently landed in GCC making it simple in plain C as well.
If x is of type T, what type do you want (x - x) to be?
That's a hard one because it depends on what sort of details you let into types and maybe even on the specific type T. Not saying what I'm asking for is easy! Units and shape would be preserved in all cases I can think of. But with subranges (x - x) may have a super-type of x... or if the type system is very clever the type of (x - x) maybe be narrowed to a value :p
And then there's a subtlety where units might be preserved, but x may be "absolute" where as (x - x) is relative and you can do operations with relative units you can't with absolute units and vice versa. Like the difference between x being a position on a map and delta_x being movement from a position. You can subtract two positions on a map in a standard mathematical sense but not add them.
What's even worse, when typing is treated as an indisputable virtue (and not a tradeoff), pretty much every team starts sacrificing readability for the sake of typing.
And lo and behold, they end up with _more_ design bugs. And the sad part is that they will never even recognize that too much typing is to blame.
Nonsense. You might consider it a tradeoff, but it's a very heavily skewed one. Minor downsides on one side, huge upsides on the other.
Also I would say type hints sacrifice aesthetics, not readability. Most code with type hints is easier to read, in the same way that graphs with labelled axes and units are easier to read. They might have more "stuff" there which people might think is ugly, but they convey critical information which allows you to understand the code.
> Most code with type hints is easier to read
That has not been my experience in the past few years.
I've always been a fan of type hints in Python: intention behind them was to contribute to readability and when developer had that intention in mind, they worked really well.
However, with the release of mypy and Typescript, engineering culture largely shifted towards "typing is a virtue" mindset. Type hints are no longer a documentation tool, they are a constraint enforcing tool. And that tool is often at odds with readability.
Readability is subjective and ephemeral, type constraints (and intellisense) are very tangible. Naturally, developers are failing to find balance between the two.
I write a lot of typescript and rust. In those languages, when I want to understand some code I haven’t seen before, I always start by reading the types. Understanding what and how the data moves through a system is usually key to understanding everything. And usually I lean heavily on my editor for this - in typescript there’s a lot of value in the simple act of hovering over values to see what type they are.
I’m working with a medium size python program at the moment. It’s mostly written by someone smart but early career, and they’ve made a rabbit warren of classes and mixins that get combined in complex ways. I’ve been encouraging him to add types - and wherever those types exist, the code becomes 100% more legible to my code editor - and ultimately to me.
I don’t think I’d bother with types in Python for small programs. But my experience is that good type hints lay out a welcome mat to anyone who comes along later to figure the code out. And honestly, a lot of the time that person is the original author, just months or years after the code was written.
> pretty much every team starts sacrificing readability
People are sacrificing this when they start using python in the first place
> The other side is those people who do not find those kind of bugs annoying
Anecdotally, I find these are the same people who work less effectively and efficiently. At my company, I know people who mainly use Notepad++ for editing code when VSCode (or another IDE) is readily available, who use print over debuggers, who don't get frustrated by runtime errors that could be caught in IDEs, and who opt out of using coding assistants. I happen to know as a matter of fact that the person who codes in Notepad++ frequently has trivial errors, and generally these people don't push code out as fast they could.
And they don't care to change the way they work even after seeing the alternatives and knowing they are objectively more efficient.
I am not their managers, so I say to myself "this is none of my business" and move on. I do feel pity for them.
Well, using print over debuggers is fairly common in Rust and other languages with strong type systems because most bugs are, due to the extreme lengths the compiler goes to to able to detect them even before running the program, just lacks of information of the value of an expression at a single point in the program flow, which is where dbg! shines. I agree with all the other points though.
Anecdotally, I was just writing a generic BPE implementation, and spend a few hours tracking down a bug. I used debug statements to look at the values of expressions, and noticed that something was off. Only later did I figure out that I modified a value, but used the old copy — a simple logic error that #[must_use] could have prevented. cargo clippy -W pedantic is annoying, but this taught be I better listen to what it has to say.
>these people don't push code out as fast they could.
Well, one of my coworkers pushes code quite fast, and also he is the one who get rejected more often because he keep adding .tmp, .pyc and even .env files to his commits. I guess "git add asterisk" is faster, and thus more efficient, than adding files slowly or taking time to edit gitignore.
Not so long ago I read a history here in HN about a guy that first coded in his head, then wrote everything in paper, and finally coded in a computer. It compiled without errors. Slow pusher? Inefficient?
> Not so long ago I read a history here in HN about a guy that first coded in his head, then wrote everything in paper, and finally coded in a computer. It compiled without errors. Slow pusher? Inefficient?
I've read and heard stories about these folks too, apparently this was more common decades ago.
To be clear, I don't think I could pull it off with any language. It's quite impressive and admirable to get things right on the first try.
Having said that, the thing is, languages were a lot simpler back then too. I'm not convinced this is realistically even possible with today's languages unless you constrain yourself to some overly restrictive subset. Like try this with C++, and I would be shocked if you can write nontrivial programs without getting compiler errors. Like to give a trivial example, every time I write my own iterator class for a container, I miss something when I hit compile: like either a comparison operator, or subtraction, or conversion to const iterator, or post-decrement, or subscript, or some member typedef. Or try it with python, and I bet you'll call .get() on something and then forget to check for null somewhere.
I would love to be proven wrong though. If anyone knows of someone who does this with a modern language, please share.
I think there is another overlooked factor: some languages’ type systems suck and your opinion of types depends more on your first experience rather than a true comparison.
I think you're missing the point of the blog a bit, as the `1 + "2" == "12"` type of issues wasn't it. It definitely also sucks and much more common than you make it sound (especially when refactoring) but it's definitely not that.
Anyhow, no need to rehash the same arguments, there was a long thread here on HN about the post, you can read some of it here: https://news.ycombinator.com/item?id=37764326
> Writing software without types lets you go at full speed. Full speed towards the cliff.
Isn't it strange that back when Python (or Ruby) didn't even have type hints (not type checkers, type hints!), it would easily outperform pretty much every heavily typed language?
Somehow when types weren't an option we weren't going towards the cliff, but now that they are, not using them means jumping off a cliff? Something doesn't add up.
It's because the nature of typing has changed drastically over the last decade or so, in well known languages, going from C++/Java's `FancyObject *fancyObject = new FancyObject()` (which was definitely annoying to type, and was seen as a way to "tell the compiler how to arrange memory" as opposed to "how do we ensure constraints hold?") to modern TypeScript, where large well-typed programs can be written with barely a type annotation in sight.
There's also a larger understanding that as programs get larger and larger, they get harder to maintain and more importantly refactor, and good types help with this much more than brittle unit tests do. (You can also eliminate a lot of busywork tests with types.)
No it hasn’t? C++ type system has hardly changed (until concepts) and is one of the most powerful available.
A certain generation of devs thought types were academic nonsense and then relearned the existence of those features in other languages. Now they are zealots about using them.
I think the point is that in newer languages like typescript, the price paid for static typing is lower because type inference does so much of the leg work. You get all the benefits of static typing, and the cost is usually tiny - you just need to define your types (a valuable exercise regardless) and add them to function signatures.
We’ve come a long way from the C++ or Java I wrote when I was young, where types were named and renamed constantly. As I understand it, even C++ has the auto keyword now.
Large programs are harder to maintain because people don't have the balls to break them into smaller ones with proper boundaries. They prefer incremental bandaids like type hints or unit tests that make it easier to deal with the big ball of mud, instead of not building the ball in the first place.
Every single typed system I have ever worked on, no matter how poorly designed, has been easier to alter than the vast majority of ruby, python, perl, php, and elixir that I've worked on
I have the opposite experience:
Inserting a library that wraps an existing one to add new features has been a nightmare in every statically typed language I’ve used — including times it’s virtually impossible because you’d need the underlying library to understand the wrapper type in its methods.
In Python (with duck typing), that’s a complete non-issue.
Can you give an example? I think part of the problem is that mixins and such are so hard to do in most statically typed languages that programmers just don’t code things that way.
I see your point - I certainly find myself reaching for clever high level patterns less in typescript than I do in JavaScript because complex typing can get in the way. But also, programs that make heavy use of metaprogramming are often, also, harder to read and debug. There’s something very nice and straightforward about explicit, concrete types.
> back when Python (or Ruby) didn't even have type hints (not type checkers, type hints!), it would easily outperform pretty much every heavily typed language?
No it didn't. It outperformed Java 1.2, and people thought that Java 1.2 was what a typed language looked like. Python always sucked compared to OCaml (yet alone OCaml with a decent IDE), but OCaml had a weird syntax and the documentation was in French, so no-one cared. Now that we finally have a copy of OCaml with curly braces and a critical mass of obnoxious fanboy hype, more people have noticed.
> when types weren't an option we weren't going towards the cliff
Erm yes we were. Untyped Python wasn't magically tolerable just because type hints hadn't been implemented yet.
How come all those unicorns were built with intolerable Python/Ruby, not Java/C#/Go?
https://charliereese.ca/y-combinator-top-50-software-startup...
They are likely leveraging Django/Rails which treads the beaten path for Startups.
Startups are also more likely to do monoliths.
For Enterprise & microservices, you will start to see more Java/Go/C#.
I would expect dynamic type crowd to embrace microservices first, given how everybody says that dynamic codebases are a huge mess.
Regardless, to me enterprise represents legacy, bureaucracy, incidental complexity, heavy typing, stagnation.
I understand that some people would like to think that heavy type-reliance is a way for enterprise to address some of it's inherent problems.
But I personally believe that it's just another symptom of enterprise mindset. Long-ass upfront design documents and "designing the layout of the program in types first" are clearly of the same nature.
It's no surprise that Typescript was born at Microsoft.
You want your company to stagnate sooner? Hyperfixate on types. Now your startup can feel the "joys" of enterprise even at the seed stage.
Eh. The amount of work it takes to specify your types in a typescript program is tiny. Type inference does almost all of the work. And the benefit of that work is largely felt in maintenance & onboarding, since the code is easier to read when you’re new and come back to later. Refactoring large JavaScript programs is a nightmare.
The real enterprise death doesn’t come from types. It comes from tasteless over use of classes - especially once you have a complex web of long lived objects that and all reference each other. Significant portions of code in these codebases ends up dedicated to useless tasks like lifecycle management instead of the actual work of your application. It’s kind of the code version of corporate beaurocracy - classes everywhere devoted to doing BS jobs.
It’s not complicated people. Just write the code that tells the computer what you want it to do. No more. Unnecessary encapsulation and premature abstraction will kill your velocity dead.
This distinction makes no sense. Can you explain why types would be more relevant?
Actually I don't think types are relevant here. People are choosing based on other weighted factors like toolchain, ecosystem, products, and culture.
For the Python people, it seems a matter of habit and culture. When a person has gone down a certain direction for so long, it can be really hard to change. Think that's why it is a good idea to be exposed to other languages earlier on, where the person would have seen other type systems and other ways of doing things. There wouldn't be so much trauma and drama, when confronted with types or differences.
Isn't the rust type system fairly off-topic here? Python is a dynamic language, Rust is on the other end of the scale.
The Rust Evangelism Strike Force used to be more subtle! (joke)
I had a play with Dart a while back. It felt like Python with types designed in from the outset. Would quite like to use it more seriously.
It's in that funny position though where it is in danger of becoming synonymous with Flutter. Like Ruby and Rails.
> (and some cool hackery to make SQLAlchemy feel very typed and work nicely with Pydantic).
Sounds interesting. Can you elaborate on the cool hackery? We introduced SQLModel recently but struggle in a few cases (e.g. multi-level joins). Do you know reference projects for SQLAlchemy and pydantic?
My info is maybe a bit dated, as it's been a while since we wrote this hackery. We also adopted SQLModel at some point but we had to patch it to work well (I think some of my contributions are now in upstream). As for some of the hacks:
def c(prop: t.Any) -> sa.Column: # type: ignore
return prop
To make it possible to access sqlmodel properties as columns for doing things like `in_` but still maintaining type safety.Added types ourselves to the base model like this:
__table__: t.ClassVar[sa.Table]
Added functions that help with typing like this: @classmethod
async def _fetch_one(cls: t.Type[BaseT], db: BaseReadOnlySqlSession, query: Select) -> t.Optional[BaseT]:
try:
return (await db.execute(query)).scalar_one()
except NoResultFound:
return None
and stuff like this for relationships: def ezrelationship(
model: t.Type[T_],
id_our: t.Union[str, sa.Column], # type: ignore
id_other: t.Optional[t.Union[t.Any, sa.Column]] = None, # type: ignore
) -> T_:
if id_other is None:
id_other = model.id
return sqlm.Relationship(sa_relationship=relationship(model, primaryjoin=f"foreign({id_our}) == {id_other}"))
def ezrelationship_back(
id_our: t.Union[str, sa.Column], # type: ignore
id_other: t.Union[str, sa.Column], # type: ignore
) -> t.Any:
model, only_id2 = id_other.split(".")
return sqlm.Relationship(
sa_relationship=relationship(
model,
primaryjoin=f"foreign({id_our}) == {id_other}_id",
back_populates=only_id2,
)
)
I hope this helps, I don't have time to find all the stuff, but we also hacked on SQLAlchemy a bit, and in other places. > The Python type system is pretty bad
Coming from the perspective of a religious python hater, their type hints are better than what you give credit for: Supports generics, nominative, structural, unions, bottom type, and literals.
What is missing is mainstream adoption in libraries which is a matter of time.
Optional typing is always a castle built on sand. I don't see Python typing ever becoming reliable, because there's no way you can retrofit the entire ecosystem that thoroughly.
> What is missing is mainstream adoption in libraries which is a matter of time.
I don't think that's a big problem anymore. Between typeshed and typing's overall momentum, most libraries have at least decent typing and those that don't often have typed alternatives.
I don't think that's a big problem anymore.
ORMs have entered the chat…
These sometimes use a lot of dynamic modification, such as adding implicit ID fields or adding properties to navigate a relationship with another type that is defined in code only from the other side.
It can also be awkward to deal with “not null” database fields if the way the ORM model classes are defined means fields are nullable as far as the Python type hints are concerned, yet the results of an actual database query should never have a null value there. Guarding against None every time you refer to one of them is tedious.
I’m not exactly the world’s loudest advocate for ORMs anyway, but on projects that also try to take type safety seriously, they do seem to be a bit of a dark corner within the Python ecosystem.
Can you give some examples of how the Python type system is disappointing you?
As a heavy user of Python’s type annotations, I’m very happy with them, but I would like for them to be first class at runtime, so I can do useful and interesting things with them. The status quo is that a type annotation can be a class, a string, or a “typing special form.” I would like for a type annotation to be an object that could exist independently and be treated as a value, and this is only sometimes true.
Mainly, the seems to be no way, in a dynamic language, to dynamically check if functions get the right types.
To me, this means I don't really understand the python type hinting at all, as adding hints to just one or two functions provides no value to me at all.
I assume I must be not using them usefully, as I've tried adding type hints to some projects and they just seemed to do nothing useful.
You want runtime typechecking.
See either beartype [1] or typeguard [2]. And if you're doing any kind of array-based programming (JAX or not), then jaxtyping [3].
[1] https://github.com/beartype/beartype/
Thanks for posting this. I had seen beartype several years ago but I don't believe it had the whole-module registration feature yet. I'm looking forward to trying both of the libraries since the ergonomics are better than decorating every function individually.
Type hints alone don't do this, but you can use Pydantic to accomplish what you want. In Python type hints aren't enforced anywhere at runtime. They're for a type-checker to validate your source.
https://docs.pydantic.dev/latest/concepts/validation_decorat...
default values! Since type hints are *hints*, it is difficult to set default values for complicated types. For instance, if you have lists, dicts, sets in the type signature, without a library like pydantic, it is difficult and non-standard. This becomes even more problematic when you start doing more complicated data structures. The configuration in this library starts to show the problems. https://koxudaxi.github.io/datamodel-code-generator/custom_t...
The issue very much is a lack of a standard for the entire language; rather than it not being possible.
I might be dense, but I don't understand what that has to do with type hints...
To my eyes, the problem of choosing useful defaults for complicated types/datastructures is independent of whether I add type hints for them.
I think I am missing something...
If it’s 100x better than no types, then probably 10x better than C++ type system. It takes some time to unlearn using dicts everywhere, but then namedtuples become your best friend and noticeably improve maintainability. Probably the only place where python type system feels inadequate is describing json-like data near the point of its (de)serialization.
Pretty much anywhere you're tempted to use a namedtuple, you should be using a dataclass[0] instead.
And typing JSON-like data is possible with TypedDict[1].
[0] https://docs.python.org/3/library/dataclasses.html
[1] https://docs.python.org/3/library/typing.html#typing.TypedDi...
Why? I thought one should prefer immutability. As for typed dicts.. yes, I’m mostly stuck on old python versions, nice reminder.
You can use TypedDict from `typing_extensions` if your version doesn't have it. You can use a lot of the newer stuff from there, too, especially if you enable `__future__.annotations`.
How old is your Python, though? TypedDict is from 3.8. That was 5 years ago.
You can use:
> @dataclass(frozen=True)
to create an immutable data class. There’s TyepdDict that is decent for a JSON like data structure if the types are simple. It doesn’t have the bells and whistles of Pydantic, but gets the job done for passing predictable dicts around and ensuring consistency while developing
Something I didn't see mentioned much here is refactoring. Refactoring without types is like walking in the dark. You have to check everywhere to see how your changes impact other code, and you'll certainly miss some and only find out in production. With typing, when you change your type signature, you can just run the checker and get a list of places you need to change.
Yeah to me this is the biggest difference between static/dynamic types. I mean there are a LOT of differences DX-wise, but refactoring is so scary without static types.
If we need to make changes to the DB at work, I’ll just update the prisma schema and run ‘npx prisma generate’ followed by ‘tsc —noEmit’ to instantly see all the affected areas. I feel like there are a lot of similar little superpowers you get by having a nice static type system.
The logic of type hint is not bad but sadly I think that type hint are making python source code messy and unreadable.
I'm missing a lot simple functions with explicit argument names and docstrings with arguments types and descriptions clearly but discreetly documented.
It was one big strength of Python to have so simple and clean code without too much boilerplate.
Also, I have the feeling that static typing extremist are trying to push the idea that type hinting allows to ensure to not mix types as it would be bad. But from my point of view the polymorphic and typing mixing aspect is a strong force of Python.
Like having dictionaries that are able to hold whatever you want is so incredible when you compare to trying to do the equivalent in Java for example.
One part where I find type hint to be wonderful still is for things like pydantic and dataclasses!
> Like having dictionaries that are able to hold whatever you want is so incredible when you compare to trying to do the equivalent in Java for example.
Can't you just make a dictionary of objects, same as in C#? Except that in C#, if you really want to, you can also use `dynamic` to get python-like behavior.
Otherwise, generally speaking, in a strongly typed language you want to figure out what those objects have in common and put that inside an interface. If you can't modify those objects just slap an adapter pattern on top.
The result is a dictionary of objects that adhere to a specific interface, which defines all the properties and procedures that are relevant to the domain and the problem.
This makes thinking about the problem much easier from a type theoretical perspective because it lets you abstract away the concrete details of each object while preserving the fundamental aspects that you care about.
I guess that it takes two different mindsets to in order to appreciate the pros and cons of dynamic vs static programming. There are certainly many pros for dynamic programming, but I'm more comfortable thinking about problems in generic terms where every relation and constraint is laid bare in front of me, one level removed from the actual implementation.
> I think that type hint are making python source code messy and unreadable.
I hear this sentiment a lot from people who rarely use strict(er) typed languages: Rust, C++, Java, C#, Go, etc. Can you imagine a developer in any of those languages complaining that "oh, now the code is messy and unreadable because we added explicit types"? It seems bizarre to think about it. Sure, Java and C# is a bit repetitive, but at least you always know the type.There is an ongoing debate in C++, Java, and C# if the newish keyword "auto"/"var" is a good idea to hide local variable explicit types. The real issue: For the person who wrote the code, they already know the implicit types.. However, for people reading the code, they have a harder time to understand the implicit types.
> Can you imagine a developer in any of those languages complaining that "oh, now the code is messy and unreadable because we added explicit types"?
Python used to be described as "executable pseudocode". None of the languages you've listed have ever been considered that easy to read.
Making Python look more like them is therefore a step backwards in terms of cleanliness and readability.
> The logic of type hint is not bad but sadly I think that type hint are making python source code messy and unreadable.
Compared to legacy Python, yes.
Compared to verbose language like Java, no. Python typing is equal or less verbose than Java (unless you use "var" in Java).
Technically, Python typing is more verbose than Java because it uses more tokens. Compare these:
Python: def foo(x: int, y: int) -> int: return x + y
Java: int foo(int x, int y) { return x + y; }
Python uses colons and arrows while Java uses positions to encode where the type should go. Python has union types, and you can type something as a container type with no type parameters.
You can but it defeats the purpose of typing. Makes a little bit more complicated to code and more verbose for almost no benefit. That is my point.
Hear, hear. I often spend five times as long peddling about with the type annotations. Most of the “bugs” I find with the type checker are type annotation bugs, not actual software bugs.
What type annotations do however deliver is useful completion via LSP.
Type hints are nice, until you have to interact with a library that isn't type-hinted, and then it very quickly becomes a mess.
I don't know how other IDEs behave, but VScode + the Python extensions try to infer the missing hints and you end up with beauties such as `str | None | Any | Unknown`, which of course are completely meaningless.
Even worse, the IDE marks as an error some code that is perfectly correct, because it somehow doesn't match those nonsensical hints. And so it gives you the worst of both worlds: a lot of false positives that you quickly learn to ignore, dooming the few actual type errors to irrelevance, because you'll ignore them anyways until they blow up at runtime, just as it'd happen without typehints.
> Type hints are nice, until you have to interact with a library that isn't type-hinted, and then it very quickly becomes a mess.
Whenever I find myself in that situation, I usually write a typing stub for the parts that I use from that library (example: [0]) and then let `mypy_path` point to that directory [1].
VS Code will then pick up the hints from those stubs.
[0]: https://github.com/claui/itchcraft/blob/5ca04e070f56bf794c38...
[1]: https://github.com/claui/itchcraft/blob/5ca04e070f56bf794c38...
>I don't know how other IDEs behave, but VScode + the Python extensions try to infer the missing hints and you end up with beauties such as `str | None | Any | Unknown`, which of course are completely meaningless.
Are they correct? If they're correct (even though they are a superset of the actual intended type) then what's the problem?
At worst, it's like not having type checks for that particular package.
They are verbose but correct. I've caught some errors this way.
I usually don't think of None as a potential return value (= voids in C) but the LSP code analysis usually picks up on code paths that don't return a value.
I don't find Python's typing valuable for Jupyter type explorations, but they're immensely valuable for catching little issues in production code.
For example in mypy the default is to not check procedures, which have no argument type annotations and no return type annotation. That gets rid of your whole problem of untyped library, if you have a wrapper procedure.
If VSCode still highlights it, then it is time to configure VSCode properly.
I believe VSCode by default uses pyright which is fast but shitty in that it gives a lot of false positives. If you want the most correct typing experience, use mypy. Even then you may need a config.
I get what you need, yet I find these cases aren't all that often, and when it happens it doesn't bother me as I quickly recognize where the type system is somewhat failing and either ignore it or add a type hint.
But maybe if you have a codebase with a lot of magic of certain libraries, you experience is different. I also don't really depend on the typing treat it the same as C# or Java.
I believe there's a mode for VS Code type checking which ignores untyped code - have you tried that?
Worst of both worlds is right. I came back to a Python project with a couple of critical but untyped dependencies recently after writing mostly Rust, and to clear up a large number of these (particularly “type is partially unknown”) I had the choice between lots of purely type-checking ceremony (`typing.cast`) or going without.
The third option here is writing type stubs for the library, which you can sometimes find community versions of as well. They’re not too time consuming to write and generally work well enough to bridge the gap
Yeah, I think this may be a good option when actively working on a project. Sadly at the moment, it's mostly a case of "I just need to make a couple of bug fixes in this old project, why is my editor shouting at me?"
What did you end up choosing & why?
It's only a personal side project and I have a good handle on the untyped modules in question, so in the end I suppressed most of the errors with `# type:ignore` and friends.
I'd reconsider that if I was doing more than the odd bug fix on the project. I still like Python, and started using type hints early, but there's enough added friction to make me question using them in the future.
I imagine on big projects the benefit is clearer.
Thanks for sharing!
Asking because I was really, really annoyed by the non-helpfulness of the type hints in practice, contrary to the theory.
I think TypeScript provides a lot more of the freedom the author is looking for. For instance, you can say, "the type of this argument is whatever is returned by that function."
Personally I find myself more comfortable and productive using types. Stating your types has a similar benefit to journaling, in my view. It's a forcing function for clarifying your ideas about the problem domain. Some perceive this as overhead, I perceive this as front loading. If my ideas are murky, I will run into trouble sooner or later. The later it is, the more painful it will be.
I think it largely comes down to different habits of working and thinking. I don't think one way is superior to another inherently (though types are important for collaboration), but that different people work in different ways.
The same applies to tests, maybe docs too.
For x in tests, type annotations, and documentation*:
If you write your x first then you have to decide what your API is. This is great if you want to think about your API. Sometimes though you just want to get down to it and play around with a new idea. Either way is fine.
As soon as you start sharing code or patching production code or patching someone else’s production code, one must insist on seeing some kind of x. Having x around the outside of a system — rather than requiring x be added throughout the entire system — is often good enough.
*The useful, architecture kind.
> In writing this it occurs to me that I do often know that I have distinct types (for example, for what functions return) and I shouldn't mix them, but I don't want to specify their concrete shape as dicts, tuples, or whatever. [...] Type aliases are explicitly equivalent to their underlying thing, so I can't create a bunch of different names for eg typing.Any and then expect type checkers to complain if I mix them.
It sounds to me like you're describing the NewType pattern which is just slightly farther down the page you linked in the article.
This has all happened before. This will all happen again. Everyone felt the same way about TypeScript. Types feel annoying at first, if you lived without them. Using untyped libraries is annoying until they all have types, which happens over time. Types prove their worth every time.
The problem (in my opinion) is that Python gives you the tools (and perhaps even encourages you) to write code that would benefit from typing.
It's perfectly feasible to write maintainable, well-designed code in a dynamic language. I've worked with some extremely robust and ergonomic Clojure codebases before, for example. However, in Clojure, the language pushes you into its own "pit of success".
Personally, I never feel that with Python.
That’s good to hear about clojure!
I just started learning the language and it’s been a ton of fun (especially babashka omg) but I’m so typescript-minded that it’s been really tough being back in dynamic land.
Have you looked into core.spec at all? It's been a while since I've even thought about it but I believe it's an interesting middle ground between Python's type hints and TypeScript static typing. It functions as sort of verifiable documentation (at runtime, if you wish) and can also be used to dynamically generate data (e.g. for testing).
I miss Clojure.
EDIT: Here's a great talk Rich Hickey gave about it at LispNYC, which I was lucky enough to attend.
The biggest issue with Python type hints for me isn't the hints themselves, it's that they encourage people to write overly complex, verbose code just to satisfy the type checker.
Code like this [0] could simply be 3 functions. Instead it's 3 classes, plus a base class `AstNode`, just so the author can appease the type checker by writing `body: List[AstNode]` instead of the dynamically-typed `body = []`.
[0] https://gist.github.com/sportsracer/16a1e294966cfba83ba61e6a...
That code looks like a proper object-oriented design to me, nothing to do with type-hints actually.
Type hints encourage this sort of object-oriented design though, in my experience. The resulting code is extremely verbose compared to Pythonic "executable pseudocode".
For example, see Jack Diederich's talk "Stop Writing Classes": https://www.youtube.com/watch?v=o9pEzgHorH0
That talk had a big impact on my coding style. But citing a 99 line script written as an example for a blog post doesn't really support your argument. 99 lines is short, and verbosity is expected in such example code.
Consider FastAPI. It uses functions as endpoints, like flask. Very compatible with "Stop Writing Classes." It also leverages type hinting to eliminate boilerplate and create more concise code. You don't have to put validation or dependency injection logic at the top of every endpoint, it's handled for you so you can dedicate screen space to the problems you're solving.
Consider also the pythonism, "explicit is better than implicit." If memory serves, "Stop Writing Classes" wasn't so much about not writing containers for data but not writing containers for behavior when it wasn't associated with data. Behavior can live as a freestanding function just as well as inside of an object. But it's difficult to understand the semantics of freestanding nontrivial data, like dictionaries or long tuples.
Dataclasses and pydantic models require a minimum of boilerplate and couple the data with it's semantic meaning, so that it's preserved across boundaries. I for one am never going back to the Python before these tools.
Type hinting in python is a bit of a sticky plaster.
We have pyre enforcement at work, the problem is, that it has been gradually turned on over time, so some stuff is pyre compliant (or just strategically ignored) and some stuff isnt, so when you open some old code to do something, you have a million errors to deal with.
That would be fine if types were enforceable. in runtime type hinting does shit all.
I would dearly love a "strict" mode where duck typing is turned off and variable are statically typed. However I suspect that will never happen, even though it'd speed up a load of stuff if done correctly (type inference happens a lot)
I suspect to use type hints properly, I'd need to think a bit more C-like and create dataclasses as types to make things more readable, rather than using Dict[str,int] or what ever.
There are some other ways of expressing names for types, once you start using typing. There are typevars, enums and using the "|" to separate options, there are TypedDict, NamedTuple, Union, Literal, Optional, and probably more. Not everything needs to be a dataclass.
IMHO, type hints should only serve two purposes:
1. design a memory layout for faster execution
2. press dot and get suggestions in IDE
other usage of types brings more problem than it solves.
I typehint the stuff that is easy. My observations about typehinting in Python track the 80:20 rule or even a 90:10 rule. You get about 80% benefit for typhinting the easy 20%.
The problems with python types are:
A) The type system is fairly bad, as type systems go (forgivable) B) the type checkers are, for large codebases, excruciatingly slow, to the point that the tests are faster!!
The second is not forgivable.
If you could snap your fingers and have your type hints update to match your code, it wouldn’t get in the way of your work.
Hyperbolically: You have to be able to edit code at the speed of thought - whatever it takes - or else programming languages cease to be a more useful tool than just thinking.
If you type slower than you think, or can’t do the type-hint-based textual translation as quickly as you think, then… yeah - it’s not good for you.
The advice I’d wanna hear for myself is: just get better. But the advice I’d give to my coworkers is: have explicit domains where you’re able to do whatever is most efficient and effective, and then when you hand off data to the next subsystem - obey a contract. A schema. Be that type hints or a .proto file or a database schema or an API. Doesn’t matter.
It seems like the author is looking for the ability to specify types as `typeof <function>:arguments` and `typeof <function>:return`. I can see how this could make prototyping easier. It is also helpful for cases (not uncommon in Python) where you're just proxying another function.
TypeScript has the equivalent of what you're describing via the `Parameters` and `ReturnType` utility types [1][2], and I've found these types indispensable. So you can do the following:
type R = ReturnType<typeof someFunction>
type P = Parameters<typeof someFunction>
[1] https://www.typescriptlang.org/docs/handbook/utility-types.h...[2] https://www.typescriptlang.org/docs/handbook/utility-types.h...
The note about creating your own data types is interesting. I used to be heavily dependent on tuples. Admittedly dicts would've saved me here but I liked the efficiency of numeric indexing. Anyway, any time I changed ordering or anything I'd have countless headaches. I started using dataclasses and never looked back. I love turning the type checker on in vscode and seeing it correctly catch future issues. Only problem is when libraries are indirectly hinted, as others have pointed out
> PPS: I think my ideal type hint situation would be if I could create distinct but otherwise unconstrained types for things like function arguments and function returns, have mypy or other typing tools complain when I mixed them
Cleaner/safer function args and return types is a common motivation for dataclass. has benefits over many/complex args besides typing too.
This is what we ended up using with mypy so we could add type checks to our CI without having to fix every single typing error:
> yet another Python thing I'd have to try to keep in my mind despite it being months since I used them last.
Typing hints are also a moving target: they have changed, sometimes significantly, on every minor Python release since they came to be.
The `Optional` type came and went (being replaced by the new Union syntax.) Type classes are usually born in the `typing` module, then some of them get moved to `collections`, `abc`, or `collections.abc`. Some types have even been moved several times. `TypeAlias` came and went (replaced by `type`). `List` became `list`. `Tuple` became `tuple`. Forward declarations require a workaround by using string literals. But in Python 3.14, they no longer do, and the workaround will become deprecated.
I'm an evangelist for static type checking, and I never write a Python function signature without complete typing annotations. I also think that typing annotations in Python are evolving in a good way. However, for long-lived Python scripts, I highly recommend that developers are aware of, and factor in, the effort necessary to keep up with the evolution of static typing across Python versions. That effort spent on migrating is going to come on top of the mental load that typing hints already add to your plate.
The old stuff never stopped working, though. You can still use Optional, Union, TypeAlias, and import collection protocols from typing. You don't have to migrate if you don't want to. They're not even deprecated.
I encourage you to open the `typing` documentation [0] and search for the word `deprecated`.
Spoiler alert: the search result will be three-figure.
Some of the results are already scheduled for removal.
Ruff can automatically upgrade all of the issues you mentioned to match your target minimum python version.
None of those things came and went, they came but did not go. They're all still here, even in the 3.14 beta.
Program the way you like it and it should be fun.
If you are at work doing professional programming then typing helps avoid bugs and makes programming more of a reliable and robust process.
But doing your own thing, doing little utilities, banging out quick stuff, do exactly what makes you happy.
Programming should be fun not a chore and if types make it a chore for you then drop them.
It’s kind of cool that type hints can be reflected on to do things (see Pydantic). Other than that I find it pretty cumbersome to use in practice, coming from TypeScript. Semi-relatedly I also dislike Python’s different ways to access objects/dicts, it feels arbitrary and cumbersome.
this is not a just a python problem but a problem in many dynamic languages.
switching to the typed variants whether typecript, python type-hints, mypy etc will force you to do the dance to make the compiler happy instead of working on code.
which is why for me - JSDoc is really good - it uses types for documentation and ends there.
Sometimes I feel like we need an analog to javascript/typescript. Ptypethon if you will.
Absolutely. The main problem with python typing is that checking types is optional. A dialect with mandatory types (with inference) and runtime/load-time checking would be great.
Use pyright in strict mode, then. If you really want runtime checking, you can Pydantic's validation decorator, typeguard or beartype. Current typed python is much better than people give it credit. You just have to use it properly.
Checking types is optional with Typescript too. We don't need another type annotation syntax for Python. The existing one is fine.
Python continues to show it will adopt any feature with online interest without guiding principle.
It takes mere seconds to write type hints, and with modern auto-complete tooling or Copilot, it's basically automatic. It goes a long way to use tools like MyPy to verify your code and also provide documentation for users of your code. If you don't want to write type hints, don't use a dynamically typed language.
Is that a double negative in your title ? Or is it an inside joke I didn't get ?
> Types prevent me from writing code that I don't understand.
Yes, thats the point.
My objection to strong types in python is philosophical. Python mimics natural language, and natural language rejects strong types for context resolved ambiguity.
In the way we resolve these issues in natural language, we can resolve bugs in python, that is, “do you mean integer ’3’ or string’3’” instead of insisting we define everything always forever.
To me, people who use type hinting are just letting me know they have written code that doesn’t check in line.
> people who use type hinting are just letting me know they have written code that doesn’t check in line.
Yes, that's the point. We use typing so the type checker can find the mistakes for us instead of adding `isinstance` everywhere.
Strong/weak types and static/dynamic typing are orthogonal things.
Strong type system limits things like 1+”1”.
Static type system requires type declarations (“int i”).
Python always had strong dynamic types.
Python has always been strongly typed, since the very beginning.
The article and the feature has nothing to do with strong types.