There are also a repository containing some laws in markdown format on GitHub. They even used PRs for actual changes proposed by the parties in the parliament. Also, the commits have their proper date, so you can `git blame` on the laws and even see which president signed-off that change.
Sadly, it is unmaintained.
I maintain a repository with all German legal acts which is up to date:
https://github.com/jandinter/gesetze-im-internet
I scrape the official website (https://www.gesetze-im-internet.de) once a week. The repository contains the "official" XML files with a formatting that is more focussed on presentation than on the logical structure of the legal acts, unfortunately (https://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd).
Some time ago, someone from the digital service of Germany reached out and asked about my use case. Maybe there will be an official version of a "Git law" repo someday...
Very cool! I came across your project last year while building https://digebu.de .
I wanted to build an "IDE-inspired" law reader. It has selection highlighting and you can open references within the same window. It scrapes gesetze-im-internet.de daily, processes the XML to JSONS and builds static HTML pages, hosted on Github pages. The entire build process for the 6000+ pages takes 5-10 minutes. It uses up less than <20% of my actions minutes that come with Github pro.
It was a really fun rabbit hole to go down.
What I found most fascinating is that: There doesn't seem to be an official version of the German law. The state just publishes official announcements like "Law X will be changed as follows", "Law X will be removed" or "Law X will be added". So the official version of the German law really is something akin to a git tree. AFAIK, all consolidated versions are created by private entities.
I did a test by picking a law at random, finding the first time it was published and then applying all the changes from subsequent years. Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes. I found that quite fascinating (and a little scary).
It's also funny how often laws are referenced that don't even exist anymore. The collection of laws really are is as tidy as you would imagine an 80 year old system, where the maintainers change every 5 years, to be.
Has git ever made the necessary updates so that you can have proper datestamps on the 80 yr old laws? Last I had checked, nothing prior to unix epoch can be put into git.
> Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes.
Can you say more about what these small mistakes were? Would they affect the interpretation of the law?
In the example I checked the mistakes wouldn't have changed the interpretation. It were mistakes like additional or missing commas, missing spaces or missing articles.
buzer.de actually has a list of things that differ in their consolidation compared to gesetze-im-internet.de: https://www.buzer.de/quality.htm
In that list you can actually find mistakes that would alter the interpretation. But I think this also sounds worse than it is. It's just a funny thought that whatever source you are using, you are essentially trusting one party to not have made any mistakes, consolidating 1000s of pages of pdfs :)
So then what is the official way to get the latest version? I mean… how does the state itself handle those laws or are you telling me that every German court and government agency buys those books?
I'm not sure if they still buy the books, but I know from someone who worked as a judge in Germany, that they personally stopped buying the books only ~5-10 years ago, because they saw that the online availability was good enough now.
But my point is that, as far as I know, there is no official version of the final text. The official publications are made in the Bundesgesetzblatt (which had been privatized in the past, but that's another story). The publications might look like this:
1947: We hereby make the following text a law called Grundgesetz "Artikel I: Human dignity is inviolable"
2026: We hereby change the law called Grundgesetz by changing the first article to say "Human or Alien" instead of "Human".
Now there are a lot of entities that will consolidate these changes into a final text. But this consolidation isn't done officially. So, while in this example its easy to see, that in 2026 the law would read "Human and Alien dignity is inviolable", it becomes less clear when these changes are spread over 80 years and are only available as PDFs.
Laws are distributed through the Bundesgesetzblatt, the official announcement publication for laws of the German Bundestag. Their online presence is here: https://www.recht.bund.de/de/bundesgesetzblatt/bundesgesetzb...
[EDIT: fixed link]
What did you tell him about your use case?
I'm asking as I don't agree on the underlying assumption a use case was needed. I consider the value of transparency and public information for a democratic society as evident.
The question might not have been about the transparency, but more about the choice of having it as a git repository, or whether there are actual tools based on the git repository. Arguably, the git repository is unusable for the majority of people, so it cannot be an answer to transparency in itself, some user-friendly tools based on it might.
I'm also interested in the response btw :-)
I just want to archive the "official" XML files since the "official" website does not provide an archive. For that reason, I also don't change the XML files: The spec is available and everyone can build their own transform (to JSON, XML, whatever) based on their particular needs.
They are working on an official API to replace Gesetze im Internet. It should be out in the next few weeks according to its developer.
Nice work. Maybe you could do some preprocessing of the XML data, so that you actually have a diff of the content and not the whole XML block.
I thought about it, but decided against pre-processing: The repo is meant to be an archive, and the XML spec can be looked up. If I were to introduce a new structure by pre-processing the files, I think that might be a plus for reading, but not for archiving. Whoever has a concrete use case (the "Digebu" website above looks great!), can write their own pre-processor for that use case.
German laws might work different, but I'm not really under the impression that software version control is really that compatible with law making.
In source code we replace or modify the parts that doesn't work in place. Many laws does not work like that, they are a labyrinth of add ons. A new law is introduced with wordings like "This replaces the words "small businesses" with "nuclear rockets" in the law on "Workplace safety of fishing vessels of 1992", §12, section 3, line 5.
No amount of version control will ever find these changes.
These wordings are basically a manual description of a diff. They wouldn't be necessary if version control would be used for laws.
I actually think version control is an absolute necessity for laws.
I can imagine these "relative text patches" could just commit as is written but could be committed with a corresponding metadata and referential locations array backed by some kind of encoding that lands in the same commit. That would unlock a visualization tool that could render a strikeout for the earlier precedent legal text or something like that in whatever way the modification applies.
> They wouldn't be necessary if version control would be used for laws.
So a country only needs to rewrite all the laws to adopt versioning, cool.
In reality both have can be used, commits to see what changed by whom and wordings that says what changed
> So a country only needs to rewrite all the laws to adopt versioning, cool.
No, they only need to start using versioning in order to adopt versioning. Think of an "initial git commit"
Might work in some systems but England has Case Law stacked on top of statutes. That's tricky to turn back into code.
Case law is basically monkey-patching (or wrapping / decorating). It’s part of the running law but does not modify the law itself in—source.
I had thought the goal here was something more than just 'track changes', which legislators could do in Word
That's exactly like the German law works. And as far as I know, it's how all law system that aren't common/case law work.
Changes will either add, delete or change an existing law.
There is actually a website where someone has all changes dating back to 2006 and you can display diffs (called Synopsis in Germany) - for example: https://www.buzer.de/gesetz/5041/v322454-2025-03-25.htm
Most laws in most civil law countries (there are exceptions, but it's the standard for the main laws) are a fixed law on a topic which then gets updated. So the 2025 version of the Family Code in France is everything included, and if you do a diff with the 2024 version, you'll see which parts were removed/updated/replaced by which new parts (it could be a clarifying word, new contents, changed rules, etc.). Reading it end to end (it's a hefty book, but still) gives you a complete representation of all laws regarding families (marriage, divorce, kids, etc etc).
It's mostly the obsolete system of common law where to have an understanding of what is legal and what isn't, you need to have a spiderweb of random acts (random as in, they don't have to be thematic, so the Chicken Tax Act of 2005 can have provisions on solar panels that replace the previous solar panel legislation form the STOP KILLING OUR COMMUNITIES Act of 1785) that build/replace on one another, sometimes going very far back, with associated precedents that clarify them.
Most but not all laws in France are like that. In fact, some of the laws applied in France are written... in German and aren't compiled in the codes. There is a whole institute (albeit small) dedicated to studying and make those laws[1].
The case happened to me when I searched for the original text that said a worker have to be compensated in full for "short" sick leave, and what I found was a very short text in German. Hopefully the company I worked for complied with law after consulting its accountant.
This is a very special exception for the unique situation of Alsace and parts of Lorraine (Moselle) that spent a few decades in Germany, and as a result have a mixed legal system, and some other fun ones like extra public holidays.
I think your understanding is that the Act must be the content. But the Act isn't the content, the Criminal Code, Civil Code etc are the content. The Act itself is more like a patch file, with some surrounding metadata (and its own meta version history as proposed legislation gets marked up throughout the process). It could add a new file, but it could also be an edit. But in neither case is it the effective content, it's a description of changes to said content. (The line does get blurred on the initial commit though, because typically the name of the resulting law is the name of the Act that established it).
So you've called out precisely why version control systems present such a useful analog.
I one time learn that the cluster f of exceptions surprisingly quickly turns into something you [almost] cant replicate in software.
Of course one could also argue that it isn't a problem with poorly designed laws but that our programming languages are ill equipped for it.
Then again, the funniest thing I've seen in law: Where an engineer would make a nice drawing with the size of things neatly organized into available space, a law maker will spray all the numbers and description randomly all over the place as if to prevent anyone from ever building the described.
>German laws might work different, but I'm not really under the impression that software version control is really that compatible with law making.
Hard disagree. It allows you to attach a name to particular portions of the code (and a date), it shows you when the code moves from one status to another (branches), and you could even easily do things like show who voted/signed for any given piece.
What's not really compatible with law making as it is now, where to repeal a law it doesn't remove the offending code, but adds more code that says "now you can ignore that previous one". Those don't even make it into the official text until codification occurs (this is periodic, not continuous).
>In source code we replace or modify the parts that doesn't work in place. Many laws does not work like that, they are a labyrinth of add ons. A new law is introduced with wordings like "This replaces the words "small businesses" with "nuclear rockets" in the law on "Workplace safety of fishing vessels of 1992", §12, section 3, line 5.
Exactly. They've been doing it wrong (artifact of doing everything on paper, I think).
I usually convert to markdown from PDF local laws when I need them as a reference for the functional specification. That way its easier to pinpoint to exact section of the thing.
Its not easy to convert general law to markdown, it involved online converters and manual fixes. Currently experimenting with marker [1] on local LLM hardware and so far it is the best out there.
I maintain a web site where I re-render to HTML a scrapped version of the (consolidated version of) the Official Belgian Journal[0].
One of the nice thing about having an underlying structured representation of those texts is that I can also render them to e.g. Markdown[1].
I've experimented about generating the Markdown files corresponding to multiple versions (archives) of a given text and committing them to the same Git repository to be able to see diffs or blames[2].
I would like to assign the proper dates to each commit, but given there are texts in e.g. 1791, it's not possible.
0: https://refli.be/fr/lex 1: https://github.com/hypered/iterata-md 2: https://github.com/hypered/iterata-archive
> They even used PRs for actual changes proposed by the parties in the parliament. Also, the commits have their proper date, so you can `git blame` on the laws and even see which president signed-off that change.
This is something I'm very interested in for a different use case: model legislatures. The infrastructure and tooling for model congresses and parliaments is very limited: largely relegated to wikis and Google Docs. And that's fine, but it becomes a problem long term with tracking and archival.
We had a situation where our model parliament did not own the Google Doc for a particular treaty with another model legislature. It was changed out from under us, which is not ideal. But that brings into question ownership of Google Docs, and what happens if that person withdraws from the game.
Another issue is respecting and maintaining the creativity of those who play. People put a lot of effort into their bills with the fonts, formatting, layout, and imagery they use. It would be a shame to erase all that effort by converting it a bland wall of text a la markdown.
Markdown also has its issues: if the legislature removes an entry of an ordered list, how do you prevent markdown from renumbering the list? And the ways around this involve extending markdown, or using plaintext (eg: https://www.apache.org/licenses/LICENSE-2.0.txt)
Another solution could be QuillJS (https://quilljs.com/), which serialises into a JSON array of Deltas. However, this would make any kind of git-diff difficult to read. You'd need a custom differ, which is not impossible, but that may be a lot of work and may not be supported on git sites like Github.
Another issue is that, if you're using commits-as-enactments, then that probably means using the commit message (or notes) for the enactment's text. To what extent is that supported? As in, how long can commit messages be before it starts wreaking havoc on git clients? Will my Github tab or GitKraken client crash if I view the commit history? Could the commit message itself contain a serialised QuillJS document? What if that document contained a base64-encoded image?
This kind of digitization is great.
I don't know if they are doing it, but I always thought that it should be easy for regular citizens to see the historical reasons why a law or regulation exists. Because there is sometimes a good reason why a regulation exists, but nobody knows it.
Laws are usually passed with longer texts that explain why the law was passed. These are consulted by courts if there are issues in interpreting a law.
This is more important than readers may think as laws especially in civil law jurisdictions are meant to be applied based on intent, and not by the letter.
I don't know, something tells me that lawmaking-by-git is somehow less accessible to the 'regular citizen'.
See also the US Constitution in GitHub: https://github.com/JesseKPhillips/USA-Constitution
Nice. I kinda wish they went all the way and modified the commit details to have the actual authoring dates for each amendment/etc. Anyone know how well git plays with pre-epoch timestamps?
I just poked around a bit.
So git was first created with u32 time in mind only. However because of the looming year-2038 problem, they are working on expanding that.
Apparently git internals are almost ready to support more interesting timestamps. However, much of the git tooling and UI (like command line parsing and output) refuses to deal with pre-epoch timestamps.
I briefly tried with git 'porcelain' and also via libgit2, but it's all a bit annoying.
In summary, I think you'd need to hack up at least some of git's tooling to make everything work, but it wouldn't be heart surgery, because the internals are already nearly ready for this kind of change.
Things get ugly if you go back far enough that you need to account for jurisdictions which no longer exist switching between calendars at different times from one another. I don't know how well Unix timestamps will fare for dates prior to approximately the 1600s.
At least you won't need to worry about figuring out historical leap seconds.
> Things get ugly if you go back far enough that you need to account for jurisdictions which no longer exist switching between calendars at different times from one another.
I think that would be a 'timezone' conversion you do at display time. Internally, it's still stored as a unix timestamp.
> how well git plays with pre-epoch timestamps?
Tangentially, most RSS readers don't play nicely. A lot of webtooling doesn't like featuring e.g. old poetry etc. with the actual dates e.g.: https://alexalejandre.com/poetry/ I got a few e.g. newsboat to update their handling though.
I do think you're putting the horse before the carriage here.. It's just info, which is public .. What are you worried about ?
Edit : I see you're focus is on another thing completely. Share it, could be a great topic also.