I maintain a repository with all German legal acts which is up to date:
https://github.com/jandinter/gesetze-im-internet
I scrape the official website (https://www.gesetze-im-internet.de) once a week. The repository contains the "official" XML files with a formatting that is more focussed on presentation than on the logical structure of the legal acts, unfortunately (https://www.gesetze-im-internet.de/dtd/1.01/gii-norm.dtd).
Some time ago, someone from the digital service of Germany reached out and asked about my use case. Maybe there will be an official version of a "Git law" repo someday...
Very cool! I came across your project last year while building https://digebu.de .
I wanted to build an "IDE-inspired" law reader. It has selection highlighting and you can open references within the same window. It scrapes gesetze-im-internet.de daily, processes the XML to JSONS and builds static HTML pages, hosted on Github pages. The entire build process for the 6000+ pages takes 5-10 minutes. It uses up less than <20% of my actions minutes that come with Github pro.
It was a really fun rabbit hole to go down.
What I found most fascinating is that: There doesn't seem to be an official version of the German law. The state just publishes official announcements like "Law X will be changed as follows", "Law X will be removed" or "Law X will be added". So the official version of the German law really is something akin to a git tree. AFAIK, all consolidated versions are created by private entities.
I did a test by picking a law at random, finding the first time it was published and then applying all the changes from subsequent years. Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes. I found that quite fascinating (and a little scary).
It's also funny how often laws are referenced that don't even exist anymore. The collection of laws really are is as tidy as you would imagine an 80 year old system, where the maintainers change every 5 years, to be.
Has git ever made the necessary updates so that you can have proper datestamps on the 80 yr old laws? Last I had checked, nothing prior to unix epoch can be put into git.
> Turns out all available versions (gesetze-im-internet, dejure.org, buzer.de) had at least a couple of small mistakes.
Can you say more about what these small mistakes were? Would they affect the interpretation of the law?
In the example I checked the mistakes wouldn't have changed the interpretation. It were mistakes like additional or missing commas, missing spaces or missing articles.
buzer.de actually has a list of things that differ in their consolidation compared to gesetze-im-internet.de: https://www.buzer.de/quality.htm
In that list you can actually find mistakes that would alter the interpretation. But I think this also sounds worse than it is. It's just a funny thought that whatever source you are using, you are essentially trusting one party to not have made any mistakes, consolidating 1000s of pages of pdfs :)
So then what is the official way to get the latest version? I mean… how does the state itself handle those laws or are you telling me that every German court and government agency buys those books?
I'm not sure if they still buy the books, but I know from someone who worked as a judge in Germany, that they personally stopped buying the books only ~5-10 years ago, because they saw that the online availability was good enough now.
But my point is that, as far as I know, there is no official version of the final text. The official publications are made in the Bundesgesetzblatt (which had been privatized in the past, but that's another story). The publications might look like this:
1947: We hereby make the following text a law called Grundgesetz "Artikel I: Human dignity is inviolable"
2026: We hereby change the law called Grundgesetz by changing the first article to say "Human or Alien" instead of "Human".
Now there are a lot of entities that will consolidate these changes into a final text. But this consolidation isn't done officially. So, while in this example its easy to see, that in 2026 the law would read "Human and Alien dignity is inviolable", it becomes less clear when these changes are spread over 80 years and are only available as PDFs.
Laws are distributed through the Bundesgesetzblatt, the official announcement publication for laws of the German Bundestag. Their online presence is here: https://www.recht.bund.de/de/bundesgesetzblatt/bundesgesetzb...
[EDIT: fixed link]
What did you tell him about your use case?
I'm asking as I don't agree on the underlying assumption a use case was needed. I consider the value of transparency and public information for a democratic society as evident.
The question might not have been about the transparency, but more about the choice of having it as a git repository, or whether there are actual tools based on the git repository. Arguably, the git repository is unusable for the majority of people, so it cannot be an answer to transparency in itself, some user-friendly tools based on it might.
I'm also interested in the response btw :-)
I just want to archive the "official" XML files since the "official" website does not provide an archive. For that reason, I also don't change the XML files: The spec is available and everyone can build their own transform (to JSON, XML, whatever) based on their particular needs.
They are working on an official API to replace Gesetze im Internet. It should be out in the next few weeks according to its developer.
Nice work. Maybe you could do some preprocessing of the XML data, so that you actually have a diff of the content and not the whole XML block.
I thought about it, but decided against pre-processing: The repo is meant to be an archive, and the XML spec can be looked up. If I were to introduce a new structure by pre-processing the files, I think that might be a plus for reading, but not for archiving. Whoever has a concrete use case (the "Digebu" website above looks great!), can write their own pre-processor for that use case.