github.com

Recently there seem to be a surge in SQLite related projects. TeaTime is riding that wave...

A couple of years ago I was intrigued by phiresky's post[0] about querying SQLite over HTTP. It made me think that if anyone can publish a database using GitHub Pages, I could probably build a frontend in which users can decide which database to query. TeaTime is like that - when you first visit it, you'll need to choose your database. Everyone can create additional databases[1]. TeaTime then queries it, and fetches files using an IPFS gateway (I'm looking into using Helia so that users are also contributing nodes in the network). Files are then rendered in the website itself. Everything is done in the browser - no users, no cookies, no tracking. LocalStorage and IndexedDB are used for saving your last readings, and your position in each file.

Since TeaTime is a static site, it's super easy (and free) to deploy. GitHub repo tags are used for maintaining a list of public instances[2].

Note that a GitHub repository isn't mandatory for storing the SQLite files or the front end - it's only for the configuration file (config.json) of each database, and for listing instances. Both the instances themselves and the database files can be hosted on Netlify, Cloudflare Pages, your Raspberry Pi, or any other server that can host static files.

I'm curious to see what other kinds of databases people can create, and what other types of files TeaTime could be used for.

[0] https://news.ycombinator.com/item?id=27016630

[1] https://github.com/bjesus/teatime-json-database/

[2] https://github.com/bjesus/teatime/wiki/Creating-a-TeaTime-in...

112
23
pizza 28 minutes ago

This is awesome. Would it be able to integrate Pocket bookmarks into this somehow? Would love to be able to keep a cache offline of all the things I wanted to read (esp if I can prevent link-rot), and to be able to query them.

aeyes 2 hours ago

The Pear P2P framework might be worth a look if you want to get off of GitHub and turn it into a truly distributed system. If the index must be on GitHub then what good is it to have the files on IPFS?

https://docs.pears.com/

mdaniel 7 hours ago

> (I'm looking into using Helia so that users are also contributing nodes in the network)

I had to look that term up <https://github.com/ipfs/helia#readme> but while sniffing around in their <https://github.com/ipfs/helia/wiki/Projects-using-Helia> I was reminded of https://github.com/orbitdb/orbitdb#readme which seems like it may be much less rolling your own parts

yoavm 7 hours ago

Thanks, I haven't heard of it before. I wonder if it's possible to use OrbitDB to download a file from IPFS though? Based on it's CID, I mean, because that was my intention with Helia. I thought that one of the nice thing about reading books is that it takes time, and if users could be seeding their files from IndexedDB to the network, they could automatically and effortlessly become "contributing citizens".

Another interesting use case would be to see if this can replace (or be an addition to) SQLite as the database in which the queries are ran.

sram1337 30 minutes ago

I don't know what's in the database, so I don't know what to search for.

How about a browse feature?

mushufasa 6 hours ago

> The databases used in TeaTime are GitHub repositories tagged with the teatime-database topic, which are published on GitHub Pages.

Couldn't this be a security issue, for a bad actors to use this tag?

yoavm 5 hours ago

That's a fair point - I guess it could be abused. Databases are sorted by their number of GitHub stars, so I was hoping that with the power of the crowds it will be possible to minimize the bad effect such actor might have, by simply not voting them up.

mushufasa 5 hours ago

there's been several attacks recently where a bad actor takes over a repo where the original maintainer wants to take a step back, then launch a supply chain attack. in recent cases, the attack came from obfuscated binary files in the repo rather than code. given we are dealing with documents here (books) that would be easy to hide malicious code in a file. pdfs have all sorts of execution vulnerabilities for example

yoavm 5 hours ago

Interesting - I'm kinda counting on PDF.js, which is used for PDF rendering, on doing it safely, but of course that doesn't always have to be the case. Do you have any thoughts on how to make this safer?

mushufasa 3 hours ago

some other method of collection where you can hav eknown trust of the files contributed, some method of 'registering' a submission to create trust,

cicko 5 hours ago

PDFs are not executables.

crest 2 hours ago

May I recommend the old 27C3 talk "OMG WTF PDF"?

taneq 2 hours ago

You’d be surprised what’s executable with the right attitude.

slagfart 7 hours ago

Sorry if I missed this, but is there an example instance we can play with?

viborcip 14 hours ago

"Everything is done in the browser - no users, no cookies, no tracking. LocalStorage and IndexedDB are used for saving your last readings, your position in each file."

I love this! Thanks for making this!

clueless 5 hours ago

Is this like an open source distributed libgen?

yoavm 5 hours ago

Libgen is one of the databases that are currently available (and was probably a good fit because they already had their files on IPFS), but I think that this architecture, in which the UI is decoupled from the database, the database doesn't hold any copyrighted materials, and the files are downloaded from IPFS, is quite resilient and could be used for serving all sorts of content.

westurner 6 hours ago

Do static sites built with sphinx-build or jupyter-book or hugo or other jamstack static site generators work with TeaTime?

sphinx-build: https://www.sphinx-doc.org/en/master/man/sphinx-build.html

There may need to be a different Builder or an extension of sphinxcontrib.serializinghtml.JSONHTMLBuilder which serializes a doctree (basically a DOM document object model) to the output representation: https://www.sphinx-doc.org/en/master/usage/builders/#sphinxc...

datasette and datasette-lite can load CSV, JSON, Parquet, and SQLite databases; supports Full-Text Search; and supports search Faceting. datasette-lite is a WASM build of datasette with the pyodide python distribution.

datasette-lite > Loading SQLite databases: https://github.com/simonw/datasette-lite#loading-sqlite-data...

jupyter-lite is a WASM build of jupyter which also supports sqlite in notebooks in the browser with `import sqlite3` with the python kernel and also with a sqlite kernel: https://jupyter.org/try-jupyter/lab/

jupyterlite/xeus-sqlite-kernel: https://github.com/jupyterlite/xeus-sqlite-kernel

(edit)

xeus-sqlite-kernel > "Loading SQLite databases from a remote URL" https://github.com/jupyterlite/xeus-sqlite-kernel/issues/6#i...

%FETCH <url> <filename> https://github.com/jupyter-xeus/xeus-sqlite/blob/ce5a598bdab...

xlite.cpp > void fetch(const std::string url, const std::string filename) https://github.com/jupyter-xeus/xeus-sqlite/blob/main/src/xl...

yoavm 5 hours ago

> Do static sites built with sphinx-build or jupyter-book or hugo or other jamstack static site generators work with TeaTime?

I guess it depends on what you mean by "work with TeaTime". TeaTime itself is a static site, generated using Nuxt. Nothing that it does cannot be achieved with another stack - at the end it's just HTML, CSS and JS. I haven't tried sphinx-build or jupyter-book, but there isn't a technical reason why Hugo wouldn't be able to build a TeaTime like website, using the same databases.

> datasette-lite > Loading SQLite databases: https://github.com/simonw/datasette-lite#loading-sqlite-data...

I haven't seen datasette before. What are the biggest benefits you think it has over sql.js-httpvfs (which I'm using now)? Is it about the ability to also use other formats, in addition to SQLite? I got the impression that sql.js-httpvfs was a bit more of a POC, and later some possibly better solutions came out, but I haven't really went that rabbit hole to figure out which one would be best.

Edit: looking a little more into datasette-lite, it seems like one of the nice benefits of sql.js-httpvfs is that it doesn't download the whole SQLite database in order to query it. This makes it possible have a 2GB database but still read it in chunks, skipping around efficiently until you find your data.

noamikotamir 14 hours ago

super cool

evantbyrne 5 hours ago

Any particular reason for choosing IPFS instead of bittorrent or other p2p protocols? It feels like every time I try an IPFS tool it just crashes, whereas I rarely have issues with torrents.

yoavm 5 hours ago

Yeah - the desire to have TeaTime run as a normal website. BitTorrent doesn't run over HTTP, unless you use WebTorrent, which most BitTorrent clients aren't 100% compatible with. This means you can basically only download from other WebTorrent nodes - and there aren't many.

I do think this might change with the introduction of things like the Direct Sockets API, but for now they are too restricted and not widely supported.

It's my first time working with IPFS and I agree it hasn't always been 100% reliable, but I do hope that if I manage to get TeaTime users to also be contributing nodes, this might actually improve the reliability of the whole network. Once it's possible to use BitTorrent in the browser, I do think it would be a great addition (https://github.com/bjesus/teatime/issues/3).