What does HackerNews think of magnetico?
Autonomous (self-hosted) BitTorrent DHT search engine suite.
> The DHT implementation was largely borrowed from Magnetico (https://github.com/boramalper/magnetico) a popular and widely used app which is likewise unable to service these requests.
You also want to know which torrent looks legit, so you may want to query trackers ... Here you go: https://sr.ht/~rakoo/magneticos/
The problem then will be, how do you make sure your content is legit? There's no magic way here, the best thing you can do is compare the number of seeders and aim for the highest. If a torrent is fake, people will delete it and it won't be seeded. I have a thingy for that: https://sr.ht/~rakoo/magneticos/
The problem then becomes, number of seeders naturally selects towards popular content. It doesn't ensure viability of content. But I don't think there's a technical answer to that.
Someone else did this a while back, universe continues to exist.
"Autonomous (self-hosted) BitTorrent DHT search engine suite."
For anyone wanting to pursue this, I feel like I can share, I used this recently https://github.com/boramalper/magnetico, people share database dumps regularly. I found ~4 DB dumps and merging scripts are all you need to get up and running.
It needs quite a bit of bandwidth and some storage space, but worked well so far, for me. I've been running it for a couple of years and it indexed around 12.5M torrents (36GB of uncompressed database).
SQLite is most interesting not when the database is small, but when there are very few writes and all you do is reading. You can also look at https://datasette.io/ and see how SQLite is perfect for representing a lot of datasets and querying them
If I had to install a typical database or some search engine I would never have used it. It is more than enough for what I'm using it for.
When it comes to search engines, two things shake the centralized assumption: (1) you can crawl DHTs like https://github.com/boramalper/magnetico does for BitTorrent, and this can be done self-hosted, (2) too often we assume that searches must be global, but in many cases you are implicitly constraining your search space to regional boundaries or interest boundaries. E.g. as a westerner you are probably not interested in content from Vietnam or Pakistan. Also, when searching the Web, rarely do people browse through multiple Google search result pages. As a consequence, things like `awesome-*` lists ( https://github.com/bayandin/awesome-awesomeness ) are a good starting point for a decentralized curated catalogue that could cover that use case. So you could imagine a curated catalogue of programming websites compiled as a few hundred megabyte file, it could pretty much cover your interest boundaries, and as a file this can be shared over Dat today very easily. Want to search another 'interest boundary'? Run the same search software but on a different catalogue file. These things are possible.
In other words, centralization is overestimated. Maybe there are a couple of genuine centralized-middleman use cases but most of use cases can be very well covered by decentralization.
A search DHT makes much more sense, look up keywords based on hash. Think the Kademlia network from the ED2K days. Maybe I'm being too academic by even suggesting a DHT be used, you could do it in simpler ways like flood search in the style of Gnutella - any distributed search means would work really.
I believe there are a few attempts at this in the bittorrent world, like Tribler, which is probably the most practical implementation to date, there's a few others too, none of which look particularly mature yet:
https://sourceforge.net/projects/aresgalaxy/editorial/?sourc... https://github.com/lmatteis/torrent-net https://github.com/boramalper/magnetico
The problem with distributed search is that perfecting it is hard - bittorrent has won out because websites could be used to prevent spamming of malware, track reputation and discussion of torrents and individuals, etc. Tribler has some proposed alternatives here and I seem to remember Kad having done a decent job preventing this from becoming a significant problem, but it never had the popularity.
If someone manages to knock out all the big torrent and usenet indexes overnight, these systems will become a lot more necessary and probably get a lot more popular.