Understanding Indie Search Engines

Indie Search Engines

What is Teclis?

ArchivedRead

Marginala

Kagi is a privacy-focused, user-centric search engine. Great search experience starts with Kagi!

ReadArchived

Wiby is a search engine for older style pages, lightweight and based on a subject of interest. Building a web more reminiscent of the early internet.

ArchivedRead

Find a web page made by an IndieWeb community member.

ReadArchived

At Mojeek we like to do things differently, that's why we're building a search engine that respects your privacy whilst providing unique and unbiased results.

ArchivedRead

Ecosia uses the ad revenue from your searches to plant trees where they are needed the most. By searching with Ecosia, you’re not only reforesting our planet, but you’re also empowering the communities around our planting projects to build a better future for themselves. Give it a try!

ecosia, green, search, engineArchivedRead

spot ecloud global, powered by searx

spot, ecloud, searx, search, search engine, metasearch, meta searchArchivedRead

Presearch is a decentralized search engine that provides search choice, quality results, privacy and rewards to those who want to end the search monopoly and take back the web.

Decentralized, Search Engine, Search, PREReadArchived

Tools

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments) - GitHub - adbar/trafilatura: Web scraping library and command-line tool for text dis...

ReadArchived

Headless Chrome Node.js API. Contribute to puppeteer/puppeteer development by creating an account on GitHub.

ReadArchived

A standalone version of the readability lib. Contribute to mozilla/readability development by creating an account on GitHub.

ReadArchived

Lightning-fast, open source search engine for everyone

typesense, search engine, fuzzy search, typo tolerance, faceting, filtering, app search, site search, search bar, algolia, elasticsearchArchivedRead

You can install it using pip:

ArchivedRead

FastAPI framework, high performance, easy to learn, fast to code, ready for production

ArchivedRead

Google Research. Contribute to google-research/google-research development by creating an account on GitHub.

ReadArchived

A motivating factor is the search engine has sort of grown to a scale where it's becoming increasingly difficult to productively work on as a personal solo project. It needs more structure. What's kept me from open sourcing it so far has also been the need for more structure. The needs of the marginalia project, and the needs of an open source project have effectively aligned.

ArchivedRead

YaCy P2P - Decentralized Search Engine

YaCy Suchmaschine search engine spider harvester indexer p2p peer network open free download software developmentArchivedRead

Parse And Create Web ARChive (WARC) files with node.js - GitHub - N0taN3rd/node-warc: Parse And Create Web ARChive (WARC) files with node.js

ArchivedRead

Enterprise Tools

Amazon Kendra offers an intelligent enterprise search solution that increases employee productivity and improves customer satisfaction.

ReadArchived

Enterprises and developers use Algolia’s AI search infrastructure to understand users and show them what they’re looking for.

ReadArchived

Search for Static Sites

Elasticlunr.js, lightweight full-text search engine in Javascript for browser search and offline search. Elasticlunr.js is developed based on Lunr.js, but more flexible than lunr.js. Elasticlunr.js provides Query-Time boosting and field search. A bit like Solr, but much smaller and not as bright, but also provide flexible configuration and query-time boosting.

elasticlunr, full-text search, information retrieval, offline searchArchivedRead

LunrSearch made simple

ArchivedRead
{if(!e.target.className.includes("read-link")&&!e.target.className.includes("title-link")){const mainLinks=this.querySelectorAll("a.main-link");mainLinks[0].click()}}}}}customElements.define("contexter-box",ContexterBox)},window.contexterSetupComplete||window.contexterSetup();

Pagefind is a fully static search library that aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure.

ArchivedRead
" itemprop="url">Pagefind | Pagefind

Pagefind is a fully static search library that aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure.

ArchivedRead

Impossibly fast web search, built for static sites.

ArchivedRead

Specific Search & Recommendation Platforms

Blog Surf is the internet's only search engine for blogs. Explore the best writing on the internet.

ArchivedRead

An open index of well-known resources.

ReadArchived

TinyGem is a bookmarking service, that automatically uses the links you save to surface other related content from manually curated sources. If you are intelectually curious, have a selective news diet and enjoy reading places like Hacker News, TinyGem might be for you.

ReadArchived

Corpuses

Us

ArchivedRead

The HTTP Archive Tracks how the web is built by periodically crawl the top sites on the web and record detailed information about fetched resources, used web platform APIs and features, and execution traces of each page.

ArchivedRead

Crawl Techniques

Stealth mode: Applies various techniques to make detection of headless puppeteer harder.. Latest version: 2.11.1, last published: 3 months ago. Start using puppeteer-extra-plugin-stealth in your project by running `npm i puppeteer-extra-plugin-stealth`. There are 334 other projects in the npm registry using puppeteer-extra-plugin-stealth.

puppeteer, puppeteer-extra, puppeteer-extra-plugin, stealth, stealth-mode, detection-evasion, crawler, chrome, headless, pupeteerArchivedRead

I want to share lists of links, but make them readable and archived

posts, projects, 11ty, Node, WiP, fetch, Context PagesArchivedRead

Other languages:

ArchivedRead

In my blog post brainstorming a new indie web search engine, I noted that running a web search engine is hard. With that in mind, I started to think that I haven't written too much about what I learned about web crawling when running IndieWeb Search, a search engine for the indie web. IndieWeb Search crawled a whitelist of websites, searching for pages, and indexed them for use in the search engine.

ReadArchived

Search Techniques

A cursory review of all the non-metasearch, indexing search engines I have been able to find.

ReadArchived

Thanks to the multi billion dollar advertisement industry, searching for something on the internet …

indieweb, search engines, How To Search The Internet, postReadArchived

Code

GitLab Enterprise Edition

ReadArchived

Search without being tracked.

ArchivedRead

The source code and instructions to create your own version of Wiby.

ArchivedRead

community search engine. Contribute to cblgh/lieu development by creating an account on GitHub.

ArchivedRead

Search as a service with YaCy Searchlab: Web Crawling and Data Science Apps for Web Content

YaCy Suchmaschine search engine spider harvester indexer p2p peer network open free download software developmentReadArchived

https://pagefind.app/

Why Should We Care?

With a landmark antitrust trial under way, a giant of the modern web is buckling under its own weight.

Google Search, Google Search feels, SEO expert, endless libraries of online information, physical world, last year, part encyclopedia, predictive engine, antitrust laws, company command, product placement, user data, Yahoo CEO Marissa Mayer, modern web, charitable explanation, midlife crisis, Google, opening days, company, U.S. search-engine market, search engine, online-search business, open secret of Google SearchThe company, longtime CEO of Waze, Noam Bardin, search quality, importance of data, Marie Haynes, efficient former self, Internal Google emails, Silicon Valley, own success, start-ups, Last fall, most helpful results, recent years, Justice Department, early years, good use, ever-evolving internet, generic mailers.It’s fitting, Google’s mission statement, default browser, simple question, real lesson of Google, scale, unfathomable amounts of information, tacit admission, heart of the case, past searches, technology, TechnologyReadArchived

A look at the new Tiptoe encrypted search system

ArchivedRead

Testimony during Google’s antitrust case revealed that the company may be altering billions of queries a day to generate search results that will get you to buy more stuff.

ideas, search, google, antitrust, algorithms, advertising, textaboveleftgridwidth, web, tagsArchivedRead


A guide for how to discover cool things on the internet.

ArchivedRead

Hello. I was going to write a post about how to surf the web only I remembered it had already been written, in a far more comprehensive format, by another person. So I'm just going to link to it and…

ArchivedRead

Meilisearch is neat together with their tokenizer lib they use. More practically DocSearch is great for plug and use solution. Tantivy, Quickwit & Edgesearch are interesting too.

ArchivedRead

This article is a stub. You can help the IndieWeb wiki by expanding it.

ArchivedRead

Hey nerds: I recently stumbled across “Marginalia Search”. It’s a search engine with a fascinating design — rather than give you exactly what you’re looking for, it tries to surprise you.

ArchivedRead

Indie Map is a complete crawl of 2300 of the most active IndieWeb sites as of June 2017, sliced and diced and rolled up in a few useful ways:

ArchivedRead

🍵️

ArchivedRead

The way to improve search is not to mimic Google, but instead to build boutique search engines that index, curate, and organize things in new ways.

ArchivedRead

bookmark

ArchivedRead

Kyle Chayka writes about the evolution of Google Search, which has become the runaway favorite Internet search engine despite many users’ misgivings about how the company monetizes the data it collects and how its algorithms determine the search results that a user is shown.

infinite scroll, new yorker favorites, google, search engines, internet, digital technology, algorithmic bias, textbelowcenterfullbleednocontributor, web, tagsArchivedRead

I would like for there to be more tiny search engines that are focused on a particular topic. It would be cool if I could type in

ReadArchived