How Siftree Works

How Siftree Works

Siftree

Aug 4, 2025

Siftree Explained
Siftree Explained

A simple, visual representation of Siftree's architecture

Ingestion

Siftree categorizes social media networks in 2 distinct ways.

  1. Incumbent Platforms: Reddit, Facebook, TikTok, etc.

  2. Open Protocols: Mastodon, Primal, Bluesky, and others that are built on top of open frameworks (Nostr, ATproto, etc.)


Within each categorization, each application has different levels of API access, lexicons, authentication processes, audiences, media formats, etc.

This means, for billions of people across the world "the internet" is a very different place. Echo chambers and interest-based algorithms create feedback loops that lock people into niche pockets of inherently niche applications.

Siftree ingests data across all of these disparate platforms, to create an aggregated view of the data. We've created something called an "adapter" that creates a shared, unified understanding of the data.

The end result is a "Canonical Social Event" (CSE): simply put, something that happened on the social web, represented in a schema that can be understood across platforms and protocols.


Processing

Siftree ingests a lot of CSE's. We use various machine learning techniques (such as NER, clustering, and sentiment analysis) to understand and categorize the events coming into our platform.

This process creates a "taxonomy of events", allowing us to store vector representations of social activity across the world.


Your experience

We use additional machine learning methods to show you relevant topics and enable analytics. This involves using methods like reranking and leveraging LLMs to read thousands of events to synthesize annotations.

Overall, our goal is to "Sift through the noise" and find what you're interested in.

Siftree Diagram


Ingestion

Siftree categorizes social media networks in 2 distinct ways.

  1. Incumbent Platforms: Reddit, Facebook, TikTok, etc.

  2. Open Protocols: Mastodon, Primal, Bluesky, and others that are built on top of open frameworks (Nostr, ATproto, etc.)


Within each categorization, each application has different levels of API access, lexicons, authentication processes, audiences, media formats, etc.

This means, for billions of people across the world "the internet" is a very different place. Echo chambers and interest-based algorithms create feedback loops that lock people into niche pockets of inherently niche applications.

Siftree ingests data across all of these disparate platforms, to create an aggregated view of the data. We've created something called an "adapter" that creates a shared, unified understanding of the data.

The end result is a "Canonical Social Event" (CSE): simply put, something that happened on the social web, represented in a schema that can be understood across platforms and protocols.


Processing

Siftree ingests a lot of CSE's. We use various machine learning techniques (such as NER, clustering, and sentiment analysis) to understand and categorize the events coming into our platform.

This process creates a "taxonomy of events", allowing us to store vector representations of social activity across the world.


Your experience

We use additional machine learning methods to show you relevant topics and enable analytics. This involves using methods like reranking and leveraging LLMs to read thousands of events to synthesize annotations.

Overall, our goal is to "Sift through the noise" and find what you're interested in.

Siftree Diagram