Skip to content

Filter your current RSS feeds with AI customized recommendations.

License

Notifications You must be signed in to change notification settings

m0wer/rssfilter

Repository files navigation

build pre-commit build build

RSS filter

RSS feeds recommendation system based on user read articles. Replaces the feed URLs with the backend URL and uses the backend to filter out unwanted items and track user read articles. Uses LLM embeddings and machine learning to recommend similar articles.

This is a simple RSS filter that filters out unwanted items from an RSS feed. It is written in Python and uses the feedparser library to parse the feed.

It works by tracking the users read articles, computing their embeddings, clusyering them, and then recommending similar articles from the feed. It also includes random articles from the feed to allow for discovery of new topics. This starts working only after a user has read a few articles (10 by default).

Embedding models allow for a new era of recommendation systems, where a large user base is not required, since recommendations are based on the content of the articles, not on other users behavior.

Self-hosting

You can self-host this project by running the following command:

cp .env.example .env
docker-compose -f docker-compose.yml up

If you don't have or want to use the GPU, first run:

sed -i 's/^.*devices:.*$/#&/' docker-compose.yaml

Test it with:

curl -X 'GET' \
  'http://localhost/api/v1/feed/1/https%3A%2F%2Fnews.ycombinator.com%2Frss' \
  -H 'accept: application/json'

To use the self-hosted frontend, you should change apiBaseUrl in frontend/static/app.js to match the backend URL.

Security

SSRF Protection

RSS Filter includes built-in protection against Server-Side Request Forgery (SSRF) attacks. When fetching feeds, the application automatically blocks requests to:

  • Private IP ranges (192.168.x.x, 10.x.x.x, 172.16.x.x-172.31.x.x)
  • Localhost (127.x.x.x, ::1)
  • Link-local addresses (169.254.x.x)
  • AWS metadata service (169.254.169.254)

This protection works automatically for basic setups - no configuration needed. The application validates DNS resolution for all requests, including redirects.

Enhanced Security with Proxy (Optional)

For additional security in production environments, you can route all feed requests through a proxy that only allows external hosts:

# docker-compose.yaml
services:
  backend:
    environment:
      FEED_PROXY: http://gluetun:8888  # Your proxy service

Using a proxy provides defense-in-depth by enforcing network-level isolation.

Architecture

The application consists of several services:

  • backend: FastAPI application serving the API
  • frontend: Static web frontend
  • redis: Message queue for background jobs
  • rq-worker: Background workers for feed fetching and embeddings
  • rq-worker-gpu: GPU-enabled worker for computing embeddings
  • scheduler: Handles all periodic tasks (replaces external cron jobs)
  • proxy: Traefik reverse proxy

Scheduled Tasks

The scheduler service handles all periodic tasks automatically:

Task Schedule Description
fetch_all_feeds Hourly Fetches all feeds for active users
run_full_maintenance Daily 4am UTC Cleanup old articles, vacuum database
retry_disabled_feeds Weekly Sunday 3am UTC Retry feeds that were disabled due to errors

No external cron jobs are required.

CLI Commands

The backend includes a CLI for manual operations:

# Inside the backend container
python -m app.cli --help

# Available commands:
python -m app.cli fetch-feeds      # Manually trigger feed fetching
python -m app.cli retry-feeds      # Re-enable and retry disabled feeds
python -m app.cli maintenance      # Run full maintenance cycle
python -m app.cli stats            # Show database statistics
python -m app.cli freeze-users     # Freeze dormant users
python -m app.cli unfreeze USER_ID # Unfreeze a specific user
python -m app.cli clean-articles   # Delete old unread articles
python -m app.cli clean-embeddings # Remove old embeddings
python -m app.cli vacuum           # Vacuum and analyze database

Development

Backend

Dependencies

To install the required libraries, run the following command in the backend or frontend:

pip install -r requirements.txt

Running the backend

cd backend
python -m uvicorn app.main:app --reload --log-level debug --port 8000

Contributing

There are some hooks in .pre-commit-config.yaml to ensure:

  • pip-compile is up-to-date with added dependencies
  • code is well formatted and linted with ruff and black.

You can install these hooks with pre-commit install and run them on demand by pre-commit run --all-files.

Contact

If you have any questions, feel free to contact me at m0wer at autistici dot org.

About

Filter your current RSS feeds with AI customized recommendations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 2

  •  
  •