This repository documents the automated web crawling bots used by our team for security and privacy research.
Our bots are designed for security and privacy research, focusing primarily on passive crawls and analysis of security and privacy aspects of web requests and pages. Our research aims to better understand and improve the security and privacy landscape of the modern web through large-scale empirical studies.
- Passive Analysis: Our bots perform passive crawls, observing and analyzing web pages without attempting to exploit vulnerabilities or interact with services in adversarial ways
- Security Research: We analyze security-related aspects such as HTTPS adoption, security headers, cookie practices, and other security mechanisms
- Privacy Research: We examine privacy-related features including tracking mechanisms, third-party resources, and data collection practices
We are committed to responsible and ethical web crawling practices:
- Non-Adversarial: Our bots do not attempt to compromise, exploit, or jeopardize the services we navigate to. We strictly observe and analyze without causing harm
- Rate Limiting: We implement appropriate delays between navigation requests to avoid overwhelming servers
- Domain Throttling: We carefully space out requests to the same domains to minimize impact on individual websites
- Respectful Behavior: We follow industry best practices for web crawling and respect website operators' resources
Our research focuses on large-scale web crawls to enable comprehensive analysis of security and privacy trends across the internet. This scale allows us to:
- Identify broad patterns and trends in security and privacy practices
- Track the evolution of web security and privacy over time
- Provide data-driven insights to the research community
For questions about our research or bots, please contact Saiid El Hajj Chehade saiid.elhajjchehade@epfl.ch