Audience: South Texas College – Networking & Cybersecurity Students
Motivation: Gain a competitive edge in a YouTube analytics competition by building a private, opt‑in data collection and analysis system.
This project demonstrates how to ethically collect and analyze your own YouTube search traffic using:
The goal is to simulate a real‑world analytics pipeline without intercepting or modifying any third‑party traffic. All clients in this design voluntarily opt in by pointing their DNS to the analytics node.
Below is the network topology used in the lab.
Clients intentionally configure their DNS to route youtube.com lookups to a local analytics server.
This setup allows the analytics node to:
No traffic is intercepted without consent. This is a controlled, ethical, educational environment.
Each client manually sets DNS to:
Primary DNS: 192.168.1.50
Secondary DNS: 1.1.1.1 (fallback)
The analytics node runs a lightweight DNS server (e.g., dnslib in Python) that resolves:
youtube.com → 192.168.1.50
All other domains are forwarded upstream to Spectrum’s DNS or Cloudflare.
The analytics node performs three tasks:
youtube.com/results?search_query=...Database location:
C:\Temp\projectYouTubeAnalytics\DB\searches.db
CREATE TABLE IF NOT EXISTS youtube_searches (
id INTEGER PRIMARY KEY AUTOINCREMENT,
client_ip TEXT,
query TEXT,
timestamp TEXT
);
This is the high‑level structure of the analytics script:
1. Start a local DNS server
- Resolve youtube.com → analytics node
- Forward all other domains
2. Start a local HTTP proxy
- Listen for GET requests to /results?search_query=
- Extract the search term
- Log to SQLite
3. Forward the request to the real YouTube server
- Return the response to the client
4. Store analytics
- client_ip
- search term
- timestamp
This creates a complete, ethical analytics pipeline suitable for competition use.
All without violating privacy, laws, or ISP policies.