While working on the 3D room, I realised the next logical step was to get more information on user analytics and their behaviours. What is clicked on most, what is not, so that I could iteratively improve features that are used, remove those that are not, and free up space for new ones to be built.
In designing this, I went with a modular infrastructure approach by building a separate analytics pipeline, completely isolated from the portfolio site itself. The portfolio site already has one job: serve the room and the classic pages. By isolating the components, if the analytics service breaks, the portfolio should still work. Keeping them separate means each component can be deployed, updated, and continue to function independently.
How the site and analytics work together
Portfolio Site -> /track -> QStash -> /consume -> Redis
|
Dashboard <- /stats <--------------------------------+
The analytics service has three public endpoints. POST /track is what the browser calls whenever a visitor does something: entering the 3D room, clicking an object, and so on. The endpoint validates the event (only known types and object IDs are accepted), rate limits by IP, and hands it off to QStash, a message queue service similar to Kafka.
POST /consume is called by QStash automatically whenever a new event arrives. It checks the request is genuinely from QStash, then increments the right counter in Redis. Room entries, object clicks, and page views all have their own key.
GET /stats is what the dashboard calls every 30 seconds to get the latest numbers from Redis.
Why QStash
QStash is Upstash's HTTP-based message queue, similar to Kafka, designed specifically for serverless environments. It fits the same pattern: publish a message, specify a delivery URL, and QStash handles the rest.
Why Node.js and not Python or FastAPI
The analytics service is written in Node.js rather than Python or FastAPI, partly because the Upstash SDKs for Redis and QStash are JavaScript-first, which made integration straightforward. FastAPI would have been a reasonable choice if the analytics involved heavier computation such as ML-based workflows, but for three endpoints that validate events and increment Redis counters, Node.js was the simpler fit.
Why Chart.js for the dashboard and not Grafana
Grafana needs a dedicated data source such as Prometheus to pull metrics from. That means running additional infrastructure just to display the counters. I wanted to keep it lightweight for my portfolio site, and Chart.js can read directly from the /stats endpoint and render charts without any additional backend.
Current limitations
This works well for a moderate load of messages, but sending events via HTTP instead of a persistent connection may throttle when load gets large. For high volume, Kafka is still the right approach.
What's next
With the event pipeline in place, the next step is to feed those signals back into improving the site and application. Knowing what visitors consistently click, I'm hoping that these signals could help improve how Baymax answers questions, prioritising information that visitors most frequently ask about. Longer term, a pipeline that detects patterns and surfaces recommendations automatically would make the portfolio adaptive as well.