Architecture Diagram:
Tech Stack:
- Frontend:
- Hosting: Cloudflare
- Tech Stack: React, TypeScript, Redux, RTK Query, Tailwind
- Tooling: Yarn, tsx, Sentry, Auth0, Vite
- Backend:
- Hosting: AWS EC2
- Tech Stack: Node.js, Auth0, TypeScript, Websockets, TypeOrm
- Tooling: Yarn, tsx, Sentry, Docker, Docker-Compose
- Database:
- Service: AWS RDS
- Database: Postgres
How it works:
Synchronization
- The user signs in with their Google profile, accepting the application's permissions, which require being able
to read their Google Calendar.
- Once they click the synchronize button in the dashboard, I subscribe to their Google Calendar using an event
webhook.
- However, for security reasons, this webhook only notifies us that a change has taken place on the calendar, but
it doesn't tell us what changed.
- Google thankfully provides an efficient solution for this by using synchronization tokens.
- You send a request with the last synchronization token to Google's API, and it returns the differences and the
current synchronization token.
- We now have an efficient way to keep our data source in sync with the Google Calendar.
Scalable Architecture
- Next, we need something that can scale with the incoming events, as when you create an everyday request on
Google Calendar, it defaults to creating the event for the next 730 days.
- This is a lot of events, and that's only from one user with one event set for every day.
- Here, it should be noted that Google provides you the ability to bundle all of these events into a single event;
however, this is much harder to work with as you then have to perform a bunch of your own calculations from that
single event, and some calculations just aren't possible.
- It's much easier working with an atomic unit to build off of than a single event representing 730 other events.
- Enter SQS. SQS automatically scales to meet demand, providing high throughput at low latency. AWS also provides
a generous free plan for SQS, giving you a million requests per month.
- To get the message into SQS, I created an API gateway with a direct connection to SQS.
- Here, I went for long polling over a more event-driven architecture with AWS Lambdas.
- The reasoning was that long polling was simple and faster, as the Lambda cold start times were ~275ms.
- I have no doubt that I could have gotten those cold start times down to a reasonable amount, but seeing as how I
didn't have the application fully built, I didn't care to pre-optimize.
Processing Events
- A brief look at a Google Calendar event structure will help you
follow along in this section.
- The events have now reached the application. What to do?
- Now that events don't actually tell us what changed, we have to make requests to the Google Calendar with our
sync tokens to see what actually changed.
- As there can be a significant amount of events, we must be careful not to block the event loop or use excessive memory when processing the events.
- For this an async generator works perfectly, allowing for a smaller memory footprint and concurrent operations.
- Having obtained the event differences, we pass them to a function that sorts the raw events into a list of unique
event names, active events, and canceled events.
- Handling the cancelled events is farily easy as we take the event id delete the events in batches.
- Moving onto active events I first check if the events exist and then seperate them into new and existing events.
- This becomes a fairly complex process as with recurring events, if someone changes 1 of the events name out of 730 it's analytics can no longer be counted towards that event and instead it becomes its own event.
- Or what if the user adjusted the time of one of the recurrings events, or changed a non recurring events name to that of a recurring events name.
- This application also supports user defined categories so when a user drags and drops an event name into a category every event with that name is then added to that category.
- So you need an event name table for grouping.
To be continued..