Case Study 12 May 2025
Building a SaaS Platform from Scratch: Lessons Learned
Building a SaaS platform is one of the most complex and rewarding challenges in software development. Over the past year, our team at OxelLab built a social media automation platform that manages content scheduling, AI-powered post generation, and analytics across multiple platforms. The journey from initial concept to a production system handling thousands of users taught us lessons that no tutorial or course could provide.
This article is a candid look at the decisions we made, the mistakes we corrected along the way, and the technical and business insights that emerged from building a real SaaS product from the ground up. Whether you are planning your own SaaS venture or considering hiring a team to build one, these lessons will save you time, money, and frustration.
The Planning Phase: Getting It Right Before Writing Code
The biggest mistake we see founders make is jumping straight into code without a clear plan. Before writing a single line, we spent three weeks on planning. This included defining the core value proposition, mapping out user flows, identifying the minimum viable feature set, and designing the database schema. Those three weeks saved us at least two months of rework later.
We started with user stories — simple descriptions of what each type of user needs to accomplish. For this project, that meant understanding content creators who need to schedule posts across platforms, marketing managers who need analytics and team collaboration, and agencies who manage multiple client accounts. Each user type had different needs, and our architecture needed to accommodate all of them from day one.
The most valuable exercise was defining what we would not build for the initial launch. Feature creep is the silent killer of SaaS projects. We had a list of 47 potential features and ruthlessly cut it down to 12 for the MVP. Everything else went into a backlog for future iterations. This discipline kept the project on schedule and within budget.
Choosing the Tech Stack
Technology decisions made at the start of a SaaS project echo for years. Changing your database or framework six months in is extraordinarily expensive. We evaluated multiple options and settled on a stack that balanced developer productivity, performance, and long-term maintainability. If you are facing the same decision, our guide on how to pick a tech stack without overthinking it breaks down the process.
For the frontend, we chose React with Next.js. React's component model and massive ecosystem made it the pragmatic choice for building a complex, interactive dashboard. Next.js added server-side rendering for SEO-critical pages, API routes that simplified our backend architecture, and an excellent developer experience with hot module replacement and TypeScript support.
The backend runs on Node.js with Express — a core part of our custom web development stack — chosen for its performance with I/O-heavy operations — which is exactly what a social media platform needs when communicating with dozens of external APIs simultaneously. For the database, PostgreSQL was the clear winner. Its support for JSON columns gave us the flexibility of a document database when we needed it, while maintaining the reliability and query power of a relational database.
- Frontend: React + Next.js + TypeScript + Tailwind CSS
- Backend: Node.js + Express + TypeScript
- Database: PostgreSQL with Prisma ORM
- Cache: Redis for session management and job queues
- Infrastructure: Docker + AWS (ECS, RDS, S3, CloudFront)
Authentication and Authorization
Getting authentication right is critical for any SaaS platform. Users trust you with their data and their connected social media accounts. A security breach does not just cause technical damage — it destroys trust and can kill a young product overnight.
We implemented a multi-layered authentication system. Users can sign up with email and password or through OAuth providers like Google and GitHub. All passwords are hashed with bcrypt using a cost factor of 12. Sessions are managed through JWTs stored in HTTP-only cookies with short expiration times and automatic refresh token rotation.
Authorization was more complex. The platform supports multiple roles — owners, admins, editors, and viewers — across multiple organizations. We implemented role-based access control (RBAC) with a middleware layer that checks permissions on every API request. Each route is decorated with the required permissions, and the middleware validates the user's role within the context of their current organization before allowing the request to proceed.
Payment Integration with Stripe
We chose Stripe for payment processing, and despite its excellent documentation, integrating subscriptions with usage-based pricing was the most time-consuming feature to build. The complexity comes not from the initial integration but from handling all the edge cases: failed payments, card expirations, subscription upgrades and downgrades, prorated charges, free trial periods, and usage overages.
Our approach was to build the Stripe integration as a separate service with its own database tables for tracking subscription state. Stripe webhooks notify our system of payment events — successful charges, failed payments, subscription cancellations — and our service updates the local state accordingly. This event-driven architecture means our application never needs to poll Stripe for status updates and can handle payment processing asynchronously.
One lesson we learned the hard way: always implement Stripe's webhook signature verification from day one. During development, we skipped this step for convenience. When we deployed to staging, we discovered that without signature verification, anyone could send fake webhook events to our endpoint and manipulate subscription states. We fixed this immediately, but it could have been a serious vulnerability if it had reached production.
Deployment with Docker and CI/CD
Containerization with Docker was a non-negotiable decision from the start. Docker ensures that the application runs identically in development, staging, and production environments. The classic "it works on my machine" problem simply does not exist when every environment runs the same container images.
Our deployment pipeline uses GitHub Actions for continuous integration and continuous deployment. Every pull request triggers automated tests — unit tests, integration tests, and end-to-end tests with Playwright. If all tests pass and the code review is approved, merging to the main branch automatically builds new Docker images, pushes them to Amazon ECR, and triggers a rolling deployment on ECS. The entire process from merge to production takes about 8 minutes.
We also implemented blue-green deployments so that if a new release introduces a critical bug, we can roll back to the previous version in under 30 seconds. This safety net gave us the confidence to deploy frequently — sometimes multiple times per day — without fear of breaking the production environment.
Scaling and Monitoring
The transition from a handful of beta users to thousands of active users exposed every performance bottleneck we had overlooked. Database queries that took 50 milliseconds with 100 records took 5 seconds with 100,000 records. API endpoints that handled 10 concurrent requests gracefully crashed under 500 concurrent requests. Background jobs that processed one at a time fell hours behind when the queue grew.
We addressed these issues systematically. Database indexes were added based on actual query patterns revealed by PostgreSQL's query analyzer. The API layer was horizontally scaled behind a load balancer, with ECS automatically adding more containers when CPU utilization exceeded 70%. Background job processing was moved to a dedicated worker fleet with Redis-backed job queues that could process hundreds of jobs concurrently.
For monitoring, we use a combination of Datadog for infrastructure metrics, Sentry for error tracking, and custom dashboards built with Grafana for business metrics. Alerts are configured for critical thresholds — if error rates exceed 1%, if response times exceed 500 milliseconds, or if the job queue depth grows beyond 1,000 items, our team is notified immediately through Slack and PagerDuty.
Key Lessons for Anyone Building a SaaS
If we could go back and give ourselves advice before starting this project, these would be the top takeaways. First, invest heavily in automated testing from day one. We wrote tests retroactively for some features and it was significantly harder and less thorough than writing tests alongside the code. Second, design for multi-tenancy from the start, even if you think you will only have single-user accounts. The migration to multi-tenant architecture later is painful and error-prone.
Third, do not over-engineer. We spent two weeks building a sophisticated caching layer before we had enough users to need it. That time would have been better spent on features that attracted more users. Optimize when the metrics tell you to, not when your intuition suggests you might need to. Fourth, talk to users constantly. The features we were most excited about were rarely the ones users valued most. Real feedback from real users is worth more than any amount of internal brainstorming.
Building a SaaS platform is a marathon, not a sprint. The technical challenges are significant, but the organizational and business challenges are even greater. If you are considering building a SaaS product, we would love to share our experience and help you avoid the pitfalls we encountered. You can also explore our past projects and e-commerce solutions to see what we have built for other clients.