How we built Pleo’s tech stack (and what we learned along the way)
Don’t miss an article.
They say Rome wasn’t built in a day. Neither was our tech stack. It’s been a work in progress since Pleo was born back in 2015. We’re pretty happy with our current set-up, so we’ve pulled together the highs and lows of our experience to help tech teams everywhere get a head start. We sat down with Fredrik Rimmius and Radek Dors, two of our Engineers, to get into the nitty gritty of it (hint: it’s not as simple as picking the tools we liked the look of and deploying them).
1. What’s our tech stack made up of?
Forming the foundations of our tech stack is Amazon Web Services (AWS), a trusty cloud provider. Maybe you’ve heard of it? This is our starting point and the foundation from which we run all Pleo systems. Then, we use Terraform to provision our infrastructure in AWS, from databases to networking and more.
All of our applications are deployed to Kubernetes Clusters – sets of node machines for running containerised applications – using GitOps (FluxCD). Each of these technologies work to provide a robust and scalable environment, allowing us to focus on the work that matters.
And we don’t stop there when it comes to our love for Cloud services. For observability, we use DataDog, a platform that helps to aggregate logs, traces and metrics provided by our applications. We also use a bit of Serverless: AWS OpenSearch, a search engine technology.
Our applications are primarily written in Kotlin and communicate asynchronously by events transmitted through Kafka, and synchronously through REST APIs. These APIs build up the foundation of our product, and it’s on this very platform that we create such a smooth experience for our customers.
2. Why is it so important to build the right tech stack?
Ultimately, it’s all about scalability. If your technology can’t grow with you, it’s moving against you and will likely cause more problems than it’s worth down the line.
Firstly, each of our teams is based on solving a set of problems, and the idea is that they should be able to solve these independently from other teams. Let’s say one team wants to move fast one day and do a big release, while another team might have an offsite and need to postpone their workload until tomorrow. Ideally, the first team’s technology should run as autonomously as possible from the other teams, so that nothing holds them back.
Secondly, it’s important to consider scalability in order to handle traffic peaks. At Pleo, we operate solely in European markets (for now, anyway!), so our traffic peaks generally happen during the same working hours. This means we need to be able to scale our services to handle this traffic and not crash, which could disrupt the user experience. For other financial services platforms, they might need to perform heavy calculations at the end of the day or end of month. These are usually billing and reconciliations.
Last but not least, our tech stack needs to scale when it comes to serving our future business needs. While we might have started as a solution to managing employee expenses, we need to be able to ideate and launch new ideas quickly, so our APIs must be as extensible and as reusable as possible.
3. How do we make sure our tech stack is cost effective?
One of the ways we do this is by using consistent tagging and labelling of our resources. This gives us a holistic breakdown of how much money we’re spending on each application. We can group this information by the team that is stewarding the applications or technology, so we can work out how much money we spend on databases, for example. Plus, we’ve set up alerts that warn us if we overspend on certain cloud services.
One of the benefits of using DataDog is that it gives us insights on how effectively we’re provisioning our resources. Whether we want to know how many resources we have or what we’re actually using, the platform can tell us. This helps us to cut costs and increase utilisation.
4. Has our tech stack changed over time?
In short, yes. Over the last couple of years, we’ve slowly been making the move from a monolith to a distributed architecture. This is primarily because we want our teams to be autonomous and enjoy the ease of scalability, so we’ve moved to an API-first approach and an event-driven architecture.
Want to know more? Our CTO, Meri Williams, talks in-depth about Pleo’s API-centricity in this interview with The Stack.
5. Have we ever had any issues with our tech stack?
About two years ago, we realised that we had grown out of our current architecture. It became clear that our business was growing faster than our technology! So it was then that we began to take our tools to the next level.
A few of our team worked at Klarna before they joined Pleo, and they faced the same challenges and change of direction that we’re facing now. They saw first-hand how hard it was to scale the business with the tools they had, so we’re fortunate that they can bring some of the ideas, knowledge and experience that they gained during that period to our business.
Hopefully you now have a better understanding of Pleo’s tech stack and how much we’ve progressed since the beginning. Fancy joining our world-class engineering and IT team? Check out our open roles across Europe, and if you have any questions, feel free to reach out!