The Grouparoo Blog
Don't Do Background Jobs on Google Cloud Run
Tagged in Engineering NotesBy Evan Tahler on 2021-04-13
Grouparoo is a self-hosted product, so we are always looking for the simplest ways to help our customers run the application. A new member of the Google Cloud Platform (GCP) family is Google Cloud Run - which is the closest Google has come yet to a Heroku-like "Git-Ops" way to deploy your applications. It handles load balancing, scaling, and more for you and is a really compelling product. Combined with Google Cloud Build, you can wire up your service to (re)deploy automatically when your git repository changes.
Grouparoo was easy to run on Google Cloud Run, with a few caveats:
- You'll need a VPC connector to bridge the Cloud Run networks and any other services you might be running (like a postgres database or redis service). learn more
- When configuring hosted Redis for Grouparoo (via Google Memorystore), be sure to enable an authentication string, but not encryption in transit.
- You'll be using "Google Cloud Build" to manage deployments. The Cloud Build service will also need access the
Serverless VPC Access User
andCompute Network Admin
IAM roles.
After those steps, and setting our environment variables, we had Grouparoo running on Google Cloud Run!
But... it didn't last long. Every few hours, we would notice that our job throughput would grind to 0. We would then visit the site to look for failures, but everything appeared to be OK, and would start working again. However, only a few hours later, things would slow down again. What was going on?
After some digging, we learned that Google Cloud Run throttled based on HTTP requests, and only HTTP requests.
When an application running on Cloud Run finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Therefore, you should not start background threads or routines that run outside the scope of the request handlers.
This makes Google Cloud Run a poor platform choice for an application like Grouparoo which manages its own background jobs, and expects at least one instance to always be available for scheduling. You can learn more here. That explains why when we visited the site, things started working again, and stopped after that.
If you are looking to run Grouparoo on GCP, check out our Google Cloud example project which uses node.js natively, and connects to a hosted Redis and Postgres database.
Tagged in Engineering Notes
See all of Evan Tahler's posts.
Evan is the CTO and co-founder of Grouparoo, an open source data framework that easily connects your data to business tools. Evan is an open-source innovator, and frequent speaker at software development conferences focusing on Product Management, Node.JS, Rails, and databases.
Learn more about Evan @ https://www.evantahler.com
Get Started with Grouparoo
Start syncing your data with Grouparoo Cloud
Start Free TrialOr download and try our open source Community edition.