{"id":33180,"date":"2023-04-25T14:12:39","date_gmt":"2023-04-28T14:12:39","guid":{"rendered":"https:\/\/www.ntsplhosting.com\/blog\/?p=33180"},"modified":"2023-04-25T14:26:09","modified_gmt":"2023-04-25T14:26:09","slug":"best-practices-and-a-tutorial-for-using-google-cloud-functions-with-mongodb-atlas","status":"publish","type":"post","link":"https:\/\/www.ntsplhosting.com\/blog\/best-practices-and-a-tutorial-for-using-google-cloud-functions-with-mongodb-atlas\/","title":{"rendered":"Best practices and a tutorial for using Google Cloud Functions with MongoDB Atlas"},"content":{"rendered":"<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Serverless applications are becoming increasingly popular among developers. They provide a cost-effective and efficient way to handle application logic and data storage. Two of the most popular technologies that can be used together to build serverless applications are Google Cloud Functions and MongoDB Atlas.<\/p>\n<p>Google Cloud Functions allows developers to run their code in response to events, such as changes in data or HTTP requests, without having to manage the underlying infrastructure. This makes it easy to build scalable and performant applications. MongoDB Atlas, on the other hand, provides a fully-managed, globally-distributed, and highly-available data platform. This makes it easy for developers to store and manage their data in a reliable and secure way.<\/p>\n<p>In this article, we&#8217;ll discuss three best practices for working with databases in Google Cloud Functions. First, we&#8217;ll explore the benefits of opening database connections in the global scope. Then, we&#8217;ll cover how to make your database operations idempotent to ensure data consistency in event-driven functions. Finally, we&#8217;ll discuss how to set up a secure network connection to protect your data from unauthorized access. By following these best practices, you can build more reliable and secure event-driven functions that work seamlessly with your databases.<\/p>\n<h2>Prerequisites<\/h2>\n<p>The minimal requirements for following this tutorial are:<\/p>\n<ul>\n<li>A <a href=\"https:\/\/www.mongodb.com\/docs\/atlas\/getting-started\/\" target=\"_blank\" rel=\"noopener noreferrer\">MongoDB Atlas database<\/a> with a database user and appropriate network configuration.<\/li>\n<li>A Google Cloud account with <a href=\"https:\/\/cloud.google.com\/billing\/docs\/how-to\/verify-billing-enabled\">billing enabled<\/a>.<\/li>\n<li>Cloud Functions, Cloud Build, Artifact Registry, Cloud Run, Logging, and Pub\/Sub APIs enabled. Follow this link to <a href=\"https:\/\/console.cloud.google.com\/apis\/enableflow?apiid=cloudfunctions,cloudbuild.googleapis.com,artifactregistry.googleapis.com,run.googleapis.com,logging.googleapis.com,pubsub.googleapis.com\">enable the required APIs<\/a>.<\/li>\n<\/ul>\n<p>You can try the experiments shown in this article yourself. Both MongoDB Atlas and Cloud Functions offer a free tier which are sufficient for the first two examples. The final example \u2014 setting up a VPC network or Private Service Connect \u2014 requires setting up a paid, dedicated Atlas database and using paid Google Cloud features.<\/p>\n<h2>Open database connections in the global scope<\/h2>\n<p>Let\u2019s say that we\u2019re building a traditional, self-hosted application that connects to MongoDB. We could open a new connection every time we need to communicate with the database and then immediately close that connection. But opening and closing connections adds an overhead both to the database server and to our app. It\u2019s far more efficient to reuse the same connection every time we send a request to the database. Normally, we\u2019d connect to the database using a <a href=\"https:\/\/www.mongodb.com\/docs\/drivers\/\" target=\"_blank\" rel=\"noopener noreferrer\">MongoDB driver<\/a> when we start the app, save the connection to a globally accessible variable, and use it to send requests. As long as the app is running, the connection will remain open.<\/p>\n<p>To be more precise, when we connect, the MongoDB driver creates a connection pool. This allows for concurrent requests to communicate with the database. The driver will automatically manage the connections in the pool, creating new ones when needed and closing them when they\u2019re idle. The pooling also limits the number of connections that can come from a single application instance (<a href=\"https:\/\/www.mongodb.com\/docs\/manual\/administration\/connection-pool-overview\/#settings\" target=\"_blank\" rel=\"noopener noreferrer\">100 connections is the default<\/a>).<\/p>\n<\/div>\n<\/div>\n<div class=\"block-image_full_width\">\n<div class=\"article-module h-c-page\">\n<div class=\"h-c-grid\">\n<figure class=\"article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 \"><img src=\"https:\/\/storage.googleapis.com\/gweb-cloudblog-publish\/images\/Connection_pooling.max-1000x1000.png\" alt=\"Connection pooling\" \/><figcaption class=\"article-image__caption \">\n<div class=\"rich-text\">Connection pooling<\/div>\n<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>On the other hand, Cloud Functions are serverless. They\u2019re very efficient at automatically scaling up when multiple concurrent requests come in, and down when the demand decreases.<\/p>\n<p>By default, each function instance can handle only one request at a time. However, with <a href=\"https:\/\/cloud.google.com\/functions\/docs\/concepts\/version-comparison\">Cloud Functions 2nd gen<\/a>, you can configure your functions to handle <a href=\"https:\/\/cloud.google.com\/functions\/docs\/configuring\/concurrency\">concurrent requests<\/a>. For example, if you set the concurrency parameter to 10, a single function instance will be able to work on a max of 10 requests at the same time. If we\u2019re careful about how we connect to the database, the requests will take advantage of the connection pool created by the MongoDB driver. In this section, we\u2019ll explore specific strategies for reusing connections.<\/p>\n<p>By default, Cloud Functions can spin up to 1,000 new instances. However, each function instance runs in its own isolated execution context. This means that instances can\u2019t share a database connection pool. That\u2019s why we need to pay attention to the way we open database connections. If we have our concurrency parameter set to 1 and we open a new connection with each request, we will cause unnecessary overhead to the database or even hit the maximum connections limit.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-image_full_width\">\n<div class=\"article-module h-c-page\">\n<div class=\"h-c-grid\">\n<figure class=\"article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 \"><img src=\"https:\/\/storage.googleapis.com\/gweb-cloudblog-publish\/images\/Cloud_Functions_connections.max-1000x1000.png\" alt=\"Cloud functions connections\" \/><figcaption class=\"article-image__caption \">\n<div class=\"rich-text\">Cloud functions connections<\/div>\n<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>That looks very inefficient! Thankfully, there\u2019s a better way to do it. We can take advantage of the way Cloud Functions reuses already-started instances.<\/p>\n<p>We mentioned earlier that Cloud Functions scale by spinning up new instances to handle incoming requests. Creating a brand new instance is called a \u201ccold start\u201d and involves the following steps:<\/p>\n<ol>\n<li>Loading the runtime environment.<\/li>\n<li>Executing the global (instance-wide) scope of the function.<\/li>\n<li>Executing the body of the function defined as an \u201centry point.\u201d<\/li>\n<\/ol>\n<p>When the instance handles the request, it\u2019s not closed down immediately. If we get another request in the next few minutes, chances are high it will be routed to the same, already \u201cwarmed\u201d instance. But this time, only the \u201centry point\u201d function will be invoked. And what\u2019s more important is that the function will be invoked in the same execution environment. Practically, this means that everything we defined in the global scope can be reused \u2014 including a database connection! This will reduce the overhead of opening a new connection with every function invocation.<\/p>\n<p>While we can take advantage of the global scope for storing a reusable connection, there is no guarantee that a reusable connection will be used.<\/p>\n<p>Let\u2019s test this theory! We\u2019ll do the following experiment:<\/p>\n<ol>\n<li>We\u2019ll create two Cloud Functions that insert a document into a MongoDB Atlas database. We\u2019ll also attach an event listener that logs a message every time a new database connection is created.\n<ol>\n<li>The first function will connect to Atlas in the function scope.<\/li>\n<li>The second function will connect to Atlas in the global scope.<\/li>\n<\/ol>\n<\/li>\n<li>We\u2019ll send 50 concurrent requests to each function and wait for them to complete. In theory, after spinning up a few instances, Cloud Functions will reuse them to handle some of the requests.<\/li>\n<li>Finally, we\u2019ll inspect the logs to see how many database connections were created in each case.<\/li>\n<\/ol>\n<p>Before starting, go back to your Atlas deployment and locate your <a href=\"https:\/\/www.mongodb.com\/docs\/guides\/atlas\/connection-string\/\" target=\"_blank\" rel=\"noopener noreferrer\">connection string<\/a>. Also, make sure you\u2019ve allowed access from anywhere in the network settings. Instead of this, we strongly recommend establishing a secure connection.<\/p>\n<h3>Creating the Cloud Function with function-scoped database connection<\/h3>\n<p>We\u2019ll use the Google Cloud console to conduct our experiment. Navigate to the <a href=\"https:\/\/console.cloud.google.com\/functions\/\">Cloud Functions<\/a> page and make sure you\u2019ve logged in, selected a project, and enabled all required APIs. Then, click on Create function and enter the following configuration:<\/p>\n<ul>\n<li>Environment: 2nd gen<\/li>\n<li>Function name: create-document-function-scope<\/li>\n<li>Region: us-central-1<\/li>\n<li>Authentication: Allow unauthenticated invocations<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<div class=\"block-image_full_width\">\n<div class=\"article-module h-c-page\">\n<div class=\"h-c-grid\">\n<figure class=\"article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 \"><img src=\"https:\/\/storage.googleapis.com\/gweb-cloudblog-publish\/images\/Screenshot_of_the_configuration_for_the_fi.max-1000x1000.png\" alt=\"Configuration of cloud function\" \/><figcaption class=\"article-image__caption \">\n<div class=\"rich-text\">Configuration of cloud function<\/div>\n<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Expand the <b>Runtime<\/b>, <b>build<\/b>, <b>connections<\/b> and <b>security settings<\/b> section and under Runtime environment variables, add a new variable <b>ATLAS_URI<\/b> with your MongoDB Atlas connection string. Don\u2019t forget to replace the username and password placeholders with the credentials for your database user.<\/p>\n<p>&gt; Instead of adding your credentials as environment variables in clear text, you can easily store them as secrets in <a href=\"https:\/\/cloud.google.com\/secret-manager\">Secret Manager<\/a>. Once you do that, you\u2019ll be able to access them from your Cloud Functions.<\/p>\n<p>Click <b>Next<\/b>. It\u2019s time to add the implementation of the function. Open the `package.json` file from the left pane and replace its contents with the following:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u'{rn &#8220;dependencies&#8221;: {rn &#8220;@google-cloud\/functions-framework&#8221;: &#8220;^3.0.0&#8221;,rn &#8220;mongodb&#8221;: &#8220;latest&#8221;rn }rn}&#8217;), (u&#8217;language&#8217;, u&#8217;lang-py&#8217;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece892a0c10&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>We\u2019ve added the `mongodb` package as a dependency. The package is used to distribute the MongoDB Node.js driver that we\u2019ll use to connect to the database.<\/p>\n<p>Now, switch to the `<b>index.js<\/b>` file and replace the default code with the following:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8221;\/\/ Global (instance-wide) scopern\/\/ This code runs once (at instance cold-start)rnconst { http } = require(&#8216;@google-cloud\/functions-framework&#8217;);rnconst { MongoClient } = require(&#8216;mongodb&#8217;);rnrnhttp(&#8216;createDocument&#8217;, async (req, res) =&gt; {rn \/\/ Function scopern \/\/ This code runs every time this function is invokedrn const client = new MongoClient(process.env.ATLAS_URI);rn client.on(&#8216;connectionCreated&#8217;, () =&gt; {rn console.log(&#8216;New connection created!&#8217;);rn });rnrn \/\/ Connect to the database in the function scopern try {rn await client.connect();rnrn const collection = client.db(&#8216;test&#8217;).collection(&#8216;documents&#8217;);rnrnrn const result = await collection.insertOne({ source: &#8216;Cloud Functions&#8217; });rnrn if (result) {rn console.log(`Document ${result.insertedId} created!`);rn return res.status(201).send(`Successfully created a new document with id ${result.insertedId}`);rn } else {rn return res.status(500).send(&#8216;Creating a new document failed!&#8217;);rn }rn } catch (error) {rn res.status(500).send(error.message);rn }rn});&#8221;), (u&#8217;language&#8217;, u&#8217;lang-py&#8217;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece7fc82f50&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Make sure the selected runtime is <b>Node.js<\/b> 16 and for entry point, replace <b>helloHttp<\/b> with <b>createDocument<\/b>.<\/p>\n<p>Finally, hit <b>Deploy<\/b>.<\/p>\n<h3>Creating the Cloud Function with globally-scoped database connection<\/h3>\n<p>Go back to the list with functions and click <b>Create function<\/b> again. Name the function <b>create-document-global-scope<\/b>. The rest of the configuration should be exactly the same as in the previous function. Don\u2019t forget to add an environment variable called <b>ATLAS_URI<\/b> for your connection string. Click <b>Next<\/b> and replace the `<b>package.json<\/b>` contents with the same code we used in the previous section. Then, open `<b>index.js<\/b>` and add the following implementation:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8221;\/\/ Global (instance-wide) scopern\/\/ This code runs once (at instance cold-start)rnconst { http } = require(&#8216;@google-cloud\/functions-framework&#8217;);rnconst { MongoClient } = require(&#8216;mongodb&#8217;);rnrn\/\/ Use lazy initialization to instantiate the MongoDB client and connect to the databasernlet client;rnasync function getConnection() {rn if (!client) {rn client = new MongoClient(process.env.ATLAS_URI);rn client.on(&#8216;connectionCreated&#8217;, () =&gt; {rn console.log(&#8216;New connection created!&#8217;);rn });rnrn \/\/ Connect to the database in the global scopern await client.connect();rn }rnrn return client;rn}rnrnhttp(&#8216;createDocument&#8217;, async (req, res) =&gt; {rn \/\/ Function scopern \/\/ This code runs every time this function is invokedrn const connection = await getConnection();rn const collection = connection.db(&#8216;test&#8217;).collection(&#8216;documents&#8217;);rnrn try {rn const result = await collection.insertOne({ source: &#8216;Cloud Functions&#8217; });rnrn if (result) {rn console.log(`Document ${result.insertedId} created!`);rn return res.status(201).send(`Successfully created a new document with id ${result.insertedId}`);rn } else {rn return res.status(500).send(&#8216;Creating a new document failed!&#8217;);rn }rn } catch (error) {rn res.status(500).send(error.message);rn }rn});&#8221;), (u&#8217;language&#8217;, u&#8217;lang-py&#8217;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b118a90&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Change the entry point to <b>createDocument<\/b> and deploy the function.<\/p>\n<p>As you can see, the only difference between the two implementations is where we connect to the database. To reiterate:<\/p>\n<ul>\n<li>The function that connects in the function scope will create a new connection on every invocation.<\/li>\n<li>The function that connects in the global scope will create new connections only on \u201ccold starts,\u201d allowing for some connections to be reused.<\/li>\n<\/ul>\n<p>Let\u2019s run our functions and see what happens! Click <b>Activate Cloud Shell<\/b><b><\/b>at the top of the Google Cloud console. Execute the following command to send 50 requests to the <b>create-document-function-scope<\/b> function:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8217;seq 50 | xargs -Iz -n 1 -P 50 \\rn gcloud functions call \\rn create-document-function-scope \\rn &#8211;region us-central1 \\rn &#8211;gen2&#8242;), (u&#8217;language&#8217;, u&#8221;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b1238d0&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>You\u2019ll be prompted to authorize Cloud Shell to use your credentials when executing commands. Click <b>Authorize<\/b>. After a few seconds, you should start seeing logs in the terminal window about documents being created. Wait until the command stops running \u2014 this means all requests were sent.<\/p>\n<p>Then, execute the following command to get the logs from the function:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8217;gcloud functions logs read \\rn create-document-function-scope \\rn &#8211;region us-central1 \\rn &#8211;gen2 \\rn &#8211;limit 500 \\rn | grep &#8220;New connection created&#8221;&#8216;), (u&#8217;language&#8217;, u&#8221;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b123e10&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>We\u2019re using `grep` to filter only the messages that are logged whenever a new connection is created. You should see that a whole bunch of new connections were created!<\/p>\n<\/div>\n<\/div>\n<div class=\"block-image_full_width\">\n<div class=\"article-module h-c-page\">\n<div class=\"h-c-grid\">\n<figure class=\"article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 \"><img src=\"https:\/\/storage.googleapis.com\/gweb-cloudblog-publish\/images\/Screenshot_of_the_Cloud_Shell_terminal_sho.max-1000x1000.png\" alt=\"New connection created\" \/><figcaption class=\"article-image__caption \">\n<div class=\"rich-text\">New connection created<\/div>\n<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>We can count them with the `wc -l` command:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8217;gcloud functions logs read \\rn create-document-function-scope \\rn &#8211;region us-central1 \\rn &#8211;gen2 \\rn &#8211;limit 500 \\rn | grep &#8220;New connection created&#8221; \\rn | wc -l&#8217;), (u&#8217;language&#8217;, u&#8221;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b1235d0&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>You should see the number 50 printed in the terminal window. This confirms our theory that a connection is created for each request.<\/p>\n<p>Let\u2019s repeat the process for the <b>create-document-global-scope<\/b> function.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8217;seq 50 | xargs -Iz -n 1 -P 50 \\rn gcloud functions call \\rn create-document-global-scope \\rn &#8211;region us-central1 \\rn &#8211;gen2&#8242;), (u&#8217;language&#8217;, u&#8221;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b123bd0&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>You should see log messages about created documents again. When the command\u2019s finished, run:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8217;gcloud functions logs read \\rn create-document-global-scope \\rn &#8211;region us-central1 \\rn &#8211;gen2 \\rn &#8211;limit 500 \\rn | grep &#8220;New connection created&#8221;&#8216;), (u&#8217;language&#8217;, u&#8221;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b123450&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>This time, you should see significantly fewer new connections. You can count them again with `wc -l`. We have our proof that establishing a database connection in the global scope is more efficient than doing it in the function scope.<\/p>\n<p>We noted earlier that increasing the number of concurrent requests for a Cloud Function can help alleviate the database connections issue. Let\u2019s expand a bit more on this.<\/p>\n<h3>Concurrency with Cloud Functions 2nd gen and Cloud Run<\/h3>\n<p>By default, Cloud Functions can only process one request at a time. However, Cloud Functions 2nd gen are executed in a <a href=\"https:\/\/cloud.google.com\/run\/docs\/overview\/what-is-cloud-run\">Cloud Run container<\/a>. Among other benefits, this allows us to configure our functions to handle multiple concurrent requests. <a href=\"https:\/\/cloud.google.com\/functions\/docs\/configuring\/concurrency#console\">Increasing the concurrency capacity<\/a> brings Cloud Functions closer to a way traditional server applications communicate with a database.<\/p>\n<p>If your function instance supports concurrent requests, you can also take advantage of connection pooling. As a reminder, the MongoDB driver you\u2019re using will automatically create and maintain a pool with connections that concurrent requests will use.<\/p>\n<p>Depending on the use case and the amount of work your functions are expected to do, you can adjust:<\/p>\n<ul>\n<li>The concurrency settings of your functions.<\/li>\n<li>The maximum number of function instances that can be created.<\/li>\n<li>The maximum number of connections in the pool maintained by the MongoDB driver.<\/li>\n<\/ul>\n<p>And as we proved, you should always declare your database connection in the global scope to persist it between invocations.<\/p>\n<h2>Make your database operations idempotent in event-driven functions<\/h2>\n<p>You can enable retrying for your event-driven functions. If you do that, Cloud Functions will try executing your function again and again until it completes successfully or the retry period ends.<\/p>\n<p>This functionality can be useful in many cases, namely when dealing with intermittent failures. However, if your function contains a database operation, executing it more than once can create duplicate documents or other undesired results.<\/p>\n<p>Let\u2019s consider the following example: The function <b>store-message-and-notify<\/b> is executed whenever a message is published to a specified Pub\/Sub topic. The function saves the received message as a document in MongoDB Atlas and then uses a third-party service to send an SMS. However, the SMS service provider frequently fails and the function throws an error. We have enabled retries, so Cloud Functions tries executing our function again. If we weren\u2019t careful with the implementation, we could duplicate the message in our database.<\/p>\n<p>How do we handle such scenarios? How do we make our functions safe to retry? We have to ensure that the function is idempotent. <a href=\"https:\/\/cloud.google.com\/functions\/docs\/bestpractices\/tips#write_idempotent_functions\">Idempotent functions<\/a> produce exactly the same result regardless of whether they were executed once or multiple times. If we insert a database document without a uniqueness check, we make the function non-idempotent.<\/p>\n<p>Let\u2019s give this scenario a try.<\/p>\n<h3>Creating the event-driven non-idempotent Cloud Function<\/h3>\n<p>Go to Cloud Functions and start configuring a new function:<\/p>\n<ul>\n<li>Environment: <b>2nd gen<\/b><\/li>\n<li>Function name: <b>store-message-and-notify<\/b><\/li>\n<li>Region: <b>us-central-1<\/b><\/li>\n<li>Authentication: <b>Require authentication<\/b><\/li>\n<\/ul>\n<p>Then, click on <b>Add Eventarc Trigger<\/b> and select the following in the opened dialog:<\/p>\n<ul>\n<li>Event provider: <b>Cloud Pub\/Sub<\/b><\/li>\n<li>Event: <b>google.cloud.pubsub.topic.v1.messagePublished<\/b><\/li>\n<\/ul>\n<p>Expand <b>Select a Cloud Pub\/Sub topic<\/b> and then click <b>Create a topic<\/b>. Enter <b>test-topic<\/b> for the topic ID, and then <b>Create topic<\/b>.<\/p>\n<p>Finally, enable <b>Retry on failure<\/b> and click <b>Save trigger<\/b>. Note that the function will <a href=\"https:\/\/cloud.google.com\/functions\/docs\/bestpractices\/retries#why_event-driven_functions_fail_to_complete\">always retry on failure<\/a> even if the failure is caused by a bug in the implementation.<\/p>\n<p>Add a new environment variable called <b>ATLAS_URI<\/b> with your connection string and click <b>Next<\/b>.<\/p>\n<p>Replace the `<b>package.json<\/b>` with the one we used earlier and then, replace the `<b>index.js<\/b>` file with the following implementation:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8221;const { cloudEvent } = require(&#8216;@google-cloud\/functions-framework&#8217;);rnconst { MongoClient } = require(&#8216;mongodb&#8217;);rnrn\/\/ Use lazy initialization to instantiate the MongoDB client and connect to the databasernlet client;rnasync function getConnection() {rn if (!client) {rn client = new MongoClient(process.env.ATLAS_URI);rn await client.connect();rn }rnrn return client;rn}rnrncloudEvent(&#8216;processMessage&#8217;, async (cloudEvent) =&gt; {rn let message;rn try {rn const base64message = cloudEvent?.data?.message?.data;rn message = Buffer.from(base64message, &#8216;base64&#8217;).toString();rn } catch (error) {rn console.error(&#8216;Invalid message&#8217;, cloudEvent.data);rn return Promise.resolve();rn }rnrn try {rn await store(message);rn } catch (error) {rn console.error(error.message);rn throw new Error(&#8216;Storing message in the database failed.&#8217;);rn }rnrn if (!notify()) {rn throw new Error(&#8216;Notification service failed.&#8217;);rn }rn});rnrnasync function store(message) {rn const connection = await getConnection();rn const collection = connection.db(&#8216;test&#8217;).collection(&#8216;messages&#8217;);rn await collection.insertOne({rn text: messagern });rn}rnrn\/\/ Simulate a third-party service with a 50% fail raternfunction notify() {rn return Math.floor(Math.random() * 2);rn}&#8221;), (u&#8217;language&#8217;, u&#8217;lang-py&#8217;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b123c50&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Then, navigate to the <a href=\"https:\/\/console.cloud.google.com\/cloudpubsub\/topic\/detail\/test-topic\">Pub\/Sub topic<\/a> we just created and go to the <b>Messages<\/b> tab. Publish a few messages with different message bodies.<\/p>\n<p>Navigate back to your <a href=\"http:\/\/cloud.mongodb.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Atlas deployments<\/a>. You can inspect the messages stored in the database by clicking <b>Browse Collections<\/b>in your cluster tile and then selecting the <b>test<\/b> database and the <b>messages<\/b> collection. You\u2019ll notice that some of the messages you just published are duplicated. This is because when the function is retried, we store the same message again.<\/p>\n<p>One obvious way to try to fix the idempotency of the function is to switch the two operations. We could execute the `notify()` function first and then, if it succeeds, store the message in the database. But what happens if the database operation fails? If that was a real implementation, we wouldn\u2019t be able to unsend an SMS notification. So, the function is still non-idempotent. Let\u2019s look for another solution.<\/p>\n<h3>Using the event ID and unique index to make the Cloud Function idempotent<\/h3>\n<p>Every time the function is invoked, the associated event is passed as an argument together with an unique ID. The event ID remains the same even when the function is retried. We can store the event ID as a field in the MongoDB document. Then, we can create a unique index on that field. That way, storing a message with a duplicate event ID will fail.<\/p>\n<p>Connect to your database from the <a href=\"https:\/\/www.mongodb.com\/docs\/atlas\/data-federation\/query\/sql\/shell\/connect\/\" target=\"_blank\" rel=\"noopener noreferrer\">MongoDB Shell<\/a> and execute the following command to create a unique index:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8217;db.messages.createIndex({ &#8220;event_id&#8221;: 1 }, { unique: true })&#8217;), (u&#8217;language&#8217;, u&#8217;lang-py&#8217;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b1230d0&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Then, click on <b>Edit<\/b> in your Cloud Function and replace the implementation with the following:<\/p>\n<\/div>\n<\/div>\n<div class=\"block-code\">\n<dl>\n<dt>code_block<\/dt>\n<dd>[StructValue([(u&#8217;code&#8217;, u&#8221;const { cloudEvent } = require(&#8216;@google-cloud\/functions-framework&#8217;);rnconst { MongoClient } = require(&#8216;mongodb&#8217;);rnrn\/\/ Use lazy initialization to instantiate the MongoDB client and connect to the databasernlet client;rnasync function getConnection() {rn if (!client) {rn client = new MongoClient(process.env.ATLAS_URI);rn await client.connect();rn }rnrn return client;rn}rnrncloudEvent(&#8216;processMessage&#8217;, async (cloudEvent) =&gt; {rn let message;rn try {rn const base64message = cloudEvent?.data?.message?.data;rn message = Buffer.from(base64message, &#8216;base64&#8217;).toString();rn } catch (error) {rn console.error(&#8216;Invalid message&#8217;, cloudEvent.data);rn return Promise.resolve();rn }rnrn try {rn await store(cloudEvent.id, message);rn } catch (error) {rn \/\/ The error E11000: duplicate key error for the &#8216;event_id&#8217; field is expected when retryingrn if (error.message.includes(&#8216;E11000&#8217;) &amp;&amp; error.message.includes(&#8216;event_id&#8217;)) {rn console.log(&#8216;Skipping retrying because the error is expected&#8230;&#8217;);rn return Promise.resolve();rn }rn rn console.error(error.message);rn throw new Error(&#8216;Storing message in the database failed.&#8217;);rn }rnrn if (!notify()) {rn throw new Error(&#8216;Notification service failed.&#8217;);rn }rn});rnrnasync function store(id, message) {rn const connection = await getConnection();rn const collection = connection.db(&#8216;test&#8217;).collection(&#8216;messages&#8217;);rn await collection.insertOne({rn event_id: id,rn text: messagern });rn}rnrn\/\/ Simulate a third-party service with a 50% fail raternfunction notify() {rn return Math.floor(Math.random() * 2);rn}&#8221;), (u&#8217;language&#8217;, u&#8217;lang-py&#8217;), (u&#8217;caption&#8217;, &lt;wagtail.wagtailcore.rich_text.RichText object at 0x3ece6b58ce90&gt;)])]<\/dd>\n<\/dl>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Go back to the Pub\/Sub topic and publish a few more messages. Then, inspect your data in Atlas, and you\u2019ll see the new messages are not getting duplicated anymore.<\/p>\n<p>There isn\u2019t a one-size-fits-all solution to idempotency. For example, if you\u2019re using update operations instead of insert, you might want to check out the <a href=\"https:\/\/www.mongodb.com\/docs\/manual\/reference\/method\/db.collection.update\/#insert-a-new-document-if-no-match-exists--upsert-\" target=\"_blank\" rel=\"noopener noreferrer\">`upsert` option<\/a> and the <a href=\"https:\/\/www.mongodb.com\/docs\/manual\/reference\/operator\/update\/setOnInsert\/#-setoninsert\" target=\"_blank\" rel=\"noopener noreferrer\">`$setOnInsert` operator<\/a>.<\/p>\n<h2>Set up a secure network connection<\/h2>\n<p>To ensure maximum security for your Atlas cluster and Google Cloud Functions, establishing a secure connection is imperative. Fortunately, you have several options available through Atlas that allow us to configure private networking.<\/p>\n<p>One such option is to set up <a href=\"https:\/\/cloud.google.com\/community\/tutorials\/serverless-vpc-access-private-mongodb-atlas\">Network Peering<\/a> between the MongoDB Atlas database and Google Cloud. Alternatively, you can create a private endpoint utilizing <a href=\"https:\/\/www.mongodb.com\/docs\/atlas\/security-cluster-private-endpoint\/\" target=\"_blank\" rel=\"noopener noreferrer\">Private Service Connect<\/a>. Both of these methods provide robust solutions for securing the connection.<\/p>\n<p>It is important to note, however, that these features are not available for use with the free Atlas M0 cluster. To take advantage of these enhanced security measures, you will need to upgrade to a dedicated cluster at the M10 tier or higher.<\/p>\n<h2>Wrap-up<\/h2>\n<p>In conclusion, Cloud Functions and MongoDB Atlas are a powerful combination for building efficient, scalable, and cost-effective applications. By following the best practices outlined in this article, you can ensure that your application is robust, performant, and able to handle any amount of traffic. From using proper indexes to securing your network, these tips will help you make the most of these two powerful tools and build applications that are truly cloud-native. So start implementing these best practices today and take your cloud development to the next level! If you haven\u2019t already, you can subscribe to MongoDB Atlas and create your first free cluster right from the <a href=\"https:\/\/console.cloud.google.com\/marketplace\/product\/mongodb\/mdb-atlas-self-service?utm_source=mongodb_devrel&amp;utm_medium=static_site&amp;utm_campaign=devrel&amp;utm_id=mongodb\">Google Cloud marketplace<\/a>.<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Serverless applications are becoming increasingly popular among developers. They provide a cost-effective and efficient way to handle application logic and data storage. Two of the most popular technologies that can be used together to build serverless applications are Google Cloud Functions and MongoDB Atlas. Google Cloud Functions allows developers to run their code in response [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":33256,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[445,479],"tags":[408,492,509,510],"_links":{"self":[{"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/posts\/33180"}],"collection":[{"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/comments?post=33180"}],"version-history":[{"count":3,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/posts\/33180\/revisions"}],"predecessor-version":[{"id":33249,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/posts\/33180\/revisions\/33249"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/media\/33256"}],"wp:attachment":[{"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/media?parent=33180"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/categories?post=33180"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ntsplhosting.com\/blog\/wp-json\/wp\/v2\/tags?post=33180"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}