Asset Storage for MongoDB Realm

Richard Krueger
Geek Culture
Published in
11 min readFeb 23, 2021

--

With the emergence of the Covid pandemic in 2020, remote work has become a de-facto reality for a large portion of the world’s population. It is now clear that commuting to shared offices may never resume to previous levels. This in turn has led to an explosion in the development of collaboration software to address the needs of this new professional diaspora. MongoDB Realm, which is the world’s leading real-time mobile/cloud synchronized database, is at the very heart of this digital transformation. This product solves the extremely important problem of how to synchronize data produced and edited by client devices (mobile, desktop, and web) with one another through a shared cloud infrastructure in real-time and at scale. It also handles the synchronization of client devices whose connectivity is intermittent through offline-first architecture that is unique in this product category.

The Realm company was originally founded in Denmark in 2011 with the goal of building a lightweight object-data base for mobile devices (iOS and Android). Their solution was quickly adopted by mobile developers and was deployed in billions of devices worldwide. Realm was a huge success partly because it was free, but also because neither Apple nor Google offered a competitive alternative. In 2017, Realm introduced a new paid service — Realm Cloud — that permitted a developer to synchronize local Realm databases to a master cloud database. This product competed directly with other backend as a service BAAS offerings like Google Firebase and Parse. The advantage of Realm Cloud over its competitors was that it cached a local copy of the database on the client device and only synchronized the deltas with the cloud version. This efficiency provided both a code simplification, as the application treated all data as local, and it provided built in offline mode support, as synchronization took place in the background once the device reconnected.

The Realm company was acquired by MongoDB in April 2019. For about a year, the engineering teams from two companies rearchitected the product to replace the backend of Realm Cloud with MongoDB Atlas — the world’s leading open-source object database. The new product called MongoDB Realm was released in June 2020 in Beta at the virtual MongoDB Live conference. This new product supports iOS, Android, Mac Desktop, Windows, as well as web interfaces through GraphQL. From a data perspective MongoDB Realm is the operating system platform for collaborative computing.

MongoDB Realm is now perhaps the leading offline-first real-time data base for collaborative programming in the world today. It is the only truly scalable solution for client-side data synchronization across a wide variety of platforms including iOS, OS/X, Android, Windows, and Web that runs on top of a unified server-less backend.

For collaborative programming however, data synchronization is usually only half of the challenge. Typically, an application includes both data and assets that need to be shared between users. Assets are defined as large immutable files such as images, music, videos, and/or documents. Currently, MongoDB Realm has a 4MB limit on the maximum object size supported — meaning that it really does not provide a solution for asset management (even if they are base 64 encoded). This is not to say that assets could not be broken up into multiple 4MB objects, but this would require additional management code within the client application to achieve. Lastly, because MongoDB Realm provides no video or audio streaming capabilities — an asset chunking strategy would probably not work.

Because of these constraints, most application developers resort to using an Asset Storage service like Amazon AWS S3 in addition to MongoDB Realm to solve the storage problem. Again, this requires additional client-side code to adequately implement a cross platform solution. More importantly, it often requires storing sensitive Storage Service credentials within the client code — which is an additional security risk.

Presently, the MongoDB Realm engineering team has tried to address this issue through a Third-Party service called AWS S3 Snippets.This solution however requires the application to upload the asset data to the MongoDB Application server, which in turns marshals it on to the Amazon S3 service. But again, there is a 4MB limit on the asset storage size, because of the base 64 encoding! Also, the very concept of having to upload an asset first to a MongoDB Realm server before it gets uploaded to a Storage Service is somewhat inefficient, as there is no inherent reason not to upload it directly from the client device to S3.

From an application developer’s point of view, an asset needs an HTTPS URL (universal resource locator) to be properly handled within a client program. This is because image caching services (like Kingfisher) usually require URLs to operate on. In addition, for image data, it is often necessary to handle different size cuts of the image (small, medium, large, and original) for different performance scenarios. Support for image cuts of varying sizes should be included in any asset management strategy. Lastly, video and audio assets require that the Storage Service be able to stream the data within an application. This can only be done through URLs to a Storage Service that supports CDN type functionality — like Amazon S3.

To address this challenge, our company Cosync has built a Storage service on top of MongoDB Realm to bridge the gap with Amazon S3. This service is called the Cosync Storage module. As far as MongoDB Realm is concerned, asset handling requires data structures to keep track of the asset within the Storage Service and include functionality to compute write URLs to upload the asset data. It should also provide support for image asset cuts and video previews. Our Storage service provides all of this functionality to a developer in one easy package that seamlessly integrates with any MongoDB Realm application.

The Cosync Storage solution for asset management within a MongoDB Application has three components:

  • A MongoDB Realm Application configured with the Amazon S3 credentials along with a set of triggers and functions for computing asset URLs
  • A set of data models and functions for modeling assets, asset uploads, and expiring asset URLs
  • Client-side code for handling asset initialization, image asset cuts, video previews, upload progress, and HTTPS uploading

The data models used to model assets are the following:

  • CosyncAssetUpload
  • CosyncAsset

Schemas for these data models are discussed in the next section below. The Cosync Storage sample application code provides version of these data models for Swift, Kotlin, and React-Native. All of these data models assume a partition key called _partition. As a recommended best practice, we suggest that CosyncAssetUpload(s) exist within the user’s private partition, to maximize scalability.

The CosyncAssetUpload object is used by a client application to initiate the upload of an asset to the Amazon S3 Storage service. The CosyncAsset object is used to record an uploaded asset in MongoDB Realm. This object is automatically created by the backend code running on the MongoDB Realm Application server.

Expiring Assets

For security purposes, the Cosync Storage module allows a developer to control whether an asset’s URLs expires or not. Expiring URLs are used to protect sensitive assets from being shared over the Internet. Asset URLs can expire in a few hours (the default is 24) or in a few minutes depending on how sensitive the information is.

Asset expiration is controlled through the expirationHours property in the CosyncAssetUpload object at the time of the upload process. If an asset expires, the expirationHours property on the CosyncAsset object will set to a value greater than zero, and the expiration property will specify a date in UTC when the asset URLs expire. If an asset has expired, the client-side code can call the function CosyncRefreshAsset(assetId) to force the backend to refresh the URLs for the asset. The new URLs that are computed will expire in so many expirationHours from the time the function was called.

The Cosync Storage module also supports non expiring assets. However, the developer should remember that URLs to non-expiring assets present a security risk should they be leaked to the wider Internet. On the other hand, there is one less management step for dealing with non-expiring assets.

Asset upload process

A client application initiates an asset upload by first creating a CosyncAssetUpload object within a Realm partition. Our recommendation is to set the _partition key value to the private user realm for optimal security purposes — that way no one else can ease drop on what is being uploaded.

To proceed with an asset upload, the client application code must fill in the uid field with the user id that is uploading the asset, it must set a unique sessionId that identifies the device from which the asset is being uploaded, and it must set a filePath in the Amazon S3 bucket where the asset will be placed. The API also provides two properties: extra and assetPartition that are used to store extra information about the asset and optionally which partition the final CosyncAsset object should be created in. The expirationHours property specifies how many hours (a floating-point value) the asset will expire in. If this property is set to zero, the asset never expires. When a CosyncAssetUpload object is created, its status property is set to pending.

Once the client application has created the CosyncAssetUpload object in Realm, the backend trigger CosyncAssetUploadTrigger that is attached to the MongoDB Realm Application on the server will fire. This trigger will compute the read and write URLs in the Amazon S3 Storage service that the client can use to upload the asset with. Once this computation has taken place, the trigger function will set status property of the CosyncAssetUpload object to initialized. The application client code will in turn be listening for any changes to the CosyncAssetUpload object. Upon receiving an initialized upload object, the client will proceed to upload the asset to the write URLs that were set by the backend in the CosyncAssetUpload object. When the upload process is complete, the application client will set the status property to uploaded. Note, the client will only listen for CosyncAssetUploadobjects whose sessionId corresponds to the device in question upon which the client application is running. The write URLs that are returned by the Cosync Storage service must always be used in conjunction with an HTTPS PUT command. The service does not presently implement URLs that support a multipart HTTPS POST command.

The following diagram presents a visual of the asset uploading process:

Cosync Storage MongoDB Realm Interface

Note: The Cosync Storage module uses the MongoDB Realm third-party AWS Service to implement this functionality. It simply uses this service to compute the pre-signed URLs needed to upload and read asset information. The third-party service simply acts as a broker between MongoDB Realm and Amazon S3, while the Cosync Storage module packages all of this into an understandable bundle that can be seamlessly integrated into a working client application.

Expiring Asset URLs

The CosyncAsset object is created by the client after the upload process associated with CosyncAssetUpload object is completed. The Cosync Sample Code will in fact create the CosyncAsset object automatically, once the status property of the CosyncAssetUpload is set to uploaded. The CosyncAsset object is simply there to provide a ledger record in MongoDB Realm of the uploaded asset — the actual asset itself resides with the Amazon S3 Storage system. The client application code needs a CosyncAsset object to be able to retrieve the URLs associated with an asset.

A CosyncAsset object does not need to be placed within the same Realm partition as the CosyncAssetUpload. In fact, the best practice is to always place the CosyncAssetUpload object within a private user partition. The CosyncAssetUpload object has a property called assetPartition that specifies the partition in which the CosyncAsset object will be created. Remember: it is the CosyncAsset object that enables client-side application code to find the asset that is stored in Amazon S3. This object is created by the Cosync backend server-side code attached to the MongoDB Realm Application once the client has signaled an upload complete by setting the status property of the CosyncAssetUpload object to uploaded.

As far as Cosync is concerned, assets are immutable objects, i.e., once uploaded they do not change. Any change to an asset requires a second upload, and a second asset. The CosyncAsset object will record a number of readUrl(s) that permit an application to access the asset on Amazon S3. A non-expiring asset will have an expirationHours property set to zero. Expiring assets will have a property called expiration, which records the date in UTC when the asset’s readUrl(s) expire.

The function CosyncRefreshAsset() is used to update the readUrl(s) on an asset that has expired. The client application should call this function and pass the asset Id of the expired asset to bring it up to date.

Expiring Assets

Sample Application

We provide a number Sample applications for both the Cosync Storage module and the CosyncJWT service. All of the Cosync Sample Application Code is open source and is released under the Apache 2.0 license.

The first step is to go to our public GitHub repository and download the code examples to your machine from the link CosyncSamples.

This sample app provides a simple iOS example of how to use the Cosync Storage module within a MongoDB Realm application. In order to get this example running, the developer will first have to configure a MongoDB Realm application with the Cosync Storage module as explained Cosync Storage/Configure Application section of this documentation. For the purposes of running the Sample Application, the developer should also configure a simple Email/Password authentication provider on the MongoDB Realm instance.

The CosyncStorageSample application is relatively simple. After the user signs up and logs in, he/she is presented with a scrollable view images that have been uploaded to the public partition.

Cosync Storage Sample — Asset View

As an image is uploaded, the sample application will show the upload progress in the top part of the user interface.

Cosync Storage Sample — Progress upload

The CosyncStorageSample application supports both image and video asset types. Videos can be played directly within the scrollable asset view.

Cosync Storage Sample — Video Assets

The Cosync Storage module is available to our developers at a cost of $6/month. To sign up for the service, please go to our website at www.cosync.io. All of the client-side code that is bundled with a developer’s application is open source under the Apache 2.0 license.

Happy Realming

--

--

Richard Krueger
Geek Culture

I have been a tech raconteur and software programmer for the past 25 years and an iOS enthusiast for the last eight years. I am the founder of Cosync, Inc.