Lots of exciting news this week in the world of Google Cloud, let’s jump in!
Cloud Firestore gets Datastore compatibility
Let’s start with databases. Cloud Firestore is a new database offering that will be the eventual upgrade/replacement for Cloud Datastore. In case you aren’t aware, Cloud Datastore is Google’s current serverless, high performance, and NoSQL database ideal for mobile applications and other NoSQL database needs. Datastore was originally introduced with Google App Engine about ten years ago, so it might be due for a bit of an upgrade soon.
So why are we talking about this now? As of last week, Firestore just became more available to more users, along with new modes of operation. Previously, only a ‘native’ mode was available which used Firestore specific features such as real-time updates and expanded web client libraries.
The newest change is that a ‘datastore’ mode has been added, which supports your existing Datastore data with no changes necessary. Cloud Firestore in Datastore mode loses the native mode features, but is still an improvement over Datastore in that the data is always strongly consistent, which was not the case in Datastore. This alone is a big improvement.
Now be aware that at some point, Cloud Firestore is going to leave beta and become the de facto replacement for Datastore. Don’t panic, though, because when that transition takes place, your existing Datastore database will be automatically upgraded to Firestore in Datastore mode with no code changes or downtime required. It should be a completely seamless and dare I say, “invisible” upgrade?
Hortonworks partnership for hybrid Hadoop/Spark environments
Let’s move on to data analytics and new partnerships. You may be aware that the Cloud Dataproc service on Google Cloud lets you run Hadoop and Spark jobs in a managed environment. However, if you still run Hadoop ecosystem jobs both on-premises and on Google Cloud, managing both environments can be a bit of a challenge as you essentially are working with two different systems, each with its own management interface.
Enter Hortonworks, which is a managed provider of data platforms such as Hadoop. Last week, both companies announced a new partnership in which Hortonwork’s Data Platform (HDF) is now fully supported on Google Cloud. This gives customers a wide range of large-scale data analytics solutions across hybrid and multi-cloud deployments. In simple terms, it allows you to ‘mix and match’ Hadoop distributions across multiple platforms, including Google Cloud.
Backup/replay Pub/Sub events to Cloud Storage
Finally, wouldn’t it be nice to have a handy backup of your Cloud Pub/Sub streaming data, just in case something in your pipeline just wasn’t working right? Well now you can! Cloud Dataflow now has a simple template called ‘Cloud PubSub to GCS Text’ which is available from the web console that streams Pub/Sub events into text files in Cloud Storage.
Now, why would you want this? Let’s say you’ve built a streaming pipeline on top of Pub/Sub and Dataflow and runs in production. You later find out that some of your streaming events are not formatted correctly, and you need to investigate it without disrupting your pipeline. This capability will give you the ability to create a separate stream of your events into Cloud Storage bucket for further analysis. You even have the ability to reply Pub/Sub events using the same feature.
Thank you again for checking out this week’s edition of Google Cloud Weekly. Stay tuned for the latest in Google Cloud news every week!
Check out the previous episode of Google Cloud Weekly (8/10/2018), and here’s the next episode (8/22/2018)