How to Deploy Percona Server for MongoDB for High Availability

percona

high availability

↧

An Overview of the Percona MongoDB Kubernetes Operator

November 12, 2020, 3:57 am

≫ Next: An Overview of Percona Backup for MongoDB

≪ Previous: How to Deploy Percona Server for MongoDB for High Availability

MongoDB and Kubernetes is a great combination, especially in regards to complexity. Yet, Percona's MongoDB (PSMDB) offers more flexibility for the NoSQL database, and it comes also with tools that are efficient for today's productivity; not only on-prem but also available for cloud natives.

Ther adoption rate of Kubernetes is steadily increasing. It's reasonable that a technology must have an operator to do the following: creation, modification, and deletion of items Percona Server for MongoDB environment. The Percona MongoDB Kubernetes Operator contains the necessary k8s settings to maintain a consistent Percona Server for a MongoDB instance. As an alternative option, you might compare this to https://github.com/kubedb/mongodb but the KubeDB for MongoDB does offer very limited options to offer especially on production grade systems.

The Percona Kubernetes Operators boasting its configuration are based and are following the best practices for configuration of PSMDB replica set. What matters most, the operator for MongoDB itself provides many benefits but saving time, a consistent environment are the most important. In this blog, we'll take an overview of how this is beneficial especially in a containerized environment.

What Does This Operator Can Offer?

This operator is useful for PSMDB using a replica set. This means that, your database design architecture has to conform with the following diagram below

Image borrowed from Percona's Documentation Design Overview

Currently, the supported platforms available for this operator are:

OpenShift 3.11
OpenShift 4.5
Google Kubernetes Engine (GKE) 1.15 - 1.17
Amazon Elastic Container Service for Kubernetes (EKS) 1.15
Minikube 1.10
VMWare Tanzu

Other Kubernetes platforms may also work but have not been tested.

Resource Limits

A cluster running an officially supported platform contains at least three Nodes, with the following resources:

2GB of RAM
2 CPU threads per Node for Pods provisioning
at least 60GB of available storage for Private Volumes provisioning

Security And/Or Restrictions Constraints

The Kubernetes works as when creating Pods, each Pod has an IP address in the internal virtual network of the cluster. Creating or destroying Pods are both dynamic processes, so it's not recommendable to bind your pods to specific IP addresses assigned for communication between Pods. This can cause problems as things change over time as a result of the cluster scaling, inadvertent mistakes, dc outage or disasters, or periodic maintenance, etc. In that case, the operator strictly recommends you to connect to Percona Server for MongoDB via Kubernetes internal DNS names in URI (e.g. mongodb+srv://userAdmin:userAdmin123456@<cluster-name>-rs0.<namespace>.svc.cluster.local/admin?replicaSet=rs0&ssl=false).

This PSMDB Kubernetes Operator also uses affinity/anti-affinity which provides constraints to which your pods can be scheduled to run or initiated on a specific node. Affinity defines eligible pods that can be scheduled on the node which already has pods with specific labels. Anti-affinity defines pods that are not eligible. This approach reduces costs by ensuring several pods with intensive data exchange occupy the same availability zone or even the same node or, on the contrary, to spread the pods on different nodes or even different availability zones for high availability and balancing purposes. Though the operator encourages you to set affinity/anti-affinity, yet this has limitations when using Minikube.

When using Minikube, it has the following platform-specific limitations. Minikube doesn’t support multi-node cluster configurations because of its local nature, which is in collision with the default affinity requirements of the Operator. To arrange this, the Install Percona Server for MongoDB on Minikube instruction includes an additional step which turns off the requirement of having not less than three Nodes.

In the following section of this blog, we will set up PMSDB Kubernetes Operator using Minikube and we'll follow the anti-affinity set off to make it work. How does this differ from using anti-affinity, if you change AntiAffinity you’re increasing risks for cluster availability. Let's say, if your main purpose of deploying your PSMDB to a containerized environment is to spread and have higher-availability yet scalability, then this might defeat the purpose. Yet using Minikube especially on-prem and for testing your PSMDB setup is doable but for production workloads you surely want to run nodes on separate hosts, or in an environment setup such that in a way concurrent failure of multiple pods is unlikely to happen.

Data In Transit/Data At Rest

For data security with PSMDB, the operator offers TLS/SSL for in transit, then also offers encryption when data is at rest. For in transit, options you can choose is to use cert-manager, or generate your own certificate manually. Of course, you can optionally use PSMDB without TLS for this operator. Checkout their documentation with regards to using TLS.

For data at rest, it requires changes within their PSMDB Kubernetes Operator after you downloaded the github branch, then apply changes on the deploy/cr.yaml file. To enable this, do the following as suggested by the documentation:

The security.enableEncryption key should be set to true (the default value).
Thesecurity.encryptionCipherMode key should specify proper cipher mode for decryption. The value can be one of the following two variants:
- AES256-CBC (the default one for the Operator and Percona Server for MongoDB)
- AES256-GCM

security.encryptionKeySecret should specify a secret object with the encryption key:

mongod:

  ...

  security:

    ...

    encryptionKeySecret: my-cluster-name-mongodb-encryption-key

Encryption key secret will be created automatically if it doesn’t exist. If you would like to create it yourself, take into account that the key must be a 32 character string encoded in base64.

Storing Sensitive Information

The PSMDB Kubernetes Operator uses Kubernetes Secrets to store and manage sensitive information. The Kubernetes Secrets let you store and manage sensitive information, such as passwords, OAuth tokens, and ssh keys. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a pod definition or in a container image.

For this operator, the user and passwords generated for your pods are stored and can be obtained using kubectl get secrets <your-secret-object-name> -o yaml set in your deploy/cr.yaml.

For this blog, my example setup achieves the following with the decoded base64 result.

 kubectl get secrets mongodb-cluster-s9s-secrets -o yaml | egrep '^\s+MONGODB.*'|cut -d ':' -f2 | xargs -I% sh -c "echo % | base64 -d; echo "

WrDry6bexkCPOY5iQ

backup

gAWBKkmIQsovnImuKyl

clusterAdmin

qHskMMseNqU8DGbo4We

clusterMonitor

TQBEV7rtE15quFl5

userAdmin

Each entry for backup, cluster user, cluster monitor user, and the user for administrative usage shows based on the above result.

Another thing also is that, PSMDB Kubernetes Operator stores also the AWS S3 access and secret keys through Kubernetes Secrets.

Backups

This operator supports backups which is a very nifty feature. It supports on-demand (manual) backup and scheduled backup and uses the backup tool Percona Backup for MongoDB. Take note that backups are only stored on AWS S3 or any S3-compatible storage.

Backups scheduled can be defined through the deploy/cr.yaml file, whereas taking a manual backup can be done anytime whenever the demand is necessary. For S3 access and secret keys, it shall be defined in the file deploy/backup-s3.yaml file and uses Kubernetes Secrets to store the following information as we mentioned earlier.

All actions supported for this PSMDB Kubernetes Operator are the following:

Making scheduled backups
Making on-demand backup
Restore the cluster from a previously saved backup
Delete the unneeded backup

Using PSMDB Kubernetes Operator With Minikube

In this section, we'll keep a simple setup using Kubernetes with Minikube, which you can use on-prem without the need for a cloud provider. For cloud-native setup especially for a more enterprise and production grade environment, you can go checkout their documentation.

Before we proceed with the steps, keep in mind that as mentioned above, there has been a known limitation with Minikube as it doesn't support multi-node cluster configuration which is in collision with the default affinity requirements of the operator. We'll mention this on how to handle it in the following steps below.

For this blog, the host OS where our Minikube will be installed is on Ubuntu 18.04 (Bionic Beaver).

Let's Install Minikube

$ curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube_latest_amd64.deb

$ sudo dpkg -i minikube_latest_amd64.deb

Optionally, you can follow the steps here if you are on different Linux systems.

Let's add the required key to authenticate our Kubernetes packages and setup the repository

$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

$ cat <<eof > /etc/apt/sources.list.d/kubernetes.list

deb https://apt.kubernetes.io/ kubernetes-xenial main

deb https://apt.kubernetes.io/ kubernetes-yakkety main

eof

Now let's install the required packages

$ sudo apt-get update

$ sudo apt-get install kubelet kubeadm kubectl

Start the Minikube defining the memory, the number of CPU's and the CIDR for which my nodes shall be assigned,

$ minikube start --memory=4096 --cpus=3 --extra-config=kubeadm.pod-network-cidr=192.168.0.0/16

The example results shows like,

minikube v1.14.2 on Ubuntu 18.04

Automatically selected the docker driver

docker is currently using the aufs storage driver, consider switching to overlay2 for better performance

Starting control plane node minikube in cluster minikube

Creating docker container (CPUs=3, Memory=4096MB) ...

Preparing Kubernetes v1.19.2 on Docker 19.03.8 ...

kubeadm.pod-network-cidr=192.168.0.0/16

 > kubeadm.sha256: 65 B / 65 B [--------------------------] 100.00% ? p/s 0s

 > kubectl.sha256: 65 B / 65 B [--------------------------] 100.00% ? p/s 0s

 > kubelet.sha256: 65 B / 65 B [--------------------------] 100.00% ? p/s 0s

 > kubeadm: 37.30 MiB / 37.30 MiB [---------------] 100.00% 1.46 MiB p/s 26s

 > kubectl: 41.01 MiB / 41.01 MiB [---------------] 100.00% 1.37 MiB p/s 30s

 > kubelet: 104.88 MiB / 104.88 MiB [------------] 100.00% 1.53 MiB p/s 1m9s

Verifying Kubernetes components...

Enabled addons: default-storageclass, storage-provisioner

Done! kubectl is now configured to use "minikube" by default

As you noticed, it does as well install the utility tools to manage and administer your nodes or pods.

Now, let's check the nodes and pods by running the following commands,

$ kubectl get pods -A

NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE

kube-system   coredns-f9fd979d6-gwngd            1/1     Running   0          45s

kube-system   etcd-minikube                      0/1     Running   0          53s

kube-system   kube-apiserver-minikube            1/1     Running   0          53s

kube-system   kube-controller-manager-minikube   0/1     Running   0          53s

kube-system   kube-proxy-m25hm                   1/1     Running   0          45s

kube-system   kube-scheduler-minikube            0/1     Running   0          53s

kube-system   storage-provisioner                1/1     Running   1          57s

$ kubectl get nodes -owide

NAME       STATUS   ROLES    AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION      CONTAINER-RUNTIME

minikube   Ready    master   2d4h   v1.19.2   192.168.49.2   <none>        Ubuntu 20.04 LTS   4.15.0-20-generic   docker://19.3.8

Now, download the PSMDB Kubernetes Operator,

$ git clone -b v1.5.0 https://github.com/percona/percona-server-mongodb-operator

$ cd percona-server-mongodb-operator

We're now ready to deploy the operator,

$ kubectl apply -f deploy/bundle.yaml

As mentioned earlier, Minikube's limitations require adjustments to make things run as expected. Let's do the following:

Depending on your current hardware capacity, you might change the following as suggested by the documentation. Since minikube runs locally, the default deploy/cr.yaml file should be edited to adapt the Operator for the local installation with limited resources. Change the following keys in the replsets section:
- comment resources.requests.memory and resources.requests.cpu keys (this will fit the Operator in minikube default limitations)
- set affinity.antiAffinityTopologyKey key to "none" (the Operator will be unable to spread the cluster on several nodes)
Also, switch allowUnsafeConfigurations key to true (this option turns off the Operator’s control over the cluster configuration, making it possible to deploy Percona Server for MongoDB as a one-node cluster).

Now, we're ready to apply the changes made to the deploy/cr.yaml file.

$ kubectl apply -f deploy/cr.yaml

At this point, you might be able to check the status of the pods and you'll notice the following progress just like below,

$ kubectl get pods

NAME                                              READY   STATUS              RESTARTS   AGE

percona-server-mongodb-operator-588db759d-qjv29   0/1     ContainerCreating   0          15s



$ kubectl get pods

NAME                                              READY   STATUS     RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         0/2     Init:0/1   0          4s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running    0          34s



$ kubectl get pods

NAME                                              READY   STATUS            RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         0/2     PodInitializing   0          119s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running           0          2m29s



kubectl get pods

NAME                                              READY   STATUS            RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         0/2     PodInitializing   0          2m1s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running           0          2m31s



kubectl get pods

NAME                                              READY   STATUS    RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         2/2     Running   0          33m

mongodb-cluster-s9s-rs0-1                         2/2     Running   1          31m

mongodb-cluster-s9s-rs0-2                         2/2     Running   0          30m

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running   0          33m

Now that we're almost there. We'll get the generated secrets by the operator so that we can connect to the created PSMDB pods. To do that, you need to list the secret objects first, then get the value of the yaml so you can get the user/password combination. On the other hand, you can use the combined command below with the username:password format. See the example below,

$ kubectl get secrets

NAME                                          TYPE                                  DATA   AGE

default-token-c8frr                           kubernetes.io/service-account-token   3      2d4h

internal-mongodb-cluster-s9s-users            Opaque                                8      2d4h

mongodb-cluster-s9s-mongodb-encryption-key    Opaque                                1      2d4h

mongodb-cluster-s9s-mongodb-keyfile           Opaque                                1      2d4h

mongodb-cluster-s9s-secrets                   Opaque                                8      2d4h

percona-server-mongodb-operator-token-rbzbc   kubernetes.io/service-account-token   3      2d4h



$ kubectl get secrets mongodb-cluster-s9s-secrets -o yaml | egrep '^\s+MONGODB.*'|cut -d ':' -f2 | xargs -I% sh -c "echo % | base64 -d; echo" |sed 'N; s/\(.*\)\n\(.*\)/

\2:\1/'

backup:WrDry6bexkCPOY5iQ

clusterAdmin:gAWBKkmIQsovnImuKyl

clusterMonitor:qHskMMseNqU8DGbo4We

userAdmin:TQBEV7rtE15quFl5

Now, you can base the username:password format and save it somewhere securely.

Since we cannot directly connect to the Percona Server for MongoDB nodes, we need to create a new pod which has the mongodb client,

$ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:4.2.8-8 --restart=Never -- bash -il

Lastly, we're now ready to connect to our PSMDB nodes now,

bash-4.2$ mongo "mongodb+srv://userAdmin:tujYD1mJTs4QWIBh@mongodb-cluster-s9s-rs0.default.svc.cluster.local/admin?replicaSet=rs0&ssl=false"

Alternatively, you can connect to the individual nodes and check it's health. For example,

bash-4.2$ mongo --host "mongodb://clusterAdmin:gAWBKkmIQsovnImuKyl@mongodb-cluster-s9s-rs0-2.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017/?authSource=admin&ssl=false"

Percona Server for MongoDB shell version v4.2.8-8

connecting to: mongodb://mongodb-cluster-s9s-rs0-2.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017/?authSource=admin&compressors=disabled&gssapiServiceName=mongodb&ssl=false

Implicit session: session { "id" : UUID("9b29b9b3-4f82-438d-9857-eff145be0ee6") }

Percona Server for MongoDB server version: v4.2.8-8

Welcome to the Percona Server for MongoDB shell.

For interactive help, type "help".

For more comprehensive documentation, see

        https://www.percona.com/doc/percona-server-for-mongodb

Questions? Try the support group

        https://www.percona.com/forums/questions-discussions/percona-server-for-mongodb

2020-11-09T07:41:59.172+0000 I  STORAGE  [main] In File::open(), ::open for '/home/mongodb/.mongorc.js' failed with No such file or directory

Server has startup warnings:

2020-11-09T06:41:16.838+0000 I  CONTROL  [initandlisten] ** WARNING: While invalid X509 certificates may be used to

2020-11-09T06:41:16.838+0000 I  CONTROL  [initandlisten] **          connect to this server, they will not be considered

2020-11-09T06:41:16.838+0000 I  CONTROL  [initandlisten] **          permissible for authentication.

2020-11-09T06:41:16.838+0000 I  CONTROL  [initandlisten]

rs0:SECONDARY> rs.status()

{

        "set" : "rs0",

        "date" : ISODate("2020-11-09T07:42:04.984Z"),

        "myState" : 2,

        "term" : NumberLong(5),

        "syncingTo" : "mongodb-cluster-s9s-rs0-0.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

        "syncSourceHost" : "mongodb-cluster-s9s-rs0-0.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

        "syncSourceId" : 0,

        "heartbeatIntervalMillis" : NumberLong(2000),

        "majorityVoteCount" : 2,

        "writeMajorityCount" : 2,

        "optimes" : {

                "lastCommittedOpTime" : {

                        "ts" : Timestamp(1604907723, 4),

                        "t" : NumberLong(5)

                },

                "lastCommittedWallTime" : ISODate("2020-11-09T07:42:03.395Z"),

                "readConcernMajorityOpTime" : {

                        "ts" : Timestamp(1604907723, 4),

                        "t" : NumberLong(5)

                },

                "readConcernMajorityWallTime" : ISODate("2020-11-09T07:42:03.395Z"),

                "appliedOpTime" : {

                        "ts" : Timestamp(1604907723, 4),

                        "t" : NumberLong(5)

                },

                "durableOpTime" : {

                        "ts" : Timestamp(1604907723, 4),

                        "t" : NumberLong(5)

                },

                "lastAppliedWallTime" : ISODate("2020-11-09T07:42:03.395Z"),

                "lastDurableWallTime" : ISODate("2020-11-09T07:42:03.395Z")

        },

        "lastStableRecoveryTimestamp" : Timestamp(1604907678, 3),

        "lastStableCheckpointTimestamp" : Timestamp(1604907678, 3),

        "members" : [

                {

                        "_id" : 0,

                        "name" : "mongodb-cluster-s9s-rs0-0.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "health" : 1,

                        "state" : 1,

                        "stateStr" : "PRIMARY",

                        "uptime" : 3632,

                        "optime" : {

                                "ts" : Timestamp(1604907723, 4),

                                "t" : NumberLong(5)

                        },

                        "optimeDurable" : {

                                "ts" : Timestamp(1604907723, 4),

                                "t" : NumberLong(5)

                        },

                       "optimeDate" : ISODate("2020-11-09T07:42:03Z"),

                        "optimeDurableDate" : ISODate("2020-11-09T07:42:03Z"),

                        "lastHeartbeat" : ISODate("2020-11-09T07:42:04.246Z"),

                        "lastHeartbeatRecv" : ISODate("2020-11-09T07:42:03.162Z"),

                        "pingMs" : NumberLong(0),

                        "lastHeartbeatMessage" : "",

                        "syncingTo" : "",

                        "syncSourceHost" : "",

                        "syncSourceId" : -1,

                        "infoMessage" : "",

                        "electionTime" : Timestamp(1604904092, 1),

                        "electionDate" : ISODate("2020-11-09T06:41:32Z"),

                        "configVersion" : 3

                },

                {

                        "_id" : 1,

                        "name" : "mongodb-cluster-s9s-rs0-1.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "health" : 1,

                        "state" : 2,

                        "stateStr" : "SECONDARY",

                        "uptime" : 3632,

                        "optime" : {

                                "ts" : Timestamp(1604907723, 4),

                                "t" : NumberLong(5)

                        },

                        "optimeDurable" : {

                                "ts" : Timestamp(1604907723, 4),

                                "t" : NumberLong(5)

                        },

                        "optimeDate" : ISODate("2020-11-09T07:42:03Z"),

                        "optimeDurableDate" : ISODate("2020-11-09T07:42:03Z"),

                        "lastHeartbeat" : ISODate("2020-11-09T07:42:04.244Z"),

                        "lastHeartbeatRecv" : ISODate("2020-11-09T07:42:04.752Z"),

                        "pingMs" : NumberLong(0),

                        "lastHeartbeatMessage" : "",

                        "syncingTo" : "mongodb-cluster-s9s-rs0-2.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "syncSourceHost" : "mongodb-cluster-s9s-rs0-2.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "syncSourceId" : 2,

                        "infoMessage" : "",

                        "configVersion" : 3

                },

                {

                        "_id" : 2,

                        "name" : "mongodb-cluster-s9s-rs0-2.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "health" : 1,

                        "state" : 2,

                        "stateStr" : "SECONDARY",

                        "uptime" : 3651,

                        "optime" : {

                                "ts" : Timestamp(1604907723, 4),

                                "t" : NumberLong(5)

                        },

                        "optimeDate" : ISODate("2020-11-09T07:42:03Z"),

                        "syncingTo" : "mongodb-cluster-s9s-rs0-0.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "syncSourceHost" : "mongodb-cluster-s9s-rs0-0.mongodb-cluster-s9s-rs0.default.svc.cluster.local:27017",

                        "syncSourceId" : 0,

                        "infoMessage" : "",

                        "configVersion" : 3,

                        "self" : true,

                        "lastHeartbeatMessage" : ""

                }

        ],

        "ok" : 1,

        "$clusterTime" : {

                "clusterTime" : Timestamp(1604907723, 4),

                "signature" : {

                        "hash" : BinData(0,"HYC0i49c+kYdC9M8KMHgBdQW1ac="),

                        "keyId" : NumberLong("6892206918371115011")

                }

        },

        "operationTime" : Timestamp(1604907723, 4)

}

Since the operator manages the consistency of the cluster, whenever a failure or let say a pod has been deleted. The operator will automatically initiate a new one. For example,

$ kubectl get po

NAME                                              READY   STATUS    RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         2/2     Running   0          2d5h

mongodb-cluster-s9s-rs0-1                         2/2     Running   0          2d5h

mongodb-cluster-s9s-rs0-2                         2/2     Running   0          2d5h

percona-client                                    1/1     Running   0          3m7s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running   0          2d5h

$ kubectl delete po mongodb-cluster-s9s-rs0-1

pod "mongodb-cluster-s9s-rs0-1" deleted

$ kubectl get po

NAME                                              READY   STATUS     RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         2/2     Running    0          2d5h

mongodb-cluster-s9s-rs0-1                         0/2     Init:0/1   0          3s

mongodb-cluster-s9s-rs0-2                         2/2     Running    0          2d5h

percona-client                                    1/1     Running    0          3m29s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running    0          2d5h

$ kubectl get po

NAME                                              READY   STATUS            RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         2/2     Running           0          2d5h

mongodb-cluster-s9s-rs0-1                         0/2     PodInitializing   0          10s

mongodb-cluster-s9s-rs0-2                         2/2     Running           0          2d5h

percona-client                                    1/1     Running           0          3m36s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running           0          2d5h

$ kubectl get po

NAME                                              READY   STATUS    RESTARTS   AGE

mongodb-cluster-s9s-rs0-0                         2/2     Running   0          2d5h

mongodb-cluster-s9s-rs0-1                         2/2     Running   0          26s

mongodb-cluster-s9s-rs0-2                         2/2     Running   0          2d5h

percona-client                                    1/1     Running   0          3m52s

percona-server-mongodb-operator-588db759d-qjv29   1/1     Running   0          2d5h

Now that we're all set. Of course, you might need to expose the port so you might have to deal with adjustments in deploy/cr.yaml. You can refer here to deal with it.

Conclusion

The Percona Kubernetes Operator for PSMDB can be your complete solution especially for containerized environments for your Percona Server for MongoDB setup. It's almost a complete solution as it has redundancy built-in for your replica set yet the operator supports backup, scalability, high-availability, and security.

Tags:

deployment

↧

An Overview of Percona Backup for MongoDB

November 19, 2020, 12:11 pm

≫ Next: Monitoring Percona Server for MongoDB - Key Metrics

≪ Previous: An Overview of the Percona MongoDB Kubernetes Operator

The known and popular backup method for MongoDB is mongodump. It is a logical backup method, similar to mysqldump in MySQL or pg_dump in PostgreSQL database. There is another backup tool called Percona Backup for MongoDB. It has support for replica sets and shard clusters, as well as more advanced features like point in time recovery.

It is important to note that it performs a consistent backup for your mongodb sharded cluster, and also supports S3 compatible object storage to store the backups. In this blog, we will discuss architecture, installation and usage of Percona Backup for MongoDB.

Architecture

Percona Backup for MongoDB consists of two components, the first one is a process utility that needs to be installed on each MongoDB node, called pbm-agent. The pbm-agent acts to coordinate between the database nodes, running the backup and restore process. It also checks if the node is the correct node to take the backup. The pbm-agent requires a specific user with some role privileges; eg: readWrite, backup, clusterMonitor, and restore. It also needs to create a new role for pbm with action type anyAction and resource type anyResource. The user must exist on each node in the replica set and also in the config server if you use sharded cluster architecture. Percona Backup for MongoDB uses a MongoDB URI connection string method to connect to the database, which is why it requires credential access at the first time.

The other component is the command line interface called pbm. pbm utility triggers the backup-related actions, e.g., execute backup, restore, list backup, delete, and so on. Before you work with pbm, you would need to configure backup options, restore options, and point in time recovery options.

The config file itself is stored in a YAML file, and pbm config command is used to load the configuration file. Some of commands for the pbm utility are shown below:

pbm config, the command used to configure the backup option before executed.
pbm backup, used to take a backup of MongoDB. It supports some compression methods such as gzip, pgzip, lz4, snappy.
pbm restore, the command used for restoring a backup to a node.
pbm list, listing of the current backup files.
pbm cancel-backup, used to cancel the running backup process.
pbm delete-backup, used to delete backup files. There are two options; you can specify the file name of backup to delete or delete backup files that are older than a certain age.

Installation of Percona Backup for MongoDB

There are two ways you can install Percona Backup for MongoDB, you can use the package manager from the operating system and use the official Percona Repository for installing the software, or you can build from source code.

As prerequisites before installing the pbm through the yum/apt install, you need to configure the Percona Repository, and then after that, enable pbm repository:

[root@n8 ~]# percona-release enable pbm release

* Enabling the Percona Backup MongoDB repository

<*> All done!

Then, install Percona Backup for MongoDB. In this case, I am using the CentOS based operating system so we’ll do yum install:

[root@n8 ~]# yum install percona-backup-mongodb

Loaded plugins: fastestmirror

Repository percona-release-noarch is listed more than once in the configuration

Repository percona-release is listed more than once in the configuration

Repository percona-release-noarch is listed more than once in the configuration

Repository percona-release-source is listed more than once in the configuration

(1/10): extras/7/x86_64/primary_db                                                                                                         | 222 kB  00:00:00

(2/10): pbm-release-x86_64/7/primary_db                                                                                                    | 4.2 kB  00:00:02

(3/10): percona-tools-release/7/x86_64/primary_db                                                                                          |  84 kB  00:00:00

(4/10): tools-release-x86_64/7/primary_db                                                                                                  |  84 kB  00:00:00

(5/10): percona-release-x86_64/7/primary_db                                                                                                | 1.1 MB  00:00:06

(6/10): percona-release/7/x86_64/primary_db                                                                                                | 1.1 MB  00:00:08

(7/10): base/7/x86_64/primary_db                                                                                                           | 6.1 MB  00:00:11

(8/10): updates/7/x86_64/primary_db                                                                                                        | 2.5 MB  00:00:08

(9/10): epel/x86_64/updateinfo                                                                                                             | 1.0 MB  00:00:13

(10/10): epel/x86_64/primary_db                                                                                                            | 6.9 MB  00:00:07

Loading mirror speeds from cached hostfile

 * base: mirror.telkomuniversity.ac.id

 * epel: ftp.jaist.ac.jp

 * extras: mirror.telkomuniversity.ac.id

 * updates: mirror.telkomuniversity.ac.id

Resolving Dependencies

--> Running transaction check

---> Package percona-backup-mongodb.x86_64 0:1.3.3-1.el7 will be installed

--> Finished Dependency Resolution



Dependencies Resolved



==================================================================================================================================================================

 Package                                       Arch                          Version                              Repository                                 Size

==================================================================================================================================================================

Installing:

 percona-backup-mongodb                        x86_64                        1.3.3-1.el7                          pbm-release-x86_64                         16 M



Transaction Summary

==================================================================================================================================================================

Install  1 Package



Total download size: 16 M

Installed size: 61 M

Is this ok [y/d/N]: y

Downloading packages:

percona-backup-mongodb-1.3.3-1.el7.x86_64.rpm                                                                                              |  16 MB  00:00:55

Running transaction check

Running transaction test

Transaction test succeeded

Running transaction

  Installing : percona-backup-mongodb-1.3.3-1.el7.x86_64                                                                                                      1/1

  Verifying  : percona-backup-mongodb-1.3.3-1.el7.x86_64                                                                                                      1/1



Installed:

  percona-backup-mongodb.x86_64 0:1.3.3-1.el7



Complete!

After it is complete, you can configure the pbm-agent as a background process and play around with the pbm command line interface:

[root@n7 ~]# pbm

usage: pbm [<flags>] <command> [<args> ...]



Percona Backup for MongoDB



Flags:

  --help                     Show context-sensitive help (also try

                             --help-long and --help-man).

  --mongodb-uri=MONGODB-URI  MongoDB connection string (Default =

                             PBM_MONGODB_URI environment variable)

  --compression=s2           Compression type

                             <none>/<gzip>/<snappy>/<lz4>/<s2>/<pgzip>



Commands:

  help [<command>...]

    Show help.



  config [<flags>] [<key>]

    Set, change or list the config



  backup

    Make backup



  restore [<flags>] [<backup_name>]

    Restore backup



  cancel-backup

    Restore backup



  list [<flags>]

    Backup list



  delete-backup [<flags>] [<name>]

    Delete a backup



  version [<flags>]

    PBM version info

Backup in Action

Before taking a backup of MongoDB, ensure the pbm-agent is running on each node and the backup config has been set as shown below, set the path of the backup:

[root@n8 ~]# pbm config --file=/root/config.yaml --mongodb-uri "mongodb://pbmuser:***@localhost:27017/"

[Config set]

------

pitr:

  enabled: false

storage:

  type: filesystem

  filesystem:

    path: /data/backups

And test run the backup on one of the secondary nodes:

[root@n8 ~]# pbm backup --mongodb-uri "mongodb://pbmuser:*****@localhost:27017/" --compression=pgzip

Starting backup '2020-11-13T15:28:49Z'...................

Backup '2020-11-13T15:28:49Z' to remote store '/data/backups' has started

That’s all for now. Coming soon, ClusterControl 1.8.1 will allow you to schedule and manage your MongoDB clusters using Percona Backup for MongoDB.

Tags:

percona

backup management

↧

Monitoring Percona Server for MongoDB - Key Metrics

November 26, 2020, 2:45 am

≫ Next: MongoDB Backup Management Tips for Sharded Clusters

≪ Previous: An Overview of Percona Backup for MongoDB

When running critical database services in production, we need to know and monitor the database. You need to understand the key metrics in the database that you are using. For example, when you run MongoDB with a WiredTiger storage engine, you need to know connections, authentication, operations, replication lag, page faults, locking, etc.

In this blog, we will explain some key metrics that are used to monitor Percona Server for MongoDB.

Database Connections

Database connections is one of important key metrics in any database. It monitors your current connections/threads from applications to the database. You can check current connections through the below command:

> db.serverStatus().connections

It provides an idea of how much your applications are accessing the database. A sudden spike in the number of connections can cause problems to your database servers problem. Is it expected or not?.

ClusterControl provides information related to the connections as shown below:

Command Operations

The command operations metrics monitor your current operations, either insert, update, delete, or select. You can monitor current command operations by executing the below command:

>db.serverStatus().opcounters

From the command operations, you really can see your application workload, is it heavy write or read. From this standpoint, you can make some decisions, e.g., if you have heavy read traffic, you may want to scale the secondary nodes to distribute the read queries.

Monitoring command operations in ClusterControl is straightforward, you just need to enable Agent Based Monitoring to see the OpsCounter metrics in the MongoDB Server dashboard as shown below:

ReplicaSet Lags

When you run ReplicaSet or ShardCluster architecture, one important key metric is the replication lag. Replication lag occurs when secondary nodes can not catch up with the data being written to the primary node. The reasons for this may vary from network latency, disk throughput, slow queries, etc.

You can check current replication lag information by running the below command on the primary node:

> rs.printSlaveReplicationInfo()

The lag information metrics is in seconds, so on heavy concurrent connections, it may lag by a few seconds on secondary nodes since the replication in mongodb is asynchronous.

In ClusterControl, the metrics Max Replication Lag can be found in the MongoDB Replicaset monitoring dashboard.

Page Faults

Page Faults mostly occur in high concurrent and high load applications. Page Faults happen when the mongodb process wants to get the data but it is not available in the memory, then the mongodb server reads the data from disk.

To monitor the current state of page faults, you can use the below command:

>db.serverStatus().extra_info.page_faults

It gives you the number of page faults. The value might increase during heavy load and the server might experience poor performance. You might want to check on the slow query log too.

Locking

Locking is also an important metric in MongoDB, it usually happens in high load applications with multiple transactions on the same dataset. Locking can cause serious performance issues.

You can check current locking operations in the database using below command:

>db.currentOp()

When we run db.currentOp() command, there is some information related to locking. ClusterControl monitors for Global Lock in its MongoDB dashboard as shown below:

Conclusion

These are some of the important key metrics to monitor in Percona Server for MongoDB. They provide a real time view of what is going on in the server and can uncover any anomalies that you can take action upon. ClusterControl provides some dashboards which give you visibility of your MongoDB databases.

Tags:

database monitoring

↧

MongoDB Backup Management Tips for Sharded Clusters

December 3, 2020, 12:15 pm

≫ Next: Introducing Lookup Charts in MongoDB

≪ Previous: Monitoring Percona Server for MongoDB - Key Metrics

Making proper backups of the database is a critical task. Besides setting the high availability architecture of your MongoDB for database services, you also need to have backups of your databases to ensure the availability of data in case of disaster. For example, if you accidentally delete some data from a production database, the only way to recover the data from the database point of view is to restore from backup.

Recently, ClusterControl started to support a new backup method, called Percona Backup for MongoDB, developed by Percona. It can run consistent backups for MongoDB Replica Sets and Sharded Clusters.

In this blog, we will have a look at backup management for MongoDB Replica Sets and Sharded Clusters.

MongoDB Backup in Highly Available Architecture

ClusterControl supports 3 backup methods, which are mongodump, mongodb consistent and Percona Backup for Mongodb. The mongodb consistent backup is using mongodump utility as the backup method, and the backup can be restored using mongorestore.

The latest supported backup method is Percona Backup for Mongodb for consistent and point in time backups of Replica Set and Sharded Clusters, it requires an agent to run on every node or replica set or shard nodes and management nodes for shard clusters as described in here.

Configuring and scheduling consistent backup using Percona Backup for Mongodb in ClusterControl is very easy. Go to the Backup page, and then configure the Percona Backup for Mongodb. The prerequisite is to have Percona Backup for MongoDB running on each node, which can also be installed from ClusterControl.

We need to install the Percona Backup for MongoDB agent first before being able to Schedule Backup as below:

And then configure the backup directory. Please take a note that the backup directory has to be a shared disk that has been mounted on all nodes with exactly the same mounted path as below:

If you do not have any kind of shared disk ready in the system, you can use NFS to accomplish this. For configuring the NFS server, we need a dedicated server / virtual machine with enough free space to store the backup. Install the nfs-utils and nfs-utils-lib library in the server as below (assuming we are using the CentOS based):

[root@nfs-server ~]# yum install nfs-utils nfs-utils-lib

[root@nfs-server ~]# yum install portmap

And start the portmap and nfs services.

[root@nfsserver ~]# /etc/init.d/portmap start

[root@nfsserver ~]# /etc/init.d/nfs start

After that, add new entries in /etc/exports as shown below:

[root@nfsserver ~]# vi /etc/exports

/backup 10.10.10.11(rw,sync,no_root_squash)

On the database node, we just need to mount the storage disk as shared storage.

Last thing, just click the install button and it will trigger a new job to configure the agent on each node.

After all PBM ggent is installed, we can configure the backup method for the cluster as below:

Physical vs Logical Backup

MongoDB backup supports logical backup and physical backup. The method for logical backup by using the mongodump utility is included when you install the mongodb package. Mongodump needs an access to your mongodb database, thus it requires credential access for mongodump with backup roles privileges and must have grant find action to backup the database.

It works for BSON data dump formats.The mongodump will connect to your database with credentials provided, read the entire data in your database and dump the data into files. Since it is a single threaded process, it will take longer for the backup especially with a large size of database. Mongodump does not maintain the atomicity of transactions across the shards, that is why it can not be used as a backup strategy for mongodb version 4.2 and above in a sharded cluster. Percona Backup for MongoDB is a logical backup but it supports consistent backups of clusters.

Physical backup in MongoDB works through the snapshot of the mongodb file systems, it is copying the underlying mongodb files to another location as base backup of your mongodb database. The file system snapshot are operating system if you use LVM (Logical Volume Manager) as software for managing your disk layout and device, or software appliance eg. Veritas, or NetApp Backup. You must enable journaling, the changes activity log in mongodb before running the file system snapshot to make the backup consistent.

Besides the filesystem snapshot, you can also use the cp or rsync command to copy MongoDB data files, but you need to stop the write process to mongodb because the process of copying datafiles is not an atomic operation. The backup can not be used for Point in Time Recovery in Replica Sets or Sharded Cluster architectures.

Percona Backup for MongoDB consists of two components, the pbm-agent that needs to be installed on each node and the pbm as a command line interface to interact and run the backups.The pbm-agent coordinate between the database nodes and running the backup and restoration process. The pbm-agent will decide the best node for taking the backup.

PITR Backup

In many database systems, it is common to use a checkpoint to flush the data into the disk. MongoDB uses WiredTiger storage engine as a default storage engine and also uses checkpoints to provide a consistent view of data. Not only that, the checkpoint in MongoDB can be used to recover from the last checkpoint. The journaling works between each checkpoint, journaling is required to recover from unexpected outages that happen at any time between the checkpoints. Journaling guarantees the write operations are logged to disk, MongoDB will create a journal entry for every change, including the bytes that changed and the disk location.

Mongodump and mongorestore can be used for point in time recovery backup, there is an option to leverage the oplog. The oplog is a capped collection in MongoDB which tracks all the changes in collections for every write transactions (eg. insert, update, delete). So, if you want to do point in time recovery, you need to restore from the last full backup and also use the oplog file to apply the changes to the exact time you want to recover. Another tool that can be used is the Percona Backup for MongoDB, the process is similar like mongodump, we need to restore from the backup and then apply the oplog.

Conclusion

Taking a consistent backup is important, especially in clustered MongoDB setups (replica set or sharded cluster). ClusterControl provides an easy way to configure the Percona Backup for MongoDB in your cluster and schedule your backups.

Tags:

↧

Introducing Lookup Charts in MongoDB

December 24, 2020, 2:45 am

≫ Next: Managing Journaling in MongoDB

≪ Previous: MongoDB Backup Management Tips for Sharded Clusters

If you aren’t familiar with MongoDB, it’s a document-oriented NoSQL data model, which uses documents rather than tables and rows like you would find with relational tables.

As such, due to the unique way it’s built, MongoDB is one of the best data models for high-performance databasing with great scalability. Of course, that doesn’t mean that it doesn’t have competition, and MongoDB is often compared to Firebase or Cassandra.

Of course, the problem then becomes that any query request on a datastore that large can be problematic, and requires a certain level of being a query guru.

Thankfully, MongoDB has introduced a whole new feature that not only does away with code querying, it also makes it as simple as a few clicks. That means that you don’t have to expend a lot of time and hassle to do the same type of querying and joining as you would normally.

Traditional Querying in MongoDB

While document-oriented databases are already incredibly flexible, there are probably still situations where you might need live data in multiple collections. For example, one collection could contain user data and another could contain user activity. This could even be expanded to have several collections of data for different applications, websites, and so on.

That is why MongoDB Query Language (MQL) was born, and it provided a way for programmers to create complex queries. In fact, MongoDB has a whole page for query documents and how to run them. If you aren’t familiar with it, here’s a quick step-by-step process of how it works, so you can compare to the new LookUp Charts later:

First, you need to connect your MongoDB instance by passing the URI to the Mongo shell and then using --password

mongo.exe mongodb://$[hostlist]/$[database]?authSource=$[authSource] --username $[username]

Second, switch to the Database, in this case, we’ll use a hypothetical ‘test’ database

use test

At this point, you would load more data into MongoDB if needed. You can do that with the insertMany() method:

db.inventory.insertMany( [

   { "item": "journal", "qty": 25, "size": { "h": 14, "w": 21, "uom": "cm" }, "status": "A" },

    { "item": "notebook", "qty": 50, "size": { "h": 8.5, "w": 11, "uom": "in" }, "status": "A" },

    { "item": "paper", "qty": 100, "size": { "h": 8.5, "w": 11, "uom": "in" }, "status": "D" },

    { "item": "planner", "qty": 75, "size": { "h": 22.85, "w": 30, "uom": "cm" }, "status": "D" },

    { "item": "postcard", "qty": 45, "size": { "h": 10, "w": 15.25, "uom": "cm" }, "status": "A" }

]);

Then comes the actual query through retrieving documents in a specific collection:

myCursor = db.inventory.find( { status: "D" } )

Usually, this will show 20 documents and return a cursor, but if you want. If your result set is larger though, you’ll want to iterate over the results:

while (myCursor.hasNext()) {

print(tojson(myCursor.next()));

}

Finally, you’d check the results to make sure everything is correct. Keep in mind in the example below, your ObjectID values will differ:

{

 item: "paper",

 qty: 100,

 size: {

   h: 8.5,

   w: 11,

   uom: "in"

   },

 status: "D"

},

{

 item: "planner",

 qty: 75,

 size: {

   h: 22.85,

   w: 30,

   uom: "cm"

   },

 status: "D"

}

Benefits of LookUp Charts

As you can see, the process is quite complicated with many steps, and so it makes sense that MongoDB wanted to make the process a bit more streamlined. Of course, it goes a bit beyond just making things easier and there are a lot of benefits to LookUp Charts.

For example, you can glean better insights through the single-view format by joining several collections. More importantly, having a visual and easy to parse chart updated live according to your specifications is invaluable. This often allows you to derive information nearly immediately just from visual inspection, especially if you divide information into further categories.

Finally, the biggest benefit is not needing to learn and master MQL for only one database, which reduces the barrier of entry for a lot of programmers.

How to Use LookUp Charts

Alright, so we’ve looked at how querying usually works on MongoDB, and we have a good idea of how LookUp Charts can help us get more salient information quicker, but how does it actually work?

Well, the steps are relatively simple:

First, you need to choose the data source by picking it on the drop-down menu in the top left.
Then, click on the ‘ . . .’ of the field between your collections and click on ‘Lookup field’
When the new window pops up, select the ‘Remote Data Source’ that’s where you’ll pull the data from.
Then, you need to select ‘Remote Field’ and that would be the field that is common between your two data sources.
Finally, you can save a specific name for the result field, and if not, just click on ‘Save’

And that’s pretty much it! You can now drag and drop from the new fields into the chart builder. Don’t forget to choose an Array Reduction method too, or else you might not see any chart showing up for you.

Familiarizing Yourself with MongoDB Charts

Of course, it’s pertinent at this point to mention that the new LookUp feature is a part of MongoDB Charts, and MongoDB themselves have a few interesting articles to help you get your bearings with the software:

Conclusion

As you can see, the new LookUp Charts is an incredibly powerful tool that massively cuts down on the technical knowledge of MongoDB querying. With just a few simple steps you can look at a chart of information joined from several collections, and understand new information almost immediately.

Compare that to the old method of doing it that required several steps of coding, as well as understanding that code, and you start to see how brilliant this new release is.

Tags:

↧

Managing Journaling in MongoDB

December 25, 2020, 2:45 am

≫ Next: Tips For Upgrading to the Latest MongoDB Version

≪ Previous: Introducing Lookup Charts in MongoDB

MongoDB just like any other database may fail when executing a write operation. In that case we need a strategy that will keep the operation somewhere so that the database can resume when it is restored back to operation.

In MongoDB we use journaling whereby there is a write ahead logging to on-disk journal files to keep the data available in an event of failure. The WiredTiger storage engine can use checkpoints to provide a consistent view of data on disk and allow MongoDB to recover from the last checkpoint but only if it did not exit unexpectedly. Otherwise, for the information that occurred during the last checkpoint, journaling must have been enabled to recover such data.

The procedure for the recovery process is that: the database will look into the data files to find the identifier of the last checkpoint, use this identifier to search in the journal files for the record that matches it and then apply the operations in the journal files since the last checkpoint.

How Journaling Works in the WiredTiger Storage Engine

For every client who initiates a write operation, the WiredTiger creates a journal record that is composed of internal write operations that were triggered by the initial write. Consider a document in a collection that is to be updated and we expect its index to be modified too. The WiredTiger will create a single journal record that will incorporate the update operation and corresponding index modifications.

This record will be stored in an in-memory buffer whose maximum capacity is 128kB. The storage engine then syncs this buffered journal records to disk when either of the following is met:

A write operation includes/implies a write concern of j: true.
WiredTiger creates a new journal file which is after every 100MB of data.
After every 100 milliseconds depending on the storage.journal.commitIntervalMs.
In case of replica set members:
- Instance of operations waiting for oplog entries i.e read operations performed as part of causally consistent sessions and forward scanning queries against the oplog.
- After every batch application of the oplog entries in case of the secondary members.

In case of a hard shutdown of mongod, if write operations were in process, updates can be lost even if the journal records remain in the WiredTiger buffers.

Journal Data Compression

Default setting in MongoDB directs the WiredTiger to use snappy compression for the journal data. This can be changed depending on which compression algorithm you may want using the storage.wiredTiger.engineConfig.journalCompressor setting. These log records are only compressed if their size is greater than 128 bytes, which is the minimum log record size of the WiredTiger.

Limiting the Size of a Journal File

The maximum size of a journal file is 100 MB and therefore if the file exceeds this limit, a new one will be created.

After the journal file has been used in recovery or rather there are files older than the one that can be used to recover from the last checkpoint, the WiredTiger automatically removes them.

Pre-Allocation

Journal files can be pre-allocated with the WiredTiger storage engine if the mongod process determines that it is more efficient to preallocate journal files than create new ones.

How Journaling works in the In-Memory Storage Engine

The In-memory storage Engine was stated as part of the General availability (GA) starting with the MongoDB Enterprise version 3.2.6. With this storage engine, data is kept in memory hence no separate journaling technique. If there are any write operations with a write concern (j: true) they will be immediately acknowledged.

For a replica set with a voting member using the in-memory storage engine, one must set the writeConcernMajorityJournalDefault to false. Otherwise if this is set to true, the replica set will log a startup warning.

When this option is set to false, the database will not wait for w: “majority” write to be written to the on-disk journal before acknowledging the writes. The disadvantage of this approach is that with majority write operations may roll back in the event of a transient loss (such as restart or crash) of a majority of nodes in a given replica set.

If using the MMapv1 storage engine, journal pre-allocation can be disabled using --nopreallocation option when starting the mongod.

With the WiredTiger storage engine, from MongoDB version 4.0 upwards, it is not possible to specify --nojournal option or even the storage.journal.enabled: false for replica set members using the WiredTiger storage engine.

Managing Journaling

Disabling Journaling

Journaling can only be disbled for standalone deployments and it is not recommended for production systems. For MongoDB version 4.0 upwards, one cannot specify neither the --nojournal option nor storage.journal.enabled: false when replica set members that use WiredTiger storage engine are involved.

To disable journaling start mongod with the --nojournal command line option.

Monitor the Journal Status

To get the statistics on the journal use the command db.serverStatus() which returns wiredTiger.log.

Get Commit Acknowledgement

We use the write concern with j option to get commit acknowledgement. {j: true}. Journaling must be enabled in this case otherwise the mongod instance may produce an error.

If journaling is enabled, w: “majority” this may imply j: true.

For a replica set, whenj: true, the setup requires only the primary to write to the journal, regardless of the w: <value> write concern.

However, even if thej: true is configured for a replica set, rollbacks may occur due to replica set primary failover.

Unexpected Shutdown Data Recovery

All journal files in the journal directory get replayed whenever MongoDB restarts from a crash before the server is detected. Since this operation will be recorded in the log output, there will be no need to run --repair.

Changing the WiredTiger Journal Compressor

Snappy compressor is the default algorithm of compression for the journal. However one can change this depending on the mongod instance setup.

For a standalone mongod instance:

Set the storage.wiredTiger.engineConfig.journalCompressor to a new value to update it. The most appropriate way to do this is through the config file but if using the command-line options, you must update the --wiredTigerJournalCompressor command-line option during restart.
Shutdown the mongod instance by connecting to a mongo shell of the instance and issue the command: db.shutdownServer() or db.getSiblingDB(‘admin).shutdownServer()
Restart the mongod instance:
1. If using the configuration file, use: mongod -f <path to file.conf>
2. If using command-line options, update the wiredTigerJournalCompressor:
```
Mongod --wiredTigerJournalCompressor <differentCompressor|none>
```

For a Replica Set Member:

Shutdown the mongod instance: db.shutdownServer() or db.getSiblingDB(‘admin).shutdownServer()
Make the following changes to the configuration file:
1. Set storage.journal.enabled to false.
2. Comment the replication settings
3. Set parameter disableLogicalSessionCacheRefresh to true.

i.e

storage:

   journal:

      enabled: false

#replication:

#   replSetName: replA

setParameter:

   disableLogicalSessionCacheRefresh: true

Restart the mongod instance:
1. If using the configuration file, use: mongod -f <path to file.conf>
2. If using the command-line options: include the --nojournal option, remove any replication command-line options i.e --replSet and set parameter disableLogicalSessionCacheRefresh to true
```
mongod --nojournal --setParameter disableLogicalSessionCacheRefresh=true
```

Shutdown the mongod instance:

db.shutdownServer() or db.getSiblingDB(‘admin).shutdownServer()

Update the configuration file to prepare for a restart of the replica set member with the new journal compressor: Remove the storage.journal.enabled, uncomment the replication settings for the deployment, remove disableLogicalSessionCacheRefresh option and lastly remove storage.wiredTiger.engineConfig.journalCompressor.

storage:

   wiredTiger:

      engineConfig:

         journalCompressor: <newValue>

replication:

   replSetName: replA

Restart the mongod instance as a replica set member

If using the configuration file, use: mongod -f <path to file.conf>
If using the command-line options: remove --nojournal and --wiredTigerJournalCompressor options. Include the replication command-line options and remove the disableLogicalSessionCacheRefresh parameter.

mongod --wiredTigerJournalCompressor <differentCompressor|none> --replSet ...

Conclusion

In order for MongoDB to guarantee a write operation durability, journaling is used whereby data is written to on-disk through ahead logging. As much as the WiredTiger storage engine (which is the most preferred) can recover data through the last checkpoints, if MongoDB exits unexpectedly and journaling was not enabled, recovering such data becomes impossible. Otherwise, if journaling is enabled, MongoDB can re-apply the write operations on restart and maintain a consistent state.

Tags:

↧

Tips For Upgrading to the Latest MongoDB Version

January 1, 2021, 2:45 am

≫ Next: Transaction Considerations for MongoDB in Production

≪ Previous: Managing Journaling in MongoDB

The latest versions of MongoDB are built to integrate new or improved features from the predecessor versions. For this reason, it is recommended to run the latest version for maximum performance and additional features. Besides, the latest versions may be as a result of fixed bugs depending on the MongoDB versioning.

MongoDB Versioning

MongoDB versions are of the form X.Y.Z.

When Y is even, i.e 4.0 or 4.2, this refers to a release series which is stable hence recommended for production. In this case new features are integrated which may result in backwards incompatibility.
If Y is odd, i.e 4.1 or 4.3, this refers to development series which is not stable hence recommended for testing only.
Z refers to a revision/patch number. Involves bug fixing and backward compatible changes.

Also considering the MongoDB driver versions is important towards ensuring a well performing database.

Considerations Before Upgrading

Backup: Midway the upgrade some crash may happen and in the end compromise the integrity of data the database was holding. Therefore, it is recommended to always make a backup of the data before upgrading to a certain version.
Maintenance Window: There may be some complexity that may arise when upgrading to some version if replica sets are involved. One needs to schedule enough time for this process so that you don’t encounter a high downtime.
Version Compatibility: Ensure to go through the release notes and check if your system setup will be compatible with the version you want to upgrade to. Also check the drivers compatibility documentation at Driver Compatibility page if they can go in hand with the MongoDB version you want to upgrade to. For instance from MongoDB 4.2 upwards there is no support for Ubuntu 16.04 PPCLE system.
Change Streams: Change streams are designed for applications to access real-time data changes without necessarily tailing the oplog. For MongoDB versions before 4.0.7, change stream uses a version 0 v0 resume token whereas this version and the successors use a version 1 v1 resume tokens. It is recommended for clients to wait for the upgrade to complete before resuming change streams when upgrading to version 4.0.7.
Staging Environment Check: Ensure that all configurations are well set up before upgrading the production environment and they will be compatible with the new version you want to upgrade to.
Primary-Secondary-Arbiter (PSA) Architecture: MongoDB version 3.6 and above enable support for“majority” read concern by default. However, this configuration may result in storage cache pressure and the only way to prevent this is by disabling this parameter. Nevertheless, disabling this parameter may raise more concerns i.e:
1. Support for transactions on sharded cluster will be affected in that:
  1. if a shard has read concern “majority” disabled, it will throw an error for a transaction that writes to multiple shards.
  2. Read concern “snapshot” cannot be used for a transaction involving a shard with read concern “majority” disabled
2. collMod commands that are responsible for modifying an index from rolling back will not work. This dictates that if an operation needs to roll back, one must use the primary node to resync the affected nodes.
3. Support for Change Streams for MongoDB 4.0 and earlier versions will also be disabled.
4. Replica set transactions are not affected with the disabling of this parameter.

Procedures for Upgrading

Make a backup of your data.
Upgrade the mongod/mongos binary separately using the system package management tool alongside the official MongoDB packages. You can also upgrade the mongos by replacing the existing binaries with new binaries using these procedure:
1. Download MongoDB binaries for the revision you want to upgrade to and store the downloaded compressed file in a temporary location.
2. Shut down the instance.
3. Use the downloaded binaries to replace the existing MongoDB binaries.
4. Restart the instance.
If upgrading a replica set, upgrade each member separately by starting with the secondaries and the primary last. To upgrade the secondaries:
1. Upgrade the mongod binary
2. Wait for the secondary to recover into the SECONDARY state and after it finishes, upgrade the next instance.rs.status() is used to check for the member’s state in a mongo shell. RECOVERING and STARTUP states may show up but you will need to wait until it recovers to SECONDARY.
When upgrading the primary:
1. In a mongo shell use rs.stepDown()to step down the primary as a way of initiating a normal failover. Since no writes will be accepted during the period, it is advisable to do the upgrade within the shortest time possible.
2. Until you see another member has been elected to be the primary, then upgrade the binaries of the shutdown primary.
3. Restart the primary after the upgrade is complete but if you check its status,rs.status(), it might be labeled a secondary.
To upgrade for a MongoDB 4.4 Sharded Cluster:
1. Ensure the balancer has been disabled.
2. Upgrade the config servers just as the same way you upgraded the replica set.
3. Upgrade the shard using the corresponding procedure i.e, replica set or a standalone.
4. Upgrade each mongos instance in order.
5. Re-enable the balancer.

Conclusion

With time data becomes more complex hence requiring advanced database features that can meet database administrators’ specifications. MongoDB does not fall back on this since it always releases database versions with either fixed bugs or newly integrated features. It is recommended to always upgrade to the latest MongoDB version for maximum performance. Nevertheless, before making an upgrade one needs to check the release notes for the version you want to upgrade to if it is compatible with your system. Upgrading the corresponding MongoDB drivers is also advisable.

Tags:

↧

Transaction Considerations for MongoDB in Production

January 23, 2021, 2:45 am

≫ Next: How to Backup Your Open edX MongoDB Database

≪ Previous: Tips For Upgrading to the Latest MongoDB Version

A MongoDB transaction is a group of throughput operations (reads and writes ) that require all these operations to succeed or fail together. For instance consider transferring money from account A to account B. If the write operation of updating account A fails, then consequently there will be no update for account B. Like ways if you manage to update account A but the transfer fails to be credited to account B, then the operation on account A will be reverted and the whole transaction will fail.

Transactions are generally designed this way to handle data inconsistency problems by keeping track of all the updates it makes along the way and if any of them fails, the connection breaks and the database rolls back by undoing all the changes that might have happened as a result of the transaction.

Production Considerations for Transactions

Transactions are powerful operations in MongoDB which go an extra mile in keeping track of the activities happening during the transaction. The downside of transaction operations is increased latency which lags other users from accessing the database until the transaction is complete by locking the involved resources. As this may be an appropriate way of coordinating updates for linked data, in the end it leads to unfriendly user experience.

In the production environment we have highlighted some of the considerations one should undertake when involving transactions and they include:

Feature Availability and Compatibility
Transaction Runtime Limit
Limit Size of the Oplog
WiredTiger Cache
Security before the Transactions
Shard Configuration Restriction
Sharded Cluster and Arbiters
Primary Secondary Arbiter (PSA) Configuration
Locks acquisition
Pending DDL Operations and Transactions
In-Progress Transactions and Write Conflicts
In-Progress Transactions and Stale Reads
In-Progress Transactions and Chunk Migration
Outside Reads During Transaction Commit
Drivers

These considerations are applicable regardless if running a replica set or a sharded cluster.

Feature Availability and Compatibility

MongoDB supports multi-document transactions on replica sets from version 4.0. Multi-document transactions are multi-statements that can be executed sequentially without affecting each other. For instance, you can create a user and update another user in a single transaction.

Distributed transactions were introduced in MongoDB version 4.2 to maintain an identical syntax to the transactions introduced in version 4.0. Full consistency and atomicity is maintained in that if a commit fails on one shard, all transactions in participant shards will be aborted.

Your drivers should be updated for the MongoDB 4.2 in order to use transactions in MongoDB 4.2 deployments.

The minimum feature compatibility version for using transactions with a replica set is MongoDB version 4.0 whereas for a sharded cluster is 4.2.

For checking the featureCompatibilityVersion use the command

db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } ) on the connected member.

Transaction Runtime Limit

This is a defined time in seconds within which a transaction should have come to completion. If a transaction exceeds this time it will be considered to have expired hence aborted by a periodic cleanup process. By default the value is set to be less than one minute but can be modified using the transactionLifetimeLimitSeconds parameter. The minimum value for this parameter is 1 second. To change the value for example to 50 seconds

db.adminCommand( { setParameter: 1, transactionLifetimeLimitSeconds: 50 } )

Or during the startup of the mongod

mongod --setParameter "transactionLifetimeLimitSeconds=50"

When setting the parameter for a sharded cluster, the parameter must be modified for all shards of a replica set member.

Limit size of the Oplog

MongoDB 4.0 creates a single operational log entry at commit time when a transaction contains any write operation. This is to say, the oplog will contain all write operations within the transaction and not that each write operation will have its own oplog. In that case, the oplog entry must be within the 16MB BSON document size.

In MongoDB version 4.2 multiple oplogs are created to accommodate all write operations within a transaction hence removing the total size limit for a transaction. However, each oplog entry must be within the 16MB BSON document size limit.

WiredTiger Cache

As mentioned before, transactions affect the performance of the database by locking resources involved. Among the resources is the WiredTiger cache that receives more pressure from the transactions. In order to avoid some of the storage cache pressure:

All abandoned transactions should be aborted so that they can free some space in the cache.
If an error occurs during an individual operation in the transaction, the transaction will remain in the cache until some directive operation is applied. The best thing will be to abandon the transaction and retry it.
You can set the transactionLifetimeLimitSeconds parameter such that expired transactions will be aborted automatically.

Security before the Transactions

Operations in aborted transactions are still audited if running with auditing. However you need to consider that there will be no event to indicate that the transaction aborted.

When running with access control, one must have privileges for the operations in the transaction that is read or readWrite.

Shard Configuration Restriction

If the writeConcerMajorityJournalDefault is set to false or a shard with a voting member using the in-memory storage engine, transaction cannot be run on such a cluster.

Sharded Cluster and Arbiters

If any transaction operation reads from or writes to a shard containing an arbiter, the transaction will error and abort even after spanning through multiple shards.

Primary Secondary Arbiter (PSA) Configuration

In this case we consider if one wanted to reduce cache pressure by disabling the read concern majority.

If there is a shard with disabled read concern majority, read concern snapshot cannot be used for a transaction otherwise the transaction will throw an error and abort thereafter.

This is because readConcern level snapshot cannot be supported in a sharded cluster with enableMajorityReadConcern set to false.

If any transaction involves a write operation to a shard with read concern majority disabled, the transaction will error and abort even if the transaction spanned through multiple shards.

For a replica set, there is a possibility to specify the read concern to local, majority or even snapshot and transactions can still work. However, if there is a plan to transition this to a shard then you will need to avoid using the snapshot option.

If you may want to check the status of the read concern majority, run db.serverStatus() on the mongod instances and then check the storageEngine.supportsCommittedReads field. If the value is false, the the read concern majority is disabled

Locks Acquisition

When a transaction begins it will request for locks required for the involved operations. If the request waits beyond some set maxTransactionLockRequestTimeoutMillis ( by default is 5 milliseconds ) parameter, the transaction aborts. When you create or drop a collection and later issue a transaction whose operation will involve this collection, it is important to issue a create or drop operation with write concern majority to ensure that the transaction can acquire the required locks.

If you increase this value, this can help remove transaction aborts on momentary concurrent lock acquisitions such as fast-running metadata operations. Nonetheless, with this value being high, aborts may be delayed even for transaction operations that are deadlocked.

To set this value use the command

db.adminCommand( { setParameter: 1, maxTransactionLockRequestTimeoutMillis: 20 } )

You can also enable or disable creation of a collection or an index inside a transaction using the shouldMultiDocTxnCreateCollectionAndIndexes.

To set the parameter during startup, either specify the parameter in the configuration file or run the command:

mongod --setParameter shouldMultiDocTxnCreateCollectionAndIndexes=false

If you wish to modify the parameter during runtime:

db.adminCommand( { setParameter: 1, shouldMultiDocTxnCreateCollectionAndIndexes: true } )

Consider setting the parameter for all shards if using a sharded cluster.

Pending DDL Operations and Transactions

If a transaction is in progress, DDL operation issued before it will wait since they will not obtain the locks when they are accessing the same database or collection.

An example of a DDL operation is createIndex(). If a transaction is in progress, this operation will have to wait until its completion. However, this operation does not affect transactions on other collections in the database.

The same case applies for a database that involves some database lock for example the collMod has to wait for the transaction to complete.

In-Progress Transactions and Write Conflicts

If a write operation is issued outside a transaction in progress but happens to conflict with the transaction by modifying an involved document, then the transaction will be aborted. This can only be resolved by the transaction obtaining a lock to modify the document and therefore the write operation outside the transaction will have to wait for the transaction to complete.

In-Progress Transactions and Stale Reads

Stale reads may happen because transaction operations use snapshots before the writes. For instance, a transaction can be in progress and then a write operation outside deletes the document the transaction modifies, the document will still be available within the transaction. A workaround for this is to use db.collection.findOneAndUpdate().

In-Progress Transactions and Chunk Migration

An error will occur if a chunk migration interleaves with a transaction and the migration comes into completion before the transaction takes a lock on the collection. The transaction will also abort in this case and one has to retry. Some of the errors that will be encountered in this case are:

an error from cluster data placement change ... migration commit in progress for <namespace>
Cannot find shardId the chunk belonged to at cluster time ...

Outside Reads During Transaction Commit

In this case we consider a read operation on the same document that is to be modified by the transaction. If the transaction is to write against multiple shards:

Outside reads using read concerns other than snapshot or linearizable will not wait for the writes of the transaction to be applied but read the before-transaction version of the documents that are in existence.
If the reads involve snapshot or linearizable read concerns, the operations have to wait for the transaction to be complete and the document changes visible.

Drivers

Use updated drivers for MongoDB version 4.2 if you would like to use transactions on MongoDB 4.2. If some shard members have drivers for MongoDB 4.0 instead of 4.3, the transaction will fail and log some errors such as:

251 cannot continue txnId -1 for session ... with txnId 1
50940 cannot commit with no participants

Conclusion

Transactions in MongoDB are important when data integrity and consistency need to be held. However, transactions come with some reduced database performance degradation more especially increasing latency of throughput operation. If not well managed/configured, transactions may affect the overall user experience of applications.

Tags:

↧

How to Backup Your Open edX MongoDB Database

January 27, 2021, 9:41 am

≫ Next: How to Deploy the Open edX MongoDB Database for High Availability

≪ Previous: Transaction Considerations for MongoDB in Production

Open edX is an open source project for online learning developed by the MIT and Harvard team. It is a web based application with a lot of components such as student facing, course authoring, course delivery and content management.

The Open edX is built in Python and uses Django as a web framework. It uses MongoDB as a database backend. When building and setting up an Open edX environment, one needs to think of the uptime of the service because the platform is widely used by the student and learner as an open platform.

High availability is a must for MongoDB databases, beside the application server too. For disaster recovery, a sound backup strategy is key so you know you can restore the data if something goes really wrong.

In this blog, we will review how to backup your Open edX MongoDB database.

Preparing the Backup Storage

The first thing we need to do is to prepare the storage for the MongoDB backup. You can stage the backups on the same infrastructure as the Open edX services, and then archive them offsite. You can use Storage Area Network (SAN) or Network Attached Storage where it is mounted to one of the MongoDB servers. AWS provides a simple storage service called S3 to archive your backups, while Google Cloud Platform has Cloud Storage.

It is on demand service and the pricing model is based on per GiB size of your backup. For safety, at least you can put your Open edX database backup on 2 different areas; which is on your premise, and on the cloud.

Manual Backup for MongoDB

Typically backup for MongoDB databases is using mongodump utility which is bundled when you install MongoDB server. You can take a backup in one of the MongoDB servers, just run the mongodump as shown below:

$ mongodump --db edxapp --out /backups/open-edx/`date +"%m-%d-%y"`

2021-01-11T11:23:42.541-0500    writing edxapp.module to /backups/open-edx/01-11-21/edxapp/module.bson

2021-01-11T11:23:42.878-0500    writing edxapp.module metadata to /backups/open-edx/01-11-21/newdb/module.metadata.json

2021-01-11T11:23:42.923-0500    done dumping edxapp.module (25359 documents)

2021-01-11T11:23:42.945-0500    writing newdb.system.indexes to /backups/open-edx/01-11-21/edxapp/system.indexes.bson

……

It will create a backup on the MongoDB host, you can have a script to move the backup files to some other storage.

Backup MongoDB for Open edX Using ClusterControl

ClusterControl supports MongoDB backup for your Open edX platform. It supports mongodump and we just added support for a new backup method called PBM (Percona Backup for MongoDB), which would be more appropriate for sharded MongoDB Clusters. Taking backup using mongodump in ClusterControl is very easy using a GUI-based wizard. Choose the Backup Tab and then Create Backup. There are two options you can choose, you can immediately create a backup or you can schedule the backup.

And then just click Continue:

Choose mongodump as a backup method, and then write down the location directory where you want to put the backup. In this step, you can use a Storage Area Network or Network Attached Storage that is mounted to your MongoDB server.

ClusterControl also supports backup to the cloud, currently we support Amazon Web Services (AWS), Google Cloud Platform, and Microsoft Azure.

You can also enable encryption for your backup, this is especially important if you are archiving in the cloud. Next, just press Create Backup, it will trigger a new job for the backup as shown below:

You can also use Percona Backup for MongoDB for consistent backup of your MongoDB Replicaset and Sharded Cluster.. Just select the percona-backup-mongodb as backup method, it requires you to install an agent on each node and shared storage to be mounted on every MongoDB node.

Tags:

↧

How to Deploy the Open edX MongoDB Database for High Availability

January 29, 2021, 10:57 am

≫ Next: Replicating MongoDB Across a Hybrid Cloud Environment

≪ Previous: How to Backup Your Open edX MongoDB Database

Open edX is a platform that provides the massively scalable learning software technology behind edX. The Open edX project is a web-based platform for creating, delivering, and analyzing online courses. It is the software that powers edx.org and many other online education sites.

We have blogged previously about deploying a High Availability for MySQL database on the Open edX platform. As said previously, it is a complex platform as it covers multiple components, and part of this huge platform it is covered by multiple services:

eCommerce: https://github.com/edx/ecommerce
Catalog: https://github.com/edx/course-discovery
LMS / Studio: https://github.com/edx/edx-platform
Credentials: https://github.com/edx/credentials
Insights: https://github.com/edx/edx-analytics-dashboard
Analytics API: https://github.com/edx/edx-analytics-data-api

Essentially, the Open Edx is perfect for online courses amidst pandemic and online training as what you might have tried and taken already especially if you are acquiring a product certification.

Brief Architectural Overview

The centerpiece of the Open edX architecture is edx-platform, which contains the learning management and course authoring applications (LMS and Studio, respectively). Besides its edx-platform, the technical services comprising the whole platform comprises various technologies involved which covers up a whole complex level of this software. See the diagram below taken from edX Team presentation last December.

You have Snowflake, Amazon RDS, MongoDB, Amazon S3, Elasticsearch, Memcached, and Redis as the technologies embodying this rich platform. Yet it's even hard to install and setup Open edX but I managed to put up a simple development environment to understand a bit of this platform.

Whilst, let's focus on MongoDB which is used to store contents for Forums, Course Structure, and Course Assets. Per-learner data is stored in MySQL, so if you want to know and have high availability for your MySQL with Open edX, read it here.

Storing Content For MongoDB

MongoDB is the database of choice by Open edX for storing large files which are text files, PDFs, audio/video clips, tarballs, etc. If you are familiar with Open edX and have used it especially as an author for the LMS or Studio, data is stored if you upload assets to your Open edX setup. These uploads are so called "contentstore" is basically a MongoDB-backed GridFS instance. Open edX uses MongoDB GridFS in order to store file data in chunks within a MongoDB instance and are able to store files greater than 16 MB in size. It can also serve portions of large files instead of the whole file.

An asset can be uploaded as "locked" or "unlocked". A locked asset is only available to students taking a particular course - the edx-platform checks the user's role before serving the file. Unlocked assets are served to any user upon request. When a student in a course requests an asset, the entire asset is served from GridFS.

Setting up a High Availability For your Open edX MongoDB Database

Let's admit that installing or setting up your Open edX platform is a great challenge. It's tough especially that you are new to this platform or software but it has a very great architectural design. However, it is possible that your setup with your MongoDB is a one-node Replica Set stand as a primary. On the other hand, it is best that your Replica Set must have at least a secondary or multiple secondary nodes aside from the Primary. This serves your high availability setup in case your Primary goes kaput, your Secondary replica node will take over the primary role.

Setup a Replica Set with Secondary Replicas

Doing this, you just have to add and setup at least two secondary replicas. The ideal is that, at least, in a replica set, you have 3 nodes for which one is your Primary, then the other two nodes are your secondaries replicating to the primary. This allows the MongoDB Replica set to proceed an election in case primary loses connectivity with its secondaries. This gives you reliability, redundancy, and of course high availability. It is a simple setup that you can have to achieve a high available environment with MongoDB.

Why does this provide high availability? A Replica Set in MongoDB is a group of mongod processes that maintain the same data set. MongoDB Replica sets use elections to determine which set member will become primary to the event that primary goes down or terminated abnormally or some configuration changes. Replica sets can trigger an election in response to a variety of events, such as:

Adding a new node to the replica set,
initiating a replica set,
performing replica set maintenance using methods such as rs.stepDown() or rs.reconfig(), and
the secondary members losing connectivity to the primary for more than the configured timeout (10 seconds by default).

Take this example diagram which visualizes how the election works.

Image courtesy of MongoDB documentation

Additionally, you can use the other secondary replicas as your read preference but this depends on the setup based on your client's connection. You can learn more by reading the read preference options for connection or check the Read Preference here.

Now, this looks great but dealing with your application client endpoint such as changing the hostname or IP address requires a manual change. It's not ideal if you have a load balancer on top of your Replica Set just like HaProxy since MongoDB Replica Set performs the election internally of MongoDB.

Setup A Sharded Cluster

Sharded cluster is ideal if you are dealing with a large size of data sets. Although it doesn't mean that you have to design a sharded cluster, it has to be dealing with large data sets. MongoDB offers mongos, which is a utility that shall act as a routing service for MongoDB shard configurations that processes queries from the application layer then determines the location of this data in the sharded cluster identified through its shard key in order to complete its transactions or database operations. Basically, just think that mongos instances behave identically to any other MongoDB instance.

So why having a mongos in front of your application? In times that your Replica set Primary hostname or IP changes after the election, from the application perspective, that means you also need to change the endpoint. With mongos, just point your application client to one of our mongos instances. Your application client only interfaces with the mongos instance and that's all it matters. The mongos will be the one to handle your query requests or transactions utilizing its purpose and function for your MongoDB Shard setup. That means, in your Open edx configuration files, there's no changes to be done. You don't need to restart your application servers in order to catch up with the changes from your MongoDB Replica Sets.

How to Setup High Availability

For example, using ClusterControl. Using ClusterControl can be achieved simply and efficiently as this can be done over the UI avoiding those manual configurations and installations for a very complex setup.

Let's consider you have an existing MongoDB instance with Open edX database existing,

rs0:PRIMARY> show dbs;

admin                0.000GB

cs_comments_service  0.000GB

edxapp               0.087GB

local                0.118GB



rs0:PRIMARY> rs.status()

{

        "set" : "rs0",

        "date" : ISODate("2021-01-22T14:46:51.398Z"),

        "myState" : 1,

        "term" : NumberLong(17),

        "heartbeatIntervalMillis" : NumberLong(2000),

        "members" : [

                {

                        "_id" : 0,

                        "name" : "192.168.40.10:27017",

                        "health" : 1,

                        "state" : 1,

                        "stateStr" : "PRIMARY",

                        "uptime" : 133,

                        "optime" : {

                                "ts" : Timestamp(1611326680, 1),

                                "t" : NumberLong(17)

                        },

                        "optimeDate" : ISODate("2021-01-22T14:44:40Z"),

                        "electionTime" : Timestamp(1611326679, 1),

                        "electionDate" : ISODate("2021-01-22T14:44:39Z"),

                        "configVersion" : 2,

                        "self" : true

                }

        ],

        "ok" : 1

}

You can simply import this as an existing database to ClusterControl and take a backup using ClusterControl's backup feature. Alternatively, you can use mongodump or try using the Percona Backup for MongoDB.

Now, in ClusterControl, create a MongoDB Shard as a new deployment. This can be done by the following steps:

Deploy a new MongoDB Shard in the deployment wizard dialog.

Setup the SSH Settings and its Configuration Servers and Routers. This is where your mongos instances shall be aside from your configuration servers.

Define your Shards. These are your Replica Set shard(s). Depending on your need. For example, in this deployment I deployed two shards but you can just use one shard to begin with especially for small deployments.

Define your database settings

At this point, hit the deploy button and just wait as the job is processed by ClusterControl.

Once finished, you can now restore the backup you have taken from mongodump. For example, I took a backup using ClusterControl and then used this as my source backup. When using mongorestore command, make sure that your destination host is one of your mongos instances. For this example deployment, I have 192.168.40.233 host.

$ mongorestore --host 192.168.40.233 --port 27017 --username <username> --password <password> --gzip  --archive=BACKUP-2/rs0.gz --authenticationDatabase=admin

2021-01-22T11:17:06.335+0000    preparing collections to restore from

2021-01-22T11:17:06.336+0000    don't know what to do with subdirectory "cs_comments_service", skipping...

2021-01-22T11:17:06.336+0000    don't know what to do with subdirectory "edxapp", skipping...

2021-01-22T11:17:06.337+0000    don't know what to do with subdirectory "admin", skipping...

2021-01-22T11:17:06.337+0000    don't know what to do with subdirectory "", skipping...

2021-01-22T11:17:06.372+0000    restoring to existing collection edxapp.modulestore.definitions without dropping

2021-01-22T11:17:06.372+0000    reading metadata for edxapp.modulestore.definitions from archive 'BACKUP-2/rs0.gz'

2021-01-22T11:17:06.373+0000    restoring edxapp.modulestore.definitions from archive 'BACKUP-2/rs0.gz'

2021-01-22T11:17:06.387+0000    restoring to existing collection edxapp.fs.chunks without dropping

2021-01-22T11:17:06.387+0000    reading metadata for edxapp.fs.chunks from archive 'BACKUP-2/rs0.gz'…

……

Now, you're ready and then make some changes to your Open edX configuration files. In my installation setup, you can update the/edx/etc/studio.yml and /edx/etc/lms.yml. You might have to change as well the files in /edx/app/edxapp/lms.auth.json and /edx/app/edxapp/cms.auth.json files and replace them with the correct hostname of your mongos instance.
Verify in your mongos and check if the databases are loaded and can be accessible,

root@mongos.33:~# mongo --host "mongodb://edxapp:CMwo1sGZ9mtpA0ZSdPf1rJ5FZ8pDCK9aMCJ@192.168.40.233:27017/?authSource=admin"

MongoDB shell version v4.2.11

connecting to: mongodb://192.168.40.233:27017/?authSource=admin&compressors=disabled&gssapiServiceName=mongodb

Implicit session: session { "id" : UUID("00a3a395-3531-4381-972e-502478af38d1") }

MongoDB server version: 4.2.11

mongos> show dbs

admin                0.000GB

config               0.002GB

cs_comments_service  0.000GB

edxapp               0.104GB

Now you're set!!!

In the web view also of ClusterControl, once the ClusterControl finishes the deployment, you'll have a topology that shall look like this,

Once done, you're all good to manage your Open edX and manage your courses!

Tags:

↧

Replicating MongoDB Across a Hybrid Cloud Environment

February 17, 2021, 9:41 pm

≫ Next: Architecting for Security : A Guide for MongoDB

≪ Previous: How to Deploy the Open edX MongoDB Database for High Availability

Relying only on on-premises infrastructure may limit how quickly organizations can develop and launch applications. Extending the infrastructure to utilize public cloud in a hybrid setup is a great way to achieve infrastructure agility. It is worth nothing that application performance is dependent not only on the database hardware, but also on your network connection to the database hosts. By making sure data is replicated and available between both on-prem and public cloud, it ensures applications can access data locally, at low latency.

Hybrid Cloud Database Architecture

A hybrid cloud consists of both public and private clouds functioning as a single unit. This allows organizations to take advantage of the strengths of both environments. When deploying MongoDB in a hybrid cloud environment, ClusterControl can be used as the orchestration tool to deploy and manage the MongoDB nodes.

The private cloud gives you full control over the compute resources, network, storage, as well as security. You manage all the infrastructure, and you can configure everything based on your requirements.

Hybrid Cloud Architecture

While the public cloud offers scalability and agility of infrastructure. For example, you can conveniently spin up VM instances in a few minutes and in a couple of clicks.

Many organizations mix private and public clouds, taking advantage of both environments. If the business is growing rapidly, you need to have fast scalability for your infrastructure. On the other hand, you also need to control and share the resources between them.

Replicating MongoDB in Hybrid Cloud

Preparing the environment

Servers/virtual machines on both sites need to be provisioned, and also connectivity between the nodes. Security in a hybrid cloud deployment is a major priority, both environments can be zoned off in a security group and you need to restrict communication to only specific ports.

Latency is one of the challenges in hybrid cloud architectures, you need to ensure that latency is similar across the nodes. This is to ensure that when the MongoDB ReplicaSet is up and running, there is no replication lag caused by the network. AWS has Direct Connect that provides dedicated connectivity between cloud and other data centers.

Setting up MongoDB

Deploying MongoDB nodes in a hybrid setup can be automated using ClusterControl. ClusterControl will take care of installing all the required packages, configuring the software and making sure the entire cluster comes up. You can go to deployment page:

Choose the MongoDB tab (in this case, we will deploy MongoDB ReplicaSet). Specify the SSH user, password, and give a name to the Cluster as shown below:

After that, choose the vendor database. Currently, the MongoDB database package is supported by MongoDB and Percona. We will use Percona Server for MongoDB with version 4.2.

Fill the admin user and password, you can change the server data directory and port for custom setting, or leave it as default. Add the target MongoDB node, we will configure 3 nodes in private cloud ( ip address 10.10.10.11, 10.10.10.12, 10.10.10.13) and 2 nodes in public cloud (ip address 10.11.10.111 and 10.11.10.112) in a MongoDB ReplicaSet architecture as shown below:

Just click Deploy, it will trigger a new job for deployment in ClusterControl as shown below:

At the end of deployment, you will have hybrid topology for MongoDB ReplicaSet as shown below:

The MongoDB nodes are spread across the private and public environments, which make the cluster highly available

Tags:

↧

Architecting for Security : A Guide for MongoDB

March 24, 2021, 10:48 pm

≫ Next: Audit Logging for MongoDB

≪ Previous: Replicating MongoDB Across a Hybrid Cloud Environment

The prevalence of data breaches is no longer surprising. Based on the just released FBI cybercrime report, victims of cybercrime have cumulatively lost a whopping $4.2 billion in 2020 which is $700 million more than the reported losses in 2019. In particular, insecure MongoDB databases have been part of the problem leading to significant data breaches. In February of 2019, an email verifications service company had their MongoDB database breached and it exposed 763 million records including email addresses, phone numbers, IP addresses and dates of birth. The reason being, a public facing MongoDB instance with no password.

Lack of authentication, no restricting of ports on the firewall, or failure to secure data in transmit may lead to a data breach. In this blog, we will discuss how to prepare and secure your MongoDB database in a production environment.

Authentication and Authorization

Authentication and authorization are two different things, but they are correlated. Authentication means the user has access to the MongoDB database while authorization permits the user to access the resource inside the database.

The default installation for authentication in MongoDB is disabled. MongoDB supports multiple authentications eg: SCRAM, x.509 Certificate Authentication. The default one in MongoDB is SCRAM (Salted Challenge Response Authentication Mechanism), which verifies the supplied credentials access with the username, password, and authentication database.

Before enabling authentication, please create a superuser in MongoDB with the role userAdminAnyDatabase.After it finishes, we just need to open the /etc/mongod.conf file and find the section on security. The default is disabled, we just need to enable.

security:
    authorization: "disabled"

Restart the MongoDB service to apply the configuration changes. We can also configure RBAC (Role Based Access Control) inside the MongoDB database for better security related to the user. So we segregate the access to the database based on the user and privileges.

Network Segmentation

Network segmentation is an important aspect when we design database architecture, it applies to all databases, not only for MongoDB. It is a best practice that we segregate the network for the database. We set up the database server in a private network, where it can not be reached from the internet.

The communication to the database happens on the private network, and when the user wants to access the database, they can use VPN or a jumphost. Besides network segmentation, restricting the port also plays a key role, we open the database port specific to the segmented network to control the inbound and outbound network traffic. So, we know that the incoming traffic is from the trusted source address.

Data Encryption

Another area we need to take a look at is data encryption. Data encryption is a method where the information is encoded to another form during the transmission and stored in the database.

Data encryption covers:

Data in transit: data in transmission state
Data at rest: data stored on disk. There are various types of data encryption at rest, we can use encryption on the database level, or we can use encryption in the storage layer.

Enabling SSL/TLS from the clients and MongoDB server and between the MongoDB nodes (in replicaset and sharded cluster architecture), will secure the data in transit. The data transfer will not be in plain text.

There are various encryption tools and features for data at rest encryption, for example; AWS provides EBS disk encryption combined with the KMS (Key Management Service) on the storage, while on the database layer, the Enterprise edition of MongoDB provides the database encryption at rest.

Database Auditing

Implementing an audit database for MongoDB gives visibility of what is running inside the database; eg: from which user, and what command that executed, and the source of ip address. We can combine those logs and create rules based on the authorization access. We can detect if there is any unintended user running some script in MongoDB. We can see the section auditLog.

auditLog:
   destination: syslog

We can send the MongoDB audit log into the syslog files and push the logs into Log Management. Want to have more tips on securing MongoDB? Watch this video to get a better understanding on the best practices to secure your MongoDB database

Conclusion

Implementing security standards for MongoDB is a must, especially for a production environment. We can not accept every loss and breach of the data stored in the database.

Tags:

↧

Audit Logging for MongoDB

March 31, 2021, 11:49 pm

≫ Next: How to use Encryption to Protect MongoDB data

≪ Previous: Architecting for Security : A Guide for MongoDB

One of the security aspects of managing a database is to understand who accessed the database, when, and what did they do. Although we have already secured the MongoDB service, we still want to know who is doing what, and detect if there is something weird. In a data breach investigation, an audit log allows us to analyze historical activity, understand from which endpoint the attacker came from, and what operations they did once they were inside the database.

In this blog, we will review audit logging for MongoDB and implementation.

Enabling Audit Logging in MongoDB

To enable audit logging in MongoDB, we need to go to the mongod.conf configuration file, section auditLog:

auditLog:
   destination: file
   format: BSON
   path: /var/lib/mongodb/audit_mongodb.bson

There are 3 types of log destinations, which are: file, syslog, and console. Ideally, we can send the audit log to a file, in JSON or BSON supported format. We can also enable the audit log during startup of the MongoDB service as shown below:

mongod --dbpath /var/lib/mongodb --auditDestination file --auditFormat BSON --auditPath /var/lib/mongodb/audit_mongodb.bson

Audit Filter in MongoDB

Still in the auditLog section, there is a parameter called filter. We can filter the action pattern that we want to log. For example, if we want to log authentication to a specific database, we can use the below command:

auditLog:
   destination: file
   format: BSON
   path: /var/lib/mongodb/audit_mongodb.bson
   filter: '{ atype: "authenticate", "param.db": "user_profile" }'

It will track every authentication to the user_profile database. Another example: we want to track the actions; drop index, rename collection, and drop collection in user_profile database. The command would be :

auditLog:
   destination: file
   format: BSON
   path: /var/lib/mongodb/audit_mongodb.bson
   filter: { atype: { $in: [ "dropIndex", "renameCollection", "dropCollection" ] }, "param.ns": /^user_profile\\./ } }

We can also monitor the audit process for the specific roles, we would need to define the roles and database in the filter:

auditLog:
   destination: file
   format: BSON
   path: /var/lib/mongodb/audit_mongodb.bson
   filter: { roles: { role: "readWrite", db: "user_profile" } }

It will log every action related to the user which has the readWrite roles in the user_profile database.

For audit logging of write and read operations, we need to enable the auditAuthorizationSuccess in MongoDB first. We can run below command :

db.adminCommand( { setParameter: 1, auditAuthorizationSuccess: true } )

Or another option is to change the following in the mongod.conf as below:

auditLog:
   destination: file
   format: BSON
   path: /var/lib/mongodb/audit_mongodb.bson
   filter: { roles: { role: "readWrite", db: "user_profile" } }
setParameter: { auditAuthorizationSuccess: true }

Percona Server for MongoDB gives the audit logging features for free, while in MongoDB it is only available in Enterprise Edition. Please take note that enabling the parameter will impact the database performance of your MongoDB, especially in the production environment.

What’s next ?

We can send the MongoDB audit log to a Logging Management System, example : ELK (Elasticsearch, Logstash, and Kibana) stack or we can use the Log Management System from the provider for analysis purposes.

The simplest way is to use jq tools utility in the Linux environment to read the log in JSON or BSON format.

Tags:

↧

How to use Encryption to Protect MongoDB data

April 16, 2021, 1:46 am

≫ Next: MongoDB Database Deployment Automation

≪ Previous: Audit Logging for MongoDB

With many kinds of data stored in the database, sometimes we may be dealing with confidential data, which might include credit card data, financial record, personal information. Those PII (Personal Identifiable Information) data is subject to regulation eg: PCI DSS, HIPAA, or GDPR which we need to protect and ensure confidentiality, integrity and availability.

Data Encryption is part of architecting MongoDB for security implementation in production environments. The aim of data encryption is adding more safeguards for security of the data, especially from insider threats. We lock down the service and ports of the database, maintain an access control list of who can access and perform which operations into the database, and enable encryption to protect from sniffing during network transmission, or when the data is stored. In this blog, we will discuss how to use encryption in MongoDB.

Data Encryption in Transit

Data Encryption in transit ensures MongoDB data is secured between the clients (i.e., application server) and the database server and in between the database servers in the MongoDB ReplicaSet or ShardedCluster architecture. MongoDB uses SSL/TLS certificates, either generated as self signed certificates or certificates which are issued by the Certificate Authority.

The best way is to use the certificate from a certificate authority, because it will allow MongoDB drivers to check the host with the Certificate Authority which means there will be a validation of server identity to avoid man-in-the-middle attack. You can still use the self signed certificate in a trusted network.

MongoDB SSL/TLS encryption must use the TLS/SSL ciphers with a minimum of 128-bit key. Starting from MongoDB version 4.2 and above, there is a new parameter called net.tls. It provides the same functionality as net.ssl. The configuration in mongod.conf file as shown below:

net:
   tls:
      mode: requireTLS
      certificateKeyFile: /etc/ssl/mongodb.pem

While if we want to add Client Certificate Validation, we just need to add parameter CAFile as following:

net:
         tls:
            mode: requireTLS
            certificateKeyFile: /etc/ssl/mongodb.pem
            CAFile: /etc/ssl/caClientCert.pem

With the above configuration, MongoDB SSL/TLS connections require valid certificates from the clients and the client must specify SSL/TLS connection and present the certificate key files.

In the above configuration, we use net.tls which exists on MongoDB 4.2. For the above version, we can use net.ssl configuration as shown below:

net:
   ssl:
      mode: requireSSL
      PEMKeyFile: /etc/ssl/mongodb.pem

And adding Client Certificate Validation is similar with the net.tls configuration. Just add the parameter CAFile as shown below:

net:
   ssl:
      mode: requireSSL
      PEMKeyFile: /etc/ssl/mongodb.pem
      CAFile: /etc/ssl/caClientCert.pem

Data Encryption at Rest

Talking about data encryption at rest, there are several methods of MongoDB data encryption which are:

Database Storage Engine encryption

MongoDB provides native encryption on the WiredTiger storage engine. The data rest encryption requires two keys protection for the data, which are master key used for encrypting the data and master key used to encrypt the database keys. The encryption uses AES256-CBC Advanced Encryption Standard. It uses asymmetric keys which is the same key for encrypting and decrypting the data. It is available only in Enterprise Edition starting from version 3.2 and above.

Percona Server for MongoDB has data encryption at rest which comes as part of the open source server, introduced from version 3.6. The current release does not include the Key Management Interoperability Protocol (KMIP) or Amazon KMS. We can use a local keyfile or a third party key management server such as Hashicorp Vault.

The parameter in Percona Server for MongoDB related to encryption is encryptionCipherMode which we can configure by choosing one of the following cipher modes:

AES256-CBC
AES256-GCM

The default cipher is AES256-CBC if you did not explicitly apply one of the above. We can enable data at rest encryption on the new Percona Server for MongoDB installation, but it does not support existing MongoDB services.

Disk/Storage Encryption

The storage encryption is the encryption of the storage media. We can use Linux based disk encryption such as LUKS to encrypt the data volume of the disk, or if we use a cloud environment, there might be an encryption option. For instance, in AWS it is possible to have encrypted block storage as well as S3 storage.

API based Encryption

The API based encryption uses third party encryption software or the application provides an algorithm to encrypt the data before it is stored in the MongoDB database. The entire process is handled by the application layer.

Tags:

↧

MongoDB Database Deployment Automation

May 3, 2021, 5:34 am

≫ Next: MongoDB Replication Best Practices

≪ Previous: How to use Encryption to Protect MongoDB data

Organizations are making use of infrastructure in the cloud because it offers speed, flexibility, and scalability. You can imagine if we can spin up a new database instance with just a click, and it takes a couple of minutes until it is ready, we also can deploy the application faster than when compared to on-prem environment.

Unless you are using MongoDB’s own cloud service, the major cloud providers do not offer a managed MongoDB service so it is not really a one-click operation to deploy a single instance or cluster. The common way is to spin up VMs and then deploy them on these. The deployment needs to be taken care of from A to Z - we need to prepare the instance, install the database software, tune some configurations and secure the instance. These tasks are essential, although they are not always properly followed - with potentially disastrous consequences.

Automation plays an important role in making sure all tasks starting from installation, configuration, hardening and until the database service is ready. In this blog, we will discuss deployment automation for MongoDB.

Software Orchestrator

There is a lot of new software tooling to help engineers to deploy and manage their infrastructure. Configuration management helps engineers deploy faster and effectively, reducing deployment time for new services. Popular options include Ansible, Saltstack, Chef, and Puppet. Every product has advantages and disadvantages, but they all work very well and are hugely popular. Deploying a stateful service like a MongoDB ReplicaSet or Sharded Cluster can be a bit more challenging since these are multi-server setups and the tools have poor support for incremental and cross node coordination. Deployment procedures usually call for orchestration across nodes, with tasks carried out in a specific order.

MongoDB Deployment Tasks to Automate

Deployment of a MongoDB server involves a number of things; add MongoDB repository into local, install MongoDB package, configure port, username, and start the service.

Task: install MongoDB

- name: install mongoDB
  apt: 
    name: mongodb
    state: present
    update_cache: yes

Task: copy the mongod.conf from configuration file.

- name: copy config file
  copy:
    src: mongodb.conf
    dest: /etc/mongodb.conf
    owner: root
    group: root
    mode: 0644
  notify:
    - restart mongodb

Task: create MongoDB limit configuration:

- name: create /etc/security/limits.d/mongodb.conf
  copy:
    src: security-mongodb.conf
    dest: /etc/security/limits.d/mongodb.conf
    owner: root
    group: root
    mode: 0644
  notify:
    - restart mongodb

Task: configuring swappiness

- name: config vm.swappiness
  sysctl:
    name: vm.swappiness
    value: '10'
    state: present

Task: configure TCP Keepalive time

- name: config net.ipv4.tcp_keepalive_time
  sysctl:
    name: net.ipv4.tcp_keepalive_time
    value: '120'
    state: present

Task: ensure MongoDB will automatically start

- name: Ensure mongodb is running and and start automatically on reboots
  systemd:
    name: mongodb
    enabled: yes
    state: started

We can combine all of these tasks into a single playbook and run the playbook to automate the deployment. If we run an Ansible playbook from the console:

$ ansible-playbook -b mongoInstall.yml

We will see the progress of deployment from our Ansible script, the output should be something like below:

PLAY [ansible-mongo] **********************************************************

GATHERING FACTS ***************************************************************
ok: [10.10.10.11]

TASK: [install mongoDB] *******************************************************
ok: [10.10.10.11]

TASK: [copy config file] ******************************************************
ok: [10.10.10.11]

TASK: [create /etc/security/limits.d/mongodb.conf]*****************************
ok: [10.10.10.11]


TASK: [config vm.swappiness] **************************************************
ok: [10.10.10.11]

TASK: [config net.ipv4.tcp_keepalive_time]*************************************
ok: [10.10.10.11]

TASK: [config vm.swappiness] **********************************************
ok: [10.10.10.11]

PLAY RECAP ********************************************************************
[10.10.10.11]          : ok=6    changed=1    unreachable=0    failed=0

After the deployment, we can check the MongoDB service on target server.

Deployment Automation of MongoDB using ClusterControl GUI

There are two ways to deploy MongoDB using ClusterControl. We can use it from the dashboard of ClusterControl, it is GUI-based and just needs 2 dialoges until it triggers a new job for new deployment of MongoDB.

First we need to fill the SSH User and password, fill the Cluster Name as shown below:

And then, choose the vendor and version of MongoDB, define the user and password, and the last is fill the target IP Address

Deployment Automation of MongoDB using s9s CLI

From the command line interface, one can use the s9s tools. The deployment of MongoDB using s9s is just a one line command as below:

$ s9s cluster --create --cluster-type=mongodb --nodes="10.10.10.15"  --vendor=percona --provider-version=4.2 --db-admin-passwd="12qwaszx" --os-user=vagrant --cluster-name="MongoDB" --wait
Create Mongo Cluster
/ Job 183 FINISHED   [██████████] 100% Job finished.

So deploying MongoDB, whether it is a ReplicaSet or a Sharded Cluster, is very easy, and is completely automated by ClusterControl.

Tags:

deployment

database automation

↧

MongoDB Replication Best Practices

June 8, 2021, 7:13 am

≫ Next: How to configure SELinux for MongoDB Replica Sets

≪ Previous: MongoDB Database Deployment Automation

Production database systems often require high availability architectures. This is to ensure the availability of the data through redundancy across multiple instances. However, it is important to understand how to deploy and configure the database cluster in order to achieve high availability.

MongoDB provides database replication in two different architectures - Replica Set, and Sharded Cluster.

MongoDB uses asynchronous replication to distribute the data to secondary nodes, using the oplog (operation logs), the transaction log for write operations in the database. After the data is written into the disk in the Primary node, it writes the oplog to a special capped collection that keeps rolling records for a write operation (insert, update, delete). In this blog, we will review some options and parameters for MongoDB Replication.

Transaction Log Parameters

MongoDB replication requires at least three nodes for the automatic election process in case the primary node goes down. The replication of MongoDB relies on the oplog, for the data synchronization from Primary to Secondary nodes. The default oplog size for the WiredTiger storage engine is 5% of the free disk space.

We can tune the oplog size by specifying the parameter oplogSizeMB based on the predicted workload of the system. Setting the minimum retention period of the oplog to limit the old entries of the transaction log to be truncated, we can set the parameter storage.oplogMinRetentionHours in mongod configuration files.

Read and Write Configuration

MongoDB comes with Write and Read configuration settings, which you can configure depending on the needs from the application side. We can tune parameters for the written acknowledgment from primary to secondary. We can also set the Read option to distribute read traffic across all nodes of a Replica Set.

The write concern in MongoDB gives the option on how the data is acknowledged for write operations, both in Replica Set or Sharded Cluster architectures.

The typical configuration of write concern is shown below:

{ w: <value>, j: <boolean>, wtimeout: <number> }

For example, if we want to have the data written to the majority of nodes before the timeout, we can configure as below:

{ writeConcern: { w: majority, wtimeout: 2000 } }

It will synchronize the data before the timeout happens, in this case, the timeout is 2 seconds.

Another value of the writeConcern parameter besides the majority, are:

w: 1, this value acknowledges the write operation in a standalone MongoDB node or in the Primary node of Replica Set / Sharded Cluster.

w: 0, there is no acknowledgment of the write operation in this value.

Write concern above the value of 1 requires acknowledgement from the Primary and as many as the secondaries required to meet the write concern. For example, if we set the write concern of 4 in a MongoDB ReplicaSet with 7 nodes (1 Primary and 6 Secondary), it requires acknowledgement from at least 4 nodes (1 Primary and 3 Secondary) before it is returned as a successful write operation to the application.

Another option is the j option, which is the acknowledgment that the data that has been written in the disk journal. In a Replica Set configuration, there are 3 options that explain how to write the data in the journal as below:

j: true the configuration requires writing the data into the journal disk

j: false the configuration requires writing of the data in memory.

For example, the configuration with the j option as below:

{ writeConcern: { w: majority, j: true, wtimeout: 2000 } }

The last option is if the j parameter is not specified in the parameter, it depends on the parameter of writeConcernMajorityJournalDefault. If the parameter is true, then it will require writing on the disk journal, while if it is false then only requires writing on the memory.

MongoDB sends the read request to the Primary node by default. There are some options on how we can configure read requests in a Replica Set architecture. MongoDB provides a parameter called readPreference which have many options as below:

Primary: The default value of the read preference, all the read traffic will be sent to the primary.
primaryPreferred: The read traffic will be sent to the primary node most of the time, unless the primary node is not available it will reroute to the secondary node.
Secondary: The read traffic will be sent to the secondary node.
secondaryPreferred: The read traffic will be sent to the secondary node, unless the secondary node is not available it will reroute to the primary node.
Nearest: The read request will be sent to the nearest reachable node based on the latency threshold.

Secondary Nodes

The secondary node participates in a primary election process, so eventually, the secondary nodes will be promoted if the primary goes down. There are various types of secondary nodes in a MongoDB Replica Set architecture.

Secondary nodes

The “normal” secondary nodes, which are participating in the election process of the primary nodes and can accept the traffic from the application.

Arbiter nodes

The arbiter node in a Replica Set acts as a voter who participates in the election of the Primary. The node can not be elected as a Primary because it does not store any data. The arbiter node is used if you only run one Primary and one Secondary node, because when there are just two nodes, MongoDB will not elect the Secondary unless it has the majority of votes ( > 50%). The configuration of the MongoDB arbiter is similar to the other nodes in terms of installation, specify the Replica Set name, the difference is that we need to add rs.addArb command when joining the arbiter node to the Primary node.

Hidden nodes

We can set the secondary nodes as a hidden member node in MongoDB Replica Set. The hidden nodes will not be visible in the MongoDB driver, so there will not be traffic sent to the hidden nodes. The priority of the hidden nodes themselves is 0, so it will not participate in the election process of a Primary. The parameter that needs to be configured for the hidden nodes is: "hidden": true

We can use the hidden node as a special purpose node for data ingestion, analytics or backup node.

Delayed nodes

The delayed node must be a hidden node. The data in the hidden node reflects the earlier data from the Primary nodes based on the delayed time we configured. For example, if we configure the delayed time to be 30 minutes, the data will reflect the state of the Primary node as it was 30 minutes before. We can configure the delayed node for 30 minutes as shown below:

cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
cfg.members[0].slaveDelay = 1800
rs.reconfig(cfg)

The purpose of the delayed node is for fast recovery of the data if we delete or update some incorrect data accidentally from the Primary, we can immediately recover the data from the delayed nodes. It will reduce the recovery time compared to restoring the data from backup.

Node with Priority 0

The Secondary nodes with Priority 0 are the “normal” nodes but the difference is that they will not participate in the election process. The configuration is really straightforward, we just need to set the priority as shown below:

cfg = rs.conf()
cfg.members[0].priority = 0
rs.reconfig(cfg)

The use case of the Node with Priority 0 is when we want to have DR (Disaster Recovery) site on the other datacenter, we can replicate the data and set the priority to become 0.

Tags:

replication

high availability

↧

How to configure SELinux for MongoDB Replica Sets

June 17, 2021, 6:55 am

≫ Next: How to Configure AppArmor for MongoDB Replica Sets?

≪ Previous: MongoDB Replication Best Practices

By 2025, the world will store approximately 200 zettabytes of data. That data will be stored either in public, private, on-premises or cloud storage, PCs, laptops, smartphones and also Internet-of-Things (IoT) devices. On projection, the number of internet-connected devices is also expected to increase to almost 75 billion in 2025. For some of us or people with less IT background, those numbers are nothing. But for security enthusiasts, this is worrying because more and more data is at risk.

In the world of open-source technology and databases, security is one of the important topics. Every now and then there will be a lot of new inventions and developments associated with security. One of them is Security-Enhanced Linux or (SELinux) for short, which was developed nearly 21 years ago by the United States National Security Agency (NSA). Even though this has been introduced so many years ago, it has evolved rapidly and extensively used as one of the security measures for the Linux system. While it’s fairly not easy to find information on how to configure it with a database, MongoDB has taken advantage of this. In this blog post, we will go through SELinux and how to configure it in MongoDB replica sets.

For this purpose, we are going to use 3 CentOS 8 VMs for our test environment and use MongoDB 4.4. Before we get going, let’s dive a little deeper into SELinux.

Enforcing, Permissive and Disabled Mode

These are the three modes that SELinux can run at any given time. Of course, all of them have their own function and purpose in terms of the security policy. We will go through it one by one.

When in enforcing mode, any configured policy will be enforced on the system and every single unauthorized access attempt both by users or processes are denied by SELinux. Not only that, those access denied actions also will be recorded in the related log files. While this is the most recommended mode, most of the Linux systems nowadays do not have this mode enabled by the system administrator due to various reasons like the complexity of the SELinux itself.

For permissive mode, we can safely say that SELinux is in a semi-enabled state. In this mode, no policy will be applied by SELinux, at the same time no access will be denied. Despite that, any policy violation is still recorded and logged in the audit logs. Typically, this mode is used to test SELinux before finalizing and proceeding to enforce it.

For the last mode which is disabled, no enhanced security is running on the system. Do you know what SELinux mode your system runs now? Simply run the following command to see:

$ sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      33

For our test systems, SELinux has been enabled and configured as enforcing so that we could proceed with the rest of the guide. In case SELinux is disabled or permissive in your system, you may follow the steps below to enable it and change to enforcing.

Edit /etc/selinux/config file to change the directive to enforcing

vi /etc/sysconfig/selinux
...
SELINUX=enforcing
…

You need to make sure the above directive is set to enforcing.

Reboot the system to apply the setting

$ reboot

Once the system is online, we need to confirm that SELinux has been configured correctly and the change has taken place. Run the following command to check, this is another way to check it apart from the first one that I mentioned earlier (sestatus).

$ getenforce

Enforcing

Once we see the word “Enforcing”, we can now confirm this is good to go. Since we are going to use a replica set, we need to make sure the SELinux has been configured on all MongoDB nodes. I believe this is the most important part that we should cover before we proceed with configuring SELinux for MongoDB.

Recommended “ulimit” Settings

In this example, we assume that MongoDB 4.4 has been installed on 3 nodes. The installation is very simple and easy, to save our time we are not going to show you the steps but here is the link to the documentation.

In some cases, “ulimit” of the system will cause a few issues if the limits have a low default value. In order to make sure MongoDB runs correctly, we highly recommend setting the “ulimit” as per MongoDB recommendation here. While every deployment may have its unique requirements or settings, it’s best to follow the following “ulimit” settings:

-f (file size): unlimited
-t (cpu time): unlimited
-v (virtual memory): unlimited
-l (locked-in-memory size): unlimited
-n (open files): 64000
-m (memory size): unlimited
-u (processes/threads): 64000

To change the “ulimit” value, simply issue the following command, for example, changing the value for “-n” (open files):

$ ulimit -n 64000

After all the limits are changed, the mongod instance has to be restarted to ensure the new limit changes take place:

$ sudo systemctl restart mongod

Configuring SELinux

According to the MongoDB documentation, the current SELinux Policy does not allow the MongoDB process to access /sys/fs/cgroup, which is required to determine the available memory on your system. So for our case, in which SELinux is in enforcing mode, the following adjustment has to be made.

Permit Access to cgroup

The first step is to ensure that our system has the “checkpolicy” package installed:

$ sudo yum install checkpolicy

yum install checkpolicy

Last metadata expiration check: 2:13:40 ago on Fri 11 Jun 2021 05:32:10 AM UTC.

Package checkpolicy-2.9-1.el8.x86_64 is already installed.

Dependencies resolved.

Nothing to do.

Complete!

Next, we need to create a custom policy file for “mongodb_cgroup_memory.te”:

cat > mongodb_cgroup_memory.te <<EOF
module mongodb_cgroup_memory 1.0;
require {
      type cgroup_t;
      type mongod_t;
      class dir search;
      class file { getattr open read };
}

#============= mongod_t ==============
allow mongod_t cgroup_t:dir search;
allow mongod_t cgroup_t:file { getattr open read };
EOF

After the policy file is created, the last steps are to compile and load the custom policy module by running these three commands:

$ checkmodule -M -m -o mongodb_cgroup_memory.mod mongodb_cgroup_memory.te
$ semodule_package -o mongodb_cgroup_memory.pp -m mongodb_cgroup_memory.mod
$ sudo semodule -i mongodb_cgroup_memory.pp

The last command should take a while and once it’s done, the MongoDB process should be able to access the correct files with SELinux enforcing mode.

Permit Access to netstat for FTDC

/proc/net/netstat is required for Full Time Diagnostic Data Capture (FTDC). FTDC in short is a mechanism to facilitate analysis of the MongoDB server. The data files in FTDC are compressed, not human-readable, and inherit the same file access permission as the MongoDB data files. Due to this, only users with access to the FTDC data files can transmit the data.

The steps to configure it are almost identical to the previous one. It’s just that the custom policy is different.

$ sudo yum install checkpolicy
Create a custom policy file “mongodb_proc_net.te”:
cat > mongodb_proc_net.te <<EOF
module mongodb_proc_net 1.0;
require {
    type proc_net_t;
    type mongod_t;
    class file { open read };
}

#============= mongod_t ==============
allow mongod_t proc_net_t:file { open read };
EOF

The last steps are to compile and load the custom policy:

$ checkmodule -M -m -o mongodb_proc_net.mod mongodb_proc_net.te
$ semodule_package -o mongodb_proc_net.pp -m mongodb_proc_net.mod
$ sudo semodule -i mongodb_proc_net.pp

Custom MongoDB Directory Path

One important thing to note is that if you installed MongoDB in the custom directory, you will need to customize the SELinux policy as well. The steps are slightly different from the previous one but it’s not too complex.

First, we need to update the SELinux policy to allow the mongod service to use the new directory, it’s worth to note that we need to make sure to include the .* at the end of the directory :

$ sudo semanage fcontext -a -t <type> </some/MongoDB/directory.*>

mongod_var_lib_t for data directory
mongod_log_t for log file directory
mongod_var_run_t for pid file directory

Then, update the SELinux user policy for the new directory:

$ sudo chcon -Rv -u system_u -t <type> </some/MongoDB/directory>

mongod_var_lib_t for data directory
mongod_log_t for log directory
mongod_var_run_t for pid file directory

The last step is to apply the updated SELinux policies to the directory:

restorecon -R -v </some/MongoDB/directory>

Since MongoDB is using the default path for both data and log files, we can take a look at the following examples on how to apply it:

For non-default MongoDB data path of /mongodb/data:

$ sudo semanage fcontext -a -t mongod_var_lib_t '/mongodb/data.*'
$ sudo chcon -Rv -u system_u -t mongod_var_lib_t '/mongodb/data'
$ restorecon -R -v '/mongodb/data'
For non-default MongoDB log directory of /mongodb/log (e.g. if the log file path is /mongodb/log/mongod.log):

$ sudo semanage fcontext -a -t mongod_log_t '/mongodb/log.*'
$ sudo chcon -Rv -u system_u -t mongod_log_t '/mongodb/log'
$ restorecon -R -v '/mongodb/log'

Custom MongoDB Port

For some situations, some of the MongoDB installations are using a different port number other than the default one which is 27017. In this particular case, we need to configure SELinux as well and the command is pretty simple:

$ sudo semanage port -a -t mongod_port_t -p tcp <portnumber>
For example, we are using port 37017:
$ sudo semanage port -a -t mongod_port_t -p tcp 37017

Deploying MongoDB SELinux Enabled With ClusterControl

With ClusterControl, you have the option to enable SELinux during the deployment of your MongoDB replica set. However, you still need to change the mode to enforcing since ClusterControl only sets it to permissive. To enable it during deployment, you may untick “Disable AppArmor/SELinux” as per the screenshot below.

After that, you may continue and add the nodes for your MongoDB replica set and start the deployment. In ClusterControl, we are using version 4.2 for MongoDB.

Once the cluster is ready, we need to change the SELinux to enforcing for all nodes and proceed with configuring it by referring to the steps that we went through just now.

Conclusion

There are 3 modes of SELinux available for any Linux system. For SELinux enforcing mode, there are a few steps that need to be followed in order to make sure MongoDB runs without any issues. It’s also worth noting that some of the “ulimit” settings need to be changed as well to suit the system requirements as well as specs.

With ClusterControl, SELinux can be enabled during the deployment, however you still need to change to enforcing mode and configure the policy after the replica set is ready.

We hope that this blog post will help you setting up SELinux for your MongoDB servers

Tags: