Many tasks feed into making sure that your OpenShift container structure continues to run optimally through the life of your operation. Some of the most important of these tasks are listed below, but make sure to check out this comprehensive guide for all your maintenance items.
- Certificate management
- Environment health checks
- Environment backup/restore
- Adding host to cluster
- Backing up projects
- Backing up persistent volumes
- Pruning images in containers
While you can run typical Day 2 operations directly from inside OpenShift, you’ll find that the included functions for doing environment health checks do not include some monitoring and security capabilities that you might like to incorporate as an added layer of insurance for your containers. We recommend trying Sysdig to secure your containers and monitor them for changes.
So, let’s backtrack for a moment. Why do we want to add these capabilities through an additional service? Isn’t the point of containers to simplify and accelerate our development capabilities?
Sysdig Monitor: The elements of monitoring containers
On one hand, containers provide flexibility and portability, but on the other hand, they complicate monitoring and troubleshooting. Why is that?
As isolated “black boxes,” containers make it difficult for traditional tools to penetrate their shells to observe processes and performance metrics. If we can use something to make our containers more transparent, especially if we’re orchestrating containers on an enterprise level, we can preempt issues before they arise and gain better insight into potential problems for simplified debugging. By using Sysdig, we continue to move toward our goal of simplification and acceleration.
Prometheus and Sysdig
OpenShift automatically provides Prometheus metrics, via attaching monitoring containers to your containers, called Exporters. To get a complete picture from the Exporters, you must use an additional analytics platform, such as Grafana. While this provides satisfactory monitoring capabilities, it does double the number of containers in your production environment after accounting for each of the Exporters.
Sysdig uses a system call monitoring approach which utilizes an agent for each host that collects metrics by observing all system calls traversing the OS kernel. This method drastically reduces the monitoring agent’s resource consumption and eliminates the need for per-container instrumentation. It allows you to see what’s happening inside a container from the outside. This approach also enables you to collect in-depth data for containers, short-lived processes, orchestration tools, and underlying infrastructure with little overhead.
Sysdig Monitor rolls in OpenShift’s Prometheus monitoring to ingest metrics from instrumented apps and enables advanced queries using the Prometheus query language (PromQL). Sysdig takes it a step further and tags all metrics with all the available metadata and tags to support exploring, aggregating, segmenting, and drilling down.
A key challenge with troubleshooting containers is that they may no longer exist after a problem occurs. Comprehensive container monitoring solutions should include the ability to automatically record all of the activity on a system that takes place surrounding an event. Capturing information such as commands, process details, network activity, and file system activity allows after-the-fact investigation, even after containers are gone and outside the production environment. Sysdig provides this forensics support, even on containers that are already gone.
Sysdig Secure: The elements of securing containers
Once a container is deployed, you need to put security measures in place to detect violations of expected activity or prevent certain system calls, processes, or network connections from occurring that could be detrimental.
Container orchestration and container security
Container orchestration platforms such as Kubernetes, and OpenShift come with the following security measures in place:
- Role-based access control (RBAC): RBAC specifies the authorization and access control specifications that define the actions allowed over Kubernetes entities.
- Pod security policy: Using security policies, you can restrict the pods that will be allowed to run on your cluster. For example, you can configure resources, privileges, and sensitive configuration items.
- Network policy: A network policy is a specification of how groups of pods are allowed to communicate with each other and other network endpoints.
Your security capacities should include approaches to address each of the following:
- Container resource consumption
- Container age and image consistency
- Credentials management
- Outside images credentials
- Runtime security monitoring
These topics, and others, are available for further review via the Running Containers in Production for Dummies e-book available for free download through Sysdig.
Sysdig and container resource consumption
Errors in creating containers or deliberate attacks can cause containers to consume more resources than intended, resulting in denial-of-service. Since containers can be incredibly numerous when deployed on an enterprise level, these issues can be instantly exacerbated. Sysdig helps you set resource consumption limits and monitor consumption levels against these limits over time to prevent such problems from occurring. You can also create consumption alerts in Prometheus, but Sysdig helps create a holistic view out of this continuous information.
Sysdig and images/containers monitoring
The longer you run software without updating it, the more likely it is to eventually be exploited. This means you could easily have older, vulnerable software running in production. You should continually monitor how long containers have been running in production and make sure they stay current. Make sure you use a vulnerability scanner to stay up to date on any potential security issues in the software you use. You can use Sysdig to help you do this. Sysdig integrates with CI/CD tools to introduce security elements into the pipeline early. This helps prevent vulnerable images from ever being deployed to production.
Sysdig and runtime security
Sometimes, in spite of best efforts, a corrupted or otherwise suspicious image can infiltrate your containers. For this situation, you will want to have a system in place to prevent these images from doing extensive damage across your containers. Sysdig provides powerful run-time security and forensics for your containers and microservices. You also get vulnerability management and image scanning that will act like an anti-virus software for your containers and images.
The same agent that supports performance and health monitoring also supports Sysdig’s security capabilities. Because it collects data through system calls, it gives you significantly more signals about container, host and orchestrator activity in your environment. Sysdig Secure adds a continuous security element to your CI/CD pipeline so that your security features work and scale in conjunction with your containers.
Again, a characteristic of containers is that by the time you realize that there has been an issue, the container has already disappeared. Therefore, Sysdig Secure works with Sysdig Inspect to provide forensic analysis that allows you to analyze what happened before, during, and after the issue, even after the container is already gone.
Monitoring and Security as Day 2 Operations on OpenShift with Sysdig
While OpenShift comes with a lot of capabilities for monitoring and security, you will want to give an extra layer of both to help make the job of monitoring and security an enterprise-level container implementation simpler. Sysdig Monitor and Sysdig Secure work in conjunction with not only each other, but also in conjunction with OpenShift to provide expanded Prometheus metrics, resource-efficient monitoring via system calls, and capabilities to provide expanded runtime security, image scanning, and consumption monitoring to circumvent possible attacks.
Sysdig helps give your OpenShift implementation sustainability beyond deployment. It helps measure and correlate incident responses, monitors and presents the entire containers environment from both the outside and inside, and incorporates continuous security into your CI/CD pipeline.