“Cloud security is simple, absolutely simple. Stop over complicating it.”
This is how I kicked off a presentation I gave at the CyberRisk Alliance, Cloud Security Summit on Apr 17 of this year. And I truly believe that cloud security is simple, but that does not mean easy. You need the right strategy.
As I am often asked about strategies for the cloud, and the complexities that come with it, I decided to share my recent talk with you all. Depending on your preference, you can either watch the video below or read the transcript of my talk that’s posted just below the video. I hope you find it useful and will enjoy it. And, as always, I’d love to hear from you, find me @marknca.
[embedded content]
For those of you who prefer to read rather than watch a video, here’s the transcript of my talk:
Cloud security is simple, absolutely simple. Stop over complicating it.
Now, I know you’re probably thinking, “Wait a minute, what is this guy talking about? He is just off his rocker.”
Remember, simple doesn’t mean easy. I think we make things way more complicated than they need to be when it comes to securing the cloud, and this makes our lives a lot harder than they need to be. There’s some massive advantages when it comes to security in the cloud. Primarily, I think we can simplify our security approach because of three major reasons.
The first is integrated identity and access management. All three major cloud providers, AWS, Google and Microsoft offer fantastic identity, and access management systems. These are things that security, and [inaudible 00:00:48] professionals have been clamouring for, for decades.
We finally have this ability, we need to take advantage of it.
The second main area is the shared responsibility model. We’ll cover that more in a minute, but it’s an absolutely wonderful tool to understand your mental model, to realize where you need to focus your security efforts, and the third area that simplifies security for us is the universal application of APIs or application programming interfaces.
These give us as security professionals the ability to orchestrate. and automate a huge amount of the grunt work away. These three things add up to, uh, the ability for us to execute a very sophisticated, uh, or very difficult to pull off, uh, security practice, but one that ultimately is actually pretty simple in its approach.
It’s just all the details are hard and we’re going to use these three advantages to make those details simpler. So, let’s take a step back for a second and look at what our goal is.
What is the goal of cybersecurity? That’s not something you hear quite often as a question.
A lot of the time you’ll hear the definition of cybersecurity is, uh, about, uh, securing the confidentiality, integrity, and availability of information or data. The CIA triad, different CIA, but I like to phrase this in a different way. I think the goal is much clearer, and the goal’s much simpler.
It is to make sure that whatever you’re building works as intended and only as intended. Now, you’ll realize you can’t accomplish this goal just as a security team. You need to work with your, uh, developers, you need to work with operations, you need to work with the business units, with the end users of your application as well.
This is a wonderful way of phrasing our goal, and realizing that we’re all in this together to make sure whatever you’re building works as intended, and only as intended.
Now, if we move forward, and we look at who are we up against, who’s preventing our stuff from working, uh, well?
You look at normally, you think of, uh, who’s attacking our systems? Who are the risks? Is it nation states? Is it maybe insider threats? While these are valid threats, they’re really overblown. You’re… don’t have to worry about nation state attacks.
If you’re a nation state, worry about it. If you’re not a nation state, you don’t have to worry about it because frankly, there’s nothing you can do to stop them. You can slow them down a little bit, but by definition, they’re going to get through your resources.
As far as insider attacks, this is an HR problem. Treat your people well. Um, check in with them, and have a strong information management policy in place, and you’re going to reduce this threat naturally. If you go hunting for people, you’re going to create the very threats that you’re looking at.
So, it brings us to the next set. What about cyber criminals? You know, we do have to worry about cyber criminals.
Cyber criminals are targeting systems simply because these systems are online, these are profit motivated criminals who are organized, and have a good set of tools, so we absolutely need to worry about them, but there’s a more insidious or more commonplace, maybe a simpler threat that we need to worry about, and that’s one of mistakes.
The vast majority of issues that happen around data breaches around security vulnerabilities in the cloud are mistake driven. In fact, to the point where I would not even worry about cyber criminals simply because all the work we’re going to do to focus on, uh, preventing mistakes.
And catching, and rectifying the stakes really, really quickly is going to uh, you a cover all the stuff that we would have done to block out cyber criminals as well, so mistakes are very common because people are using a lot more services in the cloud.
You have a lot more, um, parts and moving, uh, complexity in your deployment, um, and you’re going to make a mistake, which is why you need to put automated systems in place to make sure that those mistakes don’t happen, or if they do happen that they’re caught very, very quickly.
This applies to standard DevOps, the philosophies for building. It also applies to security very, very wonderfully, so this is the main thing we’re going to focus on.
So, if we look at that sum up together, we have our goal of making sure whatever we’re building works as intended, and only as intended, and our major issue here, the biggest risk to this is simple mistakes and misconfigurations.
Okay, so we’re not starting from ground zero here. We can learn from others, and the first place we’re going to learn is the shared responsibility model. The shared responsibility applies to all cloud service providers.
If you look on the left hand side of the slide here, you’ll see the traditional on premise model. We roughly have six areas where something has to be done roughly daily, whether it’s patching, maintenance, uh, just operational visibility, monitoring, that kind of thing, and in a traditional on premise environment, you’re responsible for all of it, whether it’s your team, or a team underneath your organization.
Somewhere within your tree, people are on the hook for doing stuff daily. Here when we move into an infrastructure, so getting a virtual machine from a cloud provider right off the bat, half of the responsibilities are pushed away.
That’s a huge, huge win.
And, as we move further and further to the right to more managed service, or staff level services, we have less and less daily responsibilities.
Now, of course, you always still have to verify that the cloud service provider’s doing what they, uh, say they’re doing, which is why certifications and compliance frameworks come into play, uh, but the bottom line is you’re doing less work, so you can focus on fewer areas.
Um, that is, or I should say not less work, but you’re doing, uh, less broad of a work.
So you can have that deeper focus, and of course, you always have to worry about service configuration. You are given knobs and dials to turn to lock things down. You should use them like things like encrypting, uh, all your data at rest.
Most of the time it’s an easy check box, but it’s up to you to check it ‘cause it’s your responsibility.
We also have the idea of an adoption framework, and this applies for Azure, for AWS and for Google, uh, and what they do is they help you map out your business processes.
This is important to security, because it gives you the understanding of where your data is, what’s important to the business, where does it lie, who needs to touch it, and access it and process it.
That also gives us the idea, uh, or the ability to identify the stakeholders, so that we know, uh, you know, who’s concerned about this data, who is, has an investment in this data, and finally it helps to, to deliver an action plan.
The output of all of these frameworks is to deliver an action plan to help you migrate into the cloud and help you to continuously evolve. Well, it’s also a phenomenal map for your security efforts.
You want to prioritize security, this is how you do it. You get it through the adoption framework, understanding what’s important to the business, and that lets you identify critical systems and areas for your security.
Again, we want to keep things simple, right? And, the third, uh, the o- other things we want to look at is the CIS foundations. They have them for AWS, Azure and GCP, um, and these provide a prescriptive guidance.
They’re really, um, a strong baseline, and a checklist of tasks that you can accomplish, um, or take on, on your, uh, take on, on your own, excuse me, uh, in order to, um, you know, basically cover off the really basics is encryption at rest on, um, you know, do I make sure that I don’t have, uh, things needlessly exposed to the internet, that type of thing.
Really fantastic reference point and a starting point for your security practice.
Again, with this idea of keeping things as simple as possible, so when it comes to looking at our security policy, we’ve used the frameworks, um, and the baseline to kind of set up a strong, uh, start to understand, uh, where the business is concerned, and to prioritize.
And, the first question we need to ask ourselves as security practitioners, what happened? If we, if something happens, and we ask what happened?
Do we have the ability to answer this question? So, that starts us off with logging and auditing. This needs to be in place before something happened. Let me just say that again, before something happened, you need [laughs] to be able to have this information in place.
Now, uh, this is really, uh, to ask these key questions of what happened in my account, and who, or what made that thing happen?
So, this starts in the cloud with some basic services. Uh, for AWS it’s cloud trail, for Azure, it’s monitor, and for Google Cloud it used to be called Stackdriver, it is now the Google Cloud operations suite, so these need to be enabled on at full volume.
Don’t worry, you can use some lifecycle rules on the data source to keep your costs low.
But, this gives you that layer, that basic auditing and logging layer, so that you can answer that question of what happened?
So, the next question you want to ask yourself or have the ability to answer is who’s there, right? Who’s doing what in my account? And, that comes down to identity.
We’ve already mentioned this is one of the key pillars of keeping security simple, and getting that highly effective security in your cloud.
[00:09:00] So here you’re answering the questions of who are you, and what are you allowed to do? This is where we get a very simple privilege, uh, or principle in security, which is the principle of least privilege.
You want to give an identity, so whether that’s a user, or a role, or a service, uh, only the privileges they, uh, require that are essential to perform the task that, uh, they are intended to do.
Okay?
So, basically if I need to write a file into a storage, um, folder or a bucket, I should only have the ability to write that file. I don’t need to read it, I don’t need to delete it, I just need to write to it, so only give me that ability.
Remember, that comes back to the other pillar of simple security here of, of key cloud security, is integrated identity.
This is where it really takes off, is that we start to assign very granular access permissions, and don’t worry, we’re going to use the APIs to automate all this stuff, so that it’s not a management headache, but the principle of these privilege is absolutely critical here.
The services you’re going to be using, amazingly, all three cloud providers got in line, and named them the same thing. It’s IAM, identity access management, whether that’s AWS, Azure or Google Cloud.
Now, the next question we’re going to a- ask ourselves are the areas where we’re going to be looking at is really where should I be focusing security controls? Where should I be putting stuff in place?
Because up until now we’ve really talked about leveraging what’s available from the cloud service providers, and you absolutely should available, uh, maximize your usage of their, um, native and primitive, uh, structures primitive as far as base concepts, not as, um, refined.
They’re very advanced controls and, but there are times where you’re going to need to put in your own controls, and these are the areas you’re going to focus on, so you’re going to start with networking, right?
So, in your networking, you’re going to maximize the native structures that are available in the cloud that you’re in, so whether that’s a project structure in Google Cloud, whether that’s a service like transit gateway in AWS, um, and all of them have this idea of a VPC or virtual private cloud or virtual network that is a very strong boundary for you to use.
Remember, most of the time you’re not charged for the creation of those. You have limits in your accounts, but accounts are free, and you can keep adding more, uh, virtual networks. You may be saying, wait a minute, I’m trying to simplify things.
Actually, having multiple virtual networks or virtual private clouds ends up being far simpler because each of them has a task. You go, this application runs in this virtual private cloud, not a big shared one in this specific VPC, and that gives you this wonderfully strong security boundaries, and a very simple way of looking at one VPC, one action, very much the Unix philosophy in play.
Key here though is understanding that while all of the security controls in place for your service provider, um, give you, so, you know, whether it’s VPCs, routing tables, um, uh, access control lists, security groups, all the SDN features that they’ve got in place.
These really help you figure out whether service A or system A is allowed to talk to B, but they don’t tell you what they’re saying.
And, that’s where additional controls called an IPS, or intrusion prevention system come into play, and you may want to look at getting a third party control in to do that, because none of the th- big three cloud providers offer an IPS at this point.
[00:12:00] But that gives you the ability to not just say, “Hey, you’re allowed to talk to each other.” But, to monitor that conversation, to ensure that there’s not malicious code being passed back and forth between systems that nobody’s trying a denial of service attack.
A whole bunch of extra things on there have, so that’s where IPS comes into play in your network defense. Now, we look at compute, right?
We can have compute in various forms, whether that’s in serverless functions, whether that’s in containers, manage containers, whether that’s in traditional virtual machines, but all the principles are the same.
You want to understand where the shared responsibility line is, how much is on your plate, how much is on the CSPs?
You want to understand that you need to harden the EOS, or the service, or both in some cases, make sure that, that’s locked down, so have administrator passwords. Very, very complicated.
Don’t log into these systems, uh, you know, because you want to be fixing things upstream. You want to be fixing things in the build pipeline, not logging into these systems directly, and that’s a huge thing for, uh, systems people to get over, but it’s absolutely essential for security, and you know what?
It’s going to take a while, but there’s some tricks there you can follow with me. You can see, uh, on the slides, uh, at Mark, that is my social everywhere, uh, happy to walk you through the next steps.
This idea of this presentation’s really just the simple basics to start with, to give you that overview of where to focus your time, and, dispel that myth that cloud security is complicating things.
It is a huge path is simplicity, which is a massive lens, or for security.
So, the last area you want to focus here is in data and storage. Whether this is databases, whether this is big blob storage, or, uh, buckets in AWS, it doesn’t really matter the principles, again, all the same.
You want to encrypt your data at rest using the native cloud provided, uh, cloud service provider, uh, features functionality, because most of the time it’s just give it a key address, and give it a checkbox, and you’re good to go.
It’s never been easier to encrypt things, and there is no excuse for it and none of the providers charge extra for, uh, encryption, which is amazing, and you absolutely want to be taking advantage of that, and you want to be as granular as possible with your IAM, uh, and as reasonable, okay?
So, there’s a line here, and a lot of the data stores that are native to the cloud service providers, you can go right down to the data cell level and say, Mark has access, or Mark doesn’t have access to this cell.
That can be highly effective, and maybe right for your use case. It might be too much as well.
But, the nice thing is that you have that option. It’s integrated, it’s pretty straightforward to implement, and then, uh, when we look here, uh, sorry. and then, finally you want to be looking at lifecycle strategies to keep your costs under control.
Um, data really spins out of control when you don’t have to worry about capacity. All of the cloud service providers have some fantastic automations in place.
Basically, just giving you, uh, very simple rules to say, “Okay, after 90 days, move this over to cheaper storage. After 180 days, you know, get rid of it completely, or put it in cold storage.”
Take advantage of those or your bill’s going to spiral out of control, and, and that relates to availability ‘cause uh, uh, and reliability, ‘cause the more you’re spending on that kind of stuff, the less you have to spend on other areas like security and operational efficiency.
So, that brings us to our next big security question. Is this working?
[00:15:00] How do you know if any of this stuff is working? Well, you want to talk about the concept of traceability. Traceability is a, you know, somewhat formal definition, but for me it really comes down to where did this come from, who can access it, and when did they access it?
That ties very closely with the concept of observability. Basically, the ability to look at, uh, closed systems and to infer what’s going on inside based on what’s coming into that system, and what’s leaving that system, really what’s going on.
There’s some great tools here from the service providers. Again, you want to look at, uh, Amazon CloudWatch, uh, Azure Monitor and the Google Cloud operations, uh, suite. Um, and here this leads us to the key, okay?
This is the key to simplifying everything, and I know we’ve covered a ton in this presentation, but I really want you to take a good look at this slide, and again, hit me up, uh, @marknca, happy to answer any questions with, questions afterwards as well here, um, that this will really, really make this simple, and this will really take your security practice to the next level.
If the idea of something happened in your, cloud system, right? In your deployment, there’s a trigger, and then, it either is generating an event or a log.
If you go the bottom row here, you’ve got a log, which you can then react to in a function to deliver some sort of result. That’s the slow-lane on the bottom.
We’re talking minutes here. You also have the top lane where your trigger fires off an event, and then, you react to that with a function, and then, you get a result in the fast lane.
These things happen in seconds, sub-second time. You start to build out your security practice based on this model.
You start automating more and more in these functions, whether it’s, uh, Lambda, whether it’s Cloud Functions, whether it’s Azure Functions, it doesn’t matter.
The CSPs all offer the same core functionality here. This is the critical, critical success metric, is that when you start reacting in the fast lane automatically to things, so if you see that a security event is triggered from like your malware, uh, on your, uh, virtual machine, you can lock that off, and have a new one spin up automatically.
Um, if you’re looking for compliance stuff, the slow lane is the place to go, because it takes minutes.
Reactions happen up top, more, um, stately or more sedate things, so somebody logging into a system is both up top and down low, so up top, if you logged into a VPC or into, um, an instance, or a virtual machine, you’d have a trigger fire off and maybe ask me immediately, “Mark, did you log into the system? Uh, ‘cause you’re, you know, you’re not supposed to be.”
But then I’d respond and say, “Yeah, I, I did log in.” So, immediately you don’t have to respond. It’s not an incident response scenario, but on the bottom track, maybe you’re tracking how many times I’ve logged in.
And after the three or fourth time maybe someone comes by, and has a chat with me, and says, “Hey, do you keep logging into these systems? Can’t you fix it upstream in the deployment, uh, and build a pipeline ‘cause that’s where we need to be moving?”
So, you’ll find this balance, and this concept, I just wanted to get into your heads right now of automating your security practice. If you have a checklist, it should be sitting in a model like this, because it’ll help you, uh, reduce your workload, right?
The idea is to get as much automated possible, and keep things in very clear, and simple boundaries, and what’s more simple than having every security action listed as an automated function, uh, sitting in a code repository somewhere?
[00:18:00] Fantastic approach to modern security practice in the cloud. Very simple, very clear. Yes, difficult to implement. It can be, but it’s an awesome, simple mental model to keep in your head that everything gets automated as a function based on a trigger somewhere.
So, what are the keys to success? What are the keys to keeping this cloud security thing simple? And, hopefully you’ve realized the difference between a simple mental model, and the challenges, uh, in, uh, implementation.
It can be difficult. It’s not easy to implement, but the mental model needs to be kept simple, right? Keep things in their own VPCs, and their own accounts, automate everything. Very, very simple approach. Everything fits into this s- into this structure, so the keys here are remembering the goal.
Make sure that cybersecurity, uh, is making sure that whatever you build works as intended and only as intended. It’s understanding the shared responsibility model, and it’s really looking at, uh, having a plan through cloud adoption frameworks, how to build well, which is a, uh, a concept called the Well-Architected Framework.
It’s specific to AWS, but it’s generic, um, its principles, it can be applied everywhere. We didn’t cover it here, but I’ll put the links, um, in the materials for you, uh, as well as remembering systems over people, right?
Adding the right controls at the right time, uh, and then, finally observing and react. Be vigilant, practice. You’re not going to get this right out of the gates, uh, perfect.
You’re going to have to refine, iterate, and then it’s extremely cloud friendly. That is the cloud model is, get it out there, iterate quickly, but putting the structures in place, you’re not going to make sure that you’re not doing that in an insecure manner.
Thank you very much, uh, here’s a couple of links that’ll help you out before we take some Q&A here, um, trendmicro.com/cloud will get you to the products to learn more. We’re also doing this really cool streaming.
Uh, I host a show called Let’s Talk Cloud. Um, we uh, interview experts, uh, and have a great conversation around, um, what they’re talking about, uh, in the cloud, what they’re working on, and not just around security, but just in building in general.
You can hit that up at trendtalks.fyi. Um, and again, hit me up on social @marknca.
So, we have a couple of questions to kick this off, and you can put more questions in the webinar here, and they will send them along, or answer them in kind if they can.
Um, and that’s really what these are about, is the interaction is getting that, um, to and from. So, the first question that I wanted to tackle is an interesting one, and it’s really that systems over people.
Um, you heard me mention it in the, uh, in the end and the question is really what does that mean systems over people? Isn’t security really about people’s expertise?
And, yes and no, so if you are a SOC analyst, if you are working in a security, uh, role right now, I am really confident saying that 80%, 90% of what you do right now could be delegated out to a system.
So, if you were looking at log lines, and stuff that should be done by systems and bubble up, just the goal for you to investigate to do what people are good at in systems are bad at, so systems mean, uh, you know, putting in, uh, to build pipeline, putting in container scanning in the build pipeline, so that you have to manually scan stuff, right to get rid of the basics. Is that a pen test? 100% no.
Um, but it gets rid of that, hey, you didn’t upgrade to, um, you know, this version of this library.
[00:21:00] That’s all automated, and those, the more systems you get in place, the more you as a security professional, or your security team will be able to focus on where they can really deliver value and frankly, where it’s more interesting work, so that’s what systems over people mean, is basically automate as much as you can to get people doing what people are really good at, and to make sure that the systems catch what we make as mistakes all the time.
If you accidentally try to push an old build out, you know that systems should stop that, if you push a build that hasn’t been checked by that container scanning or by, um, you know, it doesn’t have the appropriate security policy in place.
Systems should catch all that humans shouldn’t have to worry about it at all. That’s systems over processing. You saw that on the, uh, keys to success slide here. I’ll just pull it up. Um, you know, is that, that’s absolutely key.
Another question that we had, uh, was what we didn’t get into here, which was around the Well-Architected Framework. Now, this is a document that was published by AWS, uh, a number of years back, and they’ve kept it going.
They’ve evolved it and essentially it has five pillars. Um, performance, efficiency, uh, op- reliability, security, cost optimization, and operational excellence. Hey, I’ve got all five.
Um, and really [laughs] what that is, is it’s about how to take advantage of these cloud tools.
Now, AWS publishes it, but honestly it applies to Azure, it applies to Google Cloud as well. It’s not service specific. It teaches you how to build in the cloud, and obviously security is one of those big pillars, but it’s… so talking about teaching you how to make those trade offs, how to build an innovation flywheel, so that you have an idea, test it, uh, get the feedback from it, and move forward.
Um, and that’s really, really key. Again, now you should be reading that even if you are an Azure, or GCP customer or, uh, that’s where you’re putting your most of your stuff, because it’s really about the principles, and everything we do, and encourage people to build well, it means that there’s less security issues, right?
Especially we know that the number one problem is mistakes.
That leads to the last question we have here, which is about that, how can I say that cyber criminals, you don’t need to worry about them.
You need to worry about mistakes? That’s a good question. It’s valid, and, um, Trend Micro does a huge amount of research around cyber criminals. I do a whole huge amount of research around cyber criminals.
Uh, my training, by training, and by professional experience. I’m a forensic investigator. This is what I do is take down cyber crimes. Um, but I think mistakes are the number one thing that we deal with in the cloud simply because of the underlying complexity.
I know it’s ironic, and to talk about simplicity, to talk about complexity, but the idea is, um, is that you look at all the major breaches, especially around s3 buckets, those are all m- based on mistake.
There’ve been billions, and billions, and billions of records, and, uh, millions of dollars of damage exposed because of simple mistakes, and that is far more common, uh, than cyber criminals.
And yes, cyber crimes you have [inaudible 00:23:32] worry. You have to worry about them, but everything you’re going to do to fix mistakes, and to put systems in place to stop those mistakes from happening is also going to be for your pr- uh, protection up against cyber criminals, and honestly, if you’re the guy who runs around your organization’s screaming about cyber criminals all the time, you’re far less credible than if you’re saying, “Hey, I want to make sure that we build really, really well, and don’t make mistakes.”
Thank you for taking the time. My name’s Mark Nunnikhoven. I’m the vice president of cloud research at Trend Micro. I’m also an AWS community hero, and I love this stuff. Hit me up on social @marknca. Happy to chat more.