podcast

Making APIs Secure Demands Tracing and Machine Learning to Rapidly Limit Damage from Attacks

Team Traceable

A discussion with Traceable AI CEO and co-founder Jyoti Bansal on the latest innovations for making APIs more understood, trusted, and robust.

Transcript

Dana Gardner: Hi, this is Dana Gardner, Principal Analyst at Interarbor Solutions, and you’re listening to BriefingsDirect.

The burgeoning use of application programming interfaces (APIs) across cloud-native computing and digital business ecosystems has accelerated rapidly due to the COVID-19 pandemic. Enterprises have had to scramble to develop and procure across new digital supply chains and via unproven business-to-business processes.

Companies have also extended their business perimeters to include home workers as well as to reach more purely online end-users and customers. In doing so, they may have given short shrift to protecting against the cybersecurity vulnerabilities inherent in the expanding use of APIs.

Stay with us now for Part 2 in our series that explores how APIs, microservices, and cloud-native computing require new levels of defense and resiliency.

To learn more about the latest innovations for making APIs more understood, trusted, and robust, please welcome our special guest, Jyoti Bansal, Chief Executive Officer and Co-Founder at Traceable.ai. Welcome back, Jyoti.

Jyoti Bansal: Good to be here again, Dana.

Gardner: In our last discussion, we learned how the exploding use of cloud-native apps and APIs has opened a new threat vector. As a serial start-up founder in Silicon Valley, as well as a tech visionary, what are your insights and experience telling you about the need for identifying and mitigating API risks? How is protecting APIs different from past cybersecurity threats?

Bansal: Protecting APIs is different in one fundamental way – it’s all about software and developers. APIs are created so that you can innovate faster. You want to empower your developers to move fast using DevOps and CI/CD, as well as microservices and serverless.

You want developers to break the code into smaller parts, and then connect those smaller pieces to APIs – internally, externally, or via third parties. That’s the future of how software innovation will be done.

Now, the way you secure these APIs is not by slowing down the developers. That’s the whole point of APIs. You want to unleash the next level of developer innovation and velocity. Securing them must be done differently. You must do it without hurting developers and by involving them in the API security process.

Gardner: How has the pandemic affected the software development process? Is the shift left happening through a distributed workforce? How has the development function adjusted in the past year or so?

Software engineers at home

Bansal: The software development function in the past year has become almost completely work-from-home (WFH) and distributed. The world of software engineering was already on that path, but software engineering teams have become even more distributed and global. The pandemic has forced that to become the de facto way to do things.

Now, everything that software engineers and developers do will have to be done completely from home, across all their processes. Most times they don’t even use VPNs anymore. Everything is in the cloud. You have your source code, build systems, and CI/CD processes all in the cloud. The infrastructure you are deploying to is also in a cloud. You don’t really go through VPNs nor use the traditional ways of doing things anymore. It’s become a very open, connect-from-everywhere software development process.

Gardner: Given these new realities, Jyoti, what can software engineers and solutions architects do with APIs be made safer? How are we going to bring developers more of the insights and information they need to think about security in new ways?

Bansal: The most important thing is to have the insights. The fundamental problem is that people don’t even know what APIs are being used and which APIs have a potential security risk, or which APIs could be used by attackers in bad ways.

And so, you want to create transparency around this. I call it turning on the lights. In many ways, developers are operating in the dark – and yet they’re building all these APIs.

Normally, these days you have a software development team of maybe five to 10 engineers. If you are developing using many APIs, each with augmentations, you might end up with 200 or 500 engineers. They’re all working on their own pieces, which are normally one or two microservices, and they’re all exposing them to the current APIs.

It’s very hard for them to understand what’s going on. Not only with their own stuff, but the bigger picture across all the engineering teams in the company and all the APIs and microservices that they’re building and using. They really have no idea.

For me, the first thing you must do is turn on the lights so that everyone knows what’s going on — so they’re not operating in the dark. They can then know which APIs are theirs, and which APIs talk to other APIs? What are the different microservices? What has changed? How does the data flow between them? They can have a real-time view of all of this. That is the number one thing to begin with.

We like to call it a Google Maps kind of view, where you can see how all the traffic is flowing, where the red lights are, and how everything connects. It shows the different highways of data going from one place to another. You need to start with that. It then becomes the foundation for developers to be much more aware and conscious about how to design the APIs in a more secure way.

Gardner: If developers benefit from such essential information, why don’t the older solutions like web application firewalls (WAFs) or legacy security approaches fit the bill? Why do developers need something different?

Bansal: They need something that’s designed to understand and secure APIs. If you look at a WAF, it was designed to protect systems against attacks on legacy web apps, like a SQL injection.

Normally a WAF will just look at whether you have a form field on your website where someone who can type in a SQL query and use it to steal some data. WAFs will do that, but that’s not how attackers steal data from APIs. They are completely different kinds of attacks.

Most WAFs work to protect against legacy attacks but they have had challenges of how to scale, and how to make them easy and simple to use.

But when it comes to APIs, WAFs really don’t have any kind of solution to secure APIs.

Gardner: In our last discussion, Jyoti, you mentioned how the burden for API security falls typically on the application security folks. They are probably most often looking at point solutions or patches and updates.

But it sounds to me like the insights Traceable.ai provides are more of a horizontal or one-size-fits-all solution approach. How does that approach work? And how is it better than spot application security measures?

End-to-end app security

Bansal: At Traceable.ai we take a platform approach to application security. We think application security starts with securing two parts of your app.

One is the APIs your apps are exposing, and those APIs could be internal, external, and third-party APIs.

The second part is the clients that you yourselves build using those APIs. They could be web application clients or mobile clients that you’re building. You must secure those as well because they are fundamentally built on top of the same APIs that you’re exposing elsewhere for other kind of clients.

When we look at securing all of that, we think of it in a classic way. We think security is still about understanding and taking inventory of everything. What are all of the things that are there? Then, once you have an inventory, you look at protecting those things. Thirdly, you look to do it more proactively. Instead of just protecting the apps and services, can you go in and fix things where and when the problem was created.

Our solution is designed as an end-to-end, comprehensive platform for application security that can do all three of these things. All three must be done in very different ways. Compared to legacy web application firewalls or legacy Runtime Application Self Protection (RASP) solutions that security teams use; we take a very different approach. RASPs also have weaknesses that can introduce their own vulnerabilities.

Our fundamental approach builds a layer of tracing and instrumentation and we make these tracing and instrumentation capabilities extremely easy to use, thanks to the lightweight agents we provide. We have agents that run in different programming environments, like Java, .Net, PHP, Node.js, and Python. These agents can also be put in application proxies or Kubernetes clusters. In just a few minutes, you can install these agents and not have to do any work.

We then begin instrumenting your runtime application code automatically and assess everything that is happening. First thing, in just a minute or two, based on your real-time traffic, we draw a picture of everything -the APIs in your system, all the external APIs, your internal microservices, and all the internal API endpoints on each of the microservices.

This is how we assess the data flows between one microservice to a second and to a third. We begin to help you understand questions such as — What are the third-party APIs you’re invoking? What are the third-party systems you are invoking? And we’ll draw that all in Google Maps kind of traffic picture in just a matter of minutes. It shows you how everything flows in your system.

The ability to understand and embrace all of that is Traceable.ai solution’s first part, which is very different from any kind of legacy RASP app security approach out there.

Once we understand that, the second part starts in our system that creates a behavioral learning model around the actual use of your APIs and applications to help you understand answers to questions such as – Which users are accessing which APIs? Which users are passing what data into it? What is the normal sequence of API calls or clicks in the web application that the users do? What internal microservices are invoked by every API? What pieces of data are being transferred? What volume of data is being transferred?

All of that comes together into a very powerful machine learning (ML) model. Once that model is built, we learn the n-dimensional behavior around everything that is happening. There is often so much traffic, that it doesn’t take us long to build out a pretty accurate model.

Now, every single call that happens after that, we then compare it against the normal behavior model that we built. So, for example, normally when people call an API, they ask for data for one user. But if suddenly a call to the same API asks for data for 100,000 users, we will flag that — there is something anomalous about that behavior.

Next, we develop a scoring mechanism whereby we can figure out what kind of attack someone might be trying to do. Are they trying to steal data? And then we can create a remediation mechanism, such as blocking that specific user or blocking that particular way of invoking that API. Maybe we alert your engineering team to fix the bug there that allows this in the first place.

That’s a very different approach than most of the traditional app security approaches, which are very rules-based. Using them you would have to pre-define the rules sets and use them with regular expression matching. We don’t need to do that. For us, it’s all about learning the behavioral model through our ML engine and understanding whenever something is different in a bad way.

Gardner: It sounds like a very powerful platform — with a lot of potential applications.

Jyoti, as a serial startup founder you have been involved with AppDynamics and Harness. We talked about that in our first podcast. But one of the things I’ve heard you talk about as a business person, is the need to think big. You’ve said, “We want to protect every line of code in the world,” and that’s certainly thinking big.

How do we take what you just described as your solution platform, and extrapolate that to protecting every line of code in the world? Why is your model powerful enough to do that?

Think big, save the world’s code

Bansal: It’s a great question. When we began Traceable.ai, that was the mission we started with. We have to think big because this is a big problem.

If I fast-forward to 10 years from now, the whole world will be running on software. Everything we do will be through interconnected software systems everywhere. We have to make sure that every line of the code is secure and the way we can ensure that every line of code is secure is by doing a few fundamental things, which are hard to do, but in concept they are simple.

Can we watch every line of code when it runs in a runtime environment? If an engineer wrote a thousand lines of code, and it’s out there and running, can we watch the code as it is running? That’s where the instrumentation and tracing part comes in. We can find where that code is running and watch how it is run. That’s the first part.

The second part is, can we learn the normal behavior of how that code was supposed to run? What did the developer intend when they wrote the code? And if we can learn that it’s the second part.

And the third component is, if you see anything abnormal, you flag it or block it, or do something about it. Even if the world has trillions and trillions of lines of code, that’s how we operate.

Every single line of code in the world should have a safety net built around it. Someone should be watching how the code is used and learn what is the normal developer intent of that code. And if some attacker, hacker, or a malicious person is trying to use the code in an unintended way, you just stop it.

That to me is a no-brainer — if we can make it possible and feasible from a technology perspective. That’s the mission we are on Traceable.ai – To make it possible and feasible.

Gardner: Jyoti, one of the things that’s implied in what we’ve been talking about that we haven’t necessarily addressed is the volume and speed of the data. It also requires being able to analyze it fast to stop a breach or a vulnerability before it does much damage.

You can’t do this with spreadsheets and sticky notes on a whiteboard. Are we so far into artificial intelligence (AI) and ML that we can take it for granted that this going to be feasible? Isn’t a high level of automation also central to having the capability to manage and secure software in this fashion?

Let machines do what they do

Bansal: I’m with you 100 percent. In some ways, we have machines to protect against these threats. However, the amount of data and the volume of things is very high. You can’t have a human, like a security operations center (SOC) person, sitting at a screen trying to figure out what is wrong.

That’s where the challenge is. The legacy security approaches don’t use the right kind of ML and AI — it’s still all about the rules. That generates numerous false positives. Every application security, bot security, RASP, and legacy app security approach defines rules sets to define if certain variables are bad and that approach creates many false positives and junk alerts, that they drown the humans monitoring those- it’s just not possible for humans to go through it. You must build a very powerful layer of learning and intelligence to figure it out.

The great thing is that it is possible now. ML and AI are at a point where you can build the right algorithms to learn the behavior of how applications and APIs are used and how data flows through them. You can use that to figure out the normal usage behaviors and stop them if they veer off – that’s the approach we are bringing to the market.

Gardner: Let’s think about the human side of this. If humans can’t necessarily get into the weeds and deal with the complexity and scale, what is the role for people? How do you oversee such a platform and the horizontal capabilities that you’re describing?

Do we need a new class of security data scientist, or does this is fit into a more traditional security management persona?

Bansal: I don’t think you need data scientists for APIs. That’s the job of products like Traceable.ai. We do the data science and convert it into actionable things. The technology behind Traceable.ai itself could be the data scientist inside.

But what is needed from the people side is the right model of organizing your teams. You hear about DevSecOps, and I do think that that kind of model is really needed. The core of DevSecOps is that you have your traditional SecOps teams, but they have become much more developer, code, and API aware, and they understand it. Your developer teams have become more security-aware than they have been in the past.

Both sides have to come together and bridge the gap. Unfortunately, what we’ve had in the past are developers who don’t care about security, and security people who don’t care about code and APIs. They care about networks, infrastructures, and servers, because that’s where they spend most of their time trying to secure things. From an organization and people perspective, we need to bridge that from both sides.

We can help, however, by creating a high level of transparency and visibility by understanding what code and APIs are there, which ones have security challenges, and which ones do not. You then give that data to developers to go and fix. And you give that data to your operations and security teams to manage risk and compliance. That helps bridge the gap as well.

Gardner: We’ve traditionally had cultural silos. A developer silo and a security silo. They haven’t always spoken the same language, never mind work hand-in-hand. How does the data and analytics generated from Traceable.ai help bind these cultures together?

Bridge the SecOps divide

Bansal: I will give you an example. There’s this new pattern of exposing data through GraphQL. It’s like an API technology. It’s very powerful because you can expose your data into GraphQL where different consumers can write API queries directly to GraphQL.

Many developers who write these graphs to allow APIs, don’t understand the security implications. They write the API, and they don’t understand that if they don’t put in the right kind of checks, someone can go and attack them. The challenge is that most SecOps people don’t understand how GraphQL APIs work or that they exist.

So now we have a fundamental gap on both sides, right? A product like Traceable.ai helps bridge that gap by identifying your APIs, and that there are GraphQL APIs with security vulnerabilities where sensitive data can potentially be stolen.

And we will also tell if there is an attack happening. We will tell you that someone is trying to steal data. Once you have that data, and developers see the data, they become much more security-conscious because they see it in a dashboard that they built the GraphQL APIs from, and which has 10 security vulnerabilities and alerts that two attacks are happening.

And the SecOps team, they see the same dashboard. They know which APIs were crafted, and that by these patterns they know which attackers and hackers are trying to exploit them. Thus, having that common shared sense of data in a shared dashboard between the developers and the SecOps team creates the visibility and the shared language on both sides, for sure.

Gardner: I’d like to address the timing of the Traceable.ai solution and entry into the market.

It seems to me we have a level of trust when it comes to the use of APIs. But with the vulnerabilities you’ve described that trust could be eroded, which could be very serious. Is there a race to put in the solutions that keep APIs trustworthy before that trust gets eroded?

A devoted API security solution

Bansal: We are in the middle of the API explosion. Unfortunately, when people adopt a new technology, they think about its operational elements, and then security, performance, and scalability after that. Once they start running into those problems, they start challenging them.

We are at a point of time where people are seeing the challenges that come with API security and the threat vectors that are being opened. I think the timing is right. People, the market, and the security teams understand the need and feel the pain.

We already have had some very high-profile attacks in the industry where attackers have stolen data through improperly secured APIs. So, it’s a good time to bring a solution into the market that can address these challenges. I also think that CI/CD in DevOps is being adopted at such a rapid phase that API security and securing cloud-native microservices architectures are becoming a major bottleneck.

In our last discussion, we talked about Harness, another company that I have founded, which provides the leading CI/CD platform for developers. When we talk to our customers at Harness and ask, “What is the blocker in your adoption of CI/CD? What is the blocker in your adoption of public cloud, or using two or more microservices, or more serverless architectures?”

They say that they are hesitant due to their concerns around application security, securing these cloud-native applications, and securing the APIs that they’re exposing. That’s a big part of the blocker.

Yet this resistance to change and modernization is having a big business impact. It’s beginning to reduce their ability to move fast. It’s impacting the very velocity they seek, right? So, it’s kind of strange. They should want to secure the APIs – secure everything – so that they can gain risk mitigation, protect their data, and prevent all the things that can burn your users.

But there is another timing aspect to it. If they can’t soon figure out the security, the businesses really don’t have any option other than to slow down their velocity and slow down adoption of cloud-native architectures, DevOps, and microservices, all of which will have a huge business and financial impact.

So, you really must solve this problem. There’s no other solution or way out.

Gardner: I’d like to revisit the concept of Traceable.ai as a horizontal platform capability.

Once you’ve established the ML-driven models and you’re using all that data, constantly refining the analytics, what are the best early use cases for Traceable.ai? Then, where do you see these horizontal analytics of code generation and apps production going next?

Inventory, protection, proactivity

Bansal: There’s a logical progression to it. The low-lying fruit is to assume you may have risky APIs with improper authentication that can expose personally identifiable information (PII) and data. The API doesn’t have the right authorization control inside of it, for example. That becomes the first low-hanging fruit. Once, you put Traceable.ai in your environment, we can look at the traffic, and the learning models will tell you very quickly if you have these challenges. We make it very simple for a developer to fix that. So that’s the first level.

The second level is, once you protect against those issues, you next want to look for things you may not be able to fix. These might be very sophisticated business logic abuses that a hacker is trying to insert. Once our models are built, and you’re able to compare how people are using the services, we also create a very simple model for flagging and attributing any bad behaviors to a specific user.

This is what we call a threat actor. It could be a bot, a particular authenticated user, or a non-authenticated user trying to do something that is not normal behavior. We see the patterns of such abuses around data theft or something that is happening around the data. We can alert you and we can block the threat actor. So that becomes the second part of the value progression.

The third part then becomes, “How do we become even more proactive?” Let’s say you have something in your API that someone is trying to abuse through a sophisticated business logic approach. It could be fraud, for example. Someone could create a fraudulent transaction because the business logic in the APIs allows for that. This is a very sophisticated hacker.

Once we can figure that abuse out, we can block it, but the long-term solution is for the developers to go and fix the code logic. That then becomes the more proactive approach. By Traceable.ai bringing in that level of learning, that a particular API has been abused, we can identify the underlying root cause and show it to a developer so that they can fix it. That’s becoming the more progressive element of our solution.

Eventually you want to put this into a continuous loop. As part of your CI/CD process, you’re finding things, and then in production, you are also finding things when you detect an attack or something abnormal. We can give it all back to the developers to fix, and then it goes through the CI/CD process again. And that’s how we see the progression of how our platform can be used.

Gardner: As the next decade unfolds, and organizations are even more digital in more ways, it strikes me that you’re not just out to protect every line of code. You’re out there to protect every process of the business.

Where do the use cases progress to when it comes to things like business processes and even performance optimization? Is the platform something that moves from a code benefit to a business benefit?

Understanding your APIs

Bansal: Yes, definitely. We think that the underlying model we are building will understand every line of code and how is it being used. We will understand every single interaction between different pieces of code in the APIs and we will understand the developer intent around those. How did the developers intend for these APIs in that piece of code to work? Then we can figure out anything that is abnormal about it.

So, yes, we are using the platform to secure the APIs and pieces of code. But we can also use that knowledge to figure out if these APIs are not performing in the right kinds of way. Are there bottlenecks around performance and scalability? We can help you with that.

What if the APIs are not achieving the business outcomes they are supposed to achieve? For example, you may build different pieces of code and have them interact with different APIs. In the end, you want a business process, such as someone applying for a credit card. But if the business process is not giving you the right outcome, you want to know why not? It may be because it’s not accurate enough, or not fast enough, or not achieving the right business outcome. We can understand that as well, and we can help you diagnose and figure out the root cause of that as well.

So, definitely, we think eventually, in the long-term, that Traceable.ai is a platform that understands every single line of code in your application. It understands the intent and normal behaviors of every single line of code, and it understands every time there is something anomalous, wrong, or different about it. You then use that knowledge to give you a full understanding around these different use cases over time.

Gardner: The lesson here, of course, is to know yourself by letting the machines do what they do best. It sounds like the horizontal capability of analyzing and creating models is something you should be doing sooner rather than later.

It’s the gift that keeps giving. There are ever-more opportunities to use those insights, for even larger levels of value. It certainly seems to lead to a virtuous adoption cycle for digital business.

Bansal: Definitely. I agree. It unlocks and removes the fear of moving fast by giving developers freedom to break things into smaller components of microservices and expose them through APIs. If you have such a security safety net and the insights that go beyond security to performance and business insights, it reduces the fear because you now understand what will happen.

We see the benefits again and again when people move from five monolithic services to 200 microservices. But now, we just don’t understand what’s going on in the 200 microservices because we have so much velocity. Every developer team is moving independently, and they are moving 10 times faster than have been used to. We just don’t understand what is going on because there are 200 moving parts now, and that’s just for microservices.

When people start thinking of serverless, Functions, or similar technologies the idea is that you take those 200 microservices and break them into 2,000 micro-functions. And those functions all interact with each other. You can clip them independently, and every function is just a few hundred lines of code at most.

So now, how do you start to understand the 2,000 moving parts? There is a massive advantage of velocity, and reusability, but you will be challenged in managing it all. If you have a layer that understands and reduces that fear, it just unlocks so much innovation. It creates a huge advantage for any software engineering organization.

Gardner: I’m afraid we’ll have to leave it there. You have been listening to a sponsored BriefingsDirect discussion on how today’s business may have given short shrift to the cyber security vulnerabilities inherent in the expanding use of APIs.

And we’ve learned about the latest innovations for making those APIs more trusted, understood, and robust. And so a big thank you to our guest, Jyoti Bansal, Chief Executive Officer and Co-Founder at Traceable.ai. Thank you so much, Jyoti.

Bansal: Thank you, Dana.

Gardner: And a big thank you as well to our audience for joining this BriefingsDirect’s API resiliency discovery discussion. I’m Dana Gardner, Principal Analyst at Interarbor Solutions, your host throughout this series of Traceable.ai-sponsored interviews.

Stay tuned in our next podcast in this series with a deep-dive look with Sanjay Nagaraj, Co-Founder and CTO at Traceable.ai, as we examine ways that APIs can be made far safer and better managed.

Thanks again for listening. Please pass this along to your business community, and do come back for our next chapter.