What Is Data Egress? Ingress vs. Egress

Kevin Bogusch | Oracle Senior Competitive Intelligence Analyst | January 22, 2024

The deceptively simple definition of data egress is just “data leaving a network.” Of course, monitoring and controlling data egress has never been a simple matter. And in our modern world of ecommerce, cloud-hosted IT infrastructure, and the growing threat of cyberattacks, IT professionals and business managers alike need a nuanced understanding of data egress and its related costs and security risks.

Costs, for example, are a concern for companies with IT infrastructure in the cloud because cloud vendors typically charge for data egress, and those charges can add up. Data egress security concerns, meanwhile, center on valuable or sensitive information that may be accidentally transmitted out of the network or deliberately stolen by a threat actor who seeks to embarrass an organization or hold the data for ransom.

Reliance on the internet and mobile apps means data egress and its risks are just part of doing business. Monitoring these data flows is essential to limit financial and security threats.

What Is Data Egress?

Data egress refers to the information that flows out from a network—whether via email, interactions with websites, or file transfers—to cloud storage containers or other sources. This is how modern organizations communicate with one another and with customers. As businesses migrate to cloud infrastructures and adopt software-as-a-service (SaaS) applications, they consume these services via data egress and ingress as well. In fact, unless an organization operates a military-grade air-gapped network that has absolutely no connections beyond its own boundaries, information is constantly flowing in and out.

Before the arrival of the public internet and cloud computing in the early 1990s, corporate networks were generally closed or linked only to networks that were consciously chosen by an organization. Such links were made via dedicated private network lines purchased from telecommunications carriers. At the time, the risks posed by data egress were entirely tied to security—that is, the potential for sensitive information to be leaked or stolen.

Now, with most corporate networks exposed to the internet, those security risks have increased exponentially. In addition, a new cost risk has emerged because cloud service providers charge for data egress, sometimes in counterintuitive and surprising ways.

Seven steps to create a visual representation of the data flow, security controls, and procedures involved in the egress process
Follow these 7 steps to create a visual representation of the data flow, security controls, and procedures involved in the egress process.

Data Egress vs. Data Ingress

The traditional concept of data egress is strictly related to data leaving a corporate network, while data ingress is commonly understood as unsolicited data coming into a network. When information is sent into the network in response to an internal request, firewalls typically let it through unhindered. To protect the organization, firewalls generally stop unsolicited data unless specific rules have been established to the contrary.

The economics of cloud computing complicate this simple model. Cloud service providers charge per-gigabyte fees for data egress but generally allow data ingress at no cost. Additionally, cloud services have introduced new concepts for data egress that, in practice, establish more types of network boundaries than the traditional corporate network perimeter. For example, with Amazon Web Services (AWS), traffic on the same virtual network often is metered and charged when moving between availability zones. Availability zones refer to cloud data centers that may be in the same geographic region but have, for instance, different network carriers and power providers that make it highly unlikely that they will fail at the same time. By distributing resources across multiple availability zones, cloud providers can minimize the impact of hardware failures, natural disasters, and network outages on their services. But though availability zones are ultimately a positive, the related egress fees can present a significant, unanticipated cost burden, especially when a business first migrates to the cloud.

Regarding monitoring and security, it’s important to profile both data ingress and egress. While unknown ingress traffic is typically blocked by firewalls, profiling that traffic can provide useful threat information for security teams. Due to the nature and prevalence of firewalls, monitoring ingress is common. However, far fewer organizations monitor data egress as carefully. Firewalling and limiting egress traffic to known targets can limit the impact of attacks and provide protection from malware.

Key Takeaways

  • Data egress represents a cost risk for cloud customers and a security risk for all organizations.
  • Leaks of sensitive data can pose significant financial and organizational risks.
  • Carefully monitoring data egress can help manage and optimize cloud spending while detecting malicious attacks early on.
  • Firewalls can be configured to limit inbound and outbound traffic to known, trusted locations.
  • Data loss prevention (DLP) tools help identity and classify sensitive data and apply additional controls to prevent unauthorized data egress.

Data Egress Explained

Data egress is a constant that must be carefully managed in terms of security and cost. For example, if a business shares its product catalog on a customer-facing website, the data has to leave the internal network where the catalog is maintained and traverse the internet to reach the browser where the customer is viewing the site. Whether a business is sharing data with subsidiaries or partners or interacting with customers via the internet, there will always be some volume of data leaving the company network.

For enterprises that have moved part or all their IT infrastructures into the cloud, any data movement may incur cloud data egress costs depending on their provider and the design of their applications.

Beyond the expense, data egress also poses the risk of exposing sensitive data to unauthorized or unintended recipients. Organizations must monitor for malicious activities from external threat actors while watching out for internal attacks such as data exfiltration by insiders. Protecting an organization against these attacks requires a comprehensive approach that includes a solid network design, continuous monitoring, and properly configured cloud application architectures. Typically, organizations will limit data egress using firewalls, monitoring outgoing traffic for anomalies or malicious activity. IT security groups may also take steps to restrict high-volume data transfers and block specific outbound destinations.

Effective monitoring requires an in-depth understanding of normal traffic patterns and how they differ during an attack or data exfiltration incident. It can also present a real challenge for IT organizations. The most common way to monitor data egress traffic is to review and analyze log files from network devices at the edge of cloud or on-premises networks. The sheer volume of traffic from those devices, however, makes this an arduous task for administrators. Many firms use security information and event management (SIEM) tools to better understand threats. SIEM tools typically include intelligence around known threat patterns, regulatory compliance, and automated updates to adapt to new threats. While implementing SIEM systems isn’t a simple process, doing so can improve an organization’s understanding of data egress patterns, enabling security teams to identify attacks much earlier.

For example, a sudden surge in data egress could indicate a data exfiltration attack, where a threat actor exports large amounts of data to an external host or service. Likewise, careful monitoring and control of data egress patterns can help identify malware that’s already present within a corporate network as it tries to seek further instructions from its command-and-control network. Many modern ransomware attacks attempt to exfiltrate large volumes of data to extort funds from an organization before encrypting that data. Tools including DLP, network traffic analysis systems such as packet sniffers, and user behavior analytics to detect abnormal patterns can help IT detect exfiltration. Egress filtering, where IT monitors outbound traffic and blocks traffic deemed to be malicious, helps mitigate these risks too.

Beyond firewalls, organizations also use DLP software to protect against data exfiltration. These tools employ techniques such as cataloging and tagging data with sensitivity labels, encryption, and auditing to keep sensitive data from leaving the network.

Cloud Data Egress Threats

As well as driving up cloud costs, substantial data egress can indicate several types of threats, including a data exfiltration attack by a threat actor or malware moving laterally within a corporate network via subnet communications.

  • Unbounded data egress fees: Cloud vendors may charge for data egress on a per-gigabyte basis. Charges vary based on the type of cloud service and the target location’s distance from the originating network or network segment. Excess data egress fees can originate from a few different sources. The most common are application misconfigurations that place resources with heavy network traffic in geographically distant regions and hybrid public-private systems in which cloud services constantly send large volumes of data to on-premises computers.
  • Cloud storage service egress fees: Cloud storage services are commonly used to host website assets, such as images or documents, and fees can add up surprisingly quickly. These services apply two layers of egress charges: one set for reads and writes to and from the storage account and another if those read operations traverse regions or go out to the internet.
  • Poor application performance: Applications configured to send cloud network traffic between regions will have high end-to-end latency. While not a security or budget issue on its surface, this does make for a suboptimal user experience, which could ultimately impact revenue.
  • Insider data exfiltration attacks: Any insider attempting to export large volumes of corporate data from the network should be investigated. A typical example would be a disgruntled salesperson exporting a customer list from a company database into a personal spreadsheet.
  • Data exfiltration attacks from external threat actors: A tactic frequently used by external threat actors is to minimize the possibility of early detection by infiltrating a network with bare-bones malware. Once inside, the malware may connect to an external command-and-control site to download software and expand its attack or exfiltrate corporate data.
  • Unencrypted data transfer: When sensitive information is transferred without encryption, it can potentially be intercepted and exploited by malicious actors. This can lead to significant financial or reputational damage for an organization.
  • Data residency and compliance concerns: Depending on an organization’s country or industry and the sensitivity of its data, data egress to other regions could pose legal and compliance risks. These can include data residency issues as some countries legally require that certain types of data stay within specific geographical boundaries.

7 Security Best Practices for Cloud Data Egress Management

Organizations can mitigate the security risks around data egress in a number of ways, such as realigning cloud services to limit outbound traffic. The following seven best practices are used by many organizations to better control and manage data egress security risks:

  1. Use a firewall to control outbound traffic: Most organizations use firewalls to limit inbound traffic. Far fewer use their firewalls to tightly control outbound traffic, even for server networks where the most-sensitive data resides. Network administrators should tightly control outbound traffic as well, thereby strengthening both monitoring and security controls.
  2. Create a data egress policy: Establishing a policy that limits user access to preapproved services, especially for networks in which sensitive data is stored, limits the possibility of data exfiltration attacks.
  3. Use SIEM to monitor network traffic: It’s impossible for a network administrator to review all traffic from all managed devices on a large network. With automation fueled by admin-established rules and machine learning and AI technology, an SIEM tool can help identify attacks earlier and provide additional levels of protection.
  4. Use DLP to categorize, label, and protect sensitive data assets: Like SIEM, DLP also uses machine learning to inspect data, understand its context, match it against established data egress policies, and block data transfers that might violate those policies.
  5. Control access to sensitive data: Once the locations of the most-sensitive data assets have been identified and cataloged via DLP, network administrators can further refine access controls for those data sets.
  6. Encrypt sensitive data: Encryption can provide a last line of defense against data exfiltration attacks. If information is encrypted in transit and at rest, the data will remain unreadable if it’s exfiltrated without the proper encryption key.
  7. Implement an incident response plan: In the event of an attack or data breach, having a clearly defined response plan can accelerate an organization’s response time and boost its overall effectiveness. Like a disaster recovery plan, an incident response plan should be signed off on by an executive and regularly tested via tabletop exercises so everyone understands their role.

Note that these practices aren’t separate one-off solutions; rather, they depend on each other. For example, the data categorization element of DLP and the creation of an egress policy would both inform firewall configurations and access-control settings.

How to Reduce Cloud Data Egress Charges

Data egress charges can lead to some costly surprises early in an organization’s cloud migration process, so it’s important to monitor cloud data egress costs daily to ensure they’re within budget—and to investigate if they aren’t. All public cloud providers support alerts tied to spending, so data egress costs can be monitored just like a virtual machine’s CPU utilization. However, monitoring is only the first step in reducing the cost of cloud data egress. Here are a few tips for lowering egress costs in cloud applications.

  • Keep cloud resources in the same region: While this may seem like common sense, certain services may be available in some cloud regions and not others, leading to costly cross-region deployments.
  • Reduce cash outlay with caching: Storing data in in-memory caches close to the application can eliminate round trips to databases and storage services.
  • Buy dedicated private lines: Direct private network connections offer lower data transfer pricing, sometimes even a flat rate per month for unlimited data egress, depending on the cloud provider.
  • Use content delivery networks (CDNs): Similarly, apps can use CDNs to cache web assets, such as images, documents, and videos, closer to users. Besides reducing data egress costs, employing CDNs generally results in a better browsing experience for users.
  • Compress network traffic when possible: Employing data compression whenever network traffic travels between regions or availability zones can also keep costs down. For example, when replicating a busy database to another region to support disaster recovery, the CPU costs of compressing and decompressing that data can be far lower than the potential data egress costs.
  • Deploy deduplication: Particularly for backup processes, using deduplication alongside compression can further reduce the volume of data being transferred, thus reducing costs.
  • Re-architect applications: Revising existing apps to make them cloud native can reduce egress costs by improving how efficiently they use data.

While these changes may require a significant one-time investment to implement, they can ultimately lower recurring cloud bills, resulting in a strong return on the initial cost and better cloud cost management. If data egress fees make up a large portion of your organization’s cloud costs, prioritizing these changes over other engineering projects could be a net win.

Reduce Data Egress Costs with Oracle

OCI data egress pricing is a big differentiator for organizations building cloud services that require large amounts of bandwidth. Large-scale apps that take advantage of these rates include live video streaming, video conferencing, and gaming.

To assess what your organization’s data egress and other cloud costs would be as an Oracle Cloud customer, use the OCI cost estimator.

Unfettered data egress can pose both a security and a financial risk to organizations. Deploying a non–cloud native or poorly designed application in the cloud can lead to unchecked data egress costs and inadequate security that puts organizations at risk of data exfiltration and ransomware attacks.

As such, it’s important to restrict, fortify, and monitor outbound traffic from a company network. In short, organizations need to control where their data can travel and look out for any anomalous patterns. Best practices for network security organizations include implementing a good incident response plan and employing SIEM and DLP technologies. In addition, choosing the right cloud provider for their requirements and designing or re-architecting applications with data egress costs in mind can contribute significantly to an organization’s cloud ROI.

AI can help CIOs analyze data to optimize cloud spend and suggest code tweaks to architect to minimize egress. Learn how to harness the power of artificial intelligence now to address talent, security, and other challenges.

Data Egress FAQs

What is data egress cost?

In addition to the cost of compute and storage resources, cloud providers also meter and bill for data egress. While these costs can vary by cloud provider, they are typically charged on a per-gigabyte basis for data that travels between cloud regions, availability zones, or to the internet or on-premises networks. Data egress fees can also differ based on the target location and cloud provider. They can be reduced by compressing data, leveraging content delivery networks, and colocating data to limit cross-region traffic.

What is egress in cloud computing?

Egress is defined as data that goes from one network to another, but the term takes on further complexity in cloud computing. With virtual machines and networks, standard network traffic between cloud regions or availability zones is considered data egress. Additionally, data that travels from the cloud back to on-premises networks or the internet is also metered as data egress.