Read Aviatrix product guides, how-tos, and the latest industry news
Disclaimer AWS keeps changing the limits and design option from time to time. For most accurate and up to date information please consult with the AWS documentation and links provided in this article Direct Connect (DX) DX is region specific offering It allows On-Prem physical locations to connect to a specific AWS region/location DX supports max of 50 VIFs (including Private and Public) per physical connection DX does not support Transit VIF for AWS-TGW connectivity DXGW What is DXGW? According to AWS "Direct Connect gateway is a grouping of virtual private gateways (VGWs)" Only supports Private and Transit VIFs DXGW mainly used to access private resources in VPCs Does not support public VIF DXGW does not provide any benefit of Public Internet Connectivity VGW associated with a DXGW must be “attached” to a VPC Does not support transitive routing or transit connectivity VPC in Region-1 cannot directly communicate with VPC in Region-2 DX Location-1 cannot directly communicate with DX Location-2 Up to 30 DX physical connections can connect to one single DXGW for physical link redundancy purposes In another words 30 DX locations/regions DX supports max of 50 VIFs (for DXGW only Private and Transit VIFs are applicable) It means one can have Max of 50 DXGW per physical DX link But one DXGW can connect to max of 10 VPCs It means Max of 500 VPCs (50 x 10 VPC) per physical DX link across accounts and regions DXGW is Must for AWS-TGW Transit VIF is a must when terminating DirecConnect (DX) circuit on AWS-TGW But Transit VIF can only be attached to a DXGW That means AWS-TGW mandates deploying DXGW Max of 3 AWS-TGW Behind a Direct Connect Circuit Max of 3 AWS-TGW can be attached to one DXGW behind one Transit VIF And only one Transit VIF is possible per DirectConnect circuit Aviatrix Transit does not have this limitation because it uses Private VIF Transit VIF and Private VIF are not allowed on same DXGW A single DXGW cannot attach to both Private and Transit VIF One cannot attach a DX-GW to an AWS-TGW when the DX-GW is already associated with an AWS-VGW or is attached to a Private VIF. I did a simple test in my lab, and I get an error when I try to connect a Private VIF to DX-GW. This DX-GW had a Transit VIF attached to it. Also confirmed this from following AWS doc https://docs.aws.amazon.com/directconnect/latest/UserGuide/direct-connect-transit-gateways.html DXGW with and without AWS-TGW Comparison DXGW without AWS-TGW DXGW with AWS-TGW 10 VPCs per DXGW 3 TGWs per DXGW 50 DXGW max (b/c of 50 Private VIF) With Transit VIF only one DXGW is possible 500 VPCs total 5,000 VPCs per TGW 15,000 VPC per DX physical link Private VIF supported on all Direct Connect connection types Transit VIF supported only on dedicated or hosted connections of speed 1Gbps and above No additional charges Additional charge for TGW data processing DXGW with AWS-TGW Routing Limitations * Only 20 routes from AWS to On-Prem per AWS-TGW * Only 100 routes from on-prem to AWS Reference: Transit Gateway Reference Architectures for Many VPCs NET406-R1 PDF Transit Gateway Reference Architectures for Many VPCs NET406-R1 VoD <br /> Intra-Region AWS-TGW Peering is not Allowed When multiple AWS Transit Gateways are required in the same region (separation between prod/dev air gap, separate NGFWs or other reasons), inter-region peering cannot be used to route traffic between VPCs attached to the AWS Transit Gateways. Two AWS Transit Gateways can only be peered when they are in different regions. Aviatrix Transit Solution Works with the Private VIF and does not need Public or Transit VIF Does not need any DXGW One can deploy as many Aviatrix Transit GW as per the business need AWS-TGW is limited by 5 per region AWS-TGW Intra Region Peering is not allowed Aviatrix Transit can extent to AWS-TGW if needed as shown in the following diagram Summary Transit VIF can only be attached to a DXGW Only one Transit VIF for any AWS Direct Connect 1/2/5/10 Gbps connection Less than 1G connections does not support Transit VIF Max of 3 AWS-TGW can connect to one DXGW behind one Transit VIF A single DXGW cannot attach with both Private and Transit VIF This could be a serious limitation for some customers I think the underline assumption is that if a customer is already using AWS-TGW then why would he want to use a private VIF attached to the same DXGW? Aviatrix Transit Solution is not bound to these limits AWS References https://docs.aws.amazon.com/vpc/latest/tgw/transit-gateway-limits.html https://docs.aws.amazon.com/directconnect/latest/UserGuide/limits.html
There are two important routing entities to understand in GCP VPC native routing service GCP Cloud Router VPC Native Routing Service When VPC is created, GCP automatically deploys a routing service inside the VPC. This routing service or daemon is similar to what you see in any other public cloud such as AWS and Azure. It is a hidden service that performs the L3 routing between different subnets inside the same VPC. GCP Cloud Router (GCR) GCP Cloud Router is another service that is instantiated when GCP needs to connect to on-premise data centers or branches. GCP Cloud Router (CR) runs as a managed service. CR is similar to a traditional router but it only provides control plane functionality. It learns routes from on-prem and supports eBGP. Notice OSPF is pretty much dead in the Cloud so no cloud provider supports OSPF. The actual data plane is inside the GCP VM. That means every VM is doing host routing and all the route table lives inside the VM itself. CR is google managed process. GCP will re-spin the cloud router. It is like a distributed router. GCP will bring that backup. CR Routing Modes Two routing mode are available Regional Routing Only learns in the specific region. For example, CR West only learns routes from the West region Global Routing Global routing allows picking all subnets in all regions. For example, subnets in the west and east. You deploy CR in a specific region and then it learns globally Global vs Regional CR In GCP LB is a regional construct. So if you are using LB, then it means use would use regional CR instead of Global CR. Route Priority Router Priority is controller via BGP attribute called MED. Standard MED is 1000. Local CR will have MED of 1000. Routes from other regions have a metric based on RTT added to the default MED value.
Google VPC implementation is very different than traditional Cloud VPC (AWS/Azure) implementation. First take a look at how a traditional VPC (AWS/Azure) concepts. Traditional VPC In traditional VPC (AWS/Azure/etc.), the VPC boundary is based on a physical location/region. For instance when AWS VPC is created, it must be specified by a region such as VPC in US-West or VPC in US-East. In the topology below, notice that there are two VPCs present. One in US West and the other in US East. Each VPC has its own subnet. In order for these VPCs to communicate with each other, there are mainly two options available Either configure something called VPC peering (it has its own limitations) Or use VPN/IPSec technology to connect over public internet (this has its own limitations as well for example 1.25 Gbps throughput per VPN tunnel) In traditional VPC, subnet and VPC are regional. GCP VPC In the following GCP VPC topology, notice that VPC is globally available but subnets are regional entities. So instead of building a VPC in US-West and another VPC in US-East, GCP builds VPC across globe and then customer builds subnets in specific regions. Here there is no inter-VPC VPN requirement. All the routing within the VPC is handled by GCP under the hood. There are way to control the traffic using Firewall rules etc. In GCP VPC, subnet is regional but VPC is global.
Building multi-cloud network and troubleshooting it - Aviatrix Demo - Day 1 and Day 2 operations
Aviatrix customers have many options for enabling secure egress from their cloud environments: Aviatrix own secure egress with FQDN filtering (link) Egress through a firewall (Palo Alto, Checkpoint, Fortinet) (link) Any other 3rd party tool The instructions below show you how to configure the 3rd scenario, with zscaler as the external tool: The IPSec tunnel to ZScaler needs to be established from Transit VPC GW: Transit Network > 3. Connect to VGW / External Device / CloudN Option External Device > Static Remote Subnet is 0.0.0.0/0 You can select the Pre-shared key, but it’s a good idea to let the controller define the Local and Remote Tunnel IP When building the tunnel from the remote end (ZScaler) to the Transit VPC GW, you will need to input the following: Remote subnets - the CIDRs of the VPCs connected to the AWS TGW (in the domains that are connecting to ZScaler) and the Transit VPC CIDR Pre-shared key Local and Remote Tunnel IP (If the controller generated the b. or c. you need to download the configuration file from Site2Cloud menu: select the created tunnel, select Generic Vendor and download the configuration file, which will have these details. Make sure to not mix up what is remote and what is local. The imported file will be naming IPs from the perspective of ZScaler - these will be local attributes) In Site2Cloud, in the details of the tunnel, update “Local subnets” to include CIDRs of all the VPCs that should be connecting to ZScaler.
Problem Statement All the major cloud service providers (AWS, Azure, GCP) natively provide layer-4 firewalls functioanlity. These firewalls are applied to the virtual nic of each instance/virtual machine. The rules are usually also limited to a single VPC/VNet. This means when you need to apply this firewall rule you have to create it for each instance and manage it individually. The distributed nature has its challenges especially at scale when you need to troubleshoot and apply compliance and governance. Lets take few minutes to understand these in each cloud provider: AWS Native L4 Firewall Implementation AWS offers three native ways to implement security policies across VPCs: Network ACLs, Security groups AWS Network Firewall Both of these have limitations that can get in the way of cloud adoption. AWS Native L4 Firewall Limitations We will not cover these limitations in detail here. But, on a high level: AWS NACLs (Network ACLs) are not stateful which makes it extremely challenging in mid-to-large-scale environments. This means you need 2 rules (inbound and outbound) for each traffic flow. There is also a limit of 20 rules per NACL. AWS Security Groups are tied to individual instances and have a limit of 50 rules. AWS Network Firewall offers Distributed and Centralized models Distributed Model: No load balancing across Availability Zones (GWLB Endpoint only load balances within AZ) Very expensive (~$0.40 per hour per AZ plus data plus Firewall Manager) For internet egress traffic, You must NAT traffic BEFORE it hits the FW hence there is no Source IP visibility in logs nor you can apply rules based on Source IP There is no visibility. At the time of this update (Nov 25, 2020), the Logging feature is not supported hence there is zero visibility Each Firewall is account based so if you have several accounts, each account must manage its own firewall rules 5-tuple FW rules are not tied to Domain rules. Meaning, if you configure an HTTP at L4 (5-tuple rules) and then want to do FQDN filtering, all traffic will be allowed and not be checked by FQDN There is no discovery of URLs (or an observation mode), you need to exactly know all the FQDNs to allow Customer owns planning subnets, managing and configuring route tables. At the minimum you need to manage workload subnet route table, NAT subnet Route table, Firewall Subnet Route table, IGW Edge route table. This is uniquely needed per VPC. No intra-vpc inspection thru the AWS Network Firewall. Following picture shows some of the per VPC customer owned route table management Ref: https://aws.amazon.com/blogs/networking-and-content-delivery/deployment-models-for-aws-network-firewall/ 2. Centralized Model AWS Network Firewall also offers a centralized model. Without going in the details of the firewall functionality itself, the orchestration and plumbing required is extremely complicated. As with TGW, there is no visibility at all on where the traffic is going, which route table it is hitting, which attachment is it going thru. At the minimum, customer needs to manage following pieces manually and with no visibility it gets very complicated, very quickly. Spoke VPC Route table TGW attachment TGW route propagation TGW route table and association with right attachments Inspection VPC where FW is deployed Require 2 subnets per NFW (one for TGW Ingress and another for FW) TGW Ingress subnets will require unique configuration Firewall subnet route table Inspection VPC attachment Inspection VPC TGW route table association CLI configuration on TGW to avoid asymmetric traffic to Network Firewall Egress VPC Multiple Subnets in Egress VPC (one for TGW Ingress and another for NAT) NAT Gateway and its subnet Unique Route tables for ingress subnet and NAT GW Subnet Manual route entries in TGW route table to support Egress routes Ingress VPC NLB NLB Subnet and its route table Ingress subnet TGW attachment, association, propagation and route table The following diagram does a good job at summarizing all the things customers have to manage: Ref: https://aws.amazon.com/blogs/networking-and-content-delivery/deployment-models-for-aws-network-firewall/ Because of these limitations, AWS customers commonly use security groups to firewall within a VPC. As NFW is very new, I will update this space as we learn more. Azure NSG and GCP Firewall rules are same in nature as AWS Security Group Additional Details Cross VPC/VNet firewalling has become an important consideration in recent years as the number of VPCs have grown to the 100s for a typical cloud customer. Some customers have implemented commercially available firewalls in each VPC. This will serve the purpose, but at a much higher cost. If you are trying to implement a L4 stateful firewall for cross VPC traffic and log the traffic (stats/allows/denies) there is an easier and more cost effective solution to consider – use Aviatrix gateways. Aviatrix Brings Holistic L4 Firewall to Cloud and Multi-Cloud Aviatrix gateways have a Layer-4 stateful firewall built into the software. This firewall feature is automatically used with features such as transit peering or FQDN filtering, but can also be used by itself. Aviatrix L4 Firewall Solution in AWS The following diagram shows how you can implement this solution in AWS: Aviatrix L4 Firewall Solution in Azure The following diagram shows how you can implement this solution in Azure: Aviatrix L4 Firewall Solution in GCP The following diagram shows how you can implement this solution in GCP: Solution Implementation Details To enable this feature in Aviatrix first follow the Transit Workflow. Once deployed, the Stateful Firewall feature can be enabled by using the following steps: Log into the Aviatrix Controller to create firewall policies: Go to the security tab: Security -> Stateful Firewall -> Policy tab Select each gateway from the list -> Click Edit on the top of the list Use the Security Policy UI to Allow or Deny traffic based on Source, Destination, Protocol and Port Range. Additional Features: You can also log traffic to Log Analytics tools like Splunk, DataDog, SumoLogic etc. for audit and compliance purposes. Enable Logging (under Settings) to your log tool of choice. Select “Enable Packet Logging” in the Security Policy UI (screenshot above). You can also group IP addresses and tag them with human-readable names. This makes it easier to manage and apply firewall policies. Tagging can be accomplished under the Tag Management tab in the Security view.
No account yet? Create an account
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
OKSorry, our virus scanner detected that this file isn't safe to download.
OK