How to leverage NATed T1 Gateway for overlapping networks over a Transit Connect

In my last post, I have showed you how to access a SDDC overlapped segment from a VPC behind a TGW attach to the SDDC through a Route Based VPN.

In this blog post, I am going to cover how to leverage the NATed T1 Gateway for connection to an overlapping segment from a VPC behind a TGW connected to my SDDC over a Transit Connect (vTGW).

There are slight differences between both configuration. The main one is that static routing in the vTGW peering attachment is used instead of dynamic routing. The second one is we will have to use route aggregation on the SDDC.

Route Summarization (eg. Aggregation)

A question, I have heard for a long time from my customers is when are we going to support route summarization. Route summarization — also known as route aggregation — is a method to minimize the number of entries in routing tables for an IP network. It consolidates selected multiple routes into a single route advertisement.

This is now possible since M18 with the concept of Route Aggregation!

Route Aggregation will summarize multiple individual CIDRs into a smaller number of advertisements. This is possible for Transit Connect, Direct Connect endpoints and the Connected VPC .

In addition, since the launch of multi CGW, it’s mandatory for CIDRs sitting behind a non default CGW to be able to advertised them.

This is going to be mandatory in this use case as I am using a NATed T1 Compute Gateway with an overlapping segments that I want to access over a DNAT rule.

Implementing the topology

SDDC Group and Transit Connect

In this evolution of the lab, I have created an SDDC Group called “Chris-SDDC-Group” and attach my SDDC to it.

If you remember well from my previous post, SDDC Groups are a way to group SDDCs together for ease of management.

Once I created the SDDC group, it has deployed a VMware Managed Transit Gateway (Transit Connect) that I have then peered to my native TGW.

I have entered the information needed including the VPC CIDR (172.20.5.0/24) that stands behind the peered TGW.

Keep in mind that the process of establishing the peering connection may take up to 10′ to complete. For more detailed information on how to setup this peering check my previous blog post here.

Creating the Route Aggregation

In order to create the route aggregation, I have had first to open the NSX UI as the setup is done over the Global configuration menu from the Networking tab which is only accessible through the UI.

I created a new aggregation prefix list called DNATSUBNET by clicking on the ADD AGGREGATION PREFIX LIST …

I have created the DNATSUBNET with the prefix 192.168.3.0/24 to advertised it over Transit Connect

with the following prefix (CIDR):

To finish, I have then created the route aggregation configuration. For that, I have first given it a name, selected the INTRANET as the endpoint, and selected the prefix list created earlier.

Checking the Advertised route over the vTGW

In order to make sure the route aggregation works well, I have verified it in both the VMC UI at the Transit Connect level on the Advertised Routes tab:

and in the SDDC group UI from the Routing tab that displays the Transit Connect (vTGW) route Table:

Adding the right Compute Gateway Firewall rule

In this case, as I am using a vTGW peering attachment, there is no need to create an additional Group with the VPC CIDR as there is an already created group called “Transit Connect External TGW Prefixes” that I can used.

I have utilized the same group with the CIDR used to hide the overlapped segment with the DNAT rule.

I then have created the Compute Gateway Firewall rule called ‘ToDNAT‘ with the group “Transit Connect External TGW Prefixes” as the source and the group “NatedCIDRs” as the destination with ‘SSH’ and ‘ICMP ALL’:

Testing the topology

Checking the routing

The routing in the VPC didn’t change. We only need to add a static route back to the SDDC NATed segment CIDR: 192.168.3.0/24

Next is to check the TGW routing table to make sure there is also a route to the SDDC Nated CIDR through the peering connection.

We have to add a static route with a route back to the NATed CIDR:

I confirmed there was a route in the Default Route table of the Transit Gateway:

Pinging the VM in the SDDC from the VPC

To test my lab connectivity, I have connected to the instance created in the AWS native VPC (172.20.5.0/24) and try to ping the 192.168.3.100 Ip address and it worked again!

In this blog post, I have demonstrated how to connect to an overlapped SDDC segment by creating an additional NATed T1 Compute Gateway. In this lab topology, I have tested connectivity from a native VPC behind a TGW connected to my SDDC over a Transit Connect (vTGW).

I hope you enjoyed it!

In my next blog post, I will show you how to establish a communication between a subnet in a VPC and an SDDC segment that are overlapping over a Transit Connect, stay tune!

How to leverage NATed T1 Gateway for overlapping networks over a Route Based VPN to a TGW

I have seen a lot of customers having overlapping IP subnets among their applications and who wanted to avoid renumbering their network segments when they migrate them to the cloud.

In the recent 1.18 release of VMware Cloud on AWS, we have added the ability for customers to create additional T1 Compute Gateways . Additional T1s can be used for a number of use cases including environment segmentation, multi-tenancy and overlapping IP addresses.

In this blog post, I am going to cover the specific design case where a native VPC needs to connect with a segment in an SDDC that has an overlapping subnet with another segment. The SDDC itself is using a Route Based VPN to connect to a native AWS Transit Gateway where the native VPC is peered.

Lab topology

First of all, I have deployed my SDDC in the Northern Virginia region. Straightaway I have attached it to a native Transit GW over a Route Based VPN (I wanted to leverage BGP for dynamic routes exchange).

I then have attached the native VPC to the native TGW through a normal peering connectivity.

N.B.: I took the assumption that I would need a route aggregation for route advertisement as it’s a requirement for the Multiple Compute Gateway case. A key point in that case is that it’s not using Transit Connect, Direct Connect, or Connected VPC, so I don’t need a route aggregation.

Additionally in this SDDC, I have created two Compute segments that are using overlapping IPs (172.20.2.0/24).

On the AWS native side, there is 1 EC2 instances (172.20.5.153) in a native VPC (172.20.5.0/24) and on the SDDC I have deployed a Debian10 Virtual Machine named Deb10-app001 running with IP 172.20.2.100.

Inside the SDDC, I have attached the VM to the Overlapped segment as you can see it here:

Implementing the lab topology

Creating the New CGW

There are three different types of Tier-1 Compute Gateways that can be added with M16: Routed, Isolated and NATed. In this example, I have chosen the NATed type. This CGW type allows for communication between segments but avoid that any segments to be learned by the Tier-0 router. This also avoid having their CIDRs show up in the routing table.

To create the new NATed CGW, I went to the Tier1-Gateways menu and click on the “ADD TIER-1 GATEWAY” button. In addition, I have made sure I pick NATed as the type.

I know have a new Tier-1 Compute Gateway.

Creating a new overlapping segment

Equally, I have created the overlapped segment and have attached it to the new NATed CGW Compute Gateways. I have picked the ‘T1 NATed’ CGW in the list.

Creating a DNAT Rule

In order to ensure connectivity to the SDDC’s overlay segments configured behind them we need to configure a NAT Rule on the NATed CGWs

So the last step was to create a DNAT Rule and attach it to the T1 NATed Compute Gateway.

For that I went to the NAT Menu on the left and have selected the Tier-1 Gateway tab and pick the T1 NATed GW.

I have just clicked on the ADD NAT RULE button. I have enter the NATed subnet in the destination IP/Port field (192.168.3.0/24 in this example) and the overlapped subnet as the destination.

This means that any IP I want to reach inside 172.20.2.0/24 like 172.20.2.100 will be accessible over the Nated IP 192.168.3.100.

Next step is to click on the Set to select the right T1 Gateway to apply the rule to and picked the T1 Nated Gateway:

After a couple of seconds, I have confirmed the rule gets activated by checking the rule status:

Adding the right Compute Gateway Firewall rule

Next thing I did to finish the setup is to add the right firewall rule on the new T1 Gateway. Remember each T1 has its own FW rules.

In this case, I have created a new group called ‘My VPC Prefix’.

I have, however, created another group with the CIDRs I used to map the ‘Overlapping Segment’:

The segment used in this example is 192.168.3.0/24. The group reference two CIDRs however

I then have created the Compute Gateway Firewall rule called ‘ToDNAT‘ with the group “My VPC Prefix” as the source and the IP address of my NATed segment as the destination with ‘SSH’ and ‘ICMP ALL’:

Compute Gateway FW rule have to use the NATed CIDR as Destination IP and not the segment ip.

Testing the lab topology

Checking the routing

First let’s check the routing inside the VPC. We need to add a static route back to the SDDC NATed segment CIDR: 192.168.3.0/24. Thanks to that, the VPC will send all traffic destinated to the NATed CIDR over the TGW peering.

The route back to the SDDC in the route table of the VPN/subnet

As there is a Route Based VPN between the SDDC and the TGW, the TGW route table is automatically advertising the SDDC NATed CIDR (192.168.3.0/24) that I will use to connect to the overlapped segment (172.20.2.0/24).

This shows the attached VPC CIDRs and the segment CIDRs.

Pinging from the VPC

To test my lab connectivity, I have connected to the instance created in the AWS native VPC (172.20.5.0/24) and try to ping the 192.168.3.100 Ip address and it worked!

In this blog post I have demonstrated how to connect to an overlapped SDDC segment by creating an additional NATed CGW. In this example, I have connected from a VPC attach to a TGW connected to the SDDC over a VPN .

I hope you enjoyed it!

In my next blog post, I will show you how to do the same over a Transit Connect, stay tune!

VMware Transit Connect to native Transit Gateway intra-region peering in VMware Cloud on AWS

VMware Transit ConnectTM has been around now for more than a year and I must admit that it has been widely adopted by our customers and considered as a key feature.

Over the time we have added multiple new capabilities to this feature like support for connectivity to a Transit VPC, custom real-time Metering to get more visibility on usage and billing, connectivity to an external TGW (inter region), and more recently Intra-Region Peering with AWS TGW which has been announced at AWS re:Invent 2021, also referred to as intra-region peering.

Let’s focus in this post on the specific use case of allowing connectivity from workloads running on an SDDC to a VPC sitting behind a native Transit Gateway – TGW.

Intra-Region VTGW to TGW Peering lab Topology

In this lab, I have deployed an SDDC and attached it to an SDDC Group with a vTGW (Transit Connect) that I have peered to a native TGW in the same region.

On the AWS native side, I have deployed 2 EC2 instances (172.20.2.148 and 172.20.2.185) in a native VPC (172.20.2.0/24) and on the SDDC I have deployed a Debian10 Virtual Machine named Deb10-App001 running with IP 172.18.12.100.

Building the lab Topology

Let’s see how to put all things together and build the Lab!

Attaching the native TGW to the SDDC Group

To start configuring this lab it’s very easy!

I have first created an SDDC Group called “Multi T1” and attach my SDDC to it. SDDC Groups are a way to group SDDCs together for ease of management.

Transit Connect offers high bandwidth, resilient connectivity for SDDCs into an SDDC Group.

Then I have edited my SDDC Group and selected the External TGW tab.

I have Clicked on ADD TGW. The information needed are the AWS Account ID (ID of the AWS Account where the TGW resides), and the TGW ID that I have grabbed from my AWS Account.

The TGW Location is the region where the native TGW being peered with resides. The VMC on AWS Region stands for Region where the vTGW resides.

I have entered the information needed including the VPC CIDR (172.20.2.0/24) that stands behind the peered TGW and confirmed both regions are identical.

The process of establishing the peering connection starts. Keep in mind that the whole process may take up to 10′ to complete.

After a couple of seconds the status changes to PENDING ACCEPTANCE.

Accepting the peering attachment in AWS Account

Now it’s time to switch to the target AWS Account and Accept the peering from the AWS console. This is possible by going to the Transit Gateway attachments option in the VPC Menu.

I have selected Attach in the Actions drop down Menu.

And then to Click Accept.

After a few minutes the attachment is established.

Observe the change on vTGW console

In order to validate the connection is established, I have checked the route table of the vTGW and we can see that the new destination prefix of the native VPC have been added.

We can also see from the Transit Connect menu of the SDDC that the CIDR block of the native VPC has been added to the Transit Connect routing tables in the Learned Routes.

Adding a Route in the AWS native TGW route Table

To make sure the routing will work I have checked if a routing table is attached to the peering attachment and that there are routes to the SDDC subnets.

The TGW would need the following route table:

As seen in the following screen, I can confirm there is a route table attached to the peering attachment.

If I look at the routing table content, I can find both the VPC and Peering attachement in the Associations tab.

Here are the peering in the native AWS TGW. We can see some VPNs as well as the VPC peering and vTGW peering

If I look at the Route Table of the native Transit Gateway, I can see there are only one route entry:

This is the VPC subnet propagated automatically from the peering to the native VPC (172.20.2.0/24)

As the Transit Connect does not propagate its routes to the native TGW, we need to add the route back to the CIDRs of the SDDC.

Be careful because if you forget this step it won’t be able to route the packets back to the SDDC!

After a couple of seconds, the route table is updated.

All subnets should be added if connectivity is needed to them.

Add a route back to SDDC CIDRs into the native VPC route table

By default the native VPC will not have a route back to the SDDC CIDRs, so it needs to be added manually in order to make communication between both ends possible.

Add one or all SDDC CIDRs, depending on whether you want to make all or some of the segments in the SDDC accessible from the native VPC.

I have only added one segment called App01.

NB: Virtual Machines on Layer-2 extended networks (including those with HCX MON enabled) are not able to use this path to talk with the native VPC.

Keep in mind that as you add prefixes to either the VMware Cloud on AWS or the AWS TGW topologies the various routing tables on both sides will have to be updated.

Configure Security and Test Connectivity.

Now that network connectivity has been established let’s have a look at the additional steps that need to be completed before workloads can communicate across the expanded network.

Configure Compute Gateway FW rules in SDDC

First, the Compute Gateway (CGW) Firewall in the SDDC must be configured to allow traffic between the two destinations.

You have the choice to define IP prefix ranges in the CGW to be very granular in your security policy. Another option is to use the system defined CGW Groups called ‘Transit Connect External TGW Prefixes‘ as sources and destinations in the rules.

The system defined Group is created and updated automatically and it includes all CIDRs added and removed from the VTGW. Either choice works equally well depending on your requirements.

I have created a Group will all prefixes from my SDDC called SDDC Subnets where I could find the IP address of my test VM.

I decided to configure a Compute Gateway Firewall rule (2130) to allow traffic from the App01 compute segment to the native VPC and have enforced it at the INTRANET LEVEL. The second rule (2131) will allow for connectivity from the native VPC to the SDDC subnets.

Configure Security Groups in VPC

For the access to the EC2 instance running in the native VPC, make sure you have updated the security group to make sure the right protocol is permitted.

My EC2 instance is deployed has an IP Address of 172.20.2.148

and has a security group attached to it with the following rules:

Test connectivity

It’s time to test a ping from App001 VM in my SDDC which has ip address 172.18.12.100 to the IP Address of the EC2 instance.

We can also validate that a connectivity is possible from the EC2 instance to the VM running in the SDDC.

Conclusion

Win this post blog post we have established an intra-region VTGW to TGW peering, updated route tables on both the TGW and the native VPC, prepared all security policies appropriately in both the SDDC and on the AWS side, and verified connectivity end to end. 

This intra-region peering is opening a lot of new use case and connectivity capabilities in addition to the existing Inter-region peering.

In my next post, I will show you how to peer vTGW with multiple TGW, stay tune!

NSX Traceflow in VMC on AWS for self-service traffic Troubleshooting

In a recent post I have talked about the NSX Manager Standalone UI access which was released in VMware Cloud on AWS in version 1.16.

This capability is now permitting customer to access a very useful feature called Traceflow that many NSX customers are familiar with and which allows them to troubleshoot connectivity issues in their SDDC.

What is Traceflow and how does it work?

VMware Cloud on AWS customers can leverage Traceflow to inspect the path of a packet from any source to any destination Virtual Machines running into the SDDC. In addition, Traceflow provides visibility for external communication over VMware Transit Connect or the Internet.

Traceflow allows you to inject a packet into the network and monitor its flow across the network. This flow allows you to monitor your network path and identify issues such as bottlenecks or disruptions.

Traceflow observes the marked packet as it traverses the overlay network, and each packet is monitored as it crosses the overlay network until it reaches a destination guest VM or an Edge uplink. Note that the injected marked packet is never actually delivered to the destination guest VM.

Let’s see what it can do to help gaining visibility and troubleshooting networking connectivity in a VMC on AWS SDDC.

Troubleshooting Connectivity between an SDDC and a native VPC over a vTGW/TGW.

First let’s have a look at the diagram of this lab.

In my lab, I have deployed two SDDCs: SDDC1 and SDDC2 in two different regions and have attached them together within an SDDC Group. As they are in two different region two Virtual Transit Connect are required. I have two VMs deployed in the SDDC1, Deb10-App01 (172.18.12.100) and Deb10-Web001 (172.18.11.100).

I have also deployed a native VPC (IP: 172.20.2.0/24) attached to the SDDCs group through a TGW peered to the vTGW. I then have deployed two VMs in the attached VPC with IPs 172.20.2.148 and .185.

In this example, the trafic I want to gain visibility on will flow over the vTGW (VMware Managed transit Gateway) and the native Transit GatewayTGW which is peered to it.

Peering a vTGW to a native AWS Transit Gateway is a new capability we recently introduced. We can peer them in different region as well as in same region. If you want to know more how to setup this architecture, have a look at my post where I describe the all process.

Once all connectivity is established, I have tested ping connectivity between the VM Deb10-App01 (172.18.12.100) running on a Compute segment in SDDC1 to the EC2 instance (172.20.2.148) running in the native VPC (172.20.2.0/24).

Let’s launch the Traceflow from the NSX Manager UI. After connecting to the interface, the tool is accessible under the Plan & Troubleshoot menu.

Select Source machine in SDDC

In order to gain visibility under the traffic between both VMs, I have first selected the VM in the SDDC in the left Menu which is where you can define the Source machine.

Select Destination Machine in the native VPC

In order to select the destination EC2 instance running in the native VPC, I have had to select IP – Mac/ Layer 3 instead of Virtual Machine.

Displaying and Analyzing the Results

The traceflow is ready to be started!

The Analysis start immediately after clicking on the Trace button.

After a few seconds, the results are displayed. The NSX interface graphically displays the trace route based on the parameters I set (IP address type, traffic type, source, and destination). This display page also enables you to edit the parameters, retrace the traceflow, or create a new one.

Traffic flow diagram with the hops

The screen is split into two sections.

First section, on the top, is showing the diagram with the multiple hops that was crossed by the traffic. Here we can see that the packets has first flowed over the CGW, then it has reached the Intranet Uplink of the EDGE, it hit the vTGW (Transit Connect) and it has finally crossed the native TGW.

We can see that the MAC address of the destination has been collected on the top near the Traceflow ‘title’.

The second section detailed each and every steps followed by the packets with the associated timestamps. The first column shows the number of physical Hops.

The final step show the packet has been correctly delivered to the TGW.

We can confirm that the Distributed Firewall (DFW) have been enforced and were correctly configured :

To confirm which Distributed FW rule have been enforced, you can check on the console the corresponding rule by searching it by the rule ID:

Same thing applies for the Edge Firewall for North South Trafic.

Again I have checked the Compute Gateway Firewall rule to confirm it picked the right one and that it was well configured.

Let’s now do a test with a Route Based VPN to see the difference.

Troubleshooting Connectivity between an SDDC and a TGW via a RB VPN

Now instead of using the vTGW to TGW peering, I have established a Route Based VPN directly to the native Transit GW in order to avoid flowing over the vTGW.

I have enable the RB VPN from the Console:

I have enable only the first one.

After a few minutes the BGP session is established.

The VPN Tunnel in AWS shows 8 routes have been learned in the BGP Session.

I just need to remove the static route in the native TGW route table to avoid asymmetric traffic.

The 172.18.12.0/24 where my Virtual Machine runs is now learned from the BGP session.

Let’s start the analysis agains by clicking the Retrace button.

Click on Retrace button to relaunch the analysis on the same Source and Destination

Click Proceed to start the new request.

The new traceflow request result displays.

This time the packet used the Internet Uplink and the Internet Gateway of the Shadow VPC managed by VMware where the SDDC is deployed. The observations show that packets were successfully delivered to both the NSX-Edge-0 through IPSEC and to the Internet Gateway (igw).

Troubleshooting Firewall rules

Last thing you can test with Traceflow is how to troubleshoot connectivity when a Firewall rule is blocking a packet.

For this scenario, I have changed the Compute Gateway Firewall rule to drop the packets.

I have started the request again and the result is now showing a red exclamation mark.

The reason of the Packet Dropped is a Firewall Rule

The details confirmed it was dropped by the Firewall rule and it displayed the ID of the rule.

That concludes this blog post on how to easily troubleshoot your network connectivity by leveraging the Traceflow tool from the NSX Manager UI in VMware Cloud on AWS.

Thanks for visiting my blog! If you have any questions, please leave a comment below.

Transit Gateway to RB VPN BGP Route filtering on VMC on AWS

Today I wanted to cover a topic that was recently raised by one of my customer about how to filter routes coming from a native TGW attached to an SDDC with a Route Based VPN.

There are currently no way to do it over the UI but it is possible through an API call to configure route filtering in and out with a Route Based VPN.

Let’s see how it is possible.

Route Based VPN attachment to a native Transit Gateway

First thing I have created a route based VPN from my SDDC to a native Transit Gateway running on AWS.

The Transit Gateway is itself attach to a native VPC with 172.16.0.0/16 as subnet.

Let’s have a look at the VPN configuration in the SDDC and in the native AWS side.

AWS Transit Gateway VPN configuration

There is a site to site VPN configured with one tunnel (I didn’t configure two tunnels in that example).

The TGW is currently learning the SDDC subnets (see the 5 BGP ROUTES) including the management segment.

In order to see all the learned CIDRs, I need to display the Transit Gateway route table.

SDDC VPN configuration

If we look at the VPN configuration on the SDDC side, here is the result.

And if I click the View routes here

I can see the learned routes …

The SDDC is currently learning the VPC CIDR from the TGW.

and the Advertised routes.

I can confirm that by default everything is learned and advertised.

Let’s see how to limit the propagation of the routes from the TGW or the SDDC through an API call.

Installing and Configuring Postman

Download Postman

First I have downloaded Postman from here and installed it on my Mac laptop.

First thing you need to do when you have installed Postman is to create a new free account.

Click Create Free Account and follow the steps until finishing your free registration.

This will bring you to the following page with a default Workspace.

Import VMC on AWS Collections

Next thing you need to do is to import the VMC on AWS Collections and Environments variables that are directly available from VMware in the vSphere Automation SDK for REST. The VMware vSphere Automation SDK for REST provides a client SDK that contains samples that demonstrate how to use the vSphere Automation REST API and sample code for VMC on AWS and others.

Click on the download button of the Downloads section.

This will download a zip file that you need to extract.

This is going to redirect you to a Github repo. Just click on the green button called “Code” and pick Download.

Once you have downloaded, select the two following files: VMC Environment.postman_collection.jsonandVMware Cloud on AWS APIs.postman_collection.json and Import both into your Postman workspace.

This will add the collection with different sub folders.

Configuring VMC on AWS environments in Postman

If you click on the Environments section on the Left, you can setup multiple environment variables here including the SDDC ID, ORG ID and a refresh token.

Start by generating an API token, grab the information of SDDC from CSP Console and copy them in the CURRENT VALUE column.

Add the following variables with the following values:

Configuring the VMC APIs Authentication

Once you have downloaded the VMC on AWS APIs collection, we need to configure a few parameters here.

Select the Authorization tab, and change the Type from No Auth to API Key.

Change the Value to {{access_token}}, “Add to” has to be kept as Header.

Limiting routes through API calls with Postman

Create a new Collection for NSX API Calls

Here we are going to import an existing collection that has been created by my colleague Patrick Kremer here on  Github. By the way Patrick has also an excellent post here and even if it’s not covering the exact same use case it was a lot inspirational to me. I would like also to mention another excellent content from Gilles Chekroun.

Follow the same steps as before for the VMC collections this will add the following:

The two first are useful to check the configuration and the two others are used to implement things

Authenticating to VMC API

Now we can Login to the VMC on AWS API in order to execute the relevant command to create the Prefix Lists and do the Route filtering.

In order to do so, Select Login in the Authentication folder and Click the Send button on the right.

The body of the request shows a new access token which is valid for 1799 seconds.

Creating a Prefix List

Now we need to create a prefix list in order to limit SDDC subnets to be advertised to the Transit Gateway through the BGP session of the route based VPN. Let’s say we want to limit the management subnet 10.73.118.0/23 to be accessible from the VPC. We also want to avoid that we can access the VPC (172.16.0.0/16) from the SDDC.

In order to achieve that we need to create two prefix lists, one to filter in and one to filter out.

From the Postman select Create Prefix List, give the prefix list ID a value.

I have chosen filter_mngt_subnet for the first Prefix List ID.

Next is to the body of the request.

{ "description": "This will filter the Management subnet from SDDC", "display_name": "{Filter out Management subnet}",
"id": "{{prefix-list-id}}",
"prefixes": [ { "action": "DENY", "network": "10.73.118.0/23" },
{ "action": "PERMIT", "network": "ANY" }
]}

Each prefix list need a DENY and a PERMIT ANY command to avoid blocking all traffic

Just click Send to add the Prefix List.

The result of the creation is a 200 OK.

I have created a second Prefix List in order to limit the VPC subnet from being advertised to the SDDC.

Display the Prefix Lists

Next step is to check the Prefix Lists has been created successfully by leveraging the Show BGP Prefix List command.

You should see the Prefix lists with all the already created ones.

Attaching the Prefix List to the route Filter

Now we have to attach the Prefix Lists to the BGP Neighbors configuration.

First of all grab the existing configuration by using the Show VMC T0 BGP Neighbors GET command.

The result is displayed as follow.

Copy this text and remove the following lines: _create_time, _create_user, _last_modified_time, _last_modified_user, _system_owned, _protection, _revision.

Now we are going to append the prefix Lists to the configuration by using the latest command: Attach Route Filter.

Grab the Neighbor ID from the result and paste it to the VALUE.

Copy and paste the previous result into the Body of the command and the prefix list command in an out.

Click the Send button.

Checking routes Filtering

If I check on the SDDC side I can see that the management subnet is now filtered out.

I confirmed it by checking on the AWS side in the Transit Gateway route table.

The management subnet is not displayed here

To conclude, I can also confirm that the VPC subnet is not advertised in the SDDC as I don’t see it as a learned route.

That concludes my post, enjoy filter out the routes!

NSX Manager Standalone UI for VMC on AWS

Today I want to focus on the new feature from M16 release that enable customer to have a direct access to NSX Manager UI.

This is for me an interesting capability especially because it gives access to a more familiar interface (at least for customers that already utilise NSX-T) and it also reduces the latency involved with the CSP Portal reverse proxy.

In addition, it enables the access to NSX-T TraceFlow which will be very helpful to investigate connectivity issues.

Let’s have a look at this new Standalone UI mode.

Accessing the standalone UI

There are two ways to access the NSX Manager Standalone UI in VMC on AWS:

  • Via Internet through the reverse proxy IP address of NSX Manager. No particular rule is needed on the MGW.
  • Via the private IP of NSX Manager. It’s the option you will take if you have configured a VPN or a Direct Connect. A MGW firewall rule is needed in that case.

In order to choose between the type of access that fits our needs, we need to select it in the Settings tab of the VMC on AWS CSP console.

There are two ways to authenticate to the UI when leveraging the Private IP:

  • Log in through VMware Cloud Services: log in to NSX manager using your VMware Cloud on AWS credentials
  • Log in through NSX Manager credentials: log in using the credentials of the NSX Manager Admin User Account (to perform all tasks related to deployment and administration of NSX) or the NSX Manager Audit User Account (to view NSX service settings and events)

Both accounts have already been created in the backend and their user name and password are accessible below the URLs.

I have chosen the Private IP as I have established a VPN to my test SDDC.

So prior to accessing the NSX Manager, I have had to create a Management Gateway Firewall rule to allow source networks in my lab to access NSX Manager on HTTPS (the predefined group NSX Manager is used as a target).

Navigating the standalone UI

I started by clicking on the first URL here:

After a few seconds, I am presented with the NSX Manager UI:

Networking tab

Menu from Networking Tab

This tab will give you access to configuring the Connectivity options, Network Services, Cloud Services, IP Management, or Settings.

Basically the settings can be accessed in read only or read/write mode.

Keep in mind you will not have more rights or permissions to modify settings than if you were editing it from the CSP Console.

VPN and NAT options are accessible with same capabilities as from CSP console.

The Load Balancing options is there and is usable only if you have Tanzu activated in your cluster.

For example, for the Direct Connect you can change the ASN number or enable VPN as a backup.

For Transit Connect, you can have a look at the list of Routes Learned or Advertised.

Public IPs allow for requesting new IP addresses for using them with HCX or a NAT rule.

Let see what’s possible to do from the Segments menu.

From here you can see is a list of all your segments. You can also create a new segment, modify an existing segments or delete your segments.

I was able to edit the settings of one of my segment DHCP configuration.

I was also able to edit my Policy Based VPN settings.

All the other options are reflecting what we can already do in the CSP Console.

Security tab

This Menu is divided into two parts:

  • East-West Security that gives access to the Distributed Firewall rules and Distributed IDS/IPS configuration,
  • North-South Security covers internal traffic protection and the Gateway Firewall rules settings.

Nothing really interesting here, it’s pretty much the same as from the CSP Console as you can see here:

On the Distributed IDS/IPS, I can review the results of my previous penetration testing that I did in my previous post.

Inventory tab

This tab is covering:

  • Services: this where you’ll configure new protocol and services you want to leverage in the FW rules
  • Groups: group of Virtual Machines for Management FW rules and Compute Gateway rules
  • Context Profiles: you can basically add new FQDNs useful for the DFW FQDN filtering feature, AppIDs for Context Aware Firewall rule, and set up Context Profiles.
  • Virtual Machines: list all the VMs running an attached to segments with their status (Stopped, Running, …)
  • Containers: will show Namespaces and Tanzu Clusters.

Plan and Troubleshoot tab

The tab is covering:

  • IPFIX: this where you’ll configure new protocol and services you want to leverage in the FW rules
  • Port Mirroring: this permits to setup a target collector VM and then replicate and redirect all trafic from a logical port switch to it for analysis purpose
  • Traceflow: very nice feature to monitor and trouble shoot a trafic flow between two VMs and to analyze the path of the trafic flow.

The last one is a feature not existing on the current VMC on AWS CSP Console and which is to my opinion worth having a look at.

Traceflow option

Let’s have a look more deeply into what this brings onto the table in my next post.

Stay tune!

NSX Advanced Firewall Add On for VMware Cloud on AWS (Part 3)

In my previous post, I talked about the FQDN filtering feature which is one of the new Add-Ons of the Advanced firewall.

In this Part 3 of this multi part blog series, let’s focus on the latest feature, the Distributed IDS/IPS which is part of the newly announced NSX Advanced Firewall for VMware Cloud on AWS.

Introduction to Distributed IPS/IDS

With NSX Distributed IDS/ IPS, customers gain protection against attempts to exploit vulnerabilities in workloads running on VMware Cloud on AWS.

Distributed IDS/ IPS is an application-aware deep packet inspection engine that can examine and protect traffic inside the SDDC. Customers can detect and prevent lateral threat movement within the SDDC using the intrinsic security capabilities of Distributed IDS/IPS. 

Like DFW, Distributed IDS/IPS is built into the hypervisor and inspection can be performed for all traffic coming into or leaving the VM. Since the inspection is performed on all the hypervisor hosts in a distributed manner, there is no single inspection bottleneck that chokes the traffic flow.

Enabling Distributed IDS/IPS

First thing we will do is to activate and configure the Distributed IDS/IPS feature in VMC on AWS SDDC.

If you don’t have already activated the NSX Advanced Firewall add-on, please do so otherwise you will get this message:

Remember in my first Post of this series, I already have shown you how to activate the NSX Advanced Firewall Add On for VMware Cloud on AWS.

Once you have activated the add-on feature, in the browser, Click the Networking and Security tab. Click Distributed IDS/IPS, located in the Security Section.

The IDS/IPS is disabled by default so you have to enable it for the cluster. Here I have only one cluster.

Just move the slider to enable the feature and confirm that you want to enable the cluster and you are ready to test it!

Once it’s enabled you can choose to regularly update the Signatures by selecting the Auto Update new versions button.

NSX Distributed IDS/IPS utilizes the latest threat signature sets and anomaly detection algorithms to identify attempts at exploiting vulnerabilities in applications. It is integrated with the NSX Threat Intelligence Cloud Service to always remain up to date on the latest threats identified on the Internet.

You can check the other versions that have been presents in the environment by clicking the View and change versions link.

This is launching a new window with historical details. Here we can see that the first Default Signature was installed Jun 17th, 2021 and additional signatures has been pushed Oct 20th and Nov 12nd.

By clicking on the New signatures, I can dive deep into the details of each of them and access really good information on what signatures have been disabled, updated, …

We are gonna go ahead and be using the latest versions.

If you don’t have access to Internet from your NSX Manager, you also download the IDS/IPS signatures from the Network Threat Intelligence services page and be able to upload them manually.

Now it’s time to finish configuring the feature and launch some real test attacks by leveraging both Kali Linux and the infection Monkey tooling to simulate some attacks!

Configuring Distributed IDS/IPS profile & rule

Create a Profile for IDS/IPS

In this section, I will create a default profile to use with an IDS/IPS rule.

NB: We can configure up to 25 profiles.

Under the Profiles tab under Distributed IDS/IPS within the Security section, I have clicked ADD PROFILE and create the ChrisIDSProfile profile:

I have accepted the default settings but you can customise the profile to meet your requirements. You can for instance only select the Intrusion attack with a level of severity to Critical or High and Critical only.

One thing you can do is to tweak it by selecting specific CVSS or Attack types.

I clicked save to finish creating the Profile.

We can see that the profile has been successfully created.

After a few seconds it appears in green:

Create a Policy with rules for IDS/IPS

Now let’s create a Policy.

For that, I need to go to the Rules tab and add a specific IDS/IPS Policy called ChrisIDS-Policy.

I have selected the check box next to the name of the policy, then click Add Rule.

To finish the configuration I have to select the profile previously created.

I have also changed the source from Any to my SDDC subnets.

Please note that I leave the Sources and Services columns to Any and the Applied to field set to DFW.

I have also left the Mode to Detect Only. In Production it’s better to change this setting and switch to Detect & Prevent.

Now that I am done with the setup, I just need to click Publish.

Now it’s time to go for some tests of attacks and exploits.

Testing the IDS/IPS

In order to test the IDS/IPS feature, I have used my best security scanning tools to generate some attacks and try to exploit some vulnerabilities in one special server.

Basically I will launch the exploits on a OWASP Web Application server which is a test server with vulnerabilities that I have deployed in my SDDC. In a nutshell OWASP stands for The Open Web Application Security Project® and it is a nonprofit foundation that works to improve the security of software. It’s a very good way to test the level of security of your environment.

This OWASP server is going to be the target for all the vulnerability scanning coming from my two different tools.

Scanning tools

First one is the Kali Linux distribution in a Virtual Machine which have a multitude of security tools preinstalled in it. I love it!

The second one is the Infection Monkey virtual appliance from Guardicore which is a platform with a graphical interface that you can leverage to launch the exploits.

Infection Monkey is an open source breach and attack simulation (BAS) platform that allows organisations to discover security gaps and fix them. You can Simply infect a random machine with the Infection Monkey and automatically discover your security risks. Test for different scenarios – credential theft, compromised machines and other security flaws.

Deploying Kali Linux

It’s a simple process as you can install it from a ISO CD or download a virtual image directly from here.

I have choose to install it with the ISO CD as it gives more flexibility to tweak your VM settings.

Once the VM is deployed there is nothing more to do.

Deploying Monkey Island VM

First I have deployed the Monkey Island VM from the OVA downloaded from the Infection Monkey website. This is an Ubuntu Linux VM with a small footprint of only 2 vCPU and 2GB of RAM.

Once it’s been installed, I have just started the VM.

My VM is up and running very quickly and I can connect to it from the web console on port 5000:

Once I am logged in with the default username: monkeyuser and password, I can setup the system.

I start by clicking on Configure Monkey.

I need to click the Network tab, and Change the Target list IP address with the IP address of the OWASP VM running in the App segment (172.11.11.115).

Then I clicked on the Run Monkey on the left and Select From Island.

At that moment the tool launches the exploits automatically.

Launching the attacks and exploits

With Kali Linux tools

In my environment, the Kali Linux server address is 172.11.11.107.

And the OWASP Broken Web Application has the following address: 172.11.11.115.

In this first stage, I started to use Kali Linux with nmap to scan the OWASP Web server.

As you can see, there are 9 opened ports on the machine. The nmap command is able to output the name and version of the services that use the ports

Is this next step, I have leveraged the nikto command to scan for vulnerabilities on the server.

Multiple vulnerabilities have been displayed. Mainly affecting the Apache server and also the version of Python which is outdated

The result of the exploit is visible now on the CSP Console as you can see on the screen below. At the top, you can see there is representation of the attempt to compromise the server and they are spread over a time range with a slider that can be changed as needed.

The attacks have triggered a lot of Emerging Threats (ET Scan) alerts with Medium, High and Critical severity levels.

Medium alerts inform that the http protocol on the Web-Server is exploitable with vulnerabilities. The response here is just “Detect”. You can see the CVE number and CVSS Classification of the vulnerabilities on the right.

When I click on the VMs Affected, a list of the VM that have been affected by the vulnerabilities displays:

In addition, clicking the purple bar allow for displaying a detail window:

With Monkey Island tools

As I said before the scanner starts automatically after finishing the setup. Once it has finished its scanning operations, Monkey Island shows a map with all the results accessible through a web page.

It also displays a map of the devices that have been scanned by the tool.

On the right of the page, there is a tab called ATT&CK report that helps understand the exploits that have successfully been used or tried.

On the VMC on AWS Console, the results are displayed the same way as before with the Kali Linux tool:

The Alert displayed here is an apache Struts remote code execution attempt.

Conclusion

This new Advanced Firewall Add-on IDS/IPS feature is really interesting as today it’s the only way to prevent attacker from exploiting vulnerabilities from inside the SDDC.

That’s conclude the post, I hope this has given you a better understanding on how this feature is powerful.

NSX Advanced Firewall Add On for VMware Cloud on AWS (Part 2)

In my previous post, I have introduced you to the new Advanced Firewall Add-on in VMWare Cloud on AWS.

I also covered the Context Aware Firewall feature to filter connection based on the App id and not only the protocol number.

In this post, I am going to cover Distributed FW FQDN filtering to allow applications that communicate outside the SDDC gain layer 7 protection.

Introducing the FQDN Filtering feature

This feature can allow users to only access specific domains by whitelisting and/or blacklisting FQDNs. In many high-security environments, outgoing traffic is filtered using the Distributed firewall. When you want to access an external service, you usually create IP-based firewall rules. In some cases, you don’t know which IP addresses hide behind a domain. This is where domain filters come in handy.

Because NSX-T Data Center uses DNS Snooping to obtain a mapping between the IP address and the FQDN, you must set up a DNS rule first, and then the FQDN allowlist or denylist rule below it.

SpoofGuard should be enabled across the switch on all logical ports to protect against the risk of DNS spoofing attacks. A DNS spoofing attack is when a malicious VM can inject spoofed DNS responses to redirect traffic to malicious endpoints or bypass the firewall

You can define specific FQDNs that are allowed and apply them to DFW policies. Conversely, you can define specific FQDNs that are denied access to applications in the SDDC. The DFW maintains the context of VMs when they migrate. You can then increasingly rely on application profiling and FQDN filtering to reduce the attack surface of their applications to designated protocols and destinations.

Configuring DFW with FQDN filtering

In this section, I will show you how to setup a FQDN Context Profile, and a Firewall policy to limit access to specific URLs from VMs.

Creating a FQDN Context Profile.

First thing first ! Let’s create the context Profile.

Under Networking and Security, in the Inventory section, click Context Profile.

Click FQDNs Tab

Click ACTIONS –> Add FQDN

Enter the Domain: *.yahoo.com, and then Click SAVE.

Create a second FQDN with *.google.com.

Click the Context Profile Tab, and Click ADD CONTEXT PROFILE

Give it a Name: Allowed FQDNs, Click Set

Click ADD ATTRIBUTE –> Domain(FQDN) Name

Select the following domains: *.yahoo.com, *.office.com, *.google.com and Click ADD.

Click APPLY, Click SAVE. We now have a Context Profile setup.

Creating a Firewall rule and a Policy

I have created a Group called MyDesktops which includes a segment with my Windows VMs.

Now I am going to setup a Firewall Policy including this Context Profile. I will limit my VM in the MyDesktops group to access to the Allowed FQDNs. Also I limit access from this Group of VMs to specific DNS servers (8.8.8.8, 8.8.4.4).

I also add a Drop rule at the end to limit access to only the FQDNs that were whitelisted.

Now I am allowed to access google.com and Yahoo.com but I can’t connect anymore to the vmware.com site.

This concludes the post on FQDN Filtering. In my final post, I will cover the Distributed IDS/IPS feature.

NSX Advanced Firewall Add On for VMware Cloud on AWS (Part 1)

VMware Cloud on AWS already offers a robust sets of networking and security capabilities that enable customers to run production applications securely in the cloud.

The release of the M16 version is introducing new Advanced Firewall Features as an Add-on.

This includes the following new security capabilities:

  • L7 Distributed (Context Aware) Firewall with application ID – With L7 (Context-aware) firewall you can go beyond simple IP/ port level layer 4 security to complete stateful layer 7 controls and filtering.
  • L7 Distributed Firewall with FQDN Filtering – Applications that communicate outside the SDDC also gain layer 7 protection using Distributed Firewall FQDN filtering capability. Customers can define specific FQDNs you can define specific FQDNs that are denied access to applications in the SDDC. The DFW maintains the context of VMs when they migrate. Customers increasingly rely on application profiling and FQDN filtering to reduce the attack surface of their applications to designated protocols and destinations.
  • User Identity Firewall – You can create groups based on User ID and define DFW rules to control access to virtual desktops and applications in the SDDC. Per user/ user session access control limits the amount of time and exposure users have to desktops or applications. Integration with Active Directory / LDAP enables the DFW to continuously curate user access to applications. User ID based rules are enforced by the DFW at the source, delivering pervasive, intrinsic security throughout the SDDC.
  • Distributed IDS/IPS – With NSX Distributed IDS/ IPS, customers gain protection against attempts to exploit vulnerabilities in workloads on VMware Cloud on AWS. Distributed IDS/ IPS is an application-aware deep packet inspection engine that can examine and protect traffic inside the SDDC.

Let’s try them to see how it works!

Enabling the NSX Advanced Firewall Add-On

The NSX Advanced Firewall Add-on adds Layer-7 Firewall protection, Identity Firewalling, Distributed IDS/IPS and FQDN Filtering to the VMC on AWS SDDC. This Feature is an Add-on featured and priced in addition to the Standard VMC on AWS subscription.

Before any of these features can be used, you must first enable the add-on onto your SDDC. In the following section, I am going to walk you through the steps of enabling the NSX Advanced Firewall functionality onto your SDDC.

  1. On your SDDC tile, click View Details
  2. Click the Add-Ons tab
  3. In the NSX Advanced Firewall Tile, click Activate

Click Activate

Click OPEN NSX ADVANCED FIREWALL (This will take you to the Networking & Security Tab)

At this step, the NSX Advanced Firewall Addon has been enabled. To make use of the functionality it provides, you must configure them individually.

In the upcoming sections, we will configure and test each of these features.

Configuring L7 Distributed Context Aware Firewall

With L7 (Context-aware) firewall, it’s possible to go beyond simple IP/ port level layer 4 security to complete stateful layer 7 controls and filtering. This will avoid for instance someone from changing Port number to bypass a firewall rule.

Extremely powerful !

Deep packet inspection (DPI) built into the Distributed Firewall enables you to allow only the intended application / protocols to run, while denying all other traffic at the source. This enables you to isolate sensitive applications by creating virtual zones within the SDDC.

Distributed Firewall (DFW) layer 7 policies are enforced at the hypervisor (vNIC) level and can migrate with the VM when they move from host to host in the SDDC, ensuring there are no gaps in enforcement.

Let’s see how to configure and use the feature.

Configuring a standard L4 FW rule

In my example, I have two VMs (webserver01, webserver02) running in my SDDC which are part of a group called Web Tier.

Here are the IPs of the VMS:

They can communicate together over any protocol as this is the default settings in the Distributed Firewall as we can see here:

First let’s create a traditional L4 firewall rule to block SSH traffic between the two VMS.

Now if I want to ssh from webserver01 to webserver02 it’s blocked:

What if SSH was listening on another port, however? What if some nefarious person (knowing SSH on port 22 is being blocked) changed the port the server listens on and attempts to SSH to the server against this new port, what happens then? 

To do that I have edited the sshd_config on the webserver02 VM and changed the port to 2222:

I have then restarted the ssh service on the VM:

We can see the ssh server is now running on port 2222:

Let see what happens when we apply context awareness to the firewall rule.

now if I try to connect back but on port 2222, it works!

Unfortunately, the L4 DFW doesn’t block it. As mentioned earlier the firewall is looking for SSH on port 22, not port 2222, so I was able to bypass the firewall policy.

Configuring Context Aware Firewall rule

NSX Context-Aware Firewall  Rule (L7) enhances visibility at the application level and helps to override the problem of application permeability. Visibility at the application layer helps you to monitor the workloads better from a resource, compliance, and security point of view.

In order to switch to the Context Aware firewall, I just have to remove the SSH in the Service field from the DFW rule and need to add SSH in the Context Profile field:

The rule is now changed:

Let’s try to connect again to port 2222:

Now the attempt to connect to the modified port is block. That’s much better! This is because the DFW now assesses the packet at layer 7 and identifies the heuristics of the packet to be ssh and does not allow the traffic through.

With Context-Aware Firewalling you can enable enforcement of security protocol versions/ciphers reduce attacks by only allowing traffic matching APP Fingerprint, and enforce port-independent rules.

In the next post I will introduce you to the L7 Distributed Firewall with FQDN Filtering. Stay tune!

Introducing Multi-Edge SDDC on VMC on AWS

The latest M12 release of SDDC (version 1.12) came with a lot of interesting storage features including vSAN compression for i3en, TRIM/UNMap (I will cover it in a future post) as well as new networking features like SDDC Groups, VMWare Transit Connect, Time-based Scheduling of DFW rules and many more.

One that typically stands out for me is the Multi-Edge SDDC capabilities.

Multi-Edge SDDC (or Edge Scaleout)

By default, any SDDC is deployed with a single default Edge (actually this a pair of VMs) which size is based on the SDDC sizing (Medium size by default). This edge can be resized to Large when needed.

Each Edge has three logical connections to the outside world: Internet (IGW), Intranet (TGW or DX Private VIF), Provider (Connected VPC). These connections share the same host Elastic Network Adapter ENA and it’s limits.

In the latest M12 version of VMC on AWS, VMC is adding Multi-Edge capability to the SDDC. This gives the customer the ability to add additional capacity for North-South network trafic by simply adding additional Edges.

The goal of this feature is to allow multiple Edge appliances to be deployed, therefore removing some of the scale limitations by:

  • Using multiple host ENAs to spread network load for traffic in/out of the SDDC,
  • Using multiple Edge VMs to spread the CPU/Memory load.
The Edge Scale out feature consists in the creation of a pair of additional EDGEs where specific traffic type can be steered to

In order to be able to enable the feature, additional network interfaces (ENA) are going to be provisioned in the AWS network and additional compute capacity are created.

It’s important to mention that you do need additional hosts in the management clusters of the SDDC to be able to support it. So this feature is coming with an additional cost.

Multi-Edge SDDC – Use Cases

The deployment of additional Edges allow for an higher network bandwidth for the following use cases:

  • SDDC to SDDC connectivity
  • SDDC to natives VPCs
  • SDDC to on-premises via Direct Connect
  • SDDC to the Connected VPC

Keep in mind that for the first three, VMWare Transit Connect is mandatory to allow the increased network capacity by deploying those multiple Edges. As a reminder, Transit Connect is a high-bandwidth, low latency and resilient connectivity option for SDDC to SDDC communication in an SDDC groups. It also enables an high bandwidth connectivity to SDDCs from natives VPCs. If you need more information on it, my colleague Gilles Chekroun has an excellent blog post here.

Multiple Edges permits to steer certain traffic sets by leveraging Traffic Groups.

Traffic Groups

Traffic Groups is a new concept which is similar in a way to the source Based Routing. Source based routing allow to select which route (next hop) to follow based on the source IP addresses. This can be an individual IP or complete subnet.

With this new capability customer can now choose to steer certain traffic sets to a specific Edge.

At the time you will create a traffic group, an additional active edge (with a standby edge) is going to be deployed on a separate host. All Edge appliances are deployed with an anti-affinity rule to ensure only one Edge per host. So there need to be 2N+2 hosts in the cluster (where N=number of traffic groups).

Each additional Edge will then handles traffic for its associated network prefixes. All remaining traffic is going to be handled by the Default Edge.

Source base routing is configured with prefixes defined in prefix lists than can be setup directly in the VMC on AWS Console.

To ensure proper ingress routing from AWS VPC to the right Edge, the shadow VPC route tables are also updated with the prefixes.

Multi-Edge SDDC requirements

The following requirements must be met in order to leverage the feature:

  • SDDC M12 version is required
  • Transit Connect for SDDC to SDDC or VPC or SDDC to on-prem
  • SDDC resized to Large
  • Enough capacity in the management cluster

A Large SDDC means that management appliances and Edge are scaled out from Medium to Large. This is now a customer driven option that doesn’t involve technical Support anymore as it’s possible to upsize an SDDC directly from the Cloud Console.

Large SDDC means an higher number of vCPUs and memory for management components (vCenter, NSX Manager and Edges) and there is a minimal 1 hour downtime for the upscaling operations to finish, so it has to be planned during a Maintenance Window.

Enabling a Multi-Edge SDDC

This follow a three step process.

First of all, we must define a Traffic Group that is going to create the new Edges (in pair). Each Traffic group creates an additional active/standby edge. Remember also the “Traffic Group” Edges are always Large form-factor.

Immediately you will see that 2 additional edge Nodes are going to be deployed. The New Edges have a suffix name with tg in it.

Next step you have to define a prefix list with specific prefixes and associate a Prefix List. It will contain the source IP adresses of SDDC virtual machines that will use the newly deployed Edge.

After some minutes, you can confirm that the Traffic groups is ready:

NB: NSX-T configures source based Routing with the prefix you define in the prefix list on the CGW as well as the Edge routers to ensure symmetric routing within the SDDC.

You just need to click on Set to enter the prefix list. Enter the CIDR range, it could a /32 if you just want to use a single VM as a source IP.

NB: Up to 64 prefixes can be created in a Prefix List.

When you done entering the subnets in the prefix list, Click Apply and Save the Prefix List to create it.

Last step is to associate the Prefix List to the Traffic Group with Association Map. To do so click on Edit.

Basically we now need to tell what prefix list to use for the traffic group. Click on ADD IP PREFIX ASSOCIATION MAP:

Then we need to enter the Prefix List and give a name to the Association Map.

Going forward any traffic that matches that prefix list will be utilising the newly deployed Edge.

Monitoring a Multi-Edge SDDC

Edge nodes cannot be monitored on VMC Console but you can always visualise the Network Rate and consumption through the vCenter web console.

When we look at the vCenter list of Edges, the default Edge has no “-tg” in its name. So basically the NSX-Edge-0 is the default. As long as we add the new traffic group, the traffic is going to manage the additional traffic and liberate the load on this default Edge.

The NSX-Edge-0-tg-xxx is the new one and we can see an increase on this new Edge in the traffic consumption on it now because of the new traffic going to flow over it:

What happened also after the new scale edge is deployed, is that the prefix list is using the new Edge as its next-hop going forward. This is also propagated to the internal route tables from Default Edge as well as CGW route table.

All of these feature is exposed over the API Explorer, for the traffic Groups definition in the NSX AWS VMC integration API. Over the NSX VMC Policy API for the prefix list definition perspective.

In conclusion, remember that Multi Edge SDDC doesn’t increase Internet capacity nor VPN or NAT capacity. Also that there is a cost associated to it because of the additional hardware requirements.