This section describes how Open Connect Appliances are typically configured in a network. If you are an Open Connect ISP partner, Netflix works closely with you to determine the optimal configuration for your particular needs.
For more information, see the FAQs
- Router interface configuration
- Reconfiguring the IP address of an OCA
- Routing and content steering via BGP advertisements
- Clustering architectures
- Flash-based appliances
OCAs are directed cache appliances, meaning that the manner in which traffic is directed to the appliance is determined explicitly by you and by Netflix, not by the appliance itself.
An OCA only serves clients at IP addresses that you advertise to the OCA via a BGP session. In other words, traffic is only delivered from your embedded OCAs to the customer prefixes that you explicitly announce to them, as described in the following sections. Therefore, you as the ISP partner have full control over the networks that the appliances will serve. BGP sessions are established between appliance(s) and the closest connected router.
If content is requested that is not contained on an embedded OCA, the client request is directed to the closest Netflix content site via peering (if present) or via transit.
Reconfiguring the IP address of an OCA
Each appliance comes fully configured based on the IP address details that you provided to Netflix in your site survey before it was shipped.
For step by step instructions on how to change the IP address of an OCA, see the following article: Updating the IP address of an OCA
Router interface configuration
When you are connecting the appliances to your router, follow the guidelines in this section. See also Example router configurations.
- Each OCA must be assigned one publicly routable IPV4 address, and it is highly recommended to also assign one IPV6 address.
- You can assign an address to the appliance from an IPv4 subnet of /31 and larger, or an IPv6 subnet of /127 and larger.
- It is acceptable to assign an address to the appliance from a larger subnet (for example, a /24). However, because only one IPv4 address is required per appliance, a smaller subnet is typically used.
- The router interfaces must be configured for Link Aggregation Group (LAG) with LACP. Each server must be configured in its own LAG bundle for the active interfaces on each server, as illustrated in the diagram below. In the two-server example configuration, there are two separate LAG bundles, one for each server.
- A standard maximum transmission unit (MTU) must be configured on each router interface. Jumbo frames are not supported.
- If there are multiple routers available that can provide redundancy in a site, it is recommended to stagger appliances between routers. Appliances on the same router should be in the same subnet to optimize filling. Appliances on separate routers should be in separate subnets. Appliances are not designed to be connected to two separate routers.
- All ports on an OCA should be connected to the same router or switch. Using multi-chassis LAG or switch stacking should be avoided.
- Each OCA is hardened against network attack and is designed to be directly connected to the internet. Filtering inbound or outbound traffic can cause operational issues, so we strongly recommend that you allow all traffic on all ports, do not use ACLs, and ensure that your router has a default route or full routing table. If you absolutely must filter, the current list of inbound and outbound usage follows. Please note that these can change at any time without prior notification.
- Traffic from OCA: Allow all destination addresses and ports.
- Traffic to OCA: Allow TCP 22, 53, 80, 179, 443, UDP 53 and 123 (source and destination), ICMP types 0, 3, 8, 11, and all ICMPv6 from any public IP/port. Allow all return traffic from any appliance-initiated connection (TCP established).
- Note: You can confirm the status of required inbound/outbound OCA connectivity in the Partner Portal.
- Each network interface must be receiving between 0 dBm and -10 dBm of light to ensure good data throughput. The LCD panel on the front of the appliance displays the current light levels for each interface. If your appliance does not have an LCD panel, access the console with a keyboard and mouse.
- For more information see the following related articles:
Routing and content steering via BGP advertisements
We steer clients to our OCAs based on an ISP’s BGP advertisements, coupled with the routing and steering algorithms in the Open Connect control plane. ISP partners can control some aspects of content steering via the BGP routes that are announced via peering or to each embedded OCA.
The control plane steers clients to the best available OCAs using a modified version of BGP best path selection. Assuming that the appliance has the requested title and available serving capacity, the control plane provides clients with a ranked list of appliances (typically 3 or more reliable sources) to stream from.
Appliance selection criteria
The following appliance selection criteria are considered, in order, by the Open Connect control plane services. If there is a tie for a given criterion, then the next criterion is considered. If there is a tie on all criteria, traffic is balanced between appliances.
- The appliance that receives the most-specific route to the client’s prefix.
- The appliance that receives the route to the client’s netblock with the shortest AS path. (See the notes on peering below).
- The appliance that receives the route to the client’s netblock with the lowest multi-exit discriminator (MED). (See the notes on MEDs below).
- The geographically closest appliance. We geolocate based on client IPs, whose location is then compared to the latitude and longitude of nearby OCAs to determine the closest available system.
Additional notes on MEDs
- We honor the MED values that we receive. However, we increase the value as follows depending on where we learn the prefix:
- +0 for an embedded OCA (Netflix Cache server)
- +50 for a direct peering connection (PNI)
- +100 for peering at an IX (public peering)
- There is no cap on the maximum MED value.
- A missing MED is treated the same as a MED of 0, and indicates that the appliance should receive all servable traffic for the associated prefixes (also often referred to as MED-missing-as-best). Remember, if multiple appliances receive the same prefix with the same metric, traffic is load-balanced across those appliances. Because a missing MED will be equivalent to 0, it is preferred over any >0 MED on other appliances.
- Important: Marking MEDs on already installed and working Open Connect Appliances can be hazardous, because it must be done on all BGP sessions for all appliances at the same time.
Finally, a reminder that if you are peering with Netflix and do IRR filtering, our prefix set is RS-NETFLIX and our as-set is AS-NFLX. Please be sure to accept advertisements that originate from ASNs: 2906, 40027, and 55095.
See also: Peering Locations.
- Route announcements for Open Connect embedded appliances:
- IPv4 prefixes between /8 and /31 (inclusive) are accepted.
- IPv6 prefixes between /19 and /64 (inclusive) are accepted.
- Route announcements for Open Connect peering sessions:
- IPv4 prefixes between /8 and /24 (inclusive) are accepted.
- IPv6 prefixes between /19 and /48 (inclusive) are accepted.
- As an implicit requirement, all appliances must have a BGP session configured in order to correctly participate in Netflix content steering and delivery.
- To localize traffic, the best practice is to advertise the most specific routes to the appliance. For example, if you are announcing a /22 to the OCA, but a /24 is received from the same block over settlement-free interconnection (SFI) peering or transit, the /24 will be preferred, delivering content traffic from the remote source instead of the OCA.
- If you are deploying only one OCA in your network, you should advertise the most specific (longest) prefix for that OCA over the peering session that you want the OCA to use for nightly filling purposes.
- If you are deploying multiple OCAs across more than one site in your network:
- To enable efficient nightly fill: ensure that the appliances within one site can hear the subnets for the appliances in the other site via the BGP connection that is established with your router. See the Fill and updates information for more details.
- See the additional information about clustering architectures.
- Netflix does not use any BGP community information that is advertised by partners to OCAs or via Open Connect peering.
- Advertised routes that are received by an OCA are synchronized with Open Connect control plane services approximately every five minutes.
- If you are planning to serve client traffic from your embedded OCAs to prefixes within an ASN that is outside of your network, ensure that you are Documenting Network Relationships with an AS-SET
- Netflix uses RPKI-based route filtering. For more information, see: RPKI-based route filtering.
- Netflix has joined the MANRS initiative. For more information, see: Mutually Agreed Norms for Routing Security (MANRS)
Troubleshooting BGP Advertisements
There are a few tools in the Partner Portal that you can use to explore and troubleshoot BGP announcements:
- Use the Route Explorer to monitor the state of the BGP sessions and announcements that you have configured between your routers and the Netflix Open Connect Appliances that are embedded in your network.
- Use the Route Optimizer to run reports on your BGP route announcements. The Route Optimizer reports include all announcements that Netflix hears from your ASN(s), including at peering and embedded sites.
- View the Route Performance Report to discover prefixes in your network that are potentially experiencing relatively poor video streaming quality.
Embedded OCAs combined with peering sessions
The ideal Open Connect implementation is a mixture of both SFI peering and deployed embedded OCAs. Netflix uses two separate autonomous systems for peering:
- AS2906 is the AS number that Netflix uses for peering at its PoPs
- AS40027 is the AS number that embedded OCAs use to peer with ISP networks
See BGP requirements for prefix announcements that are accepted on peering sessions.
When OCAs and Open Connect SFI peering is combined, peering is used primarily for backup, for filling, and for serving long-tail titles.
Assuming that the AS PATH LENGTH to an embedded OCA and peering are equal, if you announce the same prefix both to a private or public peering session (using AS2906) and to an OCA (using AS40027) the OCA will be preferred over peering. This is because the Open Connect control plane will have two BGP entries for that prefix:
- one with an AS PATH LENGTH of 1 (<AS_NUMBER>) from the appliance itself
- one with an AS PATH LENGTH of 2 (2906 <AS_NUMBER>) from the peering location
Keep in mind the appliance selection criteria above and remember that the general best practice is to announce more specific routes to embedded appliances so that they are preferred for serving traffic.
Whenever more than one OCA is deployed in a site, they are configured by the Open Connect operations team as a single manifest cluster. OCAs in a manifest cluster share content storage and function together as one logical server/storage unit.
Although partners do not need to configure manifest clusters, it is important to understand some basic clustering concepts. In particular, there are implications to consider when OCAs in a cluster are taken down for maintenance or moved to a different site.
Clustering has the following potential benefits:
- Greater offload for unique content
In a typical two-OCA cluster, both appliances will use approximately 40% of their storage for the same popular content. This popular content typically represents roughly 60% of the OCA’s total offload. The remaining 60% of storage space on each OCA is used to store a unique set of less-frequently-accessed content. Because we do not store the same exact set of content on each single OCA in a cluster, a cluster of OCAs provides greater total offload than an unclustered group of OCAs. This strategy helps the OCAs in a site function more efficiently.
For more information about storage strategies, see this tech blog post.
- Better resiliency
Redundancy is generally acceptable in a two-OCA cluster. In the event of a single OCA failure, the healthy appliance will take over the majority of the traffic that the failed unit was serving. See the failover scenarios in the sample architectures.
Notes for partners:
- After a set of OCAs has been installed in a site and grouped together as a cluster by the Open Connect team, they should be thought of as one big server. Therefore, any changes you make to a single OCA in a site has the potential to negatively impact the serving efficiency and behavior of the group.
- If you need to make changes to the OCAs in an established site - for example, if you intend to relocate an OCA from one site to another or disable one or more OCAs for a significant period of time - it is important to notify the Open Connect team so that they can make the necessary changes to the cluster configuration. Failing to do so can cause undesired consequences. For example, you may see traffic being steered to the wrong site, fill patterns may become suboptimal, and hot spots might develop.
- To enable optimal and balanced traffic patterns, OCAs in a site must receive the exact same BGP route advertisements. Therefore, if you relocate an OCA you must revisit your BGP route announcements to ensure that traffic continues to be steered appropriately.
If you are an ISP with very large amounts of Netflix traffic, we will likely include flash-based appliances in your OCA deployment architecture. Flash-based appliances are flash memory-based servers that are deployed when you reach a threshold number of OCAs, to augment the delivery capability of the main (storage) appliances.