Welcome to the Netflix Partner Help Center. Have a question or need help with an issue? Send us a ticket and we'll help you to a resolution.

 

We continually monitor all OCAs that are live and serving Netflix traffic with a series of checks. If one of our checks fails, it is flagged as an operational issue alert.

Notifications and visibility of active issues

Operational issues are visible in the Partner Portal on the homepage, and on the details page for each site and OCA scoped accordingly. For more information, see: Viewing open operational issues in the Partner Portal

As alerts are triggered, NOC contacts in your organization will also receive notification via email. Notification will be slightly different, depending on the priority of the issue.

Non-Urgent issues

Non-Urgent issues will be summarized in a weekly email report. You will only receive the email report if there are active issues at the time the report is sent. The weekly email report includes a brief summary of the active issues in each site and links to the Partner Portal for more details.

The sender for the weekly email report is: info@partner.netflix.com. To ensure that you are receiving the report, ensure that you are allowing and monitoring correspondence from this email address. 

Important: We do not monitor the above address or the reply-to address. If you need help with an issue, go to the issue in the Partner Portal and open a ticket.

Urgent (service-impacting) issues

These issues, marked below, generate an individual ticket by default and therefore you will receive notification immediately when they are flagged. Active urgent issues are also included in the weekly email report. You can request assistance for urgent issues via the ticket thread.

The sender for our ticketing system is: support@openconnect.zendesk.com. To ensure that you are receiving ticket notifications, ensure that you are allowing and monitoring correspondence from this email address.

Important: If you need help with an open ticket, respond to the email notification thread or add a comment via the Zendesk user interface, which is linked to from the Partner Portal Tickets page. 

Live Checks

The remainder of this article provides detailed information about each Live check:

OCA Operational Checks

Several checks verify that the OCA is not operating properly, system temperature is within range, and the network interfaces and power supply units (PSUs) that were included with the OCA are properly installed, are functional, and are not flapping or reporting errors.

For general instructions and troubleshooting information for network interface errors, see the following related articles:

Troubleshooting notes for network interface errors

  • OCAs are shipped to you with the expected number of supported optics connectors and cables. You should only use the parts that were included in the shipment, and ensure that you install and cable all included optics.
  • Only single-mode or multi-mode fiber connections are supported. Direct Attach cabling (also known as DAC, TwinAx, CR, Cu, or copper) is incompatible.

  • If you have replaced the optical modules due to incorrect part types being requested or delivered, please open a ticket to request that new optics be shipped to you.

Lost Interface

This check will trigger an alert if a network interface that was previously connected is detected to be disconnected. This situation can negatively impact the overall capacity in your embedded site and cause traffic to be served outside of your embedded site.

Input Errors

This check verifies that there are no input errors on the network interfaces.

Troubleshooting Notes:

Network Flaps

This check verifies that the network interfaces are not flapping.

Troubleshooting Notes:

Unsupported SFP

This check verifies that only supported optics are being used. Unsupported copper or AOC transceivers (SFPs) inhibit our monitoring infrastructure from working correctly, and can cause undesirable streaming problems.

You should only install the optic modules that were included with your OCA, or an equivalent replacement.

If you need replacement optics, you can open a ticket to request them.

Unbalanced Network Interfaces

This check verifies that outbound traffic is being balanced across all active interfaces. Normally, an OCA will automatically balance outbound traffic across all active interfaces, especially during peak traffic.

Most often, this problem is caused by dirty or faulty optics which cause problems with traffic throughput. To troubleshoot this issue, start by cleaning the fibers and verifying the path between the OCA(s) and your router.

If the problem persists, open a ticket and provide a network diagram and/or configuration of how the OCA(s) are connected to your network, including vendor, model, and configuration of network gear in use. If you have more than 1 OCA, and only 1 OCA is experiencing this problem, please note in the ticket anything that may be different or unique in the configuration or connectivity for the impacted OCA that might help determine the source of the problem.

Power Supply Problem

This check verifies that the installed power supply(s) (PSUs) that were included with the OCA are functioning and reporting their status appropriately to our monitoring systems. 

Troubleshooting Notes:

System Temperature

This check will trigger an alert if the OCA consistently reports its system temperature above acceptable thresholds.

Troubleshooting Notes:

 


Connectivity and Reachability Checks

The following series of checks verify that the OCA has the required connectivity to serve Netflix traffic, communicate with our control plane services in AWS, and can be successfully reached for monitoring and administrative tasks. When you are troubleshooting connectivity issues:

  • Ensure that all required ports are open as described in the Network configuration section of the Deployment Guide.
  • Ensure there are no firewalls, proxies, anti-DDoS protection, or third party caching infrastructure installed that may be interfering with connectivity.

  • For additional troubleshooting information, see Viewing connectivity metrics for an OCA

Urgent - Appliance Unavailable

This check triggers an alerts if the OCA unexpectedly becomes disconnected, unavailable to serve traffic, or is suddenly powered down. 

Troubleshooting Notes:

AWS-to-OCA Connectivity (Unreachable or blocked on necessary TCP ports)

This check verifies connectivity from Netflix control plane services in AWS to the OCA on one or more of the following protocols: SSH, HTTP, SSL, or ICMP. In the Partner Portal, you can see which ports are blocked.

We rely on this access for certain administrative tasks and for monitoring. Please verify that no network ACLs or issues are preventing access to the OCA.

You can see the current status of this check on the Metrics > Connectivity > Inbound chart for the OCA in the Partner Portal.

For more information, see this article: Viewing connectivity metrics for an OCA

Note: You can also see the current status of outbound connectivity on the Metrics > Connectivity > Outbound chart for the OCA in the Partner Portal.

IPMI Unresponsive

This check verifies that the OCA is able to communicate with the Netflix endpoint that performs IPMI bundle updates.

Troubleshooting Notes:

  • Verify that there are no firewalls, filters or ACLs which are preventing access to https://hw-oca.oc.netflix.com.

  • Follow the general connectivity troubleshooting information above.

NTP Synchronization (Cannot Reach NTP Servers)

This check verifies that the OCA is able to properly synchronize its clock using Network Time Protocol (NTP).

Troubleshooting Notes:

  • Ensure that UDP port 123 (and all other required ports as described above) is open.
  • Follow the general connectivity troubleshooting information above.

 


BGP and Traffic Checks

The following checks validate that the OCA has an established BGP session and is properly learning and reporting routes. BGP must be configured properly so that the OCA can serve Netflix traffic to client devices within your network and receive fill traffic.

For general information about BGP configurations and advertisements, see the following information:

Urgent - (IPv4 or IPv6) BGP Route limit exceeded

This check alerts when an OCA automatically drops its IPv4 or IPv6 BGP session due to receiving too many prefix announcements. In this state, the OCA cannot serve traffic.

Troubleshooting Notes:

  • You can see the route limits for each OCA on the BGP Session Configuration tab. Managing BGP sessions
  • If you need the route limit increased, please respond to the ticket associated with this issue and let us know what the new route limits should be.
  • If excessive routes were advertised unintentionally, correct the error.

Urgent - (IPv4 or IPv6) BGP Session Down or Not Receiving Prefixes

These checks validate that IPv4 / IPv6 BGP sessions between the OCA and its BGP peer are properly configured and established, and verifies that the OCA is learning advertised routes via its BGP session(s). If this issue is active, the situation is not due to route limits exceeded, in contrast to the more specific issue above.

Troubleshooting Notes:

  • You can view network and BGP configurations in the Open Connect Partner Portal.
  • You can view the routes that are being heard by each OCA using the Route Explorer tool.
  • If you need to make changes to the network configuration or BGP session configuration for an OCA after delivery and installation, follow these steps.

Urgent - Unregistered BGP Routes (fka BGP Route Leaks)

This check looks for BGP routes that appear to be advertised beyond their intended scope. These routes will be filtered from our steering algorithms.

Troubleshooting Notes:

For more information, see Documenting Network Relationships with an AS-SET

Urgent - Unroutable BGP Prefixes

This check looks for BGP prefixes that are advertised to your embedded OCAs that are unroutable outside of your network. These prefixes are only advertised to your embedded OCAs and there is no known alternate source.

We always aim to stream as much content as possible from your embedded OCAs. However, there will be cases where clients request a less popular title that is not available on your embedded OCAs. Any content that your OCAs are unable to serve must be accessible from outside of your network. If there is no known alternative route to the Netflix content that cannot be served from your embedded OCAs, customers will experience streaming failures and errors.

Troubleshooting Notes:

  • Run the Discover Embedded Routes with no Alternate Source report in the Partner Portal Route Optimizer to see a current list of unroutable prefixes.
  • To address the issue, ensure that you are advertising all of the BGP prefix(es) listed in the report to an alternative source, either via peering with Netflix or via your transit provider.

Inconsistent BGP Advertisements

This check verifies that all OCAs within the site are learning the same number of routes. All OCAs in the same site must serve the same customers and receive identical BGP prefix advertisements. If the advertisements are not consistent within a site, traffic may shed to another less optimal site.

Troubleshooting Notes:

  • You can view the routes that are being heard by each OCA and look for route inconsistencies within a site using the Route Explorer tool.

Potential Proxy Server in Front of OCA

This check verifies that the OCA is communicating with the control plane from its assigned IP address. It is required that each OCA be assigned a publicly-routable IPv4 address.

Troubleshooting Notes:

  • Ensure that the OCA is able to reach the internet without its traffic passing through any intermediate devices such as proxy servers, NAT or third party caching solutions.
  • For more information, see the Router Interface Configuration section of the Network configuration article.

Tier Fill Issues

This check runs if you have more than one site containing embedded OCAs in your network. The check verifies that the IP addresses for the OCAs in each embedded site are advertised to the OCas in other sites so that they are enabled to fill from each other. This ensures that your OCAs are not forced to obtain some of their fill content from sources outside of your network (via PNI or transit) unnecessarily.

In general, it is more efficient to enable tier fill between embedded sites. However, depending on your particular network you may prefer to fill from peering sites. If you would like to opt out of the tier fill check, you can open a ticket to request a configuration change that will mute this alert.

Troubleshooting Notes:

  • You can run the Tier Fill report in the Route Explorer tool to generate a current list of tier fill issues.
  • For more information on fill patterns, see this article: Fill patterns

 

 

Was this article helpful?
0 out of 0 found this helpful