ISP Design – Building production MPLS networks with IP Infusion’s OcNOS.

Moving away from incumbent network vendors


1466540435IpInfusion interivew questions


One of the challenges service providers have faced in the last decade is lowering the cost per port or per MB while maintaining the same level of availability and service level.

And then add to that the constant pressure from subscribers to increase capacity and meet the rising demand for realtime content.

This can be an especially daunting task when routers with the feature sets ISPs need cost an absolute fortune – especially as new port speeds are released.

Whitebox, also called disaggregated networking, has started changing the rules of the game. ISPs are working to figure out how to integrate and move to production on disaggregated models to lower the cost of investing in higher speeds and feeds.

Whitebox often faces the perception problem of being more difficult to implement than traditional vendors – which is exactly why I wanted to highlight some of the work we’ve been doing at integrating whitebox into production ISP networks using IP Infusion’s OcNOS.

Things are really starting to heat up in the disaggregagted network space after the announcement by Amazon a few days ago that it intends to build and sell whitebox switches.

As I write this, I’m headed to Networking Field Day 18 where IP Infusion will be presenting and I expect whitebox will again be a hot topic.

This will be the second time IPI has presented at Networking Field Day but the first time that I’ve had a chance to see them present firsthand.

It’s especially exciting for me as I work on implementing IPI on a regular basis and integrating OcNOS into client networks.


What is OcNOS?


IP Infusion has been making network operating systems (NOS) for more than 20 years under the banner of its whitelabel NOS – ZebOS.

As disaggregated networking started to become popular, IPI created OcNOS which is an ONIE compatible NOS using elements and experience from 20 years of software development with ZebOS.

There is a great overview of OcNOS from Networking Field Day 15 here:


What does a production OcNOS based MPLS network look like?

Here is an overview of the EVE-NG lab we built based on an actual implementation.



Use case – Building an MPLS core to deliver L2 overlay services

Although certainly not a new use case or implementation, MPLS and VPLS are very expensive to deploy using major vendors and are still a fundamental requirement for most ISPs.

This is where IPI really shines as they have feature sets like MPLS FRR, TE and the newer Segment Routing for OSPF and IS-IS that can be used in a platform that is significantly cheaper than incumbent network vendors.

The cost difference is so large that often ISPs are able to buy switches with a higher overall port speeds than they could from a major vendor. This in turn creates a significant competitive advantage as ISPs can take the same budget (or less) and roll out 100 gig instead of 10 gig – as an example

Unlike enterprise networks, cost is more consistently a significant driver when selecting network equipment for ISPs. This is especially true for startup ISPs that may be limited in the amount of capital that can be spent in a service area to keep ROI numbers relatively sane for investors.

Lab Overview

In the lab (and production) network we have above, OcNOS is deployed as the MPLS core at each data center and MikroTik routers are being used as MPLS PE routers.

VPLS is being run from one DC to the other and delivered via the PE routers to the end hosts.

Because the port density on whitebox switches is so high compared to a traditional aggregation router, we could even use LACP channels if dark fiber was available to increase the transport bandwidth between the DCs without a significant monetary impact on the cost of the deployment.

The type of switches that you’d use in production depend greatly on the speeds and feeds required, but for startup ISPs, we’ve had lots of success with Dell 4048s and Edge-Core 5812.

How hard is it to configure and deploy?

It’s not hard at all!

If you know how to use the up and down arrow keys in the bootloader and TFTP/FTP to load an image onto a piece of network hardware, you’re halfway there!

Here is a screenshot of the GRUB bootloader for an ONIE switch (this is a Dell) where you select which OS to boot the switch into


The configuration is relatively straightforward as well if you’re familiar with industry standard Command Line Interfaces (CLI).

While this lab was configured in a more traditional way using a terminal session to paste commands in, OcNOS can easily be orchestrated and automated using tools like Ansible (also presenting at Networking Field Day 18) or protocols like NETCONF as well as a REST API.

Lab configs

I’ve included the configs from the lab in order to give engineers a better idea of what OcNOS actually looks like for a production deployment.










MikroTik PE-1



 MikroTik PE-2





WISP Design – OSPF “Leapfrog” path for traffic engineering


One challenge that every WISP owner or operator has faced is how to leverage unused bandwidth on a backup path to generate more revenue.

For networks that have migrated to MPLS and BGP, this is an easier problem to solve as there are tools that can be used in those protocols like communities or MPLS TE to help manage traffic and set policy.

However, many WISPs rely solely on OSPF and cost adjustment to attempt to influence traffic. Alternatively, trying to use policy routing can lead to a design that doesn’t failover or scale well.

Creating a bottleneck on a single path

WISPs that are OSPF routed will often have a primary path back to the Internet at one or more points in the network typically from a tower that aggregates multiple backhauls.

As more towers are added that rely on this path, it can create a bottleneck while other paths are unused.


Creating an alternate best path by leapfrogging another router

One way to solve this problem is to use VLANs to create another subnet for OSPF to form an adjacency.

By tagging the VLAN from Tower 6 through Tower 3 and into Tower 4, a new path for OSPF is created that will cause Tower 6 and Tower 7 to take an alternate path.

This allows a WISP operator to tap into unused bandwidth while still retaining a straightforward failover mechanism.


Tower 6 and 7 can now use the alternative path

Now that a lower cost path exists at Layer 3 for Tower 6 and 7, the original best path become less congested and bandwidth that was previously unused can now be used to load balance traffic.


How to build the new path?

I often recommend “switch-centric” architecture also known as “router-on-a-stick” when building a WISP – there are a number of benefits to doing this. In this case, since the switches handle all connections, it’s fairly easy to tag a VLAN from Tower 6 to Tower 4 and create the “leapfrog” path.


A few closing thoughts

This works well when the towers you need to move onto an alternate path are part of a stub network – that is, they only have one way into the aggregation point before the traffic is split.

If you were trying to do this with rings that connect to other rings and had redundant paths, there could be unintended results so you need to plan the costs and paths carefully.

Labbing the topology to see what will happen is always a good idea.

Hope this is helpful for someone looking to get just a little more bandwidth out of an OSPF based WISP network.