There is a wealth of untapped information in your network. Learn to query and extract it!
Even before packets start flowing, enterprise networks are complex, data-intensive repositories of topology, configuration and state information. This information is often required to solve operational issues—like finding sources of unwanted traffic drops or protocol configuration errors—or, to find problems before they become issues.
Yet, this valuable information typically goes untapped, because getting at it requires too much work. The data is scattered across devices and stored in different formats, with disparate methods to access it.
Forward Networks' newest capability, Network Query Engine (NQE), removes the burden of collecting, parsing, and querying network information. NQE takes configuration files and state information from across all your network devices and exposes it in a well-defined schema that can be queried like a database—to enable a new range of network management capabilities and insights. Read on to see how it works and how you can get started with it, today.
Simple questions are surprisingly hard to answer
Today, answering even basic questions about a network can be challenging and time-consuming. Consider this simple task: find all device interfaces in your network whose operational status does not match their configured status. For example, if an interface is configured to be UP, but is operationally DOWN, we want to know about this!
To implement this simple check, we need to 1) log in to all of our devices, 2) use the available method to retrieve the data 3) extract out information we need from the data, and 4) put this information into some common data structure, so that the check can easily be written. In some cases, we may be able to use proper APIs, like SNMP or NetConf, to retrieve this data. In other cases—and in almost every significant real world case we've seen—some of the data on some of the devices has to be retrieved from the CLI interface of a device and parsed from human-oriented, ill-specified, vendor-specific textual output.
To illustrate, consider getting the operational and administrative statuses from all interfaces on a Cisco NX-OS device and on an A10 ACOS device. On Cisco NX-OS, we have to run two commands, "show running-config" and "show interfaces brief" to get at all the data. On A10 ACOS, we have to run "show interfaces". The outputs from these three commands, shown in Figures 2, 3, 4 are all completely different and we have to write 3 ad-hoc parsing functions to grab the data we want.
Developing programs to collect the needed data on one or two devices or platforms for one or two data points is not that hard. But doing this across all of your vendors, platforms, and OSes, for the thousands of data points (some rather obscure) that you may need, is an enormous effort.
When you can efficiently get answers, many operational tasks can be accelerated
Beyond the simple question about interfaces above, operators have expressed interest in answering a long list of questions:
- Do all distribution-layer access links have redundant paths?
- Are MTU parameters consistent across all device connections?
- Are all IP addresses unique across the network?
- Are all BGP sessions currently established with configured peers?
- Is a particular static route available on specific devices?
- What are the nearest neighbors of a down device?
- Do all network interfaces have written descriptions?
- Are my BGP advertisements imported correctly into my upstream BGP routers as expected, as opposed to being filtered and dropped?
Every network engineer we talked to has questions like these, which all require network-wide information to answer completely. In some cases, engineers would query manually and get a partial answer. In other cases, where they had tools teams or scripting-savvy network teams, they would invest the time and effort. One network operator who went down this path reported that 80% of the effort was spent in basic collecting and parsing of device and vendor-specific formats and details.
Unsurprisingly, nobody reported that this was work they'd want to do, because it's tedious to get collection, parsing, and querying right across vendors and all device types (switches, routers, firewalls, and load balancers), because it requires ongoing maintenance (as the devices evolve and new firmware versions come out), and because it adds risk (that the star coder who made this leaves). There are libraries to help with network-device collection, access, and parsing, but none get you all the way to the finish line. Existing tools/techniques like SNMP, HTTP APIs, and YANG outputs often only work on a subset of devices, require software upgrades that you are not ready to apply, or don't cover all the data you need.
Given the level of effort required, many questions are simply not answered, leaving operators blind to potential problems in their network or working long weekends hunting down the needles in their haystacks. Fortunately, now, help is on the way...
Leveraging the Forward platform
As part of the underlying engine of its network assurance and intent-based verification platform, Forward Networks already does the hard work to create a centralized, vendor-agnostic internal representation of all this network configuration and state information across all layer 2-4 device types. This information powers applications like search, verification, behavioral diffs, and more, all of which can be accessed via our GUI or API.
But this raw information could do other things—if exposed the right way—to answer questions like those above, as well as enable live queries, custom verification checks, compliance documentation, and new dashboards.
To this end, we are introducing Network Query Engine (NQE) in the Forward Networks platform. NQE provides access to normalized, structured data about the network, enabling network teams to focus directly on the higher-level aspects of their use cases, which is actually making their networks more resilient, agile and robust. Data queries for specific needs can now efficiently be written in a small fraction of the time, usually in a few lines of code or in a graphical query editor. The diagram below shows how we are exposing our core platform data model alongside our Forward Enterprise applications and intent-based verification analytical model.
Our key design goals for enabling this level of access to our structured, normalized network data model were to ensure the availability of data was:
- Easy - flexible, consumable, intuitive access to potentially complex schema
- Maintainable - supportable, easy to update as data model evolves
- Scalable - to the scale of the world's largest networks
- Multi-Vendor - for all devices supported by the Forward Networks platform, now and in the future.
In particular, we aimed to have a normalized data model, in the sense that device information (interface details for example) is represented with the same information structure, regardless of vendor, platform and OS. This normalization allows you to easily write queries and scripts that work across your entire fleet. We also wanted the data to be structured, in the sense that data is fully-parsed—you won't need to parse further into some string to extract out the embedded structure. Again, this simplifies programming because it eliminates a bunch of annoying details, like dealing with differences in textual representation of MAC addresses that arises across platforms.
We believe we've achieved these goals and invite the community to try it out. The data model is far from complete (more on that later), but feedback from operators convinced us to accelerate this feature and make it available as soon as possible.
GraphQL: A simplified and efficient query language to the Forward schema
NQE is based on GraphQL, and this is a crucial ingredient in meeting our design goals. GraphQL, which was developed by Facebook in 2012, and released as an open source project in 2015, is: "a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools." [graphql.org]
GraphQL lies somewhere in between a REST API and SQL: Like a REST API, GraphQL defines a format for requesting and receiving general data over a web connection. Like SQL, GraphQL defines a query language for requesting and receiving filtered data from a graph-structured data source that has a well-defined schema. GraphQL has gained the backing of hundreds of organizations like Intuit, Netflix, GitHub, and PayPal, and is getting rave reviews. Just to quote one: "at PayPal, GraphQL has been a complete game changer in the way we think about data, fetch data and build applications."
GraphQL provides some key ingredients for NQE. First, GraphQL defines a schema language that NQE uses to provide a clear, simple, and precise description of our network data model in the form of a GraphQL schema. This schema allows developers to know precisely what information is in our data model, how it will be structured, and what it means. For example, here is a fragment of the schema that defines the structure of the object that defines Ethernet attributes of an interface:
This declaration clearly communicates the structure with minimal fuss: it defines a record with a few fields, and each field has a well-defined type in our schema.
Second, GraphQL provides a simple way to write queries against the schema. In fact, the query language is so simple, it almost doesn't feel like a language: it looks like JSON, without the values. For example, here is a query for the macAddress and negotiatedPortSpeed values of the Ethernet datatype:
Beyond its simplicity, the query API is crucial to ease-of-use: as a developer, you don't have to cope with a deluge of data, which can complicate your task. Instead, you ask for—and get—just the small amount of data you need for your task. Compared to REST API, this can make for more resource-efficient and responsive applications.
Finally, the returned data is easy to consume. The returned data is a JSON object that directly follows the structure of the query; it is essentially, the same as the query, but with values attached to the requested fields. For example, the above query might return this JSON:
Answering a simple question is simple with Network Query Engine
Using NQE, we can now easily answer our original question about interfaces with mismatched operational and configured states. In particular, you can get all the data you need, across all devices in your network, with this short (and sweet) query:
And the output is immediately consumable:
That's it! Notice what you did not have to deal with: no collection and storage of data, no reading manuals to find out which command to run, no dirty regular expressions to parse the data, and no vendor-specific hacks. This single query works for all devices supported by Forward Networks!
Moreover, we've made it easy to take a query like the one above, and embed it in a script that integrates this sanity check into larger workflows. Specifically, we've open-sourced a simple Python client library that you can use to query the GraphQL API in any Forward Networks instance. For example, we can drop the above query into the following Python script that uses our client library to print out all interfaces that have different operational and configured states:
Running this program prints out something like this:
Not the prettiest thing in the world, but not too shabby either. In some 30 odd lines of simple Python, you've implemented a script that works on all devices, platforms, and OSes supported by Forward Networks (without needing to update any device firmware or install any agents) and checks a property that may be important for keeping your network sane.
NQE takes inspiration from OpenConfig, an informal working group on network management and operations standards, which includes a vendor-neutral network data model. NQE tries to align with that data model where possible. For more details on specifics of the alignment with OpenConfig, refer to our github README file.
Head over to our github repository to get started with Forward NQE. The repository shows you how to run queries and install the client library, provides a set of examples that you can use to bootstrap, and covers a variety of details about the API. An easy way to get started is to query your network in Forward Enterprise (account required). If you do not already have an account there, you can request an account here.
In particular, our repository guides you to our Network Query Explorer, which is a lightweight, interactive query editor and schema explorer in the Forward platform. The Network Query Explorer provides a simple interface for testing GraphQL queries; you can interactively query the network and get immediate results, as well as prototype queries prior to embedding them in a custom application.
The data that could be exposed about networks is vast. We've started with a small subset of this data, covering the following areas:
- Basic device info including device name, vendor, platform, and OS.
- Interface data.
- IPv4 and IPv6 RIBs across all VRFs.
- Subset of BGP RIBs (adj-rib-in and adj-rib-out).
- Topology data via fields that link interfaces to other connected interfaces.
Please check out the repository, which has the NQE GraphQL schema in its documentation for full details. Or, check out this interactive, visualization of the NQE GraphQL schema:
We welcome your feedback on what data to include going forward. Please send us issues on our github repository to let us know about your use cases and the data you need.
We also invite you to use the github repository to share scripts and queries and collaborate with us on Network Query Engine. Feel free to fork the repository and send us pull requests.
Have fun querying and analyzing your network with our NQE! We're excited to see what you do with it!