If you are already a Network Query Engine (NQE) ninja, feel free to jump to the “NQE with Postman” section, otherwise, keep on reading.
If you are familiar with the Forward Platform, you might already know that the platform discovers switches, routers, load balancers, and firewalls from all the most common vendors and collects their configuration and state. It then parses and normalizes all the data before running a mathematical model to reason about the network behavior in terms of routing and security policies.
The data parsing and normalization are key in allowing an efficient and very scalable platform, but before NQE all this data was available for internal use only.
With NQE we have essentially opened up the Forward Platform to provide parsed and normalized data to customers and partners for use cases like custom dashboards and custom network checks.
The beauty is that we do it for every supported platform, vendor and software version, even for very old legacy platforms that have been around for 30 years!
The exposed data structures are aligned with OpenConfig, the de-facto standard for vendor-neutral network device configuration and state data models (written in YANG) and it is available through a GraphQL API.
GraphQL is a flexible data query language for APIs developed by Facebook in 2012 and released as an open-source project in 2015. GraphQL is an alternative to REST (Representational State Transfer). It offers several benefits compared to REST (see REST vs GraphQL article) like enabling users to specify exactly what data they get back in their response – nothing more, and nothing less, and it allows querying for multiple fields in a single request.
Hundreds of organizations, like Forward Networks, are already leveraging GraphQL!
Postman recently announced the inbuilt support for GraphQL to enable all the GraphQL users to leverage the most popular testing and development tool for HTTP APIs.
Unfortunately, it doesn't support GraphQL Introspection [yet?], a key feature that allows to populate the schema inspector, provides autocomplete and enables to easily build the schema documentation.
The GraphQL schema can be imported manually instead, providing autocomplete (no documentation, sorry). GraphQL Schema Definition Language (SDL) is the only format supported at the moment.
The NQE schema in SDL format can be exported from the Forward platform and manually imported in Postman. See the NQE Github repository for how-to instructions.
After the NQE schema is manually imported, you can easily build the Postman requests by selecting GraphQL, selecting the imported NQE schema, writing the query, send it, and… voilà!
Figure 1: NQE query with Postman
For more information on NQE with Postman, check this video below.
For more info on NQE, check the following articles and the NQE repo on GitHub:
Enjoy your NQE queries with Postman!!
Every time I explain the Forward platform and what a Snapshot is, I know the next question I will be asked. I would start with something like "A Snapshot is a collection of the network devices’ running configuration and state at a specific point in time. The Forward Platform uses all the data gathered to run its mathematical model and calculate every possible traffic path in the network.” And here it comes the same question, every, single, time: “How long does it take?” I smile and I say “Well, it depends :)"
It’s not that I want to hide bad news or that I’m afraid the answer could close any further conversation, but it really depends on different factors, and most of them actually are not related to the Forward platform but to the managed networks instead, like the number of devices, the size of the forwarding tables, the ability of authentication servers to keep up with the parallel collections Forward instantiates just to name a few. Some of these factors can impact one or more stages that go into building the mathematical model of the network. At the end of the day, what customers care about is the total time between when a new collection starts and when all the data has been processed and available on the Forward User Interface (or the Forward APIs). In large network environments, where thousands of devices are contained in a single view, this time can be significant.
To simplify the concept, the total time to build the mathematical model of the network can be divided into two main components:
We spend a lot of time and effort to constantly improve the Forward Platform to support bigger networks and at the same time to reduce the time needed to build the model. Many of these improvements are driven by our customer’s use cases. For instance, some customers are using the Forward platforms in change management windows where it’s critical to verify that the network is behaving as expected as quickly as possible after a change has been made. Often only a small subset of devices is affected by the change, the collection time is usually way longer than the processing time and, last but not least, the biggest bottleneck is the authentication server's scalability and response time.
Today I’m happy to announce the release of Partial Collection, a new feature designed with the change management windows use case in mind.
Partial Collection allows users to dramatically reduce the collection time by restricting and triggering a new collection from a subset of devices and then merging the new data with the most recent Snapshot.
The merged data is then fully processed again by the Forward platform.
Some customers have already reported huge improvements by leveraging Partial Collection. For instance, a customer with 5000 devices, very large routing tables and, more importantly, a TACACS server that can support only 16 parallel connections has been able to reduce the collection time by 10 times, allowing them to verify the network in very short change management windows!
We have also added new useful information to the snapshot menu, the collection Span. It's the period of time between the oldest and newest device collection times in the Snapshot. In this way, customers can understand how old the oldest collection is, as well as how far apart the collection stretched for the given network.
For more information on Partial Collection check the Snapshot documentation page in the Getting Started section.
Stay tuned for more enhancement on Collection and Processing Time!
There is a wealth of untapped information in your network. Learn to query and extract it!
Even before packets start flowing, enterprise networks are complex, data-intensive repositories of topology, configuration and state information. This information is often required to solve operational issues—like finding sources of unwanted traffic drops or protocol configuration errors—or, to find problems before they become issues.
Yet, this valuable information typically goes untapped, because getting at it requires too much work. The data is scattered across devices and stored in different formats, with disparate methods to access it.
Forward Networks' newest capability, Network Query Engine (NQE), removes the burden of collecting, parsing, and querying network information. NQE takes configuration files and state information from across all your network devices and exposes it in a well-defined schema that can be queried like a database—to enable a new range of network management capabilities and insights. Read on to see how it works and how you can get started with it, today.
Simple questions are surprisingly hard to answer
Today, answering even basic questions about a network can be challenging and time-consuming. Consider this simple task: find all device interfaces in your network whose operational status does not match their configured status. For example, if an interface is configured to be UP, but is operationally DOWN, we want to know about this!
To implement this simple check, we need to 1) log in to all of our devices, 2) use the available method to retrieve the data 3) extract out information we need from the data, and 4) put this information into some common data structure, so that the check can easily be written. In some cases, we may be able to use proper APIs, like SNMP or NetConf, to retrieve this data. In other cases—and in almost every significant real world case we've seen—some of the data on some of the devices has to be retrieved from the CLI interface of a device and parsed from human-oriented, ill-specified, vendor-specific textual output.
To illustrate, consider getting the operational and administrative statuses from all interfaces on a Cisco NX-OS device and on an A10 ACOS device. On Cisco NX-OS, we have to run two commands, "show running-config" and "show interfaces brief" to get at all the data. On A10 ACOS, we have to run "show interfaces". The outputs from these three commands, shown in Figures 2, 3, 4 are all completely different and we have to write 3 ad-hoc parsing functions to grab the data we want.
Developing programs to collect the needed data on one or two devices or platforms for one or two data points is not that hard. But doing this across all of your vendors, platforms, and OSes, for the thousands of data points (some rather obscure) that you may need, is an enormous effort.
When you can efficiently get answers, many operational tasks can be accelerated
Beyond the simple question about interfaces above, operators have expressed interest in answering a long list of questions:
Every network engineer we talked to has questions like these, which all require network-wide information to answer completely. In some cases, engineers would query manually and get a partial answer. In other cases, where they had tools teams or scripting-savvy network teams, they would invest the time and effort. One network operator who went down this path reported that 80% of the effort was spent in basic collecting and parsing of device and vendor-specific formats and details.
Unsurprisingly, nobody reported that this was work they'd want to do, because it's tedious to get collection, parsing, and querying right across vendors and all device types (switches, routers, firewalls, and load balancers), because it requires ongoing maintenance (as the devices evolve and new firmware versions come out), and because it adds risk (that the star coder who made this leaves). There are libraries to help with network-device collection, access, and parsing, but none get you all the way to the finish line. Existing tools/techniques like SNMP, HTTP APIs, and YANG outputs often only work on a subset of devices, require software upgrades that you are not ready to apply, or don't cover all the data you need.
Given the level of effort required, many questions are simply not answered, leaving operators blind to potential problems in their network or working long weekends hunting down the needles in their haystacks. Fortunately, now, help is on the way...
Leveraging the Forward platform
As part of the underlying engine of its network assurance and intent-based verification platform, Forward Networks already does the hard work to create a centralized, vendor-agnostic internal representation of all this network configuration and state information across all layer 2-4 device types. This information powers applications like search, verification, behavioral diffs, and more, all of which can be accessed via our GUI or API.
But this raw information could do other things—if exposed the right way—to answer questions like those above, as well as enable live queries, custom verification checks, compliance documentation, and new dashboards.
To this end, we are introducing Network Query Engine (NQE) in the Forward Networks platform. NQE provides access to normalized, structured data about the network, enabling network teams to focus directly on the higher-level aspects of their use cases, which is actually making their networks more resilient, agile and robust. Data queries for specific needs can now efficiently be written in a small fraction of the time, usually in a few lines of code or in a graphical query editor. The diagram below shows how we are exposing our core platform data model alongside our Forward Enterprise applications and intent-based verification analytical model.
Our key design goals for enabling this level of access to our structured, normalized network data model were to ensure the availability of data was:
In particular, we aimed to have a normalized data model, in the sense that device information (interface details for example) is represented with the same information structure, regardless of vendor, platform and OS. This normalization allows you to easily write queries and scripts that work across your entire fleet. We also wanted the data to be structured, in the sense that data is fully-parsed—you won't need to parse further into some string to extract out the embedded structure. Again, this simplifies programming because it eliminates a bunch of annoying details, like dealing with differences in textual representation of MAC addresses that arises across platforms.
We believe we've achieved these goals and invite the community to try it out. The data model is far from complete (more on that later), but feedback from operators convinced us to accelerate this feature and make it available as soon as possible.
GraphQL: A simplified and efficient query language to the Forward schema
NQE is based on GraphQL, and this is a crucial ingredient in meeting our design goals. GraphQL, which was developed by Facebook in 2012, and released as an open source project in 2015, is: "a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools." [graphql.org]
GraphQL lies somewhere in between a REST API and SQL: Like a REST API, GraphQL defines a format for requesting and receiving general data over a web connection. Like SQL, GraphQL defines a query language for requesting and receiving filtered data from a graph-structured data source that has a well-defined schema. GraphQL has gained the backing of hundreds of organizations like Intuit, Netflix, GitHub, and PayPal, and is getting rave reviews. Just to quote one: "at PayPal, GraphQL has been a complete game changer in the way we think about data, fetch data and build applications."
GraphQL provides some key ingredients for NQE. First, GraphQL defines a schema language that NQE uses to provide a clear, simple, and precise description of our network data model in the form of a GraphQL schema. This schema allows developers to know precisely what information is in our data model, how it will be structured, and what it means. For example, here is a fragment of the schema that defines the structure of the object that defines Ethernet attributes of an interface:
This declaration clearly communicates the structure with minimal fuss: it defines a record with a few fields, and each field has a well-defined type in our schema.
Second, GraphQL provides a simple way to write queries against the schema. In fact, the query language is so simple, it almost doesn't feel like a language: it looks like JSON, without the values. For example, here is a query for the macAddress and negotiatedPortSpeed values of the Ethernet datatype:
Beyond its simplicity, the query API is crucial to ease-of-use: as a developer, you don't have to cope with a deluge of data, which can complicate your task. Instead, you ask for—and get—just the small amount of data you need for your task. Compared to REST API, this can make for more resource-efficient and responsive applications.
Finally, the returned data is easy to consume. The returned data is a JSON object that directly follows the structure of the query; it is essentially, the same as the query, but with values attached to the requested fields. For example, the above query might return this JSON:
Answering a simple question is simple with Network Query Engine
Using NQE, we can now easily answer our original question about interfaces with mismatched operational and configured states. In particular, you can get all the data you need, across all devices in your network, with this short (and sweet) query:
And the output is immediately consumable:
That's it! Notice what you did not have to deal with: no collection and storage of data, no reading manuals to find out which command to run, no dirty regular expressions to parse the data, and no vendor-specific hacks. This single query works for all devices supported by Forward Networks!
Moreover, we've made it easy to take a query like the one above, and embed it in a script that integrates this sanity check into larger workflows. Specifically, we've open-sourced a simple Python client library that you can use to query the GraphQL API in any Forward Networks instance. For example, we can drop the above query into the following Python script that uses our client library to print out all interfaces that have different operational and configured states:
Running this program prints out something like this:
Not the prettiest thing in the world, but not too shabby either. In some 30 odd lines of simple Python, you've implemented a script that works on all devices, platforms, and OSes supported by Forward Networks (without needing to update any device firmware or install any agents) and checks a property that may be important for keeping your network sane.
NQE takes inspiration from OpenConfig, an informal working group on network management and operations standards, which includes a vendor-neutral network data model. NQE tries to align with that data model where possible. For more details on specifics of the alignment with OpenConfig, refer to our github README file.
Getting started
Head over to our github repository to get started with Forward NQE. The repository shows you how to run queries and install the client library, provides a set of examples that you can use to bootstrap, and covers a variety of details about the API. An easy way to get started is to query your network in Forward Enterprise (account required). If you do not already have an account there, you can request an account here.
In particular, our repository guides you to our Network Query Explorer, which is a lightweight, interactive query editor and schema explorer in the Forward platform. The Network Query Explorer provides a simple interface for testing GraphQL queries; you can interactively query the network and get immediate results, as well as prototype queries prior to embedding them in a custom application.
The data that could be exposed about networks is vast. We've started with a small subset of this data, covering the following areas:
Please check out the repository, which has the NQE GraphQL schema in its documentation for full details. Or, check out this interactive, visualization of the NQE GraphQL schema:
We welcome your feedback on what data to include going forward. Please send us issues on our github repository to let us know about your use cases and the data you need.
We also invite you to use the github repository to share scripts and queries and collaborate with us on Network Query Engine. Feel free to fork the repository and send us pull requests.
Have fun querying and analyzing your network with our NQE! We're excited to see what you do with it!