Seemingly every day a new story pops up about the Internet of Things, as new devices and wearables are launched into the market, and large enterprises contemplate the possibilities of a connected world. I’ve spent quite a bit of time discussing the requirements for taking advantage of these capabilities with organizations ranging from automobile manufacturers, to consumer electronics manufacturers, to industrial manufacturers, to city governments. What I’ve seen is a recurring pattern that acts as a guide to what’s needed to capitalize on the Internet of Things, so I thought I would share some of those thoughts.
Registration and Device Management – The first thing that is needed to support the Internet of Things is a way to easily register a device onto a network, whether that is a simple one-to-one connection between a device and a mobile phone, a home network router, or a cloud service. Self-registration is often ideal for personal devices, but curated registration through APIs or a UI is often better for more security conscious applications. The registration process should capture the API of the device (both data and control) and define the policies and data structures that will be used to talk to the device if those things are not already known in advance. Once registered, the firmware on the device sometimes needs to be remotely updateable, or even remotely wiped in cases with higher security risk.
Connectivity – Connectivity requirements vary quite a bit based on application. Some devices cannot feasibly maintain a constant connection, often due to power or network constraints, and sometimes periodic batched connections are all that is needed. However, increasingly applications require constant real-time connectivity, where information is streamed to and from the device.
Many Internet of Things architectures have two tiers of communications – one level that handles communication from devices to a collector (which may be a SCADA device, a home router, or even a mobile phone), and another that collects and manages information across collectors. However, there is an increase in the number of direct device connections to the Cloud to take advantage of its inherent portability and extensibility benefits (for example, allowing information from an activity tracking wearable to be shared easily across users or devices, or plugged into external analytics that exist only in the Cloud).
An emerging trend for connectivity is to use publish/subscribe protocols like MQTT, which optimize traffic to and from the device (or collector), and have the added benefit of being inherently event-oriented. They also allow anything to subscribe to anything else, offering an easier way to layer capabilities onto Internet of Things scenarios. This essentially allows every device to both publish and subscribe to anything else, effectively giving each device its own API. However, unlike point-to-point protocols, the ability to address devices through topics reduces the overhead of managing so many individual connections, allowing devices to be addressed in logical groupings. MQTT also has the benefit of being less taxing on network and battery infrastructure than polling based mechanisms since it pushes data from devices only when needed and requires little in the way of headers. It is also harder to spoof since the subscriptions are managed above the IP layer.
Security & Privacy – Security has quickly risen to the top of most requirements for Internet of Things, simply because software-enabled physical infrastructure has some pretty severe implications if compromised. Whether the concern involves controls on industrial or city infrastructure, or simply data privacy and security, there is no way around the issue. Devices need to take some responsibility here by limiting the tamper risk, but ultimately most of the security enforcement needs to be done by the things the devices connect into. Transport layer security is critical, but authentication, authorization, and access control also need to be enforced on both sides of the connection. Ideally, security should be enforced all the way up to the application layer, filtering the content of messages to avoid things like injection attacks from compromised devices. In cases where data is cached on the device, or the device has privileged access, remote wiping capability is also a good idea.
In the case of consumer devices, privacy is often the bigger issue. Today, most devices don’t offer much choice about how, when, and where they share information, but in the future the control needs to shift to the consumer, allowing them to opt in to sharing data. It is probable that this level of control will become something that home gateways control – allowing users to select with exactly which cloud services they would like to share specific device data. In any case, data privacy needs to be designed into Internet of Things networks from the start.
Big Data Analytics – Much of the promise of the Internet of Things is contained in the ability to detect and respond to important events within a sea of emitted data. Even in cases where data is not collected in real-time, immediate response may be desired when something important is detected. Therefore, the ability to analyze the data stream in real-time, and find the needles in the haystack, is critical in many scenarios. Many scenarios also require predictive analytics to optimize operation or reduce risk (for example, predicting when something is likely to fail and proactively taking it offline). Ideally, analytic models can be developed offline using standard analytics tools and then fed into the stream for execution.
In addition to real-time event analytics, offline data analytics are important in Internet of Things scenarios. This type of analytic processing is typically run in batch against much larger static data sets in order to uncover trends or anomalies in data that might help provide new insights. Hadoop-based technologies are capable of working against lower cost storage, and can pull in data of any format, allowing patterns to be detected across even unrelated data sets. For example, sensor data around a failure could be analyzed to try to recognize a pattern, but external sources such as space weather data could also be pulled into the analysis to see if there were external conditions that led to the failure. When patterns are detected using big data analytics, the pattern can be applied to real-time event analytics to detect or predict the same conditions in real-time.
Mediation and Orchestration – In addition to analytics, there needs to be some level of mediation and orchestration capability in order to recognize complex events across related devices, coordinate responses, and mediate differences across data structures and protocols. While many newer devices connect simply over standard protocols like HTTP and MQTT, older devices often rely on proprietary protocols and data formats. As the variety of sensors increases, and as multiple generations of sensors are deployed on the same networks, mediation capabilities allow data to be normalized into a more standard set of elements, so that readings from similar types of sensors that produce different data formats can be easily aligned and compared.
Orchestration is also important in allowing events across sensors or devices to be intelligently correlated. Since the vast majority of data created by the Internet of Things will be uninteresting, organizations need a way to recognize interesting things when they happen, even when those things only become interesting once several related things occur. For example, a slight temperature rise in one sensor might not be a huge issue, until you consider that a coolant pump is also experiencing a belt slippage. Orchestration allows seemingly disconnected events to be connected together into a more complex event. It also provides a mechanism to generate an appropriate, and sometimes complex, response.
Data Management – As data flows from connected devices, the data must be managed in a way that allows it to be easily understood and analyzed by business users. Many connected devices provide data in incremental updates, like progressive meter reads. This type of data is best managed in a time series, in what is often called a historian database, so that its change and deviation over time can be easily understood. For example, energy load profiles, temperature traces, and other sensor readings are best understood when analyzed over a period of time. Time series database techniques allow very high volumes of writes to occur in the database layer without disruption, enabling the database to keep up with the types of volumes inherent in Internet of Things scenarios. Time series query capabilities allow businesses to understand trends and outliers very quickly within a data stream, without having to write complex queries to manipulate stubborn relational structures.
Another important dimension to Internet of Things data is geospatial metadata. Since many connected things can be mobile, tracking location is often important alongside the time dimension. Geospatial analytics provide great value within many Internet of Things scenarios, including connected vehicles and equipment tracking use cases. Including native geospatial capabilities at the data layer, and allowing for easy four-dimensional analysis combining time series and geospatial data, opens more possibilities for extracting value from the Internet of Things.
With the volume of data in many Internet of Things settings, the cost of retaining everything can quickly get out of control, so it needs to be governed by retention policies. The utility of data degrades fairly quickly in most scenarios, so immediate access becomes less important as it ages. Data retention policies allow the business to determine how long to retain information in the database layer. Often this is based on a specified amount of time, but it can also be based on a specific number of sensor readings or other factors. Complex policies could also define conditions under which default policies should be overridden, in cases where something interesting was sensed, for example.
Asset Management – When the lifecycle of connected things needs to be managed, asset management becomes a key capability. This is particularly important when dealing with high value assets, or instances where downtime equates to substantial lost revenue opportunity. Asset management solutions provide a single point of control over all types of assets — production, infrastructure, facilities, transportation and communications — enabling the tracking of individual assets, along with their deployment, location, service history, and resource and parts supply chain.
Asset management manages details on failure conditions and specific prescribed service instructions related to those conditions. It helps manage both planned and unplanned work activities, from initial request through completion and recording of actuals. It also establishes service level agreements, and enables proactive monitoring of service level delivery, and implementation of escalation procedures. By connecting real-time awareness with asset management, maintenance requirements can be more effectively predicted, repair cycles optimized, and assets more effectively tracked and managed.
Dashboards & Visualization – When dealing with large volumes of data, one of the best ways of understanding what is happening is to use visualizations and dashboards. These technologies allow information to be easily summarized into live graphical views that quickly show where problems and outliers may be hidden. Users can drill into potential problem areas and get more detail to be able to diagnose problems and propose resolution. Dashboards provide context to information and provide users with specific controls to address common issues. They provide a way to visually alert users to important data elements in real time, and then act on that information directly.
Integration – Integration into on-premise or Cloud-based back office systems of record is critical for many Internet of Things scenarios. Back office systems provide customer, inventory, sales, and supply chain data, and also provide access to key functions like MRP, purchasing, customer support, and sales automation. By integrating with these key systems, insights and events gained from the Internet of Things can be converted into actions. For example, ordering of parts could be automated when a failure condition is detected that predicts a pending failure, or a partner could be alerted to an opportunity to replenish an accessory when a low supply is detected.
Client SDK – The processing power of connected devices is continuously increasing. In addition to providing a connection to the Internet, many of these devices are capable of additional processing functions. For example, in some cases where connections are sporadic, on-device caching is desirable. In other cases, it makes sense to even run some filtering or analytics directly on higher-powered devices to pre-filter or manipulate data.
Providing a client SDK that enables these functions helps organizations who build these devices to innovate more quickly. It also enables third-party developers to drive their own innovations into these products. At a minimum, the ability to manage the client side of a publish-subscribe interaction is required. By enabling these capabilities in the SDK, chip and device manufacturers can optimize their opportunity and increase the utility of their offerings.
Want more detail? Check us out at ThingMonk December 2-3 in Shoreditch: THE conference to go to for Internet of Things!