Category Archives: big data

Meet Your Makers

creation

We are in the midst of a new era of innovation, and an entire generation of makers is emerging. These makers are enabled by direct access to a range of capabilities and building blocks that were previously only available to multi-million dollar corporations. They have unprecedented control over both the digital and physical world, access to unlimited computing capacity, and an entire Internet of data to exploit. These makers are reshaping not only the technology landscape, but also the practices and opportunities of traditional businesses. If you haven’t done so already, it is time to meet your makers.

Makers can instantly download free developer tools and advanced runtime environments to build new applications. They can spin up Cloud computing infrastructure in minutes to run these applications, accessing tens or hundreds of thousands of dollars worth of computing infrastructure without any up front costs. They can choose from thousands of open APIs to add key capabilities into their applications, incorporating the best data and the best functionality available in the market without outlaying a penny of capital expense. Perhaps most amazingly, makers don’t need to be particularly sophisticated to take advantage of all of this – this is a mass movement, not an exclusive one.

The maker generation has been empowered by the removal of three key barriers that have traditionally kept this type of innovation in the hands of large corporations:

  1. Economics
  2. Closed Systems
  3. Technological Complexity

Economic Barriers
The removal of economic barriers through the availability of Cloud computing has been a huge factor in the rise of the maker. Using Cloud services, developers have access to unlimited processing power, storage, and network infrastructure. They can also easily deploy applications across geographic boundaries, lowering the barriers to entering new markets. Pay-as-you-go models are standard, and elasticity is built-in, to allow makers to experiment at a very low cost, but easily scale to meet bursting demand when ideas catch on.

But the lowering of economic barriers has not been limited to Cloud. Universal mobile and Wi-Fi connectivity, with commoditizing cost structures, has empowered anything, anywhere to be connectable. As they dream up their designs, makers can assume connectivity with a relatively high degree of reliability.

And perhaps the biggest and most current disruption is in the economics of microelectronics. Computers that would have powered businesses thirty years ago can now be shrunken down to postage stamp sizes. Battery technologies have evolved to remarkable lifespans, and energy to charge batteries can be collected from a variety of sources, including body heat and movement. And yet, with all this advancement, makers can buy an LTE capable microprocessor on the open market for under $10.

Closed System Barriers
While many of the early computing companies built their businesses on closed systems, computer systems have gradually evolved toward openness, inspired by Internet technologies like TCP/IP and HTTP, and communications technologies like Wi-Fi and GSM. Open programming frameworks like Java, and data formats like XML and JSON have lowered the barriers to interoperability, enabling makers to build new systems capable of interacting with the old. Open lightweight protocols like Bluetooth LE and MQ-TT have provided ways to easily bridge between the digital and physical world.

The most recent wave of technology innovation over the past ten years has produced advancements in open software technology like Hadoop, columnar databases, and document stores, all of which provide the tools for makers to manage and analyze huge volumes of data. And even commercial software companies now routinely offer their products through free download for development use, providing makers with limitless options without having to settle for second-class capabilities.

Technological Complexity Barriers
In my view, the biggest barrier to fall has been the one that has kept information technology in the control of a relatively small population of elite experts. The consumerization of technology, and the resulting simplification of its design, has created a huge accelerator for innovation, and vastly expanded the population of potential makers. Even 10 years ago, programming was mostly limited to technological whiz kids with advanced EE degrees or natural propensities toward mathematics and science. The barriers on the hardware side were even steeper, often requiring deep understanding of hardware architectures and embedded systems.

Today, technology can be used and controlled with a much more basic set of skills. In the Cloud, Platform as a Service technologies simplify traditionally complex tasks like configuring high availability and synchronizing data across data centers. Javascript has emerged as a low barrier programming language that simplifies the transition from client to server to database, while naturally extending to mobile devices. Even hardware has joined this wave, with technologies like the $25 Raspberry Pi that offer affordable and extensible hardware foundations for makers to build upon. And with 3D printers, even physical objects and prototypes can be created at a fraction of the cost and complexity of the past.

Perhaps most importantly, the drive toward simple Web APIs has inspired a whole new wave of Internet accessible capabilities with easy HTTP-based interfaces that can be learned in minutes. The result of this is a plethora of tools at the maker’s fingertips. Makers can combine data and functions from thousands of developers across thousands of companies, wiring together new applications in hours to achieve what would have taken weeks or months only a decade ago.

The reason why this is important is because these makers are driving much of the innovation happening in the technology marketplace today. These makers are changing business models, cross-pollinating capabilities and data into new markets, and opening up new channels. These makers are a potential innovation engine for your own data and capabilities. These makers think in new ways, find new uses for existing assets, and find ways to monetize things that were never thought of as valuable. Your competitors are likely dipping their toes into this innovation pool already, not relying on only their traditional IT teams to discover and drive innovation.

So where are these makers? With these barriers removed, they are emerging everywhere. Many of them likely exist in your own organization. They are out there thinking of an idea, perhaps searching for data, expertise, or capabilities that your organization could offer them. These makers are the people who will disrupt your market or lead your industry’s next great opportunity. If they aren’t empowered by your point of view, they will find other means to achieve their goals, many of which may directly compete with your own.

I suggest you make an effort to reach out and meet your makers and empower them before the opportunity passes you by.

Advertisements
Tagged

A recipe for the Internet of Things

Seemingly every day a new story pops up about the Internet of Things, as new devices and wearables are launched into the market, and large enterprises contemplate the possibilities of a connected world. I’ve spent quite a bit of time discussing the requirements for taking advantage of these capabilities with organizations ranging from automobile manufacturers, to consumer electronics manufacturers, to industrial manufacturers, to city governments. What I’ve seen is a recurring pattern that acts as a guide to what’s needed to capitalize on the Internet of Things, so I thought I would share some of those thoughts.

Registration and Device Management – The first thing that is needed to support the Internet of Things is a way to easily register a device onto a network, whether that is a simple one-to-one connection between a device and a mobile phone, a home network router, or a cloud service. Self-registration is often ideal for personal devices, but curated registration through APIs or a UI is often better for more security conscious applications. The registration process should capture the API of the device (both data and control) and define the policies and data structures that will be used to talk to the device if those things are not already known in advance. Once registered, the firmware on the device sometimes needs to be remotely updateable, or even remotely wiped in cases with higher security risk.

Connectivity – Connectivity requirements vary quite a bit based on application. Some devices cannot feasibly maintain a constant connection, often due to power or network constraints, and sometimes periodic batched connections are all that is needed. However, increasingly applications require constant real-time connectivity, where information is streamed to and from the device.

Many Internet of Things architectures have two tiers of communications – one level that handles communication from devices to a collector (which may be a SCADA device, a home router, or even a mobile phone), and another that collects and manages information across collectors. However, there is an increase in the number of direct device connections to the Cloud to take advantage of its inherent portability and extensibility benefits (for example, allowing information from an activity tracking wearable to be shared easily across users or devices, or plugged into external analytics that exist only in the Cloud).

An emerging trend for connectivity is to use publish/subscribe protocols like MQTT, which optimize traffic to and from the device (or collector), and have the added benefit of being inherently event-oriented. They also allow anything to subscribe to anything else, offering an easier way to layer capabilities onto Internet of Things scenarios. This essentially allows every device to both publish and subscribe to anything else, effectively giving each device its own API. However, unlike point-to-point protocols, the ability to address devices through topics reduces the overhead of managing so many individual connections, allowing devices to be addressed in logical groupings. MQTT also has the benefit of being less taxing on network and battery infrastructure than polling based mechanisms since it pushes data from devices only when needed and requires little in the way of headers. It is also harder to spoof since the subscriptions are managed above the IP layer.

Security & Privacy – Security has quickly risen to the top of most requirements for Internet of Things, simply because software-enabled physical infrastructure has some pretty severe implications if compromised. Whether the concern involves controls on industrial or city infrastructure, or simply data privacy and security, there is no way around the issue. Devices need to take some responsibility here by limiting the tamper risk, but ultimately most of the security enforcement needs to be done by the things the devices connect into. Transport layer security is critical, but authentication, authorization, and access control also need to be enforced on both sides of the connection. Ideally, security should be enforced all the way up to the application layer, filtering the content of messages to avoid things like injection attacks from compromised devices. In cases where data is cached on the device, or the device has privileged access, remote wiping capability is also a good idea.

In the case of consumer devices, privacy is often the bigger issue. Today, most devices don’t offer much choice about how, when, and where they share information, but in the future the control needs to shift to the consumer, allowing them to opt in to sharing data. It is probable that this level of control will become something that home gateways control – allowing users to select with exactly which cloud services they would like to share specific device data. In any case, data privacy needs to be designed into Internet of Things networks from the start.

Big Data Analytics – Much of the promise of the Internet of Things is contained in the ability to detect and respond to important events within a sea of emitted data. Even in cases where data is not collected in real-time, immediate response may be desired when something important is detected. Therefore, the ability to analyze the data stream in real-time, and find the needles in the haystack, is critical in many scenarios. Many scenarios also require predictive analytics to optimize operation or reduce risk (for example, predicting when something is likely to fail and proactively taking it offline). Ideally, analytic models can be developed offline using standard analytics tools and then fed into the stream for execution.

In addition to real-time event analytics, offline data analytics are important in Internet of Things scenarios. This type of analytic processing is typically run in batch against much larger static data sets in order to uncover trends or anomalies in data that might help provide new insights. Hadoop-based technologies are capable of working against lower cost storage, and can pull in data of any format, allowing patterns to be detected across even unrelated data sets. For example, sensor data around a failure could be analyzed to try to recognize a pattern, but external sources such as space weather data could also be pulled into the analysis to see if there were external conditions that led to the failure. When patterns are detected using big data analytics, the pattern can be applied to real-time event analytics to detect or predict the same conditions in real-time.

Mediation and Orchestration – In addition to analytics, there needs to be some level of mediation and orchestration capability in order to recognize complex events across related devices, coordinate responses, and mediate differences across data structures and protocols. While many newer devices connect simply over standard protocols like HTTP and MQTT, older devices often rely on proprietary protocols and data formats. As the variety of sensors increases, and as multiple generations of sensors are deployed on the same networks, mediation capabilities allow data to be normalized into a more standard set of elements, so that readings from similar types of sensors that produce different data formats can be easily aligned and compared.

Orchestration is also important in allowing events across sensors or devices to be intelligently correlated. Since the vast majority of data created by the Internet of Things will be uninteresting, organizations need a way to recognize interesting things when they happen, even when those things only become interesting once several related things occur. For example, a slight temperature rise in one sensor might not be a huge issue, until you consider that a coolant pump is also experiencing a belt slippage. Orchestration allows seemingly disconnected events to be connected together into a more complex event. It also provides a mechanism to generate an appropriate, and sometimes complex, response.

Data Management – As data flows from connected devices, the data must be managed in a way that allows it to be easily understood and analyzed by business users. Many connected devices provide data in incremental updates, like progressive meter reads. This type of data is best managed in a time series, in what is often called a historian database, so that its change and deviation over time can be easily understood. For example, energy load profiles, temperature traces, and other sensor readings are best understood when analyzed over a period of time. Time series database techniques allow very high volumes of writes to occur in the database layer without disruption, enabling the database to keep up with the types of volumes inherent in Internet of Things scenarios. Time series query capabilities allow businesses to understand trends and outliers very quickly within a data stream, without having to write complex queries to manipulate stubborn relational structures.

Another important dimension to Internet of Things data is geospatial metadata. Since many connected things can be mobile, tracking location is often important alongside the time dimension. Geospatial analytics provide great value within many Internet of Things scenarios, including connected vehicles and equipment tracking use cases. Including native geospatial capabilities at the data layer, and allowing for easy four-dimensional analysis combining time series and geospatial data, opens more possibilities for extracting value from the Internet of Things.

With the volume of data in many Internet of Things settings, the cost of retaining everything can quickly get out of control, so it needs to be governed by retention policies. The utility of data degrades fairly quickly in most scenarios, so immediate access becomes less important as it ages. Data retention policies allow the business to determine how long to retain information in the database layer. Often this is based on a specified amount of time, but it can also be based on a specific number of sensor readings or other factors. Complex policies could also define conditions under which default policies should be overridden, in cases where something interesting was sensed, for example.

Asset Management – When the lifecycle of connected things needs to be managed, asset management becomes a key capability. This is particularly important when dealing with high value assets, or instances where downtime equates to substantial lost revenue opportunity. Asset management solutions provide a single point of control over all types of assets — production, infrastructure, facilities, transportation and communications — enabling the tracking of individual assets, along with their deployment, location, service history, and resource and parts supply chain.

Asset management manages details on failure conditions and specific prescribed service instructions related to those conditions. It helps manage both planned and unplanned work activities, from initial request through completion and recording of actuals. It also establishes service level agreements, and enables proactive monitoring of service level delivery, and implementation of escalation procedures. By connecting real-time awareness with asset management, maintenance requirements can be more effectively predicted, repair cycles optimized, and assets more effectively tracked and managed.

Dashboards & Visualization – When dealing with large volumes of data, one of the best ways of understanding what is happening is to use visualizations and dashboards. These technologies allow information to be easily summarized into live graphical views that quickly show where problems and outliers may be hidden. Users can drill into potential problem areas and get more detail to be able to diagnose problems and propose resolution. Dashboards provide context to information and provide users with specific controls to address common issues. They provide a way to visually alert users to important data elements in real time, and then act on that information directly.

Integration – Integration into on-premise or Cloud-based back office systems of record is critical for many Internet of Things scenarios. Back office systems provide customer, inventory, sales, and supply chain data, and also provide access to key functions like MRP, purchasing, customer support, and sales automation. By integrating with these key systems, insights and events gained from the Internet of Things can be converted into actions. For example, ordering of parts could be automated when a failure condition is detected that predicts a pending failure, or a partner could be alerted to an opportunity to replenish an accessory when a low supply is detected.

Client SDK – The processing power of connected devices is continuously increasing. In addition to providing a connection to the Internet, many of these devices are capable of additional processing functions. For example, in some cases where connections are sporadic, on-device caching is desirable. In other cases, it makes sense to even run some filtering or analytics directly on higher-powered devices to pre-filter or manipulate data.

Providing a client SDK that enables these functions helps organizations who build these devices to innovate more quickly. It also enables third-party developers to drive their own innovations into these products. At a minimum, the ability to manage the client side of a publish-subscribe interaction is required. By enabling these capabilities in the SDK, chip and device manufacturers can optimize their opportunity and increase the utility of their offerings.

Want more detail? Check us out at ThingMonk December 2-3 in Shoreditch: THE conference to go to for Internet of Things!

Tagged

It’s a matter of convergence

I’ve been hearing a lot more lately about Social, Mobile, Cloud, and Big Data as a combined theme lately. But it often feels like the combination is used more for convenience, or to discuss them as coincident trends. But the real story is that combination of these emerging capabilities is the actual change agent. Any of them alone is an exciting opportunity, but converged together, they enable things only imagined previously in science fiction. In a lot of ways, it starts with mobile, but not just delivering mobile apps to customers. By mobile, I mean distributed and mobile presence delivered through mobile devices, sensors, and connected “things”. The universe of these connected things is expanding at an alarming pace, providing an unprecedented fabric of sensory awareness that can produce more insights than anything we’ve seen before. But we’ve had sensors for years, so what has changed is the second trend – big data. Big data gives us the ability to glean insights out of this deluge of information, so that we can actually do things based upon it – make decisions, improve interactions, reduce inefficiencies, react to business stimuli. The third trend, social, provides new channels for interaction, but also adds more context to what we know about customers and employees and products, and helps to connect this big data together in new ways. These insights can be transformed into actions at the point of impact by feeding back to those mobile devices. The fourth trend, Cloud, provides affordable computing power on demand, to enable all of this. When everybody has a supercomputer’s worth of computing power at their fingertips, all of this becomes possible.

That’s what we’re talking about at IBM Impact in Las Vegas this week. Join in the conversation by following #ibmimpact on Twitter. See you there.

Tagged , ,

The ancient art of API Management

I had a good discussion with a company today who has been talking to one of our competitors about API Management. A member of their team asked me what advantages IBM’s offering has over competitors, particularly since the API Management offering was only just announced a few weeks ago.

I think it is a great question, so I want to address it here for everyone’s benefit. I think it really comes down to four key points:

  1. In actuality, IBM has been helping organizations publish services and APIs securely using DataPower for years. In fact, organizations like Pitney Bowes and Royal Caribbean just spoke at the Impact conference in Las Vegas about the success they are having in this area. What is new within the portfolio is the developer portal, API assembly, and business API Insight. The harder parts of API management (security, policy management, traffic management, etc.) have been available in DataPower for a long time.
  2. That said, there are some strong technological advantages in the IBM offering. First of all, the DataPower appliance at the core of the offering is by far the market leading security gateway. DataPower has roughly ten times the customer base of the nearest competitor, and is growing faster. The most security conscious organizations in the world use DataPower to protect the services and APIs that they publish externally. Within the API Management solution, IBM also has a unique ability to easily assemble and publish new APIs through a simple configuration interface. This allows organizations to take internal resources and publish them as secure APIs in a few clicks. Also, the solution employs big data analytics to provide a higher level of insight into API consumption. And perhaps most importantly, IBM brings a level of scale to API Management that none of the other vendors can match. By tapping into IBM developerWorks, IBM can offer access to millions of developers around the world.
  3. Beyond this, IBM’s vision in this space is much broader. Publishing business APIs requires the same level of infrastructure and rigor as other applications, and shares a lot of technology basis with initiatives like mobile computing.  IBM’s portfolio is particularly well-suited to the requirements that this creates. Experience with technologies where IBM already has a market leadership position, like service registry for lifecycle management, in-memory caching, mobile application support, and even network traffic shaping, all fit into this vision in the long term, and can quickly provide capabilities well beyond what smaller vendors are able to offer with limited development teams and budgets.
  4. IBM’s development team is several times larger than other API Management competitors, and the network of field experts within IBM and its partners is also many times larger. IBM’s reach around the globe is far beyond what a small startup can offer. And layered on top of that IBM has extensive expertise across industries and technology platforms that exceeds the capacity of smaller companies. If your organization views API Management as a critical strategy, as IBM does, then risk mitigation and scale should be top concerns.

Now none of this is to say that the other API Management vendors lack good technology. I actually really like several of the players in this space, and in fact I have some good friends that work in a couple of them. I just think that API Management is really at an early phase of its lifecycle, so choosing a vendor that understands the challenges and will also be there for the long haul is extremely important.

Tagged , ,

New API Management Offering from IBM – Give it a try!

To steal the words of Alex Williams, we “quietly launched” a new API Management service today. I’m very excited about this new offering. It is visually elegant, and provides some fantastic capabilities covering all aspects of API management. It offers the ability to easily socialize APIs through a developer portal and manage API consumption, including some really nice use of advanced big data analytics. Unlike other API Management offerings in the market, it also allows customers to easily create new APIs, using Cast Iron’s intuitive integration capabilities under the covers.

This offering allows our customers to combine on-premise DataPower appliances for security and policy enforcement, including two and three-legged OAuth and traffic management, with cloud-hosted developer outreach and support, API analytics, operational monitoring and control, and life cycle management.

I’m convinced it is the most complete API Management offering in the market: the combination of the market leading security gateway functionality of DataPower with these new cloud-based visual analytics, developer outreach, and life cycle management capabilities, with embedded API creation and integration capabilities of Cast Iron. It is a truly everything an organization needs to get started in publishing enterprise-class APIs.

And best of all, you can start using it for free! See for yourself:
http://webapi.castiron.com/

Tagged

The API Explosion

For years, I have been tracking the trend of the “information explosion” (as evidenced by the title of this blog). The basic premise of this is that the amount and availability of information is growing at an unprecedented rate as we digitize, collect, and create information from more and more sources. The “explosion” represents both an opportunity and a threat to organizations, as they must learn to understand, control, and ultimately make use of it to drive new opportunities. This trend has been well-documented and has driven a large amount of investment by organizations of all sizes.

However, I think a more interesting emerging trend may be the API Explosion. Data, without a way of organizing it and accessing it, does not have much value. However, when information can be made available in context, it suddenly multiplies in value. This is where APIs come in. APIs provide a self-describing way to get to information – offering a programmatic gateway to specific slices of information that abstracts the complexity of where it might actually reside. APIs also provide the advantage of acting as a controlled gatekeeper for the information (presuming they are deployed this way) that can limit who can see what, as well as acting as a quality of service governor that can ensure that the things accessing the data are able to rely on a predictable response.

When you think about it, APIs are largely the drivers of some of the upstart darlings of the high tech world. What are Google, Facebook, and Linkedin but a set of APIs and UIs sitting on a complex foundation of distributed information? In addition, much of the investment that has been attributed to the information explosion – including much of the “Big Data” phenomenon, has actually been focused on providing APIs that sit on top of non-traditional data sources, so that organizations can actually make use of what is valuable inside that data.

To illustrate my poinAPIs Published per Day on ProgrammableWebt, I created this quick Many Eyes graph of the number of APIs published on ProgrammableWeb per day over the past 6 years (I would have preferred to have published this as a mashup, since ProgrammableWeb has an API, but frustratingly Many Eyes doesn’t). As you can see, the recent growth is extraordinary. This is only one data point, and it is clearly biased toward technology companies rather than traditional businesses, but it is hard to argue against this trend.

So there is an information explosion, but on top of that it is the proliferation of APIs making that information accessible that is driving the change in how organizations operate.

Tagged ,