In the grand scale of Internet of Things (IoT) and Machine-to-Machine (M2M) communication, it is tempting to use HTTP and specifically HTTPS as a communication protocol between devices and servers. The most prominent reasons for choosing to use the HTTPS protocol are:
A device can use a small HTTP client library together with a small SSL stack, making it very convenient for the device computer programmers to design an HTTPS based IoT protocol that can communicate with any type of backend application/web server. Although a web server can handle HTTPS requests, it cannot typically handle the business logic required for managing the data sent to and from the connected devices, so in most cases an application server is required for the backend infrastructure.
You may think that sending a simple HTTPS message from the device and receiving a simple HTTPS response from the server does not impose a lot of protocol overhead. However, what appears to be a simple HTTPS request sent from the client and simple HTTPS response from the server, is actually a long sequence of commands sent between the client and server. Let's take a look at what goes on behind the scene in a typical HTTPS client request and server response:
As you see from figure 1, what appears from a higher level perspective as a single HTTPS request/response is actually a set of several messages sent on the wire. This imposes a problem in HTTPS based systems that require data sent from the server to the device client since such systems require that the client polls the server for updates. The more frequent the polls, the more load placed on the server. As an example, assume we design a remote control system for opening and closing garage doors.
A simple web based phone app, connected to the online server, lets the subscriber remotely manage his/her own garage door. Say you forget to close your garage door. You open your phone app(a client), and you click the close door button. The message is immediately sent to the online server. The server finds the corresponding garage door IoT device from the connected phone app's user credentials. The server cannot immediately send the close door control message to the garage door IoT device client. Instead it saves the control command message that it received from the phone and waits for the next poll request from the garage door IoT device client. The close garage door control command will then be sent as the response message to the client poll request.
How long are you prepared to wait for the garage door to close after you press the close button? Probably not that long, and probably not longer than one minute. This means the device poll frequency must be at least in one minute intervals.
Now, say that the online server manages 250,000 garage doors, each polling the server for data every minute. The server will then need to handle 250,000 HTTPS request/response pairs every minute or 4 request/response pairs every millisecond. This will obviously create a huge network load, but the big question is if the server can handle that many connection requests. Recall that for each HTTPS request/response pair, we have at a bare minimum 9 messages sent on the wire. This means the server must be able to handle 36 low level messages every millisecond. In addition, the very heavy to compute asymmetric key encryption places a huge computation load on the server. Each HTTPS handshaking requires that the server performs asymmetric key encryption for that particular connection.
The obvious solution to the problem is to use persistent connections. A high end TCP/IP stack easily handles 250K, or 500K, or even more persistent socket connections. The question is if your web/application server solution can handle as many connections and how much memory the server uses if it supports this many connections. For this reason, selecting a good web/application server solution is very important. We will discuss this in greater detail later.
A standard TCP/IP socket connection is by definition persistent, and it can handle messages sent in both directions at the same time. Creating a custom listen server may not be part of your web/application server infrastructure/API, and a custom non HTTPS based protocol will prevent the client from penetrating firewalls and proxies. What we need is a protocol that starts out as HTTPS and then morphs into a secure persistent socket connection, keeping all the benefits of HTTPS.
The WebSocket protocol defined in RFC 6455 specifies how a standard HTTP request/response pair can be upgraded to a persistent full duplex connection. When using SSL, it lets you morph a HTTPS connection into a secure persistent socket connection. WebSocket-based applications enable real-time communication just like a regular socket connection. What makes WebSocket unique is that it inherits all of the benefits of HTTPS since it initially starts as a HTTPS request/response pair. This means that you can bypass firewalls/proxies, communicate over SSL, and easily provide authentication.
Our IoT demos use a simplified version of the WebSocket protocol. To understand how this works, we need to look at how the HTTPS upgrade sequence works for both protocols. Both the WebSocket protocol and the simplified WebSocket protocol upgrade an HTTPS connection to a secure, persistent, asynchronous, and bi-directional connection. Figure 3 below shows the HTTP request headers and HTTP response headers used by WebSocket protocol and the simplified WebSocket protocol used in the IoT demos.
|WebSocket Protocol (RFC 6455)||Protocol Used in IoT Demos|
|Client HTTPS Request||
GET /my-server-service HTTPS/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: key-data Sec-WebSocket-Version: 13 Origin: http://example.com
GET /my-server-service HTTPS/1.1 Host: server.example.com
|Server HTTPS Response||
HTTPS/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: key-data
|The server side logic skips this part and goes directly to the next step, thus an HTTPS client library is not required by the device.|
|HTTPS switches to socket connection||Secure Bi-Directional Connection||Secure Bi-Directional Connection|
After upgrading, the WebSocket protocol switches to a frame based protocol. The simplified WebSocket IoT protocol functions similarly to a standard socket connection, except that it is secure -- i.e. uses SSL. A unique feature of the IoT protocol is that it also functions as a frame based protocol, but without using a frame header as the standard WebSocket protocol. The reason for this is that the most common symmetric encryption algorithms, such as AES, are block based, and reading from the socket stream will be in packet chunks of the same size as sent from the peer side. SharkSSL manages the block/packet reading. The benefit to using a frame based protocol is that it simplifies reading control messages on the wire. Both the WebSocket protocol and our simplified WebSocket protocol used in the IoT demos will for this reason behave identically when reading data from the socket stream when using block based symmetric encryption.
As we mentioned above, SharkSSL includes a WebSocket client library. You can use this library as a foundation for designing WebSocket based IoT solutions or you can use the simplified WebSocket library found in our IoT demos. The decision is entirely yours. The benefit with the simplified WebSocket library is that it requires less code and processing in the device. The source code for the IoT demos is also included in the SharkSSL delivery.
So far, we have covered how to upgrade an HTTPS request/response pair to a secure, persistent, asynchronous, and bi-directional connection on the device side, i.e. the client side. The device clients require that you have a backend server infrastructure that can handle all the connected clients. When using a high number of persistent connections, it is important to select an application server backend infrastructure that can handle a high number of concurrent connections while using little memory and processing overhead per connected client.
Our Barracuda Application Server, when compiled for Linux, handles virtually an unlimited number of persistent connections and is thus an ideal back-end application server for device solutions based on the IoT protocol. The Mako Server, a derivative of the Barracuda Application Server product, is a standalone application server product. The high end version of the Mako Server allows for virtually an unlimited number of connected device clients. See the Mako Server for details.
The following video shows how to use the Mako Server for setting up a secure IoT solution.Further reading:
Posted in Whitepapers