Motion jpeg streaming on iOS

30 Mar

The reason I have been slacking in blog posting recently is because I received a Raspberry Pi 2 for Christmas and this triggered something within me. I have been so absorbed by a project based on it that I didn’t find time for anything else.

Anyway long story short (the long version will actually come soon with a dedicated post) one of the features that I’m developing form my Pi project is the possibility to live stream what is captured by the connected camera and since my outgoing internet connection is quite slow I decided to go for a motion jpeg streaming solution over HTTP.

Implementing the server side of the solution was no big deal… I had some “problems” with the client side though (an iOS app).

All I wanted was to be able to play the stream inside a UIView of an iOS app but apparently there is no easy standard way for doing so. Pretty much all the answers I found googling my issue were about using a WebView to load and play the mjpeg source.

It felt just wrong…although simple at first sight, to deal with all the visualization aspects (like margins, size, page filling, etc.) I had to write html and css code to properly layout it… something like:

I didn’t like the idea from the very beginning… a WebView to just play a video was an “overkill” in my opinion. I needed a “cleaner” way to view my mjpeg stream… and that is why MjpegStreamingKit was born (check it out on GitHub)!

The idea behind it is quite simple… a UIImageView is in charge of displaying the video’s frames whereas a view controller (the beating heart of MjpegStreamingKit) retrieve the various frame and update the UIImageView with them.

MjpegStreamingController is the view controller in charge of establish the connection with the server (using a NSURLSessionDataTask), retrieve the various frame and update its UIImageView with new UIImages created with the frames’ data.

To better understand what comes next it is better to know how mjpeg over HTTP works… the Wikipedia page is pretty exhaustive:

In response to a GET request for a MJPEG file or stream, the server streams the sequence of JPEG frames over HTTP. A special mime-type content type multipart/x-mixed-replace;boundary=<boundary-name> informs the client to expect several parts (frames) as an answer delimited by <boundary-name>. This boundary name is expressly disclosed within the MIME-type declaration itself. The TCP connection is not closed as long as the client wants to receive new frames and the server wants to provide new frames.

MjpegStreamingController starts a a data task to retrieve the data from the server and being the sessions’ delegate conforms to the NSURLSessionDataDelegate protocol to handles “events”… in particular it implements the methods:

  • func URLSession(session: NSURLSession, dataTask: NSURLSessionDataTask, didReceiveResponse response: NSURLResponse, completionHandler: (NSURLSessionResponseDisposition) -> Void)
  • func URLSession(session: NSURLSession, dataTask: NSURLSessionDataTask, didReceiveData data: NSData)

At the beginning I thought that to recognise the single frames within the data stream I had to check the received data for the “boundaries” and then create a UIImage with the data in between (excluding the data of the headers of course)… I was utterly wrong. It actually is way easier than that… in fact if we read the iOS documentation for the method “didReceiveResponse” (the first one among the ones I just listed) it says:

This method is optional unless you need to support the (relatively obscure) multipart/x-mixed-replace content type. With that content type, the server sends a series of parts, each of which is intended to replace the previous part. The session calls this method at the beginning of each part, and you should then display, discard, or otherwise process the previous part, as appropriate.

Basically the framework is doing all the job and the only things left to do are actually gathering the data when received and creating a UIImage with them when the method  “didReceiveResponse” is called!

Every time data are received we append them to the property “receivedData

When “didReceiveResponse” is called if the property “receivedData” is not empty it means that there is a frame ready to be displayed… hence we create a UIImage with the data and assign it to the imageView (dispatching the task to the main queue since is the only queue that should interact with the UI). Once the UIImage has been created the can reinitialize “receivedData” as a new empty NSMutableData ready to receive the new frame.

That’s it… nothing else is actually necessary to play a mjpeg stream. One last thing to mind is that if the data source is a camera and not a recorded video the stream itself will never ends… hence to stop the transfer the data task has to be explicitly cancelled:


Of course just the basic functionality (playing a video) is rarely enough… to have a better user experience I provided MjpegStreamingController with two properties that can be set to perform actions at the beginning and at the end of the loading time:

  • var didStartLoading: (()->Void)?: this closure is called right after the play() method has been invoked and should be used to set-up whatever should happen while the stream is loading (like displaying an activity indicator and start its animation)
  • var didFinishLoading: (()->Void)?: this closure is called right before displaying the first frame of the video stream and should be use to undo what was done by didStartLoading (like stopping the anipation of the activity indicator and hide it)

These two properties are completely optionals hence you are free to ignore them.

In the example above there is an activity indicator that will be displayed when the connection starts and disappears the moment the first frame is displayed (to give a proper feedback to the user).

Another aspect that MjpegStreamingController can help handling is the authentication with the server… if when attempting to connect an authentication challenge is received MjpegStreamingController will try to handle it in different ways depending on the authentication type and if the authenticationHandler is set:

  • authentication type is NSURLAuthenticationMethodServerTrust: usually happening if the url is using https instead of http, this case is automatically handled and no action is required
  • any other authentication type: in this case it checks if the closure authenticationHandler is set… if it’s the case then it calls it providing in input the authentication challenge and it will be then up to the closure to provide the NSURLSessionAuthChallengeDisposition and NSURLCredential to continue with the authentication process; if it’s not set MjpegStreamingController will fallback on the default behaviour of a NSURLSession in case of authentication challenges

Here it is an example of how to implement a custom authentication handler:

A simple way to deal with HTTP Basic Auth without having to provide an authentication handler is to put the credentials directly inside the url as follow:

There is nothing else to say, I hope this post was helpful and that MjpegStreamingKit will help you building great apps… remember to check it out on GitHub!