QuickChat and WebRTCNovember 12, 2013
I’ve been working on a personal project called QuickChat for the past several weeks. A large part of the project uses WebRTC. It’s been a lot of fun and I’ve learned a lot during this process.
I built QuickChat to scratch my own itch. Part of my responsibility as a lead programmer is to screen prospective programmers before inviting them for an on-site interview. Usually, this screening process involves a conversation using one of the following tools: Skype, Google Hangouts, or plain old telephone.
Over time using these tools became increasingly frustrating to use for the following reasons:
- Lock-in — Both Skype and Hangouts are in the business of acquiring users and locking them into their respective walled gardens. They require all parties have an account with their service.
- Significant requirement and coordination overhead — Skype requires download and installation of a native client. Once logged in to Skype, I still have to add the person I want to talk to as a "contact", so I can see their online presence on the network, and then finally I am able to start the call. Hangouts requires opting-in to Google+ and installation of a custom browser plugin.
- Poor call quality — Skype is frequently worst offender here.
Skype and Hangouts have no builtin scheduling feature and require a separate tool to setup a future call.
What I really wanted is a tool through which I can send someone a URL and when they clicked on that URL, they are taken to a web page where I can start talking to them right away. After reading a bit about WebRTC, it seemed like the perfect technology to use to build the purpose-built tool I had in mind.
WebRTC is a current W3C draft allowing web browsers to create peer-to-peer connections. With WebRTC’s high-level APIs, developers have access to:
- Media streams using bandwidth adaptive and variable bitrate video/audio codecs
- Reliable and unreliable data channels (ex. sharing files between browsers without intermediate servers)
- Secure transports (using existing SRTP and DTLS standards)
- A protocol agnostic signaling method for setting up peer-to-peer connections
These APIs allow developers to create low-friction, secure and high-quality communication tools. If you are building tools for the web, WebRTC is worth your attention.
The QuickChat Stack
To jumpstart the building of QuickChat, I started with static the HTML chat pages and WebRTC media APIs. This enabled me to focus on:
- Writing the client code to start/stop audio and video streams
- Designing video chat and permissions flows (NOTE: browsers ask for microphone and camera permissions differently)
- Designing video and chat window layout
After a couple of days, I had a simple and reasonably functional video chat interface. Next, I worked on building a web app to provide:
- User management and chat management with email invites and basic scheduling feature
- Websocket-based text chat (also used for WebRTC signaling)
- WebRTC-based direct file transfer (using WebRTC data channels)
- Subscription and payments handling (via Stripe)
One interesting bit about building the direct file transfer feature. Until Chrome v31, only unreliable data channels were available; so, you could lose data when sending files. So, I added transfer reliability at the application level. This took some effort to get working and the overhead resulted in a significant performance hit. I removed this bit of code once Chrome v31 shipped. Currently, WebRTC data channel connections between Firefox and Chrome do not work. But, I expect this to change in the near future.
WebRTC signaling defines how peers establish a direct connection to each other. WebRTC does not define the signaling transport itself, so anything can be used to pass session descriptions (ex. email.) But for Web apps, using Websockets is the fastest and most convenient way to do this. So, the Quickchat chat servers act as a "signaling server" allowing users to identify and pass opaque bits of data, called session descriptions, to each other. Session descriptions contain media capabilities (microphone presence, video camera resolution, supported codecs, etc.) and information about how to connect directly to a remote peer.
In some cases, because of network topology, peers may be unable to establish a direct network connection to each other. So, a relay server (called a TURN server) must be used. This is not as fast as a direct connection, but all WebRTC capabilities are supported.
Signaling was the most complicated bit of getting a WebRTC connection working. It requires strict coordination of both the local and remote clients and the tooling to trace signaling state during this process is immature. I created a simple wrapper similar to AndYet’s simplewebrtc library.
Here’s a simplified view of the WebRTC signaling workflow:
Final Thoughts and Useful Resources
Overall, I am pretty happy how QuickChat turned out. I hope to continue hacking on it and add other useful features. If you try it out and have any feedback, let me know here or via the website.
I am also happy with how well WebRTC worked. It’s not perfect, though. Once in a while a video chat will freeze and there’s not enough information provided to the app developer to figure what has has happened. Reconnecting the peers and restarting the video usually clears it up.
WebRTC is one of the most exciting things to happen to the web in a long time. I am really looking forward to see what creative developers build with it.
Here are some useful resources if you are planning on building something built on WebRTC:
- Good WebRTC documentation is limited, but Mozilla’s WebRTC Documentation is one of the better and update-to-date resources.
- Not all web browsers support WebRTC, you can check support on CanIUse’s WebRTC page.
- HTML5Rocks has a number of WebRTC articles including ones about WebRTC basics and data channels.
- WebRTC implementation is under active development, you’ll want to play attention to Chrome Canary’s and Firefox Nightly’s release notes for WebRTC features and notable bugfixes.
- Chromium WebRTC Test page is useful for debugging WebRTC features
- CubeSlam is a pong/breakout type game demonstrating WebRTC, WebGL and other rich web technologies