Building a cutting-edge WebRTC Video conferencing app in one hour!

Building a cutting-edge WebRTC Video conferencing app in one hour!

Building a WebRTC powered videoconferencing app is really cool and one can use it for multiple purposes including showing off to friends! The real challenge lies in understanding the complexities involved in WebRTC and then successfully dealing with it to build an app. Though it is not necessary to understand all the complexities from the beginning to build the app and one can gradually learn the concepts as one progresses with usage of WebRTC. While working with WebRTC for a couple of years now, we thought of coming up with something to help others build cutting edge video conferencing apps from scratch within an hour. We finally were able to build a npm module a couple of days ago for that same purpose. This blog post is going to provide a practical walk through on how to use that npm module to build a cutting edge video conferencing app from scratch Or integrate it to an existing app within an hour.

The npm module is a pre-baked reactjs ui component consisting of 2 pages, one for pre-call audio / video check and another for the video conferencing room itself. As it is a react component, one can easily integrate it to any existing react project to enable the app with cutting edge video conferencing capabilities. We are going to create a simple react app from scratch using create-react-app for the sake of simplicity and ease of implementation. But feel free to use the same code as mentioned below to use it in a more complex, existing react project. In order to follow this post, one needs a basic understanding of JavaScript and Reactjs. Lets begin!

If you are somebody who likes to do a git clone first and then go through this post, then here is the github link for you to download the source code and run it in your local system.

Before beginning, do keep in mind that you DON’T need a Credit card OR an API key/secret combo OR even a Signup at Centedge to run this app. You simply need to follow this tutorial and build your app. No strings attached !!

Before starting to build our app, we need to have node and npm install in our computer to follow through this post. If you don’t have those installed yet, please visit this link to download and install nodejs and npm for your OS. If you are on a Linux flavour like Ubuntu, then visit this link.

Once you have node and npm installed on your system, you are ready to create the app. Use this to create a new react app.

mkdir react-demo
cd react-demo
npx create-react-app <name-of-your-app> //Ex. react-videoconf-demo

The above commands will create a folder called react-demo in the current directory and create a sample react app named react-videoconf-demo (if you have chosen the example app name as your app name) inside that folder. You can run the default app to view it in the browser by following the steps mentioned in the output of the above mentioned command or by following these mentioned below.

cd <name-of-your-app>
npm start 

After this you should see the below output in your browser, if everything is fine until now. If you are not getting this output or you are facing some issue starting the sample application, you may not have installed node and npm properly! Please reinstall them and start from scratch again by following the above mentioned steps. If you got this output, please proceed ahead.

ree

The sample react app project structure should look like as mentioned below. You may not have the yarn.lock file if you are using npm. Instead you will have package-lock.json and this is okay.

ree

Now we are ready to make changes to our sample app and start integrating the necessary components to build our cutting-edge video conferencing app.

The default app.js file looks like as shown below which we are going to change soon.

ree

Now replace everything from this file with the code mentioned below.

import React from 'react';
import { BrowserRouter, Route, Switch } from "react-router-dom";
import AdminComponent from './components/AdminComponent';
import UserComponent from './components/UserComponent';
import LandingComponent from './components/LandingComponent';
import './App.css';

function App() {
 
 return (
 <BrowserRouter>
 <div className="App">
 
 <Switch>
 <Route path="/" component={LandingComponent} exact />
 <Route path="/admin" component={AdminComponent} exact />
 <Route path="/user" component={UserComponent} exact />
 </Switch>
 
 </div>
 </BrowserRouter>
  );
}

export default App;

This is what we are trying to achieve with the above code block.

  • Using react-router to load 3 separate components named Landing, admin and user using 3 separate URLs
  • All the components are defined in a folder named components inside src from which we are importing them to use in the App component.
  • We are importing the default css file to load some styles.
  • The next lines are the syntax to load those 3 different components in 3 separate unique URLs including 1 default landing URL.
  • You also need to install the react router npm package named react-router-dom to run the app. You can install the same using the below one liner.
npm install react-router-dom

Below are the individual components imported in the app.js file.

LandingComponent.jsx

import React from 'react';
import { useHistory } from "react-router-dom";

 const LandingComponent = () =>{
 const history = useHistory();
 return(
 <div>
 <h1>Hello, My App!</h1>
 <button onClick={()=>{history.push('/admin')}}>Join as Admin</button>
 <button onClick={()=>{history.push('/user')}}>Join as User</button>
 
 </div>
 
     )
 }
export default LandingComponent;

AdminComponent.jsx

import React from 'react';

 const AdminComponent = () =>{
 
 return(
 <div>
 <h1> hello, Admin !</h1>
 </div>
 
     )
 }
export default AdminComponent;

UserComponent.jsx

import React from 'react';

 const UserComponent = () =>{ 
 return(
 <div>
 <h1> hello, user !</h1>
 
 </div>
 
     )
 }
export default UserComponent;

Now, the folder structure should look like as below.

ree

If you have followed the tutorial correctly until now, you should get the below output in the landing page when you run the app.

The URL is http://localhost:3000/

ree

This is what you will get when you click the ” Join as Admin” button.

The URL is now changed to http://localhost:3000/admin

ree

This is what you will get when you click the ” Join as User” button.

The URL is now changed to http://localhost:3000/user

ree

If you haven’t got the above mentioned output in your app, please recheck the previous steps. You may have missed something important! If you got the desired output, you are ready to proceed.

Now that the skeleton is ready for building our video conferencing app, we need to install the necessary packages to make it happen. The most important package to make it happen is named cvr-rui. You need to download this package using the below mentioned command.

npm install cvr-rui

One it is installed, replace the content of AdminComponent with the below code.

import React from 'react';
import ParticipantPage from 'cvr-rui';

 const AdminComponent = () =>{
 const callEnded = () =>{
 console.log('hey, the call has been ended!!');
 
    }
 const callStarted = () =>{
 console.log('hey, the call has been started!!');
 
    }
 const webVideoConfig = {
 theme:'light',
 joinButtonColor:'secondary',
 roomName:'my-1st-video-room',
 userName:'admin-user',
 participantType:'moderator',
 adminApprovalRequiredToJoinRoom:false,
 screenSharing:true,
 chatOption:true,
 callStartFunction:callStarted,
 callEndFunction:callEnded,
      }

 return(
 <div>
 <ParticipantPage config={webVideoConfig} />
 </div>
 
     )
 }
export default AdminComponent;

Also replace the UserComponent code with the below code.

import React from 'react';
import ParticipantPage from 'cvr-rui';

 const UserComponent = () =>{ 
 const callEnded = () =>{
 console.log('hey, the call has been ended!!');
    }
 const callStarted = () =>{
 console.log('hey, the call has been started!!');
    }
 const webVideoConfig = {
 theme:'dark',
 joinButtonColor:'primary',
 roomName:'my-1st-video-room',
 userName:'regular-user',
 participantType:'',
 screenSharing:true,
 chatOption:true,
 callStartFunction:callStarted,
 callEndFunction:callEnded,
      }

 
 return(
 <div>
 <ParticipantPage config={webVideoConfig} />
 </div>
 
     )
 }
export default UserComponent;

If you have followed the post properly until now,you will get a result as below when you try to load the AdminComponent by clicking on the Join as Admin button in your landing page

ree

and the UserComponent by clicking on the the Join as User button in your landing page.

ree

Now is the time to click the join room button in both the pages. Once you click the Join Room buttons, the magic happens as shown below.

When only admin user joined the room

ree

when regular user also joined the room

ree

Admin view after regular user joined the room

ree

Now we have successfully created a cutting edge video conferencing app with active speaker detection and real time bandwidth monitoring and display enabled by default.

You can also easily integrate it into any existing react project to enable it with cutting edge video conferencing capabilities. Please feel free to drop us an email at hello@centedge.io in case of any issue while trying to follow this post or in case of a bug in the npm package.

The link to the npm package is here.

The link to the Centedge app running a very similar UI is here.

Note

This react ui is running with the help of a small scale video back-end infrastructure sponsored by Centedge along with some generosity from AWS. That’s why you don’t need a Credit card OR an API key / secret combo OR even a Signup at Centedge to run this app! As this is a small scale setup, we don’t advise anybody to build production grade video applications using this npm package which internally depends on infrastructure from Centedge. The back-end setup currently can handle 6 participants in one room including moderator and can handle 5 such rooms simultaneously! Please feel free to play around and tinker around using this package. If you want to build something serious using this and need more number of rooms or more number of participants in each room, feel free to drop us an email at hello@centedge.io and we will be happy to create a production grade infrastructure setup for you.

Becoming a WebRTC developer

Becoming a WebRTC developer

” How to become a WebRTC developer? “is the question many developers ask these days who want to learn this niche technology. The situation was not like this exactly 2 years ago and it was the technology of hobbyists and a very limited set of professionals working in video conferencing enterprises. Then COVID happened and the world started to run on video as strict social distancing measures along with lock downs were enforced through out the world. Everything started from shopping, learning, health checkups even marriages and funerals started to happen in an online video conferencing mode. WebRTC is the technology that made it possible to create video applications for all of these unique use cases. The demand for WebRTC developers skyrocketed since then, where as there are only a handful of skilled WebRTC developers are out there even today.Many enterprises / startups are going through a lot of pain these days to find good WebRTC developers.

As an early adopter and pioneer in the field of WebRTC, we also face a very similar situation, while trying to hire good WebRTC developers. In order to solve this problem, we will be starting to organise weekly / monthly events for students / experienced professionals where they will be briefed about the benefits of WebRTC as technology stack and how it can help them start / restart their career if they understand and master it.

If you are a student / experienced professional with < 5 years of experience/ then this online event is for you. The details are as below.

Topic: How to become a good WebRTC developer

Timing: Saturday evenings

Participants: 50 ( First come first serve basis)

Slack: Centedge

Please join the above slack link and then join the #webrtc-developers-den channel to get notified about the exact timing and meeting link to join the session. Here we also will post the learning materials, open source projects, developer/tester requirements from time to time.

PS: At Centedge we are working on a cutting edge virtual event platform which is currently in the testing environment. If you are interested in helping us test the platform thoroughly, then don’t forget to join the #testing-captains slack channel as well.

We are looking forward to host you in the coming Saturday to help you become a good WebRTC developer. Feel free to ask us any question / doubt in slack, once you join the #webrtc-developers-den

Building a scalable production grade WebRTC video app

Building a scalable production grade WebRTC video app

They say the customer is always right. But this is not always true in case of building a production grade WebRTC video calling app which also can scale.

Why? Because the customer many times wants to build a world class video app like zoom / google meet within a time period of 3 months with a budget of $$$$ / 1$$$$.

How do they come to such conclusions about time and money?

They came to such conclusions because they read it on the internet that WebRTC is free as well as opensource and one can build an app like using zoom /google meet using WebRTC by downloading a random open source package with WebRTC as a keyword, from Github. They concluded afterwards that all things needed for a app like gmeet are available, either in WebRTC or in the open source package. They just need to build a new UI layer and a dashboard around WebRTC to challenge zoom / gmeet!

A notion that WebRTC is free and open source, things like gmeet can easily being built using it without much effort, is slowly built in mind of the customer. This notion is creating a lot of confusion between companies like us and the prospective customers. It takes us time to make the prospective customers understand the reality about WebRTC and the effort it takes to build production grade video applications using WebRTC.

Once the initial understanding is built that building a live video application is more than simple WebRTC, another issue arises. This time the issue is about building large enough video rooms which can possibly cater to may be a million users. A million users in one Room!

Again we need put efforts to understand the customers thought process by asking the right set of questions. It turns out that the customer currently has a teaching learning application where one teacher teaches one student at a time. They started by using gmeet for free to conduct such sessions but later they realized that they need more control along with deep integration of the calling feature to their dashboard. That’s why they are currently looking for an alternative which can provide deep integration to their dashboard while being cost effective. But they anticipate that their product will have explosive growth and will reach a million students within couple of years. That’s why they want to build a large enough video conferencing application which can scale to million users when it happens, in a couple of years.

Here again we need to put efforts to make our customers understand how a WebRTC application really works. We need discuss and explain to them the various kind of WebRTC architectures like p2p, full mesh multiparty, conferencing, live streaming, Webinar etc. Though we prefer to not to use much WebRTC jargons in the discussion, some time it becomes unavoidable when we need top explain them things like SFU, MCU, ICE/STUN/TURN, Media server, Recording Server, FFMPEG, GStreamer etc. After one or two rounds of discussions, they themselves realize that their current need can be very well satisfied in a p2p call occasionally with a TURN server. After all these discussions, they understand that building a production grade scalable p2p video call, takes much less time and resources than building a scalable production grade video conferencing application. It is a also a good starting point to test a market and the product, before investing more resources in building a scalable video conferencing application. It is immaterial that they choose a P2P app or conferencing app, in a couple of rounds of discussions, they equip themselves with all the necessary knowledge to understand the reality with WebRTC. From here on-wards, it becomes a rewarding experience to help the customer achieve his /her business objective.

After going though the above mentioned situation for a couple of time, we decides to build a scalable production grade WebRTC p2p video calling application with a loosely coupled UI. This way UI can be modified according to individual needs where as the architecture, the front end functionality and the back end stays the same. Though building a simple p2p app seems easy but building a scalable, production grade p2p video calling app with certain level of bad network tolerance is not so straight forward. Why? Because one need to take care of the below mentioned things in a production grade app which are generally not present in a simple p2p app available on Github.

Audio / Video management: This feature includes all possible things a user may need while using the app like muting / un-muting the mic, switching on / off the camera, changing of existing mic / camera to a new mic / camera for rest of the call, allowing moderator controls for remote media input change( so that a teacher can mute / un-mute the mic or switch on /off the camera of student in case of a need ) etc.
Capturing images / statistics: This feature helps in capture an image from the real time video stream for a purpose like vKYC(video Know Your Customer). With this feature, an bank agent can capture the image while the ban’s customer shows his / her photo identity card during the call for bank’s verification purposes. Collecting real time call and network statistics are also important for quality control and monitoring purposes. Also real time network monitoring can raise timely alerts to users when their network quality degrades.
Auto re-connection: This is primarily important for maintaining the call even when the network fails. A network typically fails when one’s device changes the network connection while a call is going on. It happens when a users’ device switches between WiFi and mobile network modes like 3G/4G / LTE etc. The network temporarily fails when the switch happens and comes back once the switch is over. In order to auto reconnect, an application need to detect network failure, wait until the network reconnects and restart the media communication as quickly as possible after the successful re-connection. In the WebRTC terminology, restoring the media communication is called ICE restart.
Integrations: Other application integrations like whiteboards for collaboration, text chat option, file sharing etc also play an important role for some users. Either these features should be there or a provision should be there for the integration of these features in case of a need.
Recording : Recording is another important feature in any WebRTC application. Though it may not be needed in all kinds of WebRTC applications, it becomes necessary for applications like video call centers, video health etc. Recording can happen either in client side or server side. Ideally a server side recording is preferred as it allows to post process the raw recording for multiple purposes. As an example, a video recorded in WebM format, the default recording format for WebRTC, consumes 800 MB – 1000 MB of space to store an hour long video recording which is a lot. In the server side, one can use a tool called FFMPEG to compress it, watermark it and convert it to MP4 format which can reduce the size to < 100MB for the same video. Once can use client side recording and send the file to server once the recording is done as an alternative strategy if server side recording is not possible( like a P2P call).
In call Media Manipulation: There may be a need for masking some portion of the video stream while in call for either for security or convenience purposes. A widely used feature these days for such a need is called background removal, where the background of a user sharing a camera is either blurred or replaced with another image of a coffee shop / office desk / meeting room etc. There may be other such use cases as well.

All of the above mentioned features along with a robust architecture ready for scale makes a production grade application. It takes a lot of efforts and time to build such an application with an excellent team with deep understanding of WebRTC and related technologies, frontend, backend, and many other such things.

If you are a customer looking forward to build a production grade scalable video calling app, then by now you know that you need to have a rock star team with solid understanding of WebRTC and related technologies along with sufficient time and resources at your disposal to venture on such an adventure. If you don’t have a rock star team or time at your disposal, then we are here to help in any of below mentioned way.

CP2P is our scalable production grade P2P video calling app ready for production deployment as a managed service. It comes with a very minimal UI ready for retro fitting. Either you can share your UI design and we build it for you or share the fully designed UI for integration. We can integrate , deploy in your servers / our servers and manage it for you in a cost effective manner. The link to view details and check it in action is there at the bottom of this page.

CVR is our scalable production grade video conferencing / live streaming / Webinar application ready for production deployment as a managed service. It uses our in house built from scratch WebRTC load balancer CWLB to distribute and balance load in real time with a utilization efficiency of 75%. It also uses CR, an in house advanced recording engine developed from scratch to record meetings. It uses Mediasoup as it’s core media server. The link to know more about it and schedule a demo is at the bottom of this page.

If you think that you don’t want to use any of these products, but want o develop it from scratch, we as an consulting company can help you build your own product from scratch. In this case, you need more resources and time then the previously mentioned options. If you have more resources and time at your disposal, then this can be a path to trade. The only thing to make sure in this case is that you have a dependable rock star team who can work with us for building the product.

In case you don’t have a rock start team, then there is a reason to worry. But why worry when we are here. We have an instructor led online / offline, full time training program where we can convert any fullstack / frontend / backend developer with sound javascript knowledge to rock star WebRTC developer. The timelines for the training program are as below.

  • 5 – 7 days (for the WebRTC fundamentals program)
  • 10 -14 days ( for the WebRTC fundamentals and advanced WebRTC with Mediasoup program)

I hope I was able to share enough information about building a scalable production grade video conferencing app. If you still have doubts or questions, you can reach out to me either on sp@centedge.io or on hello@centedge.io.

The link to details on CP2P is here.

The link to details on CVR is here.

The link to schedule a free 30 mins discussion with one of our experts to resolve all your WebRTC related queries is here.

If you are student / developer looking forward to learn more about basics of WebRTC by yourself through working examples, here is a github repo with working examples on successful MediaStream acquisition, building a basic signalling server, and building a working P2P call app.

Demystifying a WebRTC video calling application for the beginner

Demystifying a WebRTC video calling application for the beginner

So you are a developer(frontend/backend/full-stack) curious about developing applications using WebRTC. You are searching over the internet for the last couple of days or even months to learn the basics and to build a basic WebRTC video calling app along with a basic understanding. Though there are a few Github repositories available with the code for building a very basic p2p video calling app, none of them have the details about the inner workings of the code. The code in those repositories just runs in your localhost with a command or two where you can connect 2 tabs of your browser with a video call. When you try to read through those codes, you find a bunch of API calls that are very unfamiliar and sometimes even illogical.

In order to demystify the inner workings of a basic video calling app using WebRTC, we need to follow a 3 step beginner-friendly approach which also is commonsensical. Let’s start.

The very first step of building a video calling app is to understand how to acquire the camera and/or mic of the device you want to use for calling using any of these browsers, chrome/firefox/edge/safari. It can be a desktop/laptop / mobile device as long as any one of these browsers is present. Without the camera and mic, the video call has no meaning at all. There can be a use case where you are going to use WebRTC in the p2p mode with data channels for file sharing only but we are not going to discuss this in this blog post today. The way to acquire the camera and mic from the browser is to use an API called getUserMedia. The below line of code will acquire the camera and mic from the browser.

const stream = await navigator.mediaDevices.getUserMedia({audio:true,video:true});

With the above line of code, we will be able to acquire the camera and mic with some preconditions. The above line of code won’t work if you are not running on HTTPS. If you try to use the above line of code with HTTP, it will fail.

If you have successfully acquired the camera and mic, you are ready for step 2 of building a WebRTC video calling app. In this step, you need to build a simple signaling server so that some messages can be exchanged between both the caller and callee. This step is all about building the capability to create a server-side application that will connect to both the caller and callee, and let them share some secret messages with each other whenever needed. Nodejs is the server-side framework that is going to be used as a signaling server in this example and WebSocket as the connectivity mechanism to connect both the users.

Here is the sample code.

const https=require('https');
const WebSocket=require('ws');
const WebSocketServer=WebSocket.Server;
const httpsServer=https.createServer(serverConfig,handleRequest);
httpsServer.listen(HTTPS_PORT,'0.0.0.0');
const wss=newWebSocketServer({server: httpsServer});

wss.on('connection',function(ws){
    ws.on('message',function(message){
    
    })
})

With the above lines of code, one has a basic signaling server ready to listen to messages from the caller/callee.

Now you are ready for building the real video calling application using the work we did in the last 2 steps. Here are the steps that are needed to establish the call.

  • The Caller (peer A) connects to the signaling server and waits for the callee(peer B)
  • The Callee peer B connects the signaling server and also informs peer A that he /she is available for a call
  • Peer B clicks the call button and boom! the call is connected where both peer A and peer B can see and listen to each other.

Here are the real steps that happen behind the scenes to establish the call.

  • Peer B first creates a new PeerConnection object while passing the available ICE servers as a parameter, which helps in sending and receiving the media streams.
const pc = new RTCPeerConnection({iceServers});
  • Then it acquires the local camera and mic and adds those camera and mic tracks to the PeerConnection. This will make the PeerConnection ready to send the media feeds as soon as the connection is established, i.e. when both the user agree to use a common network configuration acceptable to both)
stream.getTracks()
      .forEach((track) =>
        pc.addTrack(track, stream)
      );
  • Then it creates an offer to generate an offer SDP(session description protocol) which contains a large number of information (approx. 80 -100 lines of information) in plain text format. It contains information like network settings, available media stream(audio/video/screen share/anything else), codecs currently available to encode and decode media data packets, and many other things.
const offer = await pc
      .createOffer()
      .catch(function (error) {
        alert("Error when creating an offer", error);
      });
  • Once SDP is generated, the local description of PeerConnection is set using the offer. In simple terms, it asks the browser for the final confirmation of the validity of all the options available in SDP. Once the local description is set, the SDP aka settings can’t be changed anymore and the SDP is then sent to remote peer A to let its browser do all the things that peer B’s browser just did.
 await pc.setLocalDescription(offer);
 //send the offer to peer A using the signalling channel
  • As soon as the local description is set , it starts generating ice candidates( in simple terms, the current network configurations of peer B) and sends it to peer A to check if the network parameters are acceptable to his / her device to receive media streams.
pc.onicecandidate = function (event) {
      if (event.candidate) {
       //send the ice candidate to the other peer using the   
       //signalling channel 
      }
    };
  • Once the SDP is received by peer A’s browser sent via the signaling server, peer A first creates a PeerConnection object while passing the ICE servers as a parameter for the same purpose. As soon as the PeerConnecion is created, it uses the offer SDP provided by peer A to set its remote description. This is needed to be done to let the browser know of the other peer’s details so that the browser can create an answer SDP as an answer to the offer at a later stage.
const pc = new RTCPeerConnection({iceServers});
pc.setRemoteDescription(new RTCSessionDescription(offer));
  • It is a repeat of step 2 for peer A where it acquires his / her own media streams and adds them to the Peerconnection to be ready to send once the connection is established.
stream.getTracks()
      .forEach((track) =>
        pc.addTrack(track, stream)
      );
  • Then It creates an answer by calling the create answer API on the PeerConnection object and generates the answer SDP. Once the answer SDP is generated, the local description is set on peer A’s side to ask the browser for one final confirmation. Once confirmed, the answer is then sent to peer B via the signaling channel for peer B’s browser’s acceptance of this answer.
const answer = await pc
      .createAnswer()
      .catch(function (error) {
        alert("Error when creating an answer", error);
      });
await pc.setLocalDescription(answer);
//send the offer to peer A using the signalling channel
  • Once the answer SDP is received on user B’s side, it calls the set remote description API to ask the browser for acceptance of the other user’s SDP. Once the browser confirms, the connection for media transport is now established.
pc.setRemoteDescription(
      new RTCSessionDescription(answer)
    );
  • Step 5 is repeated by peer A’s browser for the exact same purpose. Both the browsers have knowledge of each other’s network configuration by now. After this, both of the peers agree to use one network configuration among all the possible network configuration options given by both of their browsers. The mutually selected network configuration aka ice candidate is then used for the actual media transport between both of the users.
pc.onicecandidate = function (event) {
      if (event.candidate) {
       //send the ice candidate to the other peer using the   
       //signalling channel 
      }
    };
  • Once the connection is established, each of the PeerConnection objects starts sending their respective media streams to the remote user. As soon as the media reaches the other side, an event named ontrack is triggered on the PeerConnection object to let the browser know that other peer media has already reached and is ready to be consumed. The local browser then extracts the media from its PeerConnection object and displays it in a video element.
pc.ontrack = (event) => { 
    if(event.streams && event.streams[0]){
    //The remote stream is now available at event.streams[0]. It     
    //can be attached to the srcObject of a video element to 
    //display the remote stream to the peer.
    } 
}
  • Now the call is successfully established where both peer A and peer B can communicate with each other in real-time with their respective camera and mic.

Once all the above-mentioned steps are done correctly, a WebRTC video call can be established successfully. Here is the link to the Github repo where all the above steps are created in separate folders along with working code for your reference.

Do keep in mind that this is for learning and understanding the inner workings of how a simple WebRTC p2p call works. If you want to build a production grade p2p call which you can deploy to a cloud and use it for a commercial venture, you need to check this out.

If you want to build a production grade video calling app by yourself as an extension to this project, you need to check this post learn more about all the necessary features in a production grade app.

Also keep in mind that you need a robust architecture to build a production grade app. The code in the Github repository created for this example, has been created for learning purpose only and is not fit for production usage. If you are interested in scheduling a discussion with a principal consultant at Centedge to do the right architecture for you, you can schedule a free 30 mins consultation cal using this link

How to test Co-Turn TURN server configurations?

How to test Co-Turn TURN server configurations?

A TURN server is a need for any WebRTC application when a user behind a strict firewall needs to connect to your application. Co-Turn is a very popular open-source TURN server that can be used with any WebRTC application for bypassing firewalls. It can be installed on a Linux server(on-premise / cloud ) and need to be properly configured to work with your WebRTC application. As the scope of this post, we are going to focus only on how to test your TURN server to find out if all the configurations are done properly or not as the configuration of a TURN server deserves a separate post.

The most popular tool available today to test a TURN server is known as the WebRTC trickle ICE testing tool. The link to the site is at the bottom of this post. Here is what it looks like.

ree

As shown, we can add our STUN/TURN server credentials using the add server button. Once the server is added, we can select the server and gather all possible ICE candidates using the gather candidates button. Here is what it looks like for when a generate candidates for working a TURN server.

ree

As shown, there are 3 kinds of candidates being generated from one candidate gathering request. They are

  • host candidates
  • srflx candidates
  • relay candidates

host candidates: Host candidates are those candidates which can be used to connect a WebRTC call within the same network, i.e. both the peers joining the call are connected to one network router / switch.

srflx candidates: srflx candidates are those candidates which can be used to connect a WebRTC call when both the users are not in the same network but on different networks may be even in 2 different continents connected by the Internet.

relay candidates: relay candidates are those candidates which can be used to connect a WebRTC call when both users are not in the same network and either one user is or both the users are, behind a firewall. When a user is behind a firewall, he/she is not reachable directly from the Internet. That’s why the firewall has been put in place so that a network can be isolated from the Internet and interaction with the Internet can be controlled by a network administrator in a desired manner. The primary reason is to keep the evil eyes out of the network.But this network arrangement poses a problem to a WebRTC app as it can’t reach to a user. In order to solve this, a technique is used where a TURN server is put in place to let both the users use it as a relay server to relay their media through the server rather than directly connecting with each other, in order to bypass the firewall. If this sounds a bit confusing, you can read more about firewalls and trickle ICE to gain a better understanding.

There can be 3 kinds of relay candidates. Any one of these candidate types can be chosen to establish the connection.

  • relay-udp
  • relay-tcp
  • relay-tls

relay-udp: The most common kind of relay candidates and the easiest to connect one. When the firewall is not very strict and allows udp connection to happen, then this type is chosen.

relay-tcp: This is the next best available option when the relay-udp candidate type is not possible as udp ports are blocked by the firewall. In this case the relay-tcp candidate type is used.

relay-tls: This is the next best available option as the firewall has not only blocked the udp ports but also blocked unsecured connections over tcp. Only a secured connection with a valid authentication mechanism like ssl certificates can pass through this firewall. In this case, the relay-tls candidate type is used which has ssl certificate based authentication mechanism already enabled at the TURN server end.

To test which kind of relay candidate a TURN server is using, we can use the Firefox browser with a special setting flag switched on. Here is how it looks like.

ree

As shown, there is a flag in Firefox to tell the browser to use only relay type connections for any kind of WebRTC application used in the browser. By default, it is set to false which means only relay type connection is not enforced for WebRTC calls. This flag can be switched on to enforce the relay only candidate generation by clicking on the button provided at the right most side of the flag. This is how it will look like after switching it on.

ree

Any WebRTC application, using Firefox after this flag is switched on will use relay type candidates only. Here is how the candidate logs of Firefox looks like while a WebRTCcall is running after this flag is switched on.

ree

As it can be seen in the above image, Firefox has chosen relay-tls candidate type in this case as the only turn server credentials provided for this example turns:turn.centedge.io:443 is enabled with ssl certificate based authentication. More than one turn server credentials can also be provided so that the TURN server can take a decision to choose the correct relay candidate type based on the users firewall configuration. An ideal iceServers parameter for establishing a WebRTCconnection should have a STUN server and multiple turn server entries different purposes. It should something like this.

ree

As it can be seen there are 4 entries in this case. The explanation for each entry is as below.

Entry 1: For generating srflx candidates using a STUN server when both the users are not behind a firewall. This is the most common and most used method to to connect a WebRTC call. A STUN server is a must for connecting a WebRTC call. Also there are no costs related maintaining a STUN server in terms of data transfer costs except for the cost of the server itself.

Entry 2: For generating relay candidates as there is a firewall in place for either one or both the users of the WebRTC application. Here the turn server is reachable on port 80 because most of the firewalls allow access on port 80 which also is used by http.

Entry 3: For generating relay candidates as there is a firewall in place and all access are blocked by it except authentication based access. For this a ssl certificate based authentication is used to pass through this firewall restriction. This firewall is stricter than the previous one. Hence port 443 is used which also is used by https.

Entry 4: For generating relay candidates of type tcp as there is a strictest possible firewall in place which not only needs a authentication based access but also has blocked udp ports.

This is even a stricter firewall than the previous one. This type of candidate should be able to pass through al kind of firewalls as All most all firewalls allow port 443 access for normal internet access to websites and without opening port 443, they won’t work!

While working with one of our customers, we realized that testing the TURN server settings one by one for all possible candidate types is critical for any application for reducing call failures due to strict firewall policies. Except for trickle ICE, there are no reliable tools available today to test TURN servers primarily from the perspective of what kind of relay candidates(udp/tcp/tls) this turn server supports. Therefore, you can use our open-source p2p application to test it in your local network or deploy the application to test it over the internet. We will try to have a hosted version of the same p2p application in some time.

How does it work?

Step 1: Download this repo to your local machine and follow clearly mentioned instructions in the read me to set it up. Don’t forget to provide your TURN server credentials you wish to test as mentioned in the read me.

Step 2: Once the setup is successful, open Firefox, change the settings as mentioned above switch on the relay-only flag.

Step 3: Copy and paste the generated link either on a different browser on the same device or on a different device in the same network. You can share the link with your friend as well to hang out with him while testing if you choose to host the application in a cloud provider like DigitalOcean.

Step 4: Once the other user joins the link, press the call button to start the video chat.

Step 5: Open the about:webrtc in a different tab in Firefox, clear the history using the Clear History button so that the statistics are available for the current call only and open the show details and then click on the show details link to view the detailed call statistics including the ice candidate details. Here you should be able to view the relay candidate type(udp/tcp/tls) along with all other relevant details.

Hope this helps you in testing your turn server for misconfigurations. Feel free to drop us an email at hello@centedge.io for any questions/doubts/concerns related to the above post or overall TURN servers/ WebRTC.

Integrating WebRTC video calling into any website / application

Integrating WebRTC video calling into any website / application

WebRTC is a very popular technology these days to integrate video calling in to any kind application. It is primarily very helpful for websites / applications who provide online teaching / learning services, doctor / patient video consultation services. It also is very helpful in video customer support applications for a real time video support agent / customer conversations where the agent can see things in real time in the customer’s end and can take quick decisions. A very good example use case would motor insurance claims where the agent can see in real time the damage to vehicles and estimate in real time the amount of damage to quicken up the claim settlement process. There can be many other use cases as well. In this post, we are going to explore various strategies to build a scalable peer to peer (p2p) video calling system which can be integrated to any website/application with ease. Building a scalable multi party video conferencing / live streaming applications are out of scope of this post.

Concurrency & state-fullness

Understanding concurrency & state-fullness is important before moving on because these 2 buzz words makes developing a WebRTC signaling server more complex.

  • Concurrency: The no of user using you app simultaneously, not one after another. Example: If there are 1000 p2p video calls currently going on in your website, then there are 2000 concurrent users connected to the signalling server used by your website / application.
  • State-fullness: State-fullness is about maintaining a socket connection through the lifetime of a connection(from the time it connected to the signalling server for the 1st time until it disconnects from it). From the above example, if 1000 p2p video calls of 1 hour each, are currently going on your website, then the signalling server has to maintain the connection for whole 1 hour without disconnecting even for 1 second! Maintaining a consistent connection for 1 hour though seems easy, is actually not a very easy thing to achieve considering that a network can break any time due to any random reason. As soon as the network breaks, different strategies like a timeout based re-connection attempts is used to maintain the connection without letting the user know. There other issues with maintaining a socket connection for so long time. For the above mentioned reasons, state-fullness adds a lot of complexity to any application.

Building a simple WebRTC video calling application involves multiple steps with skill-set requirement of more than one technology. Building a scalable WebRTC application is even more complex which has a diversified skill-set requirement. A certain amount of expertise in stateful application development(socket programming) with WebRTC is necessary to build a scalable WebRTC video calling app. Let’s first understand the building blocks of a scalable WebRTC app before proceeding to build it.

The building blocks are

  • A signalling server, Nodejs
  • A signalling protocol, WebSockets
  • A in-memory data store, JS hash maps
  • A client side application capable of running inside a browser, HTML,CSS & JS

With the above ingredients, we can build a scalable WebRTC video calling app which can be used by 100s / 1000s of simultaneous/concurrent users. The first 3 are needed to handle the server side things of our WebRTC application. The combined form of all of these is known as a signalling server. It is not mandatory that we need to build our own signalling server to build a WebRTC app from scratch. There are a couple of capable (open source!)signalling servers out there which can be used to build a WebRTC video calling application. After a careful consideration of all of them, if none of them satisfy your requirements, you can choose to build one from scratch. But keep in mind that building a WebRTC signalling server from scratch is not a very easy thing to do and you exactly need to know what and how you intend to build your signalling server before starting off on that journey.

Here are the 3 open source WebRTC signalling servers currently available which you can use to integrate video calling in to your application / website. They are

Lets explore the strength / weakness of each of them.

simple peer: It is a stable signalling server used by many applications today. It has been started in the early days of WebRTC and matured with time. As it was started in the early days, it has some old style implementations for WebRTC still there which were needed in the early days but may not be needed today. Those extra things doesn’t create any troubles though.

strength:

  • An matured signalling server with developer ecosystem
  • A good user base
  • Covers more use case (Both in browser / nodejs)

weakness:

  • There is no official option for paid support
  • No official alternative to pay for custom development
  • Need some level of WebRTC understanding to work with it(getUserMedia)

peerjs: It is as old as WebRTC itself. As it also is there from the early days of WebRTC. All it’s other characteristics are very similar to simple-peer along with some additional points to be considered. It has changed maintainer hands couple of time since it’s inception and was not maintained very well until 1 – 2 years ago.

strength:

  • An matured signalling server
  • A good user base

weakness:

  • There is no official option for paid support
  • No official alternative to pay for custom development
  • Need some level of WebRTC understanding to work with it(getUserMedia)

cignal: It is the latest signalling server made open source in the month of September 2022 as a community edition of a fully featured commercial WebRTC signalling server made for enterprise usage. It has the necessary features for scaling and it has been designed for developers with zero knowledge of WebRTC intending to integrate it with other applications/websites. The core philosophy of cignal is to abstract away all things related WebRTC and expose a very easy to use APIs to integrate video calling capabilities to any other application / Website.

strength:

  • Latest signalling server with inbuilt scalability (up to 1000s of concurrent users!)
  • Community edition of fully featured commercial WebRTC signalling server
  • Official option for paid support
  • Official option for custom development

weakness:

  • Very new to the open source scenario(1st release on September 2022)
  • The open source version yet to be used by many

Now we are going to explore how to build a video calling app using cignal as it is the latest one with scalability and easy to integrate it into any application / website. You should be able to get examples for the other 2 by searching over google or github for example projects.

Prerequisites: Basic web development experience(HTML,CSS & JS). If you don’t know what these are, then you shouldn’t be doing this by yourself. It is better to hire a web developer, who can do the job for you.

Steps:

  • Go to the cignal github repo and clone it in your local machine, using the below command.
git clone https://github.com/sauravkp/cignal.git
  • Follow the steps as mentioned in the readme file to run the app in your localhost. If you followed all the steps correctly, you can now have a video call between any two devices in the same network where the application is running.
  • Follow the steps mentioned in the hosting section of the readme file to host your application with a cloud service provider like digitalocean. You can use the same steps to host it in some other cloud service provider as well. Keep in mind that you need a domain name and ssl certificate like any other regular web application to host the cignal signaling server.
  • As cignal comes with a default minimalistic UI, you can choose to enhance it or replace it with your own UI. The cignal client triggers all the necessary events to manage pre-call, in-call, and post-call changes. All other necessary data points are mentioned in the readme document for extending the existing capabilities for other functionalities like providing a chat functionality.

For any issues/bugs in cignal, you can send a mail to support@centedge.io for support.

In case you want paid support / custom development using cignal, feel free to reach us out at hello@centedge.io.

This is the community version of Cp2p, a fully featured production grade commercial video calling application developed by Centedge for it’s customers with edu-tech, health-tech, customer support tech related use cases. This community version has been made open source so that the development teams get a solid foundation for building their video calling system without having to deal with deep WebRTC related things.

If you have requirement for a multiparty video conferencing solution / live streaming solution, do take a look at CWLB. It is a general purpose WebRTC load balancer with scalable signalling server, scalable media server, and scalable recording server integrated into it. You can easily host gmeet like video conferences upto(50 participants) and large scale private live streaming events (up to 500 participants) with detailed call analytics.

Feel free to schedule a 30 mins discussion with a video solution design expert to get a free assessment of your use case using the Talk to expert button available here