admin - Centedge - Page 2 of 9

WebRTC for Absolute Beginners: A Complete Overview

by admin | Nov 7, 2025 | WebRTC

With WebRTC, you can add real-time communication capabilities to your application that works on top of an open standard. It supports video, voice, and generic data to be sent between peers, allowing developers to build powerful voice- and video-communication solutions. The technology is available on all modern browsers as well as on native clients for all major platforms. The technologies behind WebRTC are implemented as an open web standard and available as regular JavaScript APIs in all major browsers. For native clients, like Android and iOS applications, a library is available that provides the same functionality. The WebRTC project is an open source project and supported by Apple, Google, Microsoft and Mozilla, amongst others.

What can WebRTC really do?

There are many different use-cases for WebRTC, from basic web apps that uses the camera or microphone, to more advanced video-calling applications and screen sharing. WebRTC can be used to build anything and everything starting from weekend side projects to build a simple one to one video chat app or build a complex enterprise grade video conferencing apps with security and other necessary features.

What exactly is WebRTC?

A WebRTC application usually goes through a common application flow. To make it simple, it can be understood in 4 steps, accessing the media devices, opening peer connections, discovering peers, and start streaming. It is a collection of a set of APIs which facilitate these above mentioned steps. Creating a new application based on the WebRTC technologies can be overwhelming if one is unfamiliar with these APIs.

WebRTC APIs

The WebRTC standard covers, on a high level, two different technologies: media capture devices and peer-to-peer connectivity.

Media capture devices includes video cameras and microphones, but also screen capturing “devices”. For cameras and microphones, we use navigator.mediaDevices.getUserMedia() to capture MediaStreams. For screen recording, we use navigator.mediaDevices.getDisplayMedia() instead.

The peer-to-peer connectivity is handled by the RTCPeerConnection interface. This is the central point for establishing and controlling the connection between two peers in WebRTC.

In the upcoming posts, the APIs will be elaborated with easy understand and follow examples. Stay tuned to learn WebRTC from scratch.

Share Camera and Screen Content with Browser APIs

by admin | Nov 7, 2025 | WebRTC

For the last 5 months, the demand for video conferencing has been skyrocketed. Majority of the human population on our planet have been locked up in their respective homes and all the work is getting done through video conferencing. The most primary requirement for a video conferencing is to share the camera and microphone with occasional screen sharing with everybody else so that the individual can be seen, heard and understood properly. Majority of these video conferences now a days run directly from a browser without the need to install any external software or even browser extension. The browsers these days have got some magical powers to do all thing related to camera, microphones and screen share. In this post, we will explore the magical powers of the browser to share these things on demand and the open secret behind these magical powers.

The open secret

The much awaited open secret is this browser API named navigator.mediadevices. This api provides the functionalities which includes getUserMedia to acquire the camera and microphones on request, enumerateDevices to list out all the available devices and getDisplayMedia to capture screen or application window or browser tab etc. These are the most commonly used apis in a typical video conferencing application.

Video conferencing applications can retrieve the current list of connected devices and also listen for changes, since many cameras and microphones connect through USB and can be connected and disconnected during the lifecycle of the application. Since the state of a media device can change at any time, it is recommended that applications register for device changes by using the necessary navigator.mediadevices apis in order to properly handle changes.

Media constraints

The next thing that needs discussion is media constraints which defines how one can access the camera and microphone or the screen share while passing specific instructions to the browser.

Capture camera using getUserMedia

For example, if there are 3 cameras available

to a browser, then a specific instruction can be given to browser as a constraint to access a specific camera out of the available 3 cameras for the video call.

The specific constraints are defined in a MediaTrackConstraint object, one for audio and one for video. The attributes in this object are of type ConstraintLong, ConstraintBoolean, ConstraintDouble or ConstraintDOMString. These can either be a specific value (e.g., a number, boolean or string), a range (LongRange or DoubleRange with a minimum and maximum value) or an object with either an ideal or exact definition. For a specific value, the browser will attempt to pick something as close as possible. For a range, the best value in that range will be used. When exact is specified, only media streams that exactly match that constraints will be returned.

// Camera with a resolution as close to 640x480 as possible
{
    "video": {
        "width": 640,
        "height": 480
    }
}

// Camera with a resolution in the range 640x480 to 1024x768
{
    "video": {
        "width": {
            "min": 640,
            "max": 1024
        },
        "height": {
            "min": 480,
            "max": 768
        }
    }
}

// Camera with the exact resolution of 1024x768
{
    "video": {
        "width": {
            "exact": 1024
        },
        "height": {
            "exact": 768
        }
    }
}

To determine the actual configuration of a certain track of a media stream has, we can call MediaStreamTrack.getSettings() which returns the MediaTrackSettings currently applied.

It is also possible to update the constraints of a track from a media device we have opened, by calling applyConstraints() on the track. This lets an application re-configure a media device without first having to close the existing stream.

Capture screen using getDisplayMedia

An application that wants to be able to perform screen capturing and recording must use the Display Media API. The function getDisplayMedia() (which is part of navigator.mediaDevices is similar to getUserMedia() and is used for the purpose of opening the content of the display (or a portion of it, such as a window). The returned MediaStream works the same as when using getUserMedia().

The constraints for getDisplayMedia() differ from the ones used for regular video or audio input.

{
    video: {
        cursor: 'always' | 'motion' | 'never',
        displaySurface: 'application' | 'browser' | 'monitor' | 'window'
    }
}

The code snipet above shows how the special constraints for screen recording works. Note that these might not be supported by all browser that have display media support.

Tips and tricks

A MediaStream represents a stream of media content, which consists of tracks (MediaStreamTrack) of audio and video. You can retrieve all the tracks from MediaStream by calling MediaStream.getTracks(), which returns an array of MediaStreamTrack objects.

A MediaStreamTrack has a kind property that is either audio or video, indicating the kind of media it represents. Each track can be muted by toggling its enabled property. A track has a Boolean property remote that indicates if it is source by a RTCPeerConnection and coming from a remote peer.

WebRTC RTCPeerConnection: The secret behind connecting peers in the new video call app

by admin | Nov 7, 2025 | WebRTC

WebRTC RTCPeerConnection is the API which deals with connecting two applications on different computers to communicate using a peer-to-peer protocol. The communication between peers can be video, audio or arbitrary binary data (for clients supporting the RTCDataChannel API). In order to discover how two peers can connect, both clients need to connect to a common signalling server and also provide an ICE Server configuration. The ice server can either be a STUN or a TURN-server, and their role is to provide ICE candidates to each client which is then transferred to the remote peer. This transferring of ICE candidates is commonly called signalling. All these new terminologies may sound alien at the beginning but these are the secret behind successfully connecting a video call between 2 computers using only browsers.

Signalling is needed in order for two peers to share how they should connect. Usually this is solved through a regular HTTP-based Web API (i.e., a REST service or other RPC mechanism like web socket) where web applications can relay the necessary information before the peer connection is initiated. Signalling can be implemented in many different ways, and the WebRTC specification doesn’t prefer any specific solution.

Peer connection initiation

RTCPeerConnection API is responsible for creating the RTCPeerConnection object by instantiating it as described in the code snippet below. The constructor for this class takes a single RTCConfiguration object as its parameter. This object defines how the peer connection is set up and should contain information about the ICE servers to use.

Once the RTCPeerConnection is created we need to create an SDP offer or answer, depending on if we are the calling peer or receiving peer. Once the SDP offer or answer is created, it must be sent to the remote peer through a different channel. Passing SDP objects to remote peers is called signalling, to be specific and is not covered by the WebRTC specification.

To initiate the peer connection setup from the calling side, we create a RTCPeerConnection object and then call createOffer() to create a RTCSessionDescription object. This session description is set as the local description using setLocalDescription() and is then sent over our signalling channel to the receiving side. We also set up a listener to our signalling channel for when an answer to our offered session description is received from the receiving side.

Simple signalling server

// Set up an asynchronous communication channel that will be
// used during the peer connection setup
const signalingChannel = new SignalingChannel(remoteClientId);
signalingChannel.addEventListener('message', message => {
    // New message from remote client received
});

// Send an asynchronous message to the remote client
signalingChannel.send('Hello!');

Initiating the call from browser A

async function makeCall() {
    const configuration {'iceServers': [{'urls': 'stun:stun.l.google.com:19302'}]}
    const peerConnection = new RTCPeerConnection(configuration);
    signalingChannel.addEventListener('message', async message => {
        if (message.answer) {
            const remoteDesc = new RTCSessionDescription(message.answer);
            await peerConnection.setRemoteDescription(remoteDesc);
        }
    });
    const offer = await peerConnection.createOffer();
    await peerConnection.setLocalDescription(offer);
    signalingChannel.send({'offer': offer});
}

On the receiving side, we wait for an incoming offer before we create our RTCPeerConnection instance. Once that is done we set the received offer using setRemoteDescription(). Next, we call createAnswer() to create an answer to the received offer. This answer is set as the local description using setLocalDescription() and then sent to the calling side over our signalling server.

const peerConnection = new RTCPeerConnection(configuration);
signalingChannel.addEventListener('message', async message => {
    if (message.offer) {
        peerConnection.setRemoteDescription(new RTCSessionDescription(message.offer));
        const answer = await peerConnection.createAnswer();
        await peerConnection.setLocalDescription(answer);
        signalingChannel.send({'answer': answer});
    }
});

Once the two peers have set both the local and remote session descriptions they know the capabilities of their respective remote peer. This doesn’t mean that the connection between the peers has already been established. For this to work we need to collect the ICE candidates at each peer and transfer (over the signalling channel) to the other peer in order to establish the connection between them.

ICE Candidates

ICE means Internet Connectivity Establishment. Before two peers can communicate using WebRTC, they need to exchange connectivity information. Since the network conditions can vary depending on a number of factors, an external service is usually used for discovering the possible candidates for connecting to a peer. This service is called ICE and is using either a STUN or a TURN server. STUN stands for Session Traversal of User Datagram Protocol, and is usually used indirectly in most WebRTC applications.

TURN (Traversal Using Relay NAT) is the more advanced solution that incorporates the STUN protocols and most commercial WebRTC based services uses a TURN server for establishing connections between peers. The WebRTC API supports both STUN and TURN directly, and it is gathered under the more complete term Internet Connectivity Establishment. When creating a WebRTC connection, we usually provide one or several ICE servers in the configuration for the RTCPeerConnection object.

Trickle ICE

Trickle ICE is a technique which is used to reduce the call setup time between the 2 peers. Once a RTCPeerConnection object is created, the underlying framework uses the provided ICE servers to gather candidates for establishing connectivity based on the ICE candidates. The event icegatheringstatechange on RTCPeerConnection signals in what state the ICE gathering is (new, gathering or complete).

While it is possible for a peer to wait until the ICE gathering is complete, it is usually much more efficient to use this technique and transmit each ICE candidate to the remote peer as it gets discovered. This significantly reduces the setup time for the peer connectivity and allow a video call to get started with less delays.

To gather ICE candidates, simply add a listener for the icecandidate event. The RTCPeerConnectionIceEvent emitted on that listener will contain candidate property that represent a new candidate that should be sent to the remote peer using the Signalling mechanism as mentioned above.

// Listen for local ICE candidates on the local RTCPeerConnection
peerConnection.addEventListener('icecandidate', event => {
    if (event.candidate) {
        signalingChannel.send({'new-ice-candidate': event.candidate});
    }
});

// Listen for remote ICE candidates and add them to the local RTCPeerConnection
signalingChannel.addEventListener('message', async message => {
    if (message.iceCandidate) {
        try {
            await peerConnection.addIceCandidate(message.iceCandidate);
        } catch (e) {
            console.error('Error adding received ice candidate', e);
        }
    }
});

Once ICE candidates are being received, we should expect the state for our peer connection will eventually change to a connected state. To detect this, we add a listener to our RTCPeerConnection where we listen for connectionstatechange events.

// Listen for connectionstatechange on the local RTCPeerConnection
peerConnection.addEventListener('connectionstatechange', event => {
    if (peerConnection.connectionState === 'connected') {
        // Peers connected!
    }
});

Choosing Video Infrastructure: Avoid Critical Pitfalls

by admin | Nov 7, 2025 | WebRTC

Are you a planning to start working on building a WebRTC based audio-video calling infrastructure for your upcoming use case? If yes, then be careful, there can be danger ahead if you don’t select the correct media server for your use case today. We are here to help you selecting the correct media server or infrastructure based on your use case need as well as the phase of the execution of the project you currently are in.

The possible alternatives for your use case are agora, tokbox, twillio, jitsi, janus, mediasoup, kurento, openvidu etc. This list has many other names in it as well. Her we have considered the most popular options out there to help you build your video infrastructure.

We primarily divide these above mentioned options into 2 categories. First one are the vendors who provide video apis so that other applications integrate them with less to very less effort. Second one are the open source media servers which one can use to build video applications on top of it as well but in this case the implementer i.e. you need to take care of the implementation part of it as well as the infrastructure part of it. Let’s analyse both of these options more elaborately and provide specific suggestions for each of these options.

agora / twillio / tokbox :- These are good, dependable and scalable api services. Agora is my top pick among them. Use these if you are building a MVP for a demo or if your video calling solution is not core to your business and your product’s customer satisfaction index is not directly dependent on this solution which you have build on top of them. The reason is that if they change their apis overnight, it shouldn’t break the core of the product. It is better to have less dependencies for the core of your product.

Jitsi:- Jitsi is a really nice end to end solution for in house video conferencing needs when you don’t want to depend on zoom or MS teams for your team video conferencing needs. You can deploy the whole solution quickly and run it on premise cloud or managed cloud and use it without any issue. It is a good open source zoom alternative, may be the best one but it is not very good if you want to integrate it with your product to build your solution on top of it as it is less agile, modular and customisation unfriendly. It’s architecture and it’s approach of getting things done in a certain way make’s it the best open source zoom alternative but not a great media server for your next webrtc project.

Janus/ mediasoup:- The 2 best open source media servers available today ready for integration in any use case. These are stable enough and scale nicely without much issues. Some amount of tweaking will be needed to align it to your use case. The only point to mention about mediasoup is that it is currently available only as a npm module which means it can only be integrated with a nodejs backend server. Nodejs is a great backend server according to me and I use it for may use cases. I can assure you that Nodejs+mediasoup work very well with each other and is a wining combination. These both can handle 100+ participants in one conference call if implemented properly on a scalable infrastructure.Use these if your use case is more focussed on large group calls.

Kurento/openvidu :- Kurento is one of the most versatile open source media servers out there with capabilities of MCU and SFU based upon configuration. It also has options to integrate different type of filters like face detection filters, chroma filters etc. on the live media stream. It also provides capabilities to record the streams to a file or a http end point. Use this if you have unique use case where you don’t need a large group call and if you have in house kurento expertise. Openvidu is a signalling and application server combination built on top of kurento which can be used as an Jitsi alternative.

BigBlueButton:- BBB(BigBlueButton) is a good option for online classroom/ learning use cases. If you are a school or college OR a business catering to the needs of schools and colleges, looking for an online teaching learning solution on self hosted/ cloud based online classroom solution on your own domain name with full control, then this is a good solution for you. It has all the option needed for the teacher to control the students including muting all students in the class.

Finally there can be a question in your mind that should one own the video infrastructure or use it as a service? To answer this question of owning the WebRTC infrastructure, our take is this. If the WebRTC infra is core to your business and your monetisation strategy is directly dependent on it’s performance, it is advisable to own the whole infrastructure else go for a managed service or even outsource it.

In case you are looking for developing your next video project or adopt an open source one for your own use case, we will be delighted to provide our support in helping you achieve a great result for your use case. Don’t hesitate to drop us a mail at hello@centedge.io for a free 1st round consultation. #Happytohelp.

Production getUserMedia: WebRTC Video App Best Practices

by admin | Nov 7, 2025 | WebRTC

This is going to be an advanced post on using getUserMedia effectively in real world use cases while creating production grade applications. If you you are a beginner, please check this post first before going through this post.

While developing video conferencing applications, the getUserMedia browser API provides the capabilities to capture the audio and video streams and send these streams to the receiving parties with the help of RTCPeerConnection. In this article, we will discuss the twin concepts of capabilities and constraints to understand the browser’s capabilities of capturing the media streams with the applied constraints.

Here is how the process works.

Call MediaDevices.getSupportedConstraints() (if needed) to get the list of supported constraints, which tells you what constrainable properties the browser knows about. This isn’t always necessary, since any that aren’t known will simply be ignored when you specify them—but if you have any that you can’t get by without, you can start by checking to be sure they’re on the list.

Once the script knows whether the property or properties it wishes to use are supported, it can then check the capabilities of the API and its implementation by examining the object returned by the track’s getCapabilities() method; this object lists each supported constraint and the values or range of values which are supported.

Finally, the track’s applyConstraints() method is called to configure the API as desired by specifying the values or ranges of values it wishes to use for any of the constrainable properties about which it has a preference.

The track’s getConstraints() method returns the set of constraints passed into the most recent call to applyConstraints(). This may not represent the actual current state of the track, due to properties whose requested values had to be adjusted and because platform default values aren’t represented. For a complete representation of the track’s current configuration, use getSettings().

Defining Constraints

A single constraint is an object whose name matches the constrainable property whose desired value or range of values is being specified. This object contains zero or more individual constraints, as well as an optional sub-object named advanced, which contains another set of zero or more constraints which the user agent must satisfy if at all possible. The user agent attempts to satisfy constraints in the order specified in the constraint set.

We also need to check first if the constraints we are going to apply are supported by the user agent / browser or not. In the below code, it first checks if the constraints are supported or not and then applies the constraints.

let supports = navigator.mediaDevices.getSupportedConstraints();

if (!supports["width"] || !supports["height"] || !supports["frameRate"] || !supports["facingMode"]) {
  // We're missing needed properties, so handle that error.
} else {
  let constraints = {
    width: { min: 640, ideal: 1920, max: 1920 },
    height: { min: 400, ideal: 1080 },
    aspectRatio: 1.777777778,
    frameRate: { max: 30 },
    facingMode: { exact: "user" }
  };

  myTrack.applyConstraints(constraints).then(function() => {
    /* do stuff if constraints applied successfully */
  }).catch(function(reason) {
    /* failed to apply constraints; reason is why */
  });
}

Here, after ensuring that the constrainable properties for which matches must be found are supported (width, height, frameRate, and facingMode), we set up constraints which request a width no smaller than 640 and no larger than 1920 (but preferably 1920), a height no smaller than 400 (but ideally 1080), an aspect ratio of 16:9 (1.777777778), and a frame rate no greater than 30 frames per second. In addition, the only acceptable input device is a camera facing the user (a “selfie cam”). If the width, height, frameRate, or facingMode constraints can’t be met, the promise returned by applyConstraints() will be rejected.

MediaStreamTrack.getCapabilities() is used to get a list of all of the supported capabilities and the values or ranges of values which each one accepts on the current platform and user agent. This function returns a MediaTrackCapabilities object which lists each constrainable property supported by the browser and a value or range of values which are supported for each one of those properties.

Example

The most common way of using constraints is while calling getUserMedia() to capture the streams.

navigator.mediaDevices.getUserMedia({
  video: {
    width: { min: 640, ideal: 1920 },
    height: { min: 400, ideal: 1080 },
    aspectRatio: { ideal: 1.7777777778 }
  },
  audio: {
    sampleSize: 16,
    channelCount: 2
  }
}).then(stream => {
  videoElement.srcObject = stream;
}).catch(handleError);

In this example, constraints are applied at getUserMedia() time, asking for an ideal set of options with fallbacks for the video.

The constraints of an existing MediaStreamTrack can also be changed on the fly, by calling the track’s applyConstraints() method, passing into it an object representing the constraints you wish to apply to the track.

videoTrack.applyConstraints({
  width: 1920,
  height: 1080
});

Retrieving current constraints and settings

It’s important to remember the difference between constraints and settings. Constraints are a way to specify what values you need, want, and are willing to accept for the various constrainable properties, while settings are the actual values of each constrainable property at the current time.

If at any time we need to fetch the set of constraints that are currently applied to the media, we can get that information by calling MediaStreamTrack.getConstraints(), as shown in the example below.

function switchCameras(track, camera) {
  let constraints = track.getConstraints();
  constraints.facingMode = camera;
  track.applyConstraints(constraints);
}

This function accepts a MediaStreamTrack and a string indicating the camera facing mode to use, fetches the current constraints, sets the value of the MediaTrackConstraints.facingMode to the specified value, then applies the updated constraint set.

Unless we only use exact constraints (which is pretty restrictive, so be sure we mean it!), there’s no guarantee exactly what we are going to actually get after the constraints are applied. The values of the constrainable properties as they actually are in the resulting media are referred to as the settings. If we need to know the true format and other properties of the media, we can obtain those settings by calling MediaStreamTrack.getSettings(). This returns an object based on the dictionary MediaTrackSettings. For example:

function whichCamera(track) {
  return track.getSettings().facingMode;
}

This function uses getSettings() to obtain the track’s currently in-use values for the constrainable properties and returns the value of facingMode.

In case you are looking for any specific help with your production video conferencing application related to camera quality issues, do let us know at hello@centedge.io. We will be delighted to help.

« Older Entries

Next Entries »

WebRTC for Absolute Beginners: A Complete Overview

Share Camera and Screen Content with Browser APIs

The open secret

Media constraints

Capture camera using getUserMedia

Capture screen using getDisplayMedia

Tips and tricks

WebRTC RTCPeerConnection: The secret behind connecting peers in the new video call app

Peer connection initiation

Simple signalling server

Initiating the call from browser A

ICE Candidates

Trickle ICE

Choosing Video Infrastructure: Avoid Critical Pitfalls

Production getUserMedia: WebRTC Video App Best Practices

Defining Constraints

Example

Retrieving current constraints and settings

Recent Posts

Recent Comments

Reach Us

Follow us on:

Industries

AI Native Services

Voice & Video Services