The dilemma of build vs. buy from CPAAS in the world of Video Conferencing solutions

The dilemma of build vs. buy from CPAAS in the world of Video Conferencing solutions

In today’s digital age, communication has become the lifeblood of businesses and individuals alike. As organizations strive to connect with their remote teams, engage with customers, and collaborate with partners worldwide, video communication has emerged as a powerful tool. When it comes to building a video communication app, there are two main options: building from scratch or integrating Communication Platform as a Service (CPaaS) video API providers. While the latter may seem like an attractive choice due to its convenience, there are several compelling reasons why building your own video communication app is a better long-term investment.

First and foremost, building a video communication app gives you full control over the user experience. By developing your own app, you have the freedom to customize every aspect of the platform to align with your brand identity and specific requirements. From the user interface to the features and functionalities, you can tailor the app to create a seamless and intuitive experience for your users. This level of control is crucial for building strong brand recognition and fostering user loyalty.

Secondly, building your own video communication app allows you to prioritize data privacy and security. With increasing concerns about data breaches and privacy issues, having control over the infrastructure and data handling processes becomes paramount. By building your own app, you can implement robust security measures, encryption protocols, and data storage practices to safeguard sensitive information. This not only protects your users but also builds trust and credibility in your brand, setting you apart from competitors who rely on third-party providers.

Moreover, building a video communication app provides scalability and flexibility. As your business grows and evolves, you have the freedom to add or modify features, scale up the infrastructure, and adapt the app to changing market demands. This agility is crucial for staying ahead in a rapidly evolving digital landscape. On the other hand, integrating a CPaaS video API provider might limit your ability to customize or scale the app according to your unique requirements, potentially hindering your growth potential.

Building your own video communication app also offers cost-effectiveness in the long run. While integrating CPaaS video API providers may seem like a quick and cost-efficient solution initially, the subscription fees and usage charges can add up significantly over time. By building your own app, you have the opportunity to make a one-time investment in development and infrastructure setup, reducing ongoing expenses in the form of API usage fees. This allows you to have better control over your budget and allocate resources more efficiently.

Last but not least, building a video communication app provides a competitive edge. In a market saturated with generic communication tools, having a unique and tailored app sets you apart from competitors. It allows you to differentiate your brand and offer a distinctive user experience that aligns with your specific value proposition. By investing in building your own app, you position yourself as an innovative and forward-thinking organization, attracting users and potential partners who value a premium communication experience.

Let’s take the example of an virtual events company to understand the numbers between build vs buy strategy.

An virtual events company hosts 100 events a month with the average of 500 participants attending each of those events. Lets assume that each event is 6-8 hours long out of which 4 hours of audio/video is being used by all the participants and the virtual events company is using a video cPAAS provider to provide audio / video sharing capabilities to it’s participants.

The maths for 1 event would look like as below.

4 *60 = 240 minutes *200 participants * $.004/video minute= $192 (considering 200 participants sharing their video and audio)

4 *60 = 240 minutes *300 participants * $.0009/video minute= $65 (considering 300 participants sharing their audio only)

Total $257(approx.) is what the cPAAS provider charges for 1 event.

If 100 such events happen, then the total cPAAS bill would be $25,700 for the month! If the virtual events company has been using the cPAAS services for last 2 years while hosting similar kind of events each month, total cPAAS bill would have been $616,800.

Now lets do some maths to find out what the amount would if the virtual event service provider would have decided to build it from day 1.

For building a really scalable video back-end which can replace their cPAAS offering, it should take 8-12 months with a cost of $150,000(approx.). The front-end and all other costs would stay the same.

The next important cost is the server and the data transfer cost.

Lets try to calculate the data transfer costs for 1 event with 500 people with 4 hours of audio / video usage.

1 participant consuming video and audio at 540Kb/s for 4 hours

540 * 60*60*4= 7.41Gb of data consumption for 4 hours

if 200 participants are using video, then total data consumption is 200 * 7.41 = 1483Gb

1 participant consuming audio at 40Kbps for 4 hours

40*60*60*4 = 0.54Gb of data consumption for 4 hours

if 300 participants are using audio, then total data consumption is 300 * 0.54 = 164Gb

Therefor total data transfer cost becomes 1648 Gb.

In case of AWS, the data transfer costs become = 1648* $.08/Gb = $132

In case of DigitalOcean, the data transfer costs become = 1648* $.01/Gb = $16.48

Considering that the virtual event provider is running all it’s services in DigitalOcean as it is cheaper, the total data transfer cost per month for all the 100 events would be = $1648. The server cost can be considered as included in this cost as we haven’t considered the free data transfer provided by DigitalOcean as a combined package with the servers.

cPAAS cost of $25,700 vs self managed infra cost of $1648, the difference becomes $24052.

According to this calculation, it seems that the virtual event company can recover the development cost of the self managed video back-end in less than 6 months and keep on saving at least $15000 a month considering people needed to manage and enhance the video back-end would cost monthly $9052.

A similar analysis can be done for any segment based on the above example. While integrating CPaaS video API providers might offer convenience in the short term, building your own video communication app presents numerous advantages in terms of user experience, data security, scalability, cost-effectiveness, and competitive differentiation. It empowers you to create a platform that truly reflects your brand and caters to the unique needs of your users. In a world where effective communication is paramount, investing in building your own video communication app is a strategic decision that sets you on a path of long-term success and growth. We are here to help you build a cutting edge video conferencing back-end / application for your unique use case. Feel free to drop us a email at hello@centedge.io or use this link to have an instant video meeting with us.

Twilio Video Alternatives for A Conscious Decision Maker

Twilio Video Alternatives for A Conscious Decision Maker

Video Communication & Modern Businesses

In an era defined by rapid technological advancement and shifting business landscapes, video communication has emerged as a pivotal tool for modern enterprises. Platforms like Twilio, Zoom, Google Meet, and Jitsi have revolutionized the way businesses operate, enabling seamless collaboration and communication regardless of physical distance. The COVID-19 pandemic underscored the critical importance of these tools, as companies around the world were forced to quickly adapt to remote work environments.

During the height of the pandemic, these video communication platforms played a crucial role in maintaining business continuity. Zoom, for instance, saw a massive surge in users as companies turned to its platform for virtual meetings, webinars, and conferences. Similarly, Twilio’s cloud communication solutions enabled businesses to quickly implement and scale their communication strategies, ensuring that employees remained connected and productive. Google Meet and Jitsi provided secure and reliable video conferencing capabilities, allowing teams to collaborate effectively in a virtual setting. These platforms not only kept businesses running but also paved the way for a new era of remote work and digital collaboration.

The current Video communication providers can be categorized into 2 broad categories. One category has those who provide ready-to-use video meeting solutions like Gmeet & Zoom, and another category has those who provide all the tools and technologies like Twilio & Jitsi for building custom video meeting solutions. The second category who provide tools to build video solutions can further be categorized into 2 sub categories. One subcategory is Jitsi which provides ready to use self-hosted video meeting solutions including the frontend, backend, media server, horizontal scaling, etc and the other subcategory is the CPAAS/ Video SDK providers like Twilio which provides a scalable backend service along with frontend APIs to build video applications.

All the above-mentioned categories and subcategories have their strengths and weaknesses for a specific business use case. But in general, a requirement can be divided into the abovementioned categories / sub-categories. Accordingly, an option can be chosen from all the available options in that category/sub-category. To keep this post clear and crisp, we are going to discuss more on the second subcategory of the second category, i.e. Twilio as a Video SDK provider and the impact of its EOL (End Of Life) on its existing customers as well as the sub-category/category as a whole.

Twilio Video Discontinuation & Changing Landscapes

The recent announcement by Twilio regarding the discontinuation of its Twilio Video service marks a significant shift in the landscape of video communication platforms. This decision comes amidst a rapidly evolving market, with new players entering the field and existing ones constantly innovating to meet the growing demands of users. While this change may initially cause concern for businesses relying on Twilio Video, it also presents an opportunity to reassess and adapt to the evolving needs of modern communication.

As businesses navigate this transition, it’s important to recognize that the discontinuation of Twilio Video does not signify the end of innovative video communication solutions. Instead, it signals a dynamic market where companies must remain agile and open to exploring new technologies. This shift also underscores the importance of choosing a video communication platform that aligns with the long-term goals and requirements of the organization. By carefully evaluating the available options and selecting a platform that offers both stability and scalability, businesses can continue to leverage the power of video communication to drive growth and success in an ever-changing landscape.

Possible reasons that could have led to Twilio Video Sunset

People who understand how Video technologies work and how complex it is to build a stable enterprise-ready video application, can’t ignore the importance of correct decisions early on related to the core components of the video application. If one goes wrong here at the early stages, there is a good chance of negative impacts in the later stages related to stability, scaling, future feature enhancements, etc. I assume something like this, i.e. some incorrect decisions in the early stages of the Twilio video, would have happened with them as well. I wish to share a short story from my own experience before moving on.

We who followed WebRTC from its early days i.e. from 2012-13 time, can’t forget the promise made by an open-source WebRTC media server named Kurento to the developer community at that time. It was the leading media server at those times with a promise of media forwarding(SFU), merging(MCU), and real-time post processing(with OpenCV) all combined into one package. It was theoretically the perfect media server that could change the world of media scaling and real-time processing forever. I had used Kurento for some of my personal/commercial projects in 2014-15 and I was also super excited about its prospects and possibilities. It was developed by some students and professors at a Spanish University and this open-source package was maintained by them as well along with other contributors.

But in the year 2016-17, after using this media server for 1 -2 years in some commercial projects, I realized that it is hard to manage and scale this media server. It is resource-guzzling as well as slow and less capable of handling concurrent meeting rooms

in larger numbers. It had some core features like ICERestarts were missing as well which are a necessity if wish to deal with bad networks. Therefore, a realization came to us in the year 2019 that we need to move our commercial project to a better media server if we wish to scale it to larger numbers.

The interesting part here is that Twilio bought Kurento for its promises in the year 2017 and built Twilio video on top of it. They bought it for $ 40 million (I read it somewhere back in 2017!) and spent a good amount of money afterward to make it Twilio Video. I think they realized something similar in 2023 which we realized in 2019 and thought of moving on.

I am not sure if this was the primary reason or not as I can only assume what could have happened. But I think the decision to buy Kurento 2017 and deciding to build a Twilio video on top of it may have some role to play in the decision to Twilio video Sunset.

Key learning from this incident

Anybody planning to build a video application should take a learning from it. The key learning is to avoid doing the fundamentally wrong things early on which may have catastrophic impacts later. Therefore it is critical to have good video application professionals (like us!) on your side who have sufficient experience with your kind of use cases early on, to make the right decisions that is beneficial for you and your application both in the shortest as well as in the longest term.

The Beginning of Search for a Twilio Video Alternative

In the realm of real-time communication, Twilio Video has established itself as a go-to solution for many businesses. However, as it is sun-setting its service for some reasons, we need to consider various factors when exploring alternatives. The importance of reliability, scalability, security, and ease of integration cannot be overstated in this context.

Reliability is paramount in any communication system. Look for alternatives with a proven track record of uptime and minimal disruptions. The best way to check reliability is by building a POC to check the performance of the provider with some edge cases like how the audio/video connection performs when the network degrades OR does the video freezes for a user when switching between 5g / Wifi. These scenarios tell about the depth of the provider concerning building video applications.

Scalability is not a deal breaker if your use case doesn’t require 1000s of users in 1 room or 1000s of rooms concurrently running with 10s / 100s of users in each room. But if you have a use case that needs either 1000s of users in 1 room or 1000s of concurrent rooms, then do check the scalability aspects well. A good practice is to as for load test reports which was done by the provider. The best option is to run the load tests yourselves with a POC application to check the scalability for yourself.

Security is non-negotiable, especially when dealing with sensitive information. Look for alternatives that offer end-to-end encryption and comply with relevant regulations like GDPR and HIPAA. Are they providing E2E encryption in case it is a need for your application Or can they store the recordings in an encrypted format in case of a need? These are some of the questions that need to be asked to the provider before deciding on the security aspects of it.

Ease of integration is another important aspect that can make or break your video application. If it takes too many API calls to achieve a small thing or there are important APIs that are not available for your use case then it is not worth considering the solution even if it offers the above 3.

Based on the above parameters, we can consider 2 types of solutions that can become a Twilio video alternative. One is an open-source media server-based solution and another is a proprietary solution provided by a commercial vendor. In this post, we are going to explore both alternatives in a detailed manner.

Open Source Twilio Video Alternatives:

Open-source video communication solutions are software platforms that enable users to communicate through video conferencing, messaging, and collaboration tools. Unlike proprietary solutions, open-source solutions provide the underlying code for free, allowing users to modify, customize, and distribute the software according to their needs. This openness fosters innovation, as developers worldwide can contribute to improving the software, leading to rapid advancements and feature enhancements.

One of the key benefits of open-source video communication solutions is their flexibility and scalability. Organizations can tailor the software to meet their specific requirements, integrating it into their existing infrastructure seamlessly. Additionally, open-source solutions are often more cost-effective than proprietary alternatives, as they eliminate licensing fees and allow users to leverage a global community of developers for support and development.

In recent years, open-source video communication solutions have gained popularity due to their reliability, security, and privacy features. These solutions prioritize user data protection, offering end-to-end encryption and other security measures to safeguard sensitive information. As businesses and individuals increasingly rely on video communication for work, education, and social interactions, open-source solutions provide a reliable and accessible platform for connecting people around the world.

Primarily open source solutions can be divided into 2 categories. One is a ready solution with both frontend(UI) and backend ready to be deployed on the server and used. The downside of this is that the front end (UI) may not be suitable for your use case. A great example in this category is Jitsi. Jitsi is a ready to be used open source video conferencing solution with a pre-built UI with a scalable backend. The only issue is that we can use the default UI as an IFrame inside our solution but can’t modify it easily according to our use case. Also, it may not be a great Twilio video alternative as Twilio video is used to provide the APIs to build a custom video solution according to our exact needs.

Therefore let’s explore the second alternative,i.e. a media server-based solution that provides a set of APIs to build custom video conferencing/ Interactive live streaming solutions like Twilio video currently provides. Though there are a good number of open-source Twilio alternatives, we are going to mention the top 2 alternatives that can be worthy of replacing Twilio video APIs. One great example is the mediasoup-demo project which can be considered as a real Twilio video alternative. It has all the ingredients for building a production-grade video conferencing/ Interactive live streaming system. The only downside of this project is that it is a raw tech that has been built to demonstrate the capabilities of the mediasoup media server. Hence it lacks the polish of a production-grade video application. This can be considered as a great base/starting point on top of which a production-grade video conferencing / Interactive live streaming application can be built. Please note that the APIs provided by the mediasoup-demo project are not the same as Twilio and they need a separate integration effort than the existing Twilio implementation. Another very good project is liveKit, built using PION, the Go lang version of WebRTC. If you have a Go lang backend, then this project can be worth the consideration.

Open source Twilio Video Alternatives, Pros and Cons

Pros:

  1. Cost-Effective: Open-source WebRTC projects eliminate licensing fees, reducing overall development costs.
  2. Customization: Developers can tailor the solution to meet specific enterprise needs, ensuring flexibility and scalability.
  3. Community Support: Access to a large community of developers can provide assistance, bug fixes, and ongoing updates.
  4. Security: Regular updates and scrutiny by the community can enhance security, ensuring vulnerabilities are quickly identified and patched.
  5. Interoperability: WebRTC’s standardization enables interoperability with various platforms and devices, ensuring seamless communication.

Cons:

  1. Complexity: Integrating and managing an open-source WebRTC project can be complex, requiring specialized knowledge and resources.
  2. Maintenance: While community support can be beneficial, it also requires active management to ensure compatibility and security updates are implemented.
  3. Scalability: While WebRTC itself is scalable, managing the infrastructure to support enterprise-grade usage can be challenging without dedicated resources.
  4. Quality Control: Open source projects may lack the same level of rigorous testing and quality control as commercial solutions, potentially leading to reliability issues.
  5. Legal Considerations: Open source licenses may have implications for proprietary use, requiring careful consideration of licensing terms and compliance.

Open Source Considerations, Building from Scratch vs. Using a Vendor

Building from Scratch:

Building from scratch can be considered an option if you have

  1. At least 3-5 Highly Skilled people in the core Audio/Video technologies like WebRTC, SIP, FFMPEG, GStreamer, WebSockets
  2. Peripheral skill sets like JavaScript, HTML, Android, iOS, etc. for building Web and mobile clients
  3. If you have at least 12 months and enough financial resources to maintain a team of 8-10 available. It may go to 18-24 months if your requirement is too complex and it needs a very specialized approach.
  4. You will get the benefits of the above-mentioned pros but also you need to deal with the above-mentioned cons on your own.

Using a Vendor:

Using a vendor is beneficial as well as recommended if

  1. You don’t have a team with the above-mentioned skill sets who can understand and modify an open-source package according to your use case.
  2. You don’t have at least 8 – 12 months available for implementing your use case using an open-source package.
  3. You would get the pros of open source as mentioned above without having to deal with many of the cons as mentioned above. The vendor would provide you with much-needed support, scalability, and quality control capabilities.
  4. The financial resources needed would be similar to building it from scratch with the difference that you would get the results faster with better predictability and control of the outcome.

Commercial Twilio Video Alternative:

Nothing much needs to be written about commercial alternatives for the Twilio video. As the header suggests, this category has commercial vendors who provide audio/video SDK software similar to Twilio. The SDKs may not be the exactly same but they do the job effectively i.e. providing audio/ video capabilities in your application without you having to go through the pain of building the audio/video SDK for yourself.

These are some of the providers of audio/Video SDKs that offer similar audio/video capabilities as Twilio.

There are many others as well who provide video/audio SDKs. Please google and you will find others.

Pros

  1. Faster time to market: If you are not sure about the prospect of the product or service and you wish to test your product or service really quick
  2. No need to maintain a team of audio/video experts
  3. Easy to start with a lesser amount of cost when utilization is low

Cons

  1. Cost becomes prohibitive as usage grows with time
  2. Complex to forecast monthly costs with pay-per-minute/user billing techniques
  3. Chance of getting stuck with a vendor with a huge switch cost,( Vendor lock-in)
  4. Lesser control over the actual media streams of your users
  5. The fear of if Twilio can decide to shut down the video service, then your current vendor may also decide to shut it down next year!
  6. Sometimes your video SDK provider may not be flexible enough to customize according to your requirements.

The simple formula to choose between using a commercial vendor vs not using a commercial vendor is below.

If the audio/video capabilities are mission-critical to your business for some reason and are deeply coupled to the core workflows of your business, then it’s best not to choose a commercial vendor. In this case, it is better to look for alternatives that would provide you with better control over the entire stack of your audio/video capability requirement.

If you wish to know more about a possible Twilio video alternative that can be tailored to your specific use case, please feel free to drop an email at hello@centedge.io to kickstart a conversation with us. If you wish to schedule a discussion with one of our principal engineers to discuss your use case in detail, feel free to use the Meetnow button available at the top of this page to schedule it.