When to Use a Screenshot API Instead of Puppeteer, Playwright, or Selenium

Whenever you need to capture a screenshot of a webpage, there are a few options available to you:

Use a browser automation library like Puppeteer, Playwright, or Selenium directly, or
Use a screenshot API

In the years before you could code literally anything with the help of LLMs, most developers would choose a screenshot API over using Puppeteer, Playwright, or Selenium directly.

You didn’t have to learn the libraries to understand which library works best for you. You didn’t have to go through the chosen library’s documentation in order to learn how to achieve this or that with your screenshots. Also, most screenshot APIs have a free plan, so the choice was obvious, especially when you needed simple screenshots and needed them occasionally. Sign up with a screenshot API service, get your API key, then call the API whenever you need a screenshot. That’s it.

Fast forward to today: you could have basic screenshot functionality implemented in your codebase in under 5 minutes with a single prompt. And you could even replicate an entire screenshot API service in just 4-6 weeks if needed.

These figures are not an exaggeration. It took me only 6 weeks to build the entire screenshot engine part of Screenshot Scout, our screenshot API service. Without the heavy use of AI, that would have been impossible.

Considering this, you might be wondering: do I still need a screenshot API? Or should I go with Puppeteer/Playwright/Selenium directly, given they can be implemented so quickly and easily these days?

Given my experience building a screenshot API, which uses one of the mentioned browser automation libraries under the hood, I wanted to give an honest answer both to myself and to anyone making their choice between the two options.

There are 4 clear benefits to going with a screenshot API today instead of using Puppeteer, Playwright, or Selenium directly.

1. You want your screenshots to be reliably stripped of cookie banners, ads, and chat widgets 99-100% of the time

Please pay attention to the words “reliably” and “99-100% of the time”.

Puppeteer/Playwright/Selenium don’t have any built-in capabilities to strip any of these at the moment, so you would need to implement these yourself.

This can be achieved in a few ways, depending on which browser automation library you use:

Use one of the several community-driven packages created specifically for removing different annoyances. For Puppeteer, you can go with puppeteer-extra or ghostery/adblocker. For Playwright, ghostery/adblocker might work as well.
Run Chromium with a Chrome extension loaded, responsible for stripping cookie banners, ads, and chat widgets. The most well-known and tested is uBlock Origin Lite, which can strip ads, cookie banners, and occasionally chat widgets. There are multiple other options as well.
Use duckduckgo/autoconsent to auto-consent to cookies. This will hide cookie banners from most well-known cookie consent plugins, but it will not be able to hide DIY cookie consent popups. It will not hide ads either.
Implement your own cookie consent popup clicker. Its main purpose would be to find all cookie consent banners on the page and click on Accept/Deny/Close buttons.

For the best outcome, you will have to use a combination of the above. But even when you do, it will take months and require ongoing oversight before you can reliably strip 99-100% of all cookie consent popups, ads, and chat widgets.

You will constantly see new edge cases appear where neither the packages/plugins you use, nor your own cookie consent banner clicker, worked, and you will constantly have to go in and update the packages and/or update your custom clicker to keep the rate at which you successfully strip cookie banners/ads/widgets high.

In Screenshot Scout, and in any other screenshot API that you might consider, that is part of our business, and this is what we do daily/weekly.

If you use website screenshots in production and you need to reliably remove all cookie banners, ads, and chat widgets, using a screenshot API gives you a clear win over a DIY solution.

2. You want screenshot jobs to never compete with your app/website for CPU, memory, or timeouts

Whenever you read about the resource requirements of running a headless Chromium instance, your first impression is that RAM will be your main bottleneck. And while it’s true that a single running Chromium instance can require anywhere between 500MB and 1GB of RAM, it’s not RAM that you will be struggling with. It’s the CPU.

In Screenshot Scout, we do screenshot rendering on a fleet of small VPSs. Our smallest VPS instance has this configuration: 2 vCPU/8GB RAM. As you can see, we have 4 GB of RAM per vCPU, which is the standard allocation across most VPS providers, including Hetzner and netcup, although with netcup you get 2 GB of RAM per vCPU on smaller plans.

When someone requests a screenshot render from us, each render takes 3 seconds on average. During this time, the RAM load is around just 1GB, while the load on both vCPU cores is 90%.

Here’s the load on a larger VPS with 4 vCPUs.

No active screenshot requests:

During screenshot rendering:

As you can see, the load is spread out across all 4 cores and is 51% on average for this particular screenshot request. This is much better than the 90% load on the 2-core VPS, but it is still significant.

This means that if you choose to use Puppeteer/Playwright/Selenium directly instead of using a screenshot API, and you decide to do the rendering on the same server as any other production resources, such as your website, your other resources will experience significant performance degradation during those 3 seconds.

Of course, depending on your specific implementation of screenshot rendering, the vCPU demand might be smaller. The rendering duration might be shorter if you choose to optimize screenshot rendering performance at the expense of screenshot quality. But still, the performance strain on your other resources on the same server will be noticeable.

You may choose to do the rendering part in the cloud, or on a separate VPS, just like we do and most other screenshot API providers do, but then you will have to compare the price and trouble of managing your own VPS or cloud instance to simply using a third-party screenshot API.

If you need to generate hundreds or thousands of screenshots every month for production use, and you don’t want screenshot rendering to affect the performance of your website or any other primary resources, using a screenshot API over Puppeteer/Playwright/Selenium directly has clear benefits.

3. You want to never experience CAPTCHAs or any other protective measures employed by sites

Occasionally, when you use Puppeteer/Playwright/Selenium directly and try to capture a screenshot of a webpage, instead of the intended webpage captured in the image, you may see a CAPTCHA.

Sites may show you a CAPTCHA for numerous reasons:

You have attempted to capture multiple screenshots of the same page/website in a short period of time.
You misconfigured the Chromium instance, and it’s obvious to the target website that you are using headless Chromium when you’re accessing it, and most sites don’t like that.
You haven’t updated your Puppeteer/Playwright/Selenium package for a long time, and therefore are still using an old bundled version of Chromium, which may result in more CAPTCHAs being shown over time.
Some sites are simply more strict than others, and may show CAPTCHAs more often, even to real users.

If you’re using Puppeteer/Playwright/Selenium directly, there are multiple ways to combat this:

You can use residential proxies if you need to capture the same website over and over.
You can use stealth plugins for Puppeteer/Playwright/Selenium to mask that you are using headless Chromium, and you can keep your Puppeteer/Playwright/Selenium packages updated to make sure you are always using the latest bundled Chromium version.
If some sites show you CAPTCHAs anyway, you can choose to use a CAPTCHA solver like the one from Bright Data.

If you don’t want to constantly keep track of Puppeteer/Playwright/Selenium updates, or if you don’t want to use any third-party services like residential proxies or especially CAPTCHA solvers, be that for compliance or any other reason, using a screenshot API is a good alternative.

Note that some screenshot API services may be better at CAPTCHA prevention than others. If you’re already experiencing CAPTCHAs with a page/website, sign up for several screenshot API services, try to capture the same page across all services, and then compare the results.

4. You need additional functionality beyond what Puppeteer/Playwright/Selenium expose directly

Puppeteer/Playwright/Selenium are browser automation libraries. They are not screenshot libraries.

When you capture screenshots, you may occasionally need some functionality that Puppeteer/Playwright/Selenium don’t have, but most screenshot APIs do.

Below, I will list some of that functionality. Note that most of this functionality can easily be implemented with the help of AI, which is why this advantage of using screenshot APIs over Puppeteer/Playwright/Selenium directly is not as significant as the other 3 listed before.

Caching

You may find that you do not always need the latest version of how a webpage/site looks.

When you use caching, the screenshot API will try to return the image directly from cache. If there’s no image there, or if it has expired, the screenshot will be regenerated.

You can also specify how long the image stays in cache before it expires.

Cached requests are not billed, meaning you don't have to worry about the cost of requesting these screenshots.

S3-compatible storage integration

If you need to store the screenshots you make, most screenshot APIs offer an integration with your S3-compatible storage.

You capture a screenshot, save it to your own bucket, then use the saved screenshots for any purpose you might have, without being limited in time in any way.

The integration is usually very simple, only requiring you to provide your bucket name and storage credentials.

Geographical routing

Sometimes, webpages show different content depending on what country the visitor is from.

Most screenshot APIs allow you to specify a country, sometimes even a city, from which the webpage will be accessed before the screenshot is made. This ensures that the webpage in the screenshot looks exactly the same as a real visitor from the chosen location would see it.

To achieve the same when using Puppeteer/Playwright/Selenium directly, you would have to use residential proxies.

Full-page capturing

Even though Puppeteer, Playwright, and Selenium all have the built-in ability to capture full-page screenshots, you will find that the resulting screenshots often lack a lot of lazy-loaded content.

To fix that, you will need to manually scroll the page top to bottom, capture every viewport separately, then stitch all the captures together.

When you use a screenshot API, you don’t have to worry about that. All the full-page screenshots will come out as expected, with all lazy-loaded assets triggered and present in the screenshot.

I don't need any of the above. Does it still make sense for me to use a screenshot API?

Honestly, probably not.

Don’t get me wrong, depending on your requirements for the screenshots, implementing all your requirements may take a lot of time, even with the help of AI. At the time of writing this blog post, Screenshot Scout has 70 screenshot options, covering every possible use case. And it might take 4-6 weeks to replicate that with AI.

But I don’t think you should add another dependency to your project in the form of a screenshot API unless you need any of the below:

You need clean, production-ready screenshots, with cookie popups, ads, and chat widgets reliably removed.
You need screenshots at scale, you don’t want that to affect the performance of your current production deployment, and you don’t want to manage a dedicated VPS/cloud instance for screenshot rendering.
You’re using screenshots in production, and you don’t want to ever experience CAPTCHAs and don’t want to deal with them yourself for compliance or similar reasons.
You need a lot of functionality that Puppeteer/Playwright/Selenium don’t expose directly, but most screenshot APIs have: caching, S3 integration, geographical routing of requests, and full-page capturing.

If any of the above is true for you, using a screenshot API is the better choice than using Puppeteer/Playwright/Selenium directly.

I hope this was helpful.

Happy screenshotting!