Screenshot and Image Capture Guide
Generated: 2025-01-27
Last Updated: 2025-01-27
Purpose: Comprehensive guide for capturing clean screenshots and images for documentation
Tools: Puppeteer, @knowcode/imgfetch
Overview
This guide provides detailed instructions for capturing screenshots and images for product documentation, competitor analysis, and visual assets. It covers two primary tools and includes strategies for handling cookie consent banners that often obstruct important content.
Tool Comparison
| Feature | Puppeteer | @knowcode/imgfetch |
|---|---|---|
| Type | Full browser automation | Simple image downloader |
| Best For | Dynamic sites, interactions | Static images, existing assets |
| Cookie Handling | Full control | Limited |
| Setup Complexity | Medium | Low |
| Resource Usage | High | Low |
Using @knowcode/imgfetch
Basic Usage
# Install globally
npm install -g @knowcode/imgfetch
# Download a single image
imgfetch https://example.com/logo.png -o ./images/
# Download with custom name
imgfetch https://example.com/logo.png -o ./images/competitor-logo.png
# Batch download from URL list
imgfetch -f urls.txt -o ./images/
Creating a Batch Download Script
// download-images.js
const { exec } = require('child_process');
const fs = require('fs').promises;
const imageUrls = [
{ url: 'https://activecampaign.com/images/logo.svg', name: 'activecampaign-logo' },
{ url: 'https://mailchimp.com/assets/images/logo.png', name: 'mailchimp-logo' },
{ url: 'https://convertkit.com/images/brand/logo.png', name: 'convertkit-logo' }
];
async function downloadImages() {
for (const img of imageUrls) {
const command = `imgfetch ${img.url} -o ./images/${img.name}.png`;
exec(command, (error, stdout, stderr) => {
if (error) {
console.error(`Error downloading ${img.name}:`, error);
} else {
console.log(`β
Downloaded ${img.name}`);
}
});
}
}
downloadImages();
Using Puppeteer for Screenshots
Installation and Setup
# Install Puppeteer
npm install puppeteer
# Or use puppeteer-core for smaller install
npm install puppeteer-core
Basic Screenshot Capture
const puppeteer = require('puppeteer');
async function captureScreenshot(url, outputPath) {
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
// Set viewport for consistent screenshots
await page.setViewport({
width: 1920,
height: 1080,
deviceScaleFactor: 1
});
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 30000
});
await page.screenshot({
path: outputPath,
fullPage: false
});
await browser.close();
}
// Usage
captureScreenshot('https://example.com', './screenshots/example-homepage.png');
πͺ Handling Cookie Consent Banners
Important Learnings from Production Use
Based on extensive screenshot capture experience, here are key insights:
- Puppeteer Version Compatibility: Use
headless: 'new'instead ofheadless: truefor newer Puppeteer versions - Method Changes:
page.waitForTimeout()is deprecated - useawait new Promise(r => setTimeout(r, ms))instead - Navigation Strategies: Use
waitUntil: 'domcontentloaded'for faster captures when full network idle isn't needed - Timeout Management: Many sites take longer than expected - use shorter timeouts (10-20s) and handle failures gracefully
Strategy 1: Click Accept Buttons
async function dismissCookieConsent(page) {
// Common cookie consent button selectors
const cookieSelectors = [
// Aria labels
'[aria-label*="accept" i]',
'[aria-label*="agree" i]',
'[aria-label*="consent" i]',
'[aria-label*="cookie" i]',
// IDs and classes
'button[id*="accept" i]',
'button[class*="accept" i]',
'button[class*="consent" i]',
'button[class*="agree" i]',
'a[id*="accept" i]',
// Text content
'button:contains("Accept")',
'button:contains("I agree")',
'button:contains("Got it")',
'button:contains("OK")',
'button:contains("Allow")',
// Common CSS classes
'.cookie-consent-accept',
'.accept-cookies',
'#cookie-accept',
'.gdpr-accept',
// Data attributes
'[data-testid*="accept" i]',
'[data-action*="accept" i]'
];
// Try each selector
for (const selector of cookieSelectors) {
try {
// Check if element exists
const element = await page.$(selector);
if (element) {
// Check if visible
const isVisible = await element.isIntersectingViewport();
if (isVisible) {
await element.click();
console.log(`β
Clicked cookie consent: ${selector}`);
await new Promise(r => setTimeout(r, 1000)); // Wait for animation
return true;
}
}
} catch (e) {
// Continue to next selector
}
}
return false;
}
Strategy 2: Remove Banner Elements
async function removeCookieBanners(page) {
await page.evaluate(() => {
// Common cookie banner selectors
const bannerSelectors = [
'[class*="cookie-banner"]',
'[class*="cookie-consent"]',
'[class*="cookie-notice"]',
'[class*="gdpr"]',
'[class*="privacy-banner"]',
'[id*="cookie-banner"]',
'[id*="cookie-consent"]',
'[id*="gdpr"]',
'.cc-banner',
'.cc-window',
'#cookieConsent',
'.cookieConsent',
'.cookie-popup',
'.privacy-popup'
];
bannerSelectors.forEach(selector => {
const elements = document.querySelectorAll(selector);
elements.forEach(el => {
// Only remove if it looks like a cookie banner
const text = el.textContent.toLowerCase();
if (text.includes('cookie') ||
text.includes('privacy') ||
text.includes('consent') ||
text.includes('gdpr')) {
el.style.display = 'none';
el.remove();
}
});
});
// Remove fixed/sticky elements that might be cookie banners
document.querySelectorAll('*').forEach(el => {
const style = window.getComputedStyle(el);
if ((style.position === 'fixed' || style.position === 'sticky') &&
el.textContent.toLowerCase().includes('cookie')) {
el.remove();
}
});
});
}
Strategy 3: Block Cookie Consent Services
async function blockCookieServices(page) {
await page.setRequestInterception(true);
const blockedDomains = [
'cookiebot.com',
'cookieconsent.com',
'cookieyes.com',
'onetrust.com',
'trustarc.com',
'quantcast.com',
'consensu.org',
'privacy-center.org',
'cookiepro.com',
'termly.io',
'iubenda.com'
];
page.on('request', (request) => {
const url = request.url();
if (blockedDomains.some(domain => url.includes(domain))) {
console.log(`π« Blocked cookie service: ${url}`);
request.abort();
} else {
request.continue();
}
});
}
Complete Screenshot Solution
Full Implementation with All Strategies
const puppeteer = require('puppeteer');
const fs = require('fs').promises;
const path = require('path');
class ScreenshotCapture {
constructor(options = {}) {
this.options = {
viewport: { width: 1920, height: 1080, deviceScaleFactor: 1 },
timeout: 30000,
...options
};
}
async init() {
this.browser = await puppeteer.launch({
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
]
});
}
async captureScreenshot(url, outputPath, options = {}) {
const page = await this.browser.newPage();
try {
// Set viewport
await page.setViewport(this.options.viewport);
// Block cookie services
await this.blockCookieServices(page);
// Navigate to page
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: this.options.timeout
});
// Wait a bit for dynamic content
await new Promise(r => setTimeout(r, 2000));
// Try to dismiss cookie consent
const dismissed = await this.dismissCookieConsent(page);
// If not dismissed, try removing banners
if (!dismissed) {
await this.removeCookieBanners(page);
}
// Take screenshot
const screenshotOptions = {
path: outputPath,
fullPage: options.fullPage || false,
...options
};
await page.screenshot(screenshotOptions);
console.log(`β
Screenshot saved: ${outputPath}`);
} catch (error) {
console.error(`β Error capturing ${url}:`, error.message);
} finally {
await page.close();
}
}
async dismissCookieConsent(page) {
const selectors = [
// Most common patterns
'button[onclick*="accept"]',
'button[class*="accept-all"]',
'button[id="onetrust-accept-btn-handler"]',
'button[class="cc-btn cc-dismiss"]',
'button[data-action="accept"]',
'a[class*="cc-btn cc-dismiss"]',
// Generic patterns
'[aria-label*="accept" i]',
'button:contains("Accept")',
'button:contains("I Agree")',
'button:contains("Allow all")'
];
for (const selector of selectors) {
try {
await page.waitForSelector(selector, { timeout: 3000 });
await page.click(selector);
await new Promise(r => setTimeout(r, 1000));
return true;
} catch (e) {
// Try next selector
}
}
return false;
}
async removeCookieBanners(page) {
await page.evaluate(() => {
const selectors = [
'.cookie-banner',
'.cookie-consent',
'.gdpr-banner',
'#cookie-notice',
'[class*="cookieconsent"]',
'[id*="cookieconsent"]'
];
selectors.forEach(selector => {
document.querySelectorAll(selector).forEach(el => el.remove());
});
});
}
async blockCookieServices(page) {
await page.setRequestInterception(true);
const blockedPatterns = [
/cookiebot/,
/onetrust/,
/cookieconsent/,
/trustarc/,
/quantcast/
];
page.on('request', (request) => {
if (blockedPatterns.some(pattern => pattern.test(request.url()))) {
request.abort();
} else {
request.continue();
}
});
}
async close() {
if (this.browser) {
await this.browser.close();
}
}
}
// Usage example
async function captureCompetitorScreenshots() {
const capture = new ScreenshotCapture();
await capture.init();
const competitors = [
{ url: 'https://www.activecampaign.com', name: 'activecampaign' },
{ url: 'https://mailchimp.com', name: 'mailchimp' },
{ url: 'https://www.convertkit.com', name: 'convertkit' },
{ url: 'https://www.getresponse.com', name: 'getresponse' },
{ url: 'https://www.constantcontact.com', name: 'constantcontact' }
];
// Ensure output directory exists
await fs.mkdir('./screenshots/competitors', { recursive: true });
// Capture each competitor
for (const competitor of competitors) {
// Homepage
await capture.captureScreenshot(
competitor.url,
`./screenshots/competitors/${competitor.name}-homepage.png`
);
// Pricing page
await capture.captureScreenshot(
`${competitor.url}/pricing`,
`./screenshots/competitors/${competitor.name}-pricing.png`
);
// Features page
await capture.captureScreenshot(
`${competitor.url}/features`,
`./screenshots/competitors/${competitor.name}-features.png`
);
}
await capture.close();
}
// Run the capture
captureCompetitorScreenshots();
Batch Processing Script
Complete Workflow for Documentation
// capture-all-assets.js
const puppeteer = require('puppeteer');
const { exec } = require('child_process').promises;
const fs = require('fs').promises;
class DocumentationAssetCapture {
constructor() {
this.screenshotCapture = new ScreenshotCapture();
this.assets = {
logos: [
{ url: 'https://activecampaign.com/images/logo.svg', name: 'activecampaign-logo' },
{ url: 'https://mailchimp.com/release/plums/cxp/images/logo_freddie_black.svg', name: 'mailchimp-logo' }
],
screenshots: [
{ url: 'https://activecampaign.com', name: 'activecampaign-home', fullPage: false },
{ url: 'https://activecampaign.com/pricing', name: 'activecampaign-pricing', fullPage: true }
],
features: [
{ url: 'https://activecampaign.com/features/email-designer', name: 'ac-email-designer' },
{ url: 'https://mailchimp.com/features/email-templates/', name: 'mc-email-templates' }
]
};
}
async captureAll() {
console.log('π Starting asset capture...\n');
// Create directories
await this.createDirectories();
// Download logos with imgfetch
console.log('π₯ Downloading logos...');
await this.downloadLogos();
// Capture screenshots with Puppeteer
console.log('\nπΈ Capturing screenshots...');
await this.captureScreenshots();
// Generate summary
await this.generateSummary();
console.log('\nβ
Asset capture complete!');
}
async createDirectories() {
const dirs = [
'./images/logos',
'./images/screenshots',
'./images/features',
'./images/ui-elements'
];
for (const dir of dirs) {
await fs.mkdir(dir, { recursive: true });
}
}
async downloadLogos() {
for (const logo of this.assets.logos) {
try {
await exec(`imgfetch ${logo.url} -o ./images/logos/${logo.name}.png`);
console.log(`β
Downloaded ${logo.name}`);
} catch (error) {
console.error(`β Failed to download ${logo.name}:`, error.message);
}
}
}
async captureScreenshots() {
await this.screenshotCapture.init();
for (const screenshot of this.assets.screenshots) {
await this.screenshotCapture.captureScreenshot(
screenshot.url,
`./images/screenshots/${screenshot.name}.png`,
{ fullPage: screenshot.fullPage }
);
}
await this.screenshotCapture.close();
}
async generateSummary() {
const summary = {
capturedAt: new Date().toISOString(),
assets: {
logos: this.assets.logos.length,
screenshots: this.assets.screenshots.length,
features: this.assets.features.length
},
total: this.assets.logos.length + this.assets.screenshots.length + this.assets.features.length
};
await fs.writeFile(
'./images/capture-summary.json',
JSON.stringify(summary, null, 2)
);
}
}
// Run the capture
const capture = new DocumentationAssetCapture();
capture.captureAll();
Production-Ready Simplified Approach
Based on real-world experience capturing 60+ screenshots, here's a streamlined approach that works reliably:
const puppeteer = require('puppeteer');
async function simplifiedCapture(url, outputPath) {
const browser = await puppeteer.launch({
headless: 'new',
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
try {
await page.setViewport({ width: 1920, height: 1080 });
// Faster navigation - don't wait for all resources
await page.goto(url, {
waitUntil: 'domcontentloaded',
timeout: 20000
});
// Brief wait for content
await new Promise(r => setTimeout(r, 3000));
// Simple cookie banner removal via CSS injection
await page.addStyleTag({
content: `
[class*="cookie"], [id*="cookie"],
[class*="consent"], [id*="consent"],
[class*="gdpr"], [id*="gdpr"],
.onetrust-pc-dark-filter, #onetrust-banner-sdk {
display: none !important;
}
`
});
// Try common accept button
try {
await page.click('#onetrust-accept-btn-handler');
await new Promise(r => setTimeout(r, 1000));
} catch (e) {
// Continue if button doesn't exist
}
await page.screenshot({ path: outputPath });
console.log(`β
Captured: ${outputPath}`);
} catch (error) {
console.error(`β Failed: ${error.message}`);
} finally {
await page.close();
}
await browser.close();
}
Key Simplifications:
- Use
domcontentloadedinstead ofnetworkidle2for faster captures - CSS injection is more reliable than JavaScript removal for cookie banners
- Single browser instance per capture avoids memory issues
- Shorter timeouts with proper error handling
- Minimal selector attempts - just try the most common accept button
Batch Processing with Rate Limiting:
async function batchCapture(urls, delay = 2000) {
for (const { url, output } of urls) {
await simplifiedCapture(url, output);
await new Promise(r => setTimeout(r, delay));
}
}
Best Practices
1. Image Organization
images/
βββ logos/
β βββ competitor-logos/
β βββ integration-logos/
βββ screenshots/
β βββ competitors/
β βββ features/
β βββ ui-elements/
βββ diagrams/
βββ icons/
2. Naming Conventions
- Use lowercase with hyphens:
mailchimp-pricing-page.png - Include context:
activecampaign-email-builder-screenshot.png - Add dimensions for variants:
logo-horizontal-200x50.png
3. Screenshot Checklist
- Clear cookie banners
- Hide personal data
- Consistent viewport size
- Good contrast/visibility
- No loading spinners
- Meaningful content visible
4. Performance Tips
- Reuse browser instance for multiple screenshots
- Process in batches to avoid memory issues
- Compress images after capture
- Use WebP format for smaller file sizes
5. Error Handling
async function safeCapture(url, output, retries = 3) {
for (let i = 0; i < retries; i++) {
try {
await captureScreenshot(url, output);
return;
} catch (error) {
console.log(`Retry ${i + 1}/${retries} for ${url}`);
if (i === retries - 1) throw error;
}
}
}
Troubleshooting Common Issues
1. Navigation Timeouts
- Issue: "Navigation timeout of 30000 ms exceeded"
- Solution: Use shorter timeouts and
domcontentloadedinstead ofnetworkidle2 - Code:
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 })
2. Cookie Banner Persistence
- Issue: Cookie banners still visible after removal attempts
- Solution: Combine multiple strategies - CSS injection + element removal + button clicking
- Note: Some sites load banners dynamically after initial page load
3. Memory Issues with Large Batches
- Issue: "Cannot allocate memory" errors after many screenshots
- Solution: Close browser after each capture or every 10-20 captures
- Alternative: Use
page.close()after each screenshot while reusing browser
4. Deprecated Method Warnings
- Issue: "page.waitForTimeout is not a function"
- Solution: Replace with
await new Promise(r => setTimeout(r, ms))
5. Headless Detection
- Issue: Sites blocking headless browsers
- Solution: Add stealth plugins or use
headless: falsefor problematic sites
Performance Optimization Tips
- Parallel Processing (with caution):
// Process in batches of 5 to avoid overwhelming system
const batchSize = 5;
for (let i = 0; i < urls.length; i += batchSize) {
const batch = urls.slice(i, i + batchSize);
await Promise.all(batch.map(item => simplifiedCapture(item.url, item.output)));
await new Promise(r => setTimeout(r, 2000)); // Delay between batches
}
- Resource Blocking for faster loads:
await page.setRequestInterception(true);
page.on('request', (req) => {
if (['image', 'stylesheet', 'font'].includes(req.resourceType())) {
req.abort();
} else {
req.continue();
}
});
Additional Resources
- Puppeteer Documentation
- Common Cookie Consent Solutions
- Web Screenshot Best Practices
- Puppeteer Stealth Plugin
Use this guide to efficiently capture all visual assets needed for Phases 8-14 of your product documentation. Updated with real-world learnings from capturing 60+ screenshots across major email marketing platforms.