Agent Skill · Apify

apify-sdk-integration

Integrate Apify into an existing JavaScript/TypeScript or Python application using the apify-client package. Use when adding web scraping, automation, or data extraction capabilities to an existing app via the Apify API.

Provider: Apify Path in repo: skills/apify-sdk-integration/SKILL.md

Skill body

Apify SDK Integration

Add Apify Actor execution to an existing application. This skill covers the apify-client package for JS/TS and Python, plus the REST API for other languages.

When to Use This Skill

Critical: Package Naming

apify-client is the API client for calling Actors from your app. apify is the SDK for building Actors (wrong package for this use case).

Always install apify-client. Never install apify for integration work.

Prerequisites

The user needs an APIFY_TOKEN. Direct them to Console > Settings > Integrations at https://console.apify.com/settings/integrations to create one. If they don’t have an account: https://console.apify.com/sign-up (free, no credit card).

Store the token securely — environment variable or secrets manager, never hardcoded.

Finding the Right Actor

Before writing integration code, find the Actor that fits the user’s needs. Use the MCP tools if available:

Alternatively, browse https://apify.com/store. Append .md to any Actor’s Store URL to get its docs in markdown.

JavaScript / TypeScript

Install

npm install apify-client

Synchronous Execution (wait for results)

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('apify/web-scraper').call({
    startUrls: [{ url: 'https://example.com' }],
    maxPagesPerCrawl: 10,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();

.call() blocks until the Actor finishes. Use for short-running Actors (under a few minutes).

Asynchronous Execution (start and poll/retrieve later)

const run = await client.actor('apify/web-scraper').start({
    startUrls: [{ url: 'https://example.com' }],
});

// Poll for completion
const finishedRun = await client.run(run.id).waitForFinish();

// Retrieve results
const { items } = await client.dataset(finishedRun.defaultDatasetId).listItems();

Use .start() + .waitForFinish() for long-running Actors or when you need the run ID immediately.

Retrieving Results

// Dataset items (structured data from pushData)
const { items } = await client.dataset(run.defaultDatasetId).listItems({
    limit: 100,
    offset: 0,
});

// Key-value store (files, screenshots, etc.)
const record = await client.keyValueStore(run.defaultKeyValueStoreId).getRecord('OUTPUT');

Error Handling

try {
    const run = await client.actor('apify/web-scraper').call(input);

    if (run.status !== 'SUCCEEDED') {
        const log = await client.log(run.id).get();
        throw new Error(`Actor failed with status ${run.status}: ${log}`);
    }

    const { items } = await client.dataset(run.defaultDatasetId).listItems();
} catch (error) {
    if (error.message?.includes('not found')) {
        // Actor ID is wrong or Actor was deleted
    } else if (error.statusCode === 401) {
        // Invalid or missing APIFY_TOKEN
    }
    throw error;
}

Python

Install

pip install apify-client

Synchronous Execution

from apify_client import ApifyClient
import os

client = ApifyClient(token=os.environ['APIFY_TOKEN'])

run = client.actor('apify/web-scraper').call(run_input={
    'startUrls': [{'url': 'https://example.com'}],
    'maxPagesPerCrawl': 10,
})

items = client.dataset(run['defaultDatasetId']).list_items().items

Asynchronous Execution

run = client.actor('apify/web-scraper').start(run_input={
    'startUrls': [{'url': 'https://example.com'}],
})

# Poll for completion
finished_run = client.run(run['id']).wait_for_finish()

items = client.dataset(finished_run['defaultDatasetId']).list_items().items

Async Client (asyncio)

from apify_client import ApifyClientAsync

client = ApifyClientAsync(token=os.environ['APIFY_TOKEN'])

run = await client.actor('apify/web-scraper').call(run_input={
    'startUrls': [{'url': 'https://example.com'}],
})

items = (await client.dataset(run['defaultDatasetId']).list_items()).items

REST API (Any Language)

For languages without an official client, use the REST API directly.

Start a Run

POST https://api.apify.com/v2/acts/{actorId}/runs
Authorization: Bearer <APIFY_TOKEN>
Content-Type: application/json

{ "startUrls": [{ "url": "https://example.com" }] }

Get Run Status

GET https://api.apify.com/v2/acts/{actorId}/runs/{runId}
Authorization: Bearer <APIFY_TOKEN>

Get Dataset Items

GET https://api.apify.com/v2/datasets/{datasetId}/items?format=json
Authorization: Bearer <APIFY_TOKEN>

Full API reference: https://docs.apify.com/api/v2

Best Practices

Documentation

If the Apify MCP server is available, use search-apify-docs and fetch-apify-docs tools for contextual documentation lookups during development.