by oxylabs
AI Scraper is a powerful scraping tool and scrape agent built to automate data extraction with unmatched precision. Ideal for scalable AI scraping tasks across diverse web sources, this tool simplifies complex scraping operations into efficient, intelligent workflows.
# Add to your Claude Code skills
git clone https://github.com/oxylabs/ai-scraper-pyThe AI-Scraper is an experimental scraping tool by Oxylabs AI Studio that extracts data from a single webpage using AI. It identifies and parses relevant information based on a natural language prompt, then delivers results in either structured JSON (for automation and APIs) or Markdown format (best for readable outputs and AI workflows).
This AI scraper removes the need for CSS/XPath selectors or custom parsers, so it integrates seamlessly with various automation pipelines. Automatic schema generation and flexible output formats provide users with an easy way to extract clean, structured data without ever needing to maintain parsing logic.
No comments yet. Be the first to share your thoughts!
To scrape a webpage with AI-Scraper, follow these steps:
To begin, make sure you have access to an AI Studio API key (or get a free trial with 1000 credits) and Python ver. 3.10 or above installed. You can install the oxylabs-ai-studio package using pip:
pip install oxylabs-ai-studio
The following examples show how to use AiScraper to extract data from a sample page.
from oxylabs_ai_studio.apps.ai_scraper import AiScraper
import json
# Initialize the AI Scraper with your API key
scraper = AiScraper(api_key="YOUR_API_KEY")
# Generate a schema automatically from natural language
schema = scraper.generate_schema(prompt="want to parse developer, platform, type, price game title, and genre (array)")
print(f"Generated schema: {schema}")
# Scrape a webpage and extract structured data
url = "https://sandbox.oxylabs.io/products/3"
result = scraper.scrape(
url=url,
output_format="json",
schema=schema,
render_javascript=False,
geo_location="US",
)
# Print the scrape output as JSON
print("Results:")
print(json.dumps(result.data, indent=2))
Learn more about AI-Scraper and Oxylabs AI Studio Python SDK in our PyPI repository. You can also check out our AI Studio JavaScript SDK guide for JS users.
| Parameter | Description | Default Value |
|---------------------|----------------------------------------------------------------|---------------|
| url* | Target URL to scrape | – |
| output_format | Output format (json, markdown) | markdown |
| schema | OpenAPI schema for structured extraction (mandatory for JSON) | – |
| render_javascript | Enable render JavaScript | False |
| geo_location | Proxy location in ISO2 format | – |
* – mandatory parameters
The AI-Scraper can return parsed, ready-to-use output that is easy to integrate into your applications.
This is a structured JSON of the response output:
Results:
{
"games": [
{
"developer": "Nintendo EAD Tokyo",
"platform": "wii",
"type": "singleplayer",
"price": 91.99,
"title": "Super Mario Galaxy 2",
"genre": [
"Action",
"Platformer"
]
},
{
"developer": "Eidos Interactive",
"platform": "wii",
"type": null,
"price": 80.99,
"title": "Death Jr.: Root of Evil",
"genre": [
"Action",
"Platformer",
"3D"
]
}...
Alternatively, you can use output_format=”markdown” to receive Markdown results instead of parsed JSON.
Oxylabs AI-Scraper can be applied to a wide variety of data collection tasks:
AI-Scraper doesn’t rely on CSS/XPath selectors or custom parsing logic. Instead, it uses natural language prompts and AI-powered extraction, making it more adaptable to layout changes and much easier to set up.
Yes, you can scrape any public webpage as long as the page is publicly accessible. AI-Scraper also supports JavaScript rendering for dynamic pages. Private or login-protected content isn’t supported out of the box.
No, schema is not mandatory, but it’s required if you want structured JSON output. If you don’t provide one, AI-Scraper can generate a schema automatically based on your prompt.
Unlike traditional scrapers, AI-Scraper is more resilient to layout changes because it interprets content with AI. However, major changes may require you to adjust either your prompt or the schema.
Oxylabs AI Studio AI-Scraper is free to try by signing up for a free trial that includes 1,000 credits. After the trial, the monthly plans start at just $12/month with 3000 credits and 1 request/s, with higher plans offering more credits and higher request rates.
For a deeper dive into available parameters, advanced integrations, and additional examples, check out the AI Studio documentation.
If you have questions or need support, reach out to us at support@oxylabs.io, or through live chat, accessible via Oxylabs Dashboard, or join our Discord community. For enterprise-related inquiries, contact your dedicated account manager.