The Despair When Requests Meet "Loading..."

In Chapter 3, we taught how to write a simple web scraper using requests + BeautifulSoup.
This worked perfectly for websites from 10 years ago. But if you try scraping modern sites like PChome, Shopee, or real-time stock quote platforms today, you'll encounter a crushing reality:

The HTML you scrape contains no data—just a single line <div id="loading">Loading...</div>.

This happens because modern websites use React or Vue for "client-side rendering (CSR)." Data is secretly fetched from background APIs only after the browser executes JavaScript. Worse, many sites now add Cloudflare's "bot verification" (that checkbox asking you to prove you're human).

At this point, traditional scrapers fail. We need to bring out the big guns: Playwright.


🎭 What is Playwright?

Playwright is Microsoft's open-source ultimate "browser automation testing tool."
Imagine it as a "ghost engineer." When you run a Playwright Python script, it secretly opens a real Chrome browser in the background, simulating human actions—clicking buttons, scrolling, waiting for spinners to disappear—before finally extracting the visible text.

Since it's a real browser, all dynamically rendered data appears perfectly, and it can bypass many basic anti-scraping measures.


🛠️ Vibe Coding in Action: Scraping Dynamic Stock Data with Playwright

Learning Playwright's syntax is notoriously hard, as it involves complex asynchronous (async/await) operations and DOM element targeting (XPath/CSS Selectors).
But with Cursor, we don't need to memorize syntax.

【Playwright Dynamic Scraping Vibe Prompt】
I'm developing a web scraper in Python.
Target URL: https://example-stock-site.com/ (a hypothetical stock site where tables load dynamically after 3 seconds).

Please use the playwright package to:

  1. Use async/await architecture.
  2. Launch Chromium (set headless=False so I can see the process).
  3. Navigate to the URL, then wait for the #stock-table element to appear (ensuring dynamic data loads).
  4. Extract all .stock-price elements, convert prices to floats, and store them in a Python List.
  5. Save the List as prices.csv.
  6. Include complete Chinese comments and error handling.

AI-generated god-tier script:

import asyncio
import csv
from playwright.async_api import async_playwright

async def scrape_dynamic_stocks():
    # Launch Playwright
    async with async_playwright() as p:
        # Open browser (headless=False shows the actual window)
        browser = await p.chromium.launch(headless=False)
        page = await browser.new_page()
        
        try:
            print("🌐 Navigating to stock site...")
            await page.goto('https://example-stock-site.com/')
            
            # Critical step: Wait for dynamic table to load! (avoid empty data)
            print("⏳ Waiting for data to load...")
            await page.wait_for_selector('#stock-table', timeout=10000)
            
            # Extract all elements with class 'stock-price'
            price_elements = await page.query_selector_all('.stock-price')
            
            prices = []
            for element in price_elements:
                text = await element.inner_text()
                # Remove currency symbols/commas, convert to float
                clean_price = float(text.replace('$', '').replace(',', ''))
                prices.append([clean_price])
                
            # Save to CSV
            with open('prices.csv', 'w', newline='', encoding='utf-8') as f:
                writer = csv.writer(f)
                writer.writerow(['Price'])
                writer.writerows(prices)
                
            print(f"✅ Successfully scraped {len(prices)} stock records!")
            
        except Exception as e:
            print(f"🚨 Scraping error: {e}")
        finally:
            await browser.close()

# Run async function
asyncio.run(scrape_dynamic_stocks())

👁️ AI Vision Targeting: The Ultimate Evolution of Scraping

Historically, scraper developers' biggest pain was website redesigns.
If a site engineer changed class="stock-price" to class="price-text-v2", your scraper would instantly crash.

But with Vibe Coding and AI integration, this pain point is disappearing.
If you connect OpenAI's gpt-4o (with vision capabilities) to Playwright, the workflow becomes:

  1. Playwright opens the webpage.
  2. Playwright takes a full-page screenshot.
  3. Your Python script sends the screenshot to OpenAI with the query:
    "What is TSMC's stock price in this image?"
  4. OpenAI analyzes the image and returns 1050.

This is "selector-less scraping."
You completely ignore HTML structure or class name changes. If a human can see the number, AI can extract it. This is cutting-edge black magic in data science—and your ultimate weapon for future freelance projects!

Unlock Full Tutorial

This chapter is paid content. Join the project to unlock over 5000 words of deep analysis, including 10+ god-tier Prompts and real Source Code examples!