Tuesday, July 24, 2007

Site Scraping for DreamWeaver

Now there's a DreamWeaver plug-in that makes scraping easy for dummies.

If you have no web skills just use Site Import and rip off an entire site at once.

Why learn how to design a site, create your own content, or any of that nonsense when you can just quickly and rapidly download someone's site instead?

This is cute:

No limit retrieval

With Site Import 2.0 you can import as many pages from a site as you'd like – no more limits!
That's a nice theory until a bot blocker shuts your import down in mid-scrape.

And my personal fave:

Learn from the pros

Learning by example is a time-honored tradition on the Web
I'm not sure that stealing is a time-honored tradition even if imitation is the sincerest form of flattery.

And last but not least:

Dynamic and database-driven sites, too!

Site Import works its magic with all kinds of Web sites – including those developed with ASP, ColdFusion, PHP or even .NET.
Grab your ankles and bend over while it extracts hundreds of thousands of pages from your database-driven site and pushes you over your monthly bandwidth allotment.

Don't know what user agent they use for this process but I'm pretty sure my sites (not this blog) are pretty safe from this shit except for the first few pages scraped while determining it's not a human at the controls.


Libertate said...

Heh. Although they won't be blocked on my site, they will definitely get trapped. MY spoiler hopefully poisons the downloads...

GaryK said...

Bill you questioned what the UA is for this product. I have two UAs in my database with a pattern of Dreamweaver-*. The most recent one is from mid-2006.

BTW, it's nice to be able to post in your comments again. The most recent release of Firefox seems to have cured whatever the problem was.

Anonymous said...

$99!! Or you could use HTTrack for free :)

IncrediBILL said...

I tested HTTrack and my code locked it out in about 5 hits to the site.

I was quite pleased.

IncrediBILL said...

Welcome back Gary!

Glad you can post, we missed ya!

GaryK said...

Thanks Bill. I missed being able to post here for so long.