YQL and Scraperwiki sitting in a tree…
Scraperwiki is brilliant. YQL is brilliant. Now, they can get together and make lots of datababies.
Using the simple webservice I’ve written, it’s a bit easier to use scraperwiki data in YQL queries and to mash up scraperwiki data with other YQL sources.
YQL
YQL presents a uniform query interface, modelled on SQL, for various web APIs. You can run queries like select * from flickr.photos.recent to get a list, in json or in xml, of recent flickr photos.
Data can be mashed together from multiple tables/urls, like so:
select * from search.web where query in
(select title from rss
where url="http://rss.news.yahoo.com/rss/topstories"
| truncate(count=1))
limit 1
By publishing a chunk of XML and javascript by way of an adaptor non-Yahoo! services can be queried through YQL.
There are already adaptors published for many web APIs. YQL calls these things ‘datatables’. And you can use them like this:
USE "http://myserver.com/mytables.xml" AS mytable;
SELECT * FROM mytable WHERE...
With me so far?
Scraperwiki
Scraperwiki is a new, still-in-beta service for building web-scrapers and sharing the data they scrape. More scrapers are appearing daily and the site provides a useful API for querying the data created.
Using scraperwiki in YQL
I’ve created a service that automatically generates YQL datatable definitions from scraperwiki scrapers. You can find the definition for any scraperwiki scraper at a url that looks like this: http://swikiyql.heroku.com/SCRAPERWIKI_SHORT_NAME.xml
You’ll need a scraperwiki API key – you can sign up for one on their site. But once you’ve done that you should be able to run YQL queries like this
use 'http://swikiyql.heroku.com/wikipedia-2010-uk-election-candidates.xml'
AS candidates;
SELECT * FROM candidates
WHERE party='UKIP'
AND sw_api_key='YOUR_SW_API_KEY';
which queries the data from this scraperwiki scrape. And using this, you should be able to mash together different scraperwiki scrapes, or mash scrapes with any other YQL tables.
That’s it really – the source code for the swikiyql (pronounced swikiyql) is on github and there are probably lots of things wrong with it, but it works for me.
Now, go, mash things up!
Leave a Comment
I'm Ben Griffiths:
(Scraperwiki co-creator here)
This is great!
Do you mind if (in a couple of months) we install this on scraperwiki itself?
Let’s talk on email
do you have example of use of it