I’ve been trying to find an easy way to re-grab my TwitPic images back off TwitPic to be re-syndicated on a communal family website I am trying to create. I found a great site which had some Rails code that scraped the image, so I took the code and modified it to make it compatible for pure Ruby (too be honest it didn’t need much modifying at all) .
Here’s my modified snippet of code that I’ve been using to grab the image from Twitpic with the Hpricot gem:
require 'rubygems'
require 'net/http'
require 'hpricot'
def rip_twitpic(url)
begin
code=url.match(/[w]+$/).to_s
unless code.empty?
uri=URI.parse(url)
resp=Net::HTTP.get_response(uri)
html=Hpricot(resp.body)
html.at("#photo-display")['src']
end
rescue Exception => e
puts "Error extracting twitpic: #{e}"
url
end
end
Just as it appears, this method will return the URL of the image embedded on the page that the TwitPic URL points too.
This will form the core part of a little project which will allow you to scrape TwitPic images and send them to a server of your choosing. I’ll try to release this ASAP but in the meantime, I thought a few of you might find this useful. I know I do.