Social Network Aggregate API Factory Design Pattern in Ruby

I worked on a project which allowed users to authenticate using oauth with several well known social media platforms. After users had linked all their social media presences; we wanted to import the posts for each user from each platform. This is a (redacted) sample of how I accomplished this:

First, I have a neat little module that allows me to encapsulate a list of handler objects and a notification method that can trigger the correct handler based on how the handler has subscribed itself in the factory. This is a little confusing, but basically, each provider class will initialise itself into the factory object, having subscribed itself to a particularly type of social network which it is able to process (handle). I simply abstracted this code because I thought it might be very handy for other projects which follow a similar pattern to this:

module EventDispatcher
def setup_listeners
@event_dispatcher_listeners = {}
end

def subscribe(event, &callback)
(@event_dispatcher_listeners[event] ||= []) << callback
end

protected
def notify(event, *args)
if @event_dispatcher_listeners[event]
@event_dispatcher_listeners[event].each do |m|
m.call(*args) if m.respond_to? :call
end
end
nil
end
end

Next we need to develop the Factory object to encapsulate our handler objects. It contains all the configuration attributes of our social network platform API keys and secrets, etc. It instantiates with a hash which has a static method #load to read a specified file (or by default a file in /config/social_network_configuration.json) which returns an instance of itself with the contents of the configuration file passed into the constructor:

require File.expand_path('../../event_dispatcher', __FILE__)

module SocialNetworking
class SocialNetworkFactory
include EventDispatcher
attr_reader :configs

def initialize(data)
setup_listeners
@configs = {}
data.each {|n, o| @configs.store n.downcase.to_param.to_sym, o}
end

def
process(network, user)
notify(network, user)
end

##
# Reads client configuration from a file and returns an instance of the factory
#
# @param [String] filename
# Path to file to load
#
# @return [
SocialNetworking::SocialNetworkFactory]
# Social network factory with API configuration
def self
.load(filename = nil)
if filename && File.directory?(filename)
search_path = File.expand_path(filename)
filename = nil
end
while
filename == nil
search_path ||= File.expand_path("#{Rails.root}/config")
if File.exist?(File.join(search_path, 'social_network_configuration.json'))
filename = File.join(search_path, 'social_network_configuration.json')
elsif search_path == '/' || search_path =~ /[a-zA-Z]:[\/\\]/
raise ArgumentError,
'No ../config/social_network_configuration.json filename supplied ' +
'and/or could not be found in search path.'
else
search_path = File.expand_path(File.join(search_path, '..'))
end
end
data = File.open(filename, 'r') {|file| MultiJson.load(file.read)}
return self.new(data)
end
end
end

The configuration file (/config/social_network_configuration.json) looks something like:

{
"Facebook": {
"oauth_access_token": "...",
"expires": ""
},
"SoundCloud": {
"client_id": "..."
},
"Twitter": {
"access_token": "...",
"access_token_secret": "...",
"consumer_key": "...",
"consumer_secret": "..."
},
"YouKu": {
"client_id": "..."
},
"YouTube": {
"dev_key": "..."
},
"Weibo": {
"app_id": "..."
}
}

The last part is to create a different handler object for each social network (as each social network has its own specific API for interfacing with the platform. Its pretty basic:

module SocialNetworking
module Providers
class NetworkNameProvider

def initialize(factory)
# you can access the configurations through the factory
@app_id = factory.configs[:network_name]['app_id']

# instruct the factory that this provider handles the
# 'network_name' social network oauth. The factory will
# publish the users authorization object to this handler.
factory.subscribe(:network_name) do |auth|

# Do stuff ...

end
end

end
end
end

So an example of a Weibo Provider class might look something like this:

require File.expand_path('../../../../lib/net_utilities', __FILE__)
require 'base62'
require 'httpi'

module SocialNetworking
module Providers
class WeiboProvider
include NetUtilities

def initialize(factory)
@token = get_token
@app_id = factory.configs[:weibo]['app_id']

factory.subscribe(:weibo) do |auth|
Rails.logger.info " Checking Weibo user '#{auth.api_id}'"

begin
@token = auth.token unless auth.token.nil? # || auth.token_expires < DateTime.now
request = HTTPI::Request.new 'https://api.weibo.com/2/statuses/user_timeline.json'
request.query = {source: @app_id, access_token: @token, screen_name: auth.api_id}
response = HTTPI.get request
if response.code == 200
result = MultiJson.load(response.body)
weibos = result['statuses']

weibos.each {|post|

... do something with the post

}
end
auth.checked_at = DateTime.now
auth.save!
rescue Exception => e
Rails.logger.warn " Exception caught: #{e.message}"
@token = get_token
end
end
end


private

def
get_token
auth = Authorization.where(provider: 'weibo').where('token_expires < ?', DateTime.now).shuffle.first
auth = Authorization.where(provider: 'weibo').order(:token_expires).reverse_order.first if auth.nil?
raise 'Cannot locate viable Weibo authorization token' if auth.nil?
auth.token
end

def
uri_hash(id)
id.to_s[0..-15].to_i.base62_encode.swapcase + id.to_s[-14..-8].to_i.base62_encode.swapcase + id.to_s[-7..-1].to_i.base62_encode.swapcase
end

end
end
end

Of course there are a lot of opportunities too refactor and make the providers better. For example a serious argument could be made that the API handshake should be abstracted to a seperate class to be consumed by the provider rather than the provider doing all the API lifting itself (violates the single-responsibility principal) – but I include it inline to give better idea on how this factory works without getting too abstracted.

The last piece of this puzzle is putting it all together. There are a lot of different ways you could consume this factory; but in this example I am going to do it as a rake task that can be regularly scheduled via a cron task.

Dir["#{File.dirname(__FILE__)}/../social_networking/**/*.rb"].each {|f| load(f)}

namespace :social_media do
desc
'Perform a complete import of social media posts of all users'
task import: :environment do
factory = SocialNetworking::Atlas::SocialNetworkFactory.load
# Instantiate each of your providers here with the factory object.
SocialNetworking::Atlas::Providers::NetworkNameProvider.new factory
SocialNetworking::Atlas::Providers::WeiboProvider.new factory

# Execute the Oauth authorizations in a random order.
Authorization.where(muted: false).shuffle.each do |auth|
factory.process(auth.provider.to_sym, auth)
end
end
end

I wouldn’t do this in production though, as you may encounter problems if the task gets triggered when the previous iteration is still running. Additionally, I would recommend leveraging ActiveJob to run each handler which would give massive benefits to execution concurrency and feedback on job successes/failures.

Also, you could get really clever and loop over each file in the /providers directory and include and instantiate them all at once, but I have chosen to explicitly declare it in this example.

As you can see this is a nice little pattern which uses some pseudo-event subscription and processing to allow you to import from multiple APIs and maintaining separation of responsibilities. As we loop over each authorization record, this pattern will automatically hand the auth record to the correct handler. You can also chop and change which providers are registered; as any authorization record that doesn’t have a registered handler for its type, will simply be ignored. This means that if the Weibo API changes and we need to fix our handler; it is trivial to remove the handler from production by commenting it out, and all our other handlers will continue to function like nothing ever happened.

This code was written many years ago; and should work on Ruby versions even as old as 1.8. There are probably many opportunities too refactor and enhance this code substantially using a more recent Ruby SDK. Examples of possible enhancements would be allowing the providers to subscribe to the factory using a &block instead of a symbol and allowing the factory to pass a block into to #process method to give access for additional processing to be executed in the context of the provider; but abstracted from it.

Nevertheless, I hope that this pattern proves useful to anyone needing a design pattern to have a handler automatically selected to process some work without complicated selection logic.

How to Export an Entire Newsfeed From Google Reader

Google reader is an amazing web-based feed reading application.  One of it’s most outstanding features is the ability to archive feeds, and show you posts from the feed regardless of when you subscribe.  You see, normally, RSS and ATOM feeds only contain the 10-30 of the most recently posted items, but since Google Reader stores all the posts from the subscribed feeds, they’re (usually) all available if you keep scrolling down in the interface.

This makes it an awesome feed caching and archival tool.  However, not many people are aware that you can actually extract the data back out, in one mega standard ATOM file.

Just enter this URL in the address bar:

http://www.google.com/reader/atom/feed/FEED_URL?r=n&n=NUMBER_OF_ITEMS

…where FEED_URL is the address of the feed and NUMBER_OF_ITEMS the number of posts to extract.

For example, http://www.google.com/reader/atom/feed/http://feeds.feedburner.com/blogspot/MKuf?r=n&n=100 should return the latest 100 posts from the Official Google Blog as an ATOM/XML file.

10 Absolute *must have* WordPress Plugins

Akismet

Akismet checks your comments against the Akismet web service to see if they look like spam or not and lets you review the spam it catches under your blog’s “Comments” admin screen.

3

Google Sitemap Generator

This plugin will generate a special XML sitemap which will help search engines like Google, Bing, Yahoo and Ask.com to better index your blog. With such a sitemap, it’s much easier for the crawlers to see the complete structure of your site and retrieve it more efficiently. The plugin supports all kinds of WordPress generated pages as well as custom URLs. Additionally it notifies all major search engines every time you create a post about the new content.

2

Echo

Echo is the next generation commenting system. It’s the way to share your content, and watch the live reaction. You can quickly embed Echo on WordPress, Blogger, or any website and turn your static pages into a real-time stream of diggs, tweets, comments and more.  It’s not free, but it *is* cheap and worth every dollar.

echo

WordPress Super Cache

This plugin generates static html files from your dynamic WordPress blog. After a html file is generated your webserver will serve that file instead of processing the comparatively heavier and more expensive WordPress PHP scripts.

12

Easy AdSense

Easy AdSense provides a very easy way to generate revenue from your blog using Google AdSense. With its full set of features, Easy AdSense is perhaps the first plugin to give you a complete solution for everything AdSense-related.

11

Google Analytics for WordPress

The Google Analytics for WordPress plugin automatically tracks and segments all outbound links from within posts, comment author links, links within comments, blogroll links and downloads. It also allows you to track AdSense clicks, add extra search engines, track image search queries and it will even work together with Urchin.

10

Add to Any: Share/Bookmark/Email Button

Help readers share, save, bookmark, and email your posts and pages using any service, such as Facebook, Twitter, Digg, Delicious, and over 100 more social bookmarking and sharing sites. The button comes with AddToAny’s customizable Smart Menu, which places the services visitors use at the top of the menu, based on each visitor’s browsing history.

6

Twitter Tools

Twitter Tools is a plugin that creates a complete integration between your WordPress blog and your Twitter account.

17

Broken Link Checker

This plugin will monitor your blog looking for broken links and let you know if any are found.

14

FD Feedburner Plugin

The Feedburner Plugin redirects the main feed and optionally the comments feed to Feedburner.com. It does this seamlessly without the need to modify templates, setup new hidden feeds, modify .htaccess files, or asking users to migrate to a new feed. All existing feeds simply become Feedburner feeds seamlessly and transparently for all users. Just tell the plugin what your Feedburner feed URL is and you’re done.

If you used blogger, you no doubt integrated your feed with Feedburner, seeing as Google bought it and integrated it, as it adds some wonderful
functionality to your feed with the ‘Add to Del-ici-ous… Digg… Email…. Technorati’ etc, but also its AUTOMATIC pinging to different news services, and sites to tell them you have new content. It also splices in flickr, del-ici-ous or furl, it adds Google Adsense to your posts and so much more. So you will more than likely want to keep this and so it lets you redirect your wordpress feed to your own unique wordpress feed, therefore keeping the AUTOMATED updates and subscribers are none the wiser it is coming from a different blogging platform.

Just need to go into feedburner configuration under appearance to set the feeds and redirects (took me a while to find it, prob should have read
installation. ;o)

Stefana Broadbent’s TEDTalk about Social Intimacy Through Social Media

Stefana Broadbent’s TEDTalk about how social media is enhancing personal intimacy and personal sphere’s penetrating the workplace.

“research [that] shows how communication tech is capable of cultivating deeper relationships, bringing love across barriers like distance and workplace rules”

It’s Just Too Noisy In Here

Until last night, I was following over 1100 people on Twitter. I love Twitter, but lately, I am feeling increasingly like every-time I open Twirl, I am being screamed at by 100 people at once.  It was becoming all nigh to follow what the people I really care about (@MrsAngell, @ChrisSaad, @michaelmcneill@DallasClark and @StephenKelly to name a few) where talking about.

Basically, as much as I’ve loved talking to you all, I just can’t keep it up.  In a society which is approaching economies of scarcity, I have reached a saturation point where my Twitter value is dropping because I can’t hold meaningful, deep, 140 character conversations with people anymore.  My Social Graph has become so wide that it barely holds definition anymore.  Too many trivial relationships, talking too loudly.

So starting last night, I’ve started a mass excommunication of my followers.   I have already culled over 300 people (thanks to Twitters recently improved interface tweaks).

Twitter _ People AshleyAngell is following

I am trying to be selective.  If you have *something* in common with me, professionally, personally, intellectually or geographically; then you’re probably safe.  I will an try to ensure that I follow people I converse with (so any @ replies count).

So, I am sorry if your one of the exiled, It’s not personal – It’s Just Too Noisy In Here!

Update

Twitter _ People AshleyAngell is following-1

Dissolution of Social Networks

Cross-posted from the Particls blog:

My lovely wife (who is an Economics and Business teacher coincidentally) sent me a Podcast today which really blew me away. It’s an interview with Andreas Kluth (San Francisco correspondent for The Economist) talking about real and virtual campfires, and predicts the dissolution of standalone social networks as we know them.

Anyone interested in the next generation of internet technology really needs to listen to this podcast. Its clear, concise and really gets at the heart of many social graph issues and human behaviour.

From Russia with Love

Cross-posted from the Particls blog.

With some hilarity, I present: Russian computer program fakes chat room flirting.

Internet chat room romantics beware: your next chat may be with a clinical computer trying to win your personal data and not your heart, an online security firm says.I find this both hilarious on a number of levels, but it illustrates so perfectly about how valuable (as users of the interwebs) our attention data is. It’s so valuable that some smart people have written what can really only be described as a Trojan horse for attention data!

PC Tools senior malware analyst Sergei Shevchenko says the program has a “terrifyingly well-organised” interaction that could fool users into giving up personal details and could easily be converted to work in other languages. “As a tool that can be used by hackers to conduct identity fraud, CyberLover demonstrates an unprecedented level of social engineering,” he said in a statement. “It employs highly intelligent and customised dialogue to target users of social networking systems.”This is not some script kiddy. Or some backyard Javascript peddler. This is some serious hardcore natural language processing prodigy who has the temerity and the wits to make a quick buck by collecting social and personal attention metrics. I can’t condone his actions; as I do find it highly immoral (and unethical) but I can definitely see why someone would do such a thing.

This also highlights the need for the general public to be more conscious and aware of their attention data, how to obtain it, how to control it and how to move it. It clearly demonstrates the value of the data we allow companies and products to collect about us with little or no hesitation. We allow these companies to collect whatever they like, without even letting us have a glimpse of what inside their walled gardens.

It’s long past due that we all stood up and asked them to open the doors.

It’s time we all started demanding Data Portability.

Happy Birthday Wikipedia

Cross-posted from the Particls blog.

In case you’ve been living your life under a rather large rock, there is this site called Wikipedia, and it one of my personal inspirations. You see, I consider myself somewhat of a sociologist and I am fascinated by any group of people collaborating and focussing on a single goal. To me, Wikipedia is the ultimate embodiment of on-line collaboration and cooperation.

Touchstone has just informed me that Wikipedia has just turned six, which makes it more or less one of the few remaining “old school” start-ups still around from the *ahem*, the ‘you know what’!

I wanted to take the opportunity to extend a warm “Happy Birthday” to the crew over at Wikipedia, and wish them the best of ongoing luck with what I believe, to be one of the best examples of world cooperation ever conceived. Thank-you for showing us a glimmer of what we can become if we work together.

Do We Owe YouTube Our Precious Bandwidth?

Cross-posted from the Particls blog.

I just read an interesting blog from Scott Karp at Publishing 2.0, about the recent YouTube polls, partially regarding potential cost to users should YouTube introduce ads at the start of each video.

Needless to say that it doesnt bode well for YouTube if they did this. I too wouldn’t be too happy about it and am a self-confessed media junkie. But is this just the cost of doing business? Can we not come up with more creative ways of monetizing video?

What about this; Incremental random ads where volume of advertising is tied to popularity of a piece of content. So, the less popular videos will have no ads and then randomly show ads with increasingly frequency as the views/popularity increases.

This is on the assumption that people might be more inclined to accept ads for the more popular (and thus theoretically more interesting) videos.

Just my thoughts. I wonder if there is any other compromise?

Web 2.0 in an Occasionally Connected World

Cross-posted from the Particls blog.

Chris and I often converse with each other about a fantasy world where we would happily live. In this world, internet connections are high bandwidth, always on, and free for everyone to use, wherever you are. Sadly, “reality” just hasn’t caught up with us yet. Must be because it has a strong Liberal bias. 😉

Because we don’t have this free-for-use subspace ulra-wideband connection, Web 2.0 only works so far. It’s something that we feel sets Touchstone apart, but also means that we have this constant balancing act between the value we add into the cloud and the value we add onto the client.

It’s something which I am afraid to admit has always perplexed me about the Web 2.0 world – the plethora of diverse and spectacular applications available is staggering and inspiring; so long as I am connected when I want to use them. It’s one of the reasons I seem to be slightly more pessimistic about web-based apps than Chris.

We feel that it’s very important for Touchstone to bridge this gap by persisting information for Web 2.0 apps when the user is intermittently connected.

Filter the noise, whether you’re online or off!