by Mike Shea on 27 January 2013
News, blogs, and social networks demand more and more of our time to process an ever-growing torrent of noise and separate it from the stuff we really care about. Tweet Threshold is a Python script that helps you filter noise from Twitter so you can read only the items worthy of your attention and refocus your attention on doing things instead of processing and consuming. This script and it's related files are all hosted at Github and released under a Creative Commons license.
"The most essential service of the next decade will be the one that keeps you the best informed in the least amount of time. Theres more to life than staring at screens all day."
- Mike Davidson, VP of design, Twitter, and founder, Newsvine
In the popular self-help book, 7 Habits of Highly Effective People, Stephen Covey describes how we might move our time, effort, and attention to activities of long-term importance instead of the urgent distractions we face throughout much of our day. Urgency isn't importance, regardless of how most of us act. Covey puts this together in a four-section grid:
Ever since 9/11, mainstream news has moved from importance to urgency. We've grown used to the idea that we have to know what's going on right now. This attitude has filtered down and propagated to every news site and blog on every topic in which we might be interested. Regardless of the topic, everyone seems out to be the first to break a story, even if the results of that story have little to no direct impact on us.
We've become news junkies.
Now it's filtered down to Twitter, a social network built on urgency. We can learn anything that happens the minute it happens by the very people it happened to. Do we really need to know it? Must we spend four hours a day watching Twitter? Must we have a constantly open chat log with the whole planet invited in? Is this really important or is it simply an urgent distraction?
I say we have better things to do and better places to put out attention. Tweet Threshold is the coded manifestation of the philosophy that urgency isn't importance.
Thousands to millions of people might monitor the very same Twitter streams you do. When they like something, they retweet it to share it with those who follow them. Their own brain power processes incoming information and determines what is important.
Instead of working in parallel with all of that brain power, why not work in series? Why not let these other brains work for you? Imagine if you could run your own MapReduce cluster across human brains instead of the CPUs of commodity computer systems.
One of the HTML pages included in this script gives us a different way of looking at the news. Rather than showing you all of the news up to the hour, it only shows you news from yesterday, ranked by score, and then all of the news of the previous six days before yesterday, also ranked by score. It won't show you anything from today. You'll see today's news tomorrow. Check out my own personal news page as an example.
This may seem counterintuitive. Since it only shows you news from yesterday, you're only going to check it once a day. There's no reason to check it more than that. Now you can catch up on all the news you care about, pre-filtered by other people, once a day in just a couple of minutes. This moves the urgent distractions of up-to-the-minute Twitter surfing to focusing on pre-filtered news of importance read only once a day. Let's look at this on the Covey chart:
This script reaches out to Twitter and pulls down the latest 100 tweets in JSON from one or more authenticated timelines. It saves any tweets containing a URL to a local SQLite3 database. From this database, the script outputs an HTML page filtered by an algorithm based on retweets / followers with a minimum threshold so you just get the ones a lot of people thought were retweet-worthy.
The score is determined by running an algorithm on retweets / followers for Twitter and normalizing the results. You can change the threshold of the visible score in the parameters and edit the white list and black list to eliminate any tweets containing certain words or showing tweets from white-listed users you always want to see regardless of score.
This program is intended to run every hour as a scheduled event or cron job.
You'll want some experience working with Python to run this script. This script uses Tweepy to handle the interactions with Twitter and Jinja2 to handle the HTML templating.
To run this script, you'll need to register this script as an application with Twitter. See https://dev.twitter.com/apps for more information.
Change the parameters in "fetch_tweets.py" script to your own twitter authentication codes and the local directories where you want the results. Modify the html_template.txt template to suit your needs. Set up a scheduled event like a cronjob to run "fetch_tweets.py" once every hour.
The index.html file only shows tweets from yesterday and the previous seven days. This is on purpose. Who really needs up-t-the-minute news these days? Relax and spend some time in a park for God's sake.
The whole intent of this script is to save you time and help you redirect your attention to doing things instead of just surfing news all the time. As the stream of information grows ever wider, it's important that we have some way to sift through, find the stuff we care about, and ignore the rest. This script helps you use the experience of other people, through retweets and upvotes, to bring the right things to your attention.
This script is released under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 license so you can distribute it, modify it, and share it as long as you release it under a similar license and attribute the original program to me.
We're going to need better agents like this if we want to take back our attention from the hordes of content producers demanding their space in our brains. I'm hoping to continue working on this problem or finding better solutions. We can't trust the big companies like Twitter or Google to do this for us. It isn't in their interests to help us avoid using their sites. We can only trust our own tools and our ourselves to do it for us.
If you're interested in this topic, know of some other tools like this, or have used the script and found it useful, please send me an email to email@example.com to let me know.
Send comments to firstname.lastname@example.org or follow @mshea on Twitter. If you enjoyed this article, please use this link to Amazon.com for your next online purchase.