A Guide to Backing Up Pinboard

I’m a huge fan of the Pinboard, a web-based bookmarking service. I never understood web-based booking when it was big and why everyone used it, but as my reading, writing, and speaking has increased, I’ve realized the value of having an everything bucket to toss everything and anything interesting into.

Now much of my workflow for things I do is dependent on Pinboard. When I start a presentation, I go to Pinboard and open up all my tags that are relevant to that topic. When I write on Behind Companies, I’m often reminded of things I previously read - Pinboard makes it easy to find those articles.

Even better, for $25 per year, Pinboard offers an archival account, which will store a copy of pages you bookmark to combat link rot, where stuff disappears over time.

Recently, I’ve also integrated other services into Pinboard using the fantastic ifttt. I save favorited YouTube videos, starred Google Reader items, and liked Instagram photos to Pinboard. It’s become my brain.

So with so much dependency on all this, I knew I needed to save this stuff. If Pinboard disappeared tomorrow, I need a way to access this stuff I’m saving. Maciej Ceglowski, developer of Pinboard, gives you the option at any time to export all your items as HTML, XML, or JSON. Data portability is good. However, there’s no automated option to do this.

So I came up with a way to automate everything. This is an idiot’s guide to do it, since I’m by no means a pro at this stuff - I just poke around until I can make it work.

Requirements

  • Web Server where you can run CRON jobs (most basic web servers can do this)
  • ifttt Account (Free)
  • Dropbox

Step 1 - Set up a CRON job on your web server

This step will set up an automated command to download all of your current Pinboard data to an XML or JSON file on your web server.

A CRON job is an automated way to run commands on a web server. The Pinboard API lets you send a command that will return all of your data.

Log in to your web host’s control panel, and you should be able to find somewhere to enter CRON jobs. You can choose how often to run it, I run mine nightly. Use this command, and fill in the variables in brackets (be sure to remove the brackets).

curl https://[pinboardusername]:[pinboardpassword]@api.pinboard.in/v1/posts/all?format=json -o "[/path/to/backup/directory/on/Webserver/][filename].json"

I chose to use JSON per Maciej Ceglowski’s recommendation that [JSON is probably the easisest to parse] if Pinboard were ever to disappear. If you want to use XML, just remove ?format=json and change the filename to .xml.

Parsing just means that in the future something could take the input and split all the fields into something readable. Think of it as an Excel spreadsheet that can split each field into a column.

Step 2 - Make ifttt copy your backup to Dropbox nightly

So at this point, you’ll have a .json file sitting on your webserver somewhere, http://www.example.com/Pinboard/pinboard_backup.json. For most people this might be enough, but I want an extra layer of backup protection to Dropbox, in case my webserver were to go down as well.

Log in to ifttt, and go to Tasks > Create Task. Your trigger will be Date & Time where you’ll set what time you want to run the script. Then on Dropbox, it will run as grab a file from URL. Just enter your URL and the directory on Dropbox you want to save your file.

And you’re done.

Now I can live with the security that my saved articles. I’m secure knowing I’m a paying user of Pinboard, but nothing is forever, and some day the service will not exist. When that day comes, I’ll have an up to date backup sitting locally and in the cloud on Dropbox.

Update Dec 29 4:55pm: The question was asked if you can use your API endpoint URL (https://[pinboardusername]:[pinboardpassword]@api.pinboard.in/v1/posts/all?format=json) straight into ifttt. Maciej confirmed that it should work, the problem is that you’re essentially storing your login credentials in a 3rd party service, and you don’t know if they’re storing and transmitting it securely.

So the point is: it’s an ok workaround if maybe you don’t have access to a web server, just make sure you use a unique password.

  • Anonymous

    Thank you very much for writing this up! I will now go look up how to run a CRON job on my webserver. Backups are love.

  • http://peterstuifzand.nl Peter Stuifzand

    Why don’t you download the file directly from the Pinboard url? Security, perhaps?

  • http://behindcompanies.com/ Marcelo Somers

    Because there is no way to do it automatically. I’d have to login each day and download the HTML/XML/JSON file.

    I can’t automate grabbing it directly automatically because Pinboard doesn’t give you a unique URL to the file. Maciej recommended using the API: https://twitter.com/pinboard/status/152106556891217921

  • http://peterstuifzand.nl Peter Stuifzand
  • http://behindcompanies.com/ Marcelo Somers

    Hm, interesting. I thought it was, but when I tried it yesterday it didn’t work. It works now. Looks like you could just drop that URL into ifttt. I’ll confirm with Pinboard.

  • http://behindcompanies.com/ Marcelo Somers

    Maciej said it should work. The downside is that you’d be plugging your password into a 3rd party service. You don’t that they store and transmit it securely.

    See: https://twitter.com/pinboard/status/152523193796657155

  • Drew Schuster

    Great post!  Looking at how you did this, I decided to extend your technique a little.  I wrote a post ( http://nuncamind.com/blog/2011/12/31/automatic-pinboard-backup/ ) that shows how I load the Pinboard backup as a static html file and then publish it on my server.  This way even if Pinboard and Dropbox are down I can still access them from anywhere!

  • http://twitter.com/richardmhowell Richard Howell

    Just as a side note, i had trouble getting my script to work, i set it all up as per the instructions above, the CRON ran but ifttt had nothing to copy. I found out in the end you have to point the script to the absolute path of the JSON file rather than the relative path. /home/user/public_html/backup_folder/ not /public_html/backup_folder

    This may have been an oversight on my part, not knowing 100% how things worked, however;You can find the absolute path by creating a empty php file (E.g path.php) and copying the text found here into it

    http://joejoomla.com/news-mainmenu-46/1-latest/83-how-to-find-the-absolute-path.html

    upload it to the directory you are trying to use as the backup directory then browse to that directory in your web browser (e.g http://www.yoursite.com/pinboard) adding path.php to the end of the URL and the path to this folder will be displayed.

  • http://www.facebook.com/chadchristian Chad Nielsen

    Great stuff, thanks for sharing!  I have a Pinboard account but don’t find myself using it anywhere near as much as Instapaper.  Why is Pinboard your bookmark service of choice over, say, Instapaper?

  • http://behindcompanies.com/ Marcelo Somers

    Great point, Richard, thanks for clarifying. I will update this post when I get the chance to make it clearer for the less-technical.

  • http://behindcompanies.com/ Marcelo Somers

    Good question - I’m actually love Instapaper, but they probably have different uses. I save stuff I want to read in Instapaper. If it’s worth archiving for later, then I’ll save it to Pinboard. The two services actually connect, you can automatically import your favorited articles to Pinboard.

    Pinboard is sort of my archive of stuff that was good, plus the archiving feature to combat link rot is fantastic.

  • http://www.facebook.com/chadchristian Chad Nielsen

    That’s good to know.  Hehe, I’ve found myself using Instapaper what you use it and Pinboard for.  The quick read later stuff as well as longterm link storage.  Perhaps I’ll give your “workflow” a try to see if there are some Pinboard features I find useful.

    I will say the link rot feature may be the one thing worth using Pinboard for. I’ll have to see if it’s worth the $25/year price.