August 6, 2011

The Data Portability Fact Sheet

Introduction

Parallego has been announced on TechCrunch after a stealth period as the latest social network that will challenge Facebook and Google Plus. Their investors include big names like Sequoia Capital, Andreessen Horowitz and Union Square Ventures, and they have top angels like Ron Conway. They really love developers, so they offer an API to show their commitment to openness.

Parallego doesn’t really exist, but announcements like this are part of startup breaking news about the web and entrepreneurship. These companies emphasize their love for developers and claim to be open because they provide APIs. The truth is that when you test their APIs you usually find a number of problems:

  1. You can read the information but cannot write or modify it.
  2. You have access to certain information but other information is unavailable.
  3. The rate of API calls is low, so you can only make a few calls and must wait a certain period of time to continue.
  4. You cannot make parallel requests in a multiprocess or multithreaded application.
  5. There is no way to quickly pay for the service and access a better service. Google API Console is a step in that direction but a lot of important Google NoAPIs are unavailable.
  6. Some OAuth2 protocol implementation does not work with the existing development libraries.
  7. The service says it welcomes new applications, but this is not the case for new UIs and mobile clients. See Twitter to Devs: Don’t Make Twitter Clients… Or Else [mashable.com]
  8. You cannot even export your own information. The time you have spent adding content to this service is lost once you leave it.
  9. There is no love for developers: the forums are filled with questions and there are no official answers. See Rate limit with billing enabled [google.com] and Graph API rate limit? [facebook.com]
  10. The company often changes its policies. The web mashup that you did seven months ago that attracted thousands of users is useless because the new API revision does not give you the data that you need for some specific features. See Should facebook pay compensation for deprecated API calls and changes [facebook.com]
  11. Old content is removed without warning.

After a while, you begin to doubt, close your eyes and rethink again about the word “Open”. It seems somewhat meaningless. If you are older you may remember that Microsoft was accused of being closed, but you may also remember that in the worst case you could reverse engineer and access all the internals yourself. You need advanced knowledge of tools like IDA Pro, OllyDbg, and WinDbg of course, but it was possible. You can’t reverse engineer the cloud, however you can scrape the information, but this is time consuming both in terms of development and running time.

And while “Open” is repeated in every announcement from high profile web companies, your brain does not register the word anymore just like you do not see any of the ads on Google because your brain made has made its own AdBlock extension.

Data Portability Classification

For all of the above reasons we think the best initiative towards transparency is adding a fact sheet to every service so we can compare them and know how “open” they really are. WikiMatrix is a good example of how comparisons could be made.

Marco Paol from DBB has been informally collecting information about some web services and has put it in a public spreadsheet on Data Portability Comparison

Please feel free to send us clarifications, suggestions, and fixes.

Resources

  1. Open Data and Linked Data [wikipedia.org]
  2. DataPortability project [wikipedia.org]
  3. Small data [smalldata.org]
  4. The open data manual [opendatamanual.org]
  5. Is It Open Data?
  6. Open Data mailing lists [okfn.org]
  7. Synaptic/Web
  8. Open Knowledge Foundation Blog
  9. The Friend of a Friend (FOAF) project
  10. theinfo.org: Community for Getting, Processing, and Visualizing Large Data Sets
  11. Plagiarism Today
  12. PeopleBrowsr’s case against Twitter heads back to state court after federal court ruling
  13. Archive Team archivists