Markdown syntax is supported in comments.
Thanks for the nice write-up. Maybe we can offer some friendly collective pressure so that the NYT loosens up on the API TOS? I'll sign such a petition.
Thanks indeed for the kind words and thoughtful criticism. Andrei and Derek have worked long hours to get this in shape for the 111th, and there's a lot more to come.
Just to address a couple points you raised (and David, perhaps this will address your concerns too):
1) The 5,000 limit is relatively low, I agree. I think since we're new to this, we wanted to err on the low side to begin with and ramp up as demand increased. This was largely a technical consideration, since the APIs are on a new platform for us. If usage demands it, we'll re-examine the limits and bump them upwards. We can make exceptions on a case-by-case basis, so if this is a show-stopping limitation for you, drop me a note. We'll see what we can do.
2) I'm not a lawyer, but my understanding of the noncompete clause is that it refers specifically to financial competition. In other words, we wouldn't want someone taking our API, turning it into a product or service and then selling it. (At least not without talking to us first.) It has nothing at all to do with who can and cannot use the API, just how.
3) The "archiving data" clause is fairly boilerplate, and, frankly, pretty reasonable I think. My understanding is this restriction is intended to prevent users from systematically downloading all of the data, and storing it in perpetuity -- even after you stop using the API and/or your account is terminated. That isn't the intended use of the API, and from a data integrity standpoint makes perfect sense.
We also aren't alone in having a limitation like this in place. CRP's terms, for example, have roughly the same restriction.
4) Finally, there's the question of local media organizations using the API. The legalese here is a little dense, and perhaps we should make this clearer -- but we absolutely allow other news organizations to use our API. In fact, we specifically lifted the commercial use restriction with that in mind.
There's two reasons we did this: First, any lawyer will tell you there's no hard and fast rule about what constitutes commercial use. Technically, a blogger with Google ad words on his or her site could be considered a commercial entity. Secondly, The Times considers this API to be part of its journalistic mission, and wanted it to be as open as possible.
It seemed odd to us to release something that users like the DNC and RNC -- which are both noncommercial -- could take advantage of, but for-profits like TPMMuckraker could not. We felt the best solution was to drop that restriction entirely.
Anyway, I hope that addresses some of the issues you raised, and thanks again for the mention.
Yuck. Sorry about the comment blob above. That had line breaks when I posted it... I did forget to include my email, which is aron [at] nytimes.com.
Aron,
A couple questions:
"3) The "archiving data" clause is fairly boilerplate, and, frankly, pretty reasonable I think. My understanding is this restriction is intended to prevent users from systematically downloading all of the data, and storing it in perpetuity"
Is there data that you can really claim in this new API that is really your own? Isn't all of the data here coming from public sources? Why prevent people from caching it locally?
The API terms of services says "use the NYT APIs for any commercial purpose or in any product or service that competes with products or services offered by NYT."
The "or in any" actually nullifies the commercial categorization earlier in the sentence. This means that because Sunlight Labs has an API which could be construed to compete with the New York Times API, we can't use it to begin with!
Certainly, I know you welcome us to use it, but that isn't the point. The point is, the API licensing agreement you've got needs to be revisited and adjusted so that it is truly "open."
We're happy to help in any way that we can.
"We also aren't alone in having a limitation like this in place. CRP's terms, for example, have roughly the same restriction. "
Great, they set a precedent. ::bangs his head::
But, it's okay for the NYTimes API, since it's not like you couldn't find all that on GovTrack anyway... ::cough cough::
"XML is ugly" - What's so bad about XML? Depending on the usage JSON may be more convenient, however in most cases either one will require parsing by the application.