Skip to content
Back to formatted view

Raw Message

Message-ID: <CABdHhvFzm05hWYaZ3g=ptae563J6wciMeoLO2hqRuB4LYPnAiQ@mail.gmail.com>
Date: 2012-07-26T15:10:22Z
From: Hadley Wickham
Subject: Get XML or JSON data from api into data frame
In-Reply-To: <CAKvJ-w7LCH7pF-qk0J3AQaMO8pN7wM-_9jFH7d5sVZM8m9dGyg@mail.gmail.com>

On Thu, Jul 26, 2012 at 4:18 AM, Richard Ohrvall
<richard.ohrvall at gmail.com> wrote:
> Dear all,
>
> I am new to R in general and ways to retrieve XML or JSON data in
> particular. I have tried to get information through the XML package
> and various websites without being able to do exactly what I want. I
> hope someone of you can give me some help.
>
> I want to retrieve information about movies from IMDB or rather the
> unofficial api, www.imdbapi.com. I have a vector with a lot movie-ids
> according to IMDB standard. To give just a few:
>
> ids <-c("tt0110074", "tt0096184", "tt0081568", "tt0448134", "tt0079367")
>
> Now, I want to create a data frame where each of the movies refer to
> one line and the other information is retrieved from the api. This can
> be retrieved either as XML data or JSON data, e.g.
>
> JSON:
> http://www.imdbapi.com/?i=tt0110074&tomatoes=TRUE
>
> XML:
> http://www.imdbapi.com/?i=tt0110074&r=XML&tomatoes=TRUE
>
> Where i refer to the movie-id, i.e. the information I have in my
> vector. They are all in the format ttXXXXXXX

library(httr)
library(rjson)

fromJSON(text_content(GET("http://www.imdbapi.com/?i=tt0110074&tomatoes=TRUE")))

This will be a bit easier in the next version of httr

content(GET("http://www.imdbapi.com/?i=tt0110074&tomatoes=TRUE")),
  type = "application/json")

See also https://github.com/hadley/data-movies, which I suspect is a
faster approach than using an API.

Hadley



-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/