Skip to content

converting zipcodes to latitude/longitude

4 messages · Jim Lemon, Nicola Ruggiero

#
Hello everyone,

I've downloaded Jeffrey Breen's R package "zipcode," which has the
latitude and longitude for all of the US zip codes. So, this is a
data.frame with 43,191 observations. That's one data frame in my
environment.

Then, I have another data.frame with over 100,000 observations that
look like this:

waltham, Massachusetts 02451
Columbia, SC 29209

Wheat Ridge , Colorado 80033
Charlottesville, Virginia 22902
Fairbanks, AK 99709
Montpelier, VT 05602
Dobbs Ferry, New York 10522

Henderson , Kentucky 42420

The spaces represent absences in the column. Regardless,
I need to figure out how to write a code that would, presumably, match
the zipcodes and produce another column to the data frame with the
latitude and longitude. So, for example, the code would recognize
02451 above, and, in the the column next to it, the code would write
42.3765? N, 71.2356? W in the column next to it, since that's the
latitude and longitude for Waltham, Massachusetts.

Any idea of how to begin a code that would perform such an operation?

Again, I have a data.frame with the zipcodes linked to the the
latitudes and longitudes, on the one hand, and another data.frame with
only zipcodes (and some holes). I need to produce the corresponding
latitude/longitudes in the latter data.frame.

Nicola
#
Hi Nicola,
Getting the blank rows will be a bit more difficult and I don't see
why they should be in the final data frame, so:

townzip<-read.table(text="waltham, Massachusetts 02451
Columbia, SC 29209

Wheat Ridge , Colorado 80033
Charlottesville, Virginia 22902
Fairbanks, AK 99709
Montpelier, VT 05602
Dobbs Ferry, New York 10522

Henderson , Kentucky 42420",
sep="\t",stringsAsFactors=FALSE)
zip_split<-function(x) {
 commasplit<-unlist(strsplit(x,","))
 state<-trimws(gsub("[[:digit:]]","",commasplit[2]))
 zip<-trimws(gsub("[[:alpha:]]","",commasplit[2]))
 return(c(commasplit[1],state,zip))
}
townzipsplit<-as.data.frame(t(sapply(townzip$V1,zip_split)))
rownames(townzipsplit)<-NULL
names(townzipsplit)<-c("town","state","zip")
townzipsplit$latlon<-NA
# I don't know the name of the zipcode column in the "zipcode" data frame
newzipdf<-merge(townzipsplit,zipcodedf,by.x="zip",by.y="zip")

Jim

On Tue, May 14, 2019 at 5:57 AM Nicola Ruggiero
<nicola.ruggiero.unt at gmail.com> wrote:
1 day later
#
Hi Jim,

I ended up collaborating with someone, and, on the basis of looking at
your code (we did take it into consideration and talk about it), we
came up with this:

library(stringr)
numextract <- function(string){
     str_extract(string, "\\-*\\d+\\,*\\d*")
}
myDataSet$zip<-numextract(myDataSet$state)
combineddata<-merge(zipcode, myDataSet, by.x="zip", by.y="zip")

So, as I understand it, we build a function the purpose of which was
to extract the numerical value from a string value, imputed that into
a column, then merged the two data frames together. It worked!

Now I just need to figure out this thing called shape data...basically
I need to figure out how to interpose a shape of the United States
underneath my data points so that I can see them over the location to
which they correspond.

Nicola
On Mon, May 13, 2019 at 9:09 PM Jim Lemon <drjimlemon at gmail.com> wrote:
#
Hi Nicola,
Good to learn that you solved the problem. Shape files are usually a
set of polygons for the named areas.  I did some work with the "rgdal"
package a while ago and it wasn't very difficult. There might be
better methods now, so posting to R-SIG-geo is a good idea.

Jim

On Thu, May 16, 2019 at 6:30 AM Nicola Ruggiero
<nicola.ruggiero.unt at gmail.com> wrote: