I'm having trouble building a dataset. I'm working with Census data from Brazil, and the particular set I'm trying to get into right now is a microdata sample which has 4 data files that are saved at .txt files full of numbers. The folder also has lot of excel sheets and other text files describing the data, and (I'm assuming) to help organize everything. Unfortunately there isn't much help in the description about how to construct the dataset and avoid messing things up (since its Census data, I need to make sure I avoid associating data with the wrong city/state, etc.). I basically just need to be able to put the data in readable format, because there's literally 1 variable in the set that I can't find anywhere else and need to get at for some analysis. However, when I've tried to get the data straight into R (copy from NotePad), it overloads R, and R stops responding. Any suggestions? Or, if there isn't enough information about the set to be helpful, what else do you need to know? Or if you'd like to take a look at the data let me know and I can attach it. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/help-building-dataset-tp4637491.html Sent from the R help mailing list archive at Nabble.com.
help building dataset
2 messages · walcotteric, R. Michael Weylandt
Weren't you told to take a look at read.table() (both the function help and the manual describing it)? If the rows correspond in each data file, something like do.call(cbind, lapply(dir(), read.table)) will possibly align the results of read.table()-ing each file in your directory. To parse that further: dir() gives a list of all files in the directory. lapply( x, FUN) takes a set of values (x) and does FUN to them. Here it would read.table() on each file name. do.call(cbind, x) will call the cbind() function on the results of lapply(). It's sort of like doing cbind(x[1], x[2], x[3], ...) but doesn't require as much typing or you to know how many columns are going in. Michael
On Mon, Jul 23, 2012 at 1:32 PM, walcotteric <walcott3 at msu.edu> wrote:
I'm having trouble building a dataset. I'm working with Census data from Brazil, and the particular set I'm trying to get into right now is a microdata sample which has 4 data files that are saved at .txt files full of numbers. The folder also has lot of excel sheets and other text files describing the data, and (I'm assuming) to help organize everything. Unfortunately there isn't much help in the description about how to construct the dataset and avoid messing things up (since its Census data, I need to make sure I avoid associating data with the wrong city/state, etc.). I basically just need to be able to put the data in readable format, because there's literally 1 variable in the set that I can't find anywhere else and need to get at for some analysis. However, when I've tried to get the data straight into R (copy from NotePad), it overloads R, and R stops responding. Any suggestions? Or, if there isn't enough information about the set to be helpful, what else do you need to know? Or if you'd like to take a look at the data let me know and I can attach it. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/help-building-dataset-tp4637491.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.