I have a number of datasets that are multipunch column-binary format. Does anyone have any advice on how to read this into R? Thanks. David
column-binary data
4 messages · David Barron, jim holtman, (Ted Harding)
On 16-Sep-05 David Barron wrote:
I have a number of datasets that are multipunch column-binary format. Does anyone have any advice on how to read this into R? Thanks. David
Do you mean something like the old HOLLERITH PUNCHED CARD BINARY FORMAT? 1111111110111111101111011111101111110 0000000001000000010000100000010000001 0000010100110000000010000001100010011 1111001010001010000000001100100101001 0111100100011001100001000100001101011 0100010000001100001010010101001110001 0100101000010101001100001010100101101 (here "1" = hole in card, binary representation of 7-bit ASCII encoding, high-order bit on top). If so, or if you precisely describe the binary format you have, then the above or similar should be easy to get into R. Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 16-Sep-05 Time: 19:56:01 ------------------------------ XFMail ------------------------------
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20050916/0517e8b4/attachment.pl
On 16-Sep-05 jim holtman wrote:
Each card column had 12 rows, so as binary it comes in as 12 bits. The question is does this come as a 16 bit integer, or a string of 12 bits that I have to extract from. Either case is not that difficult to do.
Indeed ... as an example of how one could proceed, I "deconstruct" my example below (see at end).
On 9/16/05, Ted Harding <Ted.Harding at nessie.mcc.ac.uk> wrote:
On 16-Sep-05 David Barron wrote:
I have a number of datasets that are multipunch column-binary format. Does anyone have any advice on how to read this into R? Thanks. David
Do you mean something like the old HOLLERITH PUNCHED CARD BINARY FORMAT? 1111111110111111101111011111101111110 0000000001000000010000100000010000001 0000010100110000000010000001100010011 1111001010001010000000001100100101001 0111100100011001100001000100001101011 0100010000001100001010010101001110001 0100101000010101001100001010100101101 (here "1" = hole in card, binary representation of 7-bit ASCII encoding, high-order bit on top).
#First, construct a vector ASCII consiting of the printable
#characters:
ASCII<-c(" ","!","\"","#","$","%","&","'","(",")",
"*","+",",","-",".","/","0","1","2","3",
"4","5","6","7","8","9",":",";","<","=",
">","?","@","A","B","C","D","E","F","G",
"H","I","J","K","L","M","N","O","P","Q",
"R","S","T","U","V","W","X","Y","Z","[",
"\\","]","^","_","`","a","b","c","d","e",
"f","g","h","i","j","k","l","m","n","o",
"p","q","r","s","t","u","v","w","x","y",
"z","{","|","}","~")
#Next, a vector of powers of 2:
rad<-2^(6:0)
#Read in the data from stdin():
M<-t(matrix(as.integer(unlist((strsplit(scan(stdin(),
what="character"),split="")))),ncol=7))
#(read 7 lines from stdin by copy&paste:
#1: 1111111110111111101111011111101111110
#2: 0000000001000000010000100000010000001
#3: 0000010100110000000010000001100010011
#4: 1111001010001010000000001100100101001
#5: 0111100100011001100001000100001101011
#6: 0100010000001100001010010101001110001
#7: 0100101000010101001100001010100101101
#8:
#Read 7 items
#and convert the columns to ASCII codes:
R<-rad%*%M
#and see what you've got:
paste(ASCII[R-31],collapse="")
#[1] "HOLLERITH PUNCHED CARD BINARY FORMAT?"
The above can be adapted to whatever your binary data represent
and to how they are laid out in the input.
Others may find a slicker way of doing this.
The only fly in the above ointment is that I haven't located
in R a character-vector constant which consists of the printable
ASCII characters, or a function to convert numerical ASCII code
to characters, so I made my own.
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 16-Sep-05 Time: 22:26:16
------------------------------ XFMail ------------------------------