Skip to content

column-binary data

4 messages · David Barron, jim holtman, (Ted Harding)

#
I have a number of datasets that are multipunch column-binary format.  Does anyone have any advice on how to read this into R?  Thanks.

David
#
On 16-Sep-05 David Barron wrote:
Do you mean something like the old

HOLLERITH PUNCHED CARD BINARY FORMAT?
1111111110111111101111011111101111110
0000000001000000010000100000010000001
0000010100110000000010000001100010011
1111001010001010000000001100100101001
0111100100011001100001000100001101011
0100010000001100001010010101001110001
0100101000010101001100001010100101101

(here "1" = hole in card, binary representation of 7-bit ASCII
encoding, high-order bit on top).

If so, or if you precisely describe the binary format you have,
then the above or similar should be easy to get into R.

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 16-Sep-05                                       Time: 19:56:01
------------------------------ XFMail ------------------------------
#
On 16-Sep-05 jim holtman wrote:
Indeed ... as an example of how one could proceed, I "deconstruct"
my example below (see at end).
#First, construct a vector ASCII consiting of the printable
#characters:

ASCII<-c(" ","!","\"","#","$","%","&","'","(",")",
         "*","+",",","-",".","/","0","1","2","3",
         "4","5","6","7","8","9",":",";","<","=",
         ">","?","@","A","B","C","D","E","F","G",
         "H","I","J","K","L","M","N","O","P","Q",
         "R","S","T","U","V","W","X","Y","Z","[",
         "\\","]","^","_","`","a","b","c","d","e",
         "f","g","h","i","j","k","l","m","n","o",
         "p","q","r","s","t","u","v","w","x","y",
         "z","{","|","}","~")


#Next, a vector of powers of 2:

rad<-2^(6:0)


#Read in the data from stdin():

M<-t(matrix(as.integer(unlist((strsplit(scan(stdin(),
     what="character"),split="")))),ncol=7))

#(read 7 lines from stdin by copy&paste:
#1: 1111111110111111101111011111101111110
#2: 0000000001000000010000100000010000001
#3: 0000010100110000000010000001100010011
#4: 1111001010001010000000001100100101001
#5: 0111100100011001100001000100001101011
#6: 0100010000001100001010010101001110001
#7: 0100101000010101001100001010100101101
#8: 
#Read 7 items

#and convert the columns to ASCII codes:

R<-rad%*%M

#and see what you've got:

paste(ASCII[R-31],collapse="")

#[1] "HOLLERITH PUNCHED CARD BINARY FORMAT?"

The above can be adapted to whatever your binary data represent
and to how they are laid out in the input.

Others may find a slicker way of doing this.

The only fly in the above ointment is that I haven't located
in R a character-vector constant which consists of the printable
ASCII characters, or a function to convert numerical ASCII code
to characters, so I made my own.

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 16-Sep-05                                       Time: 22:26:16
------------------------------ XFMail ------------------------------