Binary Conversion of a File
Hi Guys,
The entire issue is on File Handling.
There is one file say input.dat and another file say output.dat
There are two classes Head and Tail which are Serializable. I am creating objects of those two classes.
I am inserting the first object of Head, then first few bytes from input.dat and then the first object of Tail.
Like this goes on till it reaches to the end of file. Till here everything is fine.
Now the output.dat which is generating, I want to convert it into binary format (e.g. 10110)
How can I go about it? After making it in a binary form, how can I retrive the data ?
Please help me out
Regards
Unmesh
What do you mean "converting it in binary form"? Would you get output.dat and convert it in a long string of "0" and "1" chars? I'm asking because output.dat, as described by you, is already a binary file...
Re: Binary Conversion of a File
Author: jummoMa1
In Reply To: Binary Conversion of a File May 16, 2002 10:19 AM
Reply 1 of 1
What do you mean "converting it in binary form"? Would you get output.dat and convert it in a long string of "0" and "1" chars? I'm asking because output.dat, as described by you, is already a binary file...
-
Thanx jummoMa1 for the reply.
What I explained earlier is I am extracting few bytes of data from input.dat, inserting an object of Head.class at the begining of those few bytes and appending an object of Tail.class at the end of those few bytes. This process goes on till the end of file of input.dat occurs. The output file has following in the sequence
1) Serialized object Head
2) Few bytes data from input.dat
3) Serialized object tail
4) Serialized object Head
5) Next Few bytes data from input.dat
6) Serialized object tail
and so on
So my output.dat has following
1) Serialized object of Head - in JUNK format which can be deserialized by readObject()
2) few bytes which were copied from input.dat which can be read in Byte Array
3) Serialized object of TAIL - in JUNK format which can be deserialized by readObject()
and so on
This is not in the format of (1011001).
I want to convert it in this form(10110101) and which can again be converted in the following form
1) Serialized object Head
2) Few bytes data from input.dat
3) Serialized object tail
4) Serialized object Head
5) Next Few bytes data from input.dat
6) Serialized object tail
I am badly stuck in the conversion of the data of output.dat file into 110101 format
Waiting for your reply
Regards
Unmesh
Assuming that what you want is the binary ASCII representation of your file, then, this code should do the trick:
FileInputStream in = new FileInputStream(...);
byte[] buffer = new byte[4096];
intnread;
StringBuffer result = new StringBuffer();
while ((nread = in.read(buffer)) > 0)
{
for (int i = 0; i < nread; ++i)
{
byte value = buffer[i];
for (int mask = 128; mask > 0; mask /= 2)
result.append(((value & mask) == mask) ? '1' : '0');
}
}
I'm sorry, but can't figure yet what is your form(1100110). Can you give me an example of this kind of file?
The general acception of "binary file" is a sequence of bytes, which output.dat is already. If Head and Tail classes are data structures which can easily be represented as sequences of bytes I could guess you want to do some sort of custom seriaization. But again you are the one who can answer...
Waiting 4 your reply.
Thanx a lot CPOLIZZI and JUMMOMA1 for helping me out.
I want the binary ASCII representation. Meaning all the serialized objects and other data should be converted in ONEs and ZEROs. That means char(10).
I think, the code given by CPOLIZZI will work out. I haven't tried it yet. Well, cpolizzi can you please explain me the following things from the code written by you?
1) the size of the byte array buffer is 4096. Why is it so ?
2) i couldn't understand this line of the code
for (int mask = 128; mask > 0; mask /= 2)
From where are you getting these numbers 128 and 2 ? why are you dividing mask by 2 ?
There is one more question........
Once I convert the code into binary ascii representation using the code explained by CPOLIZZI, how can I get the original file back ? meaning, with the same code I will convert the entire contents of the file in char '1' and '0' so how to get the data of of the file in its original form ? so that I can deserialize it using readObject() whenever required
Thanx guys
Unmesh
Now that is clear what do you wanted, take a look to this:
import java.io.*;
class BinFile {
static int BUFFER_LEN=8192; //or whatever you think is better
// depending on the size of your file
static void toBinary(String srcPath,String targetPath ) throws IOException {
byte[] buffer=new byte[BUFFER_LEN];
FileInputStream in=new FileInputStream(srcPath);
BufferedWriter out=new BufferedWriter(new FileWriter(targetPath));
for(int nbytes; (nbytes=in.read(buffer))!=-1;) {
for(int j=0; j<nbytes; j++) {
byte thisByte=buffer[j];
for(int mask=128; mask!=0; mask>>>=1) { //mask goes from %10000000 to %00000001
out.write((thisByte & mask)==mask?'1':'0');
}
}
}
out.flush();
out.close();
in.close();
}
static void fromBinary(String srcPath,String targetPath ) throws IOException {
char[] chars8=new char[8];
BufferedReader in=new BufferedReader(new FileReader(srcPath));
BufferedOutputStream out=new BufferedOutputStream(new FileOutputStream(targetPath));
for(int nchars=0; (nchars=in.read(chars8))==8;) {
byte thisByte=0;
for(int j=8; j-->0; ) {
if(chars8[j]=='1') thisByte|=128>>j; //128>>j goes from %10000000 to %00000001
}
out.write(thisByte);
}
out.flush();
out.close();
in.close();
}
public static void main(String args[]) throws Exception {
if("-b".equals(args[0])) {
toBinary(args[1],args[2]);
} else fromBinary(args[0],args[1]);
}
}
> 1) the size of the byte array buffer is 4096. Why is it so ?
It is an arbitrary value just plucked, almost, out of thin air. It can really be any value that you want, but, since computers a binary machines, a power of two is very desirable (in this case, 2^12). Secondly, in terms of buffer sizes, as the buffer size increases you eventually wind up with a diminishing returns issue (e.g., the larger the buffer, the more memory, more management, more time). To best see this, take a large file and start out with a buffer size of 1 byte and time how long it takes to process the entire file. Do this again but double the buffer size, now 2. It should take roughly half as long to process the entire file. Repeat this process until you see that the processing time comes close to hitting a plateau. Most implementations that utilize buffers tend to use 4K and 8K simply because it just seems to be the best payoff with the smallest price.
> 2) i couldn't understand this line of the code
> for (int mask = 128; mask > 0; mask /= 2)
> From where are you getting these numbers 128 and 2 ? why are you dividing mask by 2 ?
A byte consists of 8 bits (to be more precise, let us call this an octet). Each bit has a value - basic number system thing. So, for an octet, the bit values are (most significant bit first):
7| 6 | 5 | 4 | 3 | 2 | 1 | 0 bit
128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 decimal bit value
80 | 40 | 20 | 10 | 08 | 04 | 02 | 01 hex bit value
To liken this to the decimal, base 10, system, bit 0 is the 1's column, bit 1 is the 10's column, bit 2 is the 100's column and so on. The binary system (base 2) is mathematically the same. It is generalized like this for a given value at a given "column" for any number system:
value := b^i
where:
b is the base number system in question
i is the column in question
So, to find the complete value of any number in any number system:
value := db^i + db^i-1 + db^i-2 + ... + db^1 + db^0
where:
b is the base number system in question
i is the column in question
d is the value of the digit in the column in question
I should also note that jummoMa1's approach of using bit shifting is more efficient.
Thanx a lot cpolizzi and jummoMa1....My problem is solved. Thank you once againRegardsUnmesh