FileReader

Hi ,

Iam using a FileReader to read a line from a file.this is happening porperly , but the problem is when i want to read a line from the file which has Diacritics in it.

when i get the line and display it(JSP) , the diacritic characters changes to Junk Values.

kindly Let me know how to set the Encoding (windows-1250 or UTF 16) in the FileReader.

Bcoz in my JSP if i hard code the Diacritics , then it is displayed properly .

so i guess the problem is while reading the file thru FileReader.

Appreciate Ur Help

Thanks & Regards

ASHOK

[611 byte] By [mashokforum] at [2007-9-22]
# 1

You can't set the encoding with FileReader; you need to use a FileInputStream and InputStreamReader. Like this:InputStream inStream = new FileInputStream("filename");

// the second parameter is the name of the encoding

Reader inReader = new InputStreamReader(inStream, "CP1250");

jsalonen at 2007-7-2 > top of java,Archived Forums,Java Programming [Archive]...
# 2

Yeah i did this too ,but still some characters are being changed to ?(question marks).

especially characters like a hook (umlaut )over r , s, e etc...

my code is :

InputStream inStream = new FileInputStream(Filename);

InputStreamReader inreader = new InputStreamReader(inStream, "Cp1250");

BufferedReader reader = new BufferedReader(inreader);

i tried it with Cp1250 & windows-1250 .

is there anything concerned with buferredreader readline method ?

mashokforum at 2007-7-2 > top of java,Archived Forums,Java Programming [Archive]...
# 3

Where do you observe characters changed to question marks? There are three common sources of problems:

1) The file is read using the wrong encoding

2) There are literal strings in the source code file and the code is compiled with the wrong encoding

3) The output is written or displayed using an encoding that doesn't support the characters that you are supposed to output.

Apparently 1) and 2) are not an issue anymore but 3) is still possible. If you are writing to System.out, remember that it will use the platform's default encoding. The MSDOS console on Windows has quirks of its own (in particular it uses a different character encoding than the rest of Windows) so I wouldn't rely on it being able to show your characters correctly.

The readLine method is not a problem, it only operates on unicode characters that have already been (hopefully correctly) decoded from bytes.

jsalonen at 2007-7-2 > top of java,Archived Forums,Java Programming [Archive]...
# 4

Basically iam not displaying it in the system console ,but iam using a jsp to display the characters read.

the JSP also has windows-1250 charset set.

moreover if i hardcode the charcters in JSP and then display it ,then its working properly but the chars are changed to question mark only when i read it from a file.

so i guess this is only due to InputStreamReader.

any suggestions or any other method to get rid of this.

Appreciate ur Help.

mashokforum at 2007-7-2 > top of java,Archived Forums,Java Programming [Archive]...
# 5

I just tested, InputStreamReader reads and decodes CP1250 characters correctly in a standalone Java application on my system. import java.io.*;

class CP1250Test {

public static void main(String[] args) throws IOException {

byte[] data = new byte[128]; // make an array of bytes 128-255

for (int i = 0; i < data.length; i++) {

data[i] = (byte) (i + 128);

}

// String s = new String(data, "cp1250"); same as below

ByteArrayInputStream bis = new ByteArrayInputStream(data);

InputStreamReader isr = new InputStreamReader(bis, "cp1250");

BufferedReader br = new BufferedReader(isr);

String s = br.readLine();

// System.out.println(s); if your terminal supports

javax.swing.JOptionPane.showMessageDialog(null, s);

System.exit(0);

}

}

The fact that the bytes are read from an array rather than a file should not make a difference. Does this display the proper characters or only question marks?

jsalonen at 2007-7-2 > top of java,Archived Forums,Java Programming [Archive]...