Linux vs Windows Character Set Encoding Question

Friday, March 7, 2014

Linux vs Windows Character Set Encoding Question

Something of a technical question that most readers will read and go, "Huh?" Thats okay.

There are a multiple character set encodings that are available for transferring information. In this case, the database is Informix; we are using the IBM JDBC driver to exchange information between the Java middleware and Informix. We are using Tomcat 7.

The problem is that Microsoft Word uses the extended ASCII character set to represent emdash, endash, and the "smart quotes" and "smart apostrophes." Using CP1252 encoding when Tomcat7 is running on Windows, the extended ASCII character set values are correctly stored in BLOBs on Informix. Using the same encoding on Linux, the extended ASCII character set values seem to be turned into something unrecognizable -- and what used to be individual characters comes out as several fairly random characters in the extended character set.

Questions:

1. Is this difference between Linux and Windows behavior because CP1252 encoding is not properly supported by Linux?

2. Is there an encoding character set that will allow the extended ASCII character set to be stored correctly in the Informix database?

Information Technology News

Friday, March 7, 2014

Linux vs Windows Character Set Encoding Question

0 comments:

Post a Comment

Blog Archive