TDS 7.0 for Nonwestern Languages

TDS 7.0 uses 2-byte Unicode (UCS-2) to transfer all textual data between servers and clients. By default, FreeTDS converts this data to 8-bit ASCII representation by stripping the high order byte. This is generally sufficient for western languages such as English, but produces garbage for other languages.

Since, most Unix tools and languages do not support UCS-2, FreeTDS allows conversion by the client to other character sets using the iconv standard. Background information on Unicode and how it affects FreeTDS can be found in the appendix.

To list all supported iconv character sets under Linux, use the iconv command.

iconv --list

For other systems, consult your documentation (most likely man iconv will give you some hints).

In this example a server named mssql will return data encoded in the GREEK character set.

Example 4-2. Configuring for GREEK freetds.conf setting

[mssql]
	host = ntbox.mydomain.com
	port = 1433
	tds version = 7.0
	client charset = GREEK

If iconv runs into a character it can not convert, it will replace that character with a '?' character. Always ensure that the data contained in the database is representable in the chosen character set.

If you have a mix of character data that can not be contained in a single byte character set, you may wish to use UTF-8. UTF-8 is a variable length unicode encoding that is compatible with ASCII in the range 0 to 127. With UTF-8, you are guaranteed to never have an unconvertible character.

Important: FreeTDS is not fully compatible with multi-byte character sets such as UTF-8 and UCS-2. Extreme care should be taken with testing applications using these encodings. Specifically, many applications fail to expect a number of characters returning that will exceed the column size. On the other hand, support of UTF-8 and UCS-2 is a high priority for the developers. Patches and bug reports in this area are especially welcome.

In the following example, a server named mssql will return data encoded in the UTF-8 character set.

Example 4-3. Configuring for UTF-8 freetds.conf setting

[mssql]
	host = ntbox.mydomain.com
	port = 1433
	tds version = 7.0
	client charset = UTF-8

It is also worth clarifying that TDS 7.0 and above do not accept any specified character set during login, as 4.2 does. TDS 7.0 transmits all data as UCS-2. Specifying a character set with the charset option in freetds.conf, or by calling DBSETLCHARSET and its equivalents will be ignored under TDS 7.0.