Jump to content

Tamil Script Code for Information Interchange

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by ArmbrustBot (talk | contribs) at 18:28, 5 May 2015 (→‎External links: re-categorisation per CFDS, replaced: Category:Tamil Character Encoding Standards → Category:Tamil character-encoding standards using AWB). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:Contains Indic text Tamil Script Code for Information Interchange (TSCII) is a coding scheme for representing the Tamil script. The lower 128 codepoints are plain ASCII, the upper 128 codepoints are TSCII-specific. After long years of being used on the Internet by private agreement only, it was successfully registered with the IANA in 2007.[1]

TSCII encodes the characters in visual (written) order, paralleling the use of the Tamil Typewriter.

Unicode has used the logical order encoding strategy for Tamil, following ISCII, in contrast to the case of Thai, where the visual order encoding grandfathered by TIS-620 was adopted.

The government of Tamil Nadu endorses its own TAB/TAM standards for 8-bit encoding and other, older encoding schemes can still be found on the WWW.

The free etext collection at Project Madurai uses the TSCII encoding, but has already started to provide Unicode versions.

History

The need for a common encoding for Tamil was felt by members of various mailing list based forums in mid-1990s, as there were multiple custom coded fonts were prevalent in those forums. While some of the commercial encodings were popular than the others, they were not accepted by wider community due to conflicting commercial interests. While Unicode was accepted by most as the future standard, most of the desktop systems at that time were still not capable of handling Unicode for Tamil language, and an interim 8-bit encoding was required.

A separate mailing list for discussion of such encodings (webmasters@tamil.net) was created in 1997 to initiate this discussion, starting with an email written by Dr.K.Kalyanasundaram to the popular Tamil author Sujatha who headed the committee for standardization of Tamil keyboard.[2] This forum quickly attracted enthusiastic participants from across the globe, including several prominent Tamil scholars. Archives of these discussion are maintained by INFITT.[3]

Subsequent to publishing TSCII, most of the members of webmasters@tamil.net mailing list became part of INFITT, which is a wider initiative to bring in standardization and continued development in various areas of Tamil computing.

Codepage layout

TSCII
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
8_ Template:Chset-color-intl |
0BE6
128
Template:Chset-color-intl |
0BE7
129
Template:Chset-color-intl |ஸ்ரீ
0BB8 0BCD 0BB0 0BC0
130
Template:Chset-color-intl |
0B9C
131
Template:Chset-color-intl |
0BB7
132
Template:Chset-color-intl |
0BB8
133
Template:Chset-color-intl |
0BB9
134
Template:Chset-color-intl |க்ஷ
0B95 0BCD 0BB7
135
Template:Chset-color-intl |ஜ்
0B9C 0BCD
136
Template:Chset-color-intl |ஷ்
0BB7 0BCD
137
Template:Chset-color-intl |ஸ்
0BB8 0BCD
138
Template:Chset-color-intl |ஹ்
0BB9 0BCD
139
Template:Chset-color-intl |க்ஷ்
0B95 0BCD 0BB7 0BCD
140
Template:Chset-color-intl |
0BE8
141
Template:Chset-color-intl |
0BE9
142
Template:Chset-color-intl |
0BEA
143
9_ Template:Chset-color-intl |
0BEB
144
Template:Chset-color-ext-punct|
2018
145
Template:Chset-color-ext-punct|
2019
146
Template:Chset-color-ext-punct|
201C
147
Template:Chset-color-ext-punct|
201D
148
Template:Chset-color-intl |
0BEC
149
Template:Chset-color-intl |
0BED
150
Template:Chset-color-intl |
0BEE
151
Template:Chset-color-intl |
0BEF
152
Template:Chset-color-intl |ஙு
0B99 0BC1
153
Template:Chset-color-intl |ஞு
0B9E 0BC1
154
Template:Chset-color-intl |ஙூ
0B99 0BC2
155
Template:Chset-color-intl |ஞூ
0B9E 0BC2
156
Template:Chset-color-intl |
0BF0
157
Template:Chset-color-intl |
0BF1
158
Template:Chset-color-intl |
0BF2
159
A_ Template:Chset-color-ext-punct|NBSP
00A0
160
Template:Chset-color-intl |
0BBE
161
Template:Chset-color-intl |ி
0BBF
162
Template:Chset-color-intl |
0BC0
163
Template:Chset-color-intl |
0BC1
164
Template:Chset-color-intl |
0BC2
165
Template:Chset-color-intl |
0BC6
166
Template:Chset-color-intl |
0BC7
167
Template:Chset-color-intl |
0BC8
168
Template:Chset-color-ext-punct|©
00A9
169
Template:Chset-color-intl |
0BD7
170
Template:Chset-color-intl |
0B85
171
Template:Chset-color-intl |
0B86
172
Template:Chset-color-undef| Template:Chset-color-intl |
0B88
174
Template:Chset-color-intl |
0B89
175
B_ Template:Chset-color-intl |
0B8A
176
Template:Chset-color-intl |
0B8E
177
Template:Chset-color-intl |
0B8F
178
Template:Chset-color-intl |
0B90
179
Template:Chset-color-intl |
0B92
180
Template:Chset-color-intl |
0B93
181
Template:Chset-color-intl |
0B94
182
Template:Chset-color-intl |
0B83
183
Template:Chset-color-intl |
0B95
184
Template:Chset-color-intl |
0B99
185
Template:Chset-color-intl |
0B9A
186
Template:Chset-color-intl |
0B9E
187
Template:Chset-color-intl |
0B9F
188
Template:Chset-color-intl |
0BA3
189
Template:Chset-color-intl |
0BA4
190
Template:Chset-color-intl |
0BA8
191
C_ Template:Chset-color-intl |
0BAA
192
Template:Chset-color-intl |
0BAE
193
Template:Chset-color-intl |
0BAF
194
Template:Chset-color-intl |
0BB0
195
Template:Chset-color-intl |
0BB2
196
Template:Chset-color-intl |
0BB5
197
Template:Chset-color-intl |
0BB4
198
Template:Chset-color-intl |
0BB3
199
Template:Chset-color-intl |
0BB1
200
Template:Chset-color-intl |
0BA9
201
Template:Chset-color-intl |டி
0B9F 0BBF
202
Template:Chset-color-intl |டீ
0B9F 0BC0
203
Template:Chset-color-intl |கு
0B95 0BC1
204
Template:Chset-color-intl |சு
0B9A 0BC1
205
Template:Chset-color-intl |டு
0B9F 0BC1
206
Template:Chset-color-intl |ணு
0BA3 0BC1
207
D_ Template:Chset-color-intl |து
0BA4 0BC1
208
Template:Chset-color-intl |நு
0BA8 0BC1
209
Template:Chset-color-intl |பு
0BAA 0BC1
210
Template:Chset-color-intl |மு
0BAE 0BC1
211
Template:Chset-color-intl |யு
0BAF 0BC1
212
Template:Chset-color-intl |ரு
0BB0 0BC1
213
Template:Chset-color-intl |லு
0BB2 0BC1
214
Template:Chset-color-intl |வு
0BB5 0BC1
215
Template:Chset-color-intl |ழு
0BB4 0BC1
216
Template:Chset-color-intl |ளு
0BB3 0BC1
217
Template:Chset-color-intl |று
0BB1 0BC1
218
Template:Chset-color-intl |னு
0BA9 0BC1
219
Template:Chset-color-intl |கூ
0B95 0BC2
220
Template:Chset-color-intl |சூ
0B9A 0BC2
221
Template:Chset-color-intl |டூ
0B9F 0BC2
222
Template:Chset-color-intl |ணூ
0BA3 0BC2
223
E_ Template:Chset-color-intl |தூ
0BA4 0BC2
224
Template:Chset-color-intl |நூ
0BA8 0BC2
225
Template:Chset-color-intl |பூ
0BAA 0BC2
226
Template:Chset-color-intl |மூ
0BAE 0BC2
227
Template:Chset-color-intl |யூ
0BAF 0BC2
228
Template:Chset-color-intl |ரூ
0BB0 0BC2
229
Template:Chset-color-intl |லூ
0BB2 0BC2
230
Template:Chset-color-intl |வூ
0BB5 0BC2
231
Template:Chset-color-intl |ழூ
0BB4 0BC2
232
Template:Chset-color-intl |ளூ
0BB3 0BC2
233
Template:Chset-color-intl |றூ
0BB1 0BC2
234
Template:Chset-color-intl |னூ
0BA9 0BC2
235
Template:Chset-color-intl |க்
0B95 0BCD
236
Template:Chset-color-intl |ங்
0B99 0BCD
237
Template:Chset-color-intl |ச்
0B9A 0BCD
238
Template:Chset-color-intl |ஞ்
0B9E 0BCD
239
F_ Template:Chset-color-intl |ட்
0B9F 0BCD
240
Template:Chset-color-intl |ண்
0BA3 0BCD
241
Template:Chset-color-intl |த்
0BA4 0BCD
242
Template:Chset-color-intl |ந்
0BA8 0BCD
243
Template:Chset-color-intl |ப்
0BAA 0BCD
244
Template:Chset-color-intl |ம்
0BAE 0BCD
245
Template:Chset-color-intl |ய்
0BAF 0BCD
246
Template:Chset-color-intl |ர்
0BB0 0BCD
247
Template:Chset-color-intl |ல்
0BB2 0BCD
248
Template:Chset-color-intl |வ்
0BB5 0BCD
249
Template:Chset-color-intl |ழ்
0BB4 0BCD
250
Template:Chset-color-intl |ள்
0BB3 0BCD
251
Template:Chset-color-intl |ற்
0BB1 0BCD
252
Template:Chset-color-intl |ன்
0BA9 0BCD
253
Template:Chset-color-intl |
0B87
254
Template:Chset-color-undef|
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F

In the table above 80 is U+0BE6 TAMIL DIGIT ZERO, which has been accepted in Unicode version 4.1. A0 is the NO-BREAK SPACE. The codes AD and FF are unassigned.

Conversion Tools

You can convert TSCII encoded documents to UTF-8 using the GNU iconv tools as follows,

$ iconv -f utf-8 -t tscii hello.utf8 > hello.tscii

Whereas conversion from TSCII to UTF-8 is done by interchanging -f and -t flags.

References