Encoding Schemes
Encoding Schemes
Encoding Schemes
Encoding Schemes
The mechanism of converting data into an
equivalent cipher using specific code is called encoding. it
is same for all the keyboards. This has been possible
because of standard encoding schemes where each letter,
numeral and symbol is encoded or assigned a unique code.
Eg. When the key ‘A’ is pressed, it is internally mapped to
a decimal value 65 (code value), which is then converted to
its equivalent binary value for the computer to
understand. Similarly, when we press alphabet ‘अ’ on Hindi keyboard, internally it is
mapped to a hexadecimal value 0905, whose binary equivalent is 0000100100000101.
Some of the well-known encoding schemes are described in the following sections.
2.1 American Standard Code for Information Interchange (ASCII) encoding scheme
ASCII was developed for Character Decimal Character Decimal Character Decimal
standardising the character Value Value Value
representation. ASCII is still the
most commonly used coding Space 32 @ 64 ` 96
scheme. Initially ASCII used 7 ! 33 A 65 a 97
bits to represent characters. total
” 34 B 66 b 98
number of different characters
on the English keyboard that can # 35 C 67 c 99
be encoded by 7-bit ASCII code is $ 36 D 68 d 100
27 = 128. Table shows some
printable characters for ASCII code. But ASCII is able to encode character set of English
language only.
2.2 Indian Script Code for Information Interchange (ISCII) In order to facilitate the
use of Indian languages on computers, a common standard for coding Indian scripts
called ISCII was developed in India during mid 1980s. It is an 8-bit code representation
for Indian languages which means it can represent 28=256 characters. It retains all 128
ASCII codes and uses rest of the codes (128) for additional Indian language character set.
Additional codes have been assigned in the upper region (160– 255) for the ‘aksharas’ of
the language.
2.3 Unicode There were many
अ आ इ ई उ ऊ ऋ ऌ ऍ ऎ ए
encoding schemes, for
0905 0906 0907 0908 0909 090A 090B 090C 090D 090E 090F
character sets of different
languages. But they were not क ख ग घ ङ च छ ज झ ञ ट
able to communicate with each 0915 0916 0917 0918 0919 091A 091B 091C 091D 091E 091F
other, as each of them थ द ध न ऩ प फ ब भ म य
represented characters in their 0925 0926 0927 0928 0929 092A 092B 092C 092D 092E 092F
own ways. Hence, text created
using one encoding scheme was व श ष स ह ◌ऺ ◌ऻ ◌़ ऽ ◌ा ि◌
not recognised by another 0935 0936 0937 0938 0939 093A 093B 093C 093D 093E 093F
machine using different ◌ॅ ◌ॆ ◌े ◌ै ◌ॉ ◌ॊ ◌ो ◌ौ ◌् ॎ◌ ◌ॏ
encoding scheme. Therefore, a 0945 0946 0947 0948 0949 094A 094B 094C 094D 094E 094F
standard called UNICODE has
◌ॕ ◌ॖ ◌ॗ क़ ख़ ग़ ज़ ड़ ढ़ फ़ य़
been developed to incorporate 0955
0956 0957 0958 0959 095A 095B 095C 095D 095E 095F
all the characters of every written
language of the world. UNICODE ॥ ० १ २ ३ ४ ५ ६ ७ ८ ९
provides a unique number for every 0965 0966 0967 0968 0969 096A 096B 096C 096D 096E 096F
character, irrespective of device
(server, desktop, mobile), operating ॵ ॶ ॷ ॸ ॹ ॺ ऄ ऄ ऄ ऄ ऄ
system (Linux, Windows, iOS) or 0975 0976 0977 0978 0979 097A 097B 097C 097D 097E 097F
software application (different
browsers, text editors, etc.).
Commonly used UNICODE encodings are UTF-8, UTF-16 and UTF-32. It is a superset of ASCII, and the
values 0–128 have the same character as in ASCII. Unicode characters for Devanagari script is shown in
Table. Each cell of the table contains a character along
with its equivalent hexadecimal value.