Unicode : Différence entre versions

De Wiki Seb35
Aller à : navigation, rechercher
(additional note)
m
 
Ligne 8 : Ligne 8 :
 
If 2 code units, first one is in 0xD800-0xDBFF, second one is in 0xDC00-0xDFFF.
 
If 2 code units, first one is in 0xD800-0xDBFF, second one is in 0xDC00-0xDFFF.
  
* First one: 6 fixed bits (0b110110) then 4 bits encoding the Unicode plan (minus one: 1-16 become 0-15) then 6 strong bits inside the plan - note that the plan number is splitted between the two last bits of the first byte and the two first bits of the second byte
+
* First one: 6 fixed bits (0b110110) then 4 bits encoding the Unicode plan (minus one: 1-16 become 0-15) then 6 first bits from inside the plan - note that the plan number is splitted between the two last bits of the first byte and the two first bits of the second byte
* Second one: 6 fixed bits (0b110111) then 10 bits inside the plan
+
* Second one: 6 fixed bits (0b110111) then 10 last bits from inside the plan
  
 
Non-private astral planes 0x10000-0xEFFFF are encoded in UTF-16: [\uD800-\uDAFF\uDB00-\uDB7F][\uDC00-\uDFFF]
 
Non-private astral planes 0x10000-0xEFFFF are encoded in UTF-16: [\uD800-\uDAFF\uDB00-\uDB7F][\uDC00-\uDFFF]

Version actuelle en date du 12 septembre 2019 à 00:46


UTF-16

1 or 2 code units of 16 bits = 2 bytes

If 2 code units, first one is in 0xD800-0xDBFF, second one is in 0xDC00-0xDFFF.

  • First one: 6 fixed bits (0b110110) then 4 bits encoding the Unicode plan (minus one: 1-16 become 0-15) then 6 first bits from inside the plan - note that the plan number is splitted between the two last bits of the first byte and the two first bits of the second byte
  • Second one: 6 fixed bits (0b110111) then 10 last bits from inside the plan

Non-private astral planes 0x10000-0xEFFFF are encoded in UTF-16: [\uD800-\uDAFF\uDB00-\uDB7F][\uDC00-\uDFFF]

Astral planes are encoded in UTF-16: [\uD800-\uDBFF][\uDC00-\uDFFF]



Debug data: