@@ -920,14 +920,16 @@ A verbatim string literal consists of an `@` character followed by a double-quo
920920
921921In a verbatim string literal , the characters between the delimiters are interpreted verbatim , with the only exception being a * Quote_Escape_Sequence * , which represents one double - quote character . In particular , simple escape sequences , and hexadecimal and Unicode escape sequences are not processed in verbatim string literals . A verbatim string literal may span multiple lines .
922922
923+ All string literal forms may optionally have a trailing * Utf8_Suffix * . The representation of each form is discussed below .
924+
923925```ANTLR
924926String_Literal
925927 : Regular_String_Literal
926928 | Verbatim_String_Literal
927929 ;
928930
929931fragment Regular_String_Literal
930- : '"' Regular_String_Literal_Character * '"'
932+ : '"' Regular_String_Literal_Character * '"' Utf8_Suffix ?
931933 ;
932934
933935fragment Regular_String_Literal_Character
@@ -943,7 +945,7 @@ fragment Single_Regular_String_Literal_Character
943945 ;
944946
945947fragment Verbatim_String_Literal
946- : '@"' Verbatim_String_Literal_Character * '"'
948+ : '@"' Verbatim_String_Literal_Character * '"' Utf8_Suffix ?
947949 ;
948950
949951fragment Verbatim_String_Literal_Character
@@ -958,6 +960,10 @@ fragment Single_Verbatim_String_Literal_Character
958960fragment Quote_Escape_Sequence
959961 : '""'
960962 ;
963+
964+ fragment Utf8_Suffix
965+ : 'u8' | 'U8'
966+ ;
961967```
962968
963969> * Example* : The example
@@ -990,7 +996,24 @@ fragment Quote_Escape_Sequence
990996< ! -- markdownlint - enable MD028 -- >
991997> * Note * : Since a hexadecimal escape sequence can have a variable number of hex digits , the string literal `" \x123 " ` contains a single character with hex value `123 `. To create a string containing the character with hex value `12 ` followed by the character `3 `, one could write `" \x0012 3" ` or `" \x12 " ` + `" 3" ` instead . * end note *
992998
993- The type of a * String_Literal * is `string `.
999+ A * String_Literal * that does not contain a * Utf8_Suffix * is a ***UTF -16 string literal ***, whose type is `string `.
1000+
1001+ A *String_Literal * that contains a *Utf8_Suffix * is a ***UTF -8 string literal ***, whose type is `System .ReadOnlySpan <byte >` (an indexable collection type ), and whose value contains a UTF - 8 byte representation of the string . A null terminator (a byte with value zero ) is placed beyond the last byte in memory (and outside the length of the `ReadOnlySpan <byte >`) in order to support scenarios that expect null -terminated byte strings . A UTF -8 string literal is not a constant . A UTF -8 string literal without its *Utf8_Suffix * shall be valid UTF -16. (For example , `" \uDC00\uDD00 " u8 ` is ill -formed as one low surrogate cannot be followed by another .)
1002+
1003+ > * Note * : While every UTF - 8 string literal is a `ReadOnlySpan <byte >`, not every `ReadOnlySpan < byte > ` represents a UTF - 8 string literal . See the description of UTF - 8 string concatenation in [§12.13.5 ](expressions .md #12135 - addition - operator ). * end note *
1004+ < ! -- markdownlint - disable MD028 -- >
1005+
1006+ < ! -- markdownlint - enable MD028 -- >
1007+ > * Note * : As `ReadOnlySpan < byte > ` is a ref struct type , a UTF - 8 string literal cannot be converted to `object ` or used as a type parameter ([§16.2.3 ]( structs .md #1623 - ref - modifier )). * end note *
1008+ < ! -- markdownlint - disable MD028 -- >
1009+
1010+ < ! -- markdownlint - enable MD028 -- >
1011+ > * Example * : Here are examples of each form of string literal :
1012+ > | ** Encoding ** | ** Type ** | ** Regular String Literal ** | ** Verbatim String Literal ** | ** Raw String Literal ** |
1013+ > | -------------- | ---------------------- | -------------------- - | -------------------- | -------------------- |
1014+ > | UTF - 16 | `string ` | `" Hello" ` | `@" Hello" ` | `""" Hello""" ` |
1015+ > | UTF - 8 | `ReadOnlySpan < byte > ` | `" Hello" u8 ` | `@" Hello" u8 ` | `""" Hello""" u8 ` |
1016+ > * end example *
9941017
9951018Each string literal does not necessarily result in a new string instance . When two or more string literals that are equivalent according to the string equality operator ([§12.15.8 ](expressions .md #12158 - string - equality - operators )), appear in the same assembly , these string literals refer to the same string instance .
9961019
0 commit comments