@@ -906,14 +906,16 @@ A verbatim string literal consists of an `@` character followed by a double-quo
906906
907907In a verbatim string literal , the characters between the delimiters are interpreted verbatim , with the only exception being a * Quote_Escape_Sequence * , which represents one double - quote character . In particular , simple escape sequences , and hexadecimal and Unicode escape sequences are not processed in verbatim string literals . A verbatim string literal may span multiple lines .
908908
909+ All string literal forms may optionally have a trailing * Utf8_Suffix * . The representation of each form is discussed below .
910+
909911```ANTLR
910912String_Literal
911913 : Regular_String_Literal
912914 | Verbatim_String_Literal
913915 ;
914916
915917fragment Regular_String_Literal
916- : '"' Regular_String_Literal_Character * '"'
918+ : '"' Regular_String_Literal_Character * '"' Utf8_Suffix ?
917919 ;
918920
919921fragment Regular_String_Literal_Character
@@ -929,7 +931,7 @@ fragment Single_Regular_String_Literal_Character
929931 ;
930932
931933fragment Verbatim_String_Literal
932- : '@"' Verbatim_String_Literal_Character * '"'
934+ : '@"' Verbatim_String_Literal_Character * '"' Utf8_Suffix ?
933935 ;
934936
935937fragment Verbatim_String_Literal_Character
@@ -944,6 +946,10 @@ fragment Single_Verbatim_String_Literal_Character
944946fragment Quote_Escape_Sequence
945947 : '""'
946948 ;
949+
950+ fragment Utf8_Suffix
951+ : 'u8' | 'U8'
952+ ;
947953```
948954
949955> * Example* : The example
@@ -976,7 +982,24 @@ fragment Quote_Escape_Sequence
976982< ! -- markdownlint - enable MD028 -- >
977983> * Note * : Since a hexadecimal escape sequence can have a variable number of hex digits , the string literal `" \x123 " ` contains a single character with hex value `123 `. To create a string containing the character with hex value `12 ` followed by the character `3 `, one could write `" \x0012 3" ` or `" \x12 " ` + `" 3" ` instead . * end note *
978984
979- The type of a * String_Literal * is `string `.
985+ A * String_Literal * that does not contain a * Utf8_Suffix * is a ***UTF -16 string literal ***, whose type is `string `.
986+
987+ A *String_Literal * that contains a *Utf8_Suffix * is a ***UTF -8 string literal ***, whose type is `System .ReadOnlySpan <byte >` (an indexable collection type ), and whose value contains a UTF - 8 byte representation of the string . A null terminator (a byte with value zero ) is placed beyond the last byte in memory (and outside the length of the `ReadOnlySpan <byte >`) in order to support scenarios that expect null -terminated byte strings . A UTF -8 string literal is not a constant . A UTF -8 string literal without its *Utf8_Suffix * shall be valid UTF -16. (For example , `" \uDC00\uDD00 " u8 ` is ill -formed as one low surrogate cannot be followed by another .)
988+
989+ > * Note * : While every UTF - 8 string literal is a `ReadOnlySpan <byte >`, not every `ReadOnlySpan < byte > ` represents a UTF - 8 string literal . See the description of UTF - 8 string concatenation in [§12.13.5 ](expressions .md #12135 - addition - operator ). * end note *
990+ < ! -- markdownlint - disable MD028 -- >
991+
992+ < ! -- markdownlint - enable MD028 -- >
993+ > * Note * : As `ReadOnlySpan < byte > ` is a ref struct type , a UTF - 8 string literal cannot be converted to `object ` or used as a type parameter ([§16.2.3 ]( structs .md #1623 - ref - modifier )). * end note *
994+ < ! -- markdownlint - disable MD028 -- >
995+
996+ < ! -- markdownlint - enable MD028 -- >
997+ > * Example * : Here are examples of each form of string literal :
998+ > | ** Encoding ** | ** Type ** | ** Regular String Literal ** | ** Verbatim String Literal ** | ** Raw String Literal ** |
999+ > | -------------- | ---------------------- | -------------------- - | -------------------- | -------------------- |
1000+ > | UTF - 16 | `string ` | `" Hello" ` | `@" Hello" ` | `""" Hello""" ` |
1001+ > | UTF - 8 | `ReadOnlySpan < byte > ` | `" Hello" u8 ` | `@" Hello" u8 ` | `""" Hello""" u8 ` |
1002+ > * end example *
9801003
9811004Each string literal does not necessarily result in a new string instance . When two or more string literals that are equivalent according to the string equality operator ([§12.15.8 ](expressions .md #12158 - string - equality - operators )), appear in the same assembly , these string literals refer to the same string instance .
9821005
0 commit comments