Previous | Table of Contents | Next |
String Positions and Substrings
Positions in strings are between characters, numbered starting at 1 before the first character, and there is a position after the last character, as shown by this example:
Substrings are specified by bounding positions. For example, the substring between positions 2 and 5 in mantra is ant.
Subscripting can be used to produce substrings. For example, if
word := mantra
then word[2:5] produces the substring of word between positions 2 and 5ant in this case.
The expression s[i] is shorthand for specifying the ith character of s (the character after position i). For example, word[1] is m.
There also are nonpositive position specifications starting at 0 after the last character and decreasing toward the left:
For example, the positions 5 and 2 in mantra are equivalent. Positive and nonpositive position specifications can be intermixed and given in any order. Thus, word[2:5] and word[2,2] are equivalent.
A string can be assigned to a subscripting expression to replace the specified substring. For example, if
word := thesis
then
word[1] := T
changes word to Thesis.
The replacement need not be the same length as the substring it replaces. For example,
word[4:0] :=
replaces the substring sis of Thesis by the empty string to change the value of word to The.
The operation !s generates the one-character substrings of s, from beginning to end. For example,
every write(!Hello)
writes H, e, l, l, o on separate lines.
String Comparison
As mentioned earlier, Icon uses the 256-character extended ASCII character set. The ASCII codes for characters impose an ordering on the characters.
Strings can be compared on the basis of the codes for their characters. The character c1 is less than c2 if the internal code for c1 is less than the code for c2. For example, the (ASCII) code for D is 68, and the code for Q is 81, so D is less than Q. The codes for letters have the same order as ordinary alphabetical order, but the lowercase and uppercase letters have different codes.
The codes for the digits are smaller than the codes for letters, and the uppercase letters have smaller codes than the lowercase letters. Other characters, such as punctuation, have various codes.
For strings, order is determined by the order of their characters, from left to right. Therefore, in ASCII DQ is less than dQ and dQ is less than dq. If one string is an initial substring of another, the shorter string is less than the longer. For example, DQa is lexically less than DQaaa. The empty string is less than any other string. Two strings are equal if and only if they have the same length and are the same, character by character.
There are six comparison operations for strings, which succeed and return the right operand if the comparison is successful but fail otherwise:
String Analysis
Icon has several functions for analyzing strings. They all return positions in strings. The string-analysis functions include
These functions fail if there is no match. The functions upto() and find() are generators that produce positions of successive matches.
An example of using string-analysis functions is
while instruction := read() do { j := upto( , instruction) | next # skip bad lines command := instruction[1, j] if match(command, comment) then next # skip comments else # process command }
Previous | Table of Contents | Next |