Struct utf8_ranges::Utf8Sequences [] [src]

pub struct Utf8Sequences { /* fields omitted */ }

An iterator over ranges of matching UTF-8 byte sequences.

The iteration represents an alternation of comprehensive byte sequences that match precisely the set of UTF-8 encoded scalar values.

A byte sequence corresponds to one of the scalar values in the range given if and only if it completely matches exactly one of the sequences of byte ranges produced by this iterator.

Each sequence of byte ranges matches a unique set of bytes. That is, no two sequences will match the same bytes.

Example

This shows how to match an arbitrary byte sequence against a range of scalar values.

use utf8_ranges::{Utf8Sequences, Utf8Sequence};

fn matches(seqs: &[Utf8Sequence], bytes: &[u8]) -> bool {
    for range in seqs {
        if range.matches(bytes) {
            return true;
        }
    }
    false
}

// Test the basic multilingual plane.
let seqs: Vec<_> = Utf8Sequences::new('\u{0}', '\u{FFFF}').collect();

// UTF-8 encoding of 'a'.
assert!(matches(&seqs, &[0x61]));
// UTF-8 encoding of '☃' (`\u{2603}`).
assert!(matches(&seqs, &[0xE2, 0x98, 0x83]));
// UTF-8 encoding of `\u{10348}` (outside the BMP).
assert!(!matches(&seqs, &[0xF0, 0x90, 0x8D, 0x88]));
// Tries to match against a UTF-8 encoding of a surrogate codepoint,
// which is invalid UTF-8, and therefore fails, despite the fact that
// the corresponding codepoint (0xD800) falls in the range given.
assert!(!matches(&seqs, &[0xED, 0xA0, 0x80]));
// And fails against plain old invalid UTF-8.
assert!(!matches(&seqs, &[0xFF, 0xFF]));

If this example seems circuitous, that's because it is! It's meant to be illustrative. In practice, you could just try to decode your byte sequence and compare it with the scalar value range directly. However, this is not always possible (for example, in a byte based automaton).

Methods

impl Utf8Sequences
[src]

[src]

Create a new iterator over UTF-8 byte ranges for the scalar value range given.

Trait Implementations

impl Iterator for Utf8Sequences
[src]

The type of the elements being iterated over.

[src]

Advances the iterator and returns the next value. Read more

1.0.0
[src]

Returns the bounds on the remaining length of the iterator. Read more

1.0.0
[src]

Consumes the iterator, counting the number of iterations and returning it. Read more

1.0.0
[src]

Consumes the iterator, returning the last element. Read more

1.0.0
[src]

Returns the nth element of the iterator. Read more

[src]

🔬 This is a nightly-only experimental API. (iterator_step_by)

unstable replacement of Range::step_by

Creates an iterator starting at the same point, but stepping by the given amount at each iteration. Read more

1.0.0
[src]

Takes two iterators and creates a new iterator over both in sequence. Read more

1.0.0
[src]

'Zips up' two iterators into a single iterator of pairs. Read more

1.0.0
[src]

Takes a closure and creates an iterator which calls that closure on each element. Read more

1.21.0
[src]

Calls a closure on each element of an iterator. Read more

1.0.0
[src]

Creates an iterator which uses a closure to determine if an element should be yielded. Read more

1.0.0
[src]

Creates an iterator that both filters and maps. Read more

1.0.0
[src]

Creates an iterator which gives the current iteration count as well as the next value. Read more

1.0.0
[src]

Creates an iterator which can use peek to look at the next element of the iterator without consuming it. Read more

1.0.0
[src]

Creates an iterator that [skip]s elements based on a predicate. Read more

1.0.0
[src]

Creates an iterator that yields elements based on a predicate. Read more

1.0.0
[src]

Creates an iterator that skips the first n elements. Read more

1.0.0
[src]

Creates an iterator that yields its first n elements. Read more

1.0.0
[src]

An iterator adaptor similar to [fold] that holds internal state and produces a new iterator. Read more

1.0.0
[src]

Creates an iterator that works like map, but flattens nested structure. Read more

1.0.0
[src]

Creates an iterator which ends after the first [None]. Read more

1.0.0
[src]

Do something with each element of an iterator, passing the value on. Read more

1.0.0
[src]

Borrows an iterator, rather than consuming it. Read more

1.0.0
[src]

Transforms an iterator into a collection. Read more

1.0.0
[src]

Consumes an iterator, creating two collections from it. Read more

1.0.0
[src]

An iterator adaptor that applies a function, producing a single, final value. Read more

1.0.0
[src]

Tests if every element of the iterator matches a predicate. Read more

1.0.0
[src]

Tests if any element of the iterator matches a predicate. Read more

1.0.0
[src]

Searches for an element of an iterator that satisfies a predicate. Read more

1.0.0
[src]

Searches for an element in an iterator, returning its index. Read more

1.0.0
[src]

Searches for an element in an iterator from the right, returning its index. Read more

1.0.0
[src]

Returns the maximum element of an iterator. Read more

1.0.0
[src]

Returns the minimum element of an iterator. Read more

1.6.0
[src]

Returns the element that gives the maximum value from the specified function. Read more

1.15.0
[src]

Returns the element that gives the maximum value with respect to the specified comparison function. Read more

1.6.0
[src]

Returns the element that gives the minimum value from the specified function. Read more

1.15.0
[src]

Returns the element that gives the minimum value with respect to the specified comparison function. Read more

1.0.0
[src]

Reverses an iterator's direction. Read more

1.0.0
[src]

Converts an iterator of pairs into a pair of containers. Read more

1.0.0
[src]

Creates an iterator which [clone]s all of its elements. Read more

1.0.0
[src]

Repeats an iterator endlessly. Read more

1.11.0
[src]

Sums the elements of an iterator. Read more

1.11.0
[src]

Iterates over the entire iterator, multiplying all the elements Read more

1.5.0
[src]

Lexicographically compares the elements of this Iterator with those of another. Read more

1.5.0
[src]

Lexicographically compares the elements of this Iterator with those of another. Read more

1.5.0
[src]

Determines if the elements of this Iterator are equal to those of another. Read more

1.5.0
[src]

Determines if the elements of this Iterator are unequal to those of another. Read more

1.5.0
[src]

Determines if the elements of this Iterator are lexicographically less than those of another. Read more

1.5.0
[src]

Determines if the elements of this Iterator are lexicographically less or equal to those of another. Read more

1.5.0
[src]

Determines if the elements of this Iterator are lexicographically greater than those of another. Read more

1.5.0
[src]

Determines if the elements of this Iterator are lexicographically greater than or equal to those of another. Read more