sun.text.normalizer.NormalizerBase Java Examples

The following examples show how to use sun.text.normalizer.NormalizerBase. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: CollationElementIterator.java    From Java8CN with Apache License 2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #2
Source File: CollationElementIterator.java    From jdk8u_jdk with GNU General Public License v2.0 5 votes vote down vote up
/**
 * CollationElementIterator constructor.  This takes the source string and
 * the collation object.  The cursor will walk thru the source string based
 * on the predefined collation rules.  If the source string is empty,
 * NULLORDER will be returned on the calls to next().
 * @param sourceText the source string.
 * @param owner the collation object.
 */
CollationElementIterator(String sourceText, RuleBasedCollator owner) {
    this.owner = owner;
    ordering = owner.getTables();
    if ( sourceText.length() != 0 ) {
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text = new NormalizerBase(sourceText, mode);
    }
}
 
Example #3
Source File: CollationElementIterator.java    From JDKSourceCode1.8 with MIT License 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #4
Source File: CollatorUtilities.java    From jdk8u_jdk with GNU General Public License v2.0 5 votes vote down vote up
public static NormalizerBase.Mode toNormalizerMode(int mode) {
    NormalizerBase.Mode normalizerMode;

    try {
        normalizerMode = legacyModeMap[mode];
    }
    catch(ArrayIndexOutOfBoundsException e) {
        normalizerMode = NormalizerBase.NONE;
    }
    return normalizerMode;

}
 
Example #5
Source File: CollationElementIterator.java    From jdk8u-dev-jdk with GNU General Public License v2.0 5 votes vote down vote up
/**
 * CollationElementIterator constructor.  This takes the source string and
 * the collation object.  The cursor will walk thru the source string based
 * on the predefined collation rules.  If the source string is empty,
 * NULLORDER will be returned on the calls to next().
 * @param sourceText the source string.
 * @param owner the collation object.
 */
CollationElementIterator(String sourceText, RuleBasedCollator owner) {
    this.owner = owner;
    ordering = owner.getTables();
    if ( sourceText.length() != 0 ) {
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text = new NormalizerBase(sourceText, mode);
    }
}
 
Example #6
Source File: CollationElementIterator.java    From openjdk-jdk8u-backup with GNU General Public License v2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #7
Source File: CollatorUtilities.java    From jdk8u-jdk with GNU General Public License v2.0 5 votes vote down vote up
public static int toLegacyMode(NormalizerBase.Mode mode) {
    // find the index of the legacy mode in the table;
    // if it's not there, default to Collator.NO_DECOMPOSITION (0)
    int legacyMode = legacyModeMap.length;
    while (legacyMode > 0) {
        --legacyMode;
        if (legacyModeMap[legacyMode] == mode) {
            break;
        }
    }
    return legacyMode;
}
 
Example #8
Source File: CollatorUtilities.java    From Bytecoder with Apache License 2.0 5 votes vote down vote up
public static NormalizerBase.Mode toNormalizerMode(int mode) {
    NormalizerBase.Mode normalizerMode;

    try {
        normalizerMode = legacyModeMap[mode];
    }
    catch(ArrayIndexOutOfBoundsException e) {
        normalizerMode = NormalizerBase.NONE;
    }
    return normalizerMode;

}
 
Example #9
Source File: CollationElementIterator.java    From dragonwell8_jdk with GNU General Public License v2.0 5 votes vote down vote up
/**
 * CollationElementIterator constructor.  This takes the source string and
 * the collation object.  The cursor will walk thru the source string based
 * on the predefined collation rules.  If the source string is empty,
 * NULLORDER will be returned on the calls to next().
 * @param sourceText the source string.
 * @param owner the collation object.
 */
CollationElementIterator(String sourceText, RuleBasedCollator owner) {
    this.owner = owner;
    ordering = owner.getTables();
    if ( sourceText.length() != 0 ) {
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text = new NormalizerBase(sourceText, mode);
    }
}
 
Example #10
Source File: CollationElementIterator.java    From jdk-1.7-annotated with Apache License 2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #11
Source File: CollatorUtilities.java    From jdk8u60 with GNU General Public License v2.0 5 votes vote down vote up
public static int toLegacyMode(NormalizerBase.Mode mode) {
    // find the index of the legacy mode in the table;
    // if it's not there, default to Collator.NO_DECOMPOSITION (0)
    int legacyMode = legacyModeMap.length;
    while (legacyMode > 0) {
        --legacyMode;
        if (legacyModeMap[legacyMode] == mode) {
            break;
        }
    }
    return legacyMode;
}
 
Example #12
Source File: CollationElementIterator.java    From Java8CN with Apache License 2.0 5 votes vote down vote up
/**
 * CollationElementIterator constructor.  This takes the source string and
 * the collation object.  The cursor will walk thru the source string based
 * on the predefined collation rules.  If the source string is empty,
 * NULLORDER will be returned on the calls to next().
 * @param sourceText the source string.
 * @param owner the collation object.
 */
CollationElementIterator(String sourceText, RuleBasedCollator owner) {
    this.owner = owner;
    ordering = owner.getTables();
    if ( sourceText.length() != 0 ) {
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text = new NormalizerBase(sourceText, mode);
    }
}
 
Example #13
Source File: CollationElementIterator.java    From openjdk-8 with GNU General Public License v2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #14
Source File: CollatorUtilities.java    From TencentKona-8 with GNU General Public License v2.0 5 votes vote down vote up
public static int toLegacyMode(NormalizerBase.Mode mode) {
    // find the index of the legacy mode in the table;
    // if it's not there, default to Collator.NO_DECOMPOSITION (0)
    int legacyMode = legacyModeMap.length;
    while (legacyMode > 0) {
        --legacyMode;
        if (legacyModeMap[legacyMode] == mode) {
            break;
        }
    }
    return legacyMode;
}
 
Example #15
Source File: CollationElementIterator.java    From openjdk-8-source with GNU General Public License v2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #16
Source File: CollationElementIterator.java    From jdk8u60 with GNU General Public License v2.0 5 votes vote down vote up
/**
 * CollationElementIterator constructor.  This takes the source string and
 * the collation object.  The cursor will walk thru the source string based
 * on the predefined collation rules.  If the source string is empty,
 * NULLORDER will be returned on the calls to next().
 * @param sourceText the source string.
 * @param owner the collation object.
 */
CollationElementIterator(String sourceText, RuleBasedCollator owner) {
    this.owner = owner;
    ordering = owner.getTables();
    if ( sourceText.length() != 0 ) {
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text = new NormalizerBase(sourceText, mode);
    }
}
 
Example #17
Source File: CollatorUtilities.java    From openjdk-jdk8u with GNU General Public License v2.0 5 votes vote down vote up
public static int toLegacyMode(NormalizerBase.Mode mode) {
    // find the index of the legacy mode in the table;
    // if it's not there, default to Collator.NO_DECOMPOSITION (0)
    int legacyMode = legacyModeMap.length;
    while (legacyMode > 0) {
        --legacyMode;
        if (legacyModeMap[legacyMode] == mode) {
            break;
        }
    }
    return legacyMode;
}
 
Example #18
Source File: CollationElementIterator.java    From jdk8u-jdk with GNU General Public License v2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #19
Source File: CollationElementIterator.java    From Bytecoder with Apache License 2.0 5 votes vote down vote up
/**
 * Resets the cursor to the beginning of the string.  The next call
 * to next() will return the first collation element in the string.
 */
public void reset()
{
    if (text != null) {
        text.reset();
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text.setMode(mode);
    }
    buffer = null;
    expIndex = 0;
    swapOrder = 0;
}
 
Example #20
Source File: CollationElementIterator.java    From openjdk-8 with GNU General Public License v2.0 5 votes vote down vote up
/**
 * CollationElementIterator constructor.  This takes the source string and
 * the collation object.  The cursor will walk thru the source string based
 * on the predefined collation rules.  If the source string is empty,
 * NULLORDER will be returned on the calls to next().
 * @param sourceText the source string.
 * @param owner the collation object.
 */
CollationElementIterator(String sourceText, RuleBasedCollator owner) {
    this.owner = owner;
    ordering = owner.getTables();
    if ( sourceText.length() != 0 ) {
        NormalizerBase.Mode mode =
            CollatorUtilities.toNormalizerMode(owner.getDecomposition());
        text = new NormalizerBase(sourceText, mode);
    }
}
 
Example #21
Source File: CollatorUtilities.java    From dragonwell8_jdk with GNU General Public License v2.0 5 votes vote down vote up
public static int toLegacyMode(NormalizerBase.Mode mode) {
    // find the index of the legacy mode in the table;
    // if it's not there, default to Collator.NO_DECOMPOSITION (0)
    int legacyMode = legacyModeMap.length;
    while (legacyMode > 0) {
        --legacyMode;
        if (legacyModeMap[legacyMode] == mode) {
            break;
        }
    }
    return legacyMode;
}
 
Example #22
Source File: CollationElementIterator.java    From openjdk-jdk8u with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Get the next collation element in the string.  <p>This iterator iterates
 * over a sequence of collation elements that were built from the string.
 * Because there isn't necessarily a one-to-one mapping from characters to
 * collation elements, this doesn't mean the same thing as "return the
 * collation element [or ordering priority] of the next character in the
 * string".</p>
 * <p>This function returns the collation element that the iterator is currently
 * pointing to and then updates the internal pointer to point to the next element.
 * previous() updates the pointer first and then returns the element.  This
 * means that when you change direction while iterating (i.e., call next() and
 * then call previous(), or call previous() and then call next()), you'll get
 * back the same element twice.</p>
 *
 * @return the next collation element
 */
public int next()
{
    if (text == null) {
        return NULLORDER;
    }
    NormalizerBase.Mode textMode = text.getMode();
    // convert the owner's mode to something the Normalizer understands
    NormalizerBase.Mode ownerMode =
        CollatorUtilities.toNormalizerMode(owner.getDecomposition());
    if (textMode != ownerMode) {
        text.setMode(ownerMode);
    }

    // if buffer contains any decomposed char values
    // return their strength orders before continuing in
    // the Normalizer's CharacterIterator.
    if (buffer != null) {
        if (expIndex < buffer.length) {
            return strengthOrder(buffer[expIndex++]);
        } else {
            buffer = null;
            expIndex = 0;
        }
    } else if (swapOrder != 0) {
        if (Character.isSupplementaryCodePoint(swapOrder)) {
            char[] chars = Character.toChars(swapOrder);
            swapOrder = chars[1];
            return chars[0] << 16;
        }
        int order = swapOrder << 16;
        swapOrder = 0;
        return order;
    }
    int ch  = text.next();

    // are we at the end of Normalizer's text?
    if (ch == NormalizerBase.DONE) {
        return NULLORDER;
    }

    int value = ordering.getUnicodeOrder(ch);
    if (value == RuleBasedCollator.UNMAPPED) {
        swapOrder = ch;
        return UNMAPPEDCHARVALUE;
    }
    else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
        value = nextContractChar(ch);
    }
    if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
        buffer = ordering.getExpandValueList(value);
        expIndex = 0;
        value = buffer[expIndex++];
    }

    if (ordering.isSEAsianSwapping()) {
        int consonant;
        if (isThaiPreVowel(ch)) {
            consonant = text.next();
            if (isThaiBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
        if (isLaoPreVowel(ch)) {
            consonant = text.next();
            if (isLaoBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
    }

    return strengthOrder(value);
}
 
Example #23
Source File: CollationElementIterator.java    From jdk8u-jdk with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Get the ordering priority of the next contracting character in the
 * string.
 * @param ch the starting character of a contracting character token
 * @return the next contracting character's ordering.  Returns NULLORDER
 * if the end of string is reached.
 */
private int nextContractChar(int ch)
{
    // First get the ordering of this single character,
    // which is always the first element in the list
    Vector<EntryPair> list = ordering.getContractValues(ch);
    EntryPair pair = list.firstElement();
    int order = pair.value;

    // find out the length of the longest contracting character sequence in the list.
    // There's logic in the builder code to make sure the longest sequence is always
    // the last.
    pair = list.lastElement();
    int maxLength = pair.entryName.length();

    // (the Normalizer is cloned here so that the seeking we do in the next loop
    // won't affect our real position in the text)
    NormalizerBase tempText = (NormalizerBase)text.clone();

    // extract the next maxLength characters in the string (we have to do this using the
    // Normalizer to ensure that our offsets correspond to those the rest of the
    // iterator is using) and store it in "fragment".
    tempText.previous();
    key.setLength(0);
    int c = tempText.next();
    while (maxLength > 0 && c != NormalizerBase.DONE) {
        if (Character.isSupplementaryCodePoint(c)) {
            key.append(Character.toChars(c));
            maxLength -= 2;
        } else {
            key.append((char)c);
            --maxLength;
        }
        c = tempText.next();
    }
    String fragment = key.toString();
    // now that we have that fragment, iterate through this list looking for the
    // longest sequence that matches the characters in the actual text.  (maxLength
    // is used here to keep track of the length of the longest sequence)
    // Upon exit from this loop, maxLength will contain the length of the matching
    // sequence and order will contain the collation-element value corresponding
    // to this sequence
    maxLength = 1;
    for (int i = list.size() - 1; i > 0; i--) {
        pair = list.elementAt(i);
        if (!pair.fwd)
            continue;

        if (fragment.startsWith(pair.entryName) && pair.entryName.length()
                > maxLength) {
            maxLength = pair.entryName.length();
            order = pair.value;
        }
    }

    // seek our current iteration position to the end of the matching sequence
    // and return the appropriate collation-element value (if there was no matching
    // sequence, we're already seeked to the right position and order already contains
    // the correct collation-element value for the single character)
    while (maxLength > 1) {
        c = text.next();
        maxLength -= Character.charCount(c);
    }
    return order;
}
 
Example #24
Source File: CollationElementIterator.java    From JDKSourceCode1.8 with MIT License 4 votes vote down vote up
/**
 * Get the next collation element in the string.  <p>This iterator iterates
 * over a sequence of collation elements that were built from the string.
 * Because there isn't necessarily a one-to-one mapping from characters to
 * collation elements, this doesn't mean the same thing as "return the
 * collation element [or ordering priority] of the next character in the
 * string".</p>
 * <p>This function returns the collation element that the iterator is currently
 * pointing to and then updates the internal pointer to point to the next element.
 * previous() updates the pointer first and then returns the element.  This
 * means that when you change direction while iterating (i.e., call next() and
 * then call previous(), or call previous() and then call next()), you'll get
 * back the same element twice.</p>
 *
 * @return the next collation element
 */
public int next()
{
    if (text == null) {
        return NULLORDER;
    }
    NormalizerBase.Mode textMode = text.getMode();
    // convert the owner's mode to something the Normalizer understands
    NormalizerBase.Mode ownerMode =
        CollatorUtilities.toNormalizerMode(owner.getDecomposition());
    if (textMode != ownerMode) {
        text.setMode(ownerMode);
    }

    // if buffer contains any decomposed char values
    // return their strength orders before continuing in
    // the Normalizer's CharacterIterator.
    if (buffer != null) {
        if (expIndex < buffer.length) {
            return strengthOrder(buffer[expIndex++]);
        } else {
            buffer = null;
            expIndex = 0;
        }
    } else if (swapOrder != 0) {
        if (Character.isSupplementaryCodePoint(swapOrder)) {
            char[] chars = Character.toChars(swapOrder);
            swapOrder = chars[1];
            return chars[0] << 16;
        }
        int order = swapOrder << 16;
        swapOrder = 0;
        return order;
    }
    int ch  = text.next();

    // are we at the end of Normalizer's text?
    if (ch == NormalizerBase.DONE) {
        return NULLORDER;
    }

    int value = ordering.getUnicodeOrder(ch);
    if (value == RuleBasedCollator.UNMAPPED) {
        swapOrder = ch;
        return UNMAPPEDCHARVALUE;
    }
    else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
        value = nextContractChar(ch);
    }
    if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
        buffer = ordering.getExpandValueList(value);
        expIndex = 0;
        value = buffer[expIndex++];
    }

    if (ordering.isSEAsianSwapping()) {
        int consonant;
        if (isThaiPreVowel(ch)) {
            consonant = text.next();
            if (isThaiBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
        if (isLaoPreVowel(ch)) {
            consonant = text.next();
            if (isLaoBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
    }

    return strengthOrder(value);
}
 
Example #25
Source File: CollationElementIterator.java    From JDKSourceCode1.8 with MIT License 4 votes vote down vote up
/**
 * Get the previous collation element in the string.  <p>This iterator iterates
 * over a sequence of collation elements that were built from the string.
 * Because there isn't necessarily a one-to-one mapping from characters to
 * collation elements, this doesn't mean the same thing as "return the
 * collation element [or ordering priority] of the previous character in the
 * string".</p>
 * <p>This function updates the iterator's internal pointer to point to the
 * collation element preceding the one it's currently pointing to and then
 * returns that element, while next() returns the current element and then
 * updates the pointer.  This means that when you change direction while
 * iterating (i.e., call next() and then call previous(), or call previous()
 * and then call next()), you'll get back the same element twice.</p>
 *
 * @return the previous collation element
 * @since 1.2
 */
public int previous()
{
    if (text == null) {
        return NULLORDER;
    }
    NormalizerBase.Mode textMode = text.getMode();
    // convert the owner's mode to something the Normalizer understands
    NormalizerBase.Mode ownerMode =
        CollatorUtilities.toNormalizerMode(owner.getDecomposition());
    if (textMode != ownerMode) {
        text.setMode(ownerMode);
    }
    if (buffer != null) {
        if (expIndex > 0) {
            return strengthOrder(buffer[--expIndex]);
        } else {
            buffer = null;
            expIndex = 0;
        }
    } else if (swapOrder != 0) {
        if (Character.isSupplementaryCodePoint(swapOrder)) {
            char[] chars = Character.toChars(swapOrder);
            swapOrder = chars[1];
            return chars[0] << 16;
        }
        int order = swapOrder << 16;
        swapOrder = 0;
        return order;
    }
    int ch = text.previous();
    if (ch == NormalizerBase.DONE) {
        return NULLORDER;
    }

    int value = ordering.getUnicodeOrder(ch);

    if (value == RuleBasedCollator.UNMAPPED) {
        swapOrder = UNMAPPEDCHARVALUE;
        return ch;
    } else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
        value = prevContractChar(ch);
    }
    if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
        buffer = ordering.getExpandValueList(value);
        expIndex = buffer.length;
        value = buffer[--expIndex];
    }

    if (ordering.isSEAsianSwapping()) {
        int vowel;
        if (isThaiBaseConsonant(ch)) {
            vowel = text.previous();
            if (isThaiPreVowel(vowel)) {
                buffer = makeReorderedBuffer(vowel, value, buffer, false);
                expIndex = buffer.length - 1;
                value = buffer[expIndex];
            } else {
                text.next();
            }
        }
        if (isLaoBaseConsonant(ch)) {
            vowel = text.previous();
            if (isLaoPreVowel(vowel)) {
                buffer = makeReorderedBuffer(vowel, value, buffer, false);
                expIndex = buffer.length - 1;
                value = buffer[expIndex];
            } else {
                text.next();
            }
        }
    }

    return strengthOrder(value);
}
 
Example #26
Source File: CollationElementIterator.java    From openjdk-jdk9 with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Get the ordering priority of the previous contracting character in the
 * string.
 * @param ch the starting character of a contracting character token
 * @return the next contracting character's ordering.  Returns NULLORDER
 * if the end of string is reached.
 */
private int prevContractChar(int ch)
{
    // This function is identical to nextContractChar(), except that we've
    // switched things so that the next() and previous() calls on the Normalizer
    // are switched and so that we skip entry pairs with the fwd flag turned on
    // rather than off.  Notice that we still use append() and startsWith() when
    // working on the fragment.  This is because the entry pairs that are used
    // in reverse iteration have their names reversed already.
    Vector<EntryPair> list = ordering.getContractValues(ch);
    EntryPair pair = list.firstElement();
    int order = pair.value;

    pair = list.lastElement();
    int maxLength = pair.entryName.length();

    NormalizerBase tempText = (NormalizerBase)text.clone();

    tempText.next();
    key.setLength(0);
    int c = tempText.previous();
    while (maxLength > 0 && c != NormalizerBase.DONE) {
        if (Character.isSupplementaryCodePoint(c)) {
            key.append(Character.toChars(c));
            maxLength -= 2;
        } else {
            key.append((char)c);
            --maxLength;
        }
        c = tempText.previous();
    }
    String fragment = key.toString();

    maxLength = 1;
    for (int i = list.size() - 1; i > 0; i--) {
        pair = list.elementAt(i);
        if (pair.fwd)
            continue;

        if (fragment.startsWith(pair.entryName) && pair.entryName.length()
                > maxLength) {
            maxLength = pair.entryName.length();
            order = pair.value;
        }
    }

    while (maxLength > 1) {
        c = text.previous();
        maxLength -= Character.charCount(c);
    }
    return order;
}
 
Example #27
Source File: CollationElementIterator.java    From Java8CN with Apache License 2.0 4 votes vote down vote up
/**
 * Get the next collation element in the string.  <p>This iterator iterates
 * over a sequence of collation elements that were built from the string.
 * Because there isn't necessarily a one-to-one mapping from characters to
 * collation elements, this doesn't mean the same thing as "return the
 * collation element [or ordering priority] of the next character in the
 * string".</p>
 * <p>This function returns the collation element that the iterator is currently
 * pointing to and then updates the internal pointer to point to the next element.
 * previous() updates the pointer first and then returns the element.  This
 * means that when you change direction while iterating (i.e., call next() and
 * then call previous(), or call previous() and then call next()), you'll get
 * back the same element twice.</p>
 *
 * @return the next collation element
 */
public int next()
{
    if (text == null) {
        return NULLORDER;
    }
    NormalizerBase.Mode textMode = text.getMode();
    // convert the owner's mode to something the Normalizer understands
    NormalizerBase.Mode ownerMode =
        CollatorUtilities.toNormalizerMode(owner.getDecomposition());
    if (textMode != ownerMode) {
        text.setMode(ownerMode);
    }

    // if buffer contains any decomposed char values
    // return their strength orders before continuing in
    // the Normalizer's CharacterIterator.
    if (buffer != null) {
        if (expIndex < buffer.length) {
            return strengthOrder(buffer[expIndex++]);
        } else {
            buffer = null;
            expIndex = 0;
        }
    } else if (swapOrder != 0) {
        if (Character.isSupplementaryCodePoint(swapOrder)) {
            char[] chars = Character.toChars(swapOrder);
            swapOrder = chars[1];
            return chars[0] << 16;
        }
        int order = swapOrder << 16;
        swapOrder = 0;
        return order;
    }
    int ch  = text.next();

    // are we at the end of Normalizer's text?
    if (ch == NormalizerBase.DONE) {
        return NULLORDER;
    }

    int value = ordering.getUnicodeOrder(ch);
    if (value == RuleBasedCollator.UNMAPPED) {
        swapOrder = ch;
        return UNMAPPEDCHARVALUE;
    }
    else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
        value = nextContractChar(ch);
    }
    if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
        buffer = ordering.getExpandValueList(value);
        expIndex = 0;
        value = buffer[expIndex++];
    }

    if (ordering.isSEAsianSwapping()) {
        int consonant;
        if (isThaiPreVowel(ch)) {
            consonant = text.next();
            if (isThaiBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
        if (isLaoPreVowel(ch)) {
            consonant = text.next();
            if (isLaoBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
    }

    return strengthOrder(value);
}
 
Example #28
Source File: CollationElementIterator.java    From jdk8u60 with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Get the previous collation element in the string.  <p>This iterator iterates
 * over a sequence of collation elements that were built from the string.
 * Because there isn't necessarily a one-to-one mapping from characters to
 * collation elements, this doesn't mean the same thing as "return the
 * collation element [or ordering priority] of the previous character in the
 * string".</p>
 * <p>This function updates the iterator's internal pointer to point to the
 * collation element preceding the one it's currently pointing to and then
 * returns that element, while next() returns the current element and then
 * updates the pointer.  This means that when you change direction while
 * iterating (i.e., call next() and then call previous(), or call previous()
 * and then call next()), you'll get back the same element twice.</p>
 *
 * @return the previous collation element
 * @since 1.2
 */
public int previous()
{
    if (text == null) {
        return NULLORDER;
    }
    NormalizerBase.Mode textMode = text.getMode();
    // convert the owner's mode to something the Normalizer understands
    NormalizerBase.Mode ownerMode =
        CollatorUtilities.toNormalizerMode(owner.getDecomposition());
    if (textMode != ownerMode) {
        text.setMode(ownerMode);
    }
    if (buffer != null) {
        if (expIndex > 0) {
            return strengthOrder(buffer[--expIndex]);
        } else {
            buffer = null;
            expIndex = 0;
        }
    } else if (swapOrder != 0) {
        if (Character.isSupplementaryCodePoint(swapOrder)) {
            char[] chars = Character.toChars(swapOrder);
            swapOrder = chars[1];
            return chars[0] << 16;
        }
        int order = swapOrder << 16;
        swapOrder = 0;
        return order;
    }
    int ch = text.previous();
    if (ch == NormalizerBase.DONE) {
        return NULLORDER;
    }

    int value = ordering.getUnicodeOrder(ch);

    if (value == RuleBasedCollator.UNMAPPED) {
        swapOrder = UNMAPPEDCHARVALUE;
        return ch;
    } else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
        value = prevContractChar(ch);
    }
    if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
        buffer = ordering.getExpandValueList(value);
        expIndex = buffer.length;
        value = buffer[--expIndex];
    }

    if (ordering.isSEAsianSwapping()) {
        int vowel;
        if (isThaiBaseConsonant(ch)) {
            vowel = text.previous();
            if (isThaiPreVowel(vowel)) {
                buffer = makeReorderedBuffer(vowel, value, buffer, false);
                expIndex = buffer.length - 1;
                value = buffer[expIndex];
            } else {
                text.next();
            }
        }
        if (isLaoBaseConsonant(ch)) {
            vowel = text.previous();
            if (isLaoPreVowel(vowel)) {
                buffer = makeReorderedBuffer(vowel, value, buffer, false);
                expIndex = buffer.length - 1;
                value = buffer[expIndex];
            } else {
                text.next();
            }
        }
    }

    return strengthOrder(value);
}
 
Example #29
Source File: CollationElementIterator.java    From jdk8u60 with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Get the next collation element in the string.  <p>This iterator iterates
 * over a sequence of collation elements that were built from the string.
 * Because there isn't necessarily a one-to-one mapping from characters to
 * collation elements, this doesn't mean the same thing as "return the
 * collation element [or ordering priority] of the next character in the
 * string".</p>
 * <p>This function returns the collation element that the iterator is currently
 * pointing to and then updates the internal pointer to point to the next element.
 * previous() updates the pointer first and then returns the element.  This
 * means that when you change direction while iterating (i.e., call next() and
 * then call previous(), or call previous() and then call next()), you'll get
 * back the same element twice.</p>
 *
 * @return the next collation element
 */
public int next()
{
    if (text == null) {
        return NULLORDER;
    }
    NormalizerBase.Mode textMode = text.getMode();
    // convert the owner's mode to something the Normalizer understands
    NormalizerBase.Mode ownerMode =
        CollatorUtilities.toNormalizerMode(owner.getDecomposition());
    if (textMode != ownerMode) {
        text.setMode(ownerMode);
    }

    // if buffer contains any decomposed char values
    // return their strength orders before continuing in
    // the Normalizer's CharacterIterator.
    if (buffer != null) {
        if (expIndex < buffer.length) {
            return strengthOrder(buffer[expIndex++]);
        } else {
            buffer = null;
            expIndex = 0;
        }
    } else if (swapOrder != 0) {
        if (Character.isSupplementaryCodePoint(swapOrder)) {
            char[] chars = Character.toChars(swapOrder);
            swapOrder = chars[1];
            return chars[0] << 16;
        }
        int order = swapOrder << 16;
        swapOrder = 0;
        return order;
    }
    int ch  = text.next();

    // are we at the end of Normalizer's text?
    if (ch == NormalizerBase.DONE) {
        return NULLORDER;
    }

    int value = ordering.getUnicodeOrder(ch);
    if (value == RuleBasedCollator.UNMAPPED) {
        swapOrder = ch;
        return UNMAPPEDCHARVALUE;
    }
    else if (value >= RuleBasedCollator.CONTRACTCHARINDEX) {
        value = nextContractChar(ch);
    }
    if (value >= RuleBasedCollator.EXPANDCHARINDEX) {
        buffer = ordering.getExpandValueList(value);
        expIndex = 0;
        value = buffer[expIndex++];
    }

    if (ordering.isSEAsianSwapping()) {
        int consonant;
        if (isThaiPreVowel(ch)) {
            consonant = text.next();
            if (isThaiBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
        if (isLaoPreVowel(ch)) {
            consonant = text.next();
            if (isLaoBaseConsonant(consonant)) {
                buffer = makeReorderedBuffer(consonant, value, buffer, true);
                value = buffer[0];
                expIndex = 1;
            } else if (consonant != NormalizerBase.DONE) {
                text.previous();
            }
        }
    }

    return strengthOrder(value);
}
 
Example #30
Source File: CollationElementIterator.java    From openjdk-jdk9 with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Get the ordering priority of the next contracting character in the
 * string.
 * @param ch the starting character of a contracting character token
 * @return the next contracting character's ordering.  Returns NULLORDER
 * if the end of string is reached.
 */
private int nextContractChar(int ch)
{
    // First get the ordering of this single character,
    // which is always the first element in the list
    Vector<EntryPair> list = ordering.getContractValues(ch);
    EntryPair pair = list.firstElement();
    int order = pair.value;

    // find out the length of the longest contracting character sequence in the list.
    // There's logic in the builder code to make sure the longest sequence is always
    // the last.
    pair = list.lastElement();
    int maxLength = pair.entryName.length();

    // (the Normalizer is cloned here so that the seeking we do in the next loop
    // won't affect our real position in the text)
    NormalizerBase tempText = (NormalizerBase)text.clone();

    // extract the next maxLength characters in the string (we have to do this using the
    // Normalizer to ensure that our offsets correspond to those the rest of the
    // iterator is using) and store it in "fragment".
    tempText.previous();
    key.setLength(0);
    int c = tempText.next();
    while (maxLength > 0 && c != NormalizerBase.DONE) {
        if (Character.isSupplementaryCodePoint(c)) {
            key.append(Character.toChars(c));
            maxLength -= 2;
        } else {
            key.append((char)c);
            --maxLength;
        }
        c = tempText.next();
    }
    String fragment = key.toString();
    // now that we have that fragment, iterate through this list looking for the
    // longest sequence that matches the characters in the actual text.  (maxLength
    // is used here to keep track of the length of the longest sequence)
    // Upon exit from this loop, maxLength will contain the length of the matching
    // sequence and order will contain the collation-element value corresponding
    // to this sequence
    maxLength = 1;
    for (int i = list.size() - 1; i > 0; i--) {
        pair = list.elementAt(i);
        if (!pair.fwd)
            continue;

        if (fragment.startsWith(pair.entryName) && pair.entryName.length()
                > maxLength) {
            maxLength = pair.entryName.length();
            order = pair.value;
        }
    }

    // seek our current iteration position to the end of the matching sequence
    // and return the appropriate collation-element value (if there was no matching
    // sequence, we're already seeked to the right position and order already contains
    // the correct collation-element value for the single character)
    while (maxLength > 1) {
        c = text.next();
        maxLength -= Character.charCount(c);
    }
    return order;
}