Categories
c# case-insensitive contains string

Case insensitive ‘Contains(string)’

3271

Is there a way to make the following return true?

string title = "ASTRINGTOTEST";
title.Contains("string");

There doesn’t seem to be an overload that allows me to set the case sensitivity. Currently I UPPERCASE them both, but that’s just silly (by which I am referring to the i18n issues that come with up- and down casing).

UPDATE

This question is ancient and since then I have realized I asked for a simple answer for a really vast and difficult topic if you care to investigate it fully.

For most cases, in mono-lingual, English code bases this answer will suffice. I’m suspecting because most people coming here fall in this category this is the most popular answer.

This answer however brings up the inherent problem that we can’t compare text case insensitive until we know both texts are the same culture and we know what that culture is. This is maybe a less popular answer, but I think it is more correct and that’s why I marked it as such.

1

  • 1

    try this one: Yourculture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

    – گلی

    Jul 18, 2021 at 7:34

1548

To test if the string paragraph contains the string word (thanks @QuarterMeister)

culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

Where culture is the instance of CultureInfo describing the language that the text is written in.

This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I and i for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of ‘i’ is the unfamiliar character ‘İ’.

Thus the strings tin and TIN are the same word in English, but different words in Turkish. As I understand, one means ‘spirit’ and the other is an onomatopoeia word. (Turks, please correct me if I’m wrong, or suggest a better example)

To summarise, you can only answer the question ‘are these two strings the same but in different cases’ if you know what language the text is in. If you don’t know, you’ll have to take a punt. Given English’s hegemony in software, you should probably resort to CultureInfo.InvariantCulture, because it will be wrong in familiar ways.

17

  • 78

    Why not culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0? That uses the right culture and is case-insensitive, it doesn’t allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison.

    Mar 18, 2013 at 15:32

  • 11

    This solution also needlessly pollutes the heap by allocating memory for what should be a searching function

    – JaredPar

    Mar 18, 2013 at 16:09

  • 19

    Comparing with ToLower() will give different results from a case-insensitive IndexOf when two different letters have the same lowercase letter. For example, calling ToLower() on either U+0398 “Greek Capital Letter Theta” or U+03F4 “Greek Capital Letter Theta Symbol” results in U+03B8, “Greek Small Letter Theta”, but the capital letters are considered different. Both solutions consider lowercase letters with the same capital letter different, such as U+0073 “Latin Small Letter S” and U+017F “Latin Small Letter Long S”, so the IndexOf solution seems more consistent.

    Mar 18, 2013 at 17:47

  • 3

    @Quartermeister – and BTW, I believe .NET 2 and .NET4 behave differently on this as .NET 4 always uses NORM_LINGUISTIC_CASING while .NET 2 did not (this flags has appeared with Windows Vista).

    Mar 23, 2013 at 8:13


  • 12

    Why didn’t you write “ddddfg”.IndexOf(“Df”, StringComparison.OrdinalIgnoreCase) ?

    – Chen

    Aug 23, 2015 at 13:41


1548

To test if the string paragraph contains the string word (thanks @QuarterMeister)

culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

Where culture is the instance of CultureInfo describing the language that the text is written in.

This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I and i for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of ‘i’ is the unfamiliar character ‘İ’.

Thus the strings tin and TIN are the same word in English, but different words in Turkish. As I understand, one means ‘spirit’ and the other is an onomatopoeia word. (Turks, please correct me if I’m wrong, or suggest a better example)

To summarise, you can only answer the question ‘are these two strings the same but in different cases’ if you know what language the text is in. If you don’t know, you’ll have to take a punt. Given English’s hegemony in software, you should probably resort to CultureInfo.InvariantCulture, because it will be wrong in familiar ways.

17

  • 78

    Why not culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0? That uses the right culture and is case-insensitive, it doesn’t allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison.

    Mar 18, 2013 at 15:32

  • 11

    This solution also needlessly pollutes the heap by allocating memory for what should be a searching function

    – JaredPar

    Mar 18, 2013 at 16:09

  • 19

    Comparing with ToLower() will give different results from a case-insensitive IndexOf when two different letters have the same lowercase letter. For example, calling ToLower() on either U+0398 “Greek Capital Letter Theta” or U+03F4 “Greek Capital Letter Theta Symbol” results in U+03B8, “Greek Small Letter Theta”, but the capital letters are considered different. Both solutions consider lowercase letters with the same capital letter different, such as U+0073 “Latin Small Letter S” and U+017F “Latin Small Letter Long S”, so the IndexOf solution seems more consistent.

    Mar 18, 2013 at 17:47

  • 3

    @Quartermeister – and BTW, I believe .NET 2 and .NET4 behave differently on this as .NET 4 always uses NORM_LINGUISTIC_CASING while .NET 2 did not (this flags has appeared with Windows Vista).

    Mar 23, 2013 at 8:13


  • 12

    Why didn’t you write “ddddfg”.IndexOf(“Df”, StringComparison.OrdinalIgnoreCase) ?

    – Chen

    Aug 23, 2015 at 13:41


256

You can use IndexOf() like this:

string title = "STRING";

if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1)
{
    // The string exists in the original
}

Since 0 (zero) can be an index, you check against -1.

MSDN

The zero-based index position of value if that string is found, or -1
if it is not. If value is String.Empty, the return value is 0.

0