String comparison in DotNet

There are various ways of comparing two strings. Based upon situation you might need to compare string in regards to case sensitivity or even culture. For example, Postgres Sql is case sensitive where as MsSql is not.

Here is a common scenario I have seen in many projects when it comes to comparing string that are not case sensitive. The classic ‘convert both the strings to lower variant and check if they are same’.
There is nothing wrong with code below…..  Or is there?

var first = "This is interesting";
var second = "This Is Interesting";
var result = first.ToLower() == second.ToLower();
Console.WriteLine($"First is equal to second: {result}");
//Output is : First is equal to second: True

We will come to the part where this could go wrong in later example.
However, there is better way of doing string comparison that is not case sensitive.

var first = "This is interesting";
var second = "This Is Interesting";
var result = first.Equals(second, StringComparison.OrdinalIgnoreCase);
Console.WriteLine($"First is equal to second: {result}");
//Output is : First is equal to second: True

I find this solution more elegant because, Equals extension scales well and you get the best performance with StringComparison.OrdinalIgnoreCase option. Plus Microsoft recommends using the extension method variant (and since they created the language they must know what they are talking about).

Here are the some of the reasons why Equals overload method is best bet and scales well compared to “ToLower()” variant.

The Turkish ‘İ’ conundrum

Let’s add a line in first code snippet we encountered above.

// tr-TR represents culture info of Turkey 
Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR"); 

var first = "This is interesting"; 
var second = "This Is Interesting";  

var result = first.ToLower() == second.ToLower(); 
Console.WriteLine($"First is equal to second: {result}") 

What do you think would be the result of this code?
First is equal to second: False

Wait, what? Why?
This is because Turkish capital ‘i’ has a dot on top so it looks like ‘İ’ unlike English capital ‘i’ which is ‘I’.

So how do we handle this?

Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR"); 

var first = "This is interesting"; 
var second = "This Is Interesting";  

var result = first.Equals(second, StringComparison.OrdinalIgnoreCase);; 
Console.WriteLine($"First is equal to second: {result}")

Voila! Equals extension to the rescue.

The German ‘ß’ which looks like english ‘B’ but is equivalent to ‘ss’

The following example uses the phrase “street.” in German with the “ss” in one string and ‘ß’ in another. Linguistically (in Windows), “ss” is equal to the German Esszet: ‘ß’ character in both the “en-US” and “de-DE” cultures.

So when a user provides either “ss” or “ß” they should be handled as same. But luckily there is a variant for Equals extension which handles this too. See where I am going when we mentioned that Equals extension scales well?

var first = "Straße";
var second = "Strasse";

var result = first.ToLower() == second.ToLower();
Console.WriteLine($"First is equal to second with ToLower conversion: {result}");

var firstStreet = "Straße";
var secondStreet = "Strasse";

var resultStreet = firstStreet.Equals(secondStreet, StringComparison.CurrentCultureIgnoreCase);
Console.WriteLine($"First is equal to second with Equals extension: {resultStreet}");

// Result
// First is equal to second with ToLower conversion: False
// First is equal to second with Equals extension: True  

The important thing to consider is we are using StringComparison.CurrentCultureIgnoreCase variant and not the StringComparison.OrdinalIgnoreCase

Null??!!


var first = "Straße";
string second = null;

var result = first.ToLower() == second.ToLower();
Console.WriteLine($"First is equal to second with ToLower conversion: {result}");

var firstStreet = "Straße";
var secondStreet = "Strasse";

var resultStreet = firstStreet.Equals(null, StringComparison.OrdinalIgnoreCase);
Console.WriteLine($"First is equal to second with Equals extension: {resultStreet}");

Null is obvious in the code above but this might not be always the case. And sometimes null check can be missed. So what would be the result in this case?
The ToLower() variant will throw System.NullReferenceException because we cannot perform ToLower() on null values. Whereas, the Equals extension handles this gracefully and simply returns false as result.

In conclusion, always use Equals extension while comparing strings for ‘non case sensitive’ cases. And most of the time you will get away with the StringComparison.OrdinalIgnoreCase variant unless you want to do culture specific check. In that case use StringComparison.CurrentCultureIgnoreCase. If you want to handle case sensitivity use CurrentCulture of Ordinal variant (without IgnoreCase)

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.