/ .net

An explanation of implicit and explicit conversion in C#

Given a method signature that expects two nullable DateTime parameters…

protected int CalculateActualDuration(DateTime? startDate, DateTime? endDate)

… SamIAm asked on Stack Overflow about how the compiler understands nullable types:

I am able to call the method by passing in both a DateTime? and a DateTime. So how does the compiler understand the difference?

My interpretation was that the OP was assuming some hidden logic, obfuscated by the compiler, was responsible for allowing both a type and a nullable type. While everything does eventually have to make its way down to the compiler (so yes, it is ultimately handling both types of variables, along with everything else you’ve written), the reason for this particular behavior lies in the Nullable implicit conversion(I think I was correct in my interpretation, given that he accepted the answer without further comment.)

What is implicit and explicit conversion?

If you’ve programmed in C#, you’ve most likely done both, without even realizing it.

Take the following code, for example. A Decimal is capable of storing any Int32 value, without losing any information about the integer. So we can define an integer, then store it in a decimal without the compiler yelling at us. This conversion from integer to decimal is implicit – we simply assigned one to the other, easy-peezy.

int quantity = 5;
decimal quantity2 = quantity; // no problemo

What if we go the opposite direction?

decimal quantity = 5;
int quantity2 = quantity; // compiler complains, "Cannot convert source type 'decimal' to target type 'int'"

This is disallowed, because we run the risk of losing information about the decimal (most likely the fractional portion, but Decimal can also store a much larger number than Int32). The compiler is saving us from ourselves. We may not realize we’re potentially losing data.

decimal quantity = 5;
int quantity2 = (int)quantity; // whatever, cast away!

While we can’t implicitly convert a Decimal to an Int32, we can still explicitly convert the values (that’s what the “(int)” is doing in the above code), in essence telling the compiler, “Yes, I know the risks, and I’m willing to convert anyway.”

Why would you do this? One reason might be some old code that you can’t change, which always stores a 5-digit number in an Int64. You know there’s no way storing that number in an Int32 will ever be an issue, but you aren’t about to refactor some code that touches everything. So, you just explicitly cast and move on.

Not having to do decimal quantity = new decimal(5)  improves readability. We have no reason to think a Decimal can’t store an Int32, so the conversion just happens. On the other hand, converting from an Int32 to a Decimal may have an unexpected result (loss of data), so we’re forced to explicitly cast in that direction.

What is the implicit conversion operator?

There’s an operator (two closely related ones actually) that we can implement in our own types in order to allow this same behavior, and it’s the implicit (and explicit) operator.

Here’s how MSDN defines the implicit keyword: (it sounds a lot like the implicit conversion of Int32 to Decimal up above.)

The implicit keyword is used to declare an implicit user-defined type conversion operator. Use it to enable implicit conversions between a user-defined type and another type, if the conversion is guaranteed not to cause a loss of data.

In other words, you use the implicit keyword to define how to convert between your type and some other type, without the consumer of your class having to explicitly do the conversion themselves. They don’t have to worry about the details of how another type is converted to your type. (And since they’re blissfully ignorant and all, don’t screw them over by losing their data, like allowing a decimal to be converted to an integer implicitly, and then just silently dropping the fractional portion.)

A couple ridiculous examples to illustrate

Imagine a class like this, which accepts a name in the constructor:

public class Person
{
    private string _name;
 
    public Person(string name)
    {
        _name = name;
    }
}

You would instantiate the class like this:

Person person = new Person("Bob");

But, you could add an implicit conversion, in this case one that defines how a string (name) is converted to an instance of your Person class.

public class Person
{
    private string _name;
 
    public Person(string name)
    {
        _name = name;
    }
 
    public static implicit operator Person(string name)
    {
        return new Person(name);
    }
}

Now we can do something silly like this:

Person person = "Mary";

Let me just pause a moment here to say that these examples are just for demonstration. For instance, you’d be much better off here with an empty ctor and an AutoProperty.

public class Person
{
    public string Name { get; set; }
}
 
...
...
 
Person p = new Person { Name = "Mary" };

Anywho! Next, let’s assume a Birthday class that:

  • Can take a birth date, and implicitly convert to a Birthday, losing no data
  • Can convert back to a DateTime implicitly, again losing no data
  • Can also convert back to an Int32 to represent the current age in years, but only explicitly since data about the Birthday will be lost.
public class Birthday
{
    private readonly DateTime _birthday;
 
    public Birthday() { }
 
    private Birthday(DateTime birthday)
    {
        _birthday = birthday;
    }
 
    public static implicit operator Birthday(DateTime birthday)
    {
        return new Birthday(birthday);
    }
 
    public static implicit operator DateTime(Birthday birthday)
    {
        return birthday._birthday;
    }
 
    public static explicit operator int(Birthday birthday)
    {
        TimeSpan exactAge = DateTime.Now - birthday._birthday;
 
        return (int)(exactAge.TotalDays / 365);
    }
}

The consumer of the above class can convert it to and from a DateTime without thinking about it. Converting to an Int32 loses a portion of the information that makes up a “Birthday”, so I’ve made a judgement call and decided the consumer must explicitly cast it, to show they acknowledge the potential loss and accept it.

Birthday birthday1 = new Birthday();            // standard instantiation, nothing special here!
 
Birthday birthday2 = new DateTime(1970, 6, 2);  // implicit conversion of DateTime to Birthday
DateTime birthdate = birthday2;                 // implicit conversion of Birthday to DateTime
 
int age = (int)birthday2;                       // explicit conversion of Birthday to Int32

Here’s one more example, using a new (awesome!) number we’ve created. It can’t represent anything larger or more precise than an Int32.

public struct AwesomeNumber
{
    private readonly int _number;
 
    private AwesomeNumber(int number)
    {
        _number = number;
    }
 
    public static implicit operator AwesomeNumber(int number)
    {
        return new AwesomeNumber(number);
    }
 
    public static implicit operator AwesomeNumber(short number)
    {
        return new AwesomeNumber(number);
    }
 
    public static explicit operator AwesomeNumber(double number)
    {
        return new AwesomeNumber((int)number);
    }
 
    public static explicit operator AwesomeNumber(decimal number)
    {
        return new AwesomeNumber((int)number);
    }
 
    public static implicit operator int(AwesomeNumber number)
    {
        return number._number;
    }
}

Since it can’t represent anything larger or more precise than an Int32, it only allows integers and shorts to be implicitly converted, while doubles and decimals must be explicitly cast before conversion (to acknowledge that data will be lost).

short s = 5;
AwesomeNumber number1 = s;                    // implicit short
AwesomeNumber number2 = 5;                    // implicit integer
 
AwesomeNumber number3 = (AwesomeNumber)2.0;   // explicit double
AwesomeNumber number4 = (AwesomeNumber)5.0m;  // explicit decimal

Microsoft recommends only using the implicit keyword in instances where data will not be lost:

By eliminating unnecessary casts, implicit conversions can improve source code readability. However, because implicit conversions do not require programmers to explicitly cast from one type to the other, care must be taken to prevent unexpected results. In general, implicit conversion operators should never throw exceptions and never lose information so that they can be used safely without the programmer’s awareness. If a conversion operator cannot meet those criteria, it should be marked explicit. For more information, see Using Conversion Operators.

You wouldn’t want to have unwelcome side-effects like this, where a consumer is led to think that two types are equivalent, when in reality some data loss is occurring.

public struct NotSoAwesomeNumber
{
    private readonly int _number;
 
    private NotSoAwesomeNumber(int number)
    {
        _number = number;
    }
 
    public static implicit operator NotSoAwesomeNumber(double number)
    {
        // Fail: Consumer thought they stored a double,
        //       but internally we silently dropped the fractional portion
        return new NotSoAwesomeNumber((int)number);
    }
 
    public static implicit operator double(NotSoAwesomeNumber number)
    {
        return number._number;
    }
}

At runtime, unbeknownst to the caller, their value is converted to an Int32, and is returned with the fractional part lost forever.

NotSoAwesomeNumber notSoAwesomeNumber = 5.3;
 
double myOrigValue = notSoAwesomeNumber;  // Uh-oh, only get 5.0 back

One other caveat.. there’s no intellisense when using a conversion operator, so any “summary” comment you add above the keyword will go unnoticed and unread by the consumer. So if you use it, it better be really obvious what values are allowed in there.

What is the Nullable<T> Implementation of Implicit?

Going back to the original SO question again, SamIAm posted this snippet of code:

protected int CalculateActualDuration(DateTime? startDate, DateTime? endDate)
{
    if (startDate.HasValue && endDate.HasValue)
    {
        return Math.Abs((int)(endDate.Value.Subtract(startDate.Value).TotalMinutes));
    }
    else
    {
        return 0;
    }
}

And then asked:

I am able to call the method by passing in both a DateTime? and a DateTime. So how does the compiler understand the difference?

The explanation can be found in the source code for the implicit operator in the Nullable struct(relevant parts for this discussion)

public Nullable(T value)
{
    this.value = value;
    this.hasValue = true;
}
 
public static implicit operator Nullable<T>(T value)
{
    return new Nullable<T>(value);
}
 
public static explicit operator T(Nullable<T> value)
{
    return value.Value;
}

So taking from what we now know about these operators, we see two conversions available:

  • Implicitly convert a regular, non-null instance of DateTime (or any other type) into a new Nullable. (Which stores the value in “value”, and sets the “hasValue” flag to “true”.)

  • Explicitly convert a non-null instance of DateTime? (or any other null type), into a regular type T. (This could potentially throw an exception, if internally Value is null, which is why the person using it is forced to explicitly cast the value.)

The Nullable Implicit Conversion (T to Nullable) page on MSDN reinforces this:

If the value parameter is not null, the Value property of the new Nullable value is initialized to the value parameter and the HasValue property is initialized to true.

All of this means that the reason one can call the CalculateActualDuration method with either a DateTime? or a DateTime is because the DateTime will be implicitly converted to a DateTime? and the rest of the method operates on that Nullable instance.

But… he asked about the compiler!

I enjoy looking at .NET source code and writing about what it’s doing. I know nothing about compilers except.. you know.. they compile things. Fortunately*, *the community has a resident expert in compilers!

Eric Lippert, formerly of Microsoft and compiler guru, made the following comment: (I won’t even try to elaborate, but just cite for sake of completeness)

The answer to your final question is “yes”. Think of a nullable DateTime as a normal DateTime with a Boolean glued to it that indicates its nullity. Now is it clear how the compiler could deal with converting a normal DateTime to a nullable DateTime?

References


Grant Winney

Grant Winney

I write when I've got something to share - a personal project, a solution to a difficult problem, or just an idea. We learn by doing and sharing. We've all got something to contribute.

Read More