Yet another attempt to refactor libstd's Duration


#1

At the moment the Duration type looks like

pub struct Duraion {
    secs: i64,
    nanos: i32
}

According to this, one can expect that maximum possible duration value is at least i64::MAX seconds. But in fact, the full range that i64 seconds gives us is restricted to i64::MAX milliseconds (see #16626 for more details).

I propose to change the internal representation of Duration to

pub struct Duraion {
    ticks: i64,  // A single tick represents 100 nanoseconds  
    nanos: i8  // to hold values from -99 to 99
}

Some pros:

  1. Internals of struct Duraion are not exposed. So, we are free to change them.
  2. New representation is 3 bytes smaller.
  3. Methods like num_milliseconds etc will return correct i64 values, see #16626.
  4. TimeSpan in .NET and Duration in Joda-Time types do the same.

Some cons:

  1. The overall range of values for Duration will be shorter:
  • i64::MAX seconds gives us 106751991167300 days. We don’t use need it.
  • i64::MAX milliseconds gives us 106751991167 days. This is our current choice.
  • i64::MAX ticks gives us 10675199 days. The range is still long enough, who cares?
  1. ?

#2

If JodaTime does it, we can probably do it safely too.


#3

Then I gonna do it and make a pull request. Right? Or wait for a while? I am a newbee here and don’t know if this is the right way to to contribute to Rust.


#4

I think internal representation should reflect source data as much as possible so as to not introduce unnecessary transformations. The second property of a good internal representation is low bit waste. It may seem trivial in an isolated case but imagine large vectors of these and the possibly wasted cache space.

These two goals often collide. If so, go for fewer transformations. You can then transform to whatever representation suits your application’s needs.

Also, don’t mix API and implementation. It’s better to provide interface functions that transform the data to your preferred form than to have leaky abstractions.


#5

You are absolutely right!

At the moment the first property (secs) overflows i64 in methods like num_milliseconds. Therefore we restrict it to hold only i64::MAX milliseconds:

pub const MAX: Duration = Duration {
  secs: i64::MAX / MILLIS_PER_SEC,
  ...

That’s a waste of 10 bits. I don’t like it.

I also don’t like the second property (nanos) because I don’t think that we need a nanosecond precision duration. Without nanos duration type behaves like i64. That means it transforms into code that runs faster etc.

TimeSpan (.NET) and Duration (Joda-Time) both don’t need nanos. But according to what i see - we do!


#6

I don’t have a strong opinion about this, but I will say that being 3 bytes smaller isn’t much of a win due to the alignment rules, which will wind up rounding the size to the same thing either way.


#7

3 bytes more or 3 bytes less … Remember that we deal with a dead simple type like duration. It’s not very useful by itself, but it will be the base for more complex types. The more efficient it’ll be, the better.

And the best choice we could make, IMHO, is to define duration type like that:

pub struct Duraion {
    ticks: i64, // A single tick represents 100 nanoseconds
}

I’ll fit to alignment rules and also give us a bare metal performance. But, there won’t be nanos. Others don’t need them, I don’t understand why we do!


#8

I hope the new version addresses this issue:


#9

If Rust retrieves its date using C like Timespec, then that is a thing sets it apart from Java/C# libs.

Though I think the Timespec itself is extracted from some other primitive. I could be wrong.


#10

Nanosecond precision might be most useful for reasonably short durations. We could encode very long durations (at least 2^62 ns) with lower precision, in microseconds or milliseconds.


#11

On Windows FILETIME uses 100 nanosecond intervals. http://msdn.microsoft.com/en-us/library/windows/desktop/ms724284.aspx


#12

In this case, it makes literally no difference. Both will have exactly the size in memory:

struct Old {
    secs: i64,
    nanos: i32,
}
struct New {
    ticks: i64,
    nanos: i8,
}
fn main() {
    println!("old: {}, new: {}",
             std::mem::size_of::<Old>(),
             std::mem::size_of::<New>());
}

prints old: 16, new: 16 for me (I imagine it will be 12 for both on a 32-bit computer).


#13

I don’t care much about the three wasted bytes as long as I can use other functions to acquire just the milliseconds or nanos when the full range isn’t important. Usually one needs one one of the two.

For completeness, I think it’s awesome to have a standard API with a data type that can represent this big range in precision.


#14

I found one more reason why ticks: i64, nanos: i8 (or even ticks: i64) is less appropriate: num_microseconds method will always return a valid i64 value. I mean we should change method signature from:

pub fn num_microseconds(&self) -> Option<i64>

to

pub fn num_microseconds(&self) -> i64

But it might to be a compatibility issue. Or leave it as is, then it’ll be inconsistent.


#15

Incompatibility & breaking code is absolutely not a problem before 1.0.


#16

I see!

Then I’d say that we need someone who knows: Do we really need nanoseconds?

If the answer is yes, then the best appropriate representation, IMHO, is:

pub struct Duration {
    millis: i64, // Milliseconds
    nanos:  i32, // Nanoseconds, |nanos| < NANOS_PER_MILLI
}

If the answer is no, then I vote for:

pub struct Duration {
    pub ticks: i64, // A single tick represents 100 nanoseconds
}

Rust 1.0 is not far away, so let’s make a choice!


#17

And, when there is no answer - the first representation (millis: i64, nanos: i32) wins.


#18

Just an idea. Is there any reason we wouldn’t want a variable-precision Duration class, like C++'s std::chrono::duration? The equivalent of std::ratio can be done with a trait and static methods:

trait Ratio {
    fn num() -> i64;
    fn denom() -> i64;
}

Each ratio would be defined by a (zero-sized) type implementing Ratio:

struct Nano;
impl Ratio for Nano {
    fn num() -> i64 { 1 }
    fn denom() -> i64 { 1000000000 }
}

Duration could then be written like this:

struct Duration<T: Signed+Bounded, P: Ratio> {
    ticks: T
}

where T is the type the user wants to use and P the period that spans each tick. T would probably be i64 in most cases, but smaller-sized integers and floats have valid use cases, I think.

The tricky part would be implementing with_period<N: Ratio>(&self) -> Duration<T, N>. A cast function alike to C++'s duration_cast to cast between duration types, that would have to deal the type conversion and precision issues.

It’s a big change, but I think I should propose this before 1.0.

Fast edit: Now that I think of it, wouldn’t calling to P::num() require UFCS to be implemented first? We could work around it by using a zero-sized field with an instance of the type and having Ratio take a &self parameter, but I don’t like having to hack around the language, even less for the std…


#19

Whoops, there goes libtime