«

»

Oct 12 2012

How to manage the overflow of millis()

Millis() is a native function of the Arduino core that returns the number of milliseconds (thus the name) since the start of the sketch. This function is based on a 32-bit register (a variable) that is constantly updated by the timer 0. Due to the fact that the register is a 32-bit variable, it can contain a value up to 232-1, that is 4,294,967,295. This number could appear very big, and usually is bigger enough to be managed by sketches that run just for a while. But if your board stays up all the time long, this register will reach the maximum value that can contain…. and then? It will overflow: in programming science, this means that its value has began bigger than the biggest value that it can contain. This will happen after 49.7 days (4,294,967,295 ms are1193.05 hours, or 49.7 days). So, the register will roll over starting again from 0, with all the problems that this can involve if your sketch measures the time passed using millis().Luckily there’s a way to manage the overflow of millis() so that your sketches can continue to run without issues.

To manage the overflow we first understand how the numbers are stored in memory. Variables are stored in pieces of memory of fixed size. A char or byte type occupies 1 byte of memory, 8 bits; an int type occupies 2 bytes, 16 bits; a long type occupies 4 bytes, 32 bits. A long type can contain a value from 0 to 232-1. And the negative numbers? They are managed using the same amount of bits but using 1 bit, the most significant one, to store the sign: that bit is how the compiler discriminates between number without sign and numbers with sign. A long type is a signed number that can store a value from -2,147,483,648 to +2,147,483,647 while an unsigned long can store a value from 0 to +4,294,967,295. The long type is a signed long and so it looses the use of the bit #32 resulting in a variable long 31 bits: 231 is in fact 2,147,483,648.

Using signed numbers the compiler uses a different way to store the values in memory, a method known as Two’s Complement: the number is stored with its bit that are negated, meaning that the result of a NOT operation is stored instead of the right values, and then 1 is added at the result. For example, let’s get a char type, that is a signed 8-bit value, and store in it the value 1:

00000001

Instead the value -1 is this:

NOT of 00000001 ->11111110 + 00000001 = 11111111

Now, if we have a variable that contains 0 and we subtract 1, the result is -1. Using the binary representation and a signed char to store the result, we get the following:

00000000 – 00000001 = 11111111

Let’s do another operation: let’s get a variable containing -1 and add 1. The result will be 0. This is binary representation:

11111111 + 00000001 = 100000000 -> 00000000

We can see that the result is a 9-bit number but we are using an 8-bit variable, so the 9th bit will be dropped off and the final result will be represented by the 8 less significant bits, that is the value 0. This suggest us the way to manage the overflow of millis().

We have said before that millis() returns the value of a 32-bit register with an unsinged number that can contains a number up to 4,294,967,295. Usually we schedule time-based operations with millis() by adding a value at the number returned by millis(), i.e.:

void setup() {
    previousInterval= millis() + 1000;
}
void loop() {
    if (millis() > previousInterval) {
        //code to execute
        previousInterval+= 1000;
    }
}

Let’s examine the case in which the execution has entered the if block, with millis() that returned 4,294,967,001 and with previousInterval that is equal to 4,294,967,000: the test is true and the block code will be executed. At the end of the block, we add 1,000 at previousInterval (another unsigned long): this should result in a value of 4,294,968,000 but this is a 33-bit number and it cannot be stored inside a 32-bit variabile. So it will be truncated and 704 will be stored (4,294,968,000-4,294,967,296). The next text we’ll check this situation:

4.294.967.001 > 704

The result will be positive. So the block code will be (wrongly) executed another time, and so again the followinf times until the value returned by millis() won’t be greater than previousInterval. There is another case that can happen. Let’s look at the following code:

void setup() {
    previousInterval = millis();
}
void loop() {
    if ((millis() + 1000) > previousInterval) {
        //code to execute
        previousInterval = millis();
    }
}

Let’s say that at a certain moment millis() will return 4,294,966,001 and that previousInterval will be 4,294,967,000: the test will be true because (4,294,966,001 + 1,000) > 4,294,967,000. At the end of the block, we assign to previousIntervalAl the value return by millis(): let’s say that it is now equal to 4,294,967,000 because our code has executed several heavy instructions and it has run for a while. The next time we execute the test we’ll find the following situation:

(4,294,967,001 + 1,000) > 4,294,967,000

But 4,294,967,001 + 1,000 should give the result of 4,294,968,001, that is bigger than the maximum value that we can store into a 32-bit variable. Then it will be truncated and only the first 32 bits will be stores, so the result will be 705. Now the test will be:

705 > 4.294.967.000

Of course, it will be false. And still it will be false for 49.7 days, until millis() won’t reach again the value of 4,294,966,001 so that adding 1,000 the test will be 4,294,967,001 > 4,294,967,000 resulting true.

Solution #1

Now that we have understood how the overflow of millis() can alter the execution of our programs, and learned how negative numbers are stored into the memory, we can use them to do out conditional tests. To do that we force the compiler to convert a variable type into another one (this is called “casting”). We’ve seen that  -1 into an 8-bit variable is represented  as 11111111. And what about a big number like the unsigned long 4,294,967,000? It will be represented as:

11111111 11111111 11111110 11011000

This representation is the same as we usa a signed long variable: it’s only its meaning for the compiler that changes between them. In fact, using a signed long, this number stays for -296.

Now we’ve found the trick! We just have to change the conditional test to verify if the difference between the actual value of millis() and the interval, casted into a singned long result, is less than 0: if it’s true, it means that millis() has rolled over and has restarted from 0 and the test return a negative result ultil millis() won’t be greater than zero and than interval. This is an example code that it isn’t affected by the overflow issue:

void setup() {
    interval = millis() + 1000;
}
void loop() {
    if ((long)(millis() - intervallo) >= 0) {
        //code to execute
        interval += 1000;
    }
}

The result of millis() – interval (converted into a signed long) will be negative as soon as interval will overflow and roll over, and it will stay negative until La differenza (convertita in una variabile di tipo signed) fra millis() e overflow, nel momento in cui quest’ultimo va in overflow, diventa negativa e resta tale finché anche millis() non va in overflow ed il suo valore supera quello di intervallo. In questo momento la differenza torna pari a 0 o positiva per cui sappiamo che millis() ha superato nuovamente intervallo.

To be sure that the code is running properly, we can wait for 49 days or, better thing, we can manipulate the value of the register that is read by millis(). So let’s upload the sketch below and open the serial monitor. After a few seconds we will see that the number that is printed, very big, will be very small: this happens when the interval overflows, but this situation is now managed by the code using the casting of the data types.To modify the register that contains the number of milliseconds we have to modify the Arduino variable timer0_millis, but to do that we have to declare it with the keyword extern, that specifies to the compiler that this variable is declared somewhere else. To modify it, it’s better to stop the interrupts before we do that and then we start them again after the modify has been done: this is to avoid to modify the content of the register while an interrupt is writing on it.

//the following line tells the compiler that
//timer0_millis is declared somewhere else
extern unsigned long timer0_millis;
static unsigned long myTime;

void setup() {
    Serial.begin(19200);
    delay(2000);
    cli(); //halt the interrupts
    timer0_millis = 4294950000UL; //change the value of the register
    sei(); //re-enable the interrupts
    myTime = millis() + 1000;
}

void loop() {
    if ((long)(millis() - myTime) >= 0) {
        Serial.println(millis(), DEC);
        myTime += 1000;
    }
}

Solution #2

Another solution at the problem, that the user lesto from the arduino.cc forum suggested me, can be the inversion of the check made with the interval. Usually users do comparisons like this:

MILLIS + INTERVAL > PREV_TIME

As we already said, this kind of control falls back in the field of the overflow of millis.

Instead, we can use a comparison like this:

MILLIS - PRE_TIME > INTERVAL

Using this form, the difference between the value return by millis() and the previous value stored in pre_time will always be a number included between 0 and interval. Let’s see an example:

if (millis() - prev_time > interval) {
    prev_time = millis();
    ....
}

Let’s say that prev_time is 4,294,967,000 and that interval is 1,000. At a certain moment, millis() rolls back to zero. The comparison begins:

0 - 4294967000 > 1000

We could think that it should interpreted as below:

-4294967000 > 1000

but we have to keep in mind that using unsigned variables the subtraction will return 296. This is because an unsigned can not manage negative numbers so the result is given by the max allowed value that an unsigned long can manage, 232 or 4,294,967,296, minus 4,294,967,000, so 296. Now the comparison has become:

296 > 1000

that obviously is false. Only when millis will be greater than 704 the test will be true, because:

705 - 4294967000 = -4294966295
-4294966295 => 1001
1001 > 1000 = TRUE

2 comments

  1. dconvert

    Ciao leonardo,

    ho letto con molta attenzione il tuo articolo sul millis() e ho visto che il tuo esempio funziona bene, praticamente non dovremmo più aver problemi sul tempo sui nostri circuiti che devono essere in funzione 24 ore su 24 come nel mio caso che ho costruito un tracker GPS che manda un POST al mio server ogni 30 secondi circa con le coordinate GPS.
    Avendo usato la libreria GSMSHIELD sono andato a verificare come il programmatore ha approcciato riguardo l’attesa delle risposta con il millis() e o potuto notare con piacere che anche li è stato usato un approccio simile al tuo:

    if ((unsigned long)(millis() – prev_time) >= start_reception_tmout) {
    // timeout elapsed => GSM module didn’t start with response
    // so communication is takes as finished

    prev_time è un unsigned long

    Il mio dubbio è; come puoi osservare è stato inserito un unsigned long invece del semplice long come nel tuo esempio, allora ho provato a inserire anche nel tuo esempio l’unsigned ed il risultato è che vedo uno scrolling molto veloce e non ogni secondo come programmato.
    Potresti spiegarmi questo fenomeno ?
    Grazie.
    Saluti

  2. Leonardo Miliani

    millis() e prev_time, essendo unsigned long, saranno convertiti in unsigned long in automatico. L’autore di quel codice vuole però evitare autocasting nascosti e chiede esplicitamente un unsigned long. Così facendo si assicura che la differenza sia sempre un numero positivo compreso tra 0 e l’intervallo di intervento.
    Ma se metti un unsigned long nel mio codice, trasformi anche lì il risultato in un intero positivo. Il problema è che nel mio esempio facevo il controllo con il numero uguale o maggiore di zero, per accorgermi di quando la differenza da negativa diventava positiva. Usando un unsigned long, la differenza sarà sempre positiva perché l’unsigned è appunto senza segno, quindi non tratta i numeri negativi. Ed il mio esempio viene perciò eseguito sempre perché è sempre vero.

    Detto questo, ti faccio notare che la soluzione usata dall’autore di quella libreria non ricade nel campo della prima soluzione al problema di overflow, ma nella seconda 😉
    if ((unsigned long)(millis() – prev_time) >= start_reception_tmout) {
    si può semplificare (per i motivi che ti ho espresso ad inizio risposta) in:
    if ((millis() – prev_time) >= start_reception_tmout) {
    che, come vedi, è uguale alla seconda soluzione proposta

Leave a Reply