View Issue Details

IDProjectCategoryView StatusLast Update
0001165bareos-coredirectorpublic2020-01-07 15:09
Reporterfranku Assigned Tofranku  
PrioritynormalSeveritymajorReproducibilityrandom
Status resolvedResolutionfixed 
Summary0001165: Same job name for jobs submitted at the same time
DescriptionRestore jobs submitted at the same time end up with the same job name, which results in one of the jobs being rejected by the storage daemon and failing.
Steps To ReproduceSee attached files:

why.c: test-source file with the code stripped down to the essence
output: log-output of the test that shows up the problem
Additional InformationThe seq variable is incremented inside of the mutex, which should be safe, but then its value is read into the JobControlRecord outside of the mutex, which is a race condition if other threads are manipulating the value at the same time.
TagsNo tags attached.

Activities

franku

franku

2020-01-03 09:57

administrator  

why.c (1,939 bytes)   
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>


#define THREADCOUNT 20

/* compile with "gcc -o whyc why.c -lpthread" */

void *maybeunique(void* arg) {


        static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
        static time_t last_start_time = 0;
        static int seq = 0;
        int lseq;
        time_t now = time(NULL);
        struct tm *info;
        char dt[50];
        time_t rawtime;

        pthread_mutex_lock(&mutex);
        seq++;
        if (seq > 59) {
                seq = 0;
                while (now == last_start_time) {
                        usleep(500000);
                        now = time(NULL);
                }
        }
        last_start_time = now;
        lseq = seq; //assign static value to thread-local variable
        pthread_mutex_unlock(&mutex);

        time(&rawtime);
        info = localtime(&rawtime);
        strftime(dt, sizeof(dt), "%Y-%m-%d_%H.%M.%S", info);

        fprintf(stdout, "jobname.%s_%02d (seq = %02d, lseq = %02d)\n", dt, lseq, seq, lseq);
        fflush(stdout);
        return (void*) NULL;
}

int main(int argc, char** argv) {

        int r;
        pthread_t threads[THREADCOUNT];

        fprintf(stdout, "starting...\n");
        fflush(stdout);
        for (int i = 0; i < THREADCOUNT; i++) {
                r = pthread_create(&threads[i], NULL, &maybeunique, NULL);
                if (r) {
                        fprintf(stderr, "failed to create thread %d, errno is %d\n", i, r);
                }
        }
        fprintf(stdout, "threads dispatched. waiting...\n");
        fflush(stdout);
        for (int i = 0; i < 5; i++) {
                r = pthread_join(threads[i], NULL);
                if (r) {
                        fprintf(stderr, "failed to wait on thread %d, errno is %d\n", i, r);
                }
        }

        fprintf(stdout, "done\n");
        fflush(stdout);
        exit(0);
}
why.c (1,939 bytes)   
output (1,116 bytes)   
$ ./whyc
starting...
jobname.2019-12-27_17.23.33_01 (seq = 05, lseq = 01)
jobname.2019-12-27_17.23.33_05 (seq = 07, lseq = 05)
jobname.2019-12-27_17.23.33_03 (seq = 07, lseq = 03)
jobname.2019-12-27_17.23.33_08 (seq = 11, lseq = 08)
jobname.2019-12-27_17.23.33_06 (seq = 11, lseq = 06)
jobname.2019-12-27_17.23.33_02 (seq = 11, lseq = 02)
jobname.2019-12-27_17.23.33_07 (seq = 12, lseq = 07)
jobname.2019-12-27_17.23.33_04 (seq = 13, lseq = 04)
jobname.2019-12-27_17.23.33_09 (seq = 13, lseq = 09)
jobname.2019-12-27_17.23.33_10 (seq = 14, lseq = 10)
jobname.2019-12-27_17.23.33_14 (seq = 14, lseq = 14)
jobname.2019-12-27_17.23.33_12 (seq = 14, lseq = 12)
jobname.2019-12-27_17.23.33_13 (seq = 14, lseq = 13)
jobname.2019-12-27_17.23.33_15 (seq = 15, lseq = 15)
jobname.2019-12-27_17.23.33_11 (seq = 16, lseq = 11)
jobname.2019-12-27_17.23.33_17 (seq = 17, lseq = 17)
jobname.2019-12-27_17.23.33_16 (seq = 17, lseq = 16)
jobname.2019-12-27_17.23.33_18 (seq = 18, lseq = 18)
jobname.2019-12-27_17.23.33_19 (seq = 19, lseq = 19)
threads dispatched. waiting...
jobname.2019-12-27_17.23.33_20 (seq = 20, lseq = 20)
done
output (1,116 bytes)   

Related Changesets

bareos: master 472ceb9b

2020-01-03 10:46

franku


Committer: GitHub

Ported: N/A

Details Diff
Merge pull request 0000385 from WaryWolf/fix-job-creation-racecond

copy value of seq to the stack to avoid race condition when generating Job names
Affected Issues
0001165
mod - AUTHORS Diff File
mod - core/src/dird/job.cc Diff File

bareos: bareos-18.2 0ee566dc

2020-01-03 10:46

franku

Ported: N/A

Details Diff
Merge pull request 0000385 from WaryWolf/fix-job-creation-racecond

(manual merge)
copy value of seq to the stack to avoid race condition when generating Job names
Affected Issues
0001165
mod - AUTHORS Diff File
mod - core/src/dird/job.cc Diff File

bareos: bareos-19.2 36c230fa

2020-01-03 10:46

franku

Ported: N/A

Details Diff
Merge pull request 0000385 from WaryWolf/fix-job-creation-racecond

copy value of seq to the stack to avoid race condition when generating Job names
Affected Issues
0001165
mod - AUTHORS Diff File
mod - core/src/dird/job.cc Diff File

Issue History

Date Modified Username Field Change
2020-01-03 09:57 franku New Issue
2020-01-03 09:57 franku File Added: why.c
2020-01-03 09:57 franku File Added: output
2020-01-03 09:58 franku Assigned To => franku
2020-01-03 09:58 franku Status new => confirmed
2020-01-03 09:58 franku Steps to Reproduce Updated
2020-01-03 10:31 franku Changeset attached => bareos master 472ceb9b
2020-01-03 15:22 franku Changeset attached => bareos bareos-18.2 0ee566dc
2020-01-07 14:40 franku Changeset attached => bareos bareos-19.2 36c230fa
2020-01-07 15:07 franku Status confirmed => resolved
2020-01-07 15:07 franku Resolution open => fixed