View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000855 | bareos-core | director | public | 2017-10-02 04:23 | 2023-09-11 17:13 |
Reporter | Ryushin | Assigned To | bruno-at-bareos | ||
Priority | normal | Severity | minor | Reproducibility | always |
Status | closed | Resolution | unable to reproduce | ||
Platform | Linux | OS | Debian | OS Version | 8 |
Product Version | 17.2.3 | ||||
Summary | 0000855: Update Volume or Purge Volume Hangs on Using Catalog | ||||
Description | This just started happening after the last couple of nightly updates. Pretty much at the time when the new mysql schema was updated in the Debian packages. I've been using the same scripts that I've created the for the last couple of years without a problem. Now the scripts fail to run and hang when sending the command using echo and piping it to bconsole. Example commands: echo "update volume=vchanger_monthly_1.5tb_drives_0003_0001 ActionOnPurge=Truncate" | bconsole echo "purge volume=vchanger_monthly_1.5tb_drives_0003_0001 action=truncate storage=vchanger_monthly_1.5tb pool=Off-site_Pool" | bconsole echo "update volume=vchanger_monthly_1.5tb_drives_0003_0001 volstatus=Append storage=vchanger_monthly_1.5tb pool=Off-site_Pool" | bconsole Example Output: echo "update volume=vchanger_monthly_1.5tb_drives_0003_0001 ActionOnPurge=Truncate" | bconsoleConnecting to Director windwalker:9101 1000 OK: windwalker-dir Version: 17.2.3 (14 Aug 2017) Enter a period to cancel a command. update volume=vchanger_monthly_1.5tb_drives_0003_0001 ActionOnPurge=Truncate Automatically selected Catalog: MyCatalog Using Catalog "MyCatalog" <---- hangs here Running bconsole directly, and then running the commands from inside bconsole seems to work most of the time. But after testing a few times, even inside of bconsole it hanges on the command. I restart the dir and sd daemons to unlock the issue (not sure if the sd daemon needs to be restarted or not) It seems to be doing this with all of my scripts right now. | ||||
Additional Information | Script used: bareos_purge_volumes_on_1.5tb_monthly_drive.sh: #!/bin/bash if [ "$1" = "" ] then echo "Syntax: $0 <path where bareos volumes exist>" echo "I.e. $0 /mnt/vchanger/Monthly_1.5TB_01" exit 1 fi echo "update slots storage=vchanger_monthly_1.5tb" | bconsole for vol in $(ls $1/volumes/vchanger_monthly_1.5tb_drives_*) do echo "update volume=$(basename $vol) ActionOnPurge=Truncate" | bconsole echo "purge volume=$(basename $vol) action=truncate storage=vchanger_monthly_1.5tb pool=Off-site_Pool" | bconsole echo "update volume=$(basename $vol) volstatus=Append storage=vchanger_monthly_1.5tb pool=Off-site_Pool" | bconsole done | ||||
Tags | No tags attached. | ||||
BTW, this is occurring on two different installations. | |
I don't think, that this issue is related to the catalog. However, there have been changes to the prune/purge and ActionOnPurge commands. However, I'm not sure, if your script, even if it run, does what you expect. In my example, with 17.2.4-rc1, volumes don't get truncated. I would use: <your volume loop> purge storage=File pool=Full volume=$VOLUME # truncate all volumes with volstatus=Purged in pool Full truncate volstatus=Purged pool=Full yes Setting volstatus to append should not be required. Back to your main problem: no idea. Try to enable debug before the commands. Maybe the problem than gets obvious: setdebug director level=200 |
|
I'll update the script with your new commands. The script use to purge and truncate the volumes just fine. I had to wait until I could change out the backup drive to another one so I can test again. Ran the script and it hung again. I turned setdebug to 200 and I recorded a trace file. I think everything after line 49 could be deleted (I think the webui queried at that time), but I left it just in case. |
|
Was the attached trace file useful? Is there anything else I can do? | |
As far as I can see, the relevant part starts at line 686. windwalker-dir (10): ua_audit.c:143-0 : Console [default] from [192.168.9.1] cmdline update volume=vchanger_monthly_1.5tb_drives_0001_0001 ActionOnPurge=Truncate I can't see an error there. It did what it should: windwalker-dir (100): sql_query.c:124-0 called: bool B_DB::sql_query(const char*, int) with query UPDATE Media SET VolJobs=1,VolFiles=12,VolBlocks=832203,VolBytes=53687079247,VolMounts=1,VolErrors=0,VolWrites=832204,MaxVolBytes=53687091200,VolStatus='Append',Slot=1,InChanger=1,VolReadTime=0,VolWriteTime=583967158,LabelType=0,StorageId=3,PoolId=6,VolRetention=7344000,VolUseDuration=0,MaxVolJobs=0,MaxVolFiles=0,Enabled=1,LocationId=0,ScratchPoolId=0,RecyclePoolId=0,RecycleCount=0,Recycle=1,ActionOnPurge=1,MinBlocksize=0,MaxBlocksize=0 WHERE VolumeName='vchanger_monthly_1.5tb_drives_0001_0001' Also the follow up commands also got through: windwalker-dir (10): ua_audit.c:143-0 : Console [default] from [192.168.9.1] cmdline purge volume=vchanger_monthly_1.5tb_drives_0001_0001 action=truncate storage=vchanger_monthly_1.5tb pool=Off-site_Pool ... Did you run the commands again manually? Otherwise I don't see an error here. Adding the timestamp to the trace might help to figure out, when a command is issued. setdebug director level=200 timestamp=1 So, I can't reproduce the problem and the trace file looks as if everything run through. Without further information, I afraid, I can't help with this. |
|
Newish error, I've seen it a couple of times. Ran: ./bareos_purge_volumes_on_1.5tb_monthly_drive.sh /mnt/vchanger/Monthly_1.5TB_01 Output: Connecting to Director windwalker:9101 1000 OK: windwalker-dir Version: 17.2.3 (14 Aug 2017) Enter a period to cancel a command. update slots storage=vchanger_monthly_1.5tb Automatically selected Catalog: MyCatalog Using Catalog "MyCatalog" Connecting to Storage daemon vchanger_monthly_1.5tb at windwalker:9103 ... 3306 Issuing autochanger "list" command. <---hangs here I've attached another trace file. |
|
Restarting bareos-dir and bareos-sd daemons got it to get past the list command. Output: Connecting to Director windwalker:9101 1000 OK: windwalker-dir Version: 17.2.3 (14 Aug 2017) Enter a period to cancel a command. update volume=vchanger_monthly_1.5tb_drives_0001_0001 ActionOnPurge=Truncate Automatically selected Catalog: MyCatalog Using Catalog "MyCatalog" <--- hangs here I've attached another trace file. |
|
Was the latest trace file helpful? Is there anything else I can provide? | |
I'm thinking what might be the issue is the mysql operation is taking so long to complete the script appears to hang since bconsole never sends back a response. If the mysql completes its operation quickly, then the script appears to work. Only a theory at this point while watching mytop. MySQL on localhost (10.0.32) load 1.78 1.23 1.15 2/1055 5752 up 19+09:45:45 [16:08:59] Queries: 10.6k qps: 0 Slow: 0.0 Se/In/Up/De(%): 81471/00/00/00 Sorts: 0 qps now: 1 Slow qps: 0.0 Threads: 4 ( 2/ 6) 00/00/00/00 Key Efficiency: 0.1% Bps in/out: 0.2/ 25.3 Now in/out: 21.3/ 3.0k Id User Host/IP DB Time Cmd State Query -- ---- ------- -- ---- --- ----- ---------- 266 snort localhost snort 738 Sleep 468041 bareos localhost bareos 72 Query updating DELETE FROM File WHERE JobId IN (4990) 466542 bareos localhost bareos 9 Sleep 462612 root localhost 0 Query init show full processlist |
|
Actually, I don't think this a bug after all. It's a mysql problem. The script is just taking far, far longer to run. To the point that it seems hung. mytop is still showing a query after what must be over an hour at this point: MySQL on localhost (10.0.32) load 2.10 2.48 2.43 1/1106 30277 up 19+11:29:46 [17:53:00] Queries: 15.3k qps: 0 Slow: 2.0 Se/In/Up/De(%): 56385/00/00/00 Sorts: 0 qps now: 1 Slow qps: 0.0 Threads: 4 ( 2/ 8) 125/00/00/00 Key Efficiency: 0.1% Bps in/out: 0.2/ 36.7 Now in/out: 21.4/ 3.1k Id User Host/IP DB Time Cmd State Query -- ---- ------- -- ---- --- ----- ---------- 468894 bareos localhost bareos 3372 Query updating DELETE FROM File WHERE JobId IN (5034,5035,5036,5037,503 266 snort localhost snort 1342 Sleep 466542 bareos localhost bareos 10 Sleep 462612 root localhost 0 Query init show full processlist Is there some kind of mysql schema code update that fixes how long the updating DELETE File WHERE JobId IN <JOB> takes or some kind of optimization I can make? The query is just taking far longer since the schema update. |
|
After four hours I stopped waiting for it to finish and I went to bed. It had only gotten to 47 out of 140 volumes. I spent quite a bit of time trying to improve the performance of mysql with no noticeable improvement as I'm definitely I/O bound right now. The File.ibd database is over 26GB in size. I run a monthly mysql optimize. The mysql database resides on zfs raidz2 volume made up of four 300GB 15K drives. My mysql performance parameters are as follows: default_storage_engine=innodb innodb_file_per_table=1 innodb_buffer_pool_size=4096M innodb_log_file_size=128M innodb_flush_log_at_trx_commit=2 innodb_flush_method=O_DIRECT innodb_buffer_pool_instances=8 innodb_thread_concurrency=8 innodb_checksum_algorithm=none innodb_doublewrite = 0 Not sure what else I can try and do to solve this. Before the schema update, the longest it took was about 45 minutes. |
|
Closing oldies | |
Date Modified | Username | Field | Change |
---|---|---|---|
2017-10-02 04:23 | Ryushin | New Issue | |
2017-10-02 04:40 | Ryushin | Note Added: 0002768 | |
2017-10-02 16:47 | joergs | Note Added: 0002770 | |
2017-10-02 16:47 | joergs | Assigned To | => joergs |
2017-10-02 16:47 | joergs | Status | new => feedback |
2017-10-04 02:20 | Ryushin | File Added: windwalker-dir.trace | |
2017-10-04 02:21 | Ryushin | Note Added: 0002772 | |
2017-10-04 02:21 | Ryushin | Status | feedback => assigned |
2017-10-07 14:00 | Ryushin | Note Added: 0002776 | |
2017-10-08 11:43 | joergs | Note Added: 0002778 | |
2017-10-08 11:43 | joergs | Status | assigned => feedback |
2017-10-08 15:46 | Ryushin | Note Added: 0002779 | |
2017-10-08 15:46 | Ryushin | Status | feedback => assigned |
2017-10-08 15:50 | Ryushin | File Added: windwalker-dir.trace_20171008 | |
2017-10-08 15:59 | Ryushin | Note Added: 0002780 | |
2017-10-08 16:01 | Ryushin | File Added: windwalker-dir.trace_20171008_2 | |
2017-10-19 14:49 | Ryushin | Note Added: 0002804 | |
2017-12-28 00:11 | Ryushin | Note Added: 0002849 | |
2017-12-28 00:13 | Ryushin | Note Edited: 0002849 | |
2017-12-28 01:55 | Ryushin | Note Added: 0002850 | |
2017-12-28 01:56 | Ryushin | Note Edited: 0002850 | |
2017-12-28 01:57 | Ryushin | Note Edited: 0002850 | |
2017-12-28 13:42 | Ryushin | Note Added: 0002851 | |
2019-09-27 16:42 | joergs | Assigned To | joergs => |
2023-09-11 17:13 | bruno-at-bareos | Assigned To | => bruno-at-bareos |
2023-09-11 17:13 | bruno-at-bareos | Status | assigned => closed |
2023-09-11 17:13 | bruno-at-bareos | Resolution | open => unable to reproduce |
2023-09-11 17:13 | bruno-at-bareos | Note Added: 0005406 |