Bitmessage hangs when 'disk full' condition occurrs #572
Labels
No Label
bug
build
dependencies
developers
documentation
duplicate
enhancement
formatting
invalid
legal
mobile
obsolete
packaging
performance
protocol
question
refactoring
regression
security
test
translation
usability
wontfix
No Milestone
No project
No Assignees
1 Participants
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Bitmessage/PyBitmessage-2024-11-28#572
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I accidentally had the 'disk full' condition on the disk where bitmessage stored data. After deleting large files and clearing this condition, bitmessage remained frozed with 0% CPU use. GUI was repainting, but not clickable.
All other running programs continued fine, but bitmessage.
The SQL qriter thread probably crashes and then SQL commands "stack up" and are no longer processed. I think, this has already been reported once but I do not know, if a solution was found.
yurivict, did you restart Bitmessage?
I did.
I am just worried that it is not robust enough to survive disk full condition. I also doubt this is SQLite problem, since they are among the best tested packages around with extremely extensive QA suite.
Some SQL command must have failed in SQLite and BitMessage ignored the failure, etc. Or maybe this is Python binding issue.
Some strategy of handling disk full condition should be developed. The easiest way is to go offline, notify the user with the prominently displayed message (preferably of some other color), and to keep checking for disk full condition and come back when it is cleared. In many systems GUI fails to report disk full condition, and user only knows when something silently malfunctions.
If Bitmessage is just sitting there doing something like syncing to the network and fills the disk, it is programmed to display an alert, wait for the user to hit "ok" and then immediately exit. If the instant the disk fills up happens to be when you are doing something in the UI which adds information to the database, like adding an address book entry, then the UI thread will be blocked while waiting for data from the SQL thread but it will never get it because the SQL thread will have already exited. The UI will thus will appear to freeze.
What is the best way to solve this? Is it really a common problem? Should the UI thread be programmed to checks a status variable to see if the query was successful in each place that we make one?
While working with large files, I see the disk full problem at least twice a month. It is practically unavoidable.
80+% of software around was never ever tested for this condition.
Solution: do any data entry in a transaction-like way. You either succeeded to add the new address or failed. In case of failure GUI goes back to the "dirty-GUI" state when user just typed the entry but it wasn't saved yet.
Also when you get the packet with inventory records, do it in a transaction, if it failed then reject the whole packet and go offline with disk-full status. And SQLite, by the way, supports transactions.
Attaching the screenshot how disk full failure looks like now.
I'll see if I can reproduce and fix this.
Now I got disk full with 0.6.0
Problems:
Also, it doesn't need to stop permanently, because disk full condition can later go away, The correct behavior is to check the available disk space every minute, and resume operation if there is significant space available.
A disk full condition is commonly encountered when VACUUMing the database because according to the VACUUM documentation:
I think going forward VACUUM should be performed sparingly and only when the user allows it. This would require a non-modal notification/prompt. In place of the current usage of VACUUM, auto_vacuum could be set to incremental and incremental_vacuum performed (after ensuring freelist_count is greater than zero) whenever the user invokes 'Delete all trashed messages' and the cleaner thread purges the inventory of expired items. Unfortunately enabling auto_vacuum for existing users will require the use of VACUUM
All this does not resolve the issue of how PyBitmessage behaves when the disk is full but it should alleviate the occurance in the short term so a proper solution can be implemented.
I'll look at this, but other than fixing the popup I probably won't do anything else (e.g. fiddling around with VACUUM). Have your computer notify you when you run low on disk space.
The user doesn't look at notifications all the time. I come back in a few hours, plenty of disk space is available, but BM displays the messages that can't be closed and needs to be killed.
For example, some runaway process will use all memory and get killed by the system, but BM will stop as a result.
Looking at the source, PyBitmessage is supposed to shutdown if disk is full, but I think it freezes instead because only the SQL thread ends while other keep running. A clean shutdown cannot happen because it needs to writes data.
If instead the SQL thread would wait, then the problem with inability to cleanly shutdown remains. The disk being full can also cause problems with updating keys.dat and known addresses when triggered by other means than during the shutdown. It also cannot log anything and even communicate correctly with other nodes because it cannot update its own inventory.
So to sum it up, I think that it should immediately quit and not even try to display the message, just like it does when it runs without the GUI.
@yurivict use other sysadmin tools to prevent disk being full, if notifications don't work then you can use quotas.