Excessive memory usage #1598

Closed
opened 2020-01-31 01:49:41 +01:00 by PeterSurda · 11 comments
PeterSurda commented 2020-01-31 01:49:41 +01:00 (Migrated from github.com)

I got a report about excessive memory usage. At first I thought it's wrongly deployed because I haven't seen it in a long time, but after upgrading to latest code I could trigger it as well. I'm now doing a poor man's bisect to see where it broke.

I got a report about excessive memory usage. At first I thought it's wrongly deployed because I haven't seen it in a long time, but after upgrading to latest code I could trigger it as well. I'm now doing a poor man's bisect to see where it broke.
g1itch commented 2020-01-31 11:46:48 +01:00 (Migrated from github.com)

Perhaps the objgraph can help?

Perhaps the [objgraph](https://mg.pov.lt/objgraph/) can help?
PeterSurda commented 2020-01-31 12:00:37 +01:00 (Migrated from github.com)

I first need to find at least approximately when it was introduced and how to trigger it. So far I was only able to trigger it once, and only after the original reporter insisted he's running the current code. Is this a new bug that was introduced last year (as I can't reproduce it with older code) or a continuation of the old leaks that I thought were fixed?

I did indeed used objgraph last time I was tracing the leaks.

I first need to find at least approximately when it was introduced and how to trigger it. So far I was only able to trigger it once, and only after the original reporter insisted he's running the current code. Is this a new bug that was introduced last year (as I can't reproduce it with older code) or a continuation of the old leaks that I thought were fixed? I did indeed used objgraph last time I was tracing the leaks.
g1itch commented 2020-01-31 12:05:31 +01:00 (Migrated from github.com)

Did you check this: 6ca3460? It comes to my mind first when I think about memleak.

Did you check this: 6ca3460? It comes to my mind first when I think about memleak.
PeterSurda commented 2020-01-31 13:05:25 +01:00 (Migrated from github.com)

@g1itch that was a workaround for a weird bug in Python's threading, the automatic gc was occasionally triggeigr a RecursionError.

@g1itch that was a workaround for a weird bug in Python's threading, the automatic gc was occasionally triggeigr a `RecursionError`.
g1itch commented 2020-01-31 13:08:32 +01:00 (Migrated from github.com)

Yes, I remember. But it still may be a good starting point for bisect.

Yes, I remember. But it still may be a good starting point for bisect.
PeterSurda commented 2020-01-31 15:13:14 +01:00 (Migrated from github.com)

I can't reproduce it even with more recent code so it's unlikely a good starting point

I can't reproduce it even with more recent code so it's unlikely a good starting point
PeterSurda commented 2020-02-02 10:34:26 +01:00 (Migrated from github.com)

I was able to narrow it down a bit, between 7e1f1d2604 (works ok) and 03316496b7 (can be triggered). That leaves only about 4 potential sources as most of the commits in between are code quality. Continuing testing.

I was able to narrow it down a bit, between 7e1f1d2604c333ccc2eb548c07f9cf19051b6592 (works ok) and 03316496b7c3380c5ac408f86d049855dbcedac6 (can be triggered). That leaves only about 4 potential sources as most of the commits in between are code quality. Continuing testing.
PeterSurda commented 2020-02-02 14:57:05 +01:00 (Migrated from github.com)

Narrowed it futher, it's most likely the first one of these two:

I'll narrow it down and try to write a minimal patch that stops the memory leak (even if it disables the functionality), and then I'll see how to fix the leak properly.

Narrowed it futher, it's most likely the first one of these two: - [x] a69732f060608428ddfebf608bdb3c8751e296ce ( Addrthread finish ) - [x] 2a165380bb7214afdcfd95b74dce83660bad53ff ( Restrict outbound connections on network groups ) I'll narrow it down and try to write a minimal patch that stops the memory leak (even if it disables the functionality), and then I'll see how to fix the leak properly.
PeterSurda commented 2020-02-03 04:09:41 +01:00 (Migrated from github.com)

Pretty sure now it's a69732f060

Pretty sure now it's a69732f060608428ddfebf608bdb3c8751e296ce
g1itch commented 2020-02-03 16:57:20 +01:00 (Migrated from github.com)

If there is a memleak, it's barely visible. With 10 connections (2 inbound) memory_profiler.memory_usage() returns 105.36. objgraph.show_growth(limit=10) shows 1695 Peer objects.

diff --git a/src/bitmessagemain.py b/src/bitmessagemain.py
index d6cb289b..42fcb305 100755
--- a/src/bitmessagemain.py
+++ b/src/bitmessagemain.py
@@ -238,7 +238,7 @@ class Main(object):
         if daemon:
             with shared.printLock:
                 print('Running as a daemon. Send TERM signal to end.')
-            self.daemonize()
+            # self.daemonize()
 
         self.setSignalHandler()
 
diff --git a/src/class_singleCleaner.py b/src/class_singleCleaner.py
index b9fe3d1c..14c99f9a 100644
--- a/src/class_singleCleaner.py
+++ b/src/class_singleCleaner.py
@@ -23,6 +23,9 @@ import gc
 import os
 import time
 
+import objgraph
+from memory_profiler import memory_usage
+
 import knownnodes
 import queues
 import shared
@@ -31,7 +34,7 @@ import tr
 from bmconfigparser import BMConfigParser
 from helper_sql import sqlExecute, sqlQuery
 from inventory import Inventory
-from network import BMConnectionPool, StoppableThread
+from network import BMConnectionPool, StoppableThread, stats
 
 
 class singleCleaner(StoppableThread):
@@ -150,6 +153,15 @@ class singleCleaner(StoppableThread):
 
             gc.collect()
 
+            objgraph.show_growth(limit=10)
+
+            self.logger.warning('Memory used: %sM', memory_usage().pop())
+
+            connections = stats.connectedHostsList()
+            self.logger.warning('Connections (%s):', len(connections))
+            for c in connections:
+                self.logger.warning('-> %s', c.destination)
+
             if state.shutdown == 0:
                 self.stop.wait(singleCleaner.cycleLength)
 
$ BITMESSAGE_HOME=/tmp/bmtest mprof run --include-children python2 src/bitmessagemain.py
$ mprof plot --output /tmp/bmtest/memory-profile.png

memory-profile_7

If there is a memleak, it's barely visible. With 10 connections (2 inbound) `memory_profiler.memory_usage()` returns 105.36. `objgraph.show_growth(limit=10)` shows 1695 Peer objects. ```diff diff --git a/src/bitmessagemain.py b/src/bitmessagemain.py index d6cb289b..42fcb305 100755 --- a/src/bitmessagemain.py +++ b/src/bitmessagemain.py @@ -238,7 +238,7 @@ class Main(object): if daemon: with shared.printLock: print('Running as a daemon. Send TERM signal to end.') - self.daemonize() + # self.daemonize() self.setSignalHandler() diff --git a/src/class_singleCleaner.py b/src/class_singleCleaner.py index b9fe3d1c..14c99f9a 100644 --- a/src/class_singleCleaner.py +++ b/src/class_singleCleaner.py @@ -23,6 +23,9 @@ import gc import os import time +import objgraph +from memory_profiler import memory_usage + import knownnodes import queues import shared @@ -31,7 +34,7 @@ import tr from bmconfigparser import BMConfigParser from helper_sql import sqlExecute, sqlQuery from inventory import Inventory -from network import BMConnectionPool, StoppableThread +from network import BMConnectionPool, StoppableThread, stats class singleCleaner(StoppableThread): @@ -150,6 +153,15 @@ class singleCleaner(StoppableThread): gc.collect() + objgraph.show_growth(limit=10) + + self.logger.warning('Memory used: %sM', memory_usage().pop()) + + connections = stats.connectedHostsList() + self.logger.warning('Connections (%s):', len(connections)) + for c in connections: + self.logger.warning('-> %s', c.destination) + if state.shutdown == 0: self.stop.wait(singleCleaner.cycleLength) ``` ``` $ BITMESSAGE_HOME=/tmp/bmtest mprof run --include-children python2 src/bitmessagemain.py $ mprof plot --output /tmp/bmtest/memory-profile.png ``` ![memory-profile_7](https://user-images.githubusercontent.com/4012700/73667793-7b4aaf00-46ad-11ea-9d93-ea770b877608.png)
g1itch commented 2020-02-05 14:03:50 +01:00 (Migrated from github.com)

No much difference even than ran overnight:

memory-profile_8

No much difference even than ran overnight: ![memory-profile_8](https://user-images.githubusercontent.com/4012700/73844132-9db50780-4828-11ea-8805-df0c3bc13267.png)
This repo is archived. You cannot comment on issues.
No Milestone
No project
No Assignees
1 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Bitmessage/PyBitmessage-2024-11-28#1598
No description provided.