Implement Dandelion message routing #1049
Labels
No Label
bug
build
dependencies
developers
documentation
duplicate
enhancement
formatting
invalid
legal
mobile
obsolete
packaging
performance
protocol
question
refactoring
regression
security
test
translation
usability
wontfix
No Milestone
No project
No Assignees
1 Participants
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Bitmessage/PyBitmessage-2025-01-09#1049
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When a BitMessage node receives a message, it will not instantly relay that message to its peers. Instead it will add a randomized delay. I'm not sure of the exact specifics (is this a delay for all peers? Or is the randomized delay unique for each peer?). It's a security measure to prevent statistical/timing attacks against adversaries.
However, there is a new approach on the block called Dandelion. Which was initially created for Bitcoin, but the same properties apply to the BitMessage network.
There's already a bitfield reserved for dandelion in the protocol specification.
Awesome checklist! Really eager to review this, but it may need to wait until after my imminent deadlines :)
Some things I'm interested in.....
@amiller Thank you for the reply.
With the 3 commits, I believe all the points are implemented, however I have done no testing and there probably are bugs, so I haven't made any ticks yet.
Regarding your questions:
there is no equivalent to consensus mechanism in Bitmessage and the objects are completely independent. After you decrypt the objects, there may be some relationships, but this is separate from the network subsystem, and at least in PyBitmessage it's handled asynchronously to the network subsystem.
Bitmessage has had much less security review than Bitcoin. There was some criticism in the past, much of it is similar to issues with Bitcoin, and some of it has been addressed since it was reported. Since there are no interdependecies between objects (indeed, almost all objects are encrypted so you can only verify if it hasn't expired, if it has a sufficient PoW, a correct checksum and stream), this makes certain attacks impossible which would have worked on Bitcoin.
When the intersection attack was brought to my attention last year in February, we discussed it internally (@mirrorwish, @Atheros1 and me) and I wrote code for mitigating it by stalling the attacker here: https://github.com/Bitmessage/PyBitmessage/blob/v0.6/src/network/tcp.py#L79 . In summary, if a node receives a getdata request and it doesn't have the object (and now also if it's a stem object that isn't on its route), it will skip processing getdata requests from that node until it estimates that a small object would propagate across the whole network on average. This skip is also triggered upon connection to prevent an attacker from quickly connecting, sending a getdata and disconnecting.
The older implementation in the network subsystem (until 0.6.2) didn't allow the mitigation to work cleanly, as the processing of each connection was synchronous, however with switching the network subsystem to asyncore, the processing is more asynchronous (it can be improved even more in the future) and it can ignore getdata requests while processing others.
I also made a minor change in one of the patches above, that a response to getdata is processed in randomised order, and the first missing/stem object will trigger the skipping, so if the attacker requests multiple objects, he'll only receive a random subset of the valid ones, so he can't be sure which you don't have or if there were network delays.
I am not sure if it's better to skip or delay the getdata requests after the trigger. I decided to skip, since if I was delaying them, my intuition tells me that there probably would be security issues depending on how it's implemented that I can't foresee at the moment, and I may end up making the attack worse. I'll reevaluate it in the future or if someone does a proper analysis.
I'll post here as I'm having troubles with sending to @illinois.edu email addresses.
I did find the time to review the additional information (Greg's posts, Giulia's updated announcement), and have a couple of questions:
it looks like the sample code uses two child stems. I thought I read an explanation for that, but I can't find it. Should I implement two or one child stems?
the sample code appears to ensure that after reshuffling the new path, it doesn't include any of the current ones, am I correct?
it looks like it only uses outbound (and whitelisted, which is a concept that currently doesn't exist in PyBitmessage) connections? I presume this is a defense. What to do with nodes that don't make outbound connections, or don't currently have one? Like during an early phase of startup. Turn dandelion off? Delay (if it's a temporary issue)?
regarding shifting the fluff mode to a non-dandelion node, how is that handled probability wise? Is there another variable, or a compile time (C) value?
Hey Peter,
Yes there should be two child stems. We call this the "4 regular graph" approach, since there's typically two incoming and two outgoing stem edges for each node... And our analysis says it helps privacy when the attacker can learn some of the edges. For each transaction, only one is chosen though.
The current shuffling does do this, but I don't think it's necessary
outbound only is important because Core nodes accept incoming connections from anywhere, a super node could easily make 100 connections to each node and get an advantage in being chosen as a stem edge. It is harder to get nodes to make outgoing connections to you. I would say, delay until having more outgoing edges. Or at least until having 2 and then let those be the stem edges.
I'm not sure I follow the last question, but we have a new command like flag and global parameter for the fluff probability
Ok, this is the point I missed, thanks. Have to fix it.
In his bitcoin-dev post, gmaxwell said this:
To which Giulia replied:
So to reformulate it, there are two types of stem randomly turning to fluff:
I fluff with a probability x.
With a probability x, I announce the transaction as a non-dandelion one to the edge, and assume he fluffs (which normally he would)
How does the code distinguish which one to take? Or did I misunderstand it, and the first type doesn't exist? That actually makes sense in retrospect, as that would automatically get rid of some of the bugs that exist in my implementation at the moment, and why I can't find how your code handles it. It would also explain some of Greg's comments, regarding something being unclear in the specification, and how attackers may be able to block propagation.
Partial fix in
b1442ecb0a
Todo:
Ready in
fd1a6c1fa1
.