Governance Discussion - Validator Node Uptime Protocols


#1

Hello everyone,

This topic has been created for POA Core Validators to discuss and propose protocols regarding validator node uptime. Feel free to share your ideas below!

Thanks!

Regards,
Rocco Mancini


Validators’ node down time and action protocol
#2

Dear Validator,

This post if the duplicate of the original posting an hour earlier.

Please address your concerns there to avoid topic dispersion discussing the same problem.

Sincerely,

Alex


#3

Hello Alex,

This is not a duplicate of the original posting. This post was created as Henry’s post is extremely biased and leading. As another validator pointed out to you Alex:

“After a quick read, I’m concerned that your post looks like this these five bullet points have already been discussed and your suggestions are already in place, with only time limits and thresholds to be defined. Would you consider editing the bullet points in to suggestions for adopition before defining specific outcomes? I think each topic has merit for discussion. Once consensus is reached on one, then perhaps criteria could be applied. What do you think?”


Validators’ node down time and action protocol
#4

Again, please address your concerns regarding either the context and/or the topic there to keep all the discussion organized. We need to decide on the action protocol for the common issue.

Alex


#5

Uptime requirements need to be sensible and achievable. Also, we must have these be something that are uniformly applied to all validators.

In my opinion, there has been some ‘goal post moving’ in the past. In my mind, whatever is agreed upon ought to be hard coded to not allow selective enforcement. This is something that I will be making a priority going forward.

With that said - the last time this topic was talked about (ensuring uptimes) the discussion went to how to best ensure uptime. And the best idea that seemed to be acceptable to the group was the notion of ‘backup buddies’. I…e: one would pair up and have that person’s “back” if you will. If you see your buddy’s node is down, you would reach out via some agreed to channel(s) to see what’s what.

Why I bring this up, I think that making sure that we can communicate and coordinate is the first critical step towards ensuring proper uptime.

It bears to state: The network is large enough such that having a validator down does not negatively impact block creation time. <- Allow me to unpack that statement. In industry (and government for that matter) a 10% error bar is used as a rule of thumb. This would mean 5.5 seconds - I think we all can agree that this would require quite a number of validators to go dark, and would indicate some other (deeper) issue. I bring this up, as it is important to keep everything in perspective.

As for the question around time windows. I like to use past practice (or precedent). We had such an event happen on Sokol with a validator that went dark. Nobody was able to reach this individual, and suggested that we remove this individual. At the time, I felt that we ought to wait and see - and make a more concerted effort at out reach. This was a validator that I did not vote for, and yet I felt that removing them from consensus without a more exhaustive attempt to make contact wasn’t right. In this case, this person was not seen for (I believe) ~1 week.

One could argue that that was the testnet. However, I think that should a single node go dark for a week - the actual impact to the entire network is minimal (again, with the current number of validators: 5 / X seconds = time impact).

So my thinking is a week. The beauty of Bitcoin was achieving network coordination by SLOWING down the network and giving nodes ten minutes to catch up. It is my belief that we should take this lesson to heart and realize that human coordination is inherently slower than a system of interconnected computers communicating at the speed of light, so we ought to be realistic about time windows (particularly if a downed node has such min impact overall.)

And again, before you suggest that this is wrong or whatever - ask yourself if you are willing to live with whatever limits you are about to suggest.

UPDATE: I made an error in my ‘formula’ to be fair - It was late for me, and I had spent the better part of the day talking Blockchain to the Dutch… many of whom were not pro-Blockchain until me that is ;-).


#6

Throughout the past two calendar years, POA Network has done an excellent job of establishing Validator independence (also called “Autonomy” as seen in recent forum posts) and Validator node availability (up-time). High availability, fast block times and extremely low network fees are a benefit of the Proof of Authority Consensus model, and POA Network has established itself as a leader in the blockchain economy by combing best practices and cutting edge code.

Now that 2019 is here, let’s focus on elevated performance and adoption. The POA Network DevTeam is driving advancement through, among other things, development and deployment of the emerging Honey Badger BFT Consensus:

Along with exceptional advancements such as HB-BFT, Independent Validators and the Community can help expand availability and increase network reach to the conventional economy. Everyone has access to tools such as this forum and public network statistics:
https://core-netstat.poa.network/

Validators monitor their own nodes, and hopefully all nodes in the network. Along with email contact information, Validators can communicate via internal channels, and many also choose to notify others via text, SMS and mobile phone in times of emergency, outage, and soft and hard for network upgrades.

As Network adoption increases, it becomes easier to maintain network up-time in that higher load demands more and improved network nodes, typically in advance of demand. As community membership increases, many of the new products and services (typically Distributed Applications, or DApps) monitor their own systems to make sure their dedicated community has a solid experience. Gaming companies like Everdragons - https://everdragons.com - monitor the overall POA Network and can deploy dedicated network infrastructure for their own increased demand as they deem necessary.

Many of these DevTeams work with the POA Foundation developers, community members and Independent Validators to the benefit of all. As a Public Blockchain, POA Network is free for anyone to deploy to with network fees (Gas) so small as to be fractions of conventional systems. Great DevTeams deploy their own independent support and monitoring systems which by default support and improve POA Network. Everyone wins.

We as Independent POA Validators should do our best to encourage network adoption in commercial, academic, industrial; all economic sectors. Doing so strengthens our blockchain, reduces costs and improves the system for everyone.

Jim O’Regan
January 22, 2019


#7

Well done. Thank you for sharing this.


#8

Jeff,

Love the idea of the buddy system personally. In order to push the concept forward, I think we would need to address one of the critical dissenting opinions to the concept. I believe concern was raised that pairing up validators could be seen as validators being affiliated with each other.

In my opinion, I don’t believe this is a significant issue, but it could be beneficial if the proposed system addresses this point. I have two ideas that may address the concern:

  1. Random selection. This would remove the risk that buddies choose each other for the purpose of being affiliated with one another.
  2. Rotations. Define a set period of time when two or more validators are buddies then rotate them out and randomly select again once the period ends.

These are just (slightly sleep deprived) ideas though, and I am completely open to thoughts and suggestions.


#9

I also like buddy system. I think the random selection, and maybe have like 3 or 4 buddies. As long as rotations are at least 1 year apart (even longer hopefully) so it’s not a mess keeping track who you currently are responsible for then I’d support this.


#10

Sounds reasonable. This way, we are responsible to help ourselves and our peers, and by doing so, we will hopefully have support ourselves when it may be needed most.