This content was deleted by the author. You can see it from Blockchain History logs.

Analysis of 11 factors in forming "reputation" in STEEM

Repository

https://github.com/steemit/steem


Analysis Goals & Conditions


① Goals

Have you ever wondered about the key factors that determine "reputation" in STEEM?

In this article, we examine in depth the correlations with 11 factors that are likely to be related to the rise of "reputation".

② Conditions

As of February 20, for each of the 5,779 accounts with a reputation of 61 or more, we calculated the 11-factor values ​​associated with the increase in reputation and then analyzed the correlations for each factor.

STEEM's "reputation" is based on a commercial log function, and each time the "reputation" goes up nine steps, the actual "reputation score" is designed to rise by a factor of ten.

According to this, in order for reputation 34 to become reputation 61, it needs to build up a reputation score of 1,000 times. Therefore, it is fairly reasonable to set the condition of reputation 61 or more as a reference point for the meaningful correlation analysis for each major factor.

In a recent article, I have treated intuitively and easily the principle of calculation and major characteristics of "reputation" from the point of view of most ordinary users. If you do not know this at all, you need to see it, but it is not necessary or a preliminary task.

Three rules to help intuitive understanding of "Reputation" in STEEM


Analysis Table


Ⅰ. Summary of correlation between reputation and 11 factors

Ⅱ. Reputation and key factor

Ⅲ. Limitations of reputation

※ In this analysis, I present sufficient content and opinions in the text, and omit separate conclusions.


Contents


Ⅰ. Summary of correlation between reputation and 11 factors


7000.png

The table above shows the correlation between reputation_score, reputation and the 11 factors that are likely to affect it. In fact, it is the core content of this article and maybe it is almost everything.

Because "reputation" is actually calculated by "reputation_score", it can be compared most accurately based on "reputation_score". However, in fact only for the "vesting_payout" factor, "reputation_score" showed a higher correlation to "reputation" (86% vs. 66%), with other figures similar.

Therefore, for the sake of convenience, we will analyze the correlation value with the "reputation" on the right side of the above table.

7002.png

The above chart lists the correlation between reputation and eleven factors in order of strength.

The correlation with "vesting_payout of posts" was the highest at 66%, and "payout of posts" was also high at 53%. The two factors are the underlying variables of the reputation output, and so are probably natural consequences.

Somewhat surprisingly, the reputation was positively correlated with "followers' SP SUM" by 54%. It was 53% with "followers". Considering that the "payout of posts" was 53%, it looks pretty significant.

Other than that, there was a significant correlation between "posts" (28%) and "created day" (25%). The higher the "posts" and the "created days", the higher the likelihood of a rise in reputation, but not to a very high degree.

And unexpectedly, there was only a low 10% early correlation between "effective SP", "owned SP", "reward per post" and reputation.

What really surprised me was that the correlation between reputation and "comments" (number of comments: number of posts + number of comments) was 8%, which is virtually irrelevant. The correlation with "replies" was even lower at 7%.

Given that the correlation with "posts" was meaningful at 28%, increasing the "replies" meaninglessly did not seem to help much in getting a reputation of 61 or higher. However, "replies" will play a little role in the early days after joining.


Ⅱ. Reputation and key factor


The basic summary is already over.

Let's take a closer look at the distribution of eight representative factors and the average value per reputation segment.

① vesting_payout of posts

7005.png

The positive correlation between the "vesting_payout of posts" factor and reputation is very high at 66%. Distribution is very apparent in the diagram.


② payout of posts

7006.png

The correlation between the "payout of posts" factor and reputation is also fairly high at 53%. It is also clear from the distribution chart.


③ followers' SP SUM

7007.png

The correlation of "followers' SP SUM" factor and reputation is also quite high at 54%.


④ followers

7008.png

The "followers" factor is also fairly high at 53%. It should be noted that "followers' SP SUM" and "followers" factors are as highly correlated with reputation as "payout" factors.


⑤ posts

7009.png

The correlation between the "posts" factor and the reputation is 28% meaningful. It can be seen that steady writing contributes to raising reputation.


⑥ created day

7011.png

The "created day" factor also showed a meaningful correlation of 25%. If you do not stop STEEM activity and write a steady post, it means that your reputation will eventually increase.


⑦ comments

7010.png

There is little correlation between "comments" factor and reputation. It is only 8%.

Many people seem to mistakenly believe that 2 factors above have a high correlation.

Many "comments" can be helpful when your reputation is very low, but the "comments" factor is almost irrelevant when you have a reputation of 61 or higher.

In the end, it is a "posts" factor, not "comments" or "replies" that relate to increasing reputation. Apart from having a lot of "replies", you can see that quality post is a priority.


⑧ effective SP

7012.png

The correlation between the "effective SP" factor and the reputation had only a weak correlation of 14%. Like the "comments" factor, many people seem to think the other way around.

For reference, the "owned SP" factor was 12% and the "reward per post" factor was 13%.

If the reputation is 61 or more, it can not be concluded that "effective SP" and "owned SP" are high when the reputation is high. Of course, it is likely that correlations are estimated to be lower than actual due to the accounts that currently quit STEEM activities.

The table below is an average of 11 factors for each reputation segment. (For reference purposes)


Ⅲ. Limitations of reputation


① Current reputation is half-pointed: lack of representation

6500.png

The image above is what Dan Larimer had to say about STEEM's "reputation" in the past. However, I added "NOT" in the middle of the sentence.

When Dan Larimer first made STEEM, in his head, "reputation" in STEEM might have been everything. At present, "reputation" in STEEM is not everything.

Regardless of "reputation", there are many different subaccounts (investment, game, etc.) used for different purposes.

However, for accounts that write, the "reputation" indicator must be able to have a real impact.

Basically, STEEM's motto is a system that distributes a sort of well water, called reward pool, according to the value of the content.

However, due to the appearance of various types of bidbots, we can easily earn a reputation score. This seems to be a problem on the system level of STEEM. Past surveys have shown that bidbot takes a significant portion of the overall reward pool.

Proof-of-Brain or Power-of-Bid(bot), that is the question.


② Increasing necessity of complementary indicators

This requires the addition of alternative indicators.

First, the best parallel indicator is "follows' SP SUM".

The correlation with reputation was found to be equal to the "payout" factor, and there are actually many accounts with low or high "follows SP SUM" value compared to the average value per reputation segment.

If an account has a high reputation and a high "follows' SP SUM" value, then the actual impact is likely to be high.

Of course, even if the value of "follows' SP SUM" is high, the downside is that it is hard to know if people like the account. People follow up on some disliked accounts for observation / surveillance purposes.

The "followers" factor also had a high correlation with reputation, but the number of followers could be very different depending on when you signed up. If you were already active when the boom was happening, you might have a relatively larger number of followers than other accounts with similar reputations.

Therefore, it would be better to check the actual impact of the account by showing "follows' SP SUM" instead of "followers" and "posts" instead of "comments" with reputation. Of course, marking comments is a good intention to create a forum for exchange of opinions by equating post and reply. But in practice, reputation and comments were irrelevant.

This can help us to see the real reputation / awareness / influence of each account and, at the same time, to encourage us to consider others more, which can contribute to the overall value of STEEM.


The Data and Queries


I did this analysis by connecting to the @steemsql db with MSSQL client(Microsoft SQL server management studio), Excel.

Refer to My Github


(My main analyzes)

Analysis of Voting Pattern: From posting to payout
The current (actual) inflation rate of STEEM is quite different from the design.
Proof-of-Brain or Power-of-Bid(bot), that is the question.
The recent rise in STEEM prices may have been driven by Tanos' thumb.
Analysis of actual curation yield distribution
The SBD returns ?