Steem Statistics: Correlation between number of posts and number of followers

Greater user engagement correlates with greater follower counts!


Hi all, I have recently been playing around with the Steem blockchain data using @arcange's SteemSQL (which I highly recommend for the data-centric minds out there!), and I decided to just take a look at how the level of user engagement correlates with the number of that user's followers. Of course the general trend is fairly intuitive: increasing your level of engagement on Steemit (as evidenced by increased number of posts and comments) leads to more awareness of your blog and increased numbers of followers.

It turns out, that the correlation between total number of posts and total number of followers does show that behavior, and (slightly surprisingly to me), the best-fit trend to fit the data is actually a power-law relationship. The correlation is not exceptionally strong with an R^2 value of 0.40492, although note that R^2 values less than 0.5 are fairly common when it comes to studying human behavior, especially in the cases of social media type platforms where there are significant outliers in terms of numbers of followers and/or post counts. This is further exacerbated by the presence of bots that auto-follow and/or auto-post, leading to even more outliers in the data. Nonetheless, it does seem like the correlation is still reasonable despite the significant outliers, and I believe the conclusion is fairly obvious (and supported by the data!).

Conclusion: If you want more followers.... increase your level of engagement on Steemit!

postsVsFollowers.png

Methodology:


If you are a new programmer, or just interested in the methodology for performing such a search using SteemSQL, this is the code used to grab the data from the SQL server:

#!/usr/bin/env python

import pymssql
import sys

server = "vip.steemsql.com"
user = "Steemit-trogdor"
password = "*****************"
database = "DBSteem"

conn = pymssql.connect(server, user, password, database)
cursor = conn.cursor()

cursor.execute("SELECT Accounts.name,MAX(Accounts.post_count),COUNT(Followers.follower) FROM Accounts WITH (NOLOCK) INNER JOIN Followers ON Accounts.name=Followers.following GROUP BY Accounts.name")
row = cursor.fetchone()

while row:
print(row[0] + ", " + str(row[1]) + ", " + str(row[2]))
row = cursor.fetchone()

I then filtered out results with less than 10 posts and/or 10 followers to remove the new/unused accounts.

Let me know if you found this interesting, and if you need any help with searches of your own, of if you are interested in any other data, just let me know!

Best,
Trogdor :)

H2
H3
H4
3 columns
2 columns
1 column
Join the conversation now