[STEEMSQL] A deep analysis of Steemians gratefulness

Using SteemSQL, I discovered someone created a post just to say “thanks” to me.

Remember in SteemSQL introduction, I wrote about full text search and said :

Let say I want to know if anyone mentioned me in a post or comment, the following simple query will do the trick
SELECT author, title, body, url FROM TxComments WHERE CONTAINS(body, '@arcange')

I decided to use my favorite analytic toy and put my own piece of advice into practice.

I executed the query and surprisingly discovered that a few weeks ago @lesliestarrohara created a post post fully dedicated to “thanks” me!

WOW … made me smile too, made me proud.

This is exactly why I came to Steemit after reading Steem whitepaper.
I loved the idea of upvote incentive vs flagging. Building something with a positive attitude.

As usual, my little thoughts mill started to produce new questions:

  • Are people on Steemit grateful towards each other?
  • Are we really building a friendly community?
  • Who is grateful?
    and many more…

Couldn’t sleep with all those ‘stupid ’ questions.

Anyway, maybe it’s useless, but I decided to transform it into a good tutorial on how to build a more complex analytic report.

Let’s go …

We first need to build the SQL query

I will start with the question “how many users post or reply with the word “thanks” in it?”
This translate into T-SQL as

SELECT 
    COUNT(*) AS [Thanks]
FROM
    TxComments
WHERE
    CONTAINS(body,'thanks')

This return an awesome number of 89936 “thanks” posted.

Magic predicate

But wait, people do not always say “thanks”. Some say “Thank” or “Thank you” or something similar.
The same query with the word “thank” (singular) return only 66517 posts.

As it won’t be fun to give a try to each word, I will use one of the coolest feature of SQL full text search: FREETEXT() predicate. This predicate searches for values that match the meaning and not just the exact wording of the words in the search condition.

The where clause in the query becomes

...
WHERE
    FREETEXT(body,'thanks')
...

We now get a greater number of 151869 posts where gratitude has been expressed!

Linking and combining

As I want to create a graph to see the evolution over time, I will link the Transactions and Blocks tables to get some date parts from the block’s timestamp. I will also ask the server to compile data using GROUP BY clause to avoid returning all those individuals records:

SELECT 
    MONTH(blocks.timestamp) as [Month],
    DAY(blocks.timestamp) as [Day],
    COUNT(*) AS [Thanks]
FROM
    TxComments
    inner join Transactions on TxComments.tx_id = Transactions.tx_id
    inner join Blocks on Transactions.block_num = Blocks.block_num
WHERE
    FREETEXT(body,'thanks')
GROUP BY
    MONTH(blocks.timestamp),
    DAY(blocks.timestamp)

Cool. Let’s have a look at our result:

Want to know how I created the chart, look here

Hmm, that’s cool but not really instructive.
It looks like if people became suddenly very thankful from middle July (1) up to end July (2) then got more and more bored posting kind words to their fellow Steemians (3) and finally colluded to stop it as of today (4).

What’s wrong and how can we improve our analysis

  • We have to get rid of today’s data because the day is not yet over and that’s why the graph is plunging down at the end.
  • We should compare our “thanks” data with the total number of post per day. Remember there was a big hype around 4th of July and lot of people joined and started to post. Then some moved away.

To get all of this, I will use a more complex query:

SELECT 
    MONTH(blocks.timestamp) as [Month],
    DAY(blocks.timestamp) as [Day],
    COUNT(*) AS [Post],
    0 AS [Thanks]
FROM
    TxComments
    inner join Transactions on TxComments.tx_id = Transactions.tx_id
    inner join Blocks on Transactions.block_num = Blocks.block_num
WHERE
    DATEPART(dy,timestamp) <> DATEPART(dy,GETDATE())
GROUP BY
    MONTH(blocks.timestamp),
    DAY(blocks.timestamp)

UNION SELECT 
    MONTH(blocks.timestamp) as [Month],
    DAY(blocks.timestamp) as [Day],
    0 as [Post],
    COUNT(*) AS [Thanks]
FROM
    TxComments
    inner join Transactions on TxComments.tx_id = Transactions.tx_id
    inner join Blocks on Transactions.block_num = Blocks.block_num
WHERE
    DATEPART(dy,timestamp) <> DATEPART(dy,GETDATE())
    AND FREETEXT(body,'thanks')
GROUP BY
    MONTH(blocks.timestamp),
    DAY(blocks.timestamp),
    txcomments.author

I won't explain all details of the query here and know it can be improved. Feel free to ask me on steemit.chat SteemSQL channel

Let’s look at our new result:

Well, that’s much better!

We now see that despite the moving average trend is going down for posts, the gratefulness trend is now quite steady for weeks, with some nice hiccups.

Here I am reassured. There are plenty of nice people on Steemit who recognize the good work of others and thank them. I'm definitely glad to be part of this community!

The Bonus

Want to know if you’re a pro of "pat in the back" and rank in top 20 chart? Check this:

Ooops, I 'm not in
I told you it was a stupid idea!

Last but not least

Thanks to @lesliestarrohara for inspiring me this post.
Thanks to all of you who upvoted my SteemSQL introduction
Thanks to all the nice people I met here

“Bring light and delight”, Arcange


UPDATE - TOP 20 correction

I check @msjennifer account to say congrats for being top 1 saying "Thanks" to others, but noticed this account is a bot always posting similar comments. Shame on me, I should have verify it before posting.

Congrats to @kaylinart for being the most grateful people on Steemit!

Image: Shutterstock

You like this post, do not forget to upvote or follow me

H2
H3
H4
3 columns
2 columns
1 column
Join the conversation now
Logo
Center