Steemverse - top communites 3D force-directed graph visualization

Steemverse is Steem top communities 3D force-directed graph visualization using ThreeJS Javascript 3D drawing library and 3D Force-Directed Graph layout engine.

https://steemverse.com

Introduction

There are more and more data around us. All of them are collectible, groupable and comparable. The growing amount of data that are analysable born big data term. With growth of computing power the visualizations of connected informations are becoming easier. Only imagination limits us.
This is where 3D visualizations comes into play. Comparing 2D dimensions is not enough. Using three way geometry we can better position nodes with aspect of their relations each other.
Let's look at eg. Lightning Network graph visualization or WikiLeaks.org Security Graph

Visualize Steem blockchain

Same approach can be used in Steem blockchain spatial analysis. What if we could compare relations between acccounts or even tags based on their interactions? This can lead to interesting conclusions. Who has the biggest influence? Are there any strong related groups in our community? Let's compare the numbers and ratios to show the true sizes also! All in three-dimensional space.

Steemverse project wants to answer above questions. At start the top 100 biggest #categories are set onto the board. Then let's get top 100 accounts of each drawn tag, calculate strong relations between each other and throw all into the verse. Final look turns out to be very interesting and lead to curious outcomes ;)

Key outlines:

  1. #steem category is positioned in the center of the verse. It has custom grey monolith skin, rounding around vertical axis.
  2. #tags are connected by hyperlinks if they share at least 20% of anothers posts occurences (each separately). It is worth mention that link distance is halved for every adjacent line coming to/out of node.
  3. Distance between #tags indicates their posts coverage.
  4. Account to #tag distance is inversely proportional to posting activity. Only root posts are taken.
  5. Every account and #tag collides each other by twice of sphere's radius.
  6. #tags are attracted to the center of the verse. Accounts are pushed towards their main #category.
  7. Each tag size is proportional to the number of posts written in last 3 months. Accounts size is it's effective Steem Power.

Above outlines are specified as "physical forces" in the engine. The variables are chosen in the way of the best appealing look.
Physical forces being the core of graph directions

Drawing engine


"The painter has the Universe in his mind and hands"

ThreeJS Javascript 3D library

https://threejs.org/

ThreeJS is the most known and used 3D library for web browsers. It can render from simple canvas up to WebGL. I have decided to trust it because of many features, good documentation, a lot of examples and open sourced active community.
Steemverse project uses default objects from this library like scene, camera, spheres, boxes, lights and animations.

3D Force-Directed physics engine

https://github.com/vasturiano/3d-force-graph

This physics engine library of @vasturiano is directed on three dimensions geometry. Set of simple but complex forces helps create beautiful graphs. It's all based on math so the forces are not only definite, but realistic and good looking also. While studying how all of these physics works for last few weeks, I have even became one of the contributors of vasturiano libraries.

Blockchain data feed

All needed data are gathered through SteemSQL service. There are just too much amount of data to collect them from raw blockchain. Even using SteemSQL, my scripts takes over 2 hours to gather all data for a snapshot.

The SQL query to select top Steem categories:

SELECT TOP 100
    category,
    COUNT(DISTINCT author) as 'count'
FROM comments (NOLOCK)
WHERE created >= DATEADD(MONTH, -3, GETDATE())
AND depth = 0
GROUP BY category
ORDER BY 'count' DESC

All of above tags are checked each other to find out how strong relations there are between. Only share coverage > 20% do count for "link strength".

foreach category:

SELECT 
    COUNT(DISTINCT author)
FROM comments (NOLOCK)
WHERE created >= DATEADD(MONTH, -3, GETDATE())
AND category = \'''' + category[0] + '''\' 
AND depth = 0
AND ISJSON(json_metadata) > 0
AND CONTAINS(json_metadata, \'''' + tag[0] + '''\') 
AND \'''' + tag[0] + '''\' IN (SELECT value FROM OPENJSON(json_metadata, '$.tags'))

At last, for all 100 categories there are collected up to 100 biggest (in Steem Power) accounts. There is additional rule here. To avoid duplicates in the verse, an account is drawn only of it's strongest main tag (the category he writes most often in last months).

SELECT distinct TOP 100 
    name,
    FLOOR((CAST(REPLACE(vesting_shares, ' VESTS','') AS float)
        + CAST(REPLACE(received_vesting_shares, ' VESTS','') AS float) 
        - CAST(REPLACE(delegated_vesting_shares, ' VESTS','') AS float)) 
        / 2036) as 'STEEM POWER',
    COUNT(Comments.created) as 'COUNT'
FROM Accounts (NOLOCK)
LEFT JOIN Comments (NOLOCK) ON Accounts.name = Comments.author
WHERE category = \'''' + category + '''\'
AND FLOOR((CAST(REPLACE(vesting_shares, ' VESTS','') AS float)
    + CAST(REPLACE(received_vesting_shares, ' VESTS','') AS float) 
    - CAST(REPLACE(delegated_vesting_shares, ' VESTS','') AS float)) 
    / 2036) > 25
AND Comments.created >= DATEADD(MONTH, -3, GETDATE())
AND depth = 0
GROUP BY name, vesting_shares, received_vesting_shares, delegated_vesting_shares
ORDER BY 'STEEM POWER' DESC

Python script to gather all data

Numbers

Interested in numbers? In the worse scenario there could be 10000 nodes and its multiplication in number of links. Using distinct account approach there are only 4100 nodes and 4300 links being drawn.
Even loading few thousand nodes at once can be challenging for low PC's. That's why there were steps taken to optimize performance, eg:

  • sharing geometries - using this technique there are a lot less "models" in the computer memory to calculate every each frame. Geometries are shared upon categories (sizes divided by 1000) and accounts (they are divided into 5 groups - plankton/minnow/dolphin/orca/whale).
  • sharing materials - color palette is shrunken into basic tints.
  • text labels occured to be memory heavy, their quality were lowered by significant percentage.

Example stellars

#football stellar. I believe it grew upon latest World Cup tournament and many related contests.#life as the most used tag in the blockchain. It has most of all node links to/from other categories. Note #photography on the right as second also.
Strong #funny - #meme - #dmania triangleThe biggest constellation consists of cryptocurrency-around tags

Plans for the future

https://github.com/mys/steemverse/issues


I have planned adding some features to improve usability of Steemverse in the future. These are top priority before further expansion onto new graphs.

Website

https://steemverse.com

GitHub repository

https://github.com/mys/steemverse



"The thing’s hollow—it goes on forever—and—oh my God!— it’s full of stars!"

H2
H3
H4
3 columns
2 columns
1 column
Join the conversation now