[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / w / wg] [i / ic] [r9k] [cm / hm / y] [3 / adv / an / cgl / ck / co / diy / fa / fit / hc / int / jp / lit / mlp / mu / n / po / pol / sci / soc / sp / tg / toy / trv / tv / vp / wsg / x] [rs] [status / q / @] [Settings] [Home]
Board
SettingsHome
4chan
/q/ - 4chan Discussion


Posting mode: Reply
E-mail
Subject
Comment
Verification
reCAPTCHA challenge image
Get a new challenge Get an audio challengeGet a visual challenge Help
4chan Pass users can bypass this CAPTCHA. [Learn More]
File
Password (Password used for deletion)
  • Supported file types are: GIF, JPG, PNG
  • Maximum file size allowed is 2048 KB.
  • Images greater than 250x250 pixels will be thumbnailed.
  • Read the rules and FAQ before posting.
  • Japanese このサイトについて - 翻訳


Toggle
I spent the day sorting through and re-reading a few thousand e-mails from 2010, and it was really a blast from the past. What a year it was.
Thanks for an awesome 9 years, and for some great e-mails along the way.

As always, I read all of my e-mail and can be reached with questions/comments/concerns/hate mail/and plain ol' hellos at moot@4chan.org (or on AIM at MOOTCHAT).
tl;dr version of 2010: "SPAM, SPAM, SPAM, and VIRUSES: The Movie"

File: 1361350207497.png-(267 KB, 1853x988, 4chan stats 2013-02-20.png)
267 KB
267 KB PNG
Hey /q/, I like data porn. Do you like data porn? If you do, here's the thread for you.
>>
File: 1361350485050.png-(32 KB, 2071x632, anonymous.png)
32 KB
32 KB PNG
I added a little thing to the bottom right of the chart to let you know when the data was gathered, in case you re-post the image.

I'm actually not too experienced in visualizing this data, but I'll do my best! If anyone has any tips/wisdom, I'd like to hear it. The only thing I can say for sure is that LibreOffice Calc is dicks for both importing text files and for making charts.

Here's a percent chart of posts by board by those with names/trips and by Anonymous. Note that /q/, /soc/, and /b/ are all Anonymous.
>>
File: 1361350844523.png-(182 KB, 870x796, Images.png)
182 KB
182 KB PNG
Which board has the most data stored on it? It's /hr/! Followed closely by /gif/. /s/ takes third place, and /wsg/ a close fourth.
>>
Data porn gets me so hot
>>
File: 1361351166625.png-(18 KB, 1427x320, Images in number.png)
18 KB
18 KB PNG
The pie chart was bytes of images, to get an idea of which board is using the most gigabytes of storage. It's actually pretty meaningless.

This is how many images there are per board.
>>
File: 1361351242968.png-(27 KB, 1427x320, Images in number.png)
27 KB
27 KB PNG
>>434706
Woops, /3/ and /a/ were not labelled.
>>
File: 1361351267815.jpg-(113 KB, 512x384, 1249371122684.jpg)
113 KB
113 KB JPG
This is really fascinating stuff.

Thanks for compiling all this and posting it

Maybe stuff like this could start to be regularly posted on the 4chan status blog or something? maybe twitter?

I'd like to see a comparison of changes over time.
>>
>>434696 (OP)
some of this data looks funny. How is /v/'s oldest thread only 6 hours long?There are threads on /v/ right now that well exceed that time. And I think /a/'s longest thread is definitely longer than 4 days since we bumped a thread for close to a week.
>>
File: 1361352424857.png-(58 KB, 746x486, Content distribution.png)
58 KB
58 KB PNG
This is a scatterplot of content. A pattern emerges! Boards where users post a lot of comment-only posts tend to post less image-only posts. Boards that post a lot of image-only posts tend to not make a lot of comment-only posts.

>>434709
I've been meaning to do these more often (I've done it once before) but I need to into scripting to get this more automated.

>>434712
It's a snapshot of the data about three hours ago. You can check the oldest thread yourself by checking the catalog and ordering by creation date. The one at the bottom will be the oldest thread.
>>
File: 1361352762084.png-(18 KB, 1345x320, Sages.png)
18 KB
18 KB PNG
It's no surprise that the Misunderstood Genuises of /jp/ use Sage the most.
>>
File: 1361353030435.png-(23 KB, 1342x320, Comment length.png)
23 KB
23 KB PNG
>>434723
Who's the most long-winded? It's /trv/! What are they talking about? I don't know, I don't visit that board.
>>
>>434724
Is this how long discussions usually last?
>>
>>434708
Some boards like /an/ or /p/ are surprisingly low.
>>
>>434726
Average comment length.

>>434724
Does that include commentless posts?
>>
>>434726
Sorry for never reading the file name. Fugg :DD
>>
File: 1361353757645.png-(47 KB, 1566x701, speed1.png)
47 KB
47 KB PNG
My favorite statistic, because it's the one most people complain about. Posting speed!

Who's the fastest board? Well, no surprise.

>>434730
>Does that include commentless posts?
Not sure.

>>434726
That's actually not in any of this data! I would love to parse every thread and establish a "time from first post to last post" value, but I can't do it yet. It's my hope to eventually into scripting and be able to do that.
>>
the first image seems looks kinda windows, but since you said LibreOffice what OS are you working on?
if you need scripting I suggest you go linux all the way, shell scripts are a charm.
>>
File: 1361354244427.png-(68 KB, 1566x701, Speed2.png)
68 KB
68 KB PNG
Let's zoom in.

>>434737
Forgot to say, I used browser.exe to get the data (>>>/rs/browser). I'm visualizing this stuff in Google docs, actually. I asked /g/ regarding scripting and parsing websites, and they said Python/Request/BeautifulSoup or JavaScript/manipulating DOM.
>>
File: 1361354494563.png-(73 KB, 1566x701, Speed3.png)
73 KB
73 KB PNG
>>434741
Let's zoom in some more.
>>
File: 1361354810153.png-(108 KB, 1566x701, Speed4.png)
108 KB
108 KB PNG
>>434745
Still not close enough. Let's zoom in... and use our good friend, the logarithm.

I don't know if there's anything else from the data to graph, but it's been fun!
>>
>>434741
does gdocs support log scale? could be useful to uncluster those datapoints.
also I have no idea how to pipe browser.exe to your script(sorry), so rewriting the whole parsing engine could be the right way to go, even though I'm not sure moot would appreciate you scraping the whole site because bandwidth.
anyhow good job, me love some dataporn!
>>
File: 1361355344380.png-(101 KB, 1566x701, Speed5.png)
101 KB
101 KB PNG
>>434756
Here we go, all boards at once. Thank you based log scale.

I might end up parsing the archives because I want to know stuff about dead threads, not just instantaneous snapshots. Even the data here is low-quality, because it's at one time. As we can tell from http://catalog.neet.tv/stats.html there's obvious periods of low and high activity on 4chan. A day-average would be better, but I'm not sure how to get that.
>>
>>434761
the archive is a good start. for the average you should split your data into parsing blocks of 24h or whatever, and then get the average of that time interval.
>>
>>434697
Why are there so many tripfags on /o/? I mean...they talk about cars.
>>
>>434767
Might use a moving average, but I'd really have to automate the data-gathering aspect to get enough data points. Scripting here I come...
>>
>>434772
Because its annoying to have to keep saying "anon with xyz car here".
>>
>>434761
Christ /e/. I remember you going about as fast as the other boards. Or at least faster than /c/.
>>
That's really interesting. Thanks for the porn, I really enjoyed it.
>>
Thanks, OP, I hope you do this regularly and for different times



Delete Post [File Only] Password
Style
[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / w / wg] [i / ic] [r9k] [cm / hm / y] [3 / adv / an / cgl / ck / co / diy / fa / fit / hc / int / jp / lit / mlp / mu / n / po / pol / sci / soc / sp / tg / toy / trv / tv / vp / wsg / x] [rs] [status / q / @] [Settings] [Home]
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

- futaba + yotsuba -
All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.