August 16, 2012

Big Data and Data Science Blogs Ordered by Google PageRank

Table from the Enriching a List of URLs with Google Page Rank article:

News: Datablog | guardian.co.uk http://www.guardian.co.uk/news/datablog 7
Freakonomics http://www.freakonomics.com 7
Calculated Risk http://www.calculatedriskblog.com/ 7
Visual.ly http://visual.ly 7
O’Reilly Radar – Insight, analysis, and research about emerging technologies. http://radar.oreilly.com/ 7
Mendeley Blog http://blog.mendeley.com 7
GetGlue Blog http://blog.getglue.com 7
Echo http://aboutecho.com 7
The Filter Bubble http://www.thefilterbubble.com 6
Statistical Modeling, Causal Inference, and Social Science http://andrewgelman.com 6
Clay Shirky http://www.shirky.com/weblog 6
Peter Norvig http://www.norvig.com 6
Social Media Intelligence | NM Incite http://nmincite.com 6
…My heart’s in Accra http://www.ethanzuckerman.com/blog 6
asymco http://www.asymco.com/ 6
3scale http://www.3scale.net 6
Jeffrey Veen http://veen.com/jeff/ 6
Social Media Collective http://socialmediacollective.org 6
Quantified Self http://quantifiedself.com 6
Open http://open.blogs.nytimes.com/ 6
Michael Nielsen http://michaelnielsen.org/blog 6
The Official Klout Blog http://corp.klout.com/blog 6
Ivan’s private site http://ivan-herman.name 6
Information Optimized Blog http://informationoptimized.com 6
Information Arbitrage http://informationarbitrage.com/ 6
Infographics news http://infographicsnews.blogspot.com/ 6
Geeking with Greg http://glinden.blogspot.com/ 6
Palantir Technologies » Analysis Blog http://palantir.com/ 6
Vertica » Blog http://www.vertica.com 6
Sunlight Labs blog http://sunlightlabs.com/blog/ 6
Rawkes – Captain’s Blog http://rawkes.com/ 6
DataStax » Blog Post – Corporate http://www.datastax.com 6
Quantified Self http://quantifiedself.com 6
curiosity counts http://curiositycounts.com/ 6
The Numbers Guy http://blogs.wsj.com/numbersguy 6
JMP® Blog http://blogs.sas.com/jmp/ 6
Recorded Future Blog https://www.recordedfuture.com 6
PostRank Blog http://blog.postrank.com 6
Open Knowledge Foundation Blog http://blog.okfn.org 6
OkTrends http://blog.okcupid.com 6
Social Studies Blog http://blog.getsatisfaction.com 6
(title unknown) http://blog.flurry.com/ 6
DBpedia http://blog.dbpedia.org 6
James Gleick http://around.com 6
Twitter / QuidLabs http://twitter.com/QuidLabs 5
Phil Windley’s Technometria http://www.windley.com/ 5
Trampoline Systems http://www.trampolinesystems.com 5
ThingWorx – The 1st Application Platform for the Connected World http://www.thingworx.com 5
PlaceIQ http://www.placeiq.com 5
OData Blog http://odata.org/blog 5
Neoformix http://neoformix.com 5
AI3:::Adaptive Information http://www.mkbergman.com 5
Knome http://www.knome.com 5
igvita.com http://www.igvita.com 5
FactSet’s Taking Risk blog http://www.factset.com/blogs/takingrisk 5
Zero Intelligence Agents http://www.drewconway.com/zia 5
dataists http://www.dataists.com 5
CommonCrawl http://commoncrawl.org 5
CB Insights – Blog http://www.cbinsights.com/blog 5
Bitcurrent: Networking, technology, and web operations http://www.bitcurrent.com 5
The Noisy Channel http://thenoisychannel.com 5
Surprise and Coincidence – musings from the long tail http://tdunning.blogspot.com/ 5
PeteSearch http://petewarden.typepad.com/searchbrowser/ 5
Personal Data Ecosystem Consortium http://pde.cc 5
ParAccel http://paraccel.com 5
Opera Solutions http://operasolutions.com/blog 5
Metamarkets Blog http://metamarkets.com 5
The Mechanical Turk Blog http://mechanicalturk.typepad.com/blog/ 5
Inductio Ex Machina http://mark.reid.name/iem/ 5
Blog | MapR Technologies http://mapr.com/blog 5
LOD2 http://lod2.eu/BlogPost 5
LingPipe Blog http://lingpipe-blog.com 5
No Free Hunch http://blog.kaggle.com 5
iPhylo http://iphylo.blogspot.com/ 5
Entrepreneurial Geekiness http://ianozsvald.com 5
GLUE Conference http://gluecon.com/2012 5
Simply Measured » Social Media Data http://simplymeasured.com 5
The CrowdFlower Blog http://blog.crowdflower.com 5
BackType http://blog.backtype.com 5
thoughts from the red planet http://nathanmarz.com/blog/ 5
Juice Analytics: Agile Analytics http://www.juiceanalytics.com 5
Apigee Blog http://blog.apigee.com/ 5
Latest » Press http://fathom.info/latest 5
Latest http://fathom.info/latest 5
emergent by design http://emergentbydesign.com 5
eagereyes http://eagereyes.org 5
DBMS Musings http://dbmsmusings.blogspot.com/ 5
Caseorganic Blog http://caseorganic.com 5
Splunk Blogs http://blogs.splunk.com 5
Technology Leadership Tactics http://blogs.forbes.com/danwoods 5
SocialFlow Blog http://blog.socialflow.com 5
Adventures in Data Land http://blog.smola.org/ 5
Sematext Blog http://blog.sematext.com 5
ScraperWiki Data Blog http://blog.scraperwiki.com 5
Revolutions http://blog.revolutionanalytics.com/ 5
OpenCorporates news http://blog.opencorporates.com 5
NewsCred Blog – Thoughts on the Future of News http://blog.newscred.com 5
Kevin Hillstrom: MineThatData http://blog.minethatdata.com/ 5
kiwitobes.com http://blog.kiwitobes.com 5
No Free Hunch http://blog.kaggle.com 5
blog.infochimps.org http://blog.infochimps.com 5
Hunch Blog http://blog.hunch.com 5
Gnip Blog http://blog.gnip.com 5
Factual Blog http://blog.factual.com 5
DataSift http://blog.datasift.net 5
DataPortability Blog http://blog.dataportability.org 5
DataMarket blog http://blog.datamarket.com 5
Assetmap http://blog.assetmap.com 5
writing | ben fry http://benfry.com/writing 5
arbesman.net http://arbesman.net 5
Alon Halevy’s Blog http://alonhalevy.blogspot.com/ 5
Machine Learning, etc http://yaroslavvb.blogspot.com/ 4
SnapLogic Blog http://blog.snaplogic.com 4
Skeptic Geek http://www.skepticgeek.com 4
riccomini – data visualization, computer programming, data mining, software engineering http://www.riccomini.name 4
Predictive Signals http://www.predictivesignals.com/ 4
Michael G. Noll http://www.michael-noll.com 4
manAmplified http://www.manamplified.org/ 4
Informaniac http://www.informaniac.net/ 4
GoodData http://www.gooddata.com/blog 4
Blog Archive http://www.gooddata.com/blog 4
Byte Mining http://www.bytemining.com 4
Analysis Intelligence http://analysisintelligence.com 4
My tech blog. http://tomazkovacic.com/blog 4
Super-Economy http://super-economy.blogspot.com/ 4
Sesame Data Browser http://sesamedatabrowser.wordpress.com 4
Seb’s Open Research http://openresearch.sebpaquet.net/ 4
Marko A. Rodriguez http://markorodriguez.com 4
Lifestream Blog http://lifestreamblog.com 4
The Adsideologist – Kevin Berardinelli http://kevinberardinelli.com 4
Hyperextended Metaphor http://innocuous.org 4
Saplo http://blog.saplo.com 4
ThingSpeak Community http://community.thingspeak.com 4
Directed Edge News http://blog.directededge.com 4
blog.spinn3r.com http://blog.spinn3r.com/ 4
Mashape’s Voice http://blog.mashape.com 4
Marginally Interesting by Mikio L. Braun http://blog.mikiobraun.de/ 4
Mainly Data http://jeffhammerbacher.com/ 4
Data Wrangling http://www.datawrangling.com 4
API Evangelist http://blog.apievangelist.com/ 4
Dumbotics http://dumbotics.com 4
Daytum http://daytum.wordpress.com 4
Dataspora Blog http://www.dataspora.com 4
(title unknown) http://contactcon.com/blog 4
Recollect Engineering http://code.recollect.com/ 4
Timetric Blog http://blog.timetric.com 4
SpatialKey blog http://blog.spatialkey.com 4
Personalization Blog http://blog.rapleaf.com 4
MongoLab http://blog.mongolab.com 4
MetaOptimize http://metaoptimize.com/blog 4
The Kontagent Blog http://kaleidoscope.kontagent.com 4
Formulists http://formulists.com/ 4
Edwin Chen’s Blog » Bayesian Confidence Intervals: Obama’s ‘That’-Addition and Informality http://blog.echen.me/ 4
Data Miners Blog http://blog.data-miners.com/ 4
Connect.Me Blog http://blog.connect.me 4
BuzzData | Blog http://blog.buzzdata.com/ 4
A blog about changing the way we remember http://1000memories.com/blog 4
Welcome to the Cetas blog! http://03240c9.netsolhost.com/blog 4
A Notebook http://www.lisazhang.ca/ 3
Kin lane http://www.kinlane.com/ 3
Humanoid http://tumblr.gethumanoid.com/ 3
Semantic Void http://semanticvoid.com/blog 3
rachelbinx http://rachelbinx.com 3
Proximal Labs http://proximallabs.com 3
Scribbled thoughts on my research topics http://mint.typepad.com/blog/ 3
Knight & Bishop http://knightbishop.com 3
b i t q u i l l » Sci & Tech http://www.bitquill.net/blog 3
Z-Blog http://www.zemanta.com/blog/this-week-in-the-blogosphere-bloggers-are-not-journalists-importance-of-seo-extraordinary-blogs/ 3
Nick Halstead http://nickhalstead.com 3
MailRank Home http://blog.mailrank.com 3
Extractiv http://extractiv.squarespace.com/blog/ 3
FluidDB http://blogs.fluidinfo.com/fluidinfo/2009/10/03/fluiddb-as-a-universal-metadata-engine/ 3
/#! http://blog.slashpoundbang.com/ 3
Keepcon http://blog.keepcon.com 3
The Next EdgeThe Next Edge | The Next Edge http://thenextedge.org 2
brain of matpalm http://matpalm.com/blog 2
i314 – Research in Data Science_i314 – Research in Data Science http://i314.com.ar 2
All your data are belong to us http://blog.sitescraper.net/ 2
Data Big Bang Blog http://blog.databigbang.com 2
Human-based computation http://3form.org/blog 2
Yet Another Machine Learning Blog http://yamlb.wordpress.com 1
Michael Rihani http://www.michaelrihani.com 1
sudoscientist.com http://sudoscientist.com 1
Effectual Analysis http://blog.effectcheck.com 1
Big Data Craft http://bigdatacraft.com 1
7Puentes http://7puentes.com 1

  • http://twitter.com/kdnuggets Gregory Piatetsky

    Good list, but is missing KDnuggets (www.kdnuggets.com ) which has Page Rank of 6 and is a leading site blog on Business Analytics, Big Data, Data Mining, and Data Science

    • http://blog.databigbang.com/ Sebastian Wain

      I read your site directly in the browser while the others are read via Google Reader!

      • http://twitter.com/kdnuggets Gregory Piatetsky

        KDnuggets News also has RSS feed http://www.kdnuggets.com/rss.xml , but what will you do since Google Reader is going away?

        • http://blog.databigbang.com/ Sebastian Wain

          I know that KDnuggets has a RSS feed, but there are few sites that I check “manually”.

          I am in denial to the fact that Google Reader will be out. I don’t know what I’ll do :-( there are some alternative readers but Google Reader has two important features:

          - Virtually infinite feeds (beyond the current site feed). So you can add a feed now and navigate to much older articles.

          - Automatic translation of feeds. For example, I have a feed in Russian and I read it in English. When I share a good article from that blog my friends ask me if I know Russian since it is obviously shared in the original language….

          Do you have any recommendation for a Google Reader replacement? I have 1522 feeds there.