Pankaj Mishra’s NYT book review of Ardor

The philosophers of India, T. S. Eliot once wrote, “make most of the great European philosophers look like schoolboys.” Roberto Calasso seems to concur.

Calasso anticipates his reader wondering, “What can be the relevance of all we read in the Veda?” He is right to answer that such “microphysics of the mind” can bring about an “abrupt and disorientating shift of perspective” and, perhaps, snap us out of both naïve reverence for and smug disenchantment with the modern world.

“Ardor” outlines, in its own quirky way, that long-overdue and genuine intellectual cosmopolitanism.

Large Scale Deep Learning (repost of presentation by Jeff Dean @ Google)

Screenshot 2014-12-27 10.46.53

Above should not be taken out of context as hubris. There are brute force improvements that cannot be denied. The paradigm called deep learning – uncountable data + many layers of uncountable simple computations – is moving us monotonically closer to AI. Domain specific short-cuts (e.g. human programmed computers) will always have some advantage, but this margin will reduce.

The presentation describes numerous experiments that are showing the efficacy of artificial neural network based inference. These include significant improvements in speech recognition, image labelling, translation and some success in more challenging areas such as analogies.

What is an Exabyte?

It is 1 million terabytes or 1 billion gigabytes.

Forecasts from Cisco

The current Visual Networking Index forecast projects global IP traffic to nearly triple from 2013 to 2018. Overall IP traffic is expected to grow to 132 exabytes per month by 2018, up from 51 exabytes per month in 2013, a CAGR of 21 percent (Figure 1).

Figure 1. Cisco VNI Forecasts 132 Exabytes per Month of IP Traffic by 2018

Figure 2. Global Devices and Connections Growth

htmlwidgets: JavaScript data visualization for R


RStudio Blog

Today we’re excited to announce htmlwidgets, a new framework that brings the best of JavaScript data visualization libraries to R. There are already several packages that take advantage of the framework (leaflet, dygraphs, networkD3, DataTables, and rthreejs) with hopefully many more to come.

An htmlwidget works just like an R plot except it produces an interactive web visualization. A line or two of R code is all it takes to produce a D3 graphic or Leaflet map. Widgets can be used at the R console as well as embedded in R Markdown reports and Shiny web applications. Here’s an example of using leaflet directly from the R console:


When printed at the console the leaflet widget displays in the RStudio Viewer pane. All of the tools typically available for plots are also available for widgets, including history, zooming, and export to file/clipboard (note that when not running…

View original post 336 more words

Spark Cluster on Google Compute Engine

We were up an running with R on GCE in close to an hour. Looking forward to giving Spark a go soon as well. More compute power on tap creates possibilities.

Ido Green

gce+sparkWhat is Spark and Why?

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop. In the past, I’ve wrote an intro on how to install Spark on GCE and since then, I wanted to do a follow up on the topic but with more real world example of installing a cluster. Luckily to me, a reader of the blog did the work! So after I got his approval, I wanted to share with you his script.

View original post 183 more words

It’s Time to Play Moneyball: The Investment Readiness Level

Lots to process and even more to practice but tough to debate this data driven approach to developing a concept into a business.

Steve Blank

Investors sitting through Incubator or Accelerator demo days have three metrics to judge fledgling startups – 1) great looking product demos, 2) compelling PowerPoint slides, and 3) a world-class team.

We think we can do better.

We now have the tools, technology and data to take incubators and accelerators to the next level. Teams can prove their competence and validate their ideas by showing investors evidencethat there’s a repeatable and scalable business model. And we can offer investors metrics to play Moneyball – with the Investment Readiness Level.

Here’s how.


We’ve spent the last 3 years building a methodology, classes, an accelerator and software tools and we’ve tested them on ~500 startups teams.

  • A Lean Startup methodologyoffers entrepreneurs a framework to focus on what’s important: Business Model Discovery. Teams use the Lean Startup toolkit: the Business Model Canvas + Customer Development process + Agile Engineering. These…

View original post 1,155 more words

Lost in translation

Over the last couple of months I have had the opportunity to speak with a number of researchers, technologists, customers and developers in and around the area of large scale data based inference. There is work progressing at a frenetic pace and technical accomplishments that are indeed marvelous(Watson, Spark, etc.). However, we are still reaching for a coherent way to think about this apparently powerful but unfamiliar tool.

I am yet unclear as to whether the monolith from Kubrick’s mind-bender would be us or “Big Data”. Figuring out a language to get the two sides to communicate is challenging, of that I am clear.

McKinsey had some advice from last year that probably had many left scratching their heads. We definitely need more work on this one.

The answer, simply put, is to develop a plan. Literally. It may sound obvious, but in our experience, the missing step for most companies is spending the time required to create a simple plan for how data, analytics, frontline tools, and people come together to create business value.