Data Anxiety

Tempus fugit

#common sense

Data: Even if it isn't big, it should be clever »

BuzzData blog:

I’ve worked with data throughout my career (possibly for my sins).

I’m not a data scientist, but even before BuzzData,  barely a day went by where an Excel spreadsheet or CSV file didn’t end up in my inbox. Sales forecasts, cost-benefit analyses, market research, product test results, benchmarks…

Is this intended as a universal ETL? I’m not sure. It would be nice though. We could certainly use something like that. 


Can you say “death spiral” or even “fraud”?

Grumpy Old Accountants certainly can, and with relish!

Overstock has attracted analyst criticism for years…we just couldn’t resist taking a look ourselves, and the current vitals do not look good.

Inspection of the Overstock 2011 10-K provided ample evidence.

There’s been far too much non-GAAP (non-generally accepted accounting principles i.e. accepted standards with checks and balances) lately, be it pre-IPO S-1’s released by certain other companies (you know who I mean…) or similarly creative accounting by Overstock. Well, I feel this is true, in my not Certified Public Accountant-qualified opinion.

The grumpy accountants continue,

…we’re traditionalists, so the first thing we did was to compute the Altman Z-score.  After all, if you can’t find a pulse, there isn’t much use in testing for other signs of financial health. 

Fear not!

There is no slavish devotion to models here. Remember, these accountants are academicians. They probably need to toss in a statistical aside now and then. The single table of model results, 8 rows x 2 columns, is backed up as follows:

Driving these results are negative earnings before interest and taxes, as well as negative retained earnings. Additionally… values are driven by the erosion of shareholders’ equity. The debt/equity ratio rests at a staggering 12.56!

I’d keep my ears perked up for news from Overstock’s auditors. They should be blowing the whistle on this company, as it isn’t likely to continue as a going concern much longer.

The problem with Big Data


So today I’m unleashing my pent up rage on the “Big Data” crew; devotees and neophytes alike.

You do not have a big data problem. You have a functional ignorance problem.

Go back and read that a few times if necessary. Or to put it another way:

“Before you turned to big data, did you first try ‘small data’(tm)”

Or to put it yet a third and more direct way

“What’s your question?”

Most people who are “turning to big data” in their time of need don’t even know the question that they are questing for. As a result, many of the current “big data” set (pun intended) are collecting exabytes of data to hide their collective ignorance…

Start-ups are cheapening the term by using it to prop up and endless series of questionable business models and generally bad ideas.

The companies that are at the forefront of “Big Data” are there because they are solving interesting problems - at scale. Most people mistake problems at scale for the scale of their problems. What do I mean by this? Consider, Google set out to build a system that could search the web and help you find the information you wanted when you wanted it. They did not set out to build a business on top of map reduce… Netflix started out with the mission of helping me find the movies that I want to watch and get them to me… They didn’t start out with the mission of becoming the leader in applied machine learning and recommendations. The product provides the reason for the technology. 

What we now call machine learning, was called descriptive statistics in the ’60s and ’70s…

    Models often have hidden flaws that only get exposed in the real world. Your model can be fine with test data, but break on production data. The sad part is that you won’t know that it’s broken until your auto suggestion algorithm starts recommending home Euthanasia kits to people searching for elder care books.

    This post isn’t to vilify every company that mentions data analysis as being core to their product… This post is an attempt to throw the wet blanket of reality onto the bonfire of investment that seems to be throwing perfectly good VC cash down the drain in the hopes that analyzing the data from your last game-mechanic-social-coupon-buying hot thing will finally have them making money instead of spending it….

    Three places to look before you hit big data

    Classical descriptive statistics.

    Every time you map reduce without drawing a box plot, God murders a marmoset. For most data sets, starting with box plots and histograms does no harm and provides valuable insight on how to proceed. Far too often, we have one tiny nail to drive and you reach for a gigantic sledge hammer.


    Computers are awesome at simulating things… Instead of recording down every possible piece of information - try recording a little bit of information and simulating the rest. This very technique lies at the heart of one of the more powerful techniques in the statistical tool box: Monte Carlo Simulation.

    Ignore it and it Might go Away/ be unimportant

    That’s right - my favorite technique is to go focus on something else: Like your business model… Some questions just aren’t that important or intriguing. Let me revise that: MOST questions just aren’t that important. Are you asking questions that are central to improving your product or service?

    We live in interesting times for data. If we are careful with the ideas that now available to us, and the terms we use to describe them, data big and small will start driving decisions that can help everyone.

    If we continue down this road of using “big data” as a crutch for weak business models and terrible “products” … VCs, investors and industry will see our statistically driven ideas as mere tech-bubble snake oil.


Stop flipping me off!  The state TOLD me to merge all the way up here!(

Click through and see what the State of Minnesota has to say about this. The official page DOES seem to say that it is okay to merge “all the way up” there, actually! Great for infuriating other drivers….


    Stop flipping me off!  The state TOLD me to merge all the way up here!


    Click through and see what the State of Minnesota has to say about this. The official page DOES seem to say that it is okay to merge “all the way up” there, actually! Great for infuriating other drivers….

    LastPass Demonstrates Impeccable Crisis Handling »

    If an organization experiences a security breach, or any sort of business liability “event”, it is best to acknowledge it, and deal with the problem immediately.

    What’s done is done, a sunk cost. And that is what LastPass did a few weeks ago, when it experienced a liability event.