Data Anxiety

Tempus fugit

#startups

Interview with Amy Heineike - Mathematician

simplystatistics:


Amy Heineike is the Director of Mathematics at Quid
, a startup that seeks to understand technology development and dissemination through data analysis….

I recall reading about the oddly-named Quid via TechCrunch some months ago, and later about Amy Heineike’s achievement in snagging a much desired job among the data, mathematics and statistics crowd! Both Amy and Quid seem to be doing  well!

Background

I dug around and unearthed that TechCrunch post introducing Quid (September 2010), whose primary focus seemed to be critique of the Quid site’s typeface. (There is no reason a quantitative analysis-based company is obliged to have an ugly website!)

Better coverage was available via a more recent New York Times article (October 2011):

Quid tracks job listings, customer wins and funding valuations at first, second, and third funding rounds to give venture investors a better yardstick — and better quantitative tools — in valuing startups…In fact, Quid is an offshoot of YouNoodle.com, who offered an intriguing calculator that it claimed could predict the exit or liquidity valuation of startups.

The following are excerpts from Simply Statistics’ interview of Ms. Heineike:

What skills do you think are most important for statisticians moving into the tech industry?

Technical statistical [expertise] is the foundation. You need to be able to take a dataset and discover and communicate what’s interesting about it for your users… A key part of that is being willing to engage with questions about where the data comes from (how it can be collected, stored, processed and QAed), how the analytics will be run (how will it be tested, distributed and scaled) and how people interact with it (through visualisations, UI features or static presentations?).

Generally speaking, the earlier stage the company that you join, the broader the range of skills you need, and the more scrappy you need to be about getting involved in whatever needs to be done. Later stage teams and big tech companies may have roles that are purer statistics.

There is a real opportunity for people who have good statistical and computational skills [at a graduate degree level] to get into the startup and tech scenes now. Getting involved in an open source project, working with version control in a team, or sharing your code on Github are all good ways to start.

Its really important to be able to show that you want to build products though. Imagine the clients or users of the company and see if you get excited about building something that they will use.

Go ahead and read the full interview on Simply Statistics tumblr.

Make sure to leave enough time to have a look around in general, as it is a fine tumblr for the mathematically and statistically inclined, well, for those with practical inclinations!

30 Things I've Done at Startups »

via bufr:

  1. I wrote and pushed code to the production servers on my first day at work.
  2. To: team@company.com Message: “Just checked in 10,000 attendees on our new iPhone app I finished last night.” 
  3. I have had pixie sticks and Red Bull as a meal.
  4. I was in the room as…

I just found my way to a tumblr with a “coding title” that isn’t about programming, well, not the nitty gritty parts. It is The Bufr Overflow tumblr.

The problem with Big Data

evilmartini:

So today I’m unleashing my pent up rage on the “Big Data” crew; devotees and neophytes alike.

You do not have a big data problem. You have a functional ignorance problem.

Go back and read that a few times if necessary. Or to put it another way:

“Before you turned to big data, did you first try ‘small data’(tm)”

Or to put it yet a third and more direct way

“What’s your question?”

Most people who are “turning to big data” in their time of need don’t even know the question that they are questing for. As a result, many of the current “big data” set (pun intended) are collecting exabytes of data to hide their collective ignorance…

Start-ups are cheapening the term by using it to prop up and endless series of questionable business models and generally bad ideas.

The companies that are at the forefront of “Big Data” are there because they are solving interesting problems - at scale. Most people mistake problems at scale for the scale of their problems. What do I mean by this? Consider, Google set out to build a system that could search the web and help you find the information you wanted when you wanted it. They did not set out to build a business on top of map reduce… Netflix started out with the mission of helping me find the movies that I want to watch and get them to me… They didn’t start out with the mission of becoming the leader in applied machine learning and recommendations. The product provides the reason for the technology. 

What we now call machine learning, was called descriptive statistics in the ’60s and ’70s…

    Models often have hidden flaws that only get exposed in the real world. Your model can be fine with test data, but break on production data. The sad part is that you won’t know that it’s broken until your auto suggestion algorithm starts recommending home Euthanasia kits to people searching for elder care books.

    This post isn’t to vilify every company that mentions data analysis as being core to their product… This post is an attempt to throw the wet blanket of reality onto the bonfire of investment that seems to be throwing perfectly good VC cash down the drain in the hopes that analyzing the data from your last game-mechanic-social-coupon-buying hot thing will finally have them making money instead of spending it….

    Three places to look before you hit big data

    Classical descriptive statistics.

    Every time you map reduce without drawing a box plot, God murders a marmoset. For most data sets, starting with box plots and histograms does no harm and provides valuable insight on how to proceed. Far too often, we have one tiny nail to drive and you reach for a gigantic sledge hammer.

    Simulation

    Computers are awesome at simulating things… Instead of recording down every possible piece of information - try recording a little bit of information and simulating the rest. This very technique lies at the heart of one of the more powerful techniques in the statistical tool box: Monte Carlo Simulation.

    Ignore it and it Might go Away/ be unimportant

    That’s right - my favorite technique is to go focus on something else: Like your business model… Some questions just aren’t that important or intriguing. Let me revise that: MOST questions just aren’t that important. Are you asking questions that are central to improving your product or service?

    We live in interesting times for data. If we are careful with the ideas that now available to us, and the terms we use to describe them, data big and small will start driving decisions that can help everyone.

    If we continue down this road of using “big data” as a crutch for weak business models and terrible “products” … VCs, investors and industry will see our statistically driven ideas as mere tech-bubble snake oil.

    Fewer women in technology now than in 1991

    I was reminded of an article I read a few months ago about the “real reason women quit engineering.” Stemming The Tide: Why Women Leave Engineering summarizes the findings of a University of Wisconsin-Milwaukee study of 3,700 women with engineering degrees.

    They found that just one in four women who had left the field reported doing so to spend more time with family. And, unsurprisingly:

    Women engineers who were treated in a condescending, patronizing manner, and were belittled and undermined by their supervisors and co-workers were most likely to want to leave their organizations.

    News such as this can’t inspire young women to go into these fields…

    What percentage of women are participating in the more technical side of technology companies? Vastly fewer than men. According to U.S. government statistics, women accounted for 36 percent of IT professionals in 1991. They now account for only 25 percent of same.

    In an article last year in the Wall Street Journal [regarding] the lack of women in venture-backed startups:

    Only about 11% of U.S. firms with venture-capital backing in 2009 had current or former female CEOs or female founders… Start-up incubator Y Combinator has had just 14 female founders among the 208 firms it has funded.

    The “where-are-all-the-women” meme is familiar… But in start-up land, where the good idea is supposed to trump social status and everything else, the lack of women in positions of authority stands out.

    — Excerpt: Tech really is a man’s world 

    by Linda Forrest, Business Insider (August 2011)

    Admitting failure »

    The following is a quote from a clever sort named James Barnes (for full post see link above):

    The development community is failing to learn from failure. Instead of recognizing these experiences as learning opportunities, we hide them away out of fear and embarrassment.

    and admittingfailure.com is a website his post makes mention of.

    It seems worth a glance for aspiring start-up types. Not as a scare tactic, more as a “this is the way of the world, but it can still turn out alright in the end” alert.