downloading folders from google drive.

I wanted to download some course material on RL shared by the author via Google drive using the command line.  I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.

Organizational challenges for the cult(ure) of analytics

All things data 2015

Last month I had the fortune to attend "All thing data 2015" Conference in Tel-Aviv. The turnout was impressive and despite a stiff entrance fee, almost a thousand people come to learn from a lineup of local and international "experts" on Web Analytics. The crowd was a mix of consultants, sophisticated digital media buyers, analytics instructors and numerous web analysts from newly established startups and from more established companies all unified by an appreciation of their vested interest in monitoring their web presence.

Organizational challenges

While structural variation are not uncommon
the ability derive decisions from data remains
a major challenge of firms.
One of the first talks was given by Mr. Zvika Jerbi who is a consultant working for SWC. Coming from an academic background he lectured the audience on the organizational challenges of implementing a culture of analytics in more traditional business settings. The begging of the talk hinged on defining what is a data driven company. I am familiar with a number of companies which have embraced the data-driven approach to decision making but they are few and far between. The data-driven approach rarely takes off due to a conflict with the existing  organizational culture.

The data driven company

What is a data-driven company anyhow?
So what is a data-driven company? There are a number of factors that will determine if a company should be considered. First and foremost is the availability of access to the company's data. A second requirement is that it should have well-established goals, especially for customer-facing presence. This by itself is a requirement for any modern business but what really needs to be asked in this context is to what degree are the goals defined in a way that outcomes can be quantified using well defined KPIs? A data driven company tracks usage reports for any significant changes that it launches, yet reporting is also a characteristic of most large companies. Again in this case the focus is to fosters a spirit of experimentation. So while more ideas will get to see the light of day, the mediocre ones will be weeded out that much faster. There is greater focus on performance than on features. Curiosity and good storytelling are rewarded. They routinely conduct market research and are able to reverse engineer their customers using demographics, and more business-centric segments, they use behavioral techniques  and tools to expand the voice of their  customers.

And where will data drive it ?
Smaller companies lack resources for to implementing an analytic plan or to take action based on its results. Initially, there is too little data and its quality is too poor to use for driving decisions. All too soon there is more date than can be handled. The quality or quantity that can justify the invest. But all too soon they may be looking at big data which great complexity for analyzing it. Decision makers who are most likely to ask for data are low to mid-level. Their questions beg timely answers. The big decision makers are higher up the in the organizational hierarchy and accordingly further away from the data. They need and usually get highly processed summaries of real data or long term information that has been vetted by middle and upper managers.

Enterprise entanglement

In enterprise settings agility
is the primary challenge
Next we learned that at the enterprise end of the spectrum, data sets are big - they need ETL systems to process them and private clusters or public cloud of machines to store and process the data. But once it is transformed into action or intelligence it translates into a competitive advantage over smaller competitors. One advantage of smaller companies is agility. Enterprise settings tend to create a lot of "red tape" and while everyone has clearly defined responsibilities these rarely include helping someone else get things done or look at data they did not create or process. Even when backed by evidence few decisions are final since anyone may be overridden by the system of checks and balances. Here is a typical scenario.

"When Mike, a mid-level manager sees an opportunity in the marketplace. To nail it, he asks his market research department for  the impact of creatives on different areas. Annie an analyst working on product development has done first-rate before and is asked to figure colors for product and images for creative. The  data needed is warehoused in an insular unit called IT which is run like Alcatraz by Isaac the 'IT guy'. IT and  marketing, operate in different time scales and anyhow the raw data they produce is useless for Annie. She needs it to be cleaned up, serial numbers converted to labels and so on – a task which Calvin from the data warehouse is quite capable. He transforms the raw data it and combines with other data sources. 

Annie need to explore it for a day or two before she can come up with a model that can actually do better than a coin toss. Then she needs another slice of data and a model begins starts to make some interesting suggestions. When she meets with Mike, she finds that the marketing decisions have already been made – since a competitor was announced entry into the market. Also the suggestions made by the model are the exact opposite of what was decided by Mike's superior. So by the time the results arrive weeks have passed and the fleeting opportunity has passed away."

In the majority of business cases the time required to get new queries processed to solve requires the gap is so difficult to transverse that the time required to or from the people who need it the most or who could utilize it. Decision makers are clustered at the head of the hierarchy while IT and data analyst are rarely capable of making business decisions.  associated with getting information out of large data warehouses cleaning it, processing it, using it to make models and then using these in production of data-driven services . A data scientist at Facebook described a situation where a 4-month cycle was required to get at information needed to design new user interface features. Larger clients are more frequently interested in implementing in-depth reporting rather than in an analytics service.

The main lesson for introducing analytics in enterprise settings is to start small. Which is about reducing risks, costs, overhead and most important, time to action. 

Starting up small

At the other extreme are small and medium businesses, which are leaner in terms of both personnel and their business process.

"Alice is a  web designer, she works closely with Will, a web developer living abroad, all the data is either in the cloud based website - in the log files; a cloud hosted database or in the cloud-based CRM solution. There is no data-warehouse yet and the first point of attack is Google Analytics. Paul is the VP of sales and PPC work falls under his domain. He has never had time to study Adwords or GA , and so their setup is like that if to 99% of users - it is set up for the most basic reporting and there is nothing in the works to their business. No custom segments or business specific data are in use. So analytics is used to track two landing pages in supporting role for PPC. While Alice, Will, and Paul can get the raw data to convert it into evidence for making a decision is quite demanding and in the end of the day they are not paid to analyze data. Falling behind on their daily tasks has its consequences. So all decision are made by Harris, their CEO, Harris is very hand on and when it comes to the website he constantly tries to 'borrow' an ideas from the competition or from Google. or From Yahoo. When presented with a choice he will listen for any opinions and eventually end up taking a look at what the competition was doing and take that line."

Data even if it makes an appearance will be challenged and when a  HiPPO weighs in, you don't want to get in his way. For collecting statistically significant amounts of data is not a simple undertaking. Implementing an analytics solution has significant cost - primarily in acquiring qualified manpower. The event horizon for reaching break-even for such an undertaking is often further away than a small business is able to consider.

How to promote analytics in these settings?

The lessons are: Get into the habit of looking at your data daily. Use some insights to get more eyes looking at data periodically. Once reporting is in place, prioritize analytics tasks which can lead the company to take action. E.g. Brand uplift, new users and reduce churn or optimize resources. Next go after any low hanging fruit. Hypothesis that are easy to test and metrics that are actionable items. Once you learn to deliver analysis with an action plan a steady basis you not only foster trust in a data-driven approach but create some hunger. To further expand the sphere of influence automate the process of a/b tests for continual improvement or conduct other randomized tests of new ideas. Deliver these results early in the morning as a morning coffee blurb. Finally you should take presenting your results to a higher level by delivering results as "data stories",  together with infographics and concise visualization. 


Popular posts from this blog

Moodle <=< Mediawiki SUL integration - first thoughts

downloading folders from google drive.

Insight into progressive web apps