downloading folders from google drive.

I wanted to download some course material on RL shared by the author via Google drive using the command line.  I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.

Data Science Glossary - P

Page view
is an instance of a page being loaded (or reloaded) in a browser.
Panel data
another name for longitudinal data collected, generally in the context of market research over time. (See also cohort analysis).
market research technique used by web site or mobile app developers to model customer needs. Archetypes and segmentation are similar techniques which allow market researchers and ui/ux designers to explore clients based on typical representation or an idiosyncratic one.
an acronym for Personally Identifying Information
Web property
is either a web site or an application
is an acronym for principal component analysis. It is a dimensionality reduction method in multivariate statistics that can be used to extract the principal components which are the  most important linearly uncorrelated dimensions of in a possibly linearly correlated dataset. PCA is often employed in exploratory data analysis and to create predictive models.
is an acronym for Pay per Click Model in advertising
Statistical Power is the ability in Statical Inference of an experiment to detect the effect when an effect exists
is the component of the Google Analytics platform that organizes raw data into users and sessions, adds data from other sources, and applies configuration settings to transform the raw data into database tables for reporting.
is a sub-component of a Google Analytics account that determines which data is organized and stored together. Any resource tagged with the same Property ID is collected and stored together. A single property can be used to track one website or mobile app, or be a roll-up of the data from multiple sites or mobile apps.
Propensity score matching
also called PSM, is a matching procedure technique, developed by Paul Rosenbaum and Donald Rubin in 1983 and used in econometrics to analyse experiments in which some subjects are given treatment whose effect is being measured. Ideally in these experiments the groups are randomised. However PSM can help to correct a bias that is often introduced when the groups is self selected as might happen in an observational study.  The matching procedure can be one of  (nearest neighbor, caliper matching, mahalanobis metric matching, stratification matching, difference-in-differences matching and exact matching) is used to select a subsample of the data point for which to test the hypothesis. An important issue with PSM is that since it makes the control more similar to the tested group it can increase bias in latent (unobserved) variables called confounders which will adversely affect the results of the analysis. 


Popular posts from this blog

Moodle <=< Mediawiki SUL integration - first thoughts

downloading folders from google drive.

AWS CloudFormation Pros and Cons