This blog is where I share insights from any number of adventures in data analysis.
I will cover best practices from modeling and analysis projects; share tips on using new tools; outline new projects; recount war stories from Wikipedia and other FOSS projects I contribute to, and discuss challenges on information retrieval challenges, natural language processing tricks and game theoretic insights, portfolio analysis, social network analysis.
I wanted to download some course material on RL shared by the author via Google drive using the command line. I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue https://drive.google.com/drive/folders/1V9jAShWpccLvByv5S1DuOzo6GVvzd4LV if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.
Get link
Facebook
Twitter
Pinterest
Email
Other Apps
Business Analytics for Managers - Review
Get link
Facebook
Twitter
Pinterest
Email
Other Apps
Wolfgang Jank's text "Business Analytics for Managers" is part of the Use R! series published in 2011 by Springer.
It stand out as easy introduction for using R for exploratory data analytic and data modeling at an introductory level. While R and statistics are skill not easily acquired working with the book makes for an easy learning curve by focusing on the business side of the work while omitting much of the platform specific implementation details. This makes some sense if the manager will want to ask someone else to work with R to get him the results.
The text explains how to work with different type of data and how a manager would analyse the different data-sets. It explains the benefits of using a large cross section of R's visualization techniques for assessing unfamiliar data for global or cross sectional patterns.
It then goes on to explain the essentials of of data handling techniques such as creation of dummy variable interaction variables, variable transformation as well as both linear and non-linear regression models. The reader will quickly grasp what the output of R's regression models mean, how to compare different model's quality and if significant data is also of practical for business application.
Wolfgang Jank kindly furnished me with the data-sets he uses in the book. In this and the following posts I'll be adding my notes and R code I used to follow along the text and reproduce the results and their graphical visualization.
SUL is Wikipedia's Single User Login system The goal is to use it to authenticate moodle users there is one caveat - what happen if the moodle user has no account. Moodle and MediaWiki account creation require different information. Since we'd like to use MediaWiki's standard which is highly permissive. It is necessary to change moodle's requirements. Challenges Moodle does not explicitly define an object for setting registration requirements. This needs to be accomplished by admins via edit a number of files manually. errors will place the users in a limbo/blocked state... This is implemented differently on different version of moodle. This will break if the changes are overwritten by a system is upgrade. Directions Ideally Moodle should have a registration policy object which allows the admin to define which fields are required and if they need to be unique . For example Moodle complains if different users share an email adr
I wanted to download some course material on RL shared by the author via Google drive using the command line. I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue https://drive.google.com/drive/folders/1V9jAShWpccLvByv5S1DuOzo6GVvzd4LV if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.
AWS CloudFormation Pros and Cons So I'm building a PAAS product that does ML based optimisations and that means doing work in the cloud. The ML is a neat feature but without the basic product nothing will happen and to bootstrap this project on AWS I tried to make use of CloudFormation a service that automates creation and destruction of service stacks. Based on a week's worth of experimenting with CloudFormation I will try to answer the question: "Is learning CloudFormation worth the effort?" Despite the rant CloudFormation support creation, updating and deletion of entire stacks of services. SAM is built on top of CloudFormation and It has a visual editor. The way CloudFormation is described, is that you can copy paste snippets to create resources and build a library of reusable components. This is a simplistic point of view. In reality you need to bring properties, specify dependencies, and introduce signalling mechanisms to ensure your template works. T
Comments
Post a Comment