Posts

Showing posts from November, 2017

downloading folders from google drive.

I wanted to download some course material on RL shared by the author via Google drive using the command line.  I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue https://drive.google.com/drive/folders/1V9jAShWpccLvByv5S1DuOzo6GVvzd4LV if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.

The Happy winners #nlihack 2017

Image
About the event  The happy winners! Amir Aharoi, Oren Bochman and Chaim Cohen Last week (November 23-24 2017) I had the pleasure of participating in the first  National Library of Israel's Hackathon . I've been to the NLI a few times with friends from the Wikimedia movement to instruct its staff and students about editing Wikipedia. But at the hackathon, the NLI opened its doors to the best and brightest minds to help out with tagging content and dissemination of its extensive image database. The Team You can't win a hackathon without a great team. My team consisted of seven developers which have been a part of the core community of Wikimedia developers in Israel and have been meeting irregularly since the International Wikimedia hackathon Organized by Wikimedia Israel last year in Israel.  We had met about a week before the event at the Local chapter's offices and discussed over pizza what we wanted to do and what the NLI had asked us to do. I realized that ...

My first BigQuery DWH

Image
BigQuery is Google's Analytics Database Some notes about a project that has taken up lots of time recently. It was a media attribution dashboard for a client running several hundred campaigns. A POC version of the project had been created manually using spread sheets and we had to provide a drop in a replacement ASAP . I took up the task of migrating a spreadsheet based BI to a more robust script and SQL based platform able to handle the rapidly aggregating data which would soon overpower the spreadsheet's models. A secondary challenge was that the entire system were analyzing was under development and would change daily. Despite it's lacks as a classical database (missing triggers and in schema protections) I choose BigQuery for its scale-ability and ease of integration. Despite its limitations it soon felt like a perfect fit for this type of project. Data collection Data is currently acquired daily via API from various platforms: for example Google AdWords an...

How to take out the Trash from command line in Ubuntu 17.10

>How to use the trash from the command line? The pain Setting up new projects is frequently time consuming, with many false starts until everything is setup right. In fact once CI is set up and the version on the local machine is less important. I've been encountering this Ubuntu annoyance whenever starting a new project. I could create smart aliases for rm with a command line trash folder. But there would be two trash folders. I just want to access the same trash folder from the command line that I can access through the desktop. It also turns out that this has been the subject of not one not two but at east three packages. This following option is quick, safe (as it is reversible) and lets us focus on the the setup. Doing machine learning also creates big models and large downloaded data sets that can hog up the limited fast storage. Still I don't enjoy retraining a big model because I accidental tossed out the last good model with all the previous runs. The reme...

Popular posts from this blog

Moodle <=< Mediawiki SUL integration - first thoughts

downloading folders from google drive.

Big Data Analytics Israel - New Year, New Data Scientist Job: 5 Things To Think About