This blog is where I share insights from any number of adventures in data analysis.
I will cover best practices from modeling and analysis projects; share tips on using new tools; outline new projects; recount war stories from Wikipedia and other FOSS projects I contribute to, and discuss challenges on information retrieval challenges, natural language processing tricks and game theoretic insights, portfolio analysis, social network analysis.
I wanted to download some course material on RL shared by the author via Google drive using the command line. I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue https://drive.google.com/drive/folders/1V9jAShWpccLvByv5S1DuOzo6GVvzd4LV if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.
Thanks to the kind support of Wikimedia Israel, I attended Wikimania 2016 in the alpine village of Esino Lario located in the Lombardy region of northen Italy. I was also able to attend the Hackathon event. Compared to other Wikimedia conferences which I have attended this was a challenging event.
The Wikimedia foundation has undergone a management crisis with the result of staff leaving in droves. For this Wikimania, the number of scholarships for staff has been restricted so attendance by staff developers was much less with many development teams not being represented. This has had two outcomes one is that it is unclear what is being developed by WMF these days but the upside is that community projects and developers got more emphasis.
Two such projects are Kiwix and the "community wishlist". Kiwix is the brainchild of Emmanuel Engelhart (User:Kelson) and this Wikimania was preceded by a week-long hackathon focused on Kiwix. The Kiwix software is an interesting project in many ways its main focus is an offline version fo Wikipedia which might at first seem unnecessary in a highly connected world. It is not a WMF project though it has gotten some support from the foundation. The Kiwix hackathon has included a couple of similar projects which distribute content like Open Street Map, Khan academy and even Ted talks for consumption on location without internet connectivity. These include schools and colleges in China, India, and Africa. Other types of communities the use offline Wikipedias some are prisoners which do not have access to online materials and a second are people living under less enlighted regimes which persecute political activists and which do not tolerate freedom of speech. Kiwix is translated using translate wiki created by Niklas Laxström (User:Nikerabbit) by Wikipedia's translator's community.
Medical Wikipedia (Offline)
Yet another related project built on Kiwix is "Medical Wikipedia" android app. This is a version of Kiwix for android Another which comes bundled with a cherry-picked collection of medical content from Wikipedia. Medical Wikipedia is a curated collection is high-quality materials with are being expanded and translated into more and more languages. The translations are done by Wikipedia volunteers and may often be the only available medical material in the target language and as such may have very high impact on a population that has not been reachable due to a double threat of low connectivity with high communication cost and a more troubling language barrier.
Growing focus on medicine
Over the last two years, I have only attended local events organised by my local chapter. During this time I have noticed on the social networks that more and more of my associates have been shifting their focus to medicine. On one or two occasions I even found myself volunteering at editing sessions coordinated and run by Shani Evenstein on medical subjects. This reinforces my belief that long-time Wikimedians will gravitate from their initial areas of interest to more high impact areas. This Wikimania had a large number of medicine-related talks, I take this opportunity to highlight these videos:
1. Wikipedia's coverage of medical topics by Lane Raspberry, Fred Totter
This talk underscores that a number of interesting points:
If "Attention is the new currency" then Wikipedia is the most requested, published, accessed and consulted source of health and information in the world.
success and even the "best model of sharing" for medical information are not yet understood.
Wikipedia is open for partnership with individuals and organizations, so long as they follow Wikipedia's guidelines.
looked at research - By Doc Helman JM West AG Wikimedia and medicine
2. Medical topics by James Heilman
3. Wikimania 2016 - Wikiversity Journal of Medicine by Mikael Haggestrom
4. Wikimania 2016 - Wikipedia Addiction and it's Comorbidites by Kritzolina
This talk confirmed a number of things:
You got to be nuts to edit this Wikipedia thing :-) and it is definitely habit forming.
If people do not respond to policy try appealing at an emotional level.
If that does not work - they may not be "sane" at the moment
Heilman JM, West AG (2015). "Wikipedia and medicine: quantifying readership, editors, and the significance of natural language". J. Med. Internet Res. 17 (3): e62. doi:10.2196/jmir.4069. PMID 25739399.
SUL is Wikipedia's Single User Login system The goal is to use it to authenticate moodle users there is one caveat - what happen if the moodle user has no account. Moodle and MediaWiki account creation require different information. Since we'd like to use MediaWiki's standard which is highly permissive. It is necessary to change moodle's requirements. Challenges Moodle does not explicitly define an object for setting registration requirements. This needs to be accomplished by admins via edit a number of files manually. errors will place the users in a limbo/blocked state... This is implemented differently on different version of moodle. This will break if the changes are overwritten by a system is upgrade. Directions Ideally Moodle should have a registration policy object which allows the admin to define which fields are required and if they need to be unique . For example Moodle complains if different users share an email adr
I wanted to download some course material on RL shared by the author via Google drive using the command line. I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution. I installed gdown using pip and then used: gdown --folder --continue https://drive.google.com/drive/folders/1V9jAShWpccLvByv5S1DuOzo6GVvzd4LV if there are more than 50 files you need to use --remaining-ok and only get the first 50. In such a case its best to download using the folder using the UI and decompress locally. Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.
AWS CloudFormation Pros and Cons So I'm building a PAAS product that does ML based optimisations and that means doing work in the cloud. The ML is a neat feature but without the basic product nothing will happen and to bootstrap this project on AWS I tried to make use of CloudFormation a service that automates creation and destruction of service stacks. Based on a week's worth of experimenting with CloudFormation I will try to answer the question: "Is learning CloudFormation worth the effort?" Despite the rant CloudFormation support creation, updating and deletion of entire stacks of services. SAM is built on top of CloudFormation and It has a visual editor. The way CloudFormation is described, is that you can copy paste snippets to create resources and build a library of reusable components. This is a simplistic point of view. In reality you need to bring properties, specify dependencies, and introduce signalling mechanisms to ensure your template works. T
Comments
Post a Comment