tag:blogger.com,1999:blog-73416132015590461682024-03-13T04:01:05.844+02:00Bayesian.NinjaThis blog is where I share insights from any number of adventures in data analysis.
I will cover best practices from modeling and analysis projects; share tips on using new tools; outline new projects; recount war stories from Wikipedia and other FOSS projects I contribute to, and discuss challenges on information retrieval challenges, natural language processing tricks and game theoretic insights, portfolio analysis, social network analysis.Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.comBlogger49125tag:blogger.com,1999:blog-7341613201559046168.post-72619487653826724252022-04-25T13:37:00.008+03:002022-04-25T13:37:57.301+03:00downloading folders from google drive.<p>I wanted to download some course material on RL shared by the author via Google drive using the command line. </p><p>I got a bunch of stuff using wget a folder in google drive was a challenge. I looked it up in SO which gave me a hint but no solution.</p><p>I installed gdown using pip and then used:</p>gdown --folder --continue https://drive.google.com/drive/folders/1V9jAShWpccLvByv5S1DuOzo6GVvzd4LV <p>if there are more than 50 files you need to use --remaining-ok and only get the first 50.</p><p>In such a case its best to download using the folder using the UI and decompress locally.</p><p>Decompressing from the command line created errors related to unicode but using the mac UI I decompressed without a glitch.</p><p><br /></p>Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-43844267363920116392020-02-24T13:51:00.001+02:002020-02-25T14:06:54.494+02:00Random Thoughts on Linear Regressions<h2>
Regression Analysis</h2>
<h3>
TLDR</h3>
Regression is the oldest and most powerful tool in a data scientist's toolbox. Under ideal conditions multiple linear regression would be the best and only tool a data scientist would want to use... In reality you would use a modern nonparametric variant or a different algorithm. Still, the main ideas I . discuss here will pop up in many other models and algorithms. This post is my brain dump on regression - I'll update it as time allows to cover the many aspects of this technique.<br />
<br />
<a name='more'></a><br />
<h2>
Advantages</h2>
<ol>
<li>Regression models are easy to run and interpret.</li>
</ol>
<ul>
<ul>
<li>Can be used within a workflow to identify the most important positive and negative influence of <i>independent variables</i> on the <i>dependant variable</i>.</li>
<li>Easy to understand how well the data is used.</li>
<li>the dreaded <i>p-values</i> indicate significance level of the <i>independent variables</i>.</li>
</ul>
<li>Provide not just a prediction but also an assessment of the errors</li>
<li>Benefits from visual exploration of the data and the regression model.</li>
<ul>
<li>Diagnostic plots can show if the regression assumption works</li>
<li>Small multiples can expose dependence between variable</li>
</ul>
</ul>
<h2>
RealPolitik</h2>
Parametric regression easily too few or too many parameters for a good fit on a general function. When regression was invented by Carl Friedrich Gauss and <br />
Adrien-Marie Legendre it was called the <i>method of least squares</i> and their data sets were a <b>tiny</b> set of astronomical observations i.e. a small number of rows and a constant number of independent number of variables. Fast forward to twenty years ago we had millions of rows and either a usually much smaller number of columns. And today we can also have millions of columns. The old Parametric models provide a poor fit this data.<br />
<br />
In many cases in data science as the number of independent variables grows dramatically all subtleties in the analysis vanish and the immediate choice becomes a logistic regression.<br />
<h3>
Assumptions</h3>
<br />
First let us recall the underlying assumptions of Linear regressions:<br />
<h4>
1. Linear relationship.</h4>
Can be visually inspected in the triangle above the diagonal of the SPLOM chart, that is a scatter plots matrix for each independent variable against the dependent variable augmented by a fit line.<br />
Might be be fixable using some transform of the variable or by dropping it.<br />
<ul><ul>
</ul>
</ul>
<h4>
2. Multivariate normality</h4>
Can be visually inspected via Q-Q plots in the diagonal or verified numerically using a goodness of fit test e.g. Kolmogorov-Smirnov test.<br />
Again a transform may help or we may drop this variable.<br />
<ul><ul>
</ul>
</ul>
<h4>
3. No or little multicollinearity</h4>
Can be visually inspected using half a Correlation matrix below the diagonal of the SPLOM<br />
<ul><ul>
</ul>
</ul>
<h4>
4. No autocorrelation</h4>
Time series present a problem as do most type of structured data in which observations might not be independent. Autocorrelation may or may not be obvious to spot on visual inspection inspection.<br />
sub-sampling then using ensembles on different samples might help converting the variable to a moving average and a residual might help but it might just make the autocorrelation harder to view.<br />
<ul><ul>
</ul>
</ul>
<h4>
5. Homoscedasticity</h4>
This occurs when the variance of a variable grows as the variable grows and can be inspected visually by looking at the scatterplot in the SPLOM. The Goldfeld-Quandt Test can be used to do a numeric test.<br />
<br />
<div>
Could be fixed using applying a nonlinear transform such as a log function or by dropping the offending variable.<br />
<ul><ul>
</ul>
</ul>
<h4>
6. Sample size</h4>
While a regression model can be fit to a small amount of data it should be fitted to sufficient data to produce good results. A pr practitioner would prefer to end up with a model in which each independent variable has statistical significance and so one might expect that if the above requirements are followed we should require the sample (number of rows) to be some factor of the independent parameters (columns) for this. </div>
<div>
<br /></div>
<div>
The rule of thumb "50 responses minimum and at least 10 responses per independent variable" may fail if your data set is unbalanced. Also if you are doing a prediction in some area with no sample point the model may not be valid even if you have tons of data.<br />
<ul><ul>
</ul>
</ul>
<div>
<h3>
Realities</h3>
</div>
<div>
So we might think we are ok after doing these additional visualizations and inspections, transforming variables and rerunning the regression dropping correlated variables and statistically insignificant ones. But in the real world and despite assurances from the central limit theorem and friends normality is rarely encountered. So one expects that most independent variables will not be able to pass item 2 on the list above.</div>
<div>
<br /></div>
Hard to fit small samples on many different groups. One rule of thumb says 20 samples per random variables.<br />
<br />
<br />
<h3>
Under The Hood</h3>
I have talked with experienced data scientists who were surprised to hear that there exists an closed form formula for parametric linear regression. However this is no longer true when we use modern computer age algorithms for our regression. cf. Computer Age Statistical Inference by Bradley Efron and Trevor Hastie<br />
Inference is based on the following $$(A^TA)^{-1}Ay$$ which has handled by algorithms.<br />
<h3>
Common Techniques</h3>
<h4>
Dummy Variables for Categorical variables</h4>
Categorical variables are non numeric features such as color, city, etc. We cannot use these directly in a regression analysis so we need to represent them numerically. The natural choice of using a numerical encoding introduces a <i>bias</i> and will make it much more <i>difficult to interpret the results</i>. For this reason categorical variables should be encoded using dummy variables. In the case of color - each color get a new column is_color. We now have more columns and may need to reassess the significance levels of the data. If most samples are have is_red as true we may have problems of significance with some of the dummy variables.<br />
<br />
<h4>
Transforms</h4>
We could transform variables to correct for problems in the data.<br />
<h4>
Interactions</h4>
We can also include interactions between independent variables a and b by including new variables such as $a \times b$ , or $a^2 \times b$, $a \times b^2 $ etc. This is a form of <i>feature engineering</i> in <i>machine learning</i>.<br />
<br />
<br />
References<br />
<br />
<br />
<br /></div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-56669792753220166102020-02-24T13:50:00.000+02:002020-02-25T15:40:21.442+02:00SQL - Selection v.s. ProjectionSelection and projection are two high level processes taking place when SQL queries are executed.<br /> <br />Selection is choosing some records (rows) from a table and leaving others out. e.g. rows having name='oren'.<br />Projection is the choosing some columns from each record and leaving others out. e.g. only name. <br /><br />So the select keyword performs projection while the where and keyword performs selection. <br /><br />Clearly the choice of using the keyword <i>select</i> for projection (choosing columns) rather than choosing rows, is an unfortunate flaw in the design of SQL, but this oversight is too well established to be fixed.Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-69028094195033322282020-02-24T11:28:00.000+02:002020-02-25T15:43:01.657+02:00Polyglot Data Science - GraalVM installation<br class="Apple-interchange-newline" /><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAdw4rAzGM33l9MvJyIgqezgBeb7bD23amdyZDDt2-WvdgPTw1s-jBVEIiA2ViqSfEib44oG2D5-VSMa7y_L93OqaxgRNsVYlLkKE4EjQO5ZTTXqlYJ9sphG733L1GwKpd8RAv7FZsfFI/s1600/graalvm.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="294" data-original-width="655" height="89" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAdw4rAzGM33l9MvJyIgqezgBeb7bD23amdyZDDt2-WvdgPTw1s-jBVEIiA2ViqSfEib44oG2D5-VSMa7y_L93OqaxgRNsVYlLkKE4EjQO5ZTTXqlYJ9sphG733L1GwKpd8RAv7FZsfFI/s200/graalvm.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 16px; text-align: center;">GraalVM - polyglot data science</td></tr>
</tbody></table>
<div>
GraalVM allows polyglot data science for example switching between R and python within a single Jupyter kernel. </div>
<div>
<blockquote class="tr_bq">
"GraalVM is a universal virtual machine for running applications written in JavaScript, Python, Ruby, R, JVM-based languages like Java, Scala, Groovy, Kotlin, Clojure, and LLVM-based languages such as C and C++."</blockquote>
<div>
<div>
<br />
<a name='more'></a></div>
<div>
<h3>
How I installed GrallVM</h3>
<ol>
<li>I got GrallVM from: <a href="https://github.com/graalvm/graalvm-ce-builds/releases">https://github.com/graalvm/graalvm-ce-builds/releases</a></li>
<li>unpacked and moved it to:<br /><br /><span style="font-family: "courier new" , "courier" , monospace;"><b><span style="font-size: x-small;">"/Library/Java/JavaVirtualMachines"</span> </b></span>which is where JVMs expect to be located on MacOS.</li>
<li>As I code lots of Java and Android I wanted to keep my Java intact as this version of GraalVM is well experimental so I set:<br /><b style="font-family: "Courier New", Courier, monospace; font-size: x-small;"><br />GRAALVM_HOME=$(/usr/libexec/java_home -v 11)</b><b style="font-family: "Courier New", Courier, monospace; font-size: x-small;">path+=($GRAALVM_HOME/bin)</b><b style="font-family: "Courier New", Courier, monospace; font-size: x-small;">...</b><b style="font-family: "Courier New", Courier, monospace; font-size: x-small;">export GRAALVM_HOME</b><b style="font-family: "Courier New", Courier, monospace; font-size: x-small;">export PATH</b><span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;"><b><br /></b></span>In my .zshrc </li>
<li><span style="font-family: inherit;">After installing GrallVM on MacOS Catalina I could not run it as it had not been </span>notarized<span style="font-family: inherit;"> for this OS. The normal </span>intervention<span style="font-family: inherit;"> via the privacy settings did not work at all. But the following at the command-line added and exception in the OS gatekeeper which allowed me to get things up and running.<br /><h3>
<span style="font-family: "courier new" , "courier" , monospace; font-size: xx-small;">xattr -d com.apple.quarantine /Library/Java/JavaVirtualMachines/graalvm-ce-java11-19.3.1</span></h3>
</span></li>
</ol>
</div>
<div>
<div>
<h3>
<span style="font-family: inherit;">References</span></h3>
<ul>
<li><span style="font-family: inherit;">The following <a href="https://blog.softwaremill.com/graalvm-installation-and-setup-on-macos-294dd1d23ca2" rel="nofollow" target="_blank">post</a> by <a href="https://blog.softwaremill.com/@adamwarski?source=post_page-----294dd1d23ca2----------------------">Adam Warski</a> helped me get started.</span></li>
<li><span style="font-family: inherit;">The following </span><a href="https://github.com/graalvm/homebrew-tap/issues/6" rel="nofollow" target="_blank">github issue</a> had the xattr fix I needed.</li>
</ul>
</div>
</div>
</div>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-47069080081025905372020-02-20T15:54:00.001+02:002020-02-24T10:32:08.405+02:00Avoid cross site scripting errors with a Jupyter local runtime<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7GFoOxgs5Nn0NOPmvYR3rivdZByxg1ak1tTIu5BzaYPh9R36saMbek_cFMEXqvAD3-Mios9KcKGgWXcOA7n031zpPXP0U9JLRc8UVuScycv1YNyXbNGO0v-Hoq1kWriPXxrov7uKayMY/s1600/Jupyter_logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;"><img border="0" data-original-height="1023" data-original-width="883" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7GFoOxgs5Nn0NOPmvYR3rivdZByxg1ak1tTIu5BzaYPh9R36saMbek_cFMEXqvAD3-Mios9KcKGgWXcOA7n031zpPXP0U9JLRc8UVuScycv1YNyXbNGO0v-Hoq1kWriPXxrov7uKayMY/s200/Jupyter_logo.png" width="172" /></a><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
<br />
So the trick is to use<br />
<br />
--NotebookApp.allow_origin <br />
and<br />
--no-browser<br />
<br />
and get the token from the command line when connecting to Google collab.<br />
<div>
<br /></div>
<div>
<pre style="background-color: #f6f8fa; border-radius: 3px; box-sizing: border-box; color: #24292e; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; font-size: 11.9px; line-height: 1.45; overflow-wrap: normal; overflow: auto; padding: 16px;"><code style="background: initial; border-radius: 3px; border: 0px; box-sizing: border-box; display: inline; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; line-height: inherit; margin: 0px; overflow-wrap: normal; overflow: visible; padding: 0px; word-break: normal;">jupyter notebook --NotebookApp.allow_origin='https://colab.research.google.com' \
--port=9090 --no-browser</code></pre>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjDkizeAxRBi0aEBr6sPzjyiB0rc82f9DOL8eXSqTAGLqoSIVtUygqCAicmHGbHPbNODwC1J1Dgc4rrLSPOdDClAmUUY2dTG2ifH3YmzzydhvEy9bFa-cQt6ecbLLOArVtHakZ2CeH7Tc/s1600/Google_Collab_logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;"><img border="0" data-original-height="343" data-original-width="776" height="140" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjDkizeAxRBi0aEBr6sPzjyiB0rc82f9DOL8eXSqTAGLqoSIVtUygqCAicmHGbHPbNODwC1J1Dgc4rrLSPOdDClAmUUY2dTG2ifH3YmzzydhvEy9bFa-cQt6ecbLLOArVtHakZ2CeH7Tc/s320/Google_Collab_logo.png" width="320" /></a><br />
<br />
<br />
References</div>
<div>
<ul>
<li><a href="https://research.google.com/colaboratory/local-runtimes.html" rel="nofollow" target="_blank">the docs</a></li>
</ul>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-77526670496620762722020-01-15T16:37:00.000+02:002020-02-20T16:38:10.954+02:00SQL DojoTLDR:<br />
Imagine just before your DS interview - you are NEO your coach is Morpheus, and you will be practice SQL in rapidly changing schemas.<br />
<br />
<iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="533" src="https://www.youtube.com/embed/Z8mO9BwVWu4" width="1280"></iframe>
<br />
Now here is a little project I thought up:<br />
<br />
Despite any number of excellent SQL based projects I have created I tend to get rusty in SQL as I don't use it on a regular basis. I decided it might be worthwhile to setup a virtual space to practice, hence the dojo.<br />
<br />
The dojo lets a student practice analytical sql primarily queries analysts use.<br />
Ultimately I'd like to to use it in an agile manner as an LMS with a minimal UI. This would require creating a story for each query and a test that the query returns a good answer. Also to make things interesting the tasks should be related and proceed from easy to more challenging and cover a number of techniques like filtering, aggregation and subqueries.<br />
<br />
However, initially I want to have things up and running quickly and to collect questions and answers that reflect how to do create views on a small number of databases from courses or books. Also this system can also be used to see how well things work on different dbms with a goal of doing things in a portable fashion.<br />
<br />
I thought I might share some specifics. The POC features should be:<br />
<ol>
<li>Run server in docker - easy to install/restart/migrate (done)</li>
<li>Agile access - e.g. using visual studio code + pluging. (done)</li>
<li>Rich clients - MySQLWorkBench (done)</li>
<li>SquirelSQL - supports more RDBS systems. (done)</li>
<li>Access from Jupyter (done - but less agile)</li>
</ol>
<div>
Beyond the POC</div>
<div>
<ol>
<li>Migrate db to AWS (more & bigger databases).</li>
<li>create a web interface to</li>
<ol>
<li>switch RDBMS</li>
<li>log in</li>
<li>enter and run queries</li>
<li>show output log</li>
<li>show query output. </li>
<li>store queries history</li>
<li>keep score </li>
<li>indicate progress in units.</li>
<li>feedback and discussion.</li>
<li>allow users to add stories and queries.</li>
<li>support non-sql dbs as well like</li>
</ol>
<li>Develop small learning units to practice techniques.</li>
<ol>
<li>[OK] basics</li>
<li>[OK] filtering</li>
<li>[OK] aggregation</li>
<li>[OK] subqueries</li>
<li>[] cleaning data & SQL wrangling</li>
<li>[] OLAP</li>
<li>[] design and ddl</li>
<li>[OK] CRUD + stored procedures python.</li>
<li>[] CRUD + stored procedures R.</li>
<li>[] CRUD + stored procedures Java.</li>
<li>[] transaction</li>
<div>
<ol></ol>
</div>
<li>[] create queries for a bi dashboard.</li>
<li>[] create queries for a marketing automation project.</li>
</ol>
<li>Migrate queries to database</li>
<ol>
</ol>
<li>Show schema for the database.</li>
<li>Make things secure.</li>
<li>Isolated user.</li>
<li>reset DB.</li>
<li>Use serverless backends too</li>
<ol>
<li>aws athena.</li>
<li>google bigquery.</li>
</ol>
<li>Use noSQL dbs - mongo, neo4,</li>
<li>Connect to a dedicated environment like MySQLWorkBench </li>
<li>Connect to a BI environment or Tableau / Power BI.</li>
<li>Use a freemium hosted database like bigquery.</li>
<li><br /></li>
</ol>
</div>
<div>
<h3>
first snag:</h3>
</div>
<div>
accessing mysql v>8.0 requires a new protocol. I had to re-enable the old one using some obscure command to allow user + password connection or change to the mysql.connector.connect connector instead<br /><br />TODO: find this snag it and record.<br />
TODO: add this hack to the mysql docker image.<br />
TODO: automate the docker image to run script to create and load data from a folder.<br />
TODO: add a docker image for postgres with equivalent capabilities.<br />
TODO: put the docker images @ AWS<br />
TODO: get a docker image with MySQL sample database as it is used in many tutorials.<br />
TODO: migrate project to trello.<br />
<br />
Updates:<br />
<ul>
<li>I installed Squirrel SQL to access multiple dbs via rich client.</li>
<li>I installed GraalVM to do polyglot data science in a notebook.</li>
<li>I created a jupyter notebook to access mysql database. </li>
<ul>
<li>This is good for accessing a local database.</li>
</ul>
<li>I plan to update this to practicing Polyglot data wrangling. i.e. get data from db into R and Python data frames and do some quick explorations.</li>
<li>Spidering & Indexing.</li>
</ul>
<div>
<br /></div>
<div>
<h3>
Spinoff DOJOs</h3>
</div>
<div>
<br /></div>
<div>
ETL </div>
<div>
ELK stack </div>
<div>
DB </div>
<div>
BIG DATA</div>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-63812839208418333782019-11-13T14:57:00.000+02:002020-02-20T16:38:56.732+02:00AWS CloudFormation Pros and Cons<h2>
AWS CloudFormation Pros and Cons</h2>
So I'm building a PAAS product that does ML based optimisations and that means doing work in the cloud. The ML is a neat feature but without the basic product nothing will happen and to bootstrap this project on AWS I tried to make use of CloudFormation a service that automates creation and destruction of service stacks. Based on a week's worth of experimenting with CloudFormation I will try to answer the question:<br />
<blockquote class="tr_bq">
"Is learning CloudFormation worth the effort?"</blockquote>
<div>
Despite the rant CloudFormation support creation, updating and deletion of entire stacks of services. SAM is built on top of CloudFormation and It has a visual editor.</div>
<div>
<br /></div>
The way CloudFormation is described, is that you can copy paste snippets to create resources and build a library of reusable components. This is a simplistic point of view. In reality you need to bring properties, specify dependencies, and introduce signalling mechanisms to ensure your template works.<br />
<div>
The samples typically lack security features so when you secure your stack, assuming you know want, you'll need to dig deep into the documentation. That's when you may find that code from GitHub or even the documentation is often broken. To say that CloudFormation templates in YAML are poorly documented is an understatement. They frequently contain blocks of complex scripts and JSON specs. These are often encoded using string processing commands, defeating the notion of having a template.</div>
<div>
<h3>
The Pros:</h3>
<ul>
<li>Infrastructure as code is an automation win .</li>
<li>Requires to integrate provisioning into CI/CD.</li>
<li>Bridge between architect and implementer.</li>
<li>Formalise manual provisioning.</li>
<li>There is a tool to convert an existing setup to a template.</li>
<li>Lots of template and snippets available.</li>
<li>Can integrate and enhance work with other devops toolsets.</li>
<li>Working with CF this may quickly enhance your knowledge of the command line</li>
<li>CF templates are a part of most AWS tutorial and workshop. </li>
<ul>
<li>But there is a caveat - the CF templates are rarely even glossed over. They setup an architecture described using blocks diagram. If you want to look under the hood prepare to open a can of worms.</li>
</ul>
</ul>
<h3>
The Cons:</h3>
<ul>
<li>Steep learning curve.</li>
<ul>
<li>The Templates</li>
<ul>
<li>Using Template Intrinsic Functions</li>
<li>Referencing other templates</li>
<ul>
<li>Outputs</li>
<li>Resources/Params</li>
</ul>
<li>Pseudo Params</li>
</ul>
<li>The python Helper Scripts</li>
<ul>
<li>ConfigSets</li>
<li>Commands</li>
<li>signal init patten</li>
<li>install test pattern</li>
</ul>
<li>Integrating with Git, CI/CD</li>
<li>Resilience - building resources in multiple availability zones etc. </li>
<li>Security - implementing AWS recommendations. </li>
</ul>
<li>Knowledge of provisioning AWS services is a prerequisite to use CloudFormation in a serious way. (Both general and specific knowledge is needed).</li>
<li>StackOverFlow may be of little help in Q2 of 2019.</li>
<li>Like Puppet, Ansible, Chef, Vagrant, a deep knowledge of the linux and its command line and Configuration management is a prerequisite to just read the file.</li>
<li>AWS platform specific and is a AWS platform lock.</li>
<li>The Visual Editor in cloud formation is a waste of space as it only generates the top level place holders.</li>
<li>Many CloudFormation samples are broken indicating that the AWS services and their dependencies have frequent breaking changes.</li>
<li>Lack of guidance on how to keep CF code clear and clean. (If there is I never found it).</li>
<li>Your CF code is up and running - great you have created dozens of security holes....</li>
</ul>
Learning CloudFormation is pretty much like Alice's rabbit hole. You need to jump in and go all the way through and it is unclear where you will end up once you do. If you take a "crash course", gaping holes in your understanding will make reading and working with samples CF code nearly impossible.<br />
<br />
The many broken sample templates indicate that AWS does not dog food their CF samples code using a CI/CD pipeline. AWS workshop frequently state that their session follows best practices via a CF template. Good luck trying to read it, to see how this is done.<br />
<br />
Unzipping the resource specification for a single region results in 330 files. Some services have just one, others likeApiGateWay and Ec2 have as many as 50 spec files. Most services have a just a few. You wont work from the spec - unless you are a tool developer but it presents a view of the scope and complexity of CloudFormation.<br />
<br />
The bottom line - learning CF is worth the effort for Developers and DevOps if and only you have serious experience with AWS and are committed to the AWS platform for multiple projects.<br />
<br />
<h2>
Further reading:</h2>
<br />
<ul>
<li><a href="https://aws.amazon.com/blogs/devops/aws-cloudformation-security-best-practices/" rel="nofollow" target="_blank">AWS CloudFormation security best practices</a></li>
</ul>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-65289712022421538422019-01-13T13:47:00.000+02:002019-03-21T13:47:54.099+02:00Android Coding Conundrums 1 Fragment Constructors<div dir="ltr">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhz4IWjf-WevL9sMAE_AX6_Ps8K-CImkupxda67tx68fXcx7ngUdPq2iM1JLD1INB056JBC5ZAGlrJfRxv59sHaNb9EeYQkVn5Z5dlLqqAxoB4c4qNPobjEAoFaI2KzCBMx6aiUWPIFYlQ/s1600/128px-Android_robot.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="150" data-original-width="128" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhz4IWjf-WevL9sMAE_AX6_Ps8K-CImkupxda67tx68fXcx7ngUdPq2iM1JLD1INB056JBC5ZAGlrJfRxv59sHaNb9EeYQkVn5Z5dlLqqAxoB4c4qNPobjEAoFaI2KzCBMx6aiUWPIFYlQ/s1600/128px-Android_robot.png" /></a></div>
While researching using the factory design pattern for fragment creation I couldn't help but notice how that fragment creation is a long term source of bugs. Why is fragment creation error prone?<br />
<br />
Perhaps because the API for fragment has been changed so frequently that so much of the advice is dated. In the real world fragment is deprecated in favour of a decedent in the app support library but that has been deprecated as well in-favour of androidX support libraries. </div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
<a href="https://commons.wikimedia.org/wiki/File:W3sDesign_Factory_Method_Design_Pattern_UML.jpg" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;" title="Vanderjoe [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons"><img alt="W3sDesign Factory Method Design Pattern UML" height="191" src="https://upload.wikimedia.org/wikipedia/commons/4/43/W3sDesign_Factory_Method_Design_Pattern_UML.jpg" title="Factory Method Design Pattern" width="400" /></a>Perhaps it is because many newcomers to Android are Java developers who follow the Java idiom of constructor overloading to pass parameters at creation for use in Activity.onCreate(). However, this is not a good idea it is an example of bug pattern. Using a constructor will usually appear to work fine until Android destroys the activity and looks for a default constructor. If there isn't one the app crashes with a <span style="font-family: "courier new" , "courier" , monospace;">runtime exception</span>. This is because behind the scenes the default constructor is called rather than the constructor provided. The issue is of course that the parameters you use to call the fragments are not known by Android framework.</div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
If there is a default constructor a more subtle bug will will arise. When the fragment is recreated following a runtime or configuration change using the default constructor and without any of the parameters used previously. This is not going to work unless somehow the parameters are set before they are used. Once you try to get at them - you will get a <span style="font-family: "courier new" , "courier" , monospace;">null error exception</span>. Using getters and setter may help and default values may help a little more but this is not really fixing the issue. </div>
<div dir="ltr">
The default Android mechanism for saving and restoring state is of little use. It only covers states stored in controls. Anything more sophisticated requires attention from the programmer.<br />
If the state has all been saved and restored to a bundle in <span style="font-family: "courier new" , "courier" , monospace;">onPause()</span> and <span style="font-family: "courier new" , "courier" , monospace;">onRestore()</span> respectively this is likely to still not enough. </div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
When the constructor passes resource ids etc required earlier in the lifecycle before the bundle mechanism is run say in <span style="font-family: "courier new" , "courier" , monospace;">onCreateView()</span> or in <span style="font-family: "courier new" , "courier" , monospace;">onCreate()</span>. These fragment may still be crashworthy. Also at this point reproducing the bug may require a rather sophisticated set of scenarios as the bug is still around but harder and harder to reproduce.<br />
<div dir="ltr">
</div>
</div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
I also noticed some mentions of an Android developer setting called "<i>don't keep activities</i>" but in <span style="font-family: "courier new" , "courier" , monospace;">adb</span> it is a global flag which tells Android to call finish on activities once they lose focus. This has the consequence of simulating a configuration change. Once set activities are no longer kept in the task (top level activity container for fragments). This will ensure the fragment's default constructor will be called as if the device is a memory staved device with a brand new equivalent date of 2012. Using this setting we should crash faster and more consistently if the above bug were introduced into the app allowing fixing them.</div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
I spent some time figuring how to control this setting via <span style="font-family: "courier new" , "courier" , monospace;">adb</span>.</div>
<div dir="ltr">
This raises another question is how to coordinate tests and <span style="font-family: "courier new" , "courier" , monospace;">adb</span> commands. </div>
<div dir="ltr">
Or better yet how to do the <span style="font-family: "courier new" , "courier" , monospace;">adb</span> voodoo using <span style="font-family: "courier new" , "courier" , monospace;">junit</span> rules.</div>
<div dir="ltr">
<br /></div>
<div dir="ltr">
I hope to cover these in future.</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com1אלקטרה, יגאל אלון 98, תל אביב יפו, 6789141, Israel32.0700544 34.7941210999999845.1748803999999957 -6.5144699000000159 58.9652284 76.102712099999991tag:blogger.com,1999:blog-7341613201559046168.post-60575897409451736632018-08-14T21:01:00.004+03:002018-08-14T23:23:54.808+03:00Big Data Analytics Israel - New Year, New Data Scientist Job: 5 Things To Think About<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPUCrj7Vu4DDuryEnNxE04QFDEbePvkzcI-xX3IQfUgACzjaWxsEtc5IWHVS3BVX5Zn75DCnZ6CUsJRJvEkl2CuDduLSW-nFv_ugD0d5ks_7wCwbIB9zPHFPQ0pj4LJwPX-iHXlqpSOGY/s1600/view_of_mt_fuji.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="623" data-original-width="930" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPUCrj7Vu4DDuryEnNxE04QFDEbePvkzcI-xX3IQfUgACzjaWxsEtc5IWHVS3BVX5Zn75DCnZ6CUsJRJvEkl2CuDduLSW-nFv_ugD0d5ks_7wCwbIB9zPHFPQ0pj4LJwPX-iHXlqpSOGY/s320/view_of_mt_fuji.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Data science interviews can be over whelming </td></tr>
</tbody></table>
<h1 class="pageHead-headline text--pageTitle" style="background-color: white; color: #2e3e48; fill: rgb(46, 62, 72); font-family: "Graphik Meetup", -apple-system, system-ui, Roboto, Helvetica, Arial, sans-serif; font-size: 32px; line-height: 1.1; margin: 0px 0px 8px; padding: 0px; stroke: transparent;">
New Year, New Data Scientist Job: 5 Things To Think About</h1>
<div>
<br /></div>
<div>
My notes: https://www.meetup.com/Big-Data-Analytics-Israel/events/253124286/</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
The first talk was by:</div>
<h3>
Raya Belinsky - <span style="font-size: 18.72px;">"New job - yes or no?"</span></h3>
<div>
The talk about finding your next job or reinventing your current jobs. Miss Belinsky's humour and background as an executive life-coach made this talk both pleasant and worth-while.</div>
<div>
<br /></div>
<div>
She covered her operational definition of job burnout</div>
<div>
Linkin profile - complete the profile (it tells you what to do)</div>
<div>
The CV - ask 2 people to prepare it</div>
<div>
The Interview - e.g. prepare 3 questions</div>
<div>
<br /></div>
<div>
Each had at least a couple of points worth taking care of in your next round of job search. Check out the talk and slides when they go online.</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<hr />
<div>
<br /></div>
<div>
Second talks by:</div>
<h3>
Nathaniel Shimoni - "Life story"</h3>
<div>
Mr Shimoni is an experienced story and had a compelling story to tell and his own twiting path to becoming a data scientist. </div>
<div>
<br /></div>
<div>
Some Highlights:</div>
<div>
<ul>
<li>Listened 1 hour to lectures during commute.</li>
<li>Later he decided to dedicate 1 hour a day to </li>
<li>Participated in many Kaggle competition </li>
<li>Liked: Data Hack</li>
<li>Read papers - keep up to date</li>
<li>Recommends to take risks you can afford</li>
</ul>
<div>
<br /></div>
</div>
<div>
<hr />
<div>
<br /></div>
<div>
<br /></div>
Third talk:<br />
<div>
<div>
<h3>
Omri Allouche "The top mistakes you're making in your Data Science interview"</h3>
</div>
<div>
Mr Allouche also did had an unorthodox track to DS. He used many metaphors from DS which was refreshing in this talk sequence as was his use of compelling visuals.</div>
<div>
<br /></div>
<div>
Ask what will you be doing:</div>
<div>
<ul>
<li>Writing code that goes to production</li>
<li>Develop new algorithms</li>
<li>Be in charge of collect data</li>
<li>Work alone / lead others</li>
</ul>
<div>
Don't run away from your super powers.</div>
</div>
<div>
Don't be the only/first data scientist</div>
<div>
<br /></div>
<div>
Running away from data in data science </div>
<div>
<ul>
<li>Don't skip - Exploratory analysis</li>
<li>Unsupervised is cool... don't rush to do supervised models</li>
<li>Learning to do proper error analysis - when is the model wrong...</li>
</ul>
</div>
<div>
<div>
Running away from science in data science </div>
<div>
Use your intuition but learn to say - "I don't know but I would try ... " (2 different solutions)</div>
</div>
<div>
Mr Allouche - talked about the community and that we could have a conversation about brain storming strange new ideas.</div>
<div>
<br /></div>
<div>
define </div>
<div>
<ul>
<li>Data set</li>
<li>Input </li>
<li>Output</li>
<li>Your own the loss function yourself</li>
</ul>
</div>
<div>
<br /></div>
<div>
Overconfidence is problem - it says that this person is not going to learn too much.</div>
<div>
<br /></div>
<div>
But my five cents on this interesting lecture is that it does not seem to be grounded in having done lots of interviews or sat in these. Some of his comments were contrarian and his pointers on CVs may be counterproductive.</div>
<div>
<br /></div>
<div>
<hr />
<div>
<br /></div>
Fourth talk: </div>
<h3>
Aharon Frazer "The Skills That Make a Great Data Scientist"</h3>
<div>
Aharon was the the only American Rabbi 🐰DS speaker.</div>
<div>
He did his studies in US than was a PHP coder. </div>
<div>
He suggests asking about jobs not being offered - in smaller companies. </div>
<div>
Did BI at "Seeking Alpha" when he was looking for work as a web developer.</div>
<div>
Went to Joy tunes and after 4 months was head hunted by FaceBook</div>
<div>
<br /></div>
<div>
He talled about </div>
<div>
<br /></div>
<div>
data engineering - </div>
<div>
<ul>
<li>Data acessibility</li>
<li>Data quality</li>
<li>Logging</li>
<li>ETL Pipelines</li>
<li>Dashboards</li>
<li>Alerts</li>
</ul>
</div>
<div>
data science - are analysts</div>
<div>
<ul>
<li>Indentifing opportunities</li>
<li>Product visions Foreccasting</li>
<li>Goal setting & tracking</li>
<li>Product updates</li>
</ul>
<div>
<br /></div>
</div>
<div>
Before and after analysis - </div>
<div>
<br /></div>
<div>
Reality is messy</div>
<div>
<br /></div>
<div>
Experiments @ Facebook</div>
<div>
<br /></div>
<div>
Exposures \</div>
<div>
======> Stats Engine ======> Metrics Change </div>
<div>
Metrics /</div>
<div>
<br /></div>
<div>
Some interviewing notions:</div>
<div>
<br /></div>
<div>
I interviewed many more times than I got jobs</div>
<div>
<br /></div>
<div>
look at problems as 3-d</div>
<div>
<br /></div>
<div>
people look as if it is a text book problem but he is more interested in someone who imagines the problem is really happening.</div>
<div>
<br /></div>
<div>
Show you have the template of the problem in your head.</div>
<div>
Model - Looking at errors </div>
<div>
<ul>
<li>Framing the problem first. - "Here is a metric of success."</li>
</ul>
Self awareness.</div>
<div>
Come to agreement with the interviewers.</div>
<div>
Some questions have no great answer but cover a fundamental issue. </div>
<div>
Pros and cons of real world situations.</div>
<div>
<br /></div>
<div>
One note: Mr Frazer had a slide-deck disaster but it only slowed him down a bit and he could talk well without his slides - kudos on that. If you are going to give a talk practice giving it without your slides.</div>
<div>
<br /></div>
</div>
</div>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com2tag:blogger.com,1999:blog-7341613201559046168.post-70751693129174599922018-08-08T18:58:00.003+03:002020-02-24T10:10:35.026+02:00Paratroopers Puzzle<h2>
Puzzle:</h2>
Two paratroopers are dropped onto a practically infinite railway track. Both were given a note with the identical instructions... They both follow the instructions and eventually meet up.<br />
<br />
What did the note tell them to do?<br />
<h2>
Answer:</h2>
To drop their para-shoots on the track. Then they should run north 10 steps then switch and run 3 times to the south and switch again and triple and do not stop until they meet or reach the other parachute...<br />
<h2>
The fun answer:</h2>
The standard random walk has the properties related to the normal distribution (which Bernoulli approximates as N approaches infinity). For the random walk the mean position for the random walker is his or her starting point. The variance however grows with the root of the time. So pretty much any random walk would work as a rendezvous strategy - whenever they run past a pub, pop in and do not leave until you are punch drunk is probably as good randomising strategy for the above answer.<br />
<br />
<br />
For more details you can look at the following entry on <a href="https://stats.stackexchange.com/questions/159650/why-does-the-variance-of-the-random-walk-increase" rel="nofollow" target="_blank">Cross Validated</a>.<br />
This is also called the two robots problem.<br />
<br />
<br />Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-33501300691740986712018-04-30T18:43:00.002+03:002018-04-30T18:43:19.768+03:00PyData 13 <div class="separator" style="clear: both; text-align: center;">
<br /></div>
1st speaker JP Morgan<br />
<br />
<h2>
Continuous Delivery in Python on a Massive Scale, by Or Been-Zeev (JP Morgan) delivery at JP Morgan</h2>
<div>
<h3>
Abstract: </h3>
J.P. Morgan has one of the largest Python codebases in the world. We will discuss the challenges of working with millions of lines of Python and how one can deal with those. We will also show you how Python makes it easy to achieve continuous delivery and ”push to production” approaches regardless of scale.</div>
<h3>
My notes:</h3>
<br />
<ul>
<li> CD = CI + Push to production</li>
<li>20 million lines of code - use a monolithic code base...</li>
<li>time to market is the KPI </li>
<li>but how to avoid breaking the code many times a day?</li>
<li>Python simplifies the typical CI pipeline as there is no compile or build</li>
<li>They have a single head but not clear about how they are merging changes - they have shared staging layers to handle this issue.</li>
</ul>
<h2>
Speaker separation in the wild, and the industry's view - Rapahel Cohen (Chorus.ai)</h2>
<h3>
Abstract:</h3>
Audio recordings are a data source of great value used for analyzing conversations and enabling digital assistants. An important aspect of analyzing single-channel audio conversations is identifying who said what, a task known as speaker diarization. The task is further complicated when the number of speakers is a priori unknown. In this talk we’ll dive into the deep learning research of speaker "embedding" (verification and diarization). Connect it to a framework of “real” speaker separation needs (as in audio applications such as Chorus.ai’s Conversation Analytics platform), and present the pipeline required for integrating these solutions in a semi supervised manner without requiring any effort by the end user<div>
<br /></div>
<h3>
My Notes</h3>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<ul>
<li>conversation are 10 to minutes</li>
<li>task 1: identify consecutive speaking range by some speaker.</li>
<li>task 2 : given a labeled sample label range.</li>
<li>Sounds like a simplification of the cocktail party problem </li>
</ul>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzHiVEhpcfGy7lQ5FzOFeDg7xRXjIcofPH5RvvogR9DxMyLKDCxAvraipG0-kAHSBmY6xs0w1NR4EWDf48PAgQpkKAXsUsr5qCQk600dwyuFgxK-EjHwucuNyy3xDjNeY2OCWHTs9PD00/s1600/Screen+Shot+2018-04-30+at+16.31.29.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="874" data-original-width="1600" height="108" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzHiVEhpcfGy7lQ5FzOFeDg7xRXjIcofPH5RvvogR9DxMyLKDCxAvraipG0-kAHSBmY6xs0w1NR4EWDf48PAgQpkKAXsUsr5qCQk600dwyuFgxK-EjHwucuNyy3xDjNeY2OCWHTs9PD00/s200/Screen+Shot+2018-04-30+at+16.31.29.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">you might remember this from Andrew NG course lecture 1</td></tr>
</tbody></table>
<ul>
<li>Extra Tasks:</li>
<ul>
<li>find and share features</li>
<li>produces call summary</li>
<li>generates todo list (actionable analytics)</li>
<li>voice metrics - sentiment etc. (e.g. whatson)</li>
<li>Provide guidance</li>
</ul>
<li>DWH</li>
<ul>
<li>Store sales conversation as a database - for future query</li>
</ul>
<li>Proprietary tech:</li>
<ul>
<li>Speech recognition</li>
<ul>
<li>who said what?</li>
</ul>
</ul>
<li>Prior Art</li>
<ul>
<li>EigenVoices in scifi (predates Shazam by 5-6 years !?)</li>
<li>iVector - simple concept but complex paper & many implementation details. </li>
<li>replaced by Deep learning + Softmax classification architecture instead</li>
<li><br /></li>
</ul>
<li>Large Softmax issue - handled based on Le-Cuns idea of a "Siamese network" </li>
<ul>
<li>instead of detecting who is talking </li>
<li>check if it is the same or different speaker then </li>
<li>we need the big SoftMax just once per speaker's utterance.</li>
</ul>
<li>Since different people sound different a Siamese network quickly learn a fit and later does not generalise very well. (This is actually an issue of imbalance in the dataset as segments used are short and switches between speakers are rare...) </li>
<li>They used triplet ($speaker_1, speaker_1, speaker_2$) etc. to teach the network about speaker boundaries.</li>
</ul>
<div>
Some slides:</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHbV2m5ynZZjhJq7XfbnK23K7B6lEhcCv7UxX5spuDAa8O1sFI5zfGJS6FEvpJbLGZPuuri9TQgnE837rjRdiG0UZkemsZ5stb0-JT80AMRuBalL5ZkT69APSEoH-XuR4G6Snu1NPPFZA/s1600/20180429_191811_HDR.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="993" data-original-width="1600" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHbV2m5ynZZjhJq7XfbnK23K7B6lEhcCv7UxX5spuDAa8O1sFI5zfGJS6FEvpJbLGZPuuri9TQgnE837rjRdiG0UZkemsZ5stb0-JT80AMRuBalL5ZkT69APSEoH-XuR4G6Snu1NPPFZA/s320/20180429_191811_HDR.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">initial segmentation - segment using a 2 second window</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitXMdb5jV_-naYzzdM5yRlh71hhUZTtN0kRq3iFDL9qQRMEpP1UJl5LUdpJUFyGFhVxXVJ3JIL808fWaadi-4j4GEVesm4w3uOkJYZJlHiyRM-rTje-2tnF-FZeuc_lbsEq9TMpDbiUI4/s1600/20180429_191817.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1005" data-original-width="1600" height="201" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitXMdb5jV_-naYzzdM5yRlh71hhUZTtN0kRq3iFDL9qQRMEpP1UJl5LUdpJUFyGFhVxXVJ3JIL808fWaadi-4j4GEVesm4w3uOkJYZJlHiyRM-rTje-2tnF-FZeuc_lbsEq9TMpDbiUI4/s320/20180429_191817.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">The overall architecture</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ6cptaATz_nW43MIAHTVF6SaSlwlBhoYiq8BDNo6VFccBzDQLGzKEHvWwglqXs_64D5eqdaujjv4U5qTVyPN4pXcvwRcDN8TSVJ7xqNrIOh1fSSNYtfYEwSulR9EyOjajlcl-uIVEzLU/s1600/20180429_192822.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAlDjkajzadaGbbJ74w3eaS7zbNGKOJYDg8DGgGUQje6YUDrGIyK_3txhHEVjipnM9yFSHVoVxyF3ZR22OO1IvkksVsDzPHZ8UP4VpCvqbfoj0upf13rbRU8GErvtyiFJ51Gw-N4EcwLI/s1600/20180429_191948.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="990" data-original-width="1600" height="197" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAlDjkajzadaGbbJ74w3eaS7zbNGKOJYDg8DGgGUQje6YUDrGIyK_3txhHEVjipnM9yFSHVoVxyF3ZR22OO1IvkksVsDzPHZ8UP4VpCvqbfoj0upf13rbRU8GErvtyiFJ51Gw-N4EcwLI/s320/20180429_191948.jpg" width="320" /></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ6cptaATz_nW43MIAHTVF6SaSlwlBhoYiq8BDNo6VFccBzDQLGzKEHvWwglqXs_64D5eqdaujjv4U5qTVyPN4pXcvwRcDN8TSVJ7xqNrIOh1fSSNYtfYEwSulR9EyOjajlcl-uIVEzLU/s1600/20180429_192822.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJ6cptaATz_nW43MIAHTVF6SaSlwlBhoYiq8BDNo6VFccBzDQLGzKEHvWwglqXs_64D5eqdaujjv4U5qTVyPN4pXcvwRcDN8TSVJ7xqNrIOh1fSSNYtfYEwSulR9EyOjajlcl-uIVEzLU/s1600/20180429_192822.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a></div>
<div class="separator" style="clear: both; text-align: center;">
i-vector is based on Dehak et all 2011 - complicated</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK6kiibY9u3vxTvmMWT6W5XGRjtkkaC7CfO1A-mZw_xIPVWeMSqL_nD4ZykfWWBbBEUy1z5IPbYwEjOsjdP0oYWeqmmlIUQeBGfuMrpYjMOiglomsUmefW2Awu_U5lXZoiyezumzIYFEQ/s1600/20180429_192206.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1166" data-original-width="1600" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK6kiibY9u3vxTvmMWT6W5XGRjtkkaC7CfO1A-mZw_xIPVWeMSqL_nD4ZykfWWBbBEUy1z5IPbYwEjOsjdP0oYWeqmmlIUQeBGfuMrpYjMOiglomsUmefW2Awu_U5lXZoiyezumzIYFEQ/s320/20180429_192206.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">DNN to the rescue!</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4Z2XafPr0dQeeY9rs1xaRdHvYgrfmazd5IAo66KVOyTnWA4Cfa9rr1s4rQzmckRfgQ5QbKU8K8_U47WYeK9Qrn2nM8IfBik_C2VIuc4tRT03hM7roOZXT5-Rmpxzu4UaqkIix7foWW9w/s1600/20180429_192352.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="982" data-original-width="1600" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4Z2XafPr0dQeeY9rs1xaRdHvYgrfmazd5IAo66KVOyTnWA4Cfa9rr1s4rQzmckRfgQ5QbKU8K8_U47WYeK9Qrn2nM8IfBik_C2VIuc4tRT03hM7roOZXT5-Rmpxzu4UaqkIix7foWW9w/s320/20180429_192352.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Big SoftMax !?! - so they use Siamese architecture </td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihLTbI-qwH1EPa5JEvOl56cAneRqkUe5IZuwGpmLaXkqvds2pajK8dIMwps4eeCSq1k-IYa4kCO5ZIGra1muF-EMsEWbykExJrP4-1t4JjhBLFeM_5MeFvK8prdhPRlS9yigqEzjXc8aM/s1600/20180429_192549.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="983" data-original-width="1600" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihLTbI-qwH1EPa5JEvOl56cAneRqkUe5IZuwGpmLaXkqvds2pajK8dIMwps4eeCSq1k-IYa4kCO5ZIGra1muF-EMsEWbykExJrP4-1t4JjhBLFeM_5MeFvK8prdhPRlS9yigqEzjXc8aM/s320/20180429_192549.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">Which won't generalise too well (highly unbalanced DS) <br />sample triplets with 2 speakers to the rescue (Li et all 2017 Baidu)</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhL2n_bZ_4DcbLbC7nYLwkQJK45a5Cv87woAoqUaKaVS0ISfNwduK36PMuc2Ll3Qr93G_DqQFQ5FLGJwfZpCmc741BSBz_cqgUhBctPH84wkMNzn_RtaDczL5jYEyfkwSkUF8iPsUCWl7k/s1600/20180429_192551.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="871" data-original-width="1600" height="173" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhL2n_bZ_4DcbLbC7nYLwkQJK45a5Cv87woAoqUaKaVS0ISfNwduK36PMuc2Ll3Qr93G_DqQFQ5FLGJwfZpCmc741BSBz_cqgUhBctPH84wkMNzn_RtaDczL5jYEyfkwSkUF8iPsUCWl7k/s320/20180429_192551.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Can we do better (Google paper -? which)</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitXMdb5jV_-naYzzdM5yRlh71hhUZTtN0kRq3iFDL9qQRMEpP1UJl5LUdpJUFyGFhVxXVJ3JIL808fWaadi-4j4GEVesm4w3uOkJYZJlHiyRM-rTje-2tnF-FZeuc_lbsEq9TMpDbiUI4/s1600/20180429_191817.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto; text-align: center;"><img border="0" data-original-height="1005" data-original-width="1600" height="201" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEitXMdb5jV_-naYzzdM5yRlh71hhUZTtN0kRq3iFDL9qQRMEpP1UJl5LUdpJUFyGFhVxXVJ3JIL808fWaadi-4j4GEVesm4w3uOkJYZJlHiyRM-rTje-2tnF-FZeuc_lbsEq9TMpDbiUI4/s320/20180429_191817.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The overall architecture</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAK1AIMZkY0JlvPY8HblszWGiDA7GGVzdAKvD4HOHc2zQ4_4oHzKMfO2bZqSDHQIOhhouq4Nsa-3alQUHj2u8DZbGFQ-lQlkzwa0HmcnQTK0hxtHP-jLZePsqdls__Bm7uOT4ancZLfmo/s1600/20180429_192733.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="984" data-original-width="1600" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAK1AIMZkY0JlvPY8HblszWGiDA7GGVzdAKvD4HOHc2zQ4_4oHzKMfO2bZqSDHQIOhhouq4Nsa-3alQUHj2u8DZbGFQ-lQlkzwa0HmcnQTK0hxtHP-jLZePsqdls__Bm7uOT4ancZLfmo/s320/20180429_192733.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">smarter distance metric via PLDA!</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<h2>
Automated Extractions for Machine Generated Mail, by Irena Grabovitch-Zuyev (Yahoo Research)</h2>
<h3>
Abstract:</h3>
A few months ago I presented <b>Xcluster</b> - a technique for clustering of machine generated emails and we focused on the classification use case.</div>
<div>
<iframe width="320" height="266" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/O-B3aH_1Bc0/0.jpg" src="https://www.youtube.com/embed/O-B3aH_1Bc0?feature=player_embedded" frameborder="0" allowfullscreen></iframe></div>
<div>
Well, now that we have those classified clusters, what else can we gain from it?<br />In this follow-up talk I will present our solution to the Mail extraction task, whose objective is to extract valuable data from the content of mail messages.<br />This task is key for many types of applications including re-targeting, mail search, and mail summarisation, which utilises the important personal data pieces in mail messages to achieve their objectives. The heart of our solution is an offline process that leverages the structural mail-specific characteristics of the clustering, and automatically creates extraction rules that are later applied online for each new arriving message. This process has been productised in Yahoo mail backend and has been tested in large-scale experiments carried over real Yahoo mail traffic.</div>
<div>
<h3>
My Notes:</h3>
<div>
<ul>
<li>This talk how Systems like Google Inbox and in particular Yahoo Mail handle grouping and smart processing of emails. Inbox does smart clustering that seems to go beyond a simple bag of words. Also they are able to extract the most salient facts and present them. While parsing is the traditional approach they paper below explains how this type of work is scaled up. </li>
<li>Talk covers the paper: </li>
<ul>
<li><a href="https://dblp.uni-trier.de/pers/hd/c/Castro:Dotan_Di" itemprop="url" style="color: #1889c4;"><span itemprop="name">Dotan Di Castro</span></a><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">, </span><span itemprop="author" itemscope="" itemtype="http://schema.org/Person" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"><a href="https://dblp.uni-trier.de/pers/hd/g/Gamzu:Iftah" itemprop="url" style="color: #7d848a; text-decoration-line: none;"><span itemprop="name">Iftah Gamzu</span></a></span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">, </span><span itemprop="author" itemscope="" itemtype="http://schema.org/Person" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"><span class="this-person" itemprop="name">Irena Grabovitch-Zuyev</span></span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">, </span><span itemprop="author" itemscope="" itemtype="http://schema.org/Person" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"><a href="https://dblp.uni-trier.de/pers/hd/l/Lewin=Eytan:Liane" itemprop="url" style="color: #7d848a; text-decoration-line: none;"><span itemprop="name">Liane Lewin-Eytan</span></a></span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">, </span><span itemprop="author" itemscope="" itemtype="http://schema.org/Person" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"><a href="https://dblp.uni-trier.de/pers/hd/p/Pundir:Abhinav" itemprop="url" style="color: #7d848a; text-decoration-line: none;"><span itemprop="name">Abhinav Pundir</span></a></span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">, </span><span itemprop="author" itemscope="" itemtype="http://schema.org/Person" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"><a href="https://dblp.uni-trier.de/pers/hd/s/Sahoo:Nil_Ratan" itemprop="url" style="color: #7d848a; text-decoration-line: none;"><span itemprop="name">Nil Ratan Sahoo</span></a></span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">, </span><span itemprop="author" itemscope="" itemtype="http://schema.org/Person" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"><a href="https://dblp.uni-trier.de/pers/hd/v/Viderman:Michael" itemprop="url" style="color: #7d848a; text-decoration-line: none;"><span itemprop="name">Michael Viderman</span></a></span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">:</span><br style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;" /><span class="title" itemprop="name" style="background-color: white; color: #666666; font-family: "Open Sans", sans-serif; font-size: 14.6667px; font-weight: 700;">Automated Extractions for Machine Generated Mail.</span><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;"> </span><a href="https://dblp.uni-trier.de/db/conf/www/www2018c.html#CastroGGLPSV18" style="background-color: white; color: #7d848a; font-family: "Open Sans", sans-serif; font-size: 14.6667px; text-decoration-line: none;"><span itemprop="isPartOf" itemscope="" itemtype="http://schema.org/Series"><span itemprop="name">WWW (Companion Volume)</span></span> <span itemprop="datePublished">2018</span></a><span style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">: </span><span itemprop="pagination" style="background-color: white; color: #505b62; font-family: "Open Sans", sans-serif; font-size: 14.6667px;">655-662</span></li>
</ul>
<li>Look at the structure and hash it AKA X-Cluster. Within each X-Cluster</li>
<ul>
<li>extract text as x-path - creates tables </li>
<li>some paths will be constants</li>
<li>others will be different</li>
</ul>
<li>Use rule extraction</li>
<ul>
<li>dictionary based (names, places, ... ) need only to be a 70% hit a dictionary to annotate</li>
<li>output is a regex</li>
</ul>
<li>Rule refinement</li>
<ul>
<li>use classification</li>
<li>use xpath previously ignored...</li>
<li>Features (light annotations)</li>
<ul>
<li>relative xpath postion</li>
<li>annotation before/after</li>
<li>constant values before/after</li>
<li>HTML headers\</li>
</ul>
<li>The we have a contextual ...</li>
</ul>
<li>Ecaluation</li>
</ul>
</div>
<h2>
Beyond A-B testing in the AdTech industry - Uri Goren (Bigabid)</h2>
<h3>
Abstract:</h3>
A-B Testing is the default evaluation method used all across the advertising industry.<br />However, despite the simplicity of A-B testing, it is not a silver bullet suitable for all scenarios.<br />We would cover several flavours of A-B testing and their applications, and their upsides and downsides. We would introduce a Bayesian model to tackle some of the issues, and cover the "conjugate_prior" pypi module that is authored by the speaker.</div>
<div>
<br /></div>
<div>
A/B testing is a great subject for speakers to demonstrate their level of sophistication. Uri Goren did about as well as I've heard - Kudos! He has misses some of the big issues but avoids most of the sand pits:-) while covering the terrain.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWLscIN2D3-uE8BZjs25Zw1mkTM8PGXs8LgZwy1P5qlHmcts4IgzVwAAeaGl7rVZli7ABwEuswEL_P0AthlbAqzmyP8VE6tttEumkgDARkEvDUgQWN8SDs8A2bCA3jBJLFXq91OQM_jWY/s1600/20180429_201657.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="827" data-original-width="1600" height="165" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWLscIN2D3-uE8BZjs25Zw1mkTM8PGXs8LgZwy1P5qlHmcts4IgzVwAAeaGl7rVZli7ABwEuswEL_P0AthlbAqzmyP8VE6tttEumkgDARkEvDUgQWN8SDs8A2bCA3jBJLFXq91OQM_jWY/s320/20180429_201657.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">bio - impressive</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZoaZAieNEAb1zaLWkE2srTDVjnHYHusXQ04QQbMzQi4UEYIbjqiOWowJtBa00HjnXNXVPwk00HVMfJMx1Y-SOmSFoiSlL5QIX3inefPnawzmAbCCAEM2YhQef0avd7A9npFw7LBidMkc/s1600/20180429_202546.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1096" data-original-width="1600" height="219" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZoaZAieNEAb1zaLWkE2srTDVjnHYHusXQ04QQbMzQi4UEYIbjqiOWowJtBa00HjnXNXVPwk00HVMfJMx1Y-SOmSFoiSlL5QIX3inefPnawzmAbCCAEM2YhQef0avd7A9npFw7LBidMkc/s320/20180429_202546.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">All data scientists end up working on CTR!?</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEu2M9v4F86pswBdL77Nq3ZMj0qF6xys1agOWGc0ReSfqPgeF7VfmoXkYckpoz3Ji2rPYr9kdt-SmV-mUCJ-_V0aDigHxR2OXA6HWXF3v9Rb81J9-HBjT_UMzS-lNXCyCUJAEhfBv-1k0/s1600/20180429_201852.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1021" data-original-width="1600" height="204" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEu2M9v4F86pswBdL77Nq3ZMj0qF6xys1agOWGc0ReSfqPgeF7VfmoXkYckpoz3Ji2rPYr9kdt-SmV-mUCJ-_V0aDigHxR2OXA6HWXF3v9Rb81J9-HBjT_UMzS-lNXCyCUJAEhfBv-1k0/s320/20180429_201852.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">mentioned avoiding confounding factors by limiting test scope.</td></tr>
</tbody></table>
<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZx1g09uwTufXYXsONlbOe-PvBmM-VmR7465lsmkM_BFEbNNUS8epkEEhrDQnSxLvrGtqBFAA8x0ndneLUfNiGOMvG_CG6i2NBvrfHmYQl2Iqy-MoEcFc3foF4T5OxPQsonV72rRN_GcE/s1600/20180429_202032.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="970" data-original-width="1600" height="193" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZx1g09uwTufXYXsONlbOe-PvBmM-VmR7465lsmkM_BFEbNNUS8epkEEhrDQnSxLvrGtqBFAA8x0ndneLUfNiGOMvG_CG6i2NBvrfHmYQl2Iqy-MoEcFc3foF4T5OxPQsonV72rRN_GcE/s320/20180429_202032.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">the reason by we can assume a normal distribution<br />however we soon see the distribution is highly skewed</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRCskg55qMFc3mofbfDD2d00VyAds1e2yCds3rkwmTLogMD9U86WOWpG1eMmp59usZiS2bHqDAdo_00cruZmixBYBHxetLXlVvRYRYjwLUDQHp0WhbK3k47q-L1LlFhRuEY8YDA0eM5Pg/s1600/20180429_202110.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1043" data-original-width="1600" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRCskg55qMFc3mofbfDD2d00VyAds1e2yCds3rkwmTLogMD9U86WOWpG1eMmp59usZiS2bHqDAdo_00cruZmixBYBHxetLXlVvRYRYjwLUDQHp0WhbK3k47q-L1LlFhRuEY8YDA0eM5Pg/s320/20180429_202110.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">how long before stopping. explained p-value<br /></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP7AGO9MwyQ5QS0J9YlzwbjZ9eoCYN51Q0zjDP8yDPl7APrraARe8w627sAoOvWZqdtpLFxZwgbti0zCUH81ko_bhYlsOE_LzV119ozOLv-UfZ4d3zmA9QznTDUqXL5cdHOZ9U3PxbbWE/s1600/20180429_202714.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1042" data-original-width="1600" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP7AGO9MwyQ5QS0J9YlzwbjZ9eoCYN51Q0zjDP8yDPl7APrraARe8w627sAoOvWZqdtpLFxZwgbti0zCUH81ko_bhYlsOE_LzV119ozOLv-UfZ4d3zmA9QznTDUqXL5cdHOZ9U3PxbbWE/s320/20180429_202714.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">so the data is far from normal.</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgze9G0uev1CqvnlOfY5QOWE2Nose8X1gXSNeBWClk3Wqu-O66D4MOqkPzzC10Wi3Ja-snIfTAIv7QAA4FbMbdOwxKpztANf2opaoRUWrvjg8d0Uj6nf6nVYeaBLWXq3lxy0p4irqOiEJA/s1600/20180429_202941.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="963" data-original-width="1600" height="192" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgze9G0uev1CqvnlOfY5QOWE2Nose8X1gXSNeBWClk3Wqu-O66D4MOqkPzzC10Wi3Ja-snIfTAIv7QAA4FbMbdOwxKpztANf2opaoRUWrvjg8d0Uj6nf6nVYeaBLWXq3lxy0p4irqOiEJA/s320/20180429_202941.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">we want to estimate CTR probability using Bernoulli Distribution</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS3ta4b5xJ80vayqSbFpzw03Uj7P51mwdSYOY9QS7Libar3trVju4JUrYNq72G-iAt6O9WS_t_ROuciIyvDXxugK28fGqDAljzIH2ew-ul5eknZYwYb-n4En7TJP6jJx5AtYn3-j8mMRQ/s1600/20180429_203032.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="952" data-original-width="1600" height="190" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS3ta4b5xJ80vayqSbFpzw03Uj7P51mwdSYOY9QS7Libar3trVju4JUrYNq72G-iAt6O9WS_t_ROuciIyvDXxugK28fGqDAljzIH2ew-ul5eknZYwYb-n4En7TJP6jJx5AtYn3-j8mMRQ/s320/20180429_203032.jpg" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
This auction pricing slide shows that CPC <br />is based on ctr ... (aren't we missing the next bid) ...)</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiac6jiaioJ4cCcLhyu0ANG4eCgC6MeEuJ49gZIZv3X1qE7VKFsMZ7mzfW4mdsBpqxHqUoxozBC4m0aKGJ39dhWx3BXK6rgWl2vYtESYcjFQ7dL7dvAIx-N0tzqqG_xXh_yzyc2FX6haI/s1600/20180429_203229.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1048" data-original-width="1600" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiac6jiaioJ4cCcLhyu0ANG4eCgC6MeEuJ49gZIZv3X1qE7VKFsMZ7mzfW4mdsBpqxHqUoxozBC4m0aKGJ39dhWx3BXK6rgWl2vYtESYcjFQ7dL7dvAIx-N0tzqqG_xXh_yzyc2FX6haI/s320/20180429_203229.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Any prior might converge eventually ... but if there is a congregate <br />prior, it is the best choice. Also introducing a new Python package with <br />bayesian Monte Carlo simulation for a/b tests, (which lets us guestimate <br />the remaining probability of a win for the A or B)</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTqkMETaHJTOKKL_zpt6_DCTbUmT0FNsrPLMgQrwiO1KFk-_D20_CRkamS-aMGEf9SrTzRCTba_rj87kWhWnAGx4elxD_jPAcmD4Aig6Ytn7fGLHjTzr5XASQ2y0VR0C5t6BfU80siZNU/s1600/20180429_203331.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1002" data-original-width="1600" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjTqkMETaHJTOKKL_zpt6_DCTbUmT0FNsrPLMgQrwiO1KFk-_D20_CRkamS-aMGEf9SrTzRCTba_rj87kWhWnAGx4elxD_jPAcmD4Aig6Ytn7fGLHjTzr5XASQ2y0VR0C5t6BfU80siZNU/s320/20180429_203331.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption">using a conjugate prior (as it fits the posterior) <br />the package matches posteriors with priors :-)</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcSxQzhq4KSxQU_PsOGL5aW7OuvDQFUmK5_XvPsYCVylkht16ZUcabOFzPGXq5T6SjtPY1Pxo7i5A9ABltEheENaYI21S8nLiO_PqgY3QBP4yCqgYNX6h97kMoNCe0hyphenhyphenwXro5_HC68Ebc/s1600/20180429_203542.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1308" data-original-width="1600" height="261" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcSxQzhq4KSxQU_PsOGL5aW7OuvDQFUmK5_XvPsYCVylkht16ZUcabOFzPGXq5T6SjtPY1Pxo7i5A9ABltEheENaYI21S8nLiO_PqgY3QBP4yCqgYNX6h97kMoNCe0hyphenhyphenwXro5_HC68Ebc/s320/20180429_203542.jpg" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfZalmbauydQQQ-urxYKyOepMl5beZefOZsVbF72aE8fbE2TCMnXXcL8HBcqO1tuilTow2XRHsspB9_I9_B72U0VFVoKXODXwIbLdKvLvSUtTjPF8Dzc9yfm5JS2WLWXDX0lyOnGaNH5c/s1600/20180429_203623.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1174" data-original-width="1600" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfZalmbauydQQQ-urxYKyOepMl5beZefOZsVbF72aE8fbE2TCMnXXcL8HBcqO1tuilTow2XRHsspB9_I9_B72U0VFVoKXODXwIbLdKvLvSUtTjPF8Dzc9yfm5JS2WLWXDX0lyOnGaNH5c/s320/20180429_203623.jpg" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjShpENp7HZJqRmHkMESurPBzMO1u2YTbQ0yxRIaukQQxd9LvM0Lvs_3qx_A73DpwrY6h1USpADB-zuDVxzQ2NbXYgl6drE0Y0x0uAkdmuhfUsKRxVFkg7s6wA4tw2yGgZOnlI_KJ8_u4/s1600/20180429_203705.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1005" data-original-width="1600" height="201" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjShpENp7HZJqRmHkMESurPBzMO1u2YTbQ0yxRIaukQQxd9LvM0Lvs_3qx_A73DpwrY6h1USpADB-zuDVxzQ2NbXYgl6drE0Y0x0uAkdmuhfUsKRxVFkg7s6wA4tw2yGgZOnlI_KJ8_u4/s320/20180429_203705.jpg" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<br /></div>
<h3>
My Notes</h3>
<div>
<ul>
<li>combing features requires a factorial design</li>
<li>p value - is the chance of getting the same result in a A/A test</li>
<li>like others before talks about N shows 30 is good for a uniform diftribution</li>
<li>when do we stop - in ad-tech cross validation ? </li>
<li>Stratified cross validation.- did not talk about it</li>
<li>Bernoulli is better but ...</li>
<li>Bayesian one armed bandits save you money that would be lost on the worst branch of the test while running the test.</li>
<li>Asked how do they know the test has run its course and/or validate the results....</li>
</ul>
<div>
some answers:</div>
</div>
<div>
recommended tutorial on bayesian for data science - Robert Downey's Think Bayes!</div>
<div>
<br /></div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-66605834830887311072018-02-19T14:05:00.000+02:002018-02-19T14:05:00.002+02:00Insight into progressive web appsSome notes from a Meetup on PWAs in January 2016. I feel quite knowledgable on PWA but I wanted to learn more on implementing service worker. I ended up adding some research and collecting some great resources.<br />
<br />
However I ended up getting more detailed materials on the service worker based on google's developers docs. Also the resources have been expanded.<br />
<br />
<!--StartFragment-->
<br />
<div style="border-width: 100%; direction: ltr;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 7.002in;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 7.002in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<a href="https://www.meetup.com/The-Future-is-Javascript/events/246346861/">https://www.meetup.com/The-Future-is-Javascript/events/246346861/</a></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h1 style="color: #1e4e79; font-family: Calibri; font-size: 16.0pt; margin: 0in;">
Service
worker</h1>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Service Workers Its
just a simple JavaScript file that sits between you and the network </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
– It runs in another
thread </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
– It has no access
to DOM </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
– It intercepts
every network request (including cross domain)</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-size: 11.0pt; margin: 0in;">
<span style="font-family: Calibri;">Entry
point: </span><span style="font-family: Consolas;">self.caches</span><span style="font-family: Calibri;"> (in service worker) or </span><span style="font-family: Consolas;">window.caches</span><span style="font-family: Calibri;">
(on page)</span></div>
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
</h2>
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Registering
a Service Worker </h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
• Works with
promises </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
• Re-registration
works fine</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
In main.js</div>
<div style="direction: ltr;">
<table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border-color: #A3A3A3; border-style: solid; border-width: 0pt; direction: ltr;" summary="" title="" valign="top">
<tbody>
<tr>
<td style="background-color: white; border-width: 0pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 5.4673in;">
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">navigator.serviceWorker.register('/sw.js').then(function(reg){</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin-left: .375in; margin: 0in;">console.log('regsitered');</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">}.catch(function(err)){</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin-left: .375in; margin: 0in;">console.log('Boo!');</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<div style="font-family: Calibri; font-size: 11.0pt; line-height: 15pt; margin: 0in;">
<br /></div>
</td>
</tr>
</tbody></table>
</div>
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Setting
an interception scope </h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
The default scope is
where the sw file is, but you can control that</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">navigator.serviceWorker.register('/sw.js',{scope:
'/my-app/'});</code><br />
<br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
It will then control
/my-app/ and its subdirectories</div>
<br />
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
On
install</h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<ul style="direction: ltr; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="disc">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11.0pt;">Add initial resources (the
application shell) to the cache</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11.0pt;">Cache has a name</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11.0pt;">Array of resources to cache</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11.0pt;">Mechanism to get a resource
by path (a map)</span></li>
</ul>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">console.log('Service
Worker Registered!');</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">// This
function build an array of urls,<br />
// fetch them, and store the responses in the cache,<br />
// example: key: 'main.js' value: 'alert(3)'</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">var cacheName
= 'app-shell-cache-v1';</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">var
filesToCache = ['/', '/index.html', ...];</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">self.addEventListener('install',
event => {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> event.waitUntil(</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> caches.open(cacheName).then(cache
=> {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
cache.addAll(filesToCache);<span style="mso-spacerun: yes;"> </span>//load app
shell into the cache.</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> }).then(()
=> {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
self.skipWaiting();</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> })</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> );</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
The install should
happen in the background in case there is a previous version of the service
worker running. If the install fails the old service worker will be left
intact.</div>
<br />
<br />
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
On
activate</h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Update cache -
remove outdated resources. Cache should be versioned. If the sum off all caches
is to big for an origin point is too big they may be reclaimed. So we should
ensure th remove old data. This is done more easily if we use versioned caches.</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">self.addEventListener('activate',e
=> {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> e.waitUntil(</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> caches.keys().then(keyList
=> {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
Promise.all(keyList.map(key => {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> if
(key !== cacheName) return caches.delete(key);</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> }));</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> }));</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
self.clients.claim();</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<br />
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
On
fetch</h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Retrieve from cache
with network fallback</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Allows to intercept
page loading</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Can get page from
the cache or from network, </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Handle offline and
404 with exception</div>
<br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">self.addEventListener('fetch',
event => {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> event.respondWith(
</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> caches.match(event.request)</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> .then(response
=> {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
response || fetch(event.request); //return cached else fetch </code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> })</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> );</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">How this is
handled in practice depends on resources and their rate of change. So the shell
might be fetched from cache first</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">news might be
fetched from the network and fall back to the cache if offline.</code><br />
<br />
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Serving
files from the cache</h2>
<br />
<h3 style="color: #377bac; font-family: Calibri; font-size: 12.0pt; margin: 0in;">
Cache
falling back to network</h3>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
As above</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">self.addEventListener('fetch',
function(event) {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> event.respondWith(
</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> caches.match(event.request)</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> .then(function(response)
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
response || fetch(event.request);</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> })</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> );</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h3 style="color: #377bac; font-family: Calibri; font-size: 12.0pt; margin: 0in;">
Network
falling back to cache</h3>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Frequently updated
data with fallback to cache - say for news where we have an older feed.</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">self.addEventListener('fetch',
function(event) {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> event.respondWith(</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> fetch(event.request).catch(function()
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
caches.match(event.request);</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> })</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> );</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h3 style="color: #377bac; font-family: Calibri; font-size: 12.0pt; margin: 0in;">
Cache
then network </h3>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
For resources that
update frequently and are not versioned in the shell </div>
<div style="font-size: 11.0pt; margin: 0in;">
<span style="font-family: Calibri;">E.g. (</span><span style="font-family: Arial;">articles, avatars, social media timelines, game
leader boards)</span></div>
<div style="color: black; font-family: Arial; font-size: 11.0pt; margin: 0in;">
Requires 2
requests - one to cache and one to the network.</div>
<div style="color: black; font-family: Arial; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="color: black; font-family: Arial; font-size: 11.0pt; margin: 0in;">
Note this
code goes in the main script not the SW as it is … reactive</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">var
networkDataReceived = false;</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">var
networkUpdate = fetch('/data.json')</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">.then(function(response)
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
response.json();</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">}).then(function(data)
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> networkDataReceived
= true;</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> updatePage(data);</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Next we look for the
resource in the cache. This will usually respond faster than the network
request. We use the cached data to provide a quick response. If the network
provides newer data we update again. If cache fails we try to get from the net</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">caches.match('/data.json').then(function(response)
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
response.json();</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">}).then(function(data)
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> if
(!networkDataReceived) {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> updatePage(data);</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> }</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">}).catch(function()
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
networkUpdate;</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">})</code><br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h3 style="color: #377bac; font-family: Calibri; font-size: 12.0pt; margin: 0in;">
Generic
fallback</h3>
<br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">Here is a
version with a generic fallback to an offline mode if network fails</code><br />
<br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">self.addEventListener('fetch',
function(event) {</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> event.respondWith(</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> caches.match(event.request).then(function(response)
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
response || fetch(event.request);</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> }).catch(function()
{</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> return
caches.match('/offline.html');</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> })</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"> );</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">});</code><br />
<br />
<br />
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Progressive
web app use a manifest to setup an icon on mobile.</h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h3 style="color: #377bac; font-family: Calibri; font-size: 12.0pt; margin: 0in;">
In
html:</h3>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<link
rel="manifest" href="/manifest.json"></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h3 style="color: #377bac; font-family: Calibri; font-size: 12.0pt; margin: 0in;">
Sample
WebApp Manifest: </h3>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
{ </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"name": “Tovli",</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"short_name": “TovliWeb",</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"start_url": ".", </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"display": "standalone",</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"background_color":
"#fff",</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"description": “Feel better
today",</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>"icons": [{ </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin-left: .375in; margin: 0in;">
"src":
"images/homescreen48.png", </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin-left: .375in; margin: 0in;">
"sizes":
"48x48", </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin-left: .375in; margin: 0in;">
"type":
"image/png"</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>}]</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
}</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Cache
storage limits</h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<div style="direction: ltr;">
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border-color: #A3A3A3; border-style: solid; border-width: 1pt; direction: ltr;" summary="" title="" valign="top">
<tbody>
<tr>
<td style="background-color: #b7b7b7; border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.6208in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in; text-align: center;">
Browser
</div>
</td>
<td style="background-color: #b7b7b7; border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: .8493in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in; text-align: center;">
Limitation</div>
</td>
<td style="background-color: #b7b7b7; border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.0402in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in; text-align: center;">
Notes</div>
</td>
</tr>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.6208in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Chrome and Opera</div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: .8298in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
No limit.</div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.1437in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Storage is per
origin not per API </div>
</td>
</tr>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.6208in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Firefox</div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: .8298in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
No limit. </div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.0118in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Prompts after 50
MB </div>
</td>
</tr>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.6208in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Mobile Safari</div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: .8298in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
50MB. </div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.0118in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
</td>
</tr>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.6208in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Desktop Safari</div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: .8298in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
No limit. </div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.0118in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Prompts after 5MB </div>
</td>
</tr>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.6402in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Internet Explorer
(10+)</div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: .8298in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
250MB. </div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 1.9923in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Prompts after 10MB
</div>
</td>
</tr>
</tbody></table>
</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
The
PWA Checklist </h2>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
• Site is served
over HTTPS (localhost permitted) </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
• Pages are
responsive on tablets & mobile devices </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
• Site works
cross-browser</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>• Each page has a URL </div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
• Page transitions
don't feel like they block on the network</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>• The start URL (at least) loads while offline</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>• Metadata provided for Add to Home screen</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>• First load fast even on 3G</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<span style="mso-spacerun: yes;"> </span>• See the full checklist</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
PWA
with Vue</h2>
<div style="direction: ltr;">
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border-color: #A3A3A3; border-style: solid; border-width: 1pt; direction: ltr;" summary="" title="" valign="top">
<tbody>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.6548in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
</td><td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 4.0548in;">
</td>
</tr>
<tr>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 2.6756in;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTpoGAH8SP5tp-h4RreJi6REyqXGyea_B5Y2XVSYbfZH25Gnk8WDg93Sk6Gg0jeoCCMD6jDBm-tgSAL2LaMYICeQnixEFeUWc3PhplyYWzYNjqxjnA8wZtWpp0cURcrb7e0WsCt9-9Mrg/s1600/vue-pwa.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="653" data-original-width="410" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTpoGAH8SP5tp-h4RreJi6REyqXGyea_B5Y2XVSYbfZH25Gnk8WDg93Sk6Gg0jeoCCMD6jDBm-tgSAL2LaMYICeQnixEFeUWc3PhplyYWzYNjqxjnA8wZtWpp0cURcrb7e0WsCt9-9Mrg/s320/vue-pwa.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">vue pwa app</td></tr>
</tbody></table>
<div style="margin: 0in;">
<br /></div>
</td>
<td style="border-color: #A3A3A3; border-style: solid; border-width: 1pt; padding: 2.0pt 3.0pt 2.0pt 3.0pt; vertical-align: top; width: 4.1034in;">
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">npm install
-g vue-cli</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">vue init
pwa my-project</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">cd
my-project</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">npm install</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">npm run dev</code><br />
<br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">once done</code><span style="font-family: Consolas; font-size: 11pt;"> use NPM to ...</span><br />
<span style="font-family: Consolas; font-size: 11pt;"><br /></span>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">Run the app in development mode:</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">npm run dev</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"><br /></code>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">Build for
production (uglify, minify etc):</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">npm run
build</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"><br /></code>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">Run unit
test with karma+ mocha + karma-webpack:</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">npm run
unit</code><br />
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;"><br /></code>
<code style="font-family: Consolas; font-size: 11.0pt; margin: 0in;">Run end to
end test with night watch:</code><br />
<code style="margin: 0in;"><span style="color: #24292e; font-family: SFMono-Regular; font-size: 10.2pt;">npm run e2e</span></code><br />
</td>
</tr>
</tbody></table>
</div>
<br />
<h2 style="color: #2e75b5; font-family: Calibri; font-size: 14.0pt; margin: 0in;">
Resources</h2>
<ul style="direction: ltr; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="disc">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11.0pt;">The slides are here: </span><a href="http://bit.ly/misterBIT-PWA-slides" rel="nofollow" target="_blank"><span style="background: white; font-family: "Graphik Meetup"; font-size: 12.0pt;">http://bit.ly/misterBIT-PWA-slides</span></a></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11.0pt;">Code sample is here: </span><a href="http://bit.ly/misterBIT-PWA-Code" rel="nofollow" target="_blank"><span style="background: white; font-family: "Graphik Meetup"; font-size: 12.0pt;">http://bit.ly/misterBIT-PWA-Code</span></a></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://docs.google.com/presentation/d/1yKsDW9wu5xbx2pXDKlSBTg7h1i48CoQd2A8QvHw-lag/edit#slide=id.p4"><span style="font-family: Calibri; font-size: 11.0pt;">Slides</span></a><span style="font-family: Calibri; font-size: 11.0pt;"> on service worker and the
caching API from Google: </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://jakearchibald.com/2014/offline-cookbook/" rel="nofollow" target="_blank"><span style="font-family: Calibri; font-size: 11.0pt;">The offline cookbook</span></a><span style="font-family: Calibri; font-size: 11.0pt;"> Jake Archibald (caching patterns for most situations)</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri; font-size: 11pt;"> </span><a href="https://serviceworke.rs/" rel="nofollow" style="font-family: Calibri; font-size: 11pt;" target="_blank">Serviceworke.rs</a> -from Mozilla. (more caching patterns)</li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://www.udacity.com/course/offline-web-applications--ud899" rel="nofollow" target="_blank"><span style="font-family: Calibri; font-size: 11.0pt;">Course on offline first</span></a><span style="font-family: Calibri; font-size: 11.0pt;"> on Udacity by Jake Archibald.</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://medium.com/dev-channel/offline-storage-for-progressive-web-apps-70d52695513c"><span style="font-family: Calibri; font-size: 11.0pt;">Offline Storage for
Progressive Web Apps</span></a><span style="font-family: Calibri; font-size: 11.0pt;"> by Addy Osmani </span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://developers.google.com/web/updates/2016/06/persistent-storage?hl=en"><span style="font-family: Calibri; font-size: 11.0pt;">Persistent storage</span></a><span style="font-family: Calibri; font-size: 11.0pt;"> by Chris Wilson</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://developers.google.com/web/ilt/pwa/lab-caching-files-with-service-worker"><span style="font-family: Calibri; font-size: 11.0pt;">Lab</span></a><span style="font-family: Calibri;"><span style="font-size: 11pt;"> by Google on Caching Files
with Service Worker</span></span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><a href="https://infrequently.org/2016/09/what-exactly-makes-something-a-progressive-web-app/" rel="nofollow" target="_blank">What, Exactly, Makes Something A Progressive Web App?</a></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri;"><span style="font-size: 14.6667px;"><a href="https://developers.google.com/web/tools/lighthouse/" rel="nofollow" target="_blank">Lighthouse</a> - tool for auditing PWA.</span></span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: Calibri;"><span style="font-size: 14.6667px;"><a href="http://hnpwa.com/" rel="nofollow" target="_blank">Hacker news as PWA</a> in a framework of your choosing (inspired by <a href="https://github.com/tastejs/todomvc" rel="nofollow" target="_blank">TodoMVC</a> )</span></span></li>
</ul>
</div>
</div>
</div>
<div style="border-width: 100%; direction: ltr;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 7.002in;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 7.002in;">
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
<br /></div>
</div>
</div>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0Electra Tower, יגאל אלון 98, תל אביב יפו, 67726, ישראל32.0700804 34.79414459999998131.962456900000003 34.632783099999983 32.177703900000004 34.95550609999998tag:blogger.com,1999:blog-7341613201559046168.post-21414263186827357252018-02-19T11:39:00.000+02:002019-03-21T10:29:15.742+02:00GraphQL with Apollo<div style="direction: ltr;">
<div style="direction: ltr; margin-left: 0in; margin-top: 0in; width: 6.6305in;">
<div style="direction: ltr; font-family: verdana; font-size: 14.6667px; margin-left: 0.075in; margin-top: 0in; width: 3.4541in;">
<div style="font-family: Consolas; font-size: 20pt; margin: 0in;">
<span style="font-family: "calibri"; font-size: 11pt;">My notes from </span><span style="font-family: "calibri"; font-size: 14.6667px;">Alexey Kureev's </span><span style="font-family: "calibri"; font-size: 11pt;">talk titled "Apollo Client: the stuff no-one ever told ya"</span><span class="Apple-converted-space" style="font-family: "calibri"; font-size: 11pt;"> </span><span style="font-family: "calibri"; font-size: 11pt;">by @klarna </span></div>
<div style="font-family: Consolas; font-size: 20pt; margin: 0in;">
<span style="font-family: "calibri"; font-size: 11pt;">in the React & React Native Meetup.</span><span style="font-family: "calibri"; font-size: 11pt;"><br /></span><br />
<span style="font-family: "calibri"; font-size: 11pt;">Meetup</span><span class="Apple-converted-space" style="font-family: "calibri"; font-size: 11pt;"> </span><a href="https://www.meetup.com/React-IL/events/247277840/?_xtd=gqFyqTE0MzM0NzM5MqFwpmlwaG9uZQ&from=ref" style="font-family: calibri; font-size: 11pt;">link</a><span style="font-family: "calibri"; font-size: 11pt;">:</span></div>
</div>
<div style="direction: ltr; margin-left: 0in; margin-top: 0.4444in; width: 6.6305in;">
<span style="font-family: "calibri";"><span style="font-size: 11pt;">REST is very widely used but as web applications have </span><span style="font-size: 14.6667px;">evolved over time with most of the processing happening at the client some of its features are now seen as performance bottlenecks. Some examples are that </span><span style="font-size: 11pt;">endpoint are </span><span style="font-size: 14.6667px;">separated</span><span style="font-size: 11pt;"> as well as the </span><span style="font-size: 14.6667px;">entities</span><span style="font-size: 11pt;"> but we typically want to query for data representing some relation between endpoint and slices of the entities. With REST this would require multiple requests and getting the full </span><span style="font-size: 14.6667px;">entities. GraphQL lets us do this using a single request and provides a more sophisticated way to make queries.</span></span><br />
<span style="font-family: "calibri"; font-size: 11pt;"><br /></span>
<span style="font-family: "calibri"; font-size: 11pt;">GraphQL is the evolution... Benchmarks published by facebook claim a </span></div>
<div style="direction: ltr; margin-left: 0in; margin-top: 0.4444in; width: 6.6305in;">
<ul style="direction: ltr; font-family: verdana; font-size: 14.6667px; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed;" type="disc"><ul style="direction: ltr; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed;" type="circle">
</ul>
</ul>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
The slides show how to consume a GraphQL data source using react.</div>
<div style="direction: ltr; font-family: verdana; font-size: 14.6667px;">
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border: 1pt solid rgb(163, 163, 163); direction: ltr;" summary="" title="" valign="top"><tbody>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirSI_fTLGHXND-rjVW9x9omWiVzuiLmHhloRmiS2znkxq3rzKZf2oJXmzfTCk8qNjyWodTlW4eYitWfruYcQCeVMM6rDmy6IFJj_TsGZeSkv258hcNiIoh7sl3rf0VCPxYo2qy9YnO58c/s1600/gql1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="793" data-original-width="1600" height="156" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirSI_fTLGHXND-rjVW9x9omWiVzuiLmHhloRmiS2znkxq3rzKZf2oJXmzfTCk8qNjyWodTlW4eYitWfruYcQCeVMM6rDmy6IFJj_TsGZeSkv258hcNiIoh7sl3rf0VCPxYo2qy9YnO58c/s320/gql1.png" width="320" /></a></td> <td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1548in;"><!--StartFragment-->
<br />
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
Rest is now vintage -
separated endpoint with separated entities</div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
REST cons:</div>
<ul style="direction: ltr; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="disc">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: "calibri"; font-size: 11.0pt;">it provides deterministic
URIs,</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: "calibri"; font-size: 11.0pt;">caching on the HTTP level</span></li>
</ul>
<div style="font-family: Calibri; font-size: 11.0pt; margin-left: .375in; margin: 0in;">
<br /></div>
<div style="font-family: Calibri; font-size: 11.0pt; margin: 0in;">
GraphQL supports
reactive subscription data aggregation that doesn't really fit REST
architecture that well. With GraphQL, we
need only one. As the name implies, GraphQL is a graph query language that
changes the way you think of data. Instead of operating separated entities, you
start to operate on data graphs. Let's take a closer look:</div>
<!--EndFragment--><br />
<div style="margin: 0in;">
</div>
</td> </tr>
<tr> <td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjh8QL0zYNK-t7x3XLeN0UiB32cDWseDu5bNc8Z-Va_OfSQRqLCRMRaThUHDBxVRzk2wSxlhtvJ-ea6QIaGNFjERjddfw1IufJFfKdMBNYAgvHXLnhx3zsQ4WfhWiOR1WMdPEZCdAb-bD8/s1600/gql2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="925" data-original-width="1600" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjh8QL0zYNK-t7x3XLeN0UiB32cDWseDu5bNc8Z-Va_OfSQRqLCRMRaThUHDBxVRzk2wSxlhtvJ-ea6QIaGNFjERjddfw1IufJFfKdMBNYAgvHXLnhx3zsQ4WfhWiOR1WMdPEZCdAb-bD8/s320/gql2.png" width="320" /></a></div>
<div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1305in;"><div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-size: 8pt; font-weight: bold;">Operation pattern</span><br />
<div style="color: #c48ec4; font-family: Consolas; font-size: 8.0pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8.0pt; margin: 0in;">
GraphQL provides
three operation types: </div>
<span style="font-size: 8pt; font-weight: bold;">
<!--StartFragment-->
<!--EndFragment--></span><br />
<ol style="direction: ltr; font-family: Consolas; font-size: 8.0pt; font-style: normal; font-weight: normal; margin-bottom: 0in; margin-left: .375in; margin-top: 0in; unicode-bidi: embed;" type="a">
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;" value="1"><span style="font-family: "consolas"; font-size: 8.0pt; font-style: normal; font-weight: normal;">Query</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: "consolas"; font-size: 8.0pt;">Mutation</span></li>
<li style="margin-bottom: 0; margin-top: 0; vertical-align: middle;"><span style="font-family: "consolas"; font-size: 8.0pt;">Subscription</span></li>
</ol>
</div>
<div style="color: #c48ec4; font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="color: #c48ec4;">op_type</span><span style="color: #979797;"><span class="Apple-converted-space"> </span>Op_Name<span class="Apple-converted-space"> </span></span><span style="color: #e29200;">(op_params)<span class="Apple-converted-space"> </span></span><span style="color: green;">{</span><br />
<span style="color: green;"> <span class="Apple-converted-space"> </span>field_1</span><br />
<span style="color: green;"> <span class="Apple-converted-space"> </span>field_2<span class="Apple-converted-space"> </span></span><span style="color: teal;">(field_params) {</span><br />
<span style="color: teal;"> <span class="Apple-converted-space"> </span>subfields</span><br />
<span style="color: teal;"> <span class="Apple-converted-space"> </span>}</span><br />
<span style="color: green;">}</span></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiKlpNda742avrUr_dLa36GEEkEEI4u5s286A2n5ZVRPqW5P1FW9kI7U7cGpU3vrG5_2SWbBg35P4q0AkLr8CkdtdQiL1U9sOeV0JlScWiAmiVvqDvk2oJEfRlFG-PT91GZGqdQp-12eM/s1600/gql3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="799" data-original-width="1600" height="159" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiKlpNda742avrUr_dLa36GEEkEEI4u5s286A2n5ZVRPqW5P1FW9kI7U7cGpU3vrG5_2SWbBg35P4q0AkLr8CkdtdQiL1U9sOeV0JlScWiAmiVvqDvk2oJEfRlFG-PT91GZGqdQp-12eM/s320/gql3.png" width="320" /></a></div>
<div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1111in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Query</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query UserQuery ($id: Int!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>user (id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>name</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>email</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5CsfrfeYmMjsgtFnoKQBKb986UR0v4whjWzID_ciEi_SwcidWpPhX50ZdNyKH0DiO1bw5GT6PHaABWKfVwFZKKiVimuVAF58rTgWHD6jJkFjw86VyZZQBEjQslvYZDFkB5IS0pPCcg8g/s1600/gql4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="795" data-original-width="1600" height="159" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5CsfrfeYmMjsgtFnoKQBKb986UR0v4whjWzID_ciEi_SwcidWpPhX50ZdNyKH0DiO1bw5GT6PHaABWKfVwFZKKiVimuVAF58rTgWHD6jJkFjw86VyZZQBEjQslvYZDFkB5IS0pPCcg8g/s320/gql4.png" width="320" /></a></div>
<div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1305in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Mutation</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
mutation UserNameMutation ($id: Int!, name: String!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>editUserName (id: $id, name: $name) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>name</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiwTq-P09_aKs9ejANcWoKmyWvFB1LVxT6WTDrfgLM5KmtWGGOmA68tVHbedTsVtNyx2CMBDXZmgdVW2rYXHHRATIcbsRElgfJKZlM_VXapt4GedcFDr2MI5-R2Rse2xuXJaactdI9fEE/s1600/gql5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="793" data-original-width="1600" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiwTq-P09_aKs9ejANcWoKmyWvFB1LVxT6WTDrfgLM5KmtWGGOmA68tVHbedTsVtNyx2CMBDXZmgdVW2rYXHHRATIcbsRElgfJKZlM_VXapt4GedcFDr2MI5-R2Rse2xuXJaactdI9fEE/s320/gql5.png" width="320" /></a></div>
<div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1111in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Subscription</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
subscription UserQuery ($id: Int!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>user (id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>name</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>email</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEist4cJK6tEu9IT2bGvmeTXLtGMkNObVzq1VJXB9-LlqpFBjH0nyxi9FffmjpD0UVwPbTqOIoywmNB7ZHWetxmb2oTOHJmBTeIpPDiDShch63FAgnxiN8H2uZKshI6hqP0kt4IlVCc9tp8/s1600/gql6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="GraphQL shopping list example" border="0" data-original-height="791" data-original-width="1600" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEist4cJK6tEu9IT2bGvmeTXLtGMkNObVzq1VJXB9-LlqpFBjH0nyxi9FffmjpD0UVwPbTqOIoywmNB7ZHWetxmb2oTOHJmBTeIpPDiDShch63FAgnxiN8H2uZKshI6hqP0kt4IlVCc9tp8/s320/gql6.png" title="GraphQL shopping list example" width="320" /></a></div>
<div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1111in;"><div style="margin: 0in;">
</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvOL9p6OBIKlTKhNDxztS1uAGLZXzA1H07LfvczcqRKDk7fVuDAqYF0hs653xT8szxfIZmKThHc_M_Amsc4Fpf7LIarNAvU3A_afX_HPGIlI3Qdob0rwFkXT7Udeiew1iwBKTYdTsRD1Y/s1600/gql7.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="GraphQL shopping list example - mockup" border="0" data-original-height="805" data-original-width="1600" height="161" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvOL9p6OBIKlTKhNDxztS1uAGLZXzA1H07LfvczcqRKDk7fVuDAqYF0hs653xT8szxfIZmKThHc_M_Amsc4Fpf7LIarNAvU3A_afX_HPGIlI3Qdob0rwFkXT7Udeiew1iwBKTYdTsRD1Y/s320/gql7.png" title="GraphQL shopping list example - reactive layout" width="320" /></a></div>
</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h1 style="color: #1e4e79; font-family: Calibri; font-size: 16pt; margin: 0in;">
Shopping cart example</h1>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-size: 11pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Shopping Cart Query</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query ShoppingCartList {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>products {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>title</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>price</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Details query</h2>
<h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query ProductInfo($id: Int!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>product(id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>title</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>price</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;"> <span class="Apple-converted-space"> </span>isAvailable</span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;"> <span class="Apple-converted-space"> </span>discountValue</span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span><span style="font-weight: bold;">description</span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in 0in 0in 0.375in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
We can now utilise reusable fragments</h2>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Define</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
fragment preview on Product {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>title</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>price</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Define</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
fragment details on Product {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>isAvailable</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>discountValue</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>description</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Use</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query ShoppingCartList {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>products {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>...preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Use</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query ProductInfo($id: Int!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>product(id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>...preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>...details</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1111in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
But this is going to fetch - from the web</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
fragments can be cached separately.</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
so they can save transfer bandwidth.</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;">1. readQuery</span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
const { todo } = client.readQuery({</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>query: gql`</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>query Product($id: Int!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>product(id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>title</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>`,</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>variables: { id: 5 }</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
});</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;">2. readFragment</span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
const todo = client.readFragment({</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id: 5,<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>fragment: gql`</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>fragment productFragment on Product {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>title</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>`,</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
});</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
Memory cache</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2083in;"><div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
Only reads from local storage but will fail if some of the query is missing .</div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="color: #979797; font-weight: bold;">3. <</span><span style="color: firebrick; font-weight: bold;">Query</span><span style="color: #979797; font-weight: bold;"> </span><span style="color: #e29200; font-weight: bold;">fetchPolicy</span><span style="color: #979797; font-weight: bold;">=</span><span style="color: brown; font-weight: bold;">"cache-only"</span><span style="color: #979797; font-weight: bold;"> /></span></div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
Only reads from local storage but will fail if some of the query is missing<span class="Apple-converted-space"> </span></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<Query<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>query={cacheQuery}<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>variables={{ id }}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>fetchPolicy="cache-only"</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>{({ data = {} }) => <SomeComponent /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
</Query></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
for example: Shopping List screen</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<Query query={shoppingCartList}></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>{({ loading, data }) => <span class="Apple-converted-space"> </span>(</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span><View></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>{loading && <Loading />}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>{!loading && <Products data={data.products} />}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span></View></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>)}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
</Query></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
But we want to get some data from local cache</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
and some from the server</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_jigNOqY_KpS77-TNUYvFc97Wpx9VV495N4zUhXoBgME5vRwu-TsCs3Obg5cn8MmLE5zenAN2GcMw0QoJm8BgeeUfmJJkCsfh0a8PL-Wvi4RI1bplbr7yQvYgJeyvCOVpp1n1vKOp6HA/s1600/gql8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="794" data-original-width="1600" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_jigNOqY_KpS77-TNUYvFc97Wpx9VV495N4zUhXoBgME5vRwu-TsCs3Obg5cn8MmLE5zenAN2GcMw0QoJm8BgeeUfmJJkCsfh0a8PL-Wvi4RI1bplbr7yQvYgJeyvCOVpp1n1vKOp6HA/s320/gql8.png" width="320" /></a></div>
<div style="margin: 0in;">
</div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrbm-FhHsyUwktyUXZNuZh3YL8a-0XisjSTVrRYPxu0hy4ChuRCu-bQcoNhoYULltFJvUiWxs7etLbSuNOciDUuIbO4TZqT1_X5VMBpoCN6YQAwcrmHbHVyKYIiyXEsoqetIWWTi7EpEo/s1600/gql9.png" imageanchor="1" style="font-family: verdana; font-size: medium; margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="790" data-original-width="1600" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrbm-FhHsyUwktyUXZNuZh3YL8a-0XisjSTVrRYPxu0hy4ChuRCu-bQcoNhoYULltFJvUiWxs7etLbSuNOciDUuIbO4TZqT1_X5VMBpoCN6YQAwcrmHbHVyKYIiyXEsoqetIWWTi7EpEo/s320/gql9.png" width="320" /></a></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
1. Cached data query</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query ProductInfoCache($id: Int!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>product(id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>...preview</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
And display via</h2>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<Query<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>query={productInfoCache}<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>variables={{ id }}<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>fetchPolicy="cache-only"</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>{({ data = {} }) => // ...</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
</Query></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="font-size: 11pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="font-size: 8pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
2. Missing (details) data query</h2>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
query ProductInfo($id: Int!, $full: Boolean!) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>product(id: $id) {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>id</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>...preview @include(if: $full)</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>...details</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
}</div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><h2 style="color: #2e75b5; font-family: Consolas; font-size: 8pt; margin: 0in;">
Or Declaratively</h2>
<div style="font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-size: 8pt; margin: 0in;">
<Query<span class="Apple-converted-space"> </span></div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>query={productInfo}</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>variables={{ id, full: !data.product }}</div>
<div style="font-size: 8pt; margin: 0in;">
></div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>({ loading, data }) => //...</div>
<div style="font-size: 8pt; margin: 0in;">
</Query></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2812in;"><div style="color: #979797; font-family: "Open Sans"; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;">Add cache redirect</span></div>
<div style="font-size: 8pt; margin: 0in;">
const cache = new InMemoryCache({</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>cacheRedirects: {</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>Query: {</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>product: (_, { id }) =></div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>toIdValue(cache.config.dataIdFromObject({<span class="Apple-converted-space"> </span></div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>__typename: "Product", id</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>})</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>)</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>}</div>
<div style="font-size: 8pt; margin: 0in;">
});</div>
<div style="color: #979797; font-family: "Open Sans"; font-size: 8pt; margin: 0in;">
<br /></div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.1861in;"><div style="color: #979797; font-family: "Open Sans"; font-size: 10pt; margin: 0in;">
<span style="font-weight: bold;">TRY IT OUT</span></div>
<div style="color: #979797; font-family: "Open Sans"; font-size: 10pt; margin: 0in;">
<br /></div>
<div style="color: #979797; font-family: "Open Sans"; font-size: 10pt; margin: 0in;">
<a href="https://codesandbox.io/s/mmpm32j7ly"><span style="font-weight: bold;">https://codesandbox.io/s/mmpm32j7ly</span></a></div>
</td></tr>
<tr><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTHSHGJsKVCHRV82FnltZ4GYiAxkT7ksxP-sfxfnrq9gqO6Cc6SzXqAQ5FU2J4FxeylpcXj4IUku2hxRaqKGZzPFc3CCti6XO0ooeUcOzt_oDu14Nq_J48L2-QzDgKb2BWa5fQFE5T6Rc/s1600/gql10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="799" data-original-width="1600" height="159" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTHSHGJsKVCHRV82FnltZ4GYiAxkT7ksxP-sfxfnrq9gqO6Cc6SzXqAQ5FU2J4FxeylpcXj4IUku2hxRaqKGZzPFc3CCti6XO0ooeUcOzt_oDu14Nq_J48L2-QzDgKb2BWa5fQFE5T6Rc/s320/gql10.png" width="320" /></a><br />
<div style="margin: 0in;">
</div>
</td><td style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.2347in;"><div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
Optimistic response … Every mutation or update - update is called</div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;">Optimistic UIs don’t wait for an operation to finish to update to the final state.</span></div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="color: #979797; font-family: Consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;">They immediately switch to the final state, showing fake data for the time while the real operation is still in-progress.</span></div>
<div style="font-size: 11pt; margin: 0in;">
<br /></div>
</td></tr>
<tr><td colspan="2" style="background-color: #fadbd2; border: 1pt solid rgb(163, 163, 163); padding: 2pt 3pt; vertical-align: top; width: 3.302in;"><div style="color: #979797; font-family: consolas; font-size: 8pt; margin: 0in;">
<span style="font-weight: bold;">Remove item form the cached list</span></div>
<div style="color: #979797; font-family: consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
mutate({</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>variables: { id },</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>optimisticResponse: {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>__typename: 'Mutation',</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>removeCartItem: {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>__typename: 'Product',</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>},</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>},</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>update: (proxy) => {</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>const data = proxy.readQuery({ query: ShoppingCartList });</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>const filteredProducts = data.products.filter(product => product.id !== id);</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<br /></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>proxy.writeQuery({<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>query: ShoppingCartList,<span class="Apple-converted-space"> </span></div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>data: {...data, products},</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>});</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
<span class="Apple-converted-space"> </span>},</div>
<div style="font-family: Consolas; font-size: 8pt; margin: 0in;">
});</div>
</td></tr>
</tbody></table>
</div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in 0in 0in 2.625in;">
<br /></div>
<div style="color: #979797; font-family: consolas; font-size: 8pt; margin: 0in;">
<span style="font-family: calibri; font-size: 11pt;">P.S. graphics are by shadow.x.q84@gmail.com</span></div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
<br /></div>
<div style="font-family: verdana; font-size: 14.6667px; margin: 0in;">
</div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
<span style="font-size: 11pt;">Summing up: Alexey Kureev is an amazing story teller, able to take this complex technical topic</span><span class="Apple-converted-space" style="font-size: 11pt;"> </span></div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
and make it both interesting and exciting. I definitely will be looking into using GraphQL and</div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
Apollo in future projects.</div>
<div style="font-family: Calibri; font-size: 11pt; margin: 0in;">
<br /></div>
<h2 style="color: #2e75b5; font-family: calibri; font-size: 14pt; margin: 0in;">
See also:</h2>
<ul style="direction: ltr; font-family: verdana; font-size: 14.6667px; margin-bottom: 0in; margin-left: 0.375in; margin-top: 0in; unicode-bidi: embed;" type="disc">
<li><span style="font-family: "verdana"; font-size: 14.6667px;">video: </span><a href="https://youtu.be/REwGyGc7nMg?t=153" style="font-family: verdana; font-size: 14.6667px;"><span style="font-family: "verdana"; font-size: 9.0pt;">Apollo Client: the stuff no-one ever told ya @ React
London February 2018</span></a></li>
<li><span style="font-family: "verdana"; font-size: 14.6667px;">slide deck: </span><a href="https://slides.com/alexeykureev/apollo-stuff-no-one-told-ya" style="font-family: verdana; font-size: 14.6667px;"><span style="font-family: "verdana"; font-size: 9.0pt;">https://slides.com/alexeykureev/apollo-stuff-no-one-told-ya</span></a></li>
<li><a href="https://blog.apollographql.com/query-components-with-apollo-ec603188c157" style="font-family: verdana; font-size: 14.6667px;"><span style="font-family: "verdana"; font-size: 9.0pt;">Query Components with Apollo</span></a></li>
<li><a href="https://blog.apollographql.com/graphql-schema-design-building-evolvable-schemas-1501f3c59ed5" style="font-family: verdana; font-size: 14.6667px;"><span style="font-family: "verdana"; font-size: 9.0pt;">GraphQL Schema Design:
Building Evolvable Schemas</span></a></li>
<li><a href="http://engineering.khanacademy.org/posts/creating-query-components-with-apollo.htm" style="font-family: verdana; font-size: 14.6667px;"><span style="font-family: "verdana"; font-size: 9.0pt;">Creating Query Components with
Apollo</span></a></li>
<li><a href="https://blog.apollographql.com/batching-client-graphql-queries-a685f5bcd41b" style="font-family: verdana; font-size: 14.6667px;"><span style="font-family: "verdana"; font-size: 9.0pt;">Batching Client GraphQL
Queries</span></a></li>
<li><span style="font-family: "verdana"; font-size: 9pt;"><a href="https://blog.apollographql.com/deploy-a-fullstack-apollo-app-with-netlify-45a7dfd51b0b" style="font-family: verdana; font-size: 14.6667px;">Deploy a fullstack Apollo app
with Netlify</a></span></li>
</ul>
</div>
</div>
</div>
<div style="caret-color: rgb(0, 0, 0); font-family: Verdana; font-size: 14.6667px; text-size-adjust: auto;">
<div style="margin: 0in;">
<br /></div>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0Yigal Alon St 98, Tel Aviv-Yafo, Israel32.0700635 34.7939280000000515.1748895000000026 -6.514662999999949 58.9652375 76.102519000000058tag:blogger.com,1999:blog-7341613201559046168.post-42968681489775038212018-01-17T20:00:00.000+02:002020-02-24T09:41:43.813+02:00Streaming events to BigQuery<br />
<div class="separator" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWeRpNeW8LiQHSsMqxpGoTEUrp_2Bofih0aWBH4wHExiUiCgKeXy6KljtN86Iory9nr_kZ8QuNSnKrhK28MAZiISKcx-RO73Rl8VL_rU83nLQ51nepW5S5bCWDdT0GUbWwLa_O2plCJk0/s1600/bigquery.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqc8TPDZiQEzNzuTAkfwcRxyb0GX3kuLQE8hX-kM5p5jEcOteVpvWTC8KdPK9CpXNiFZeJcJud7D2R4aAp9WrCPKehZK2uxDc2IS635r7EjUPzzdyhWuP9kumiHpkXGwPf2CHo_0jSlRY/s1600/apache-kafka.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><br /></div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqc8TPDZiQEzNzuTAkfwcRxyb0GX3kuLQE8hX-kM5p5jEcOteVpvWTC8KdPK9CpXNiFZeJcJud7D2R4aAp9WrCPKehZK2uxDc2IS635r7EjUPzzdyhWuP9kumiHpkXGwPf2CHo_0jSlRY/s1600/apache-kafka.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><br />
<div class="separator" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: center;">
<br /></div>
<br />
<br />
Here are my notes from "Streaming events with Kafka to BigQuery and Logging" Meeting of <i>Big Things</i> Meetup which took place at Poalim Tech Offices. The work space is quite amazing and there were many people working as late as 10 PM.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJaPFbMcZhXdCynvQTXqegB74HwHaGn02EcUARU1OcZ9hAztGgmKRIo_RovdtthKP7TjKVPOwvKlvL3-sX3-ZhHzuYsr04ADc3qoqrFuG80bhrnaViSTf-BPwui3baMMhybgO_mOw1OUg/s1600/20180117_195624.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="923" data-original-width="1600" height="368" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJaPFbMcZhXdCynvQTXqegB74HwHaGn02EcUARU1OcZ9hAztGgmKRIo_RovdtthKP7TjKVPOwvKlvL3-sX3-ZhHzuYsr04ADc3qoqrFuG80bhrnaViSTf-BPwui3baMMhybgO_mOw1OUg/s640/20180117_195624.jpg" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Poalim Tech has is a great host for medium sized meetups. On the left, the obligatory pizza overdose</td></tr>
</tbody></table>
<br />
<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><img border="0" data-original-height="151" data-original-width="151" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWeRpNeW8LiQHSsMqxpGoTEUrp_2Bofih0aWBH4wHExiUiCgKeXy6KljtN86Iory9nr_kZ8QuNSnKrhK28MAZiISKcx-RO73Rl8VL_rU83nLQ51nepW5S5bCWDdT0GUbWwLa_O2plCJk0/s200/bigquery.png" style="margin-left: auto; margin-right: auto;" width="200" /></td></tr>
<tr><td class="tr-caption" style="text-align: center;">BigQuery - a serverless analytics <br />
warehouse is the destination for the data.</td></tr>
</tbody></table>
<h3>
Google BigQuery in brief</h3>
BigQuery is Google's Analytics serverless database solution based on colosul and providing as yet unmatched scaling capabilities. Usage cost are typically 5 USD per TB processed.<br />
<br />
<h4>
Pros are: </h4>
<br />
<ul>
<li>serverless data warehouse solution.</li>
<li>a powerful command line interface.</li>
<li>an SQL based interface with noSql performance.</li>
<li>Good code examples.</li>
</ul>
<br />
<h4>
Cons are: </h4>
<br />
<ul>
<li>that queries can eat up many thousands of USD of compute time.</li>
<li>clunky web interface.</li>
</ul>
<br />
<h3>
Apache Kafka in brief</h3>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhf7M3oYE6djRo4xAtZFVwaIw1qKHa6kza33EFHc1kRAYP5Sd00H__tqyHUu2UBu4wYA9onWicaO7Ouwj2lXnq2k81ZN4_tcu-01VAYaH6reTllWHrs9zA3tShnn5waPIfj6EtwSTc72yQ/s1600/apache-kafka.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="291" data-original-width="975" height="95" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhf7M3oYE6djRo4xAtZFVwaIw1qKHa6kza33EFHc1kRAYP5Sd00H__tqyHUu2UBu4wYA9onWicaO7Ouwj2lXnq2k81ZN4_tcu-01VAYaH6reTllWHrs9zA3tShnn5waPIfj6EtwSTc72yQ/s320/apache-kafka.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Kafka is used to build <span style="font-size: 12.8px;">streaming data pipelines</span></td></tr>
</tbody></table>
Apache Kafka is a highly performant free and open source message broker which allows asynchronous communication between consumer and producers of messages (messages in this case are web service based function calls). Transition to microservice architecture as well as speed and scaling concerns have made Kafka a key component in the modern enterprise's real time and streaming pipelines.<br />
<h2>
Streaming events</h2>
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrDOTk9XsQ8yvyOF0e5uZB7d0_pgpqUfmqzM_AWsP31ZecE0I62EzWoLGVFhujmbdfe7jJY-b0oT8M9ygMleMF7P092Dspoj2gyOjFyGurW-7_xOcUZcS0tCFKAy8tvEkMFK2VpGTWu5M/s1600/MH_logo_RGB.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="448" data-original-width="1600" height="89" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrDOTk9XsQ8yvyOF0e5uZB7d0_pgpqUfmqzM_AWsP31ZecE0I62EzWoLGVFhujmbdfe7jJY-b0oT8M9ygMleMF7P092Dspoj2gyOjFyGurW-7_xOcUZcS0tCFKAy8tvEkMFK2VpGTWu5M/s320/MH_logo_RGB.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">MyHeritage - stream change data from<br />
Kafka into Google BigQuery</td></tr>
</tbody></table>
The first talk which was titled " From Kafka to BigQuery - A Guide for Streaming Billions of Daily events" by <a href="https://www.linkedin.com/in/ofir-sharony-680b4761/" rel="nofollow" target="_blank">Ofir Sharony</a> (a BackEnd Tech-Lead at MyHeritage). was about how my heritage use which use Kafka get their data into BigQuery. At a previous talk by MyHeritage engineers, they covered how they converted their monolith into microservices. Mr Sharony points that there are two type of data being placed into BigQuery. The first is database (the family trees created by the clients) and a second class is called "Change Data" which is an association of microservice events log together in context of web analytics of client session. This is becoming de facto the way microservices architecture are debugged.<br />
<br />
Mr. Sharony outlines four iteration of Kafka to BigQuery Integration each with progressively simple architecture. They are as follows:<br />
<br />
<br />
<br />
<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhl0Gz78RpY1jQBZDD7Y63PAnogt__Fx4MC4kWFZIrshyjVmOzZTvTqXGQlm_8mu-g3iy1NAkgOCxotXX6L1WcTqaEKGtrU6Af5EOvO4RGkdN1lOgJwjAOEx_aUeeQoYMxHKxU4HqaRVfk/s1600/0_Rmk9027XxTfymgBe.png"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhl0Gz78RpY1jQBZDD7Y63PAnogt__Fx4MC4kWFZIrshyjVmOzZTvTqXGQlm_8mu-g3iy1NAkgOCxotXX6L1WcTqaEKGtrU6Af5EOvO4RGkdN1lOgJwjAOEx_aUeeQoYMxHKxU4HqaRVfk/s320/0_Rmk9027XxTfymgBe.png" /></a><br />
<br />
<br />
<h4>
Take 1: Batching data to GCS </h4>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEZlSv1yVJ3FAg2OPn8392XJfAIjXeJSAMKWu2Y8zgIejthhkUW-Y5vbSzurgSOzOxamqloaqAqUZa8b6K4KI9ExSeGnfxf37Ip6JIYPjbqGK8K0yNGuhPcE2KSbt6Tg49XzsalDp9tIM/s1600/0_9-7ghOVhGDwcBTDK.jpeg"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEZlSv1yVJ3FAg2OPn8392XJfAIjXeJSAMKWu2Y8zgIejthhkUW-Y5vbSzurgSOzOxamqloaqAqUZa8b6K4KI9ExSeGnfxf37Ip6JIYPjbqGK8K0yNGuhPcE2KSbt6Tg49XzsalDp9tIM/s320/0_9-7ghOVhGDwcBTDK.jpeg" /></a><br />
<div>
<br /></div>
<div>
based on Secor and Google cloud storage.<br />
<h4>
Take 2: Streaming with BigQuery API</h4>
</div>
<div>
<br /></div>
<div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfRWIYbae2SyfqrwH3F5iB2kdcDj89gdScNrUSBwjPRqTMT3dzNKxm7_AQGyUcBhLFsvwy0irseMNle9hRk6c5wcBTHEuhHwfiEtHGepAR3xBVKLYB2C6BdSFGKESg2QccFWUPVMIO9ZQ/s1600/0_g_qC-Xq2upjQcG_p.png" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfRWIYbae2SyfqrwH3F5iB2kdcDj89gdScNrUSBwjPRqTMT3dzNKxm7_AQGyUcBhLFsvwy0irseMNle9hRk6c5wcBTHEuhHwfiEtHGepAR3xBVKLYB2C6BdSFGKESg2QccFWUPVMIO9ZQ/s320/0_g_qC-Xq2upjQcG_p.png" /></a><br />
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
This iteration use he big query API. This was droppped since it required extensive error and exception handing which could be avoided using a kafka connector<br />
<h4>
Take 3: Streaming with Kafka Connect</h4>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJotN2VJK4J-qmiazQlym_Ox00V07_p_8prIuJsN7mvc3rjvPzOp2stevbi7KVps-vgF5dOvj1AwY3qYSoSTMoUWKM79-sZyUnmCFbibVeGoNgyZCPMF0EZcYKIilcLLKoPwnJv2mBtlM/s1600/1_3wS-g4G02OfCqX31bie3Jg.png"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJotN2VJK4J-qmiazQlym_Ox00V07_p_8prIuJsN7mvc3rjvPzOp2stevbi7KVps-vgF5dOvj1AwY3qYSoSTMoUWKM79-sZyUnmCFbibVeGoNgyZCPMF0EZcYKIilcLLKoPwnJv2mBtlM/s320/1_3wS-g4G02OfCqX31bie3Jg.png" /></a><br />
<span style="font-weight: normal;">He used an open-sourced Connector implemented by WePay but there were a number of issues</span><span style="font-weight: 400;">, the BigQuery connector could only ingest data by its processing time which led some to data in the wrong partitions, and could not </span>split events from a single stream to specific BigQuery tables and so they tried another solution:<br />
<h4>
Take 4: Streaming with Apache Beam</h4>
<br />
<div class="separator" style="clear: both;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8147LL9294QJ5lgCdj-mHuBRoqz7UAWzHIfv5WdtZ8q7UvCLzZI2LkhDQ7Bt-CBuz3DNof8rYZiySVljqC7kN3lCup9epMWiRnCGNqptrffatoGABYoYTFxBthlPqFCpSi3iykpiI8xw/s1600/0_bPtWjAMpc0Ox4Uil.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="138" data-original-width="800" height="55" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8147LL9294QJ5lgCdj-mHuBRoqz7UAWzHIfv5WdtZ8q7UvCLzZI2LkhDQ7Bt-CBuz3DNof8rYZiySVljqC7kN3lCup9epMWiRnCGNqptrffatoGABYoYTFxBthlPqFCpSi3iykpiI8xw/s320/0_bPtWjAMpc0Ox4Uil.png" width="320" /></a></div>
<br />
<br />
<br />
Talk video (hebrew):<br />
<div>
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/MoFqATG5AJA/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/MoFqATG5AJA?feature=player_embedded" width="320"></iframe></div>
<br />
<br />
<h2>
Logging</h2>
<div>
The team have already mostly migrated from SAS to R and are also migrating to python. But the Data Science team at Bank Hapoalim tech have a number of unusual requirements and challenges.<br />
First usually do not have to production level data. Any data they can uses must be sanitized of any sensitive PII (Personally Identifying Information).</div>
<div>
<br /></div>
<div>
</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgIRyDboxh9yfJ2VsysKif4ufM8JCvHCkQtpXN2FBGlxrRO-pnXckumxVjwzqgGyESTuc6FZMw8zDGJ2M3lY2UFM-4XeAbIjEJriG5I5MfqHura8MapTeKwrivwRVRcetp3_KvLcWDZ0A/s1600/hortonworks_logo.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="143" data-original-width="353" height="129" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgIRyDboxh9yfJ2VsysKif4ufM8JCvHCkQtpXN2FBGlxrRO-pnXckumxVjwzqgGyESTuc6FZMw8zDGJ2M3lY2UFM-4XeAbIjEJriG5I5MfqHura8MapTeKwrivwRVRcetp3_KvLcWDZ0A/s320/hortonworks_logo.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Bank Hapoalim data scientists uses a HortonWorks <br />
suite of 30 + open source software </td></tr>
</tbody></table>
Secondly they a suite of 30 of open source software tools from <a href="https://hortonworks.com/" rel="nofollow" target="_blank">Hortonworks</a>. But the software in use is based on a support contract from a vendor who changes the support of different software every year. This means that some software choices are unpopular forks of mainstream projects a decision made by the vendor rather than the data science team Also looking at some of the choices there is a significant effort to lock clients to these choices and away from competing stacks... Also since Hortonworks just provides support many of the tools in the stack are badly co-integrated. (Multiple SOLR instances with irregular levels of support for orchestration) Ofcourse to fully integrate the tools HortonWorks would need to have employ FOSS developers of sufficient standing in each projects so as to enable architectural changes for these integrations.</div>
<br />
<div>
Thirdly their work is logs are stored for different time frames in some cases these duration are regulated:</div>
<div>
<ul>
<li>Applications logs - are stored for some weeks for analysts to review data flows (un regulated)</li>
<li>Analytics log: are also stored for a few weeks (time, quality, descriptive statistics, confusion matrices etc).</li>
<li>Audit logs of decisions and why they were made are regulated and need to be stored for 7 year.</li>
</ul>
</div>
<div>
I find this vaguely amusing considering the Karkaeque nature of almost all decision made at bank being undisclosed non-transparent, and impossible to discover even who the decision makers are.</div>
<br />
Talk video (hebrew):<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/Nt0KJEOizPY/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/Nt0KJEOizPY?feature=player_embedded" width="320"></iframe></div>
<br />
<br />
<h3>
References:</h3>
<ul>
<li>Looks like the material in the first talk is based on <a href="https://medium.com/myheritage-engineering/kafka-to-bigquery-load-a-guide-for-streaming-billions-of-daily-events-cbbf31f4b737" rel="nofollow" target="_blank">this</a> blogpost. However the talk had been expanded and updated since the post.</li>
<li>A slide deck for the first talk is <a href="https://www.slideshare.net/OfirSharony/from-kafka-to-bigquery-strata-singapore-83467341" rel="nofollow" target="_blank">here</a>.</li>
</ul>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0PoalimTech Offices32.061726 34.7794771000000085.1674420000000012 -6.5291168999999911 58.95601 76.088071100000008tag:blogger.com,1999:blog-7341613201559046168.post-78618690104821225862018-01-15T16:49:00.004+02:002020-02-24T09:46:04.137+02:00How to search your youtube history from the command line in ?<br />
So I eventually found that the personal Youtube search url is at:<br />
<span style="font-family: "courier new" , "courier" , monospace;"><br /></span>
<span style="font-family: "courier new" , "courier" , monospace;">https://myactivity.google.com/myactivity?q=query&restrict=ytw</span><br />
<br />
A nice ui no doubt but I need to get at from the command line...<br />
<br />
<pre></pre>
<pre>To avoid breaking the URI we must ensure the query text is url encoded so:</pre>
<pre></pre>
<pre></pre>
<pre>urlencode() {
# urlencode <string>
local length="${#1}"
for (( i = 0; i < length; i++ )); do
local c="${1:i:1}"
case $c in
[a-zA-Z0-9.~_-]) printf "$c" ;;
*) printf '%%%02X' "'$c"
esac
done
}</pre>
<pre></pre>
<pre></pre>
<pre>with that:</pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre><span style="color: lime;">#youtube history command</span></pre>
<pre><span style="font-family: "courier new" , "courier" , monospace;">yth(){
urlencode | google-chrome "https://myactivity.google.com/myactivity?q='${*//[$'\t\r\n ']}'&restrict=ytw"
}</span>
</pre>
<div>
<br /></div>
<div>
p.s. all these go in a dotfile say at <span style="font-family: "courier new" , "courier" , monospace;">~/Dotfiles/.functions</span> and then sourced via:</div>
<div>
<br /></div>
<div>
<pre><span style="font-family: "courier new" , "courier" , monospace;">$source ~/Dotfiles/.functions</span></pre>
</div>
<div>
<br /></div>
<div>
<pre>so to look up the legendary session "<a href="https://www.youtube.com/watch?v=GsLZz8cZCzc" rel="nofollow" target="_blank">Willy Wonka of Containers - Jessie Frazelle</a>"
</pre>
<pre></pre>
<pre>I need only type:</pre>
<pre></pre>
<pre><span style="font-family: "courier new" , "courier" , monospace;">$yth Willy Wonka</span></pre>
<pre></pre>
<pre>And I instantly achieve container nirvana at #ContainerCamp.</pre>
<pre></pre>
<h2>
References:</h2>
</div>
<div>
<ol>
<li><a href="https://www.quora.com/How-do-I-search-my-YouTube-history-with-a-keyword-search-instead-of-just-plowing-through-them-page-by-page" rel="nofollow" target="_blank">Quora on: How do I search my YouTube history with a keyword search instead of just plowing through them page by page?</a></li>
<li><a href="https://www.youtube.com/watch?v=GsLZz8cZCzc" rel="nofollow" target="_blank">Willy Wonka of Containers by Jessie Frazelle</a></li>
<li><a href="https://dotfiles.github.io/" rel="nofollow" target="_blank">github does dotfiles</a></li>
</ol>
</div>
<div>
<br /></div>
<pre></pre>
<pre></pre>
<pre></pre>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-16496137624851242912017-12-30T15:23:00.003+02:002017-12-30T15:23:35.583+02:00Automating web app development with Polymer and Yeoman<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://i.imgur.com/dsFChIk.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://i.imgur.com/dsFChIk.png" height="111" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Yeoman lets you configure and<br />
stamp out sophisticated boilerplate<br />
projects from the command line.</td></tr>
</tbody></table>
In a digital marketing agency the data science team may be asked to provide for each client's campaign fairly similar media reports and explanatory analytics dashboards for both external clients and internal clients. For longer term project we may be asked to also provide predictive analytics. The data comes from advertisers Google, Facebook, Taboola, OutBrain, phone tracking metrics API, outcomes are channels via segment into say Google Analytics which has both an API and polymer components. Usually there will be additional data science products prediction special segment data, funnels, market research, attribution charts which and long term data in BigQuery which has an API as well. Some vendors don't have an API so to access their data it is exported into a CSV and placed into google sheets which has an API and a Polymer component phone tracking. It takes too much work and time to code all these dashboards unless the campaign is long term. But when one uses a generator like Yeoman and incrementally add each part the creation of much of these dashboards can be automated reducing time and complexity for creating these solutions.<br />
<br />
A second use case for using this combo is creating html5 banners. Often campaigns require creating many simail banners especially if you want to scientifically optimise your creative using a fractional factorial experimental design. Here is the <a href="https://github.com/johnfmorton/generator-buildabanner#readme" rel="nofollow" target="_blank">Buildabanner Yeoman generator</a><br />
<h3>
Automation</h3>
Polymer together with Yeoman can help kick start new web project with an opinionated fully baked tooling infrastructure. Each new edition of polymer has many changes and different tooling. Yeoman is not very well documented and challenging to integrate into the increasingly automated build formats most CLI use today. So while these have fairly steep learning curves which may make it difficult to justify the return on time invested for tooling etc. But together if you have many similar projects planned, or are building a self-serve system Yeoman and polymer may be just the right fit.<br />
<br />
Yeoman is an automation tool for creation of a web project. The more structured your projects are the more Yeoman is going to save you time. It also shares with web components the notion of compatibility which can help support complexity. However to make a Yeoman generator bullet-proof may require long term support and fixing bugs which occur on other people's systems.<br />
<br />
The two have been combined in the Polymer CLI, though currently you may be interested in the following resources if this is a project you wish to automate. It also allows teams to concentrate knowledge into a generator which will more readily supports additional automation via scripts build tools and a split production and development pipeline as well as CI down the line.<br />
<br />
Another issue common to work with large boiler plate projects - few people know what all the boiler plate is doing, how it can be tested or changed. So consider that you should document the project thoroughly.<br />
<br />
<br />
<h3>
Resources on accelerating Polymer projects with Yeomen</h3>
I cover these because Yeoman is easy to get started with but you soon end up interacting with Bower, NPM as well as other tools which boils down to a steep learning curve.<br />
<br />
The <a href="https://www.polymer-project.org/1.0/" rel="nofollow" target="_blank">Polymer</a> the Javascript library that teaching the browser new tricks. Polymer has had strong ties to <a href="http://yeoman.io/" rel="nofollow" target="_blank">Yeoman project</a> - perhaps since the polymath Addy Osmani is on both development teams. Yeoman is described as "The web's scaffolding tool for modern web apps".<br />
<br />
Here is <a href="http://www.html5rocks.com/en/tutorials/webcomponents/yeoman/" rel="nofollow" target="_blank">an article</a> from 2013 on creating polymer projects using Yeoman, a year later he release the following video:<br />
<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/Yd6Q4Wwvpd0/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/Yd6Q4Wwvpd0?feature=player_embedded" width="320"></iframe></div>
<div style="text-align: center;">
+Addy Osmani - Building a Polymer app with Yeoman 2014</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/INH_OW4lFSs/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/INH_OW4lFSs?feature=player_embedded" width="320"></iframe></div>
<div style="text-align: center;">
YOLOmer! Polymer and Yeoman for lighting fast dev</div>
<div style="text-align: center;">
<br /></div>
<div class="separator" style="clear: both;">
When Polymer 1.0 was released it was introduced with a couple of starter kit projects one for new users and another for power users. This project included some fairly sophisticated use of tooling to provide a plethora of configurable features such as offline support using service workers. At the polymer summit in 2015. Rob Dobson introduced a polymer Yeoman generator that stamps out a psk project</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div style="text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/1f_Tj_JnStA/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/1f_Tj_JnStA?feature=player_embedded" width="320"></iframe></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
+Rob Dobson - End to End with Polymer from The Polymer Summit 2015</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
In 2016 polymer introduced routing and layouts components and behaviors. These were released with a Polymer CLI a command line tool based on Yeoman that provides unified install and access to many tools that a used with Polymer. However, the PSK2 project was not updated to work with these and eventually a much simpler starter project was recommended.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
These were introduced by Rob Dobson in 2016 in a couple of Polycast episodes 52 and 53 focused on the PSK2.</div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/A_OEdyhgnKc/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/A_OEdyhgnKc?feature=player_embedded" width="320"></iframe></div>
<div style="text-align: center;">
Rob Dobson - How to build a CLI generator -- Polycasts #53</div>
<a href="https://www.youtube.com/user/ChromeDevelopers"></a><br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
<br /></div>
<div style="text-align: center;">
<span style="text-align: left;">Polymer CLI Generators 101 </span><br />
<span style="text-align: left;"><br /></span></div>
<br />
Polymer CLI allows us to generate components. There are many types of components and creating a working environment with a demo and tests takes lots of work and research. We are also at the cusp between polymer 1 and polymer 2 where es2015 rules supreme.<br />
<br />
So what Let's build some generators for custom components in Polymer. This is one use case where working with generators can have a significant payoff in time savings<br />
<br />
<ul>
<li>Polymer 2 preview elements</li>
<li>es6 polymer element element </li>
<li>psk2 with es6 support</li>
<li>an element in psk2 with es6 support</li>
<li>a style element</li>
<li>a behaviour</li>
<li>d3 element etc (integration with libraries that can play with polymer) What are the requirements for a lib so it can be integrated with polymer's data binding model.</li>
<li>a PWS project with firebase support</li>
</ul>
<div>
<h2>
Additional References</h2>
</div>
<div>
<ul>
<li><a href="https://www.packtpub.com/mapt/book/web_development/9781783981380" rel="nofollow" target="_blank">Learning Yeoman - Jonathan Spratley</a></li>
<li><a href="https://www.manning.com/books/front-end-tooling-with-gulp-bower-and-yeoman" rel="nofollow" target="_blank">Front-End Tooling with Gulp, Bower, and Yeoman</a></li>
<li><a href="http://yeoman.io/codelab/">Yeoman codelab</a></li>
</ul>
</div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-68093574587390013202017-12-27T15:25:00.000+02:002017-12-27T15:25:36.339+02:00Serverless Big Data<div>
<br style="background-color: #f6f7f8; color: #2e3e48; font-family: "Graphik Meetup", -apple-system, BlinkMacSystemFont, Roboto, Helvetica, Arial, sans-serif; font-size: 16px;" /><div>
I was at the first meeting of the p of the Big Data Analytics meetup. The first speaker was Avi Zloof CEO of EvaluteX who gave a talk titled "Serveless Big Data The Good, and the Great"</div>
<div>
<br /></div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSuJJV_hLwpb2oW8bPMA_LBLEgv_ESdSh23_hk8On-9e-3FwZK5JE9PtOClWidtNe2dStx_vPvCMT_lPWCoN_l0sjcm0ZJ3u-gVRIj36b3QFWcUQkfYfzfl-zV-S3lzBMk1FGAuhMgh28/s640/blogger-image-1903939847.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="239" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjSuJJV_hLwpb2oW8bPMA_LBLEgv_ESdSh23_hk8On-9e-3FwZK5JE9PtOClWidtNe2dStx_vPvCMT_lPWCoN_l0sjcm0ZJ3u-gVRIj36b3QFWcUQkfYfzfl-zV-S3lzBMk1FGAuhMgh28/s320/blogger-image-1903939847.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The Speaker Avi Zloof</td></tr>
</tbody></table>
<div>
EvaluateX which is located at "<a href="http://thejunction.co.il/" rel="nofollow" target="_blank">The Junction</a>" in (Rothschild 9 Tel Aviv) is an outfit that has a chrome plugin which can optimize Google BigQuery SQL queries in the web interface. My Last BigQuery project however had abandoned the web interface and switched to 100% automation via the API. Also despite having massive queries there was little need to optimize them. I had been more concerned with comparing different editions of the projects to detect data discrepancies. The Big Data and GUI connection is often the primary challenge however this was not the subject of the talk.</div>
<div class="separator" style="clear: both;">
The talk introduced me to EvaluateX and their activity. Mr. Zloof shared many interesting professional insights as well as his point of view regarding serverless database platforms. Mr Zloof briefly outlined the history of serverless databases, mentioning </div>
<div>
<br />
<ul>
<li>Google Big Query</li>
<li>AWS Athena</li>
<li>Azure Cloud Function</li>
<li>Azure clue functions</li>
<li>Google Cloud Functions</li>
<li>IBM whisk. </li>
</ul>
</div>
</div>
<div>
<br /></div>
<div>
Mr Zloof's primary takeaway message was that the <i>pricing model</i> is the key to correctly evaluating a platform's suitability for a company's business model.' </div>
<div>
<br /></div>
<div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5t2ucPc1ssPawnhbsTYAtSEwotqxWzGQgiKSddzdUBcNT8So0O-7XHH3Sfemil6v_PLmx8vxf72zmdevdNev4NnzSynRUn_G-_5mWMuB3BxuxmNEoujpWx5iQ2j0x-YGdCb6n0uHhTlI/s640/blogger-image--970232761.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="239" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5t2ucPc1ssPawnhbsTYAtSEwotqxWzGQgiKSddzdUBcNT8So0O-7XHH3Sfemil6v_PLmx8vxf72zmdevdNev4NnzSynRUn_G-_5mWMuB3BxuxmNEoujpWx5iQ2j0x-YGdCb6n0uHhTlI/s320/blogger-image--970232761.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The Serverless Databases Platforms for 2018</td></tr>
</tbody></table>
Next came a back of the napkins calculation which p<span style="font-family: "helvetica neue light" , , "helvetica" , "arial" , sans-serif;">osited that if processing a TB query cost 5 USD on BigQuery and creates a value of 6 USD, you have a viable business model for working with big data. I felt that while ignoring storage and networking costs might be a flaw in this rough model. However, I cannot deny that reducing the complexities of pricing cloud services is certainly an easier sell to middle and upper Managment than the labyrinthian calculations for pricing real world cloud services to produce BI systems and that this approach distills the costs of processing to their essence.</span></div>
<div>
<br /></div>
<div>
Mr Zloof noted that<span style="font-family: "helvetica neue light" , , "helvetica" , "arial" , sans-serif;"> when Oracle CTO Larry Alison recently <a href="https://video.oracle.com/detail/videos/most-viewed/video/5597221399001/larry-ellison-orac" rel="nofollow" target="_blank">announced</a> in October 2017 </span><span style="font-family: "helvetica neue light" , , "helvetica" , "arial" , sans-serif;"> his company's </span><span style="font-family: "helvetica neue light" , , "helvetica" , "arial" , sans-serif;">entry into the serverless database space, wit Fn such minutiae as the pricing plan had been glossed over. Once the service's pricing is finalised, it will become the deciding factor for evaluating to what extent Oracle's new platform will be competitive in this crowded space.</span></div>
<div>
<br /></div>
<div>
Price tag per 2013 - <span style="font-family: "helvetica neue light" , , "helvetica" , "arial" , sans-serif;">Google big query 5 USD per TB. </span>AWS Athena is priced at 5 USD per compressed TB which can cost a third less than BigQuery. Some other insights were that in terms of scaling and performance although Athena is cheaper than BigQuery it is far less powerful. DynamoDB is a key value store which are not as suitable for analytics work or for general purpose work as say a SQL backend.</div>
<div>
<br /></div>
<div>
Mr Zloof stated that NoSql solutions are adding an SQL interface and that after many years he now feels how SQL is the <i>lingua franca</i> for Big Data systems. I haven't seen this in, say, Firebase, however superlatives aside, this is definitely a trend in the evolution of NoSQL systems. Google BigQuery for example stated out with a proprietary SQL platform and now supports a more compliant SQL format. But when a NoSql database adds a sql front end it is highly unlikely that it will be as performant as a SQL backend which is where query optimization becomes important.</div>
<div>
<br />
Another important feature not available from most of the serverless database is "Stop Loss" ability. The term which comes from the finance industry is a command to stop an operation that will have very big charges associated. Most queries need a single scan of the data warehouse - but if your op is polynomial it could run for days on your data and you might not be aware or be able to halt it without intervention from support - which could take hours to respond. So a stop loss capability can be a game changer.<br />
<br />
All in all this was a great talk and I hope to be hearing more from this speaker in the future.<br />
<br />
<h2>
References:</h2>
<div>
<ul>
<li>https://evaluex.io/</li>
</ul>
</div>
<br /></div>
<div>
<br /></div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-10532206356552332552017-12-27T10:38:00.001+02:002017-12-27T10:38:09.739+02:00How to kill by name from the command line - ubuntu 17.10<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMJtxaV3_eYrqGgyEFZJQBX6_x8ueiwb4b3C8RwjcHUtMDFhDr6JLlMzagJi6_avdH1vw3gXjo7n0yyHXenDaQXOUgZT1GEFrjMkCF4lO_xAFU9jsyn33vxkEtA66WLnSazMFjo5HoPOM/s1600/ubuntu-linux-logo-A8280F4D05-seeklogo.com.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="300" data-original-width="292" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMJtxaV3_eYrqGgyEFZJQBX6_x8ueiwb4b3C8RwjcHUtMDFhDr6JLlMzagJi6_avdH1vw3gXjo7n0yyHXenDaQXOUgZT1GEFrjMkCF4lO_xAFU9jsyn33vxkEtA66WLnSazMFjo5HoPOM/s200/ubuntu-linux-logo-A8280F4D05-seeklogo.com.png" width="194" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Ubuntu Tip</td></tr>
</tbody></table>
<h2>
The pain</h2>
<br />
So I am working on coding a react and redux component and I have a tight loop spinning in chrome. Chrome becomes unresponsive and won't stop. Soon it will eat up all the system memory and cause my machine to grind to a halt. For some reason chrome rarely detects the rapid resource growth.<br />
<br />
I used to open a terminal and run<br />
<br />
$ ps -A<br />
<br />
to look up chrome's pid but chrome has many pids one for each window and ne per extension. My machine is slowing. I next try:<br />
<br />
$ ps -A | grep chrome<br />
<br />
this is better, I choose the first pid (I might have to scroll) and<br />
<br />
$ kill -9 <pid><br />
<br />
And thing go back to normal. But I still haven't fixed the bug and there has to be a better way...<br />
<br />
<h2>
The remedy</h2>
<div>
$ killall -9 chrome</div>
<div>
<br /></div>
<div>
and this kill all chrome processes - one command and no lookups copy pastes etc.</div>
<br />
<h3>
Note </h3>
Probably nothing ubuntu 17.10 here ....Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-77144368634832857152017-12-06T22:05:00.004+02:002017-12-17T12:43:15.373+02:00Simpla goes open sourceSimpla the headless content management system has recently announced they are closing down and making their project open source. This project allows a developer to rapidly prototype a website and a editors to manage the content from the page's ui itself. The big change is that you will not be able to move your content from the simpla database and host it on github.<br />
<br />
Headless means that a CMS don't have a huge front end like wordpress to manage the code. Instead their backend is exposed as a simple API allowing developers to use whatever integration is best suited for each user story. Headless CMS are more suitable for working with multiple channels such as android, ios app alongside a website.<br />
<br />
Trying to setup a new project using simple is easier said than done. Once I'm up and running I'll add some more updates in this space.Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-40870328428178878182017-12-03T13:44:00.000+02:002017-12-16T13:45:38.868+02:00Building a OCR using NLI Ephermera and manuscripts.<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9at9Dnekss-C2Jf3AVNJwpNlkwewJrqVkNyMlhiKJfKziQz3Eqad_eXr9Djzt55Ar_lu1BKth3aaxliQ2aSx05Gv_VS9OkBCAqImUuYdVUoOKrHVjuMLGqBsC_ZG9JBSSC7-KySL4iNI/s1600/logo-NLI-1.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="611" data-original-width="1600" height="76" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9at9Dnekss-C2Jf3AVNJwpNlkwewJrqVkNyMlhiKJfKziQz3Eqad_eXr9Djzt55Ar_lu1BKth3aaxliQ2aSx05Gv_VS9OkBCAqImUuYdVUoOKrHVjuMLGqBsC_ZG9JBSSC7-KySL4iNI/s200/logo-NLI-1.png" width="200" /></a></div>
<br />
The OCR task can be broken down as follows.<br />
<ol>
<li>Acquire the image.</li>
<li>Segment it into regions according to the following labels: </li>
<ol>
<li>Image, </li>
<li>Text Areas with optional rotation</li>
<li>Tabular Data with optional rotation</li>
</ol>
<li>Scale down very large text to suitable size glyphs</li>
<li>Improve results by adding terms to better model the noise extant on page.</li>
<li>Improve results by using lexical and grammatical knowledge into classifier.</li>
</ol>
<div>
Ideally all this should be done by an end to end system.</div>
<div>
<br /></div>
<div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6O9X6qaDlhcb1DVYLSEiWJ1tV8JA4L9kECL7ANYam-oR_znOT_jmGR-cN3kf6OMhTAF5telKpjVdQFIyYh1iIwaeDIkMdMkAvnunqtX54qRtO02u6PJDqhk4UyJeZf_eve8vY9fHNYNY/s1600/complex+layouts.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="861" data-original-width="620" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6O9X6qaDlhcb1DVYLSEiWJ1tV8JA4L9kECL7ANYam-oR_znOT_jmGR-cN3kf6OMhTAF5telKpjVdQFIyYh1iIwaeDIkMdMkAvnunqtX54qRtO02u6PJDqhk4UyJeZf_eve8vY9fHNYNY/s320/complex+layouts.jpg" width="230" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A common complex layout of Hebrew <br />
sacred texts with non rectangular <br />
columns with related but independent<br />
sequences</td></tr>
</tbody></table>
Once text area are detected requires a page segmentation algorithm to break down text areas into lines and glyphs. </div>
<br />
Looking at some samples from the NLI ephemera database one would wish to add steps to clean up and rescale some elements whose fonts are too small. Also if one had a suitable model, perhaps add details to text that is too small.<br />
<br />
The challenges are:<br />
<ol>
<li>Training on a large data set of glyphs (Characters used in the fonts one needs to recognize.) DNN and OCR engines in general are seem to be inflexible recognizing similar data as they simply cannot generalize beyond what they are trained on.</li>
<li>Flexible Segmentation and figuring the most correct sequence of text blocks so that the page is logical. (Hebrew is RTL English is LRT and they could also be mixed. Using images + hand converted results from <a href="https://www.gutenberg.org/" rel="nofollow" target="_blank">Project Gutenberg</a>, <a href="http://benyehuda.org/" rel="nofollow" target="_blank">Ben Yehuda</a> and other project could be useful as would be processing of manuscripts which would introduce greater recognition ability. This needs to be formulated into the a loss function.</li>
<li>Improving results by combining lexical and grammatical data into the loss function to select best sub sequences.</li>
<li>Learn to model real noise from different types of document. Noise can be modeled as digitization artifacts, aging, wear and tear, gutters, dirt) and separate it from the signal adaptively.</li>
</ol>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiaHZmJhJ_dNnHkddwOnJtw466_hKVqpybDhOTlLAcDcWl8JNv1J7jNahM_ZbdPpIvEJZqo6plRNcDldEdBVvazJCzpvNAolL3WZlC9i6cJzA489opdzFw8rrpd6vG1G_ctIdv3XQV6gbA/s1600/375px-Pashkavil_Mamila.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img alt="" border="0" data-original-height="300" data-original-width="375" height="256" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiaHZmJhJ_dNnHkddwOnJtw466_hKVqpybDhOTlLAcDcWl8JNv1J7jNahM_ZbdPpIvEJZqo6plRNcDldEdBVvazJCzpvNAolL3WZlC9i6cJzA489opdzFw8rrpd6vG1G_ctIdv3XQV6gbA/s320/375px-Pashkavil_Mamila.jpg" title="OCR challenges: Scaling and layout " width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Ephemera also have both non-standard layouts,<br />
unusual letter size variation as well as unexpected content.</td></tr>
</tbody></table>
<br />
Ideally one would capture vector version of glyphs in the most common fonts and in all their weights and variants (including ligatures etc) and use these together with frequencies to model the corpus.<br />
<br />
But how to train it as a GAN (Generative Adversarial Neural Network)?<br />
<br />
A piece of software to generate glyphs. (N category Classier)<br />
A piece of software to generate suitable textual sequences using these glyphs. (RNN or LSTM)<br />
A piece of software to generate text in different layouts and composite pages. (GAN)<br />
A function that checks how close the above to real pages scanned in a database. (Requirements are for a similarity threshold.)<br />
<br />
Train the different elements together.<br />
<h3>
References</h3>
<ul>
<li><a href="https://datascience.stackexchange.com/questions/15396/how-to-generate-training-data-for-ocr" rel="nofollow" target="_blank">Generating data for OCR</a></li>
<li><a href="https://github.com/zafartahirov/not_notMNIST" rel="nofollow" target="_blank">Code to generate non minst dataset from fonts</a></li>
<li><a href="http://yaroslavvb.blogspot.co.il/2011/09/notmnist-dataset.html">notmnist dataset - has greater variation in its categories</a></li>
<li><a href="https://github.com/zalandoresearch/fashion-mnist">non textual minst alternative</a></li>
<li><a href="https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html">Training image classification with very little data</a></li>
<li><a href="http://hebrewbooks.org/" rel="nofollow" target="_blank">Hebrew Books</a></li>
<li><a href="https://safranim.com/2010/12/01/%D7%90%D7%A8%D7%95%D7%9F-%D7%94%D7%A1%D7%A4%D7%A8%D7%99%D7%9D-%D7%94%D7%99%D7%94%D7%95%D7%93%D7%99-%D7%94%D7%9E%D7%9E%D7%95%D7%97%D7%A9%D7%91-%D7%9E%D7%90%D7%92%D7%A8%D7%99-%D7%9E%D7%99%D7%93%D7%A2/" rel="nofollow" target="_blank">Hebrew resources from Safranim's Blog </a></li>
<li><a href="https://he.wikipedia.org/wiki/%D7%A7%D7%95%D7%91%D7%A5:Pashkavil_Mamila.jpg" rel="nofollow" target="_blank">Source for the ephemera image</a></li>
<li><a href="https://archive.org/details/nationalyiddishbookcenter" rel="nofollow" target="_blank">Yiddish Book Center's</a></li>
</ul>
<br />
<br />
<br />
<br />
<br />Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-55403193813496614112017-11-26T16:33:00.002+02:002023-03-16T02:50:26.065+02:00The Happy winners #nlihack 2017<h2>
About the event </h2>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia9GHTs0G7rwj0JpHIldMo7WgiZKemH7AFUkhwXctz0WLonUsGs0RO0pzpUbd46K1CeYkJ9XzzFG-GRqdDUdFZuqfr64svIVDSsQMg7CTaJJc3AnFGXu-TXXR4BidqK8IFlnTt3oy2_yY/s1600/wiki500.jpg" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="333" data-original-width="500" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia9GHTs0G7rwj0JpHIldMo7WgiZKemH7AFUkhwXctz0WLonUsGs0RO0pzpUbd46K1CeYkJ9XzzFG-GRqdDUdFZuqfr64svIVDSsQMg7CTaJJc3AnFGXu-TXXR4BidqK8IFlnTt3oy2_yY/s320/wiki500.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The happy winners! <br />
Amir Aharoi, Oren Bochman and Chaim Cohen</td></tr>
</tbody></table>
Last week (November 23-24 2017) I had the pleasure of participating in the first <a href="http://blog.nli.org.il/hackathon/" rel="nofollow" target="_blank">National Library of Israel's Hackathon</a>. I've been to the NLI a few times with friends from the Wikimedia movement to instruct its staff and students about editing Wikipedia. But at the hackathon, the NLI opened its doors to the best and brightest minds to help out with tagging content and dissemination of its extensive image database.<span id="goog_1818226550"></span><br />
<br />
<h2>
The Team</h2>
You can't win a hackathon without a great team. My team consisted of seven developers which have been a part of the core community of Wikimedia developers in Israel and have been meeting irregularly since the International Wikimedia hackathon Organized by Wikimedia Israel last year in Israel. We had met about a week before the event at the Local chapter's offices and discussed over pizza what we wanted to do and what the NLI had asked us to do. I realized that the most wanted task required a longer-term commitment and possibly discussion with NLI staff on a suitable upgrade of the commons mass upload solutions.<br />
<br />
p.s. I'll be adding a note on the team members ASAP.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7xtcVJxakeeQQei2h3kn90MdqDNcAEFdsL0U2xUuxZIoRjL9PVfQ-teSA6JSXY3ONC-R1DZBPMqHiWi239QUdG2XqgISB1bzUjCZdDKpzk_NdWNniICZFBFHSdDUM5Yr8lUUxDrbK39Y/s1600/presenting.jpg" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1067" data-original-width="1600" height="425" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7xtcVJxakeeQQei2h3kn90MdqDNcAEFdsL0U2xUuxZIoRjL9PVfQ-teSA6JSXY3ONC-R1DZBPMqHiWi239QUdG2XqgISB1bzUjCZdDKpzk_NdWNniICZFBFHSdDUM5Yr8lUUxDrbK39Y/s640/presenting.jpg" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Oren Bochman presenting the deep learning model</td></tr>
</tbody></table>
<br />
<h2>
The Ideas </h2>
<br />
Another suggestion had been to learn categories via text description in the image meta data. I had a related yet more ambitious idea which was to retrain some state of the art models such as <a href="http://arxiv.org/abs/1409.4842" rel="nofollow" target="_blank">inception</a> and <a href="https://arxiv.org/abs/1503.03832" rel="nofollow" target="_blank">FaceNet</a> (also based on inception) to learn the contents of the images in the database. I thought that this was a viable idea since it was based on deep learning tasks using <a href="https://www.tensorflow.org/" rel="nofollow" target="_blank">TensorFlow</a> and its higher level interface <a href="https://keras.io/" rel="nofollow" target="_blank">Keras</a> with which I as well as other team members had had considerable previous experience. The only caveat was that it would require considerable computing power than my laptop could produce to train.<br />
<br />
The second idea which was due to the esteemed Amir Aharoni was to split up manuscripts using a heuristic and then use a chat bot to crowd-source people to convert it to text. Half the team tried this challenge which complemented my idea since any ML task benefits from crowd sourcing a suitable data set!<br />
<h3>
The Story</h3>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: right;"><tbody>
<tr><td style="text-align: center;"><img border="0" data-original-height="1067" data-original-width="1600" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdMnqxgz7N3QnKn5LW6JdMq_zsyzsOzE_tcmCF7WtHBpXocweQjV6aopQlXxuMmkS7P_eUAAZn6-WMsuy-MWZN2qUWMRXeAFt4X_qPlYY1Qd5kTK-34QutqGIomODs-TeUq3K0MOGVZHk/s640/chaim.jpg" style="margin-left: auto; margin-right: auto;" width="640" /></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The incredible Chaim Cohen presenting the overall concept</td></tr>
</tbody></table>
The actual hackathon started out as a bust. We had practicably no internet access for a number of hours. After about six hours a number of complaints we were relocated but there were additional setbacks. The scripts I had located for downloading categories from Wikimedia commons using the <a href="https://www.mediawiki.org/wiki/Manual:Pywikibot" target="_blank">Pywikibot</a> framework proved incompatible with the more recent version and needed to be rewritten from scratch. Also interfacing with NLI databases search and high-quality images proved more tricky than we had initially expected. Last but not least was the fact that we had no idea what categories of images to train on. It turned out that pre-trained inception was no good with people. FaceNet needs alignment using a face-detector like open-CV or its own model. Finally chatting with the other team members showed they had been stalled and could not help with our challenges. At this point, I was ready to give up and go to sleep but Chaim was inspired and after talking to a Librarian during a night shift we decided to give everything a second chance both data collection and improving the models. I rebuilt the tensor flow from the source and then retrained the models and we started seeing much better outcomes as we got more images to work with.<br />
<br />
<h2>
The Presentation</h2>
You can have a perfect project but the presentation is what wins the hackathon. I took a nap and when I got up there were three hours on the clock so we got to the worth of the presentation where again Mr. Cohen played a vital role. He told the store to create a logo and infused his enthusiasm about our idea to the whole crowd. But to my surprise, Chaim insisted that two more team members stand on the stage and present and we did!<br />
<br />
<a href="https://drive.google.com/open?id=1CTTigyPh23302rg-18i0-nEjmarBEoQl" rel="nofollow" target="_blank">Our presentation</a><br />
<h2>
References </h2>
<br />
<ul>
<li><a href="https://github.com/davidsandberg/facenet/wiki/Train-a-classifier-on-own-images" rel="nofollow" target="_blank">Train a classifier on its own images</a></li>
<li><a href="https://www.tensorflow.org/tutorials/image_retraining" rel="nofollow" target="_blank">How to Retrain Inception's Final Layer for New Categories</a></li>
<li><a href="https://www.facebook.com/pg/NationalLibraryIsrael/posts/" rel="nofollow" target="_blank">NLI Facebook page</a></li>
<li><a href="http://blog.nli.org.il/hackathon/" rel="nofollow" target="_blank">NLI coverage of the hackathon</a> (in Hebrew)</li>
</ul>
<br />
<br />
<span id="goog_1818226549"></span><br />
<br />
<span id="goog_1818226545"></span><span id="goog_1818226546"></span><br />
<div>
<br /></div>
<br />Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0קריית אדמונד ספרא, ירושלים, 9139002, ישראל31.7759008 35.1968096000000514.8807048 -6.1117843999999479 58.6710968 76.505403600000051tag:blogger.com,1999:blog-7341613201559046168.post-10948456333067759392017-11-17T17:03:00.001+02:002017-11-28T12:44:21.381+02:00My first BigQuery DWH<div dir="ltr">
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmWu1cubxxSDaSr8qrXM86W8csN81hsMOj3xs9X6m4-3MzJfyj-uWoz86qDLkid2XbfQRDiQTn8TySoU71v13sWAoP4uBJVLR-M3wBNdbZdCAKtTxhpdpKgZzvLL0aCljBrGdzFrrPQVk/s1600/BQ.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="102" data-original-width="410" height="98" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmWu1cubxxSDaSr8qrXM86W8csN81hsMOj3xs9X6m4-3MzJfyj-uWoz86qDLkid2XbfQRDiQTn8TySoU71v13sWAoP4uBJVLR-M3wBNdbZdCAKtTxhpdpKgZzvLL0aCljBrGdzFrrPQVk/s400/BQ.jpg" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">BigQuery is Google's Analytics Database</td></tr>
</tbody></table>
</div>
<div dir="ltr">
Some notes about a project that has taken up lots of time recently. It was a media attribution dashboard for a client running several hundred campaigns. A POC version of the project had been created manually using spread sheets and we had to provide a drop in a replacement ASAP . I took up the task of migrating a spreadsheet based BI to a more robust script and SQL based platform able to handle the rapidly aggregating data which would soon overpower the spreadsheet's models. A secondary challenge was that the entire system were analyzing was under development and would change daily. Despite it's lacks as a classical database (missing triggers and in schema protections) I choose BigQuery for its scale-ability and ease of integration. Despite its limitations it soon felt like a perfect fit for this type of project.</div>
<h3>
<b>Data collection</b></h3>
<div dir="ltr">
</div>
<ul>
<li>Data is currently acquired daily via API from various platforms: for example Google AdWords and Google Analytics.</li>
<li>I got an initial jump start by modifying a sample AdWords script from Google's developer pages which demonstrated Big Query writing capabilities. Although this system is fully hosted by google it was a decision I would ultimately regret but more on that later.</li>
<li>I added support for additional Goggle API queries. </li>
<li>Next I had to handle challenges arising from a mismatch between field names and foreign key formats between data-sets. So I the initial script became an ETL with new capabilities in each iteration. Fixing dates, converting string to numbers and handling different formats of null values.</li>
<li>Finally I added cleaning up date to the script to fix these.</li>
</ul>
<h3>
Stored Procedures</h3>
<div dir="ltr">
The DWH based on a the classic star schema retained pretty much the same schema and physical layout of tables and views. However there were a number of revisions to the stored procedures controlling the views as I found that the incoming data lost key data or change unexpectedly. The number of abstract entities multiplied due to a new requirement to be backwards compatible with the media campaigns going back many months. Handling changing data as the ETL evolved and rebuilding the BigQuery became unwieldy.</div>
<div dir="ltr">
</div>
<ul>
<li>To increase the project's agility I added to my script a key capability - to recreate all the stored procedures pragmatically if missing at every run of the project. This allowed me to drop a data set and rebuild everything from scratch as the ETL evolved despite BigQuery's clunky web interface which had rendered progress unwieldy. </li>
<li>To allow creation of the BI front end in parallel to development I bifurcated the project to run production testing and development versions concurrently each on an independent BigQuery data set</li>
<li>BigQuery's standard SQL requires stored procedures to to access tables in as fully qualified up to the data set level so to support multiple version I modified the ETL to rewrite the queries to conform with the current version of the specific data set it should work on.</li>
<li>As mentioned BigQuery's legacy queries was inferior to working standard SQL queries. </li>
</ul>
While adding these capabilities took a couple of days it significantly increased agility and over the project lifetime was a very significant time saver as it became possible to store very many queries in code and only run the latest versions.<br />
<div>
<br /></div>
<div>
However keeping all the SQL in one large file increased the size of the code base and AdWords scripts does not have minimal editing abilities so It I regretted not placing all the SQL queries in a project which handled multi-line strings would have reduced the adjustments needed to the queries. Due to the project's time frame I had to keep plugging based on the initial decisions.</div>
<div>
<br /></div>
<h3>
Testing and QA</h3>
<div>
Testing and QA became increasingly important as I strove to improve the quality of the project's final output table which detailed attribution of aggregated campaign results to each campaign and products spending could be consumed by any one of several BI platforms (I used Google's Data Studio but tested Tableau and Click Sense as front ends as well). By creating a QA dashboards outlining the data in each table and view of the DWH I was able to quickly spot leaks and duplicated or missing data.</div>
<div>
Data Studio is still in beta and undergoing significant development and several times I saw the QA environment collapse due to changes which I could not control. Ultimately I decided to split QA into a set of sanity test run in code and a simpler set of Dashboards to allow inspection of the issues which were less important over time as very high levels of accuracy has already been reached and the concern had shifted to checking the product's consistency. </div>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-19206268359882245332017-11-17T10:39:00.000+02:002017-12-16T11:02:23.207+02:00How to take out the Trash from command line in Ubuntu 17.10<h3>
>How to use the trash from the command line?</h3>
<h3>
The pain</h3>
Setting up new projects is frequently time consuming, with many false starts until everything is setup right. In fact once CI is set up and the version on the local machine is less important. I've been encountering this Ubuntu annoyance whenever starting a new project.<br />
<br />
I could create smart aliases for rm with a command line trash folder. But there would be two trash folders. I just want to access the same trash folder from the command line that I can access through the desktop. It also turns out that this has been the subject of not one not two but at east three packages. This following option is quick, safe (as it is reversible) and lets us focus on the the setup.<br />
<br />
Doing machine learning also creates big models and large downloaded data sets that can hog up the limited fast storage. Still I don't enjoy retraining a big model because I accidental tossed out the last good model with all the previous runs.<br />
<h3>
The remedy </h3>
<br />
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
Install trash-cli </div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="font-family: Courier New, Courier, monospace;">>sudo apt-get install trash-cli</span></div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
then simply</div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="font-family: Courier New, Courier, monospace;">></span><span style="background-color: transparent;"><span style="font-family: Courier New, Courier, monospace;">trash bad-project-dir</span></span></div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="font-family: Courier New, Courier, monospace;">></span><span style="background-color: transparent;"><span style="font-family: Courier New, Courier, monospace;">trash old-config-file</span></span></div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="background-color: transparent;"><span style="font-family: inherit;">list the trash</span></span></div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="background-color: transparent;"><span style="font-family: Courier New, Courier, monospace;">>trash-list</span></span></div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="background-color: transparent;"><span style="font-family: inherit;">and even empty it once things have stabilized</span></span></div>
<div style="background-color: white; border: 0px; clear: both; font-stretch: inherit; font-variant-numeric: inherit; line-height: inherit; margin-bottom: 1em; padding: 0px; vertical-align: baseline;">
<span style="background-color: transparent;"><span style="font-family: Courier New, Courier, monospace;">>trash-empty</span></span></div>
<h3>
Reference:</h3>
<br />
<br />
<ul>
<li><a href="https://askubuntu.com/questions/213533/command-to-move-a-file-to-trash-via-terminal/213549" rel="nofollow" target="_blank">Question</a> @ #askubuntu</li>
</ul>
<br />
<br />
<br />Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-28419905541624812522017-07-24T05:55:00.003+03:002020-02-24T09:43:31.088+02:00Lay the foundation faster<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEju-9hyL_V5cnVUvlgBzLQVkrXUbg3f5sGMs0yo7iJvhCkZBhN-mSfQCeWX_czrJAVp_M8mR1iNbukXwH-6Q45vnwNOdoporNYaN4jNDwSFqoLbOXjrIk3WD2LBiDBPmOCRLmxh7vpiMWk/s1600/yeti-f5.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="800" data-original-width="800" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEju-9hyL_V5cnVUvlgBzLQVkrXUbg3f5sGMs0yo7iJvhCkZBhN-mSfQCeWX_czrJAVp_M8mR1iNbukXwH-6Q45vnwNOdoporNYaN4jNDwSFqoLbOXjrIk3WD2LBiDBPmOCRLmxh7vpiMWk/s320/yeti-f5.png" width="320" /></a></div>
I've recently started to work wit foundation.js. My primary goal for work with foundation is to generate very fast mockups. After completing the first project I've taken a bit of time to learn a bit more.<br />
<br />
Here are some insights which were difficult to discover within the documentation:<br />
<br />
<br />
<ul>
<li>#slack is available but not active enough to get answers</li>
</ul>
<ul>
<li>Foundation is primarily sass or css framework. If you need to do anything more than build a mockup it is probably not going to have any code you need.</li>
</ul>
<ul>
<li>Using multiple document layouts: </li>
</ul>
<ol><ol>
<li>managed by panini which is really handlebars.js</li>
<li>can speed up static prototyping. </li>
<li>may help with PWA using http 2.0 </li>
<li>you need to have a <span style="font-family: "courier new" , "courier" , monospace;"><b>{{> body}}</b></span> handlebar in the template.</li>
<li>you have to add YAML <span style="font-family: "courier new" , "courier" , monospace;">FrontMatter</span> to to the page (is this mr Jekyll ?)</li>
<li>the YAML needs to reference the layout ie: <span style="font-family: "courier new" , "courier" , monospace;">layout: file-name-without-ext</span></li>
<li>if you reference missing layout you will get errors.</li>
</ol>
</ol>
<div>
<ul>
<li>Panini is based on handlebars.js and handlebars may have you covered if you need a little more than the panini documentation mentions - as the documentation is not very comprehensive. So getting familiar with http://handlebarsjs.com/builtin_helpers.html can accelerate your work if you want to go to proof of concept or minimum viable product without investing in a framework.</li>
</ul>
<div>
<ul>
<li>Lorem Ipsum ... and http://placehold.it/ are your friends - avoid real date as long as possible it will bifurcate the complexity of your project and slow you down.</li>
</ul>
</div>
<ul>
<li>Frontmatter can help you understand and prototype the <i>data</i> needed to introduce to merge with the static markup in order to make the site dynamic. i.e.</li>
<ul>
<li>FrontMatter dataLayer is faster than</li>
<li>Mocking data say using faker.js which is faster than</li>
<li>Building a backend with fake data which is faster than</li>
<li>Building the backend with real data</li>
</ul>
</ul>
</div>
<ul>
<li>Try to leverage partials</li>
<ol>
<li> by splitting the html into its smallest components. This makes sense if your next step is to convert to components.</li>
<li>If you are going to write tests you probably want to work on small isolated units first.</li>
</ol>
</ul>
<ul>
<li>You will probably want to use the CLI based setup as it provides a flexible project to work with and allows rapid introduction of "components". Using the framework is covered in some detail at <a href="http://foundation.zurb.com/sites/docs/starter-projects.html" rel="nofollow" target="_blank">this</a> page.</li>
</ul>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0tag:blogger.com,1999:blog-7341613201559046168.post-59526981388016068032016-08-08T22:00:00.000+03:002018-01-02T16:32:41.696+02:00When one size can't fit all - UI/UX breakpoints<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js?config=TeX-MML-AM_CHTML"></script>
<h3>
What are UI/UX breakpoints?</h3>
Breakpoints in a user experience of user interface design are the width specifications at which layouts change to accommodate larger screens with different layouts on various devices and orientation. Breakpoints are used with column based design drawn from a grid system. In the real world, for example, Android devices highly fragmented screen sizes implies that a single design's functionality will be challenged to conform with so many different sizes.<br />
<br />
Breakpoints essentially simplify the work by grouping many devices together. Within each bucket, a design will have to scale to adjust. But between breakpoints, there may be more radical changes in the interface's functionality.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://material-design.storage.googleapis.com/publish/material_v_8/material_ext_publish/0B8olV15J7abPSGFxemFiQVRtb1k/layout_adaptive_breakpoints_01.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="281" src="https://material-design.storage.googleapis.com/publish/material_v_8/material_ext_publish/0B8olV15J7abPSGFxemFiQVRtb1k/layout_adaptive_breakpoints_01.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Infographic for breakpoint from Google's <a href="https://material.google.com/layout/responsive-ui.html#" rel="nofollow" target="_blank">material design site</a></td></tr>
</tbody></table>
<br />
One best practice of design involves the use of grids. A notable reference is <a href="https://en.wikipedia.org/wiki/Josef_M%C3%BCller-Brockmann" rel="nofollow" target="_blank">Josef Müller-Brockmann's</a> book "<a href="https://www.amazon.com/Grid-Systems-Graphic-Design-Communication/dp/3721201450" rel="nofollow" target="_blank">Grid Systems in Graphic Design</a>". This has found itself into web design through libraries such as Bootstrap. However, grids are it is also somewhat implicit in the design of CSS which includes paddings and margins which stack into grids.<br />
<br />
Some examples from Google's layout section on "material design" which is the best reference for breakpoints from a practical point of view.<br />
<ol>
<li>Adjust margins and gutters in multiples of 8dp to maintain a grid as the layout changes.</li>
<li>reveal pattern shows extra elements when space increases -</li>
<ul>
<li>A side navigation is kept off screen for breakpoints that are too narrow for the content. But for wider layouts, the side nav is placed on screen permanently. This also works in master-detail views.</li>
<li>Cards that expand their content on request.</li>
</ul>
<li>Transform a simple element to a more complex one</li>
<ul>
<li>side navigation to page tabs </li>
<li>one-dimensional column list to a two-dimensional grid layout</li>
<li>side menus into icons in a toolbar</li>
</ul>
<li>Dividing a z based UI into new space.</li>
<ul>
<li>a side panel may split into left and right pannels</li>
</ul>
<li>Reflow</li>
<ul>
<li>Centered grids can be reflowed to address changes in screen size. A</li>
</ul>
<li>Expand - elements can expand</li>
<ul>
<li>A full-width grid-based design can expand to take more space</li>
<li>dialogs can expand proportionally or in specific increments.</li>
</ul>
<li>Position - elements may move to better positions as space expands</li>
<ul>
<li>FAV may move to a more visible location.</li>
</ul>
</ol>
<div>
<a href="https://material.google.com/layout/responsive-ui.html#" rel="nofollow" target="_blank">Google's guide</a> recommends seven breakpoints at 480, 600, 840, 960, 1280, 1440, and 1600dp which amounts to eight designs. However, it lists as many as 12 twelve breakpoints. </div>
<h3>
The analytics perspective - and media queries</h3>
<div>
Measuring the success of a design at converting visitors into customer media one of the best practices is to collect data using media queries at say the page level and report the prevalence of different breakpoints, screen types and orientation used to interact with the website or app.<br />
<br />
If a web property (web-site or app) has been designed professionally, the responsive layout will adjust the different layouts for different screen sizes. The breakpoint, aspect ratio, and orientation provide more meaningful segments for capturing online behavior than screen resolution or device model since they will cluster users into the most meaningful cluster. (Though it is still possible to drill down from a breakpoint segment and examine specific devices for issues)</div>
<div>
<br /></div>
<div>
Breakpoints resolution and orientation and can be captured via auto tracking and three custom dimensions.</div>
<div>
<br /></div>
<div>
These segments can be used to better understand success and failure of processes that are inherently caused by poor or serendipitous design choices and adjustment can be made to improve defects or ta add missing breakpoints or to improve the design incrementally.</div>
<h3>
Units</h3>
<blockquote class="tr_bq" style="text-align: center;">
$$ px = {{dp \times dpi} \over 160 } $$
or
$$ dp = {{ (width \space in \space px \times 160)} \over screen \space density} $$ </blockquote>
<div>
<ul>
<li><b>%</b> - relative to the enclosing unit (a CSS unit).</li>
<li><b>ch</b> - length relative to the width of the "0" (a CSS unit).</li>
<li><b>em</b> - length relative to the current font's height. (a CSS unit).</li>
<li><b>ex</b> - size relative to the current font's height by the lower case letter x (a CSS unit).</li>
<li><b>px</b> - length in pixels which are 1/96 in(a CSS unit).</li>
<li><b>in</b> - inches (a cssCSS and android unit).</li>
<li><b>mm</b> - millimeters (a CSS and android unit).</li>
<li><b>pc</b> - pica is 12 points (a CSS unit) used in print.</li>
<li><b>pt</b> - points 1/72 0f an inch (a CSS and android unit).</li>
<li><b>dpi</b> - dots per inch - the screen density of the device.</li>
<li><b>dp</b> or <b>dip</b> - density independent pixels (relative to 160 dpi screen) and bucketed by to 120 (ldpi), 160 (mdpi), 240 (hdpi), 320 (xdpi), 480 (xxhdpi) 640 (xxxhdpi). (an Android unit).</li>
<li><b>sp</b> - scale independent pixels - scaled according to font size (an Android unit).</li>
<li><b>rem</b> - relative to the font size of the root element which ignores scalings of intermediate styles (a CSS unit).</li>
<li><b>vh</b> - 1% of browser's window size height (a CSS unit).</li>
<li><b>vw</b> - 1% of browser's window size width (a CSS unit).</li>
<li><b class="">vmin</b> - the smaller of 1 vh and 1 vh (a CSS unit).</li>
<li><b class="">vmax</b>- the larger of 1 vh and 1 vh (a CSS unit).</li>
</ul>
</div>
<div>
Of course, these were all initially supported by different versions of each brand of browser.</div>
<br />
<h3>
References</h3>
<ul>
<li><a href="http://www.creativebloq.com/web-design/grid-theory-41411345" rel="nofollow">The designer's guide to grid theory</a></li>
<li><a href="http://stackoverflow.com/questions/2025282/what-is-the-difference-between-px-dp-dip-and-sp-on-android" rel="nofollow">What is the difference between “px”, “dp”, “dip” and “sp” on Android?</a></li>
<li><a href="https://developer.android.com/guide/practices/screens_support.html" rel="nofollow">Supporting multiple screens</a> from Google developers.</li>
<li><a href="https://material.google.com/layout/responsive-ui.html" rel="nofollow" target="_blank">Reponsive UI</a> on Google's material design site</li>
<li><a href="https://developer.android.com/guide/topics/resources/more-resources.html#Dimension" rel="nofollow" target="_blank">Dimensions in android</a> from Google developers.</li>
<li><a href="https://developers.google.com/web/fundamentals/design-and-ui/responsive/patterns/?hl=en" rel="nofollow" target="_blank">Patterns</a> for reponsive layouts with css </li>
</ul>
Oren Bochmanhttp://www.blogger.com/profile/15491175578474499548noreply@blogger.com0