problib ui

Frontend component to problib app pipeline

Description

A lightweight, flexible web app and API built on top of the problib to provide a user interface for exploring interesting concepts visually. This project includes both the client side web components (e.g. D3 charts, web socket infrastructure, etc), along with the Flask API backend and related functionality.

Philosophy

notes on the project philosophy, high level goals

  • Motivation dump: Speaks to the connectedness and exponential growth of systems; creating collections and repositories allow for this exponential growth over time. Not unlike the original ML track vision, writing about and implementing algorithms from the field. It’s perhaps just a little larger scope, and more general. Consider prospects as an open source project. Also ties in well with recent visualization that video ideas, think primer Grant Sanderson, distill.pub, ET see. Think about the marriage between a tool like this, and intuitive, clean, simple writings on the subjects. Also consider how Michael Nielsen’s recent work on spaced repetition systems could be incorporated, just might be worth thinking about all of these aspects and how they tie into each other as one.

Description

A lightweight, flexible web app and API built on top of the problib to provide a user interface for exploring interesting concepts visually. This project includes both the client side web components (e.g. D3 charts, web socket infrastructure, etc), along with the Flask API backend and related functionality.

Structure

  • Completely asynchronous with Eventlet and Flask SocketIO. Can serve multiple clients requesting the same/different endpoints at the same time.

details on the project’s structure, hard documentation, etc

  • Dynamic DOM elements
    • Base Chart
      • Histogram
      • Line chart
    • Text object :: stores all text data sent to it over time, shows the most recent string on the DOM, responsible for text updates as it receives new data (i.e. core data management and update features from chart)

Notes

  • Find a way to send out the same size packet for streaming, could be different numbers of items being sent out for different distributions
  • Might want to consider separating the API from the web interface itself; both could be their own Flask app, but would be nice to be separate
  • Decided that rehashing and generalizing the Typeload charts would be a nice way to get started on the visualization approach for the library. This can developed and generalized alongside the development of this application for use as its own long-term, reusable, generalized library (or rather collection) of charts.
  • Going to need to reconsider the frequency at which the chart is drawn. My worries have been confirmed; it’s going to be just too computationally expensive to draw for every single sample and draw fast. We could one of two things:
    1. Limit the frequency of redrawing. Implement this by either redrawing only after x new data points have been added (since last draw), or literally not allowing the function to be called before an x second “cooldown”.
    2. Pass large chunks of the data directly into the chart’s update system. This means that we update the chart’s data in bulk, as opposed to strictly one by one. This would mean we get more bang for our buck when we redraw the figure.
  • We also need to address how we’re streaming floats, or non-digit integers. Now our low digits are just being encoded as an 8bit character and sent over that way.
  • Realized that some of the goals of this project mirror that of designing a game; there is a process running a simulation on the backend that is used to drive the graphics system. We need to figure out how to properly handle client connections to an indifferent backend system. Updates could then be pushed to the client as often as the client can handle it. This is somewhat like the streaming approach, where we have an open connection to send data to the client.
  • Decided to implement WebSockets fully. Turns out it’s pretty slow relative to just streaming bytes, at least for the basic distribution example. However it is delivering a much larger payload i.e. full JSON wrapper vs individual bytes (2.5s vs 0.5s for 50000 samples). Should work just fine for now. Also note that streaming sends multiple bytes in one chunk at times. We could allow this for WebSockets by sending multiple samples to gain more marginal value for the price of the JSON header. Spoiler: WS chunking is indeed quite a bit faster. Rivals HTTP streaming when timed from the server side. Looks like we are good to stick with this approach.
  • Thinking about how we’re going to deal with reading out state from an evolution simulation. Was thinking that it might make sense to run the simulation in another thread, and have workers peer in. However this doesn’t make much sense, but we could possibly write temp information to a database (e.g. something like Redis or Mongo since we’re dealing with JSON data). This sounds slightly annoying and complicated unfortunately. I’m also considering packing in a generator definition so I can sync perfectly with client; I can pause and wait for the client to catch up to run the next step of the simulation. However I can just stream things continuously, I don’t need to wait for the client’s approval to continue.
    • Spent more time hashing the WS approach. Turns out long-polling has just been being used this whole time since I didn’t have eventlet installed. Installing it caused my streaming to fail; I had to wait until the generator was exhausted before Flask SocketIO would send any messages to the client. This is because it was given no time at all to react and actually send the information. To fix this you must give it a chance to send by using an async sleep method
  • Added ability for client to send simulation specific details to the alphanumeric evolutionary algorithm. This has obviously been a goal for some time, and adds the desired flexibility in allowing the client to build up the simulation as they please. Websockets appear to be working well at this point. It appears good performance can be obtained from properly chunking sent data, and applying bulk operations to visual updates where possible.
  • Next up: build up a structured template/style to at least start seeing results on the screen. A nice target might be showing the 5 most recent samples from the evolution example and a bar or percentage number indicating current progress.
  • Consider use of RequireJS or LABjs for further modularization of JS code and use classes around in different files

More:

  • If a local success, consider the idea of making the app public on one of our domains
  • small notes: consider generalized dynamic update object (ie parent of basechart, text)
  • dynamic refresh rate considerations
  • flexible in-page custom query
  • just making things more general, addressing current listener and event registration system
  • evolution line chart for fitness
  • Extend nn JS example from existing site to more vibrant, adaptive form that simply receives weights from py
  • can also create image visualization platform I had once dreamed about here, ie show MNIST during training or something
  • improve dynamic line chart, make sure conformant with api and how multiple lines are handled