06 May 2018

This is the next level of html and javascript to overthrow - I mean, complement - Excel.

Check out

with a few lines of javascript, turn your df_to_html() from a pandas dataframe into a pivot table-like interactive table.

insane or what???

check out their blog for more interesting stuff. they are insane!

in other news i begin thinking back to the fundamentals of data.

it ain’t all about fancy algorithms and stuff, you’ve got to control your data - know who it is going to, the repercussions of delivering less than perfect data and from there, think back to where the data is coming from.

how will my end users know that the data is right? how do i know my data is right? how do my data providers know that their data is right?

especially so in a large organisation, things get harder to change. is that right? because of the number of stakeholders to deal with? that means more eyes to check though. and each person only has to do a little…?

it will take time. but we need to do it.

data scientists need to think horizontally, and have full ownership of the data.

from data input (databases, API, or scraping it yourself?) to data storage (warehousing - kimball?, cloud on AWS?), data security (firewalls, permissions), ETL (using python or R or Scala?), to finally data analysis/ predictions (machine learning?), and visualisation (python, or gasps, tableau???)

there is so much to learn!!

did i mention the whole process works but requires such different methods for image or text data (as opposed to numerical data)?

at the end of the day, users just want clean, timely and accessible data.

clean: complete and correct data timely: data still relevant and appropriate for decision making accessible: easy to obtain, read and intepret.