Data and Data Visualization


Data model options have evolved far beyond a world where, to paraphrase Henry Ford, “A customer can have their data painted any color they want, so long as it is Relational.” How to persist data is one of the major challenges in the development of fault tolerant, geographically distributed, secure, regulatory compliant, evolutionary, web-scale applications. At ioet our methodology for data is to work toward a solution from two perspectives. Using domain driven design we discover the shape of the data, the number of distinct pools of data and their important characteristics in terms of privacy and security engineering, data types, scale, velocity, and architectural patterns such as events, streams, immutability, transactions and eventual consistency; all these factors and more inform the design space. From another perspective we consider DevOps factors such as the use of the many cloud-based and serverless solutions, replication for availability as well as scale, security factors such as encryption at rest and in-transit, authorization, observability, regulatory compliance, extraction of anonymized data to drive automated testing; all these factors and more inform the long term operational plan and impact the evolvability of the entire application.

We have experience with numerous serverless data options such as AWS S3, AuroraDynamoDB, QLDBTimestreamMSK (Kafka), Redis, Redshift, and Glacier. As well as other NoSQL platforms such as Neo4j, MongoDB/AtlasSnowflake, Azure Cosmos, Google FirestoreBigtableBigQuery.

Serverless cloud-scale persistence solutions are usually the best and least expensive option but there are applications where an embedded database is suitable. We have experience with all the usual options such as PostgresMariaDB and MySQL as well as conventional enterprise systems such as Oracle and SQL Server.

Data Visualization

Visualization is used to communicate information clearly and efficiently, and to abstract and reduce large and complex data into graphs and synthetic images that make it easier for users to spot patterns and anomalies. High quality visualizations are common in "dashboards" which are replacing "reports" as a tool for observing complex systems. We have experience developing visualizations based on very large data sets and using common tools and frameworks, such as Apollo, GraphQL, CytoscapeD3C3, vis.js, Jupyter, MATLAB, matplotlib, NumPy, Pandas, OpenGL, Plotly, R, Three.js, DHTMLX, Highcharts, FusionCharts, echarts, and others.