Kondal Ajjarapu
3 min readFeb 28, 2021

--

Be in the Driver seat to drive your organization’s Data Architecture Plan!

Time to Shift Your Org into Top Gear ? Now is the Time !

If there is one thing that the Pandemic has taught us, it is to stay ahead of the curve. Following is a simple manual to drive your organization to do the same in this post-pandemic era:

1st Gear — Transfer your remaining on-premise to cloud-based platforms: Not to restate the obvious but Cloud provides a secure and cost-effective platform to design, develop, deploy and run your data infrastructure, platforms, and applications at scale.

Though monumental this is the year and this is the time to educate, influence and develop an actionable time-bound Data Migration Plan to migrate the remaining legacy systems into the Cloud. It is Now or Never!

Enabler 1: Serverless Data Platforms like Snowflake, Google BigQuery, now even S3 to store, build and operate data-centric applications with infinite scale within time and budget.

Enabler 2: Containerized data solutions like Dockers that can be orchestrated by Kubernetes that will enable your org to package functionality and automate deployment while still retaining data

2nd Gear — From Batch to Real-time Data Processing: As an obvious follow up step from the previous gear, in the age of real-time data messaging and streaming capabilities, it makes no sense to carry on with Batch processing any longer.

Enabler 1: Messaging platforms like Apache Kafka provide fully scalable, durable and fault tolerant pub/sub mechanisms to process millions of messages and also support real-time use cases on a much lighter foot print.

Enabler 2: Streaming processing and analytic solutions like Kafka Streaming, Apache Flume, Apache Storm, and Apache Spark Streaming that allow for direct analysis of messages in real time to extract events or signals.

3rd Gear — From pre-integrated commercial solutions to modular apps: Sounds like a cliché but take time and break down the monoliths to build-as-you-go microservices. Each service hosting a certain business logic for your end user — for example, GetCustomer, UpdateOrder, PostPayment etc.

Enabler 1: Snowflake’s Data Pipelines or Mulesoft’s API platform provide a simple and extensible API-based interfaces not only to extract but transform and enrich your data pipelines while reducing complexity when you re-engineer.

Enabler 2: Analytics workbenches such as Amazon Sagemaker and Workday Adaptive simplify building end-to-end solutions in a highly modular architecture and also, tools to connect with variety of underlying databases.

4th Gear — From P2P to Decoupled Data Access: One burgeoning concept that is hitting the PoC dataways all across the town is the concept of Data Marketplace. Though not a new concept, its implementation has been relatively easy by granting secure, real-time, read-only access to your data assets versus relying on expensive 3rd party proprietary interfaces.

Enabler 1: API Gateways like Google’s Apigee API management or AWS’s API gateway provide allow your developers to search for existing data interfaces and reuse versus building new ones.

Enabler 2: Azure’s Data Catalog and Snowflake’s Data Marketplace serve as a system of registration and system of discovery that enables your users to discover, understand, and consume data sources in real-time and securely.

5th Gear — From Rigid Data Models towards flexible, extensible data schemas: Get on the DevOps bandwagon of using schema-light approaches like using Denormalized Data Models (have fewer physical tables) and Schema-on-Read to allow greater flexibility in storing structured and unstructured data.

Enabler 1: Technology services like Azure Synapse Analytics and Snowflake’s External tables allow querying file-based data akin to relational databases by dynamically applying table structures onto the files. The resulting views enable your users flexibility to query date stored in files or in cloud storages like Azure Blobs or AWS S3.

Enabler 2: Okta uses JWTs (Java Web Tokens) to allow user data in JSON (JavaScript Object Notation) format to be represented in a secure manner. Similarly, leverage JSON to store your org’s information that in turn will enable you to change database structures without having to change the business information models.

So starting tomorrow, establish a data tribe within your organization with squads of data stewards, data engineers, and data modelers to put in place a standard, repeatable data-centric ready-to-use framework using the Enablers mentioned above to drive your organization forward into the data world.

--

--

Kondal Ajjarapu

Snowflake Advanced Architect and Snowflake Data Champion with proven experience in migrating data , data governance and data modernization. Ready to Roll !!