We all made some of the below statements on our decisions or on someone else decision, after outcomes were known
“I told you so”
“I knew it”
“I should have known this”
“How dumb I could be to not know this”
“I didn’t see this coming”
An important aspect of…
Good decisions will always result in good outcomes, and I was wrong.
First, I have long believed that good decisions will result in good outcomes, and I am wrong.
Second, I believed it is simple to decide provided we have the necessary facts at a decent confidence level. I didn’t…
Once in a while database systems go through a phase of bundling and unbundling enabling a new set of use cases and value addition. …
1) Separation of storage and compute, there is no going back the tight coupling between storage and compute for analytics data warehouses is gone for good. Bring your own storage BYOS and Bring your own compute BYOC is the norm.
2) Distributed programming at programmers hand using simple map->shuffle->reduce pattern.
Top 15 learnings from interviewing hundreds of big data professionals for service and consulting organizations for 10 years and having attended a few.
There are more than 100 big data open-source projects, unfortunately, you have can’t avoid adding as many projects as possible…
Digitization and collection of more data points at every interaction point, your firm has with external entities, is the crucial first step in providing amazing digital experiences and achieving a successful digital transformation.
There is no doubt that poor quality data will have an impact on business outcomes. “Getting in front on data quality presents a terrific opportunity to improve business performance”, writes Thomas C. Redman in the article seizing opportunity in data quality published in MIT Sloan Management Review.
The cost of…
The word count example below illustrates the importance of caching the RDD when the RDD lineage breaks/branches out.
Case 1: Reads the input file twice
The loading of the file for loremCountCase1 and ipsumCountCase1 operations can be verified in the log. …
How do you explain spark distributed computing to a 7 yrs old kid, 9th-grade student, a software engineer (java), ETL Engineer, Machine Learning engineer and an executive
Me: Do you have domino blocks?
7 Year Old: Yes many
Me: Do you have different colors
7 Year Old: Yes, Red, Blue…