📝 MY NOTES #18 - Understanding & Achieving AI Readiness

Theme - Preparing for AI

Jul 10, 2024

Hello Folks!

Recently I got the chance to attend an online talk titled “Understanding & Achieving AI Readiness”. Was an intriguing session & in today’s MY NOTES I will talk about my learnings from it.

& If it’s your first time here, TheWeekendFreelancer currently has 5 ongoing series - Tools 🛠️, Maths 📈, Domain 🌐, Trends 📻 & My Notes 📝. Have fun reading!

Sharad Varshney (CEO, OvalEdge) and Saurabh Agarwal (An ex-McKinsey Consultant) commence the session. Saurabh starts by saying that lately he’s been getting a lot of queries from his clients about AI& not only the tech folks but also the management teams.

Saurabh starts by asking from Sharad - What do you really mean by AI in today’s day? There are so many buzzwords like GenAI, ChatGPT, Robotics, ML etc

Sharad says he remember studying Neural Networks almost 25 years ago, the difference in today’s time is the presence of processing power. With that he think that AI will strongly impact certain industries such as the Legal industry where all the information is the knowledge about law which is present in books so even though you might still need lawyers but a lot of back room discussions or things like contract reading can be done by AI. On the other hand an Industry like Utilities might not see a quick disruption with AI.

Saurabh asks whether AI will have an important role to play in Strategic areas of businesses?

Sharad thinks that where the cost of decision making is low but the impact of those decisions is high the corporations might not take chance with AI & still prefer humans. A major chunk of impact will be seen in things which are very repetitive like driving e.g Tesla going for driverless driving.

The gentlemen move towards the topic of discussion by giving example of cases where AI discriminated while gives loans to customers. This is where Compliance and AI readiness will play a crucial role.

Moving to the crux of the discussion Saurabh starts by asking How should companies hire resources for AI readiness?

Sharad answers by saying that People can be hired, data can be purchased by Data is what you will build your AI on! In order of importance should always be Data, followed by People and finally the infrastructure.

Saurabh asks Sharad to precise what he exactly means by high quality of data which is a prerequisite for AI readiness?

Sharad shares example of scenario of retail companies that face this issue.

Retail companies often struggle with maintaining high-quality data about their customers. This can severely impact their marketing activities, predictive analysis, and overall business strategies. Here’s how:

Customer Data Issues

Incomplete Data: Retail companies might have incomplete customer profiles, missing critical information like contact details, preferences, or purchase history. This hampers personalized marketing efforts.
Duplicate Records: Multiple entries for the same customer can lead to redundant communications and incorrect customer insights, affecting the accuracy of marketing campaigns.
Outdated Information: Customer data that is not regularly updated becomes stale, leading to ineffective marketing strategies. For example, promotions sent to outdated addresses or targeting preferences that are no longer relevant.
Inconsistent Data: Variations in data formats and standards across different systems within the company can result in inconsistent customer information, complicating data analysis and integration.

Impact on Marketing Activities

Targeted Campaigns: Poor data quality means retail companies cannot accurately segment their customer base. This leads to ineffective targeting in marketing campaigns, resulting in lower engagement and conversion rates.
Personalization: Personalized marketing relies on accurate customer data. Without high-quality data, personalization efforts may fall flat, leading to generic and less compelling marketing messages.
Customer Retention: Understanding customer behavior and preferences is crucial for retention strategies. Inaccurate data can lead to misunderstandings of customer needs and potential churn.
Predictive Analysis: Retailers use AI for predictive analytics to forecast sales trends, customer lifetime value, and inventory requirements. Poor data quality leads to flawed predictions, resulting in overstocking or stockouts, and misguided strategic decisions.

Example Scenarios

Email Marketing Campaigns: A retail company sends out a promotional email based on an outdated customer list. Many emails bounce back, others reach customers who no longer find the products relevant, leading to a poor return on investment for the campaign.
Customer Segmentation: Due to incomplete data, a retailer misclassifies customers, placing high-value customers in a low-value segment. This results in inadequate attention to high-value customers and lost sales opportunities.
Predictive Inventory Management: AI models predicting inventory needs based on incorrect sales data might suggest overstocking products that are not in demand, leading to increased holding costs and potential losses from unsold goods.

Next question - What will really change with the AI governance? What’s the delta in comparison to before?

Implementing strong data governance policies and platforms can bring about significant changes, ensuring that data is high quality, consistent, and compliant with regulations. Here are the key changes that effective data governance can bring:

Improved Data Quality: Data governance ensures that data is accurate, complete, and reliable. This leads to better decision-making and more effective AI and analytics initiatives.
Data Consistency: Centralized data governance ensures that data standards and definitions are consistent across the organization, reducing discrepancies and enhancing data integration efforts.
Enhanced Data Security and Compliance: Proper data governance ensures that data is handled in compliance with legal and regulatory requirements, protecting against data breaches and legal penalties.
Increased Efficiency: By standardizing processes and reducing data redundancy, data governance can streamline data management, reducing costs and increasing operational efficiency.
Better Decision-Making: High-quality, well-governed data provides a solid foundation for business intelligence and analytics, leading to more informed and effective decision-making.

Examples of Central Governance Platforms

Example 1: Master Data Management (MDM)

Master Data Management platforms ensure that an organization's critical data, such as customer information, is consistent, accurate, and controlled. With MDM, a central repository stores master data, and data governance policies enforce rules for data entry, updates, and usage. This ensures that customer data across various departments (sales, marketing, customer service) is synchronized and up-to-date.

Example 2: Data Quality Tools

Data quality tools integrate with data governance frameworks to continuously monitor and cleanse data. These tools use algorithms to detect and correct inaccuracies, fill in missing information, and eliminate duplicates.

Example 3: Data Catalogs

Data catalogs act as centralized repositories for metadata management, helping organizations understand and manage their data assets. They provide information on data lineage, data owners, and data quality metrics.

Example 4: Centralized Data Governance Platforms

Platforms like Collibra or Informatica provide a comprehensive suite of tools for data governance, including policy management, data quality assessment, and regulatory compliance. These platforms enable organizations to create, enforce, and monitor data governance policies across all data sources.

Up next Saurabh asks about AI readiness of unstructured data, How can it be achieved? A summary of Sharad’s reply would be -

1. Data Collection and Preparation

What should go into AI:

Relevant Data: Only data pertinent to the problem or task at hand should be included. This ensures the AI model receives the necessary information without being overwhelmed by irrelevant noise.
Diverse Data: To avoid biases, the data should be diverse and representative of the various scenarios the AI model will encounter.
Annotated Data: Properly labeled and annotated data helps in supervised learning tasks where the AI learns from examples.

What shouldn't go into AI:

Irrelevant Data: Data that doesn’t contribute to solving the problem should be excluded.
Redundant Data: Duplicate or overly similar data points that do not add value should be filtered out to avoid overfitting.
Sensitive Information: Personally identifiable information (PII) and other sensitive data should be excluded or anonymized to maintain privacy and compliance with regulations.

2. Data Cleaning and Preprocessing

Proper Framework for Data Quality:

Standardization: Ensure that data formats are consistent. This includes date formats, numerical scales, and categorical labels.
Missing Values Handling: Address missing data points appropriately, either by imputation, removal, or flagging them as unknown.
Noise Reduction: Remove or correct errors, outliers, and irrelevant information from the dataset.

3. Addressing Biases

Training Data Shouldn’t Have Biases:

Bias Identification: Regularly audit the data to identify and understand potential biases.
Bias Mitigation: Implement strategies to minimize biases, such as re-sampling, re-weighting, or using bias correction algorithms.
Diverse Training Set: Ensure that the training data is balanced and represents all groups fairly to avoid skewed outcomes.

4. Data Classification and Annotation

Proper Classification and Annotation:

Consistent Labeling: Use a standardized process for labeling data to maintain consistency across the dataset.
Quality Control: Implement checks to ensure that annotations are accurate and reliable. This might involve multiple annotators and consensus mechanisms.
Contextual Information: Provide sufficient context for annotations to avoid misclassification and improve the AI’s understanding of the data.

5. Validation and Verification

Ensuring Data Quality:

Validation Sets: Use separate validation sets to test the model’s performance and ensure it generalizes well to new, unseen data.
Cross-Validation: Implement cross-validation techniques to evaluate the model’s robustness and reduce overfitting.
Human Review: Periodically review the AI’s outputs to catch errors and refine the model with human insights.

This is where the session ends.

I personally gained a lot from this webinar/podcast. What about you?

The Weekend Freelancer

Discussion about this post