Prasad Khode – Medium

Prasad Khode

Understanding SQL Self Joins with Scenarios

Self joins can often be an overlooked aspect of SQL, yet they are incredibly useful for querying hierarchical or related data within a…

Oct 10, 2024

Oct 10, 2024

Grouping a Spark DataFrame and Creating JSON Lists in Scala

In this post, we’ll explore how to group a Spark DataFrame by a specific column and create a list of JSON objects from other columns. This…

Oct 9, 2024

Oct 9, 2024

Simplifying Dynamic Partition Overwrite in Spark: A Guide to PartitionOverwriteMode

When you’re dealing with large amounts of data in Apache Spark, managing your data efficiently becomes important. One way to do this is by…

Oct 9, 2024

Oct 9, 2024

How to Run Ollama Locally Using Docker

Running AI models locally can be a great way to leverage the power of machine learning without relying on cloud services. In this guide, I…

Sep 2, 2024

How to Run Ollama Locally Using Docker

Sep 2, 2024

Setting Up Apache Airflow for Local Development

In this guide, we’ll walk through setting up Apache Airflow on a local machine using Conda to manage the Python environment.

Aug 29, 2024

Setting Up Apache Airflow for Local Development

Aug 29, 2024

Handling Dynamic JSON Schemas in Apache Spark: A Step-by-Step Guide Using Scala

In the world of big data, working with JSON data is a common task. However, handling JSON schemas that may vary or are not predefined can…

Aug 21, 2024

Aug 21, 2024

How to Retrieve the Input File Name as a Column Value in Apache Spark

When working with large datasets in Apache Spark, there are scenarios where you might need to identify the origin of each row of data —…

Aug 16, 2024

Aug 16, 2024

Adding External and Maven JARs to Spark Shell for Ad-Hoc Analysis

When performing ad-hoc data analysis using Apache Spark, you may encounter situations where you need additional libraries to process your…

Aug 13, 2024

Aug 13, 2024

Handling Invalid Column Names in Spark: A Step-by-Step Guide

In data processing, it’s common to encounter files where the first line contains invalid or dummy column names, which can disrupt the…

Aug 12, 2024

Aug 12, 2024

Prasad Khode

Prasad Khode

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech