Skip to main content

Finding K-Sum Subarrays: A Journey from Confusion to Clarity

· 17 min read
Mahmut Salman
Software Developer

A conversation about solving the Subarray Sum Equals K problem that reveals how cumulative sums and hashmaps create an elegant O(n) solution—and why we need that mysterious map.put(0, 1) initialization.


The Problem

Teacher: Let's tackle this classic problem together. Given an array of integers and a target sum k, find the number of contiguous subarrays whose sum equals k.

Example:

Array: [1, 2, -1, 1, 2]
k = 3

Answer: 3

The three subarrays:
- [1, 2] (indices 0-1)
- [2, -1, 1, 2] (indices 1-4)
- [-1, 1, 2] (indices 2-4)

Question 1: The Cumulative Sum Insight

Teacher: Imagine you're at index 4 in the array, and you've been keeping track of cumulative sums as you go from left to right. If the cumulative sum up to index 4 is 5, and you want to find if there's a subarray ending at index 4 that sums to k=3, what would the cumulative sum need to be at some earlier index?

Think about this:

If sum[0 to 4] = 5
And sum[0 to j] = ?
Then sum[j+1 to 4] = 3

What should that "?" be?

Me: Hmm... if we're looking for a subarray ending at index 4, and since our cumulative sum is 5 at index 4, then at an earlier index (which will be our start index in our subarray) should have cumulative sum of 2.

Teacher: Excellent! You got it exactly right!

If cumulative sum at index 4 is 5, and we want a subarray ending at 4 with sum = 3, then we need an earlier index with cumulative sum = 2.

Because: 5 - 2 = 3


The General Formula

Teacher: So the general formula is:

cumulative_sum[current] - cumulative_sum[earlier] = k

Or rearranged:

cumulative_sum[earlier] = cumulative_sum[current] - k

Question 2: What Data Structure?

Teacher: As you iterate through the array from left to right, calculating cumulative sums at each position, what information would be useful to store about the cumulative sums you've seen so far?

Think about: When you're at the current position and you calculate that you need to find an earlier position with cumulative sum = (current - k), what would help you quickly know:

  1. Does such an earlier position exist?
  2. How many times have you seen that cumulative sum before?

What kind of data structure might help you track this information efficiently?

Me: It would be great if I can check in O(1) time if there is a cumulative sum equals to (current - k). HashMap data structure might help us here, so that I can learn if it exists or not at a glance.

Teacher: Perfect! You're thinking like an algorithm designer now!

Yes, a hashmap is exactly what we need! And you identified the key benefit - O(1) lookup time.


Question 3: What to Store in the HashMap?

Teacher: You said the hashmap can tell us "if it exists or not." But look at the problem again - we need to find the number of subarrays, not just whether they exist.

So here's a scenario: What if you're at index 4 with cumulative sum = 5, and you're looking for (current - k) = 2. But what if the cumulative sum of 2 appeared multiple times earlier in the array (say at index 1 and index 3)?

How many valid subarrays ending at index 4 would that give you? And what should you store in your hashmap as the value (not just the key)?

Think about:

  • Key in hashmap: cumulative sum
  • Value in hashmap: ??? (what information do we need?)

Me: As keys we can store cumulative sums in our hashmap and as value part we can keep its occurrences. So in this way we can have O(1) lookup for (current - k) values.

Teacher: Brilliant! You've nailed the core idea! 🎯

So to summarize what you've discovered:

Key: cumulative sum value
Value: count of how many times we've seen this cumulative sum

When you're at a position with cumulative sum = 5, and you look up (5 - 3) = 2 in the hashmap, if it appears 2 times, that means there are 2 subarrays ending at the current position that sum to k!


Question 4: The Edge Case Mystery

Teacher: Before you start iterating through the array, your hashmap is empty. But think about this scenario:

What if the very first element (or first few elements) of the array sum exactly to k? For example, if the array is [3, 1, 2] and k = 3, the subarray [3] itself sums to k.

When you're at index 0 with cumulative sum = 3, you'll look for (3 - 3) = 0 in your hashmap. But you haven't seen any cumulative sum of 0 yet because you just started!

What should you initialize in your hashmap before starting the iteration to handle this edge case?

Hint: Think about what the cumulative sum is "before" you start processing any elements.


The Confusion: Wait, Something's Wrong!

Me: It is a good question but before that I guess I didn't understand our formula. k=3 is the thing we are looking for a subarray to have total sum. For example [1, 2, -1, 1, 2], cumulative sum array is [1, 3, 2, 3, 5].

If we subtract index 2 cumulative sum from index 4 cumulative sum (and index 4 let's say is our current index) we get 3. So as far as I understand you are saying that since 5 - 2 = 3 then our subarray sum is 3 and this is a subarray that we are looking for.

But when we look at the real sum of starting index 2 to ending index 4 we get 2, [-1, 1, 2]. Am I doing something wrong?

Teacher: Ah! Great catch! You found an important detail. Let me clarify this carefully.

You're absolutely right to question this. Let's trace through it precisely:

Array:     [1,  2, -1,  1,  2]
Indices: 0 1 2 3 4

Cumulative sums:
Index 0: sum = 1
Index 1: sum = 1 + 2 = 3
Index 2: sum = 1 + 2 + (-1) = 2
Index 3: sum = 1 + 2 + (-1) + 1 = 3
Index 4: sum = 1 + 2 + (-1) + 1 + 2 = 5

Now, when you're at index 4 with cumulative sum = 5:

  1. You look for (5 - 3) = 2 in your hashmap
  2. You find that cumulative sum = 2 occurred at index 2

Here's the key: When cumulative sum at index 2 = 2, that means the sum of elements from index 0 to index 2 is 2.

So if:

  • Sum[0 to 4] = 5
  • Sum[0 to 2] = 2

Then: Sum[3 to 4] = 5 - 2 = 3

The subarray is [1, 2] (indices 3 to 4), NOT [-1, 1, 2]!


The Aha Moment: The j+1 Detail

Me: Alright! In the beginning you already said:

"If sum[0 to 4] = 5, and sum[0 to j] = ?, then sum[j+1 to 4] = 3"

If I pay attention to detail you are saying here sum[j+1 to 4], the plus 1 is important. I didn't care about it back then. But it is quite essential to understand our logic for finding subarrays sums equal to k.

Teacher: Exactly! That "+1" is crucial! I'm glad you caught that detail now.

The subarray starts at (j+1), not at j itself, because we're "removing" everything up to index j.

Visual representation:

[1, 2, -1, 1, 2]
↑ ↑ ↑
0 3 4

cumSum[4] = 5
cumSum[2] = 2
Subarray: indices 3-4 (starts at 2+1)

Understanding the Edge Case

Me: Now back to your edge case thinking. That is brilliant. But how could you see it?

Teacher: Great question! Here's the thought process:

When solving algorithm problems, I always ask myself: "What are the boundary conditions?"

For this problem, think about:

  1. What happens at the very start? Before processing any elements, what's our state?
  2. Can a valid subarray start from index 0? Yes! For example, if array = [3] and k = 3, or array = [1, 2] and k = 3.
  3. How does our formula handle this?

When you're at index i with cumulative sum = k (meaning sum[0 to i] = k), you'll look for (k - k) = 0 in the hashmap. But there's no index that produced cumulative sum = 0... or is there?

The key insight: What's the sum of elements before index 0 (before we've processed anything)? It's 0!

Think of it conceptually: cumulative sum at "index -1" (before the array starts) = 0.

Visualizing the Edge Case

Array: [1, 2, -1, 1, 2], k = 3

Index 1: cumSum = 3
Look for (3 - 3) = 0
If 0 exists → subarray [1, 2] sums to k!

Where does cumSum = 0 exist?
At the "imaginary" position before index 0

Me: You are a great teacher but I couldn't understand. Why are we thinking of start condition of our formula?

Teacher: Let me show you with a concrete trace:

Example: array = [1, 2, -1, 1, 2], k = 3

When you're at index 1:

  • Cumulative sum = 1 + 2 = 3
  • You look for (3 - 3) = 0 in your hashmap
  • If 0 exists in the hashmap, it means there's a subarray ending at index 1 that sums to k

Now think: What does finding cumulative sum = 0 mean?

It means: sum[0 to 1] - sum[0 to j] = 3

If we find cumulative sum = 0, where was it? It was at the "imaginary" position before index 0 (before we started).

So: sum[0 to 1] - 0 = 3, which gives us the subarray [1, 2] starting at index 0!


Why Not Handle It Separately?

Me: Why not use something simpler for this edge case?

Teacher: You asked a great question! You could write:

if (cumulativeSum == k) {
count += 1;
}

But notice this is just a special case of our general formula! When cumulativeSum = k, we're looking for (k - k) = 0.

Do you see now why initializing hashmap[0] = 1 makes our formula work uniformly for ALL cases, including subarrays starting at index 0?

// Without initialization: special case needed
if (cumulativeSum == k) {
count++; // Handle subarrays starting at 0
}
if (map.containsKey(cumulativeSum - k)) {
count += map.get(cumulativeSum - k);
}

// With initialization: ONE formula handles everything!
map.put(0, 1); // Initialize
if (map.containsKey(cumulativeSum - k)) {
count += map.get(cumulativeSum - k); // Works for all cases!
}

The Solution

Me: Now give me the full Java solution.

Teacher: Great! Let's write the complete solution together.

Based on everything we discussed:

  1. Use a hashmap to store cumulative sums and their counts
  2. Initialize with {0: 1} to handle subarrays starting at index 0
  3. For each position, check if (currentSum - k) exists in the map
  4. Update the hashmap with the current cumulative sum
import java.util.HashMap;

public class KSumSubarrays {
public static int subarraySum(int[] nums, int k) {
// HashMap to store cumulative sum and its frequency
HashMap<Integer, Integer> map = new HashMap<>();

// Initialize: cumulative sum of 0 appears once (before we start)
map.put(0, 1);

int count = 0; // Number of subarrays with sum = k
int cumulativeSum = 0; // Running cumulative sum

// Iterate through the array
for (int num : nums) {
// Update cumulative sum
cumulativeSum += num;

// Check if (cumulativeSum - k) exists in map
// If yes, there are subarrays ending at current position with sum = k
if (map.containsKey(cumulativeSum - k)) {
count += map.get(cumulativeSum - k);
}

// Add current cumulative sum to map (or increment its count)
map.put(cumulativeSum, map.getOrDefault(cumulativeSum, 0) + 1);
}

return count;
}

public static void main(String[] args) {
int[] nums = {1, 2, -1, 1, 2};
int k = 3;
System.out.println("Number of subarrays: " + subarraySum(nums, k));
// Output: 3
}
}

Complexity:

  • Time: O(n) - single pass through the array
  • Space: O(n) - hashmap can store at most n different cumulative sums

Wait, Why Only One Loop?

Me: But I thought that in first iteration I mean in first for loop we need to fill the hashmap and inside another for loop we apply our formulation. But in your solution I see only one for loop. How could we manage filling the hashmap?

Teacher: Excellent observation! This is a very important distinction to understand.

You're thinking we need two separate phases:

  1. First loop: fill the hashmap with all cumulative sums
  2. Second loop: apply the formula

But actually, we can do it in one loop! Here's why:

The Key Insight

When we're at index i, we only care about cumulative sums that appeared before index i (at indices 0, 1, 2, ..., i-1).

We don't need to know about cumulative sums that come after index i because subarrays must be contiguous and we're looking for subarrays ending at the current position.


Complete Step-by-Step Trace

Let me trace through the example:

Array: [1, 2, -1, 1, 2], k = 3

Initially:

map = {0: 1}
count = 0

Index 0 (num = 1):

cumulativeSum = 0 + 1 = 1
Look for (1 - 3) = -2 in map → not found
Add to map: map = {0: 1, 1: 1}
count = 0

Index 1 (num = 2):

cumulativeSum = 1 + 2 = 3
Look for (3 - 3) = 0 in map → found! map[0] = 1
count += 1 → count = 1 (found subarray [1, 2])
Add to map: map = {0: 1, 1: 1, 3: 1}

Index 2 (num = -1):

cumulativeSum = 3 + (-1) = 2
Look for (2 - 3) = -1 in map → not found
Add to map: map = {0: 1, 1: 1, 3: 1, 2: 1}
count = 1

Index 3 (num = 1):

cumulativeSum = 2 + 1 = 3
Look for (3 - 3) = 0 in map → found! map[0] = 1
count += 1 → count = 2 (found subarray [1, 2, -1, 1])
Add to map: map = {0: 1, 1: 1, 3: 2, 2: 1}

Index 4 (num = 2):

cumulativeSum = 3 + 2 = 5
Look for (5 - 3) = 2 in map → found! map[2] = 1
count += 1 → count = 3 (found subarray [1, 2] from indices 3-4)
Add to map: map = {0: 1, 1: 1, 3: 2, 2: 1, 5: 1}

Final answer: 3


Why One Loop Works

Notice that at each position, we:

  1. First check if the required previous cumulative sum exists
  2. Then add the current cumulative sum to the map

This way, when we check, we're only looking at cumulative sums from previous positions, which is exactly what we need!

for (int num : nums) {
cumulativeSum += num;

// Step 1: Check against PREVIOUS cumulative sums
if (map.containsKey(cumulativeSum - k)) {
count += map.get(cumulativeSum - k);
}

// Step 2: Add CURRENT cumulative sum for future iterations
map.put(cumulativeSum, map.getOrDefault(cumulativeSum, 0) + 1);
}

Key Point: By the time we process index i, the hashmap contains all cumulative sums from indices 0 to i-1, which is exactly what we need!


Visual Understanding

Wrong Thinking (Two Loops):

Loop 1: Build complete hashmap
[1, 2, -1, 1, 2] → {0:1, 1:1, 3:2, 2:1, 5:1}

Loop 2: Check each position against complete hashmap
But we don't need future cumulative sums!

Correct Thinking (One Loop):

At index 0: map = {0:1}          → Check against this
Add 1 → map = {0:1, 1:1}

At index 1: map = {0:1, 1:1} → Check against this
Add 3 → map = {0:1, 1:1, 3:1}

At index 2: map = {0:1, 1:1, 3:1} → Check against this
Add 2 → map = {0:1, 1:1, 3:1, 2:1}
...

Each position only needs to see previous cumulative sums!


Common Mistakes and Edge Cases

Mistake 1: Not Initializing map[0] = 1

// ❌ Wrong:
HashMap<Integer, Integer> map = new HashMap<>();
// Missing: map.put(0, 1);

// Example that breaks:
array = [3], k = 3
At index 0: cumSum = 3
Look for (3-3) = 0 → not found!
count = 0 (wrong! should be 1)
// ✅ Correct:
HashMap<Integer, Integer> map = new HashMap<>();
map.put(0, 1); // Initialize!

array = [3], k = 3
At index 0: cumSum = 3
Look for (3-3) = 0 → found! map[0] = 1
count = 1

Mistake 2: Updating Map Before Checking

// ❌ Wrong order:
cumulativeSum += num;
map.put(cumulativeSum, map.getOrDefault(cumulativeSum, 0) + 1);
if (map.containsKey(cumulativeSum - k)) {
count += map.get(cumulativeSum - k);
}

// Why wrong? You might count the current position itself!
// ✅ Correct order:
cumulativeSum += num;
if (map.containsKey(cumulativeSum - k)) { // Check first
count += map.get(cumulativeSum - k);
}
map.put(cumulativeSum, map.getOrDefault(cumulativeSum, 0) + 1); // Update after

Mistake 3: Forgetting getOrDefault

// ❌ Wrong:
map.put(cumulativeSum, map.get(cumulativeSum) + 1);
// NullPointerException if key doesn't exist!

// ✅ Correct:
map.put(cumulativeSum, map.getOrDefault(cumulativeSum, 0) + 1);

Testing the Solution

public class TestKSumSubarrays {
public static void main(String[] args) {
// Test case 1: Example from problem
int[] nums1 = {1, 2, -1, 1, 2};
int k1 = 3;
System.out.println(subarraySum(nums1, k1)); // Expected: 3

// Test case 2: Single element equals k
int[] nums2 = {3};
int k2 = 3;
System.out.println(subarraySum(nums2, k2)); // Expected: 1

// Test case 3: Entire array equals k
int[] nums3 = {1, 2, 3};
int k3 = 6;
System.out.println(subarraySum(nums3, k3)); // Expected: 1

// Test case 4: Multiple occurrences
int[] nums4 = {1, 1, 1};
int k4 = 2;
System.out.println(subarraySum(nums4, k4)); // Expected: 2
// [1,1] at indices 0-1 and [1,1] at indices 1-2

// Test case 5: Negative numbers
int[] nums5 = {1, -1, 1, -1};
int k5 = 0;
System.out.println(subarraySum(nums5, k5)); // Expected: 4

// Test case 6: No subarrays
int[] nums6 = {1, 2, 3};
int k6 = 10;
System.out.println(subarraySum(nums6, k6)); // Expected: 0
}
}

The Broader Pattern

This technique of using cumulative sums + hashmap appears in many problems:

Similar Problems

  1. Subarray Sum Equals K (our problem)

    • Store cumulative sum → frequency
  2. Contiguous Array (equal 0s and 1s)

    • Store cumulative difference → first occurrence index
  3. Maximum Size Subarray Sum Equals k

    • Store cumulative sum → earliest index
  4. Shortest Subarray with Sum at Least K

    • Use monotonic deque with cumulative sums

The Core Pattern

// Template for cumulative sum + hashmap problems
HashMap<CumulativeValue, Info> map = new HashMap<>();
map.put(initialValue, initialInfo); // Edge case handling

CumulativeValue cumulative = initial;
for (element : array) {
cumulative = update(cumulative, element);

TargetValue target = calculate(cumulative, k);
if (map.containsKey(target)) {
// Found something! Use map.get(target)
}

map.put(cumulative, updateInfo(cumulative));
}

Key Takeaways

What We Learned
  1. Cumulative Sum Formula:

    sum[j+1 to i] = cumSum[i] - cumSum[j]

    Note the j+1! Subarray starts AFTER index j.

  2. HashMap Usage:

    Key: cumulative sum value
    Value: frequency (how many times seen)
  3. The Mysterious Initialization:

    map.put(0, 1);  // Handles subarrays starting at index 0

    Represents the "imaginary" cumulative sum before the array starts.

  4. One Loop Is Enough:

    • Check against PREVIOUS cumulative sums
    • Add CURRENT cumulative sum for future iterations
    • Order matters: check first, update after
  5. Time Complexity: O(n)

    • Single pass through array
    • HashMap operations are O(1)
  6. Space Complexity: O(n)

    • HashMap stores at most n different cumulative sums
  7. This Pattern Extends:

    • Many subarray problems use cumulative sum + hashmap
    • The key is figuring out what to store in the hashmap

Conclusion

What started as a confusing formula became clear through careful exploration:

  • Why cumulative sums help us find subarrays
  • Why we need j+1 (subarray starts after the reference point)
  • Why hashmaps give us O(1) lookup
  • Why we initialize with {0: 1} (handle subarrays from start)
  • Why one loop is sufficient (only need previous cumulative sums)

The beauty of this solution: It transforms a seemingly O(n²) problem (check all subarrays) into an elegant O(n) solution using cumulative sums and a hashmap.

And that map.put(0, 1) initialization? It's not magic—it's just representing the fact that "before we start, the sum is 0, and we've seen that once!"


Practice Problems

Try these to solidify your understanding:

  1. Contiguous Array - Find longest subarray with equal 0s and 1s
  2. Maximum Size Subarray Sum Equals k - Find longest subarray with sum k
  3. Subarray Sum Divisible by K - Count subarrays where sum is divisible by k
  4. Continuous Subarray Sum - Multiple of k with length ≥ 2

Tags: #algorithms #arrays #hashmap #prefix-sum #learning-journey #interview-prep


This conversation happened during a real learning session. Sometimes the best understanding comes from questioning every detail—especially those "+1" offsets and mysterious initializations!