← Back to Blog

Streaming JSON: How to Process Files Too Large for Memory

March 17, 2026 3 min read By CodeTidy Team

The JSON Conundrum: How to Process Files Too Large for Memory

We've all been there - stuck with a massive JSON file that's too large to fit into memory. You try to parse it, but your program crashes or freezes. It's a frustrating problem, but one that can be solved with the right techniques.

Table of Contents

  • Understanding the Problem: Large JSON Files
  • SAX-Style Parsers: The Event-Driven Approach
  • NDJSON: Newline-Delimited JSON to the Rescue
  • Chunked Processing: Breaking Down the File
  • Putting it all Together: A Real-World Example
  • Key Takeaways
  • FAQ

Understanding the Problem: Large JSON Files

When working with large JSON files, the traditional approach of loading the entire file into memory can be problematic. This is because JSON parsing requires a significant amount of memory, especially for large files. We've seen files that exceed 10 GB in size, and attempting to load them into memory can lead to out-of-memory errors or even crashes.

SAX-Style Parsers: The Event-Driven Approach

One solution to this problem is to use a SAX-style parser. These parsers work by emitting events as they parse the JSON file, rather than loading the entire file into memory. This event-driven approach allows us to process large JSON files without running into memory issues.

Let's take a look at an example using the ijson library in Python:

import ijson

with open('large_file.json', 'r') as f:
    parser = ijson.parse(f)
    for prefix, event, value in parser:
        if prefix == 'items.item':
            print(value)

In this example, we're using ijson to parse a large JSON file and print out the values of the items array.

Other popular SAX-style parsers include Jackson Streaming in Java and oboe.js in JavaScript.

NDJSON: Newline-Delimited JSON to the Rescue

Another approach to processing large JSON files is to use NDJSON (Newline-Delimited JSON). NDJSON is a format where each JSON object is separated by a newline character, making it easy to process large files.

Here's an example of how to use NDJSON in Node.js:

const fs = require('fs');
const readline = require('readline');

const fileStream = fs.createReadStream('large_file.ndjson');
const rl = readline.createInterface({
  input: fileStream,
  crlfDelay: Infinity
});

rl.on('line', (line) => {
  const data = JSON.parse(line);
  console.log(data);
});

In this example, we're using the readline module to read a large NDJSON file line by line, and then parsing each line as a separate JSON object.

Chunked Processing: Breaking Down the File

Another approach to processing large JSON files is to break them down into smaller chunks. This can be done by splitting the file into smaller files or by processing the file in chunks using a streaming parser.

Here's an example of how to process a large JSON file in chunks using the json-stream library in Node.js:

const jsonStream = require('json-stream');
const fs = require('fs');

const fileStream = fs.createReadStream('large_file.json');
const parser = jsonStream.parse();

fileStream.pipe(parser);

parser.on('data', (chunk) => {
  console.log(chunk);
});

In this example, we're using the json-stream library to parse a large JSON file in chunks, and then logging each chunk to the console.

Putting it all Together: A Real-World Example

Let's say we have a large JSON file containing a list of user data, and we want to process each user object separately. We can use a combination of the techniques above to achieve this.

Here's an example of how we might do this using the ijson library in Python:

import ijson

with open('large_file.json', 'r') as f:
    parser = ijson.parse(f)
    for prefix, event, value in parser:
        if prefix == 'users.item':
            process_user(value)

def process_user(user):
    # Process the user object here
    print(user)

In this example, we're using ijson to parse a large JSON file and process each user object separately.

Key Takeaways

  • Use SAX-style parsers to process large JSON files without running into memory issues.
  • Use NDJSON to process large JSON files line by line.
  • Break down large JSON files into smaller chunks to process them more efficiently.
  • Use a combination of techniques to achieve the best results.

FAQ

Q: What is the best way to process large JSON files?

A: The best way to process large JSON files depends on the specific use case, but using a SAX-style parser or NDJSON are good options.

Q: Can I use a streaming parser to process large JSON files?

A: Yes, streaming parsers are a great way to process large JSON files without running into memory issues.

Q: How do I handle errors when processing large JSON files?

A: When processing large JSON files, it's essential to handle errors properly to avoid crashes or data corruption. Use try-catch blocks and error handling mechanisms to ensure that errors are caught and handled correctly.

AI agent tools available. The CodeTidy MCP Server gives Claude, Cursor, and other AI agents access to 60+ developer tools. One command: npx @codetidy/mcp