AWS Neptune to FalkorDB (Bulk Loader) CSV Converter

This script converts Amazon Neptune Export Service CSV files into FalkorDB bulk-loader compatible CSVs for easy data migration.

It produces separate CSV files per unique node label-set and per edge type (plus a manifest file), so you can load the output directly with falkordb-bulk-loader.

Features

Automatic file detection: Intelligently finds Neptune export files (vertices.csv, edges.csv, etc.)
Label-based file organization: Creates separate files per node label and edge type for optimized schemas
Schema preservation: Maintains all node labels, edge types, and properties
Property handling: Correctly parses JSON-encoded properties and complex data types
Flexible input formats: Handles various Neptune export CSV formats including pipe-delimited and line-numbered formats
Smart delimiter detection: Automatically detects CSV delimiters and line number prefixes
Schema documentation: Generates detailed schema information about the converted data
Flexible loading helper: bulk_load_to_falkordb.py supports both insert mode (new graph) and update mode (existing graph)

Requirements

Python 3.7+
Converter script (neptune_to_falkordb_converter.py): standard library modules only (no external dependencies)
Loader helper (bulk_load_to_falkordb.py): standard library only unless you enable index creation
- Optional (only if using --create-id-indexes): pip install falkordb redis

Installation

No installation required. Just download the script:

# Make the script executable
chmod +x neptune_to_falkordb_converter.py

Usage

Basic Usage

python3 neptune_to_falkordb_converter.py --input-dir /path/to/neptune/export --output-dir /path/to/falkordb/output

Enforced Schema Output (optional)

If you want the converter to write Neo4j-style typed headers (e.g. id:ID, name:STRING, :START_ID, :END_ID) for use with the bulk loader's --enforce-schema flag:

python neptune_to_falkordb_converter.py -i /path/to/neptune/export -o /path/to/falkordb/output --enforce-schema

With Verbose Logging

python3 neptune_to_falkordb_converter.py -i ./twitter_neptune_data -o ./twitter_falkordb_data --verbose

Command Line Options

usage: neptune_to_falkordb_converter.py [-h] --input-dir INPUT_DIR --output-dir OUTPUT_DIR [--verbose] [--enforce-schema]

Convert Neptune Export Service CSV to FalkorDB bulk-loader CSV format

optional arguments:
  -h, --help            show this help message and exit
  --input-dir INPUT_DIR, -i INPUT_DIR
                        Directory containing Neptune export CSV files
  --output-dir OUTPUT_DIR, -o OUTPUT_DIR
                        Output directory for FalkorDB CSV files
  --verbose, -v         Enable verbose logging for debugging
  --enforce-schema      Emit typed CSV headers compatible with falkordb-bulk-loader --enforce-schema

Input Format (Neptune Export Service)

The script expects Neptune Export Service CSV files with these typical structures:

Vertices/Nodes File

~id,~label,username,followers_count,verified
1,User,@elonmusk,50000000,true
2,User,@twitter,60000000,true

Edges/Relationships File

~id,~label,~from,~to,created_at,weight
e1,FOLLOWS,1,2,2023-01-15,1.0
e2,MENTIONS,2,1,2023-02-20,0.8

Output Format (FalkorDB Bulk Loader)

The script generates CSVs in the falkordb-bulk-loader schemaless format by default.

If you run the converter with --enforce-schema, the CSV headers will include explicit types and ID markers compatible with falkordb-bulk-loader --enforce-schema.

Node Files (nodes_*.csv)

Each output node file represents a label-set. The first column is the node identifier, and the remaining columns are node properties.

nodes_User.csv:

id,username,followers_count,verified
1,@elonmusk,50000000,true
2,@twitter,60000000,true

If Neptune vertices contain multiple labels, nodes are grouped by their full label set and written once to a combined file. Example:

nodes_User__Verified.csv (labels User:Verified at import time):

id,username,verified
42,@example,true

Edge Files (edges_*.csv)

Each output edge file represents a relationship type. The first two columns are the start and end node identifiers, and the remaining columns are relationship properties.

edges_FOLLOWS.csv:

source,target,created_at,weight
1,2,2023-01-15,1.0

edges_MENTIONS.csv:

source,target,created_at,weight
2,1,2023-02-20,0.8

File Discovery

The script automatically detects Neptune export files using these patterns: Node files: any CSV filename containing vertices, nodes, or vertex
Edge files: any CSV filename containing edges or relationships

For remaining unmatched CSV files, the script analyzes CSV headers to identify file types.

Neptune Column Mapping

The converter handles various Neptune export formats:

Node Columns

ID: ~id, id, vertex_id
Labels: ~label, label, labels, ~labels
Properties: Any other non-system columns

Edge Columns

Source: ~from, source, from
Target: ~to, target, to
Type: ~label, label, type, relationship_type
Properties: Any other non-system columns

Data Type Handling

The script intelligently converts Neptune data types:

JSON objects/arrays: Parsed and re-serialized
Numbers: Converted to int/float as appropriate
Booleans: Converted from string representation
Strings: Preserved as-is
Empty values: Converted to empty strings

Output Files

The converter creates multiple optimized files:

Node Files

nodes_*.csv: One file per unique label-set (a node appears in exactly one file)
Example: nodes_User.csv, nodes_Tweet.csv, nodes_User__Verified.csv

Edge Files

edges_*.csv: One file per edge type with only relevant properties
Example: edges_FOLLOWS.csv, edges_MENTIONS.csv, edges_RETWEETS.csv

Metadata

bulk_loader_manifest.json: Manifest describing the generated CSVs, including:
- Node files and the label-set that should be applied to each file
- Relationship files and their relationship type
- Basic summary information

Example Workflow

Export from Neptune using Neptune Export Service

Convert to FalkorDB format:

python3 neptune_to_falkordb_converter.py -i ./twitter_neptune_export -o ./twitter_falkordb_import

Review the output:

ls twitter_falkordb_import/
# nodes_*.csv  edges_*.csv  bulk_loader_manifest.json

# Check node files
head twitter_falkordb_import/nodes_User.csv

# Check edge files
head twitter_falkordb_import/edges_FOLLOWS.csv

# Review manifest
cat twitter_falkordb_import/bulk_loader_manifest.json

Import into FalkorDB using falkordb-bulk-loader (see Loading Data into FalkorDB below)

Real Example: Twitter Dataset

Converting a Twitter social network dataset:

# Convert Twitter Neptune export to FalkorDB format
python3 neptune_to_falkordb_converter.py -i ./twitter_neptune_export -o ./twitter_falkordb --verbose

# Example output:
# Converting nodes from 1 files: ['users.csv']
# Converting edges from 1 files: ['follows.csv']
# 
# Created files:
# nodes_User.csv                - Twitter user profiles with properties
# edges_FOLLOWS.csv             - Follow relationships with timestamps
# bulk_loader_manifest.json     - Bulk loader manifest

Sample Output Structure:

nodes_User.csv: id,username,followers_count,verified
edges_FOLLOWS.csv: source,target,created_at

Troubleshooting

Common Issues

No files found
- Check that Neptune export files are in the input directory
- Verify file naming conventions match expected patterns
Missing node/edge properties
- Check the verbose output to see what properties were detected
- Verify Neptune export includes all required data
Encoding issues
- The script uses UTF-8 encoding by default
- For other encodings, modify the script's file opening parameters

Debug Mode

Use --verbose flag for detailed logging:

python3 neptune_to_falkordb_converter.py -i input -o output --verbose

Loading Data into FalkorDB

After converting your Neptune data, you can load it into FalkorDB using the FalkorDB bulk loader.

Prerequisite: falkordb-bulk-loader

Clone the bulk loader next to this repository (or point to it with --bulk-loader-dir):

git clone https://github.com/falkordb/falkordb-bulk-loader.git ../falkordb-bulk-loader

Option A (recommended): use the helper script in this repo

# Convert
python3 neptune_to_falkordb_converter.py -i ./neptune_export -o ./falkordb_csv

# (Optional) generate typed headers for strict loading
# python3 neptune_to_falkordb_converter.py -i ./neptune_export -o ./falkordb_csv --enforce-schema

# Load (invokes ../falkordb-bulk-loader/falkordb_bulk_loader/bulk_insert.py)
# If the manifest indicates enforce_schema=true, the helper will automatically pass --enforce-schema.
python3 bulk_load_to_falkordb.py my_graph_name --csv-dir ./falkordb_csv --server-url redis://127.0.0.1:6379

# Update mode (invokes bulk_update.py with auto-generated Cypher per CSV file)
# Useful when updating an existing graph.
python3 bulk_load_to_falkordb.py my_graph_name --csv-dir ./falkordb_csv --mode update --server-url redis://127.0.0.1:6379

# Optional: create :<Label>(id) range indexes after load (requires: pip install falkordb redis)
#   --create-id-indexes
# Optional: if your ID property is not named 'id'
#   --id-property <property_name>   (also used by --mode update to match source/target nodes)

`bulk_load_to_falkordb.py` key options

--mode insert|update (default: insert)
- insert: builds a new graph via bulk_insert.py and -N/-R manifest mappings
- update: runs bulk_update.py per generated CSV using auto-generated Cypher upserts
--enforce-schema / --no-enforce-schema
- Applies to insert mode only (passed through to bulk_insert.py)
--id-property <name>
- Property used for post-load index creation, and in update mode for endpoint matching
--dry-run
- Prints the command(s) that would run

In update mode, this wrapper auto-generates --csv / --query for each file.
Do not pass --csv, --query, or --variable-name through passthrough args.

Option B: call bulk_insert.py directly

The converter writes bulk_loader_manifest.json which tells you which -N (nodes-with-label) and -R (relations-with-type) arguments to pass.

python3 ../falkordb-bulk-loader/falkordb_bulk_loader/bulk_insert.py my_graph_name \
  -u redis://127.0.0.1:6379 \
  -N User ./falkordb_csv/nodes_User.csv \
  -R FOLLOWS ./falkordb_csv/edges_FOLLOWS.csv

# If you converted with --enforce-schema, add:
#   --enforce-schema

Advanced Features

Delimiter Detection

The converter automatically handles multiple CSV formats:

Standard CSV: Comma-delimited files
Neptune pipe format: Pipe-delimited files (|)
Line-numbered format: Files with line number prefixes (e.g., 1|data,data,data)

File Organization

Label-set-based optimization: Each unique node label-set gets a file with only its relevant properties
Type-based optimization: Each edge type gets a file with only its relevant properties
Safe filename generation: Special characters in labels/types are safely converted

Multi-label Support

Nodes with multiple labels are grouped into a combined node file (e.g., nodes_User__Verified.csv)
At load time, the bulk loader is invoked with -N User:Verified nodes_User__Verified.csv to apply both labels

License

This script is provided as-is for Neptune to FalkorDB migration purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
bulk_load_to_falkordb.py		bulk_load_to_falkordb.py
neptune_to_falkordb_converter.py		neptune_to_falkordb_converter.py

Folders and files

Latest commit

History

Repository files navigation

AWS Neptune to FalkorDB (Bulk Loader) CSV Converter

Features

Requirements

Installation

Usage

Basic Usage

Enforced Schema Output (optional)

With Verbose Logging

Command Line Options

Input Format (Neptune Export Service)

Vertices/Nodes File

Edges/Relationships File

Output Format (FalkorDB Bulk Loader)

Node Files (nodes_*.csv)

Edge Files (edges_*.csv)

File Discovery

Neptune Column Mapping

Node Columns

Edge Columns

Data Type Handling

Output Files

Node Files

Edge Files

Metadata

Example Workflow

Real Example: Twitter Dataset

Troubleshooting

Common Issues

Debug Mode

Loading Data into FalkorDB

Prerequisite: falkordb-bulk-loader

Option A (recommended): use the helper script in this repo

bulk_load_to_falkordb.py key options

Option B: call bulk_insert.py directly

Advanced Features

Delimiter Detection

File Organization

Multi-label Support

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`bulk_load_to_falkordb.py` key options

Packages