This script converts Amazon Neptune Export Service CSV files into FalkorDB bulk-loader compatible CSVs for easy data migration.
It produces separate CSV files per unique node label-set and per edge type (plus a manifest file), so you can load the output directly with falkordb-bulk-loader.
- Automatic file detection: Intelligently finds Neptune export files (vertices.csv, edges.csv, etc.)
- Label-based file organization: Creates separate files per node label and edge type for optimized schemas
- Schema preservation: Maintains all node labels, edge types, and properties
- Property handling: Correctly parses JSON-encoded properties and complex data types
- Flexible input formats: Handles various Neptune export CSV formats including pipe-delimited and line-numbered formats
- Smart delimiter detection: Automatically detects CSV delimiters and line number prefixes
- Schema documentation: Generates detailed schema information about the converted data
- Flexible loading helper:
bulk_load_to_falkordb.pysupports both insert mode (new graph) and update mode (existing graph)
- Python 3.7+
- Converter script (
neptune_to_falkordb_converter.py): standard library modules only (no external dependencies) - Loader helper (
bulk_load_to_falkordb.py): standard library only unless you enable index creation- Optional (only if using
--create-id-indexes):pip install falkordb redis
- Optional (only if using
No installation required. Just download the script:
# Make the script executable
chmod +x neptune_to_falkordb_converter.pypython3 neptune_to_falkordb_converter.py --input-dir /path/to/neptune/export --output-dir /path/to/falkordb/outputIf you want the converter to write Neo4j-style typed headers (e.g. id:ID, name:STRING, :START_ID, :END_ID) for use with the bulk loader's --enforce-schema flag:
python neptune_to_falkordb_converter.py -i /path/to/neptune/export -o /path/to/falkordb/output --enforce-schemapython3 neptune_to_falkordb_converter.py -i ./twitter_neptune_data -o ./twitter_falkordb_data --verboseusage: neptune_to_falkordb_converter.py [-h] --input-dir INPUT_DIR --output-dir OUTPUT_DIR [--verbose] [--enforce-schema]
Convert Neptune Export Service CSV to FalkorDB bulk-loader CSV format
optional arguments:
-h, --help show this help message and exit
--input-dir INPUT_DIR, -i INPUT_DIR
Directory containing Neptune export CSV files
--output-dir OUTPUT_DIR, -o OUTPUT_DIR
Output directory for FalkorDB CSV files
--verbose, -v Enable verbose logging for debugging
--enforce-schema Emit typed CSV headers compatible with falkordb-bulk-loader --enforce-schema
The script expects Neptune Export Service CSV files with these typical structures:
~id,~label,username,followers_count,verified
1,User,@elonmusk,50000000,true
2,User,@twitter,60000000,true~id,~label,~from,~to,created_at,weight
e1,FOLLOWS,1,2,2023-01-15,1.0
e2,MENTIONS,2,1,2023-02-20,0.8The script generates CSVs in the falkordb-bulk-loader schemaless format by default.
If you run the converter with --enforce-schema, the CSV headers will include explicit types and ID markers compatible with falkordb-bulk-loader --enforce-schema.
Each output node file represents a label-set. The first column is the node identifier, and the remaining columns are node properties.
nodes_User.csv:
id,username,followers_count,verified
1,@elonmusk,50000000,true
2,@twitter,60000000,trueIf Neptune vertices contain multiple labels, nodes are grouped by their full label set and written once to a combined file. Example:
nodes_User__Verified.csv (labels User:Verified at import time):
id,username,verified
42,@example,trueEach output edge file represents a relationship type. The first two columns are the start and end node identifiers, and the remaining columns are relationship properties.
edges_FOLLOWS.csv:
source,target,created_at,weight
1,2,2023-01-15,1.0edges_MENTIONS.csv:
source,target,created_at,weight
2,1,2023-02-20,0.8The script automatically detects Neptune export files using these patterns:
Node files: any CSV filename containing vertices, nodes, or vertex
Edge files: any CSV filename containing edges or relationships
For remaining unmatched CSV files, the script analyzes CSV headers to identify file types.
The converter handles various Neptune export formats:
- ID:
~id,id,vertex_id - Labels:
~label,label,labels,~labels - Properties: Any other non-system columns
- Source:
~from,source,from - Target:
~to,target,to - Type:
~label,label,type,relationship_type - Properties: Any other non-system columns
The script intelligently converts Neptune data types:
- JSON objects/arrays: Parsed and re-serialized
- Numbers: Converted to int/float as appropriate
- Booleans: Converted from string representation
- Strings: Preserved as-is
- Empty values: Converted to empty strings
The converter creates multiple optimized files:
nodes_*.csv: One file per unique label-set (a node appears in exactly one file)- Example:
nodes_User.csv,nodes_Tweet.csv,nodes_User__Verified.csv
edges_*.csv: One file per edge type with only relevant properties- Example:
edges_FOLLOWS.csv,edges_MENTIONS.csv,edges_RETWEETS.csv
bulk_loader_manifest.json: Manifest describing the generated CSVs, including:- Node files and the label-set that should be applied to each file
- Relationship files and their relationship type
- Basic summary information
- Export from Neptune using Neptune Export Service
- Convert to FalkorDB format:
python3 neptune_to_falkordb_converter.py -i ./twitter_neptune_export -o ./twitter_falkordb_import
- Review the output:
ls twitter_falkordb_import/ # nodes_*.csv edges_*.csv bulk_loader_manifest.json # Check node files head twitter_falkordb_import/nodes_User.csv # Check edge files head twitter_falkordb_import/edges_FOLLOWS.csv # Review manifest cat twitter_falkordb_import/bulk_loader_manifest.json
- Import into FalkorDB using
falkordb-bulk-loader(see Loading Data into FalkorDB below)
Converting a Twitter social network dataset:
# Convert Twitter Neptune export to FalkorDB format
python3 neptune_to_falkordb_converter.py -i ./twitter_neptune_export -o ./twitter_falkordb --verbose
# Example output:
# Converting nodes from 1 files: ['users.csv']
# Converting edges from 1 files: ['follows.csv']
#
# Created files:
# nodes_User.csv - Twitter user profiles with properties
# edges_FOLLOWS.csv - Follow relationships with timestamps
# bulk_loader_manifest.json - Bulk loader manifestSample Output Structure:
- nodes_User.csv:
id,username,followers_count,verified - edges_FOLLOWS.csv:
source,target,created_at
-
No files found
- Check that Neptune export files are in the input directory
- Verify file naming conventions match expected patterns
-
Missing node/edge properties
- Check the verbose output to see what properties were detected
- Verify Neptune export includes all required data
-
Encoding issues
- The script uses UTF-8 encoding by default
- For other encodings, modify the script's file opening parameters
Use --verbose flag for detailed logging:
python3 neptune_to_falkordb_converter.py -i input -o output --verboseAfter converting your Neptune data, you can load it into FalkorDB using the FalkorDB bulk loader.
Clone the bulk loader next to this repository (or point to it with --bulk-loader-dir):
git clone https://github.com/falkordb/falkordb-bulk-loader.git ../falkordb-bulk-loader# Convert
python3 neptune_to_falkordb_converter.py -i ./neptune_export -o ./falkordb_csv
# (Optional) generate typed headers for strict loading
# python3 neptune_to_falkordb_converter.py -i ./neptune_export -o ./falkordb_csv --enforce-schema
# Load (invokes ../falkordb-bulk-loader/falkordb_bulk_loader/bulk_insert.py)
# If the manifest indicates enforce_schema=true, the helper will automatically pass --enforce-schema.
python3 bulk_load_to_falkordb.py my_graph_name --csv-dir ./falkordb_csv --server-url redis://127.0.0.1:6379
# Update mode (invokes bulk_update.py with auto-generated Cypher per CSV file)
# Useful when updating an existing graph.
python3 bulk_load_to_falkordb.py my_graph_name --csv-dir ./falkordb_csv --mode update --server-url redis://127.0.0.1:6379
# Optional: create :<Label>(id) range indexes after load (requires: pip install falkordb redis)
# --create-id-indexes
# Optional: if your ID property is not named 'id'
# --id-property <property_name> (also used by --mode update to match source/target nodes)--mode insert|update(default:insert)insert: builds a new graph viabulk_insert.pyand-N/-Rmanifest mappingsupdate: runsbulk_update.pyper generated CSV using auto-generated Cypher upserts
--enforce-schema/--no-enforce-schema- Applies to
insertmode only (passed through tobulk_insert.py)
- Applies to
--id-property <name>- Property used for post-load index creation, and in
updatemode for endpoint matching
- Property used for post-load index creation, and in
--dry-run- Prints the command(s) that would run
In update mode, this wrapper auto-generates --csv / --query for each file.
Do not pass --csv, --query, or --variable-name through passthrough args.
The converter writes bulk_loader_manifest.json which tells you which -N (nodes-with-label) and -R (relations-with-type) arguments to pass.
python3 ../falkordb-bulk-loader/falkordb_bulk_loader/bulk_insert.py my_graph_name \
-u redis://127.0.0.1:6379 \
-N User ./falkordb_csv/nodes_User.csv \
-R FOLLOWS ./falkordb_csv/edges_FOLLOWS.csv
# If you converted with --enforce-schema, add:
# --enforce-schemaThe converter automatically handles multiple CSV formats:
- Standard CSV: Comma-delimited files
- Neptune pipe format: Pipe-delimited files (
|) - Line-numbered format: Files with line number prefixes (e.g.,
1|data,data,data)
- Label-set-based optimization: Each unique node label-set gets a file with only its relevant properties
- Type-based optimization: Each edge type gets a file with only its relevant properties
- Safe filename generation: Special characters in labels/types are safely converted
- Nodes with multiple labels are grouped into a combined node file (e.g.,
nodes_User__Verified.csv) - At load time, the bulk loader is invoked with
-N User:Verified nodes_User__Verified.csvto apply both labels
This script is provided as-is for Neptune to FalkorDB migration purposes.