Introduction
DataFlowMapper is a lightweight yet powerful data transformation tool designed to bridge the gap between expensive enterprise ETL solutions and simple data cleaners. It's built for implementation teams and data migration specialists working with CSV, Excel, and JSON files.
Our platform enables you to create repeatable transformations with an intuitive interface, powerful logic builder, and AI-assisted mapping capabilities.
Getting Started
Get up and running with DataFlowMapper quickly. Learn how to run your first data transformation.
Account Setup
Getting started with DataFlowMapper is quick and straightforward. Simply create an account to access all features immediately.
Note: DataFlowMapper is currently in early access with a 90-day pilot available for all new users.
- Visit the DataFlowMapper sign-up page
- Enter your email address and create a password
- Verify your email address
- Log in to access your dashboard
Quick Start Guide
This guide will help you complete your first data transformation in minutes:
Select your file format and upload your CSV, Excel, or JSON file to begin the process.
Define how your source data maps to your desired output format.
Execute the transformation and download your transformed data.
For a more detailed walkthrough, continue to the Basic Workflow section below.
Basic Workflow
The standard DataFlowMapper workflow consists of three main steps:
1. Upload Source File
Start by uploading your source data file. DataFlowMapper supports:
- CSV files (with custom delimiters)
- Excel spreadsheets
- JSON files
You can specify header rows, concatenate multiple files, and configure other import settings.
2. Create or Upload Mapping
You have two options:
- Upload an existing mapping file - If you've previously created a mapping for similar data, you can reuse it.
- Create a new mapping - Use the Mapping Editor to define how your source data maps to your destination format.
- Import headers from a template file or manually define the header fields. System automatically detects file format and pulls first row or first JSON record
- Create direct field mappings by selecting from source field dropdown or dragging and dropping from the sidebar
- Add custom transformation logic as needed by clicking Add Custom Logic
- Add validation logic as needed by clicking Add Validation
- Specify output format and save mapping
3. Transform and Download
Once your mapping is complete:
- Execute the transformation
- Transformed data is automatically downloaded in your chosen format
- Review the results and any validations
Pro Tip: Accelerate with AI
Use the AI Co-Pilot features to speed up your mapping process. Try "Suggest Mappings" to automatically map fields or "Map All" to generate a complete mapping from a plain English description.
Transformation Engine
The core of DataFlowMapper is its powerful transformation engine that processes your data according to defined mappings and logic rules.
Supported File Formats
DataFlowMapper supports a wide range of file formats including CSV (with any delimiter), Excel spreadsheets, and JSON. You can specify header rows, concatenate files, and work with nested data structures. After uploading files, you can remove them by clicking the X next to the file name under Selected Files section.
JSON Path Syntax
When working with JSON files, DataFlowMapper uses a consistent path syntax to reference fields within nested structures. Understanding this syntax is essential for accurate mapping and transformation.
Key Concepts
- Consistent Field Names - Field names in the mapping interface directly correspond to their paths in the JSON structure, ensuring clarity and predictability.
- Specific Array Indices - For non-record path arrays, specific indices (like [0], [1], [2]) are used rather than wildcards, giving precise control over which array elements to map.
- Record Path Handling - The record path (selected in the UI) is the only place where wildcards ([*]) are used, indicating "for each item in this array, create a row in the output."
- Deep Nesting Support - The system handles JSON structures of any depth, allowing access to deeply nested fields like
employees[0].projects[0].tasks[0].subtasks[0].description
. - Array Expansion - Arrays are automatically expanded into individual fields (e.g.,
skills[0]
,skills[1]
,skills[2]
), allowing mapping from specific array elements. - Nested Object Handling - Objects inside arrays are properly flattened, with fields like
projects[0].manager.name
andprojects[0].manager.email
made available for mapping.
Example: JSON Structure and Field References
For a JSON file with this structure:
{
"company": "Acme Inc.",
"employees": [
{
"name": "John",
"skills": ["JavaScript", "Python"],
"projects": [
{ "name": "Website", "priority": "High" },
{ "name": "Mobile App", "priority": "Medium" }
]
},
{
"name": "Jane",
"skills": ["Java", "C++", "SQL"],
"projects": [
{ "name": "Database", "priority": "Critical" }
]
}
]
}
When employees
is selected as the record path, you&aposll see these fields available for mapping:
company
employees[*].name
employees[*].skills[0]
employees[*].skills[1]
employees[*].skills[2]
employees[*].projects[0].name
employees[*].projects[0].priority
employees[*].projects[1].name
employees[*].projects[1].priority
The system will extract the correct values for each employee record, handling missing values appropriately (e.g., if an employee doesn&apost have a second project).
JSON Header Import
When uploading a JSON template file for headers, DataFlowMapper automatically:
- Flattens the nested structure
- Extracts all possible field paths
- Preserves array indices for precise mapping
- Makes these fields available in the mapping interface
This allows you to map from exact fields in your source data to your desired output structure, maintaining the integrity of complex nested objects and arrays.
Read & Write
This JSON syntax is used for source and destination data. For destination fields, the nested JSON will be constructed using the syntax as provided.
File Settings Panel
The file settings panel can be opened by clicking the gear below the source file input. Here you can specify header rows, whether you will concatenate files, delimiter, as well as the record level for JSON transformations.
Data Sources
Users can choose from multiple data sources, including local files, APIs, and databases. Databases and APIs can be configured for direct data import/export. To configure sources and destinations, click the desired data source, click the gear icon, and fill in the relevant fields.
Transformation Workflow
The typical workflow involves uploading a source file, creating or uploading a mapping file, transforming the data, and then viewing the output and any validations. You can save mappings for future use to create repeatable transformations.
Performance Optimization
DataFlowMapper's transformation engine is optimized for performance, with careful state management and parallel processing algorithms that minimize memory usage and maximize throughput.
Data Viewer
After transforming data, you will see a stat summary of the transformation. You can download the log file to view any warnings or errors with the transformation. From here you can choose to push data to API or DB destination. Clicking View Transformed Data will open the Data Viewer. The Data Viewer presents you with the output data in a flattened tabular format. You can filter, sort, and search through the data to validate the transformation results. The Data Viewer also allows you to download the transformed data and validations to Excel for further analysis.
Mapping Editor
The Mapping Editor is where you define how source data fields map to destination fields. You can create one-to-one mappings or add custom logic for complex transformations.
Field Mapping
- Create direct mappings between source and destination fields by selecting which field to map from dropdown or drag and drop from the sidebar.
- For complex scenarios, you can add custom logic using our Logic Builder by clicking Add Custom Logic
- You can also choose to add validations by clicking Add Validation.
- Filters can be created by clicking Add Pre-Processing Filter or Add Post-Processing Filter. This will open the Logic Builder where you can define the filter logic. The filter logic is applied to the source data before or after the transformation, respectively.
Header Management
Import headers from template files, create custom headers, or modify existing ones. DataFlowMapper makes it easy to align your data with the required output format. Upon mapping file creation you have the option to create headers manually by adding each field, or import headers from a template. Importing takes the first row from CSV and Excel files. For JSON, it takes the first record and flattens the data to extract field names.
Import Validations
Validations can be imported from an existing mapping file. Click the Import Validations button and upload a mapping file that has destination fields that match the current mapping. The system will automatically import the validations from the uploaded mapping file.
Logic Builder
The Logic Builder provides a visual interface for creating complex transformation logic without writing code. Drag and drop fields, variables, and functions to build your transformation rules.
Each destination field has its own separate logic that processes data row by row. The Logic Builder maintains isolated state between fields, ensuring clean and predictable transformations.
Key Concept: Row-Based Processing
Logic Builder processes each row of your data independently, applying the same transformation rules to each row. This makes it perfect for consistent data transformations across your entire dataset.
Visual Logic Interface
Our intuitive no-code interface lets you build complex transformations by dragging and dropping components. Connect fields, functions, and operations to create your transformation logic.
To define what value should be returned from the logic, set the 'result' equal to the value you want. A typical workflow in the Logic Builder would be:
- Declare an empty variable to capture the logic you want to set the field to.
- Create any functions needed to manipulate the data.
- Create an if then statement and set the variable's value based on your business rules.
- Set your variable in the Return Results tab.
- Test and confirm output is correct by inputting a row number to sample from.
- Save Logic
The Logic Builder is organized into multiple tabs, each serving a specific purpose:
Variables Tab
Define and manage variables used in your transformation logic. You can create variables from:
- Source fields (e.g.,
x = row['field1']
) - Static values (e.g.,
y = 'static'
) - Numeric values (e.g.,
z = 42
) - Function calls (e.g.,
a = my_function(row)
)
If/Then Tab
Build conditional logic with a visual interface. The If/Then builder supports:
- Multiple conditions with AND/OR operators
- Grouped conditions using parentheses
- Nested conditions (one level deep)
- Multiple Else If branches
- Else blocks for default handling
- Multiple actions within each condition block
Functions Tab
Apply pre-built functions to your data with a searchable, categorized function library. When you select a function, the interface dynamically renders the appropriate input fields.
Return Result Tab
Define the final output of your transformation logic. You can return:
- A variable value
- Static text
- A function result
You can also test your logic with sample data to verify the results before saving.
Python Editor
For advanced users, we provide a full IDE experience with Python editor where you can write custom transformation code directly. This gives you full control over complex transformations.
# Example Python transformation
def transform_row(row):
# Access source fields
first_name = row['FirstName']
last_name = row['LastName']
# Return transformed value
return f"{last_name}, {first_name}"
# Return the result
result = transform_row(row)
Filter Logic
To filter data, you can use the If/Then tab in the Logic Builder to create conditions that determine which rows should be included or excluded from the output. Setting result to False will filter the record out. By default, anything unspecified will default to True and be unfiltered.
Validation Logic
To set validations, you can use the If/Then tab in the Logic Builder to create conditions that determine which criteria should pass or fail a record in the output data. Setting result to False will denote failure. Upon selecting False, a button will appear to add a reason. This reason will be shown when reviewing the output data for the given failed record. By default, anything unspecified will default to True and pass validation.
Functions
DataFlowMapper includes a comprehensive library of built-in functions for string manipulation, mathematical operations, date handling, type conversion, and utility operations.
All functions can be dynamically constructed using the Logic Builder with zero coding. Transformation logic built with the Logic Builder will parse back into the UI state and allow you to edit the logic visually.
Functions are specified by a user input name and pass 'row' as a parameter. User defined names can&apost be duplicates or overlap with reserved words or names of function definitions below. To parse Python code back into the UI state, it must match the expected definition. See below for examples of function generation.
# Example function structure with generated Python code
# UPPER - Converts text to uppercase
def myUpperFunction(row):
return str(row['field1']).upper()
# CONCAT - Joins text values together
def myConcatFunction(row):
return ''.join([str(x) for x in [row['first_name'], row['last_name']]])
String Functions
Manipulate text data with functions like UPPER, LOWER, CONCAT, SUBSTRING, REPLACE, and more. These functions help you format and transform text fields to meet your requirements.
CONCAT
Joins text values together
Parameters:
- text (required) - Text values to join
Returns:
string
# CONCAT - Joins text values together
''.join([str(x) for x in [row['text_field']]])
LEFT
Extracts the leftmost characters
Parameters:
- text (required) - Text to extract from
- length (required) - Number of characters
Returns:
string
# LEFT - Extracts the leftmost characters
str(row['text_field'])[:int(row['number_field'])]
LENGTH
Returns the length of the text
Parameters:
- text (required) - Text to measure
Returns:
number
# LENGTH - Returns the length of the text
len(str(row['text_field']))
LOWER
Converts text to lowercase
Parameters:
- text (required) - Text to convert to lowercase
Returns:
string
# LOWER - Converts text to lowercase
str(row['text_field']).lower()
REPLACE
Replaces text occurrences
Parameters:
- text (required) - Text to search in
- old_text (required) - Text to replace
- new_text (required) - Replacement text
Returns:
string
# REPLACE - Replaces text occurrences
str(row['text_field']).replace(str(row['text_field']), str(row['text_field']))
RIGHT
Extracts the rightmost characters
Parameters:
- text (required) - Text to extract from
- length (required) - Number of characters
Returns:
string
# RIGHT - Extracts the rightmost characters
str(row['text_field'])[-int(row['number_field']):]
SPLIT
Splits text into list
Parameters:
- text (required) - Text to split
- separator (required) - Separator to split on
Returns:
string
# SPLIT - Splits text into list
str(row['text_field']).split(str(row['text_field']))
SUBSTRING
Extracts a substring
Parameters:
- text (required) - Text to extract from
- start (required) - Starting position (0-based)
- length (required) - Number of characters
Returns:
string
# SUBSTRING - Extracts a substring
str(row['text_field'])[row['number_field']:row['number_field']+row['number_field']]
TITLE
Converts first character of each word to uppercase
Parameters:
- text (required) - Text to convert to title case
Returns:
string
# TITLE - Converts first character of each word to uppercase
str(row['text_field']).title()
TRIM
Removes leading/trailing whitespace
Parameters:
- text (required) - Text to trim
Returns:
string
# TRIM - Removes leading/trailing whitespace
str(row['text_field']).strip()
UPPER
Converts text to uppercase
Parameters:
- text (required) - Text to convert to uppercase
Returns:
string
# UPPER - Converts text to uppercase
str(row['text_field']).upper()
Math Functions
Perform calculations with functions like SUM, MULTIPLY, DIVIDE, ROUND, MAX, MIN, and AVERAGE. These functions enable complex numerical transformations.
AVERAGE
Calculates average of numbers
Parameters:
- numbers (required) - Numbers to average
Returns:
number
# AVERAGE - Calculates average of numbers
sum([float(x) for x in [row['number_field']]]) / len([row['number_field']])
DIVIDE
Divides two numbers (handles division by zero)
Parameters:
- dividend (required) - Number to divide
- divisor (required) - Number to divide by
Returns:
number
# DIVIDE - Divides two numbers (handles division by zero)
float(row['number_field']) / float(row['number_field']) if float(row['number_field']) != 0 else 0
MAX
Returns the maximum value
Parameters:
- numbers (required) - Numbers to compare
Returns:
number
# MAX - Returns the maximum value
max([float(x) for x in [row['number_field']]])
MIN
Returns the minimum value
Parameters:
- numbers (required) - Numbers to compare
Returns:
number
# MIN - Returns the minimum value
min([float(x) for x in [row['number_field']]])
MULTIPLY
Multiplies numbers
Parameters:
- numbers (required) - Numbers to multiply
Returns:
number
# MULTIPLY - Multiplies numbers
math.prod([float(x) for x in [row['number_field']]])
ROUND
Rounds a number to specified decimals
Parameters:
- number (required) - Number to round
- decimals (required) - Decimal places
Returns:
number
# ROUND - Rounds a number to specified decimals
round(float(row['number_field']), int(row['number_field']))
SUBTRACT
Subtracts numbers
Parameters:
- numbers (required) - Numbers to subtract
Returns:
number
# SUBTRACT - Subtracts numbers
float(row['number_field']) - float(undefined)
SUM
Calculates the sum of numbers
Parameters:
- numbers (required) - Numbers to sum
Returns:
number
# SUM - Calculates the sum of numbers
sum([float(x) for x in [row['number_field']]])
Date Functions
Work with dates using functions like FORMAT_DATE, DATE_NOW, TODAY, and ADD_TO_DATE. These functions help you manipulate and format date values.
ADD_TO_DATE
Add a specified amount of time to a date
Parameters:
- date (required) - Starting date
- amount (required) - Amount to add
- unit (required) - Time unit
Returns:
string
# ADD_TO_DATE - Add a specified amount of time to a date
pd.to_datetime(row['text_field']) + timedelta(days=row['number_field'])
DATE_NOW
Returns the current local date and time
Returns:
string
# DATE_NOW - Returns the current local date and time
pd.Timestamp.now()
FORMAT_DATE
Format a date using a specific pattern
Parameters:
- date (required) - Date to format
- format (required) - Date format pattern
Returns:
string
# FORMAT_DATE - Format a date using a specific pattern
pd.to_datetime(row['text_field']).strftime(row['text_field'])
TODAY
Returns the current local date
Returns:
string
# TODAY - Returns the current local date
pd.Timestamp.now().date()
Conversion Functions
Convert between data types with functions like TO_TEXT, TO_DATE, TO_BOOLEAN, TO_INT, and TO_FLOAT. These functions ensure your data is in the correct format for processing.
CLEAN_NUMBER
Cleans and removes non-numeric characters and converts to number
Parameters:
- text (required) - Text to clean and convert
Returns:
number
# CLEAN_NUMBER - Cleans and removes non-numeric characters and converts to number
float(''.join(filter(lambda x: x.isdigit() or x == '.', str(row['text_field']))))
IS_DATE
Validates if a string matches a date format
Parameters:
- text (required) - Text to validate
- format (required) - Date format pattern
Returns:
boolean
# IS_DATE - Validates if a string matches a date format
pd.to_datetime(row['text_field'], format=row['text_field'], errors='coerce').notna()
IS_EMPTY
Returns True if a field is empty
Parameters:
- value (required) - Value to check
Returns:
boolean
# IS_EMPTY - Returns True if a field is empty
pd.isna(row['text_field']) or str(row['text_field']).strip() == ''
IS_NUMERIC
Returns True if a value is a number
Parameters:
- value (required) - Value to check
Returns:
boolean
# IS_NUMERIC - Returns True if a value is a number
pd.to_numeric(row['text_field'], errors='coerce').notna()
TO_BOOLEAN
Converts a value to boolean
Parameters:
- value (required) - Value to convert
Returns:
boolean
# TO_BOOLEAN - Converts a value to boolean
bool(row['text_field'])
TO_DATE
Converts a value to a date
Parameters:
- value (required) - Value to convert
- format (optional) - Date format pattern
Returns:
string
# TO_DATE - Converts a value to a date
pd.to_datetime(row['text_field'], format=row['text_field'], errors='coerce')
TO_FLOAT
Convert value to float
Parameters:
- value (required) - Value to convert to float
Returns:
number
# TO_FLOAT - Convert value to float
float(row['text_field'])
TO_INT
Convert value to integer
Parameters:
- value (required) - Value to convert
Returns:
number
# TO_INT - Convert value to integer
int(float(row['text_field']))
TO_NUMERIC
Convert to numeric, coercing errors to NaN
Parameters:
- value (required) - Value to convert to numeric
Returns:
number
# TO_NUMERIC - Convert to numeric, coercing errors to NaN
pd.to_numeric(row['text_field'], errors='coerce')
TO_TEXT
Converts a value to text/string
Parameters:
- value (required) - Value to convert
Returns:
string
# TO_TEXT - Converts a value to text/string
str(row['text_field'])
Utility Functions
Additional utility functions like LOOKUP, VALUECOUNT, and IS_EMPTY help with common data transformation tasks and validations.
APPLY
Apply transformation from another destination field
Parameters:
- field (required) - Field to apply transformation from
Returns:
any
# APPLY - Apply transformation from another destination field
Apply(row['field'])
JOIN
Joins a list into a string
Parameters:
- list (required) - List to join
- separator (required) - Separator to join with
Returns:
string
# JOIN - Joins a list into a string
str(row['text_field']).join([str(x) for x in row['text_field']])
LIST_INDEX
Gets list element at index
Parameters:
- list (required) - List to index
- index (required) - Index to get
Returns:
string
# LIST_INDEX - Gets list element at index
row['text_field'][int(row['number_field'])] if len(row['text_field']) > int(row['number_field']) else None
LIST_LENGTH
Returns list length
Parameters:
- list (required) - List to measure
Returns:
number
# LIST_LENGTH - Returns list length
len(row['text_field'])
LOOKUP
Look up values from dataset by matching fields (returns first match)
Parameters:
- return_field (required) - Field to return
- condition (optional) - Optional filter condition
- condition_value (optional) - Value to match condition
- match_field (required) - Field to match on
Returns:
any
# LOOKUP - Look up values from dataset by matching fields (returns first match)
Lookup("row['text_field'] == row['field']", [row['field']], row['field'])
VALUECOUNT
Returns a count of value occurrences in a field for the given row
Parameters:
- field (required) - Field to count values in
Returns:
number
# VALUECOUNT - Returns a count of value occurrences in a field for the given row
ValueCount(row['field'])
AI Co-Pilot
DataFlowMapper's AI Co-Pilot features help automate and simplify the mapping and transformation process with intelligent suggestions and automation.
Mapping Suggestions
The AI can analyze your source and destination fields to suggest logical mappings. You can review and approve these suggestions to speed up the mapping process. The AI will provide rationale and confidence rating for each mapping.
AI Logic Assist
Describe your transformation requirements in plain English, and the AI will generate the appropriate logic in our visual builder. This feature makes complex transformations accessible to non-technical users. This can be used for transformations and validations.
Map All Feature
The Map All feature allows you to describe your entire transformation requirement in plain language. The AI will analyze your needs and create a complete mapping file, determining which fields need direct mapping and which require custom logic as well as validations. The AI will provide details, rationale, and confidence rating for each mapping.
// Example AI prompt for Map All
"Map customer data from our CRM export to the billing system format.
Combine first and last name into a full_name field.
Convert dates to YYYY-MM-DD format.
Calculate total_spent as the sum of all transaction amounts."