Introduction

DataFlowMapper is a lightweight yet powerful data transformation tool designed to bridge the gap between expensive enterprise ETL solutions and simple data cleaners. It's built for implementation teams and data migration specialists working with CSV, Excel, and JSON files.

Our platform enables you to create repeatable transformations with an intuitive interface, powerful logic builder, and AI-assisted mapping capabilities.

Getting Started

Get up and running with DataFlowMapper quickly. Learn how to run your first data transformation.

Account Setup

Getting started with DataFlowMapper is quick and straightforward. Simply create an account to access all features immediately.

Note: DataFlowMapper is currently in early access with a 90-day trial program available for all new users.

Visit the DataFlowMapper sign-up page
Enter your email address and create a password
Verify your email address
Log in to access your dashboard

Quick Start Guide

This guide will help you complete your first data transformation in minutes:

1Upload Source File

Select your file format and upload your CSV, Excel, or JSON file to begin the process.

2Create Mapping

Define how your source data maps to your desired output format.

3Transform Data

Execute the transformation and download your transformed data.

For a more detailed walkthrough, continue to the Basic Workflow section below.

Basic Workflow

The standard DataFlowMapper workflow consists of three main steps:

1. Upload Source File

Start by uploading your source data file. DataFlowMapper supports:

CSV files (with custom delimiters)
Excel spreadsheets
JSON files

You can specify header rows, concatenate multiple files, and configure other import settings.

2. Create or Upload Mapping

You have two options:

Upload an existing mapping file - If you've previously created a mapping for similar data, you can reuse it.
Create a new mapping - Use the Mapping Editor to define how your source data maps to your destination format.
- Import headers from a template file or manually define the header fields. System automatically detects file format and pulls first row or first JSON record
- Create direct field mappings by selecting from source field dropdown or dragging and dropping from the sidebar
- Add custom transformation logic as needed by clicking Add Custom Logic
- Add validation logic as needed by clicking Add Validation
- Specify output format and save mapping

3. Transform and Download

Once your mapping is complete:

Execute the transformation
Transformed data is automatically downloaded in your chosen format
Review the results and any validations

Pro Tip: Accelerate with AI

Use the AI-Copilot features to speed up your mapping process. Try "Suggest Mappings" to automatically map fields or "Map All" to generate a complete mapping from a plain English description.

Transformation Engine

The core of DataFlowMapper is its powerful transformation engine that processes your data according to defined mappings and logic rules.

Supported File Formats

DataFlowMapper supports a wide range of file formats including CSV (with any delimiter), Excel spreadsheets, and JSON. You can specify header rows, concatenate files, and work with nested data structures. After uploading files, you can remove them by clicking the X next to the file name under Selected Files section.

JSON Path Syntax

When working with JSON files, DataFlowMapper uses a consistent path syntax to reference fields within nested structures. Understanding this syntax is essential for accurate mapping and transformation.

Key Concepts

Consistent Field Names - Field names in the mapping interface directly correspond to their paths in the JSON structure, ensuring clarity and predictability.
Specific Array Indices - For non-record path arrays, specific indices (like [0], [1], [2]) are used rather than wildcards, giving precise control over which array elements to map.
Record Path Handling - The record path (selected in the UI) is the only place where wildcards ([*]) are used, indicating &quotfor each item in this array, create a row in the output.&quot
Deep Nesting Support - The system handles JSON structures of any depth, allowing access to deeply nested fields like employees[0].projects[0].tasks[0].subtasks[0].description.
Array Expansion - Arrays are automatically expanded into individual fields (e.g., skills[0], skills[1], skills[2]), allowing mapping from specific array elements.
Nested Object Handling - Objects inside arrays are properly flattened, with fields like projects[0].manager.name and projects[0].manager.email made available for mapping.

Example: JSON Structure and Field References

For a JSON file with this structure:

{
            "company": "Acme Inc.",
            "employees": [
              {
                "name": "John",
                "skills": ["JavaScript", "Python"],
                "projects": [
                  { "name": "Website", "priority": "High" },
                  { "name": "Mobile App", "priority": "Medium" }
                ]
              },
              {
                "name": "Jane",
                "skills": ["Java", "C++", "SQL"],
                "projects": [
                  { "name": "Database", "priority": "Critical" }
                ]
              }
            ]
          }

When employees is selected as the record path, you&aposll see these fields available for mapping:

company
employees[*].name
employees[*].skills[0]
employees[*].skills[1]
employees[*].skills[2]
employees[*].projects[0].name
employees[*].projects[0].priority
employees[*].projects[1].name
employees[*].projects[1].priority

The system will extract the correct values for each employee record, handling missing values appropriately (e.g., if an employee doesn&apost have a second project).

JSON Header Import

When uploading a JSON template file for headers, DataFlowMapper automatically:

Flattens the nested structure
Extracts all possible field paths
Preserves array indices for precise mapping
Makes these fields available in the mapping interface

This allows you to map from exact fields in your source data to your desired output structure, maintaining the integrity of complex nested objects and arrays.

Read & Write

This JSON syntax is used for source and destination data. For destination fields, the nested JSON will be constructed using the syntax as provided.

File Settings Panel

The file settings panel can be opened by clicking the gear below the source file input. Here you can specify header rows, whether you will concatenate files, delimiter, as well as the record level for JSON transformations.

Data Sources

Users can choose from multiple data sources, including local files, APIs, and databases. Databases and APIs can be configured for direct data import/export. To configure sources and destinations, click the desired data source, click the gear icon, and fill in the relevant fields.

Transformation Workflow

The typical workflow involves uploading a source file, creating or uploading a mapping file, transforming the data, and then viewing the output and any validations. You can save mappings for future use to create repeatable transformations.

Performance Optimization

DataFlowMapper's transformation engine is optimized for performance, with careful state management and parallel processing algorithms that minimize memory usage and maximize throughput.

Data Viewer

After transforming data, you will see a stat summary of the transformation. You can download the log file to view any warnings or errors with the transformation. From here you can choose to push data to API or DB destination. Clicking View Transformed Data will open the Data Viewer. The Data Viewer presents you with the output data in a flattened tabular format. You can filter, sort, and search through the data to validate the transformation results. The Data Viewer also allows you to download the transformed data and validations to Excel for further analysis.

Mapping Editor

The Mapping Editor is where you define how source data fields map to destination fields. You can create one-to-one mappings or add custom logic for complex transformations.

Field Mapping

Create direct mappings between source and destination fields by selecting which field to map from dropdown or drag and drop from the sidebar.
For complex scenarios, you can add custom logic using our Logic Builder by clicking Add Custom Logic
You can also choose to add validations by clicking Add Validation.
Filters can be created by clicking Add Pre-Processing Filter or Add Post-Processing Filter. This will open the Logic Builder where you can define the filter logic. The filter logic is applied to the source data before or after the transformation, respectively.

Header Management

Import headers from template files, create custom headers, or modify existing ones. DataFlowMapper makes it easy to align your data with the required output format. Upon mapping file creation you have the option to create headers manually by adding each field, or import headers from a template. Importing takes the first row from CSV and Excel files. For JSON, it takes the first record and flattens the data to extract field names.

Import Validations

Validations can be imported from an existing mapping file. Click the Import Validations button and upload a mapping file that has destination fields that match the current mapping. The system will automatically import the validations from the uploaded mapping file.

Logic Builder

The Logic Builder provides a visual interface for creating complex transformation logic without writing code. Drag and drop fields, variables, and functions to build your transformation rules. The system generates the underlying Python code automatically which can then be edited directly if needed for edge cases and more complex workflows.

Each destination field has its own separate logic that processes data row by row. The Logic Builder maintains isolated state between fields, ensuring clean and predictable transformations.

Key Concept: Row-Based Processing

Logic Builder processes each row of your data independently, applying the same transformation rules to each row. This makes it perfect for consistent data transformations across your entire dataset.

Visual Logic Interface

Our intuitive no-code interface lets you build complex transformations by dragging and dropping components. Connect fields, functions, and operations to create your transformation logic.

To define what value should be returned from the logic, set the 'result' equal to the value you want. A typical workflow in the Logic Builder would be:

Declare an empty variable to capture the logic you want to set the field to.
Create any functions needed to manipulate the data.
Create an if then statement and set the variable's value based on your business rules. Alternatively set the result directly in an if then block instead of declaring an empty variable.
Set your variable in the Return Results tab.
Test and confirm output is correct by inputting a row number to sample from.
Save Logic

The Logic Builder is organized into multiple tabs, each serving a specific purpose:

Variables Tab

Define and manage variables used in your transformation logic. You can create variables from:

Source fields (e.g., x = row['field1'])
Static values (e.g., y = 'static')
Numeric values (e.g., z = 42)
Function calls (e.g., a = my_function(row))

If/Then Tab

Build conditional logic with a visual interface. The If/Then builder supports:

Multiple conditions with AND/OR operators
Grouped conditions using parentheses
Nested conditions (one level deep)
Multiple Else If branches
Else blocks for default handling
Multiple actions within each condition block

Functions Tab

Apply pre-built functions to your data with a searchable, categorized function library. When you select a function, the interface dynamically renders the appropriate input fields.

Return Result Tab

Define the final output of your transformation logic. You can return:

A variable value
Static text
A function result

You can also test your logic with sample data to verify the results before saving.

Python Editor

For advanced users, we provide a full IDE experience with Python editor where you can write custom transformation code directly. This gives you full control over complex transformations.

# Example Python transformation
def transform_row(row):
    # Access source fields
    first_name = row['FirstName']
    last_name = row['LastName']
    
    # Return transformed value
    return f"{last_name}, {first_name}"
    
  # Return the result
  result = transform_row(row)

Filter Logic

To filter data, you can use the If/Then tab in the Logic Builder to create conditions that determine which rows should be included or excluded from the output. Setting result to False will filter the record out. By default, anything unspecified will default to True and be unfiltered.

Validation Logic

To set validations, you can use the If/Then tab in the Logic Builder to create conditions that determine which criteria should pass or fail a record in the output data. Setting result to False will denote failure. Upon selecting False, a button will appear to add a reason. This reason will be shown when reviewing the output data for the given failed record. By default, anything unspecified will default to True and pass validation.

Functions

DataFlowMapper includes a comprehensive library of built-in functions for string manipulation, mathematical operations, date handling, type conversion, and utility operations.

All functions can be dynamically constructed using the Logic Builder with zero coding. Transformation logic built with the Logic Builder will parse back into the UI state and allow you to edit the logic visually.

Functions are specified by a user input name and pass 'row' as a parameter. User defined names can't be duplicates or overlap with reserved words or names of function definitions below. To parse Python code back into the UI state, it must match the expected definition. See below for examples of function generation.

# Example function structure with generated Python code
            # UPPER - Converts text to uppercase
            def myUpperFunction(row):
                return str(row['field1']).upper()
            # CONCAT - Joins text values together
            def myConcatFunction(row):
                return ''.join([str(x) for x in [row['first_name'], row['last_name']]])

String Functions

Manipulate text data with functions like UPPER, LOWER, CONCAT, SUBSTRING, REPLACE, and more. These functions help you format and transform text fields to meet your requirements.

CONCAT

Joins text values together

Parameters:

text (required) - Text values to join

Returns:

string

# CONCAT - Joins text values together
''.join([str(x) for x in [row['text_field']]])

LEFT

Extracts the leftmost characters

Parameters:

text (required) - Text to extract from
length (required) - Number of characters

Returns:

string

# LEFT - Extracts the leftmost characters
str(row['text_field'])[:int(row['number_field'])]

LENGTH

Returns the length of the text

Parameters:

text (required) - Text to measure

Returns:

number

# LENGTH - Returns the length of the text
len(str(row['text_field']))

LOWER

Converts text to lowercase

Parameters:

text (required) - Text to convert to lowercase

Returns:

string

# LOWER - Converts text to lowercase
str(row['text_field']).lower()

REPLACE

Replaces text occurrences

Parameters:

text (required) - Text to search in
old_text (required) - Text to replace
new_text (required) - Replacement text

Returns:

string

# REPLACE - Replaces text occurrences
str(row['text_field']).replace(str(row['text_field']), str(row['text_field']))

RIGHT

Extracts the rightmost characters

Parameters:

text (required) - Text to extract from
length (required) - Number of characters

Returns:

string

# RIGHT - Extracts the rightmost characters
str(row['text_field'])[-int(row['number_field']):]

SPLIT

Splits text into list

Parameters:

text (required) - Text to split
separator (required) - Separator to split on

Returns:

string

# SPLIT - Splits text into list
str(row['text_field']).split(str(row['text_field']))

SUBSTRING

Extracts a substring

Parameters:

text (required) - Text to extract from
start (required) - Starting position (0-based)
length (required) - Number of characters

Returns:

string

# SUBSTRING - Extracts a substring
str(row['text_field'])[row['number_field']:row['number_field']+row['number_field']]

TITLE

Converts first character of each word to uppercase

Parameters:

text (required) - Text to convert to title case

Returns:

string

# TITLE - Converts first character of each word to uppercase
str(row['text_field']).title()

TRIM

Removes leading/trailing whitespace

Parameters:

text (required) - Text to trim

Returns:

string

# TRIM - Removes leading/trailing whitespace
str(row['text_field']).strip()

UPPER

Converts text to uppercase

Parameters:

text (required) - Text to convert to uppercase

Returns:

string

# UPPER - Converts text to uppercase
str(row['text_field']).upper()

Math Functions

Perform calculations with functions like SUM, MULTIPLY, DIVIDE, ROUND, MAX, MIN, and AVERAGE. These functions enable complex numerical transformations.

AVERAGE

Calculates average of numbers

Parameters:

numbers (required) - Numbers to average

Returns:

number

# AVERAGE - Calculates average of numbers
sum([float(x) for x in [row['number_field']]]) / len([row['number_field']])

DIVIDE

Divides two numbers (handles division by zero)

Parameters:

dividend (required) - Number to divide
divisor (required) - Number to divide by

Returns:

number

# DIVIDE - Divides two numbers (handles division by zero)
float(row['number_field']) / float(row['number_field']) if float(row['number_field']) != 0 else 0

MAX

Returns the maximum value

Parameters:

numbers (required) - Numbers to compare

Returns:

number

# MAX - Returns the maximum value
max([float(x) for x in [row['number_field']]])

MIN

Returns the minimum value

Parameters:

numbers (required) - Numbers to compare

Returns:

number

# MIN - Returns the minimum value
min([float(x) for x in [row['number_field']]])

MULTIPLY

Multiplies numbers

Parameters:

numbers (required) - Numbers to multiply

Returns:

number

# MULTIPLY - Multiplies numbers
math.prod([float(x) for x in [row['number_field']]])

ROUND

Rounds a number to specified decimals

Parameters:

number (required) - Number to round
decimals (required) - Decimal places

Returns:

number

# ROUND - Rounds a number to specified decimals
round(float(row['number_field']), int(row['number_field']))

SUBTRACT

Subtracts numbers

Parameters:

numbers (required) - Numbers to subtract

Returns:

number

# SUBTRACT - Subtracts numbers
float(row['number_field']) - float(undefined)

SUM

Calculates the sum of numbers

Parameters:

numbers (required) - Numbers to sum

Returns:

number

# SUM - Calculates the sum of numbers
sum([float(x) for x in [row['number_field']]])

Date Functions

Work with dates using functions like FORMAT_DATE, DATE_NOW, TODAY, and ADD_TO_DATE. These functions help you manipulate and format date values.

ADD_TO_DATE

Add a specified amount of time to a date

Parameters:

date (required) - Starting date
amount (required) - Amount to add
unit (required) - Time unit

Returns:

string

# ADD_TO_DATE - Add a specified amount of time to a date
pd.to_datetime(row['text_field']) + timedelta(days=row['number_field'])

DATE_NOW

Returns the current local date and time

Returns:

string

# DATE_NOW - Returns the current local date and time
pd.Timestamp.now()

FORMAT_DATE

Format a date using a specific pattern

Parameters:

date (required) - Date to format
format (required) - Date format pattern

Returns:

string

# FORMAT_DATE - Format a date using a specific pattern
pd.to_datetime(row['text_field']).strftime(row['text_field'])

TODAY

Returns the current local date

Returns:

string

# TODAY - Returns the current local date
pd.Timestamp.now().date()

Conversion Functions

Convert between data types with functions like TO_TEXT, TO_DATE, TO_BOOLEAN, TO_INT, and TO_FLOAT. These functions ensure your data is in the correct format for processing.

CLEAN_NUMBER

Cleans and removes non-numeric characters and converts to number

Parameters:

text (required) - Text to clean and convert

Returns:

number

# CLEAN_NUMBER - Cleans and removes non-numeric characters and converts to number
float(''.join(filter(lambda x: x.isdigit() or x == '.', str(row['text_field']))))

IS_DATE

Validates if a string matches a date format

Parameters:

text (required) - Text to validate
format (required) - Date format pattern

Returns:

boolean

# IS_DATE - Validates if a string matches a date format
pd.to_datetime(row['text_field'], format=row['text_field'], errors='coerce').notna()

IS_EMPTY

Returns True if a field is empty

Parameters:

value (required) - Value to check

Returns:

boolean

# IS_EMPTY - Returns True if a field is empty
pd.isna(row['text_field']) or str(row['text_field']).strip() == ''

IS_NUMERIC

Returns True if a value is a number

Parameters:

value (required) - Value to check

Returns:

boolean

# IS_NUMERIC - Returns True if a value is a number
pd.to_numeric(row['text_field'], errors='coerce').notna()

TO_BOOLEAN

Converts a value to boolean

Parameters:

value (required) - Value to convert

Returns:

boolean

# TO_BOOLEAN - Converts a value to boolean
bool(row['text_field'])

TO_DATE

Converts a value to a date

Parameters:

value (required) - Value to convert
format (optional) - Date format pattern

Returns:

string

# TO_DATE - Converts a value to a date
pd.to_datetime(row['text_field'], format=row['text_field'], errors='coerce')

TO_FLOAT

Convert value to float

Parameters:

value (required) - Value to convert to float

Returns:

number

# TO_FLOAT - Convert value to float
float(row['text_field'])

TO_INT

Convert value to integer

Parameters:

value (required) - Value to convert

Returns:

number

# TO_INT - Convert value to integer
int(float(row['text_field']))

TO_NUMERIC

Convert to numeric, coercing errors to NaN

Parameters:

value (required) - Value to convert to numeric

Returns:

number

# TO_NUMERIC - Convert to numeric, coercing errors to NaN
pd.to_numeric(row['text_field'], errors='coerce')

TO_TEXT

Converts a value to text/string

Parameters:

value (required) - Value to convert

Returns:

string

# TO_TEXT - Converts a value to text/string
str(row['text_field'])

Utility Functions

Additional utility functions like LOOKUP, VALUECOUNT, and IS_EMPTY help with common data transformation tasks and validations.

APPLY

Apply transformation from another destination field

Parameters:

field (required) - Field to apply transformation from

Returns:

any

# APPLY - Apply transformation from another destination field
Apply(row['field'])

JOIN

Joins a list into a string

Parameters:

list (required) - List to join
separator (required) - Separator to join with

Returns:

string

# JOIN - Joins a list into a string
str(row['text_field']).join([str(x) for x in row['text_field']])

LIST_GET

Gets list element at index

Parameters:

list (required) - List to index
index (required) - Index to get

Returns:

string

# LIST_GET - Gets list element at index
row['text_field'][int(row['number_field'])] if len(row['text_field']) > int(row['number_field']) else None

LIST_LENGTH

Returns list length

Parameters:

list (required) - List to measure

Returns:

number

# LIST_LENGTH - Returns list length
len(row['text_field'])

LOCALLOOKUP

Look up values from a local reference table using match fields

Parameters:

table_name (required) - Reference table name
match_fields (required) - Match fields mapping source to lookup fields
return_field (required) - Field to return from the reference table
return_all_matches (optional) - Return all matches instead of just the last one

Returns:

any

# LOCALLOOKUP - Look up values from a local reference table using match fields
LocalLookup(row['text_field'], row['field'], row['field'], False)

LOOKUP

Look up values from dataset by matching fields (either List or last value)

Parameters:

match_fields (required) - Fields to match on
return_field (required) - Field to return
return_all_matches (optional) - Return all matches instead of just the last one

Returns:

any

# LOOKUP - Look up values from dataset by matching fields (either List or last value)
Lookup(row['field'], row['field'], False)

NEWROW

Creates a new row in the output with specified field values

Parameters:

row_data (required) - Field mappings for the new row

Returns:

any

# NEWROW - Creates a new row in the output with specified field values
NewRow(row['field'])

REMOTELOOKUP

Look up values from API or DB sources by matching fields

Parameters:

source_type (required) - Type of remote source
source_id (required) - ID of the API or DB source
match_fields (required) - Fields to match on
return_field (required) - Field to return from remote source
return_all_matches (optional) - Return all matches instead of just the last one

Returns:

any

# REMOTELOOKUP - Look up values from API or DB sources by matching fields
RemoteLookup("row['text_field']", "row['text_field']", row['field'], "row['text_field']", row['field'])

VALUECOUNT

Returns a count of value occurrences in a field for the given row

Parameters:

field (required) - Field to count values in

Returns:

number

# VALUECOUNT - Returns a count of value occurrences in a field for the given row
ValueCount(row['field'])

AI-Copilot

DataFlowMapper's AI-Copilot features help automate and simplify the mapping and transformation process with intelligent suggestions and automation.

Mapping Suggestions

The AI can analyze your source and destination fields to suggest logical mappings. You can review and approve these suggestions to speed up the mapping process. The AI will provide rationale and confidence rating for each mapping.

AI Logic Assist

Describe your transformation requirements in plain English, and the AI will generate the appropriate logic in our visual builder. This feature makes complex transformations accessible to non-technical users. This can be used for transformations and validations.

Map All Feature

The Map All feature allows you to describe your entire transformation requirement in plain language. The AI will analyze your needs and create a complete mapping file, determining which fields need direct mapping and which require custom logic as well as validations. The AI will provide details, rationale, and confidence rating for each mapping.

// Example AI prompt for Map All
"Map customer data from our CRM export to the billing system format. 
Combine first and last name into a full_name field. 
Convert dates to YYYY-MM-DD format.
Calculate total_spent as the sum of all transaction amounts."