Data Flow
Last Updated: 2021-06-14The Grouparoo Application's main job is to collect data from Sources
, store that data to build a robust Records
, build Groups
of Records
, and finally syndicate this data with Destination
s, your other marketing tools.
This document is place to centrally describe how all of this works!
Core Data Models
Records and Record Properties
The main object in Grouparoo is a Record
. A Record by itself is simply an id
and the time it was created and last updated. It is a "join table" to collect many Record Properties
, which act as a key-value store for the data about Records
. For example, a common Record Property
would be key='email', value='evan@grouparoo.com', type='string'
. We also store when each Record Property
was created and updated.
Groups
Groups
are a collection of Records
.
There are 2 types of Groups
: "Manual" and "Calculated".
- Manual
Groups
contain records that a Grouparoo user has added to the Group. The membership of ManualGroup
s does not change. - Calculated
Groups
are constantly recalculated againstRecord Properties
. These dynamic lists ofRecords
are how you build up cohorts that change over time. You may have a Group ofAbandoned Carts Last Week
that checks aRecord Property
of "last_abandoned_card_date" and makes sure it was within the last 7 days. You might also choose to segment your customers by the language they speak with a "Locale"Record Property
.
Records can be in many groups at once.
Apps
Apps are how you define connections to your tools like your databases, email vendors, CRMs, etc. With each App
you tell Grouparoo how to connect so we can import and export your data.
Properties
Properties
describe each Record Property
and how they work. It's a bit like a "Schema" for the Records
:
- What
Record Properties
exist? You cannot just add anyRecord Property
to aRecord
. You need to define it first. - Is this
Record Property
unique? If it is, Grouparoo will make sure that no other Record can have the same value for thatRecord Property
. Common examples would beemail
oruser_id
. - What is the type of this
Record Property
? A number, a string? We actually stringify all of our Record Property data for searching, but wen we render it back out again, we convert it. This also helps us know how to build search queries for CalculatedGroups
. - How is this
Record Property
to be retrieved? By which App? We talk more about this below.
Every Record Property
also includes a reference to an App (perhaps "MySQL-Datawarehouse") and a statement/query/option of how to get it for each Record. Grouparoo will always be keeping your data in sync so we always need to know how to retrieve a Record
's Record Properties
.
- For an
App
"MySQL-Datawarehouse" which is linked to theProperty
for "First Name (string)" you might provide a SQL statement likeselect first_name from users where users.id={{ user_id }}
. - You can also have more complex queries, like you could connect the
App
"MySQL-Datawarehouse" to theProperty
for "ltv (number)" which might doselect (sum(purchases.total) - sum(refunds.total)) as ltv from users join purchases on purchases.user_id = users.id join refunds.user_id = users.id where users.id={{ user_id }}
- For a an App
CSV
responsible for "VIP (boolean)" you might choose "column" from a dropdown indicating that any column labeled "VIP" should be allowed to set thisRecord Property
Data into Grouparoo
Sources, Schedules, and Runs
When you add an App
to Grouparoo, you can also configure if you want Grouparoo to periodically check that App for new data. This might be accomplished by checking a database table for new/updated rows or asking an API for recent changes. This is called a Schedule
. You can create many Schedules
from a single App
, against each Source
. An App
can have many Sources
, often correlating to a logical collection in the App
, e.g. Tables in a Database.
Perhaps the App
"MySQL-Datawarehouse" has 4 Schedules
:
- "MySQL-Datawarehouse:users". This Schedule checks the "users" table every 10 minutes for new or updated records
- "MySQL-Datawarehouse:carts". This Schedule checks the "carts" table every hour for only new carts
- "MySQL-Datawarehouse:purchase". This Schedule checks the "purchase" table every hour for only new purchases.
- "MySQL-Datawarehouse:refunds". This Schedule checks the "refunds" table every hour for only new refunds
Each App
works a little differently, but you will be asked what to check for and how often. Each instance of a periodic check is called a Run
. When new data is found, the Run
will produce Import
s.
Imports
Imports are created when a Schedules'
Run
finds new or updated data. Either way, an Import
tells Grouparoo that something has changed that we should take note of.
Most Imports
trigger the update of a Record. An Import
can be as simple as {user_id: 3, firstName: 'Evan'}
. This tells us that we should update the Record
we have for user #3, and if we don't have a Record
for User #3, we should create one.
We know that the Import
's data we get should be saved to the related Record if:
- The
Import
's payload contains a uniqueRecord Property
as defined by theProperties
- The other data key-value pairs are also defined by the
Properties
.
Imports
are stored in Grouparoo for later use. You can choose to delete old Imports
after a period of time.
Record Hydration
Any time an Import
is recorded by Grouparoo, and that Import
can be linked to a Record
, Grouparoo will fully update that Record, checking all the Apps
that are connected to it via Record Properties
and asking for new data. You can see which Record Properties
should be updated by checking if they are in the pending state or not. In this way, you can be sure that your Records
are always up to date, and we can recover quickly from any new or missed data. We call this step "Record Hydration".
This means that if an Import
comes in from the "MySQL-Datawarehouse:purchase" Import Schedules
, Grouparoo will automatically check all Record Properties
, including "ltv", "first_name" and everything else.
Data out of Grouparoo
Destinations
When configuring your Destinations
, you'll be asked which Groups
and Record Properties
you want to send (or all of them). You may want to sync "email" and "first_name" to your email tool for everyone in the Group
"lapsed customers" and "USA Customers" so you can send them a coupon as part of your re-engagement campaign. Oh, and the Group
"lapsed customers" is Dynamic, based on LTV>0, and last_purchase at least 2 months ago, while "USA Customers" is Dynamic where locale=en-us... all of which Grouparoo is keeping up-to-date for you.
To accomplish this you, would create an Destination
for the App
"Mailchimp", toggle on the "lapsed-customers" and "USA Customers" groups, and finally the requested Record Properties
of "email" and "first_name".
When Grouparoo sends data via a Destination
, Exports
are created.
Exports
The Grouparoo App
for Mailchimp will handle using the Mailchimp API to send the Export
for you. Exports keep track of the exact data which was sent to the Destination
. Grouparoo will try to send data in bulk when possible, handle retries, and skip any updates that won't result in any new information being sent to the Destination.
You can choose to delete old Exports
after a period of time.
Plugins and the Grouparoo Marketplace
Plugins are Grouparoo's way of providing more Apps to Grouparoo, allowing you to import and export data from more and more sources and destinations.
Having Problems?
If you are having trouble, visit the list of common issues or open a Github issue to get support.