event_time
- Models
- Seeds
- Snapshots
- Sources
models:
resource-path:
+event_time: my_time_field
models:
- name: model_name
config:
event_time: my_time_field
{{ config(
event_time='my_time_field'
) }}
seeds:
resource-path:
+event_time: my_time_field
seeds:
- name: seed_name
config:
event_time: my_time_field
snapshots:
resource-path:
+event_time: my_time_field
sources:
resource-path:
+event_time: my_time_field
sources:
- name: source_name
config:
event_time: my_time_field
Definitionβ
You can configure event_time
for a model, seed, or source in your dbt_project.yml
file, property YAML file, or config block.
event_time
is required for the incremental microbatch strategy and highly recommended for Advanced CI's compare changes in CI/CD workflows, where it ensures the same time-slice of data is correctly compared between your CI and production environments.
Best practicesβ
Set the event_time
to the name of the field that represents the actual timestamp of the event (like account_created_at
). The timestamp of the event should represent "at what time did the row occur" rather than an event ingestion date. Marking a column as the event_time
when it isn't diverges from the semantic meaning of the column which may result in user confusion when other tools make use of the metadata.
However, if an ingestion date (like loaded_at
, ingested_at
, or last_updated_at
) are the only timestamps you use, you can set event_time
to these fields. Here are some considerations to keep in mind if you do this:
- Using
last_updated_at
orloaded_at
β May result in duplicate entries in the resulting table in the data warehouse over multiple runs. Setting an appropriate lookback value can reduce duplicates but it can't fully eliminate them since some updates outside the lookback window won't be processed. - Using
ingested_at
β Since this column is created by your ingestion/EL tool instead of coming from the original source, it will change if/when you need to resync your connector for some reason. This means that data will be reprocessed and loaded into your warehouse for a second time against a second date. As long as this never happens (or you run a full refresh when it does), microbatches will be processed correctly when usingingested_at
.
Here are some examples of recommended and not recommended event_time
columns:
Status | Column name | Description |
---|---|---|
β Recommended | account_created_at | Represents the specific time when an account was created, making it a fixed event in time. |
β Recommended | session_began_at | Captures the exact timestamp when a user session started, which wonβt change and directly ties to the event. |
β Not recommended | _fivetran_synced | This represents the time the event was ingested, not when it happened. |
β Not recommended | last_updated_at | Changes over time and isn't tied to the event itself. If used, note the considerations mentioned earlier in best practices. |
Examplesβ
- Models
- Seeds
- Snapshots
- Sources
Here's an example in the dbt_project.yml
file:
models:
my_project:
user_sessions:
+event_time: session_start_time
Example in a properties YAML file:
models:
- name: user_sessions
config:
event_time: session_start_time
Example in sql model config block:
{{ config(
event_time='session_start_time'
) }}
This setup sets session_start_time
as the event_time
for the user_sessions
model.
Here's an example in the dbt_project.yml
file:
seeds:
my_project:
my_seed:
+event_time: record_timestamp
Example in a seed properties YAML:
seeds:
- name: my_seed
config:
event_time: record_timestamp
This setup sets record_timestamp
as the event_time
for my_seed
.
Here's an example in the dbt_project.yml
file:
snapshots:
my_project:
my_snapshot:
+event_time: record_timestamp
Example in a snapshot properties YAML:
snapshots:
- name: my_snapshot
config:
event_time: record_timestamp
This setup sets record_timestamp
as the event_time
for my_snapshot
.
Here's an example of source properties YAML file:
sources:
- name: source_name
tables:
- name: table_name
config:
event_time: event_timestamp
This setup sets event_timestamp
as the event_time
for the specified source table.