Restored the correct version mismatch warnings between dagster core and dagster integration libraries
Field.__init__ has been typed, which resolves an error that pylance would raise about default_value
Previously, dagster_type_materializer and dagster_type_loader expected functions to take a context argument from an internal dagster import. We’ve added DagsterTypeMaterializerContext and DagsterTypeLoaderContext so that functions annotated with these decorators can annotate their arguments properly.
Previously, a single-output op with a return description would not pick up the description of the return. This has been rectified.
A docs site overhaul! Along with tons of additional content, the existing pages have been significantly edited and reorganized to improve readability.
All Dagster examples are revamped with a consistent project layout, descriptive names, and more helpful README files.
A new dagster project CLI contains commands for bootstrapping new Dagster projects and repositories:
dagster project scaffold creates a folder structure with a single Dagster repository and other files such as workspace.yaml. This CLI enables you to quickly start building a new Dagster project with everything set up.
dagster project from-example downloads one of the Dagster examples. This CLI helps you to quickly bootstrap your project with an officially maintained example. You can find the available examples via dagster project list-examples.
A default_executor_def argument has been added to the @repository decorator. If specified, this will be used for any jobs (asset or op) which do not explicitly set an executor_def.
A default_logger_defs argument has been added to the @repository decorator, which works in the same way as default_executor_def.
A new execute_job function presents a Python API for kicking off runs of your jobs.
Run status sensors may now yield RunRequests, allowing you to kick off a job in response to the status of another job.
When loading an upstream asset or op output as an input, you can now set custom loading behavior using the input_manager_key argument to AssetIn and In.
In the UI, the global lineage graph has been brought back and reworked! The graph keeps assets in the same group visually clustered together, and the query bar allows you to visualize a custom slice of your asset graph.
In 1.0.0, a large number of previously-deprecated APIs have been fully removed. A full list of breaking changes and deprecations, alongside instructions on how to migrate older code, can be found in MIGRATION.md. At a high level:
The solid and pipeline APIs have been removed, along with references to them in extension libraries, arguments, and the CLI (deprecated in 0.13.0).
The AssetGroup and build_asset_job APIs, and a host of deprecated arguments to asset-related functions, have been removed (deprecated in 0.15.0).
The EventMetadata and EventMetadataEntryData APIs have been removed (deprecated in 0.15.0).
dagster_type_materializer and DagsterTypeMaterializer have been marked experimental and will likely be removed within a 1.x release. Instead, use an IOManager.
FileManager and FileHandle have been marked experimental and will likely be removed within a 1.x release.
As of 1.0.0, Dagster no longer guarantees support for python 3.6. This is in line with PEP 494, which outlines that 3.6 has reached end of life.
[planned] In an upcoming 1.x release, we plan to make a change that renders values supplied to configured in Dagit. Up through this point, values provided to configured have not been sent anywhere outside the process where they were used. This change will mean that, like other places you can supply configuration, configured is not a good place to put secrets: You should not include any values in configuration that you don't want to be stored in the Dagster database and displayed inside Dagit.
fs_io_manager, s3_pickle_io_manager, and gcs_pickle_io_manager, and adls_pickle_io_manager no longer write out a file or object when handling an output with the None or Nothing type.
The custom_path_fs_io_manager has been removed, as its functionality is entirely subsumed by the fs_io_manager, where a custom path can be specified via config.
The default typing_type of a DagsterType is now typing.Any instead of None.
Dagster’s integration libraries haven’t yet achieved the same API maturity as Dagster core. For this reason, all integration libraries will remain on a pre-1.0 (0.16.x) versioning track for the time being. However, 0.16.x library releases remain fully compatible with Dagster 1.x. In the coming months, we will graduate integration libraries one-by-one to the 1.x versioning track as they achieve API maturity. If you have installs of the form:
[dagster-databricks] When using the databricks_pyspark_step_launcher the events sent back to the host process are now compressed before sending, resulting in significantly better performance for steps which produce a large number of events.
[dagster-dbt] If an error occurs in load_assets_from_dbt_project while loading your repository, the error message in Dagit will now display additional context from the dbt logs, instead of just DagsterDbtCliFatalRuntimeError.
Fixed a bug that causes Dagster to ignore the group_name argument to AssetsDefinition.from_graph when a key_prefix argument is also present.
Fixed a bug which could cause GraphQL errors in Dagit when loading repositories that contained multiple assets created from the same graph.
Ops and software-defined assets with the None return type annotation are now given the Nothing type instead of the Any type.
Fixed a bug that caused AssetsDefinition.from_graph and from_op to fail when invoked on a configured op.
The materialize function, which is not experimental, no longer emits an experimental warning.
Fixed a bug where runs from different repositories would be intermingled when viewing the runs for a specific repository-scoped job/schedule/sensor.
[dagster-dbt] A regression was introduced in 0.15.8 that would cause dbt logs to show up in json format in the UI. This has been fixed.
[dagster-databricks] Previously, if you were using the databricks_pyspark_step_launcher, and the external step failed to start, a RESOURCE_DOES_NOT_EXIST error would be surfaced, without helpful context. Now, in most cases, the root error causing the step to fail will be surfaced instead.
Fixed a bug where default configuration was not applied when assets were selected for materialization in Dagit.
Fixed a bug where RunRequests returned from run_status_sensors caused the sensor to error.
When supplying config to define_asset_job, an error would occur when selecting most asset subsets. This has been fixed.
Fixed an error introduced in 0.15.7 that would prevent viewing the execution plan for a job re-execution from 0.15.0 → 0.15.6
[dagit] The Dagit server now returns 500 http status codes for GraphQL requests that encountered an unexpected server error.
[dagit] Fixed a bug that made it impossible to kick off materializations of partitioned asset if the day_offset, hour_offset, or minute_offset parameters were set on the asset’s partitions definition.
[dagster-k8s] Fixed a bug where overriding the Kubernetes command to use to run a Dagster job by setting the dagster-k8s/config didn’t actually override the command.
[dagster-datahub] Pinned version of acryl-datahub to avoid build error.
DagsterRun now has a job_name property, which should be used instead of pipeline_name.
TimeWindowPartitionsDefinition now has a get_partition_keys_in_range method which returns a sequence of all the partition keys between two partition keys.
OpExecutionContext now has asset_partitions_def_for_output and asset_partitions_def_for_input methods.
Dagster now errors immediately with an informative message when two AssetsDefinition objects with the same key are provided to the same repository.
build_output_context now accepts a partition_key argument that can be used when testing the handle_output method of an IO manager.
Fixed a bug that made it impossible to load inputs using a DagsterTypeLoader if the InputDefinition had an asset_key set.
Ops created with the @asset and @multi_asset decorators no longer have a top-level “assets” entry in their config schema. This entry was unused.
In 0.15.6, a bug was introduced that made it impossible to load repositories if assets that had non-standard metadata attached to them were present. This has been fixed.
[dagster-dbt] In some cases, using load_assets_from_dbt_manifest with a select parameter that included sources would result in an error. This has been fixed.
[dagit] Fixed an error where a race condition of a sensor/schedule page load and the sensor/schedule removal caused a GraphQL exception to be raised.
[dagit] The “Materialize” button no longer changes to “Rematerialize” in some scenarios
[dagit] The live overlays on asset views, showing latest materialization and run info, now load faster
[dagit] Typing whitespace into the launchpad Yaml editor no longer causes execution to fail to start
[dagit] The explorer sidebar no longer displays “mode” label and description for jobs, since modes are deprecated.
The non-asset version of the Hacker News example, which lived inside examples/hacker_news/, has been removed, because it hadn’t received updates in a long time and had drifted from best practices. The asset version is still there and has an updated README. Check it out here