Schemas and constants used by the sync code. | (ns metabase.sync.interface (:require [malli.util :as mut] [metabase.lib.schema.common :as lib.schema.common] [metabase.util.malli.registry :as mr] [metabase.util.malli.schema :as ms])) |
(mr/def ::DatabaseMetadataTable [:map {:closed true} [:name ::lib.schema.common/non-blank-string] [:schema [:maybe ::lib.schema.common/non-blank-string]] ;; for databases that store an estimated row count in system tables (e.g: postgres) [:estimated_row_count {:optional true} [:maybe :int]] ;; for databases that support forcing query to include a filter (e.g: partitioned table on bigquery) [:database_require_filter {:optional true} [:maybe :boolean]] ;; `:description` in this case should be a column/remark on the Table, if there is one. [:description {:optional true} [:maybe :string]]]) | |
Schema for the expected output of | (def DatabaseMetadataTable [:ref ::DatabaseMetadataTable]) |
(mr/def ::DatabaseMetadata [:map [:tables [:set DatabaseMetadataTable]] [:version {:optional true} [:maybe ::lib.schema.common/non-blank-string]]]) | |
Schema for the expected output of | (def DatabaseMetadata [:ref ::DatabaseMetadata]) |
(mr/def ::TableMetadataField [:map [:name ::lib.schema.common/non-blank-string] [:database-type [:maybe ::lib.schema.common/non-blank-string]] ; blank if the Field is all NULL & untyped, i.e. in Mongo [:base-type ::lib.schema.common/base-type] [:database-position ::lib.schema.common/int-greater-than-or-equal-to-zero] [:position {:optional true} ::lib.schema.common/int-greater-than-or-equal-to-zero] [:semantic-type {:optional true} [:maybe ::lib.schema.common/semantic-or-relation-type]] [:effective-type {:optional true} [:maybe ::lib.schema.common/base-type]] [:coercion-strategy {:optional true} [:maybe ms/CoercionStrategy]] [:field-comment {:optional true} [:maybe ::lib.schema.common/non-blank-string]] [:pk? {:optional true} :boolean] ; optional for databases that don't support PKs [:nested-fields {:optional true} [:set [:ref ::TableMetadataField]]] [:json-unfolding {:optional true} :boolean] [:nfc-path {:optional true} [:any]] [:custom {:optional true} :map] [:database-is-auto-increment {:optional true} :boolean] ;; nullable for databases that don't support field partition [:database-partitioned {:optional true} [:maybe :boolean]] [:database-required {:optional true} :boolean]]) | |
Schema for a given Field as provided in [[metabase.driver/describe-table]]. | (def TableMetadataField [:ref ::TableMetadataField]) |
(mr/def ::TableIndexMetadata [:set [:and [:map [:type [:enum :normal-column-index :nested-column-index]]] [:multi {:dispatch :type} [:normal-column-index [:map [:value ::lib.schema.common/non-blank-string]]] [:nested-column-index [:map [:value [:sequential ::lib.schema.common/non-blank-string]]]]]]]) | |
Schema for a given Table as provided in [[metabase.driver/describe-table-indexes]]. | (def TableIndexMetadata [:ref ::TableIndexMetadata]) |
(mr/def ::FieldIndexMetadata [:map [:table-schema [:maybe ::lib.schema.common/non-blank-string]] [:table-name ::lib.schema.common/non-blank-string] [:field-name ::lib.schema.common/non-blank-string]]) | |
Schema for a given result provided by [[metabase.driver/describe-indexes]]. | (def FieldIndexMetadata [:ref ::FieldIndexMetadata]) |
(mr/def ::FieldMetadataEntry (-> (mr/schema ::TableMetadataField) (mut/assoc :table-schema [:maybe ::lib.schema.common/non-blank-string]) (mut/assoc :table-name ::lib.schema.common/non-blank-string))) | |
Schema for an item in the expected output of [[metabase.driver/describe-fields]]. | (def FieldMetadataEntry [:ref ::FieldMetadataEntry]) |
Schema for the expected output of [[metabase.driver.sql-jdbc.sync/describe-nested-field-columns]]. not actually used; leaving here for now because it serves as documentation | (comment (def NestedFCMetadata [:maybe [:set TableMetadataField]])) |
(mr/def ::TableFKMetadataEntry [:map [:fk-column-name ::lib.schema.common/non-blank-string] [:dest-table [:map [:name ::lib.schema.common/non-blank-string] [:schema [:maybe ::lib.schema.common/non-blank-string]]]] [:dest-column-name ::lib.schema.common/non-blank-string]]) | |
Schema for an individual entry in | (def TableFKMetadataEntry [:ref ::TableFKMetadataEntry]) |
(mr/def ::FKMetadataEntry [:map [:fk-table-name ::lib.schema.common/non-blank-string] [:fk-table-schema [:maybe ::lib.schema.common/non-blank-string]] [:fk-column-name ::lib.schema.common/non-blank-string] [:pk-table-name ::lib.schema.common/non-blank-string] [:pk-table-schema [:maybe ::lib.schema.common/non-blank-string]] [:pk-column-name ::lib.schema.common/non-blank-string]]) | |
Schema for an entry in the expected output of [[metabase.driver/describe-fks]]. | (def FKMetadataEntry [:ref ::FKMetadataEntry]) |
These schemas are provided purely as conveniences since adding | |
(mr/def ::no-kebab-case-keys (ms/MapWithNoKebabKeys)) | |
(mr/def ::DatabaseInstance [:and (ms/InstanceOf :model/Database) ::no-kebab-case-keys]) | |
Schema for a valid instance of a Metabase Database. | (def DatabaseInstance [:ref ::DatabaseInstance]) |
(mr/def ::TableInstance [:and (ms/InstanceOf :model/Table) ::no-kebab-case-keys]) | |
Schema for a valid instance of a Metabase Table. | (def TableInstance [:ref ::TableInstance]) |
(mr/def ::FieldInstance [:and [:and (ms/InstanceOf :model/Field) ::no-kebab-case-keys]]) | |
Schema for a valid instance of a Metabase Field. | (def FieldInstance [:ref ::FieldInstance]) |
+----------------------------------------------------------------------------------------------------------------+ | FINGERPRINT VERSIONING | +----------------------------------------------------------------------------------------------------------------+ | |
Occasionally we want to update the schema of our Field fingerprints and add new logic to populate the additional keys. However, by default, analysis (which includes fingerprinting) only runs on NEW Fields, meaning EXISTING Fields won't get new fingerprints with the updated info. To work around this, we can use a versioning system. Fields whose Fingerprint's version is lower than the current
version should get updated during the next sync/analysis regardless of whether they are or are not new Fields.
However, this could be quite inefficient: if we add a new fingerprint field for Thus, our implementation below. Each new fingerprint version lists a set of types that should be upgraded to it. Our fingerprinting logic will calculate whether a fingerprint needs to be recalculated based on its version and the changes that have been made in subsequent versions. Only the Fields that would benefit from the new Fingerprint info need be re-fingerprinted. Thus, if Fingerprint v2 contains some new info for numeric Fields, only Fields that derive from | |
Map of fingerprint version to the set of Field base types that need to be upgraded to this version the next time we do analysis. The highest-numbered entry is considered the latest version of fingerprints. | (def ^:dynamic *fingerprint-version->types-that-should-be-re-fingerprinted* {1 #{:type/*} 2 #{:type/Number} 3 #{:type/DateTime} 4 #{:type/*} 5 #{:type/Text}}) |
The newest (highest-numbered) version of our Field fingerprints. | (def ^:dynamic ^Long *latest-fingerprint-version* (apply max (keys *fingerprint-version->types-that-should-be-re-fingerprinted*))) |