Schema for validating a normalized MBQL query. This is also the definitive grammar for MBQL, wow!

(ns metabase.legacy-mbql.schema
  (:refer-clojure :exclude [count distinct min max + - / * and or not not-empty = < > <= >= time case concat replace abs])
  (:require
   [clojure.core :as core]
   [clojure.set :as set]
   [malli.core :as mc]
   [malli.error :as me]
   [metabase.legacy-mbql.schema.helpers :as helpers :refer [is-clause?]]
   [metabase.legacy-mbql.schema.macros :refer [defclause one-of]]
   [metabase.lib.schema.actions :as lib.schema.actions]
   [metabase.lib.schema.binning :as lib.schema.binning]
   [metabase.lib.schema.common :as lib.schema.common]
   [metabase.lib.schema.expression.temporal :as lib.schema.expression.temporal]
   [metabase.lib.schema.expression.window :as lib.schema.expression.window]
   [metabase.lib.schema.id :as lib.schema.id]
   [metabase.lib.schema.info :as lib.schema.info]
   [metabase.lib.schema.literal :as lib.schema.literal]
   [metabase.lib.schema.template-tag :as lib.schema.template-tag]
   [metabase.util.i18n :as i18n]
   [metabase.util.malli.registry :as mr]))

A NOTE ABOUT METADATA:

Clauses below are marked with the following tags for documentation purposes:

  • Clauses marked ^:sugar are syntactic sugar primarily intended to make generating queries easier on the frontend. These clauses are automatically rewritten as simpler clauses by the desugar or expand-macros middleware. Thus driver implementations do not need to handle these clauses.

  • Clauses marked ^:internal are automatically generated by wrap-value-literals or other middleware from values passed in. They are not intended to be used by the frontend when generating a query. These add certain information that simplify driver implementations. When writing MBQL queries yourself you should pretend these clauses don't exist.

  • Clauses marked ^{:requires-features #{feature+}} require a certain set of features to be used. At some date in the future we will likely add middleware that uses this metadata to automatically validate that a driver has the features needed to run the query in question.

(def ^:private PositiveInt
  [:schema
   {:description "Must be a positive integer."}
   pos-int?])

Set of valid units for bucketing or comparing against a date Field.

:day-of-week depends on the [[metabase.public-settings/start-of-week]] Setting, by default Sunday. 1 = first day of the week (e.g. Sunday) 7 = last day of the week (e.g. Saturday)

(def ^:private date-bucketing-units
  #{:default :day :day-of-week :day-of-month :day-of-year :week :week-of-year
    :month :month-of-year :quarter :quarter-of-year :year})

Set of valid units for bucketing or comparing against a time Field.

(def ^:private time-bucketing-units
  #{:default :millisecond :second :minute :minute-of-hour :hour :hour-of-day})

Set of valid units for bucketing or comparing against a datetime Field.

(def datetime-bucketing-units
  (set/union date-bucketing-units time-bucketing-units))
(mr/def ::DateUnit
  "Valid unit for date bucketing."
  (into [:enum {:error/message "date bucketing unit"}] date-bucketing-units))

it could make sense to say hour-of-day(field) = hour-of-day("2018-10-10T12:00") but it does not make sense to say month-of-year(field) = month-of-year("08:00:00"), does it? So we'll restrict the set of units a TimeValue can have to ones that have no notion of day/date.

(mr/def ::TimeUnit
  "Valid unit for time bucketing."
  (into [:enum {:error/message "time bucketing unit"}] time-bucketing-units))
(mr/def ::DateTimeUnit
  "Valid unit for *datetime* bucketing."
  (into [:enum {:error/message "datetime bucketing unit"}] datetime-bucketing-units))

Valid timezone id.

(def ^:private TimezoneId
  [:ref ::lib.schema.expression.temporal/timezone-id])
(mr/def ::TemporalExtractUnit
  "Valid units to extract from a temporal."
  [:enum
   {:error/message "temporal extract unit"}
   :year-of-era
   :quarter-of-year
   :month-of-year
   :week-of-year-iso
   :week-of-year-us
   :week-of-year-instance
   :day-of-month
   :day-of-week
   :day-of-week-iso
   :hour-of-day
   :minute-of-hour
   :second-of-minute])
(mr/def ::DatetimeDiffUnit
  "Valid units for a datetime-diff clause."
  [:enum {:error/message "datetime-diff unit"} :second :minute :hour :day :week :month :quarter :year])
(mr/def ::ExtractWeekMode
  "Valid modes to extract weeks."
  [:enum {:error/message "temporal-extract week extraction mode"} :iso :us :instance])
(mr/def ::RelativeDatetimeUnit
  [:enum {:error/message "relative-datetime unit"} :default :minute :hour :day :week :month :quarter :year])

TODO - unit is not allowed if n is current

(defclause relative-datetime
  n    [:or [:= :current] :int]
  unit (optional [:ref ::RelativeDatetimeUnit]))
(defclause interval
  n    :int
  unit [:ref ::RelativeDatetimeUnit])

This clause is automatically generated by middleware when datetime literals (literal strings or one of the Java types) are encountered. Unit is inferred by looking at the Field the timestamp is compared against. Implemented mostly to convenience driver implementations. You don't need to use this form directly when writing MBQL; datetime literal strings are preferred instead.

example: [:= [:field 10 {:temporal-unit :day}] "2018-10-02"]

becomes: [:= [:field 10 {:temporal-unit :day}] [:absolute-datetime #inst "2018-10-02" :day]]

(mr/def ::absolute-datetime
  [:multi {:error/message "valid :absolute-datetime clause"
           :doc/title     [:span [:code ":absolute-datetime"] " clause"]
           :dispatch      (fn [x]
                            (cond
                              (core/not (is-clause? :absolute-datetime x)) :invalid
                              (mr/validate ::lib.schema.literal/date (second x))      :date
                              :else                                        :datetime))}
   [:invalid [:fn
              {:error/message "not an :absolute-datetime clause"}
              (constantly false)]]
   [:date (helpers/clause
           :absolute-datetime
           "date" ::lib.schema.literal/date
           "unit" ::DateUnit)]
   [:datetime (helpers/clause
               :absolute-datetime
               "datetime" ::lib.schema.literal/datetime
               "unit"     ::DateTimeUnit)]])

Schema for an :absolute-datetime clause.

(def ^:internal ^{:clause-name :absolute-datetime} absolute-datetime
  [:ref ::absolute-datetime])

almost exactly the same as absolute-datetime, but generated in some sitations where the literal in question was clearly a time (e.g. "08:00:00.000") and/or the Field derived from :type/Time and/or the unit was a time-bucketing unit

(defclause ^:internal time
  time ::lib.schema.literal/time
  unit ::TimeUnit)
(mr/def ::DateOrDatetimeLiteral
  "Schema for a valid date or datetime literal."
  [:or
   {:error/message "date or datetime literal"}
   relative-datetime
   absolute-datetime
   ;; literal datetime strings and Java types will get transformed to [[absolute-datetime]] clauses automatically by
   ;; middleware so drivers don't need to deal with these directly. You only need to worry about handling
   ;; `absolute-datetime` clauses.
   ::lib.schema.literal/datetime
   ::lib.schema.literal/date])
(mr/def ::TimeLiteral
  "Schema for valid time literals."
  [:or
   {:error/message "time literal"}
   time
   ::lib.schema.literal/time])
(mr/def ::TemporalLiteral
  "Schema for valid temporal literals."
  [:or
   {:error/message "temporal literal"}
   [:ref ::DateOrDatetimeLiteral]
   [:ref ::TimeLiteral]])
(mr/def ::DateTimeValue
  "Schema for a datetime value drivers will personally have to handle, either an `absolute-datetime` form or a
  `relative-datetime` form."
  (one-of absolute-datetime relative-datetime time))

-------------------------------------------------- Other Values --------------------------------------------------

(mr/def ::ValueTypeInfo
  [:map
   {:description (str "Type info about a value in a `:value` clause. Added automatically by `wrap-value-literals`"
                      " middleware to values in filter clauses based on the Field in the clause.")}
   [:database_type {:optional true} [:maybe ::lib.schema.common/non-blank-string]]
   [:base_type     {:optional true} [:maybe ::lib.schema.common/base-type]]
   [:semantic_type {:optional true} [:maybe ::lib.schema.common/semantic-or-relation-type]]
   [:unit          {:optional true} [:maybe ::DateTimeUnit]]
   [:name          {:optional true} [:maybe ::lib.schema.common/non-blank-string]]])

Arguments to filter clauses are automatically replaced with [:value ] clauses by the wrap-value-literals middleware. This is done to make it easier to implement query processors, because most driver implementations dispatch off of Object type, which is often not enough to make informed decisions about how to treat certain objects. For example, a string compared against a Postgres UUID Field needs to be parsed into a UUID object, since text <-> UUID comparison doesn't work in Postgres. For this reason, raw literals in :filter clauses are wrapped in :value clauses and given information about the type of the Field they will be compared to.

(defclause ^:internal value
  value    :any
  type-info [:maybe ::ValueTypeInfo])

----------------------------------------------------- Fields -----------------------------------------------------

Expression references refer to a something in the :expressions clause, e.g. something like

[:+ [:field 1 nil] [:field 2 nil]]

As of 0.42.0 :expression references can have an optional options map

(defclause ^{:requires-features #{:expressions}} expression
  expression-name ::lib.schema.common/non-blank-string
  options         (optional :map))

Whether temporal-unit (e.g. :day) is valid for the given base-type (e.g. :type/Date). If either is nil this will return truthy. Accepts either map of field-options or base-type and temporal-unit passed separately.

(defn valid-temporal-unit-for-base-type?
  ([{:keys [base-type temporal-unit] :as _field-options}]
   (valid-temporal-unit-for-base-type? base-type temporal-unit))
  ([base-type temporal-unit]
   (if-let [units (when (core/and temporal-unit base-type)
                    (condp #(isa? %2 %1) base-type
                      :type/Date     date-bucketing-units
                      :type/Time     time-bucketing-units
                      :type/DateTime datetime-bucketing-units
                      nil))]
     (contains? units temporal-unit)
     true)))
(mr/def ::validate-temporal-unit
  ;; TODO - consider breaking this out into separate constraints for the three different types so we can generate more
  ;; specific error messages
  [:fn
   {:error/message "Invalid :temporal-unit for the specified :base-type."}
   valid-temporal-unit-for-base-type?])
(mr/def ::no-binning-options-at-top-level
  [:fn
   {:error/message ":binning keys like :strategy are not allowed at the top level of :field options."}
   (complement :strategy)])
(mr/def ::FieldOptions
  [:and
   [:map
    {:error/message "field options"}
    [:base-type {:optional true} [:maybe ::lib.schema.common/base-type]]

    ;; Following option conveys temporal unit that was set on a ref in previous stages. For details refer to
    ;; [:metabase.lib.schema.ref/field.options] schema.
    [:inherited-temporal-unit {:optional true} [:maybe ::DateTimeUnit]]

    [:source-field
     {:optional true
      :description
      "Replaces `fk->`.

  `:source-field` is used to refer to a FieldOrExpression from a different Table you would like IMPLICITLY JOINED to
     the source table.

  If both `:source-field` and `:join-alias` are supplied, `:join-alias` should be used to perform the join;
  `:source-field` should be for information purposes only."} ::lib.schema.id/field]

    [:temporal-unit
     {:optional true
      :description
      "`:temporal-unit` is used to specify DATE BUCKETING for a FieldOrExpression that represents a moment in time of
  some sort.

  There is no requirement that all `:type/Temporal` derived FieldOrExpressions specify a `:temporal-unit`, but for
  legacy reasons `:field` clauses that refer to `:type/DateTime` FieldOrExpressions will be automatically \"bucketed\"
  in the `:breakout` and `:filter` clauses, but nowhere else. Auto-bucketing only applies to `:filter` clauses when
  values for comparison are `yyyy-MM-dd` date strings. See the `auto-bucket-datetimes` middleware for more details.
  `:field` clauses elsewhere will not be automatically bucketed, so drivers still need to make sure they do any
  special datetime handling for plain `:field` clauses when their FieldOrExpression derives from `:type/DateTime`."}
     [:maybe ::DateTimeUnit]]

    [:join-alias
     {:optional true
      :description
      "Replaces `joined-field`.

  `:join-alias` is used to refer to a FieldOrExpression from a different Table/nested query that you are EXPLICITLY
  JOINING against."}
     [:maybe ::lib.schema.common/non-blank-string]]

    [:binning
     {:optional true
      :description
      "Replaces `binning-strategy`.

  Using binning requires the driver to support the `:binning` feature."}
     [:maybe [:ref ::lib.schema.binning/binning]]]]

   ;; additional validation
   [:ref
    {:description "If `:base-type` is specified, the `:temporal-unit` must make sense, e.g. no bucketing by `:year`for
  a `:type/Time` column."}
    ::validate-temporal-unit]

   [:ref
    {:description "You cannot use `:binning` keys like `:strategy` in the top level."}
    ::no-binning-options-at-top-level]])
(mr/def ::require-base-type-for-field-name
  [:fn
   {:error/message ":field clauses using a string field name must specify :base-type."}
   (fn [[_ id-or-name {:keys [base-type]}]]
     (if (string? id-or-name)
       base-type
       true))])
(mr/def ::field
  [:and
   {:doc/title [:span [:code ":field"] " clause"]}
   (helpers/clause
    :field
    "id-or-name" [:or ::lib.schema.id/field ::lib.schema.common/non-blank-string]
    "options"    [:maybe [:ref ::FieldOptions]])
   [:ref
    {:description "Fields using names rather than integer IDs are required to specify `:base-type`."}
    ::require-base-type-for-field-name]])

Schema for a :field clause.

(def ^{:clause-name :field, :added "0.39.0"} field
  [:ref ::field])
(def ^{:clause-name :field, :added "0.39.0"} field:id
  "Schema for a `:field` clause, with the added constraint that it must use an integer Field ID."
  [:and
   field
   [:fn
    {:error/message "Must be a :field with an integer Field ID."}
    (fn [[_ id-or-name]]
      (integer? id-or-name))]])
(mr/def ::Field
  [:schema
   {:doc/title "`:field` or `:expression` ref"}
   (one-of expression field)])

Schema for either a :field clause (reference to a Field) or an :expression clause (reference to an expression).

(def Field
  [:ref ::Field])

aggregate field reference refers to an aggregation, e.g.

{:aggregation [[:count]] :order-by [[:asc [:aggregation 0]]]} ;; refers to the 0th aggregation, :count

Currently aggregate Field references can only be used inside order-by clauses. In the future once we support SQL HAVING we can allow them in filter clauses too

TODO - shouldn't we allow composing aggregations in expressions? e.g.

{:order-by [[:asc [:+ [:aggregation 0] [:aggregation 1]]]]}

TODO - it would be nice if we could check that there's actually an aggregation with the corresponding index, wouldn't it

As of 0.42.0 :aggregation references can have an optional options map.

(defclause aggregation
  aggregation-clause-index :int
  options                  (optional :map))
(mr/def ::Reference
  (one-of aggregation expression field))

Schema for any type of valid Field clause, or for an indexed reference to an aggregation clause.

(def Reference
  [:ref ::Reference])
(defclause ^{:added "0.50.0"} offset
  opts [:ref ::lib.schema.common/options]
  expr [:or [:ref ::FieldOrExpressionDef] [:ref ::Aggregation]]
  n    ::lib.schema.expression.window/offset.n)

-------------------------------------------------- Expressions ---------------------------------------------------

Expressions are "calculated column" definitions, defined once and then used elsewhere in the MBQL query.

Functions that return string values. Should match [[StringExpression]].

(def string-functions
  #{:substring :trim :rtrim :ltrim :upper :lower :replace :concat :regex-match-first :coalesce :case :if
    :host :domain :subdomain :month-name :quarter-name :day-name})

Schema for the definition of an string expression.

(def ^:private StringExpression
  [:ref ::StringExpression])
(mr/def ::StringExpressionArg
  [:multi
   {:dispatch (fn [x]
                (cond
                  (string? x)                     :string
                  (is-clause? string-functions x) :string-expression
                  (is-clause? :value x)           :value
                  :else                           :else))}
   [:string            :string]
   [:string-expression StringExpression]
   [:value             value]
   [:else              Field]])
(def ^:private StringExpressionArg
  [:ref ::StringExpressionArg])

Functions that return numeric values. Should match [[NumericExpression]].

(def numeric-functions
  #{:+ :- :/ :* :coalesce :length :round :ceil :floor :abs :power :sqrt :log :exp :case :if :datetime-diff
    ;; extraction functions (get some component of a given temporal value/column)
    :temporal-extract
    ;; SUGAR drivers do not need to implement
    :get-year :get-quarter :get-month :get-week :get-day :get-day-of-week :get-hour :get-minute :get-second})

Functions that return boolean values. Should match [[BooleanExpression]].

(def boolean-functions
  #{:and :or :not :< :<= :> :>= := :!= :in :not-in :between :starts-with :ends-with :contains
    :does-not-contain :inside :is-empty :not-empty :is-null :not-null :relative-time-interval :time-interval :during})
(def ^:private aggregations
  #{:sum :avg :stddev :var :median :percentile :min :max :cum-count :cum-sum :count-where :sum-where :share :distinct
    :metric :aggregation-options :count :offset})

Functions that return Date or DateTime values. Should match [[DatetimeExpression]].

(def ^:private datetime-functions
  #{:+ :datetime-add :datetime-subtract :convert-timezone :now})

Schema for the definition of a numeric expression. All numeric expressions evaluate to numeric values.

(def ^:private NumericExpression
  [:ref ::NumericExpression])

Schema for the definition of an arithmetic expression.

(def ^:private BooleanExpression
  [:ref ::BooleanExpression])

Schema for the definition of a date function expression.

(def DatetimeExpression
  [:ref ::DatetimeExpression])

Schema for anything that is a valid :aggregation clause.

(def Aggregation
  [:ref ::Aggregation])
(mr/def ::NumericExpressionArg
  [:multi
   {:error/message "numeric expression argument"
    :dispatch      (fn [x]
                     (cond
                       (number? x)                      :number
                       (is-clause? numeric-functions x) :numeric-expression
                       (is-clause? aggregations x)      :aggregation
                       (is-clause? :value x)            :value
                       :else                            :field))}
   [:number             number?]
   [:numeric-expression NumericExpression]
   [:aggregation        Aggregation]
   [:value              value]
   [:field              Field]])
(def ^:private NumericExpressionArg
  [:ref ::NumericExpressionArg])
(mr/def ::DateTimeExpressionArg
  [:multi
   {:error/message "datetime expression argument"
    :dispatch      (fn [x]
                     (cond
                       (is-clause? aggregations x)       :aggregation
                       (is-clause? :value x)             :value
                       (is-clause? datetime-functions x) :datetime-expression
                       :else                             :else))}
   [:aggregation         Aggregation]
   [:value               value]
   [:datetime-expression DatetimeExpression]
   [:else                [:or [:ref ::DateOrDatetimeLiteral] Field]]])
(def ^:private DateTimeExpressionArg
  [:ref ::DateTimeExpressionArg])
(mr/def ::ExpressionArg
  [:multi
   {:error/message "expression argument"
    :dispatch      (fn [x]
                     (cond
                       (number? x)                       :number
                       (boolean? x)                      :boolean
                       (is-clause? boolean-functions x)  :boolean-expression
                       (is-clause? numeric-functions x)  :numeric-expression
                       (is-clause? datetime-functions x) :datetime-expression
                       (string? x)                       :string
                       (is-clause? string-functions x)   :string-expression
                       (is-clause? :value x)             :value
                       :else                             :else))}
   [:number              number?]
   [:boolean             :boolean]
   [:boolean-expression  BooleanExpression]
   [:numeric-expression  NumericExpression]
   [:datetime-expression DatetimeExpression]
   [:string              :string]
   [:string-expression   StringExpression]
   [:value               value]
   [:else                Field]])
(def ^:private ExpressionArg
  [:ref ::ExpressionArg])
(mr/def ::Addable
  [:or
   {:error/message "numeric expression arg or interval"}
   DateTimeExpressionArg
   interval
   NumericExpressionArg])
(def ^:private Addable
  [:ref ::Addable])
(mr/def ::IntGreaterThanZeroOrNumericExpression
  [:multi
   {:error/message "int greater than zero or numeric expression"
    :dispatch      (fn [x]
                     (if (number? x)
                       :number
                       :else))}
   [:number PositiveInt]
   [:else   NumericExpression]])
(def ^:private IntGreaterThanZeroOrNumericExpression
  [:ref ::IntGreaterThanZeroOrNumericExpression])
(defclause ^{:requires-features #{:expressions}} coalesce
  a ExpressionArg, b ExpressionArg, more (rest ExpressionArg))
(defclause ^{:requires-features #{:expressions}} substring
  s StringExpressionArg, start IntGreaterThanZeroOrNumericExpression, length (optional NumericExpressionArg))
(defclause ^{:requires-features #{:expressions}} length
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} trim
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} rtrim
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} ltrim
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} upper
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} lower
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} replace
  s StringExpressionArg, match :string, replacement :string)

Relax the arg types to ExpressionArg for concat since many DBs allow to concatenate non-string types. This also aligns with the corresponding MLv2 schema and with the reference docs we publish.

(defclause ^{:requires-features #{:expressions}} concat
  a ExpressionArg, b ExpressionArg, more (rest ExpressionArg))
(defclause ^{:requires-features #{:expressions :regex}} regex-match-first
  s StringExpressionArg, pattern :string)
(defclause ^{:requires-features #{:expressions :regex}} host
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions :regex}} domain
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions :regex}} subdomain
  s StringExpressionArg)
(defclause ^{:requires-features #{:expressions}} month-name
  n NumericExpressionArg)
(defclause ^{:requires-features #{:expressions}} quarter-name
  n NumericExpressionArg)
(defclause ^{:requires-features #{:expressions}} day-name
  n NumericExpressionArg)
(defclause ^{:requires-features #{:expressions}} +
  x Addable, y Addable, more (rest Addable))
(defclause ^{:requires-features #{:expressions}} -
  x NumericExpressionArg, y Addable, more (rest Addable))
(defclause ^{:requires-features #{:expressions}} /, x NumericExpressionArg, y NumericExpressionArg, more (rest NumericExpressionArg))
(defclause ^{:requires-features #{:expressions}} *, x NumericExpressionArg, y NumericExpressionArg, more (rest NumericExpressionArg))
(defclause ^{:requires-features #{:expressions}} floor
  x NumericExpressionArg)
(defclause ^{:requires-features #{:expressions}} ceil
  x NumericExpressionArg)
(defclause ^{:requires-features #{:expressions}} round
  x NumericExpressionArg)
(defclause ^{:requires-features #{:expressions}} abs
  x NumericExpressionArg)
(defclause ^{:requires-features #{:advanced-math-expressions}} power
  x NumericExpressionArg,  y NumericExpressionArg)
(defclause ^{:requires-features #{:advanced-math-expressions}} sqrt
  x NumericExpressionArg)
(defclause ^{:requires-features #{:advanced-math-expressions}} exp
  x NumericExpressionArg)
(defclause ^{:requires-features #{:advanced-math-expressions}} log
  x NumericExpressionArg)

The result is positive if x <= y, and negative otherwise.

Days, weeks, months, and years are only counted if they are whole to the "day". For example, datetimeDiff("2022-01-30", "2022-02-28", "month") returns 0 months.

If the values are datetimes, the time doesn't matter for these units. For example, datetimeDiff("2022-01-01T09:00:00", "2022-01-02T08:00:00", "day") returns 1 day even though it is less than 24 hours.

Hours, minutes, and seconds are only counted if they are whole. For example, datetimeDiff("2022-01-01T01:00:30", "2022-01-01T02:00:29", "hour") returns 0 hours.

(defclause ^{:requires-features #{:datetime-diff}} datetime-diff
  datetime-x DateTimeExpressionArg
  datetime-y DateTimeExpressionArg
  unit       [:ref ::DatetimeDiffUnit])
(defclause ^{:requires-features #{:temporal-extract}} temporal-extract
  datetime DateTimeExpressionArg
  unit     [:ref ::TemporalExtractUnit]
  mode     (optional [:ref ::ExtractWeekMode])) ;; only for get-week and get-day-of-week

only for get-week and get-day-of-week

SUGAR CLAUSE: get-year, get-month... clauses are all sugars clause that will be rewritten as [:temporal-extract column :year]

(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-year
  date DateTimeExpressionArg)
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-quarter
  date DateTimeExpressionArg)
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-month
  date DateTimeExpressionArg)
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-week
  date DateTimeExpressionArg
  mode (optional [:ref ::ExtractWeekMode]))
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-day
  date DateTimeExpressionArg)
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-day-of-week
  date DateTimeExpressionArg
  mode (optional [:ref ::ExtractWeekMode]))
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-hour
  datetime DateTimeExpressionArg)
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-minute
  datetime DateTimeExpressionArg)
(defclause ^{:requires-features #{:temporal-extract}} ^:sugar get-second
  datetime DateTimeExpressionArg)
(defclause ^{:requires-features #{:convert-timezone}} convert-timezone
  datetime DateTimeExpressionArg
  to       TimezoneId
  from     (optional TimezoneId))
(def ^:private ArithmeticDateTimeUnit
  [:enum {:error/message "datetime arithmetic unit"} :millisecond :second :minute :hour :day :week :month :quarter :year])
(defclause ^{:requires-features #{:date-arithmetics}} datetime-add
  datetime DateTimeExpressionArg
  amount   NumericExpressionArg
  unit     ArithmeticDateTimeUnit)
(defclause ^{:requires-features #{:now}} now)
(defclause ^{:requires-features #{:date-arithmetics}} datetime-subtract
  datetime DateTimeExpressionArg
  amount   NumericExpressionArg
  unit     ArithmeticDateTimeUnit)
(mr/def ::DatetimeExpression
  (one-of + datetime-add datetime-subtract convert-timezone now))

----------------------------------------------------- Filter -----------------------------------------------------

Schema for a valid MBQL :filter clause.

(def Filter
  [:ref ::Filter])
(defclause and
  first-clause  Filter
  second-clause Filter
  other-clauses (rest Filter))
(defclause or
  first-clause  Filter
  second-clause Filter
  other-clauses (rest Filter))
(defclause not, clause Filter)
(def ^:private FieldOrExpressionRefOrRelativeDatetime
  [:multi
   {:error/message ":field or :expression reference or :relative-datetime"
    :error/fn      (constantly ":field or :expression reference or :relative-datetime")
    :dispatch      (fn [x]
                     (if (is-clause? :relative-datetime x)
                       :relative-datetime
                       :else))}
   [:relative-datetime relative-datetime]
   [:else              Field]])
(mr/def ::EqualityComparable
  [:maybe
   {:error/message "equality comparable"}
   [:or
    :boolean
    number?
    :string
    [:ref ::TemporalLiteral]
    FieldOrExpressionRefOrRelativeDatetime
    ExpressionArg
    value]])

Schema for things that make sense in a = or != filter, i.e. things that can be compared for equality.

(def ^:private EqualityComparable
  [:ref ::EqualityComparable])
(mr/def ::OrderComparable
  [:multi
   {:error/message "order comparable"
    :dispatch      (fn [x]
                     (if (is-clause? :value x)
                       :value
                       :else))}
   [:value value]
   [:else [:or
           number?
           :string
           [:ref ::TemporalLiteral]
           ExpressionArg
           FieldOrExpressionRefOrRelativeDatetime]]])

Schema for things that make sense in a filter like > or <, i.e. things that can be sorted.

(def ^:private OrderComparable
  [:ref ::OrderComparable])

For all of the non-compound Filter clauses below the first arg is an implicit Field ID

These are SORT OF SUGARY, because extra values will automatically be converted a compound clauses. Driver implementations only need to handle the 2-arg forms.

= works like SQL IN with more than 2 args

[:= [:field 1 nil] 2 3] --[DESUGAR]--> [:or [:= [:field 1 nil] 2] [:= [:field 1 nil] 3]]

!= works like SQL NOT IN with more than 2 args

[:!= [:field 1 nil] 2 3] --[DESUGAR]--> [:and [:!= [:field 1 nil] 2] [:!= [:field 1 nil] 3]]

(defclause =,  field EqualityComparable, value-or-field EqualityComparable, more-values-or-fields (rest EqualityComparable))
(defclause !=, field EqualityComparable, value-or-field EqualityComparable, more-values-or-fields (rest EqualityComparable))

aliases for := and :!=

(defclause ^:sugar in,  field EqualityComparable, value-or-field EqualityComparable, more-values-or-fields (rest EqualityComparable))
(defclause ^:sugar not-in,  field EqualityComparable, value-or-field EqualityComparable, more-values-or-fields (rest EqualityComparable))
(defclause <,  field OrderComparable, value-or-field OrderComparable)
(defclause >,  field OrderComparable, value-or-field OrderComparable)
(defclause <=, field OrderComparable, value-or-field OrderComparable)
(defclause >=, field OrderComparable, value-or-field OrderComparable)

:between is INCLUSIVE just like SQL !!!

(defclause between field OrderComparable, min OrderComparable, max OrderComparable)

SUGAR CLAUSE: This is automatically written as a pair of :between clauses by the :desugar middleware.

(defclause ^:sugar inside
  lat-field OrderComparable
  lon-field OrderComparable
  lat-max   OrderComparable
  lon-min   OrderComparable
  lat-min   OrderComparable
  lon-max   OrderComparable)

SUGAR CLAUSES: These are rewritten as [:= <field> nil] and [:not= <field> nil] respectively

(defclause ^:sugar is-null,  field Field)
(defclause ^:sugar not-null, field Field)

These are rewritten as [:or [:= <field> nil] [:= <field> ""]] and [:and [:not= <field> nil] [:not= <field> ""]]

(defclause ^:sugar is-empty,  field Field)
(defclause ^:sugar not-empty, field Field)
(def ^:private StringFilterOptions
  [:map
   ;; default true
   [:case-sensitive {:optional true} :boolean]])
(doseq [clause-keyword [::starts-with ::ends-with ::contains ::does-not-contain]]
  (mr/def clause-keyword
    [:or
     ;; Binary form
     (helpers/clause (keyword (name clause-keyword))
                     "field" StringExpressionArg
                     "string-or-field" StringExpressionArg
                     "options" [:optional StringFilterOptions])
     ;; Multi-arg form
     (helpers/clause (keyword (name clause-keyword))
                     "options" StringFilterOptions
                     "field" StringExpressionArg
                     "string-or-field" StringExpressionArg
                     "second-string-or-field" StringExpressionArg
                     "more-strings-or-fields" [:rest StringExpressionArg])]))

Schema for a valid :starts-with clause.

Schema for a valid :ends-with clause.

Schema for a valid :contains clause.

(def ^{:clause-name :starts-with} starts-with
  [:ref ::starts-with])
(def ^{:clause-name :ends-with} ends-with
  [:ref ::ends-with])
(def ^{:clause-name :contains} contains
  [:ref ::contains])

Schema for a valid :does-not-contain clause.

SUGAR: this is rewritten as [:not [:contains ...]]

(def ^{:sugar       true
       :clause-name :does-not-contain}
  does-not-contain
  [:ref ::does-not-contain])
(def ^:private TimeIntervalOptions
  ;; Should we include partial results for the current day/month/etc? Defaults to `false`; set this to `true` to
  ;; include them.
  [:map
   ;; default false
   [:include-current {:optional true} :boolean]])

Filter subclause. Syntactic sugar for specifying a specific time interval.

Return rows where datetime Field 100's value is in the current month

[:time-interval [:field 100 nil] :current :month]

Return rows where datetime Field 100's value is in the current month, including partial results for the current day

[:time-interval [:field 100 nil] :current :month {:include-current true}]

SUGAR: This is automatically rewritten as a filter clause with a relative-datetime value

(defclause ^:sugar time-interval
  field   Field
  n       [:or
           :int
           [:enum :current :last :next]]
  unit    [:ref ::RelativeDatetimeUnit]
  options (optional TimeIntervalOptions))
(defclause ^:sugar during
  field   Field
  value   [:or ::lib.schema.literal/date ::lib.schema.literal/datetime]
  unit    ::DateTimeUnit)
(defclause ^:sugar relative-time-interval
  col           Field
  value         :int
  bucket        [:ref ::RelativeDatetimeUnit]
  offset-value  :int
  offset-bucket [:ref ::RelativeDatetimeUnit])

A segment is a special macro that saves some pre-definied filter clause, e.g. [:segment 1] this gets replaced by a normal Filter clause in MBQL macroexpansion

It can also be used for GA, which looks something like [:segment "gaid::-11"]. GA segments aren't actually MBQL segments and pass-thru to GA.

(def ^:private SegmentID
  [:ref ::lib.schema.id/segment])
(defclause ^:sugar segment
  segment-id [:or SegmentID ::lib.schema.common/non-blank-string])
(mr/def ::BooleanExpression
  (one-of
   ;; filters drivers must implement
   and or not = != < > <= >= between starts-with ends-with contains
    ;; SUGAR filters drivers do not need to implement
   in not-in does-not-contain inside is-empty not-empty is-null not-null relative-time-interval time-interval during))
(mr/def ::Filter
  [:multi
   {:error/message "valid filter expression"
    :dispatch      (fn [x]
                     (cond
                       (is-clause? datetime-functions x) :datetime
                       (is-clause? numeric-functions x)  :numeric
                       (is-clause? string-functions x)   :string
                       (is-clause? boolean-functions x)  :boolean
                       :else                             :else))}
   [:datetime DatetimeExpression]
   [:numeric  NumericExpression]
   [:string   StringExpression]
   [:boolean  BooleanExpression]
   [:else     (one-of segment)]])
(def ^:private CaseClause
  [:tuple {:error/message ":case subclause"} Filter ExpressionArg])
(def ^:private CaseClauses
  [:maybe [:sequential CaseClause]])
(def ^:private CaseOptions
  [:map
   {:error/message ":case options"}
   [:default {:optional true} ExpressionArg]])
(defclause ^{:requires-features #{:basic-aggregations}} case
  clauses CaseClauses, options (optional CaseOptions))
(defclause ^:sugar ^{:requires-features #{:basic-aggregations}} [case:if if]
  clauses CaseClauses, options (optional CaseOptions))
(mr/def ::NumericExpression
  (one-of + - / * coalesce length floor ceil round abs power sqrt exp log case case:if datetime-diff
          temporal-extract get-year get-quarter get-month get-week get-day get-day-of-week
          get-hour get-minute get-second))
(mr/def ::StringExpression
  (one-of substring trim ltrim rtrim replace lower upper concat regex-match-first coalesce case case:if host domain
          subdomain month-name quarter-name day-name))
(mr/def ::FieldOrExpressionDef
  "Schema for anything that is accepted as a top-level expression definition, either an arithmetic expression such as a
  `:+` clause or a `:field` clause."
  [:multi
   {:error/message ":field or :expression reference or expression"
    :doc/title     "expression definition"
    :dispatch      (fn [x]
                     (cond
                       (is-clause? numeric-functions x)  :numeric
                       (is-clause? string-functions x)   :string
                       (is-clause? boolean-functions x)  :boolean
                       (is-clause? datetime-functions x) :datetime
                       (is-clause? :case x)              :case
                       (is-clause? :if   x)              :if
                       (is-clause? :offset x)            :offset
                       :else                             :else))}
   [:numeric  NumericExpression]
   [:string   StringExpression]
   [:boolean  BooleanExpression]
   [:datetime DatetimeExpression]
   [:case     case]
   [:if       case:if]
   [:offset   offset]
   [:else     Field]])

-------------------------------------------------- Aggregations --------------------------------------------------

For all of the 'normal' Aggregations below (excluding Metrics) fields are implicit Field IDs

cum-sum and cum-count are SUGAR because they're implemented in middleware. The clauses are swapped out with count and sum aggregations respectively and summation is done in Clojure-land

(defclause ^{:requires-features #{:basic-aggregations}} ^:sugar count,     field (optional Field))
(defclause ^{:requires-features #{:basic-aggregations}} ^:sugar cum-count, field (optional Field))

technically aggregations besides count can also accept expressions as args, e.g.

[[:sum [:+ [:field 1 nil] [:field 2 nil]]]]

Which is equivalent to SQL:

SUM(field1 + field2)

(defclause ^{:requires-features #{:basic-aggregations}} avg,      field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:basic-aggregations}} cum-sum,  field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:basic-aggregations}} distinct, field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:basic-aggregations}} sum,      field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:basic-aggregations}} min,      field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:basic-aggregations}} max,      field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:basic-aggregations}} sum-where
  field-or-expression [:ref ::FieldOrExpressionDef], pred Filter)
(defclause ^{:requires-features #{:basic-aggregations}} count-where
  pred Filter)
(defclause ^{:requires-features #{:basic-aggregations}} share
  pred Filter)
(defclause ^{:requires-features #{:standard-deviation-aggregations}} stddev
  field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:standard-deviation-aggregations}} [ag:var var]
  field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:percentile-aggregations}} median
  field-or-expression [:ref ::FieldOrExpressionDef])
(defclause ^{:requires-features #{:percentile-aggregations}} percentile
  field-or-expression [:ref ::FieldOrExpressionDef], percentile NumericExpressionArg)

Metrics are just 'macros' (placeholders for other aggregations with optional filter and breakout clauses) that get expanded to other aggregations/etc. in the expand-macros middleware

(defclause metric
  metric-id ::lib.schema.id/metric)

the following are definitions for expression aggregations, e.g.

[:+ [:sum [:field 10 nil]] [:sum [:field 20 nil]]]

(mr/def ::UnnamedAggregation
  [:multi
   {:error/message "unnamed aggregation clause or numeric expression"
    :dispatch      (fn [x]
                     (if (is-clause? numeric-functions x)
                       :numeric-expression
                       :else))}
   [:numeric-expression NumericExpression]
   [:else (one-of avg cum-sum distinct stddev sum min max metric share count-where
                  sum-where case case:if median percentile ag:var cum-count count offset)]])
(def ^:private UnnamedAggregation
  ::UnnamedAggregation)

Additional options for any aggregation clause when wrapping it in :aggregation-options.

(def ^:private AggregationOptions
  [:map
   {:error/message ":aggregation-options options"}
   ;; name to use for this aggregation in the native query instead of the default name (e.g. `count`)
   [:name         {:optional true} ::lib.schema.common/non-blank-string]
   ;; user-facing display name for this aggregation instead of the default one
   [:display-name {:optional true} ::lib.schema.common/non-blank-string]])
(defclause aggregation-options
  aggregation UnnamedAggregation
  options     AggregationOptions)
(mr/def ::Aggregation
  [:multi
   {:error/message "aggregation clause or numeric expression"
    :dispatch      (fn [x]
                     (if (is-clause? :aggregation-options x)
                       :aggregation-options
                       :unnamed-aggregation))}
   [:aggregation-options aggregation-options]
   [:unnamed-aggregation UnnamedAggregation]])

---------------------------------------------------- Order-By ----------------------------------------------------

order-by is just a series of [<direction> <field>] clauses like

{:order-by [[:asc [:field 1 nil]], [:desc [:field 2 nil]]]}

Field ID is implicit in these clauses

(defclause asc,  field Reference)
(defclause desc, field Reference)
(mr/def ::OrderBy
  "Schema for an `order-by` clause subclause."
  (one-of asc desc))

+----------------------------------------------------------------------------------------------------------------+ | Queries | +----------------------------------------------------------------------------------------------------------------+

---------------------------------------------- Native [Inner] Query ----------------------------------------------

Schema for valid values of template tag :type.

(def ^:private TemplateTagType
  [:enum :snippet :card :dimension :number :text :date])
(def ^:private TemplateTag:Common
  "Things required by all template tag types."
  [:map
   [:type         TemplateTagType]
   [:name         ::lib.schema.common/non-blank-string]
   [:display-name ::lib.schema.common/non-blank-string]
   ;; TODO -- `:id` is actually 100% required but we have a lot of tests that don't specify it because this constraint
   ;; wasn't previously enforced; we need to go in and fix those tests and make this non-optional
   [:id {:optional true} ::lib.schema.common/non-blank-string]])

Example:

{:id "c2fc7310-44eb-4f21-c3a0-63806ffb7ddd" :name "snippet: select" :display-name "Snippet: select" :type :snippet :snippet-name "select" :snippet-id 1}

(mr/def ::TemplateTag:Snippet
  "Schema for a native query snippet template tag."
  [:merge
   TemplateTag:Common
   [:map
    [:type         [:= :snippet]]
    [:snippet-name ::lib.schema.common/non-blank-string]
    [:snippet-id   PositiveInt]
    ;; database to which this Snippet belongs. Doesn't always seen to be specified.
    [:database {:optional true} PositiveInt]]])

Example:

{:id "fc5e14d9-7d14-67af-66b2-b2a6e25afeaf" :name "#1635" :display-name "#1635" :type :card :card-id 1635}

(mr/def ::TemplateTag:SourceQuery
  "Schema for a source query template tag."
  [:merge
   TemplateTag:Common
   [:map
    [:type    [:= :card]]
    [:card-id PositiveInt]]])
(def ^:private TemplateTag:Value:Common
  "Stuff shared between the Field filter and raw value template tag schemas."
  [:merge
   TemplateTag:Common
   [:map
    ;; default value for this parameter
    [:default  {:optional true} :any]
    ;; whether or not a value for this parameter is required in order to run the query
    [:required {:optional true} :boolean]]])

Example:

{:id "c20851c7-8a80-0ffa-8a99-ae636f0e9539" :name "date" :display-name "Date" :type :dimension, :dimension [:field 4 nil] :widget-type :date/all-options}

(mr/def ::TemplateTag:FieldFilter
  "Schema for a field filter template tag."
  [:merge
   TemplateTag:Value:Common
   [:map
    [:type        [:= :dimension]]
    [:dimension   field]

    [:widget-type
     [:ref
      {:description
       "which type of widget the frontend should show for this Field Filter; this also affects which parameter types
  are allowed to be specified for it."}
      ::WidgetType]]

    [:options
     {:optional    true
      :description "optional map to be appended to filter clause"}
     [:maybe [:map-of :keyword :any]]]]])

Example:

{:id "35f1ecd4-d622-6d14-54be-750c498043cb" :name "id" :display-name "Id" :type :number :required true :default "1"}

(mr/def ::TemplateTag:RawValue
  "Schema for a raw value template tag."
  [:merge
   TemplateTag:Value:Common
   [:map
    [:type
     [:ref
      {:description
       "`:type` is used be the FE to determine which type of widget to display for the template tag, and to determine
  which types of parameters are allowed to be passed in for this template tag."}]
     ::lib.schema.template-tag/raw-value.type]]])
(mr/def ::TemplateTag
  "Schema for a template tag as specified in a native query. There are four types of template tags, differentiated by
  `:type`.

  Template tags are used to specify {{placeholders}} in native queries that are replaced with some sort of value when
  the query itself runs. There are four basic types of template tag for native queries:

  1. Field filters, which are used like

         SELECT * FROM table WHERE {{field_filter}}

     These reference specific Fields and are replaced with entire conditions, e.g. `some_field > 1000`

  2. Raw values, which are used like

         SELECT * FROM table WHERE my_field = {{x}}

     These are replaced with raw values.

   3. Native query snippets, which might be used like

          SELECT * FROM ({{snippet: orders}}) source

      These are replaced with `NativeQuerySnippet`s from the application database.

   4. Source query Card IDs, which are used like

          SELECT * FROM ({{#123}}) source

      These are replaced with the query from the Card with that ID.

  Field filters and raw values usually have their value specified by `:parameters`."
  [:multi
   {:dispatch :type}
   [:dimension   [:ref ::TemplateTag:FieldFilter]]
   [:snippet     [:ref ::TemplateTag:Snippet]]
   [:card        [:ref ::TemplateTag:SourceQuery]]
   [::mc/default [:ref ::TemplateTag:RawValue]]])

Alias for ::TemplateTag; prefer that going forward.

(def TemplateTag
  [:ref ::TemplateTag])
(mr/def ::TemplateTagMap
  "Schema for the `:template-tags` map passed in as part of a native query.

  Map of template tag name -> template tag definition"
  [:and
   [:map-of ::lib.schema.common/non-blank-string TemplateTag]
   ;; make sure people don't try to pass in a `:name` that's different from the actual key in the map.
   [:fn
    {:error/message "keys in template tag map must match the :name of their values"}
    (fn [m]
      (every? (fn [[tag-name tag-definition]]
                (core/= tag-name (:name tag-definition)))
              m))]])
(def ^:private NativeQuery:Common
  [:map
   [:template-tags {:optional true} [:ref ::TemplateTagMap]]
   ;; collection (table) this query should run against. Needed for MongoDB
   [:collection    {:optional true} [:maybe ::lib.schema.common/non-blank-string]]])

Schema for a valid, normalized native [inner] query.

(def NativeQuery
  [:merge
   NativeQuery:Common
   [:map
    [:query :any]]])
(mr/def ::NativeSourceQuery
  [:merge
   NativeQuery:Common
   [:map
    [:native :any]]])

----------------------------------------------- MBQL [Inner] Query -----------------------------------------------

Schema for a valid, normalized MBQL [inner] query.

(def MBQLQuery
  [:ref ::MBQLQuery])

Schema for a valid value for a :source-query clause.

(def SourceQuery
  [:multi
   {:dispatch (fn [x]
                (if ((every-pred map? :native) x)
                  :native
                  :mbql))}
   ;; when using native queries as source queries the schema is exactly the same except use `:native` in place of
   ;; `:query` for reasons I do not fully remember (perhaps to make it easier to differentiate them from MBQL source
   ;; queries).
   [:native [:ref ::NativeSourceQuery]]
   [:mbql   MBQLQuery]])
(mr/def ::SourceQueryMetadata
  "Schema for the expected keys for a single column in `:source-metadata` (`:source-metadata` is a sequence of these
  entries), if it is passed in to the query.

  This metadata automatically gets added for all source queries that are referenced via the `card__id` `:source-table`
  form; for explicit `:source-query`s you should usually include this information yourself when specifying explicit
  `:source-query`s."
  [:map
   [:name         ::lib.schema.common/non-blank-string]
   [:base_type    ::lib.schema.common/base-type]
   ;; this is only used by the annotate post-processing stage, not really needed at all for pre-processing, might be
   ;; able to remove this as a requirement
   [:display_name ::lib.schema.common/non-blank-string]
   [:semantic_type {:optional true} [:maybe ::lib.schema.common/semantic-or-relation-type]]
   ;; you'll need to provide this in order to use BINNING
   [:fingerprint   {:optional true} [:maybe :map]]])

Alias for ::SourceQueryMetadata -- prefer that instead.

(def SourceQueryMetadata
  ;; TODO - there is a very similar schema in `metabase.analyze.query-results`; see if we can merge them
  [:ref ::SourceQueryMetadata])

Pattern that matches card__id strings that can be used as the :source-table of MBQL queries.

(def source-table-card-id-regex
  #"^card__[1-9]\d*$")

Schema for a valid value for the :source-table clause of an MBQL query.

(def ^:private SourceTable
  [:or
   ::lib.schema.id/table
   [:re
    {:error/message "'card__<id>' string Table ID"
     :description   "`card__<id>` string Table ID"}
    source-table-card-id-regex]])

Valid values of the :strategy key in a join map.

(def join-strategies
  #{:left-join :right-join :inner-join :full-join})

Strategy that should be used to perform the equivalent of a SQL JOIN against another table or a nested query. These correspond 1:1 to features of the same name in driver features lists; e.g. you should check that the current driver supports :full-join before generating a Join clause using that strategy.

(def JoinStrategy
  (into [:enum] join-strategies))

Schema for valid values of the MBQL :fields clause.

(def Fields
  [:ref ::Fields])
(def ^:private JoinFields
  [:or
   {:error/message "Valid join `:fields`: `:all`, `:none`, or a sequence of `:field` clauses that have `:join-alias`."}
   [:enum :all :none]
   Fields])
(mr/def ::Join
  "Perform the equivalent of a SQL `JOIN` with another Table or nested `:source-query`. JOINs are either explicitly
  specified in the incoming query, or implicitly generated when one uses a `:field` clause with `:source-field`.

  In the top-level query, you can reference Fields from the joined table or nested query by including `:source-field`
  in the `:field` options (known as implicit joins); for explicit joins, you *must* specify `:join-alias` yourself; in
  the `:field` options, e.g.

    ;; for joins against other Tables/MBQL source queries
    [:field 1 {:join-alias \"my_join_alias\"}]

    ;; for joins against native queries
    [:field \"my_field\" {:base-type :field/Integer, :join-alias \"my_join_alias\"}]"
  [:and
   [:map
    [:source-table
     {:optional true
      :description "*What* to JOIN. Self-joins can be done by using the same `:source-table` as in the query where
  this is specified. YOU MUST SUPPLY EITHER `:source-table` OR `:source-query`, BUT NOT BOTH!"}
     SourceTable]

    [:source-query {:optional true} SourceQuery]

    [:condition
     {:description
      "The condition on which to JOIN. Can be anything that is a valid `:filter` clause. For automatically-generated
  JOINs this is usually something like

    [:= <source-table-fk-field> [:field <dest-table-pk-field> {:join-alias <join-table-alias>}]]"}
     Filter]

    [:strategy
     {:optional true
      :description "Defaults to `:left-join`; used for all automatically-generated JOINs

  Driver implementations: this is guaranteed to be present after pre-processing."}
     JoinStrategy]

    [:fields
     {:optional true
      :description
      "The Fields from this join to include in parent-level results. This can be either `:none`, `:all`, or a sequence
  of `:field` clauses.

  * `:none`: no Fields from the joined table or nested query are included (unless indirectly included by breakouts or
     other clauses). This is the default, and what is used for automatically-generated joins.

  * `:all`: will include all of the Field from the joined table or query

  * a sequence of Field clauses: include only the Fields specified. Valid clauses are the same as the top-level
    `:fields` clause. This should be non-empty and all elements should be distinct. The normalizer will automatically
    remove duplicate fields for you, and replace empty clauses with `:none`.

  Driver implementations: you can ignore this clause. Relevant fields will be added to top-level `:fields` clause with
  appropriate aliases."}
     JoinFields]

    [:alias
     {:optional true
      :description
      "The name used to alias the joined table or query. This is usually generated automatically and generally looks
  like `table__via__field`. You can specify this yourself if you need to reference a joined field with a `:join-alias`
  in the options.

  Driver implementations: This is guaranteed to be present after pre-processing."}
     ::lib.schema.common/non-blank-string]

    [:ident
     {:optional true
      :description
      "An opaque string used as a unique identifier for this join clause, even if it evolves. This string is randomly
      generated when a join clause is created, so it can never be confused with another join of the same table."}
     ::Ident]

    [:fk-field-id
     {:optional true
      :description "Mostly used only internally. When a join is implicitly generated via a `:field` clause with
  `:source-field`, the ID of the foreign key field in the source Table will be recorded here. This information is used
  to add `fk_field_id` information to the `:cols` in the query results, and also for drill-thru. When generating
  explicit joins by hand you can usually omit this information, altho it doesn't hurt to include it if you know it.

  Don't set this information yourself. It will have no effect."}
     [:maybe ::lib.schema.id/field]]

    [:source-metadata
     {:optional true
      :description "Metadata about the source query being used, if pulled in from a Card via the
  `:source-table \"card__id\"` syntax. added automatically by the `resolve-card-id-source-tables` middleware."}
     [:maybe [:sequential SourceQueryMetadata]]]]
   ;; additional constraints
   [:fn
    {:error/message "Joins must have either a `source-table` or `source-query`, but not both."}
    (every-pred
     (some-fn :source-table :source-query)
     (complement (every-pred :source-table :source-query)))]])

Alias for ::Join. Prefer that going forward.

(def Join
  [:ref ::Join])
(mr/def ::Joins
  "Schema for a valid sequence of `Join`s. Must be a non-empty sequence, and `:alias`, if specified, must be unique."
  [:and
   (helpers/non-empty [:sequential Join])
   [:fn
    {:error/message "All join aliases must be unique."}
    #(helpers/empty-or-distinct? (filter some? (map :alias %)))]])
(mr/def ::Fields
  [:schema
   {:error/message "Distinct, non-empty sequence of Field clauses"}
   (helpers/distinct [:sequential {:min 1} Field])])
(mr/def ::Page
  "`page` = page num, starting with 1. `items` = number of items per page.
  e.g.

    {:page 1, :items 10} = items 1-10
    {:page 2, :items 10} = items 11-20"
  [:map
   [:page  PositiveInt]
   [:items PositiveInt]])
(mr/def ::Ident
  "Unique identifier string for new `:column` refs. The new refs aren't used in legacy MBQL (currently) but the
  idents for column-introducing new clauses (joins, aggregations, breakouts, expressions) are randomly generated when
  the clauses are created, so the idents must be preserved in legacy MBQL.

  These are opaque strings under the initial design; I've made them a separate schema for documentation and
  future-proofing."
  [:or ::lib.schema.common/non-blank-string :keyword])
(mr/def ::IndexedIdents
  "Aggregations and breakouts get their `:ident` in legacy MBQL from a separate map, which maps the index of the
  aggregation or breakout to its ident.

  (That's super unstable, but legacy MBQL is never manipulated anymore. We just need a clean round trip through
  legacy, so indexes work fine. Idents are stored directly on the clauses in pMBQL.)"
  ;; TODO: Make the ::Ident values strict once idents are always-populated? That only works for post-normalization
  ;; queries, but I think we don't apply this schema until normalization.
  [:map-of ::lib.schema.common/int-greater-than-or-equal-to-zero [:maybe ::Ident]])
(mr/def ::ExpressionIdents
  "Expressions get their `:ident` in legacy MBQL from a separate map, which maps expression names to idents."
  ;; TODO: Make the ::Ident values strict once idents are always-populated? That only works for post-normalization
  ;; queries, but I think we don't apply this schema until normalization.
  [:map-of ::lib.schema.common/non-blank-string [:maybe ::Ident]])
(mr/def ::MBQLQuery
  [:and
   [:map
    [:source-query       {:optional true} SourceQuery]
    [:source-table       {:optional true} SourceTable]
    [:aggregation        {:optional true} [:sequential {:min 1} Aggregation]]
    [:aggregation-idents {:optional true} [:ref ::IndexedIdents]]
    [:breakout           {:optional true} [:sequential {:min 1} Field]]
    [:breakout-idents    {:optional true} [:ref ::IndexedIdents]]
    [:expressions        {:optional true} [:map-of ::lib.schema.common/non-blank-string [:ref ::FieldOrExpressionDef]]]
    [:expression-idents  {:optional true} [:ref ::ExpressionIdents]]
    [:fields             {:optional true} Fields]
    [:filter             {:optional true} Filter]
    [:limit              {:optional true} ::lib.schema.common/int-greater-than-or-equal-to-zero]
    [:order-by           {:optional true} (helpers/distinct [:sequential {:min 1} [:ref ::OrderBy]])]
    [:page               {:optional true} [:ref ::Page]]
    [:joins              {:optional true} [:ref ::Joins]]

    [:source-metadata
     {:optional true
      :description "Info about the columns of the source query. Added in automatically by middleware. This metadata is
  primarily used to let power things like binning when used with Field Literals instead of normal Fields."}
     [:maybe [:sequential SourceQueryMetadata]]]]
   ;;
   ;; CONSTRAINTS
   ;;
   [:fn
    {:error/message "Query must specify either `:source-table` or `:source-query`, but not both."}
    (fn [query]
      (core/= 1 (core/count (select-keys query [:source-query :source-table]))))]
   [:fn
    {:error/message "Fields specified in `:breakout` should not be specified in `:fields`; this is implied."}
    (fn [{:keys [breakout fields]}]
      (empty? (set/intersection (set breakout) (set fields))))]
   ;; TODO: Re-enable this - it's a useful check but it currently breaks a pile of too-literal legacy tests.
   #_[:fn
      {:error/message ":expressions must have the same keys as :expression-idents"}
      (fn [{:keys [expressions expression-idents]}]
        (core/or (core/= nil expressions expression-idents)
                 (core/and (map? expressions)
                           (map? expression-idents)
                           (core/= (set (keys expressions))
                                   (set (keys expression-idents))))))]])

----------------------------------------------------- Params -----------------------------------------------------

(mr/def ::WidgetType
  "Schema for valid values of `:widget-type` for a [[::TemplateTag:FieldFilter]]."
  [:ref :metabase.lib.schema.parameter/widget-type])

this is the reference like [:template-tag ], not the [[TemplateTag]] schema for when it's declared in :template-tags

(defclause template-tag
  tag-name [:or
            ::lib.schema.common/non-blank-string
            [:map
             [:id ::lib.schema.common/non-blank-string]]])
(mr/def ::dimension
  [:and
   {:doc/title [:span [:code ":dimension"] " clause"]}
   [:fn {:error/message "must be a `:dimension` clause"} (partial helpers/is-clause? :dimension)]
   [:catn
    [:tag [:= :dimension]]
    [:target [:schema [:or [:ref ::Field] [:ref ::template-tag]]]]
    [:options [:? [:maybe [:map {:error/message "dimension options"} [:stage-number {:optional true} :int]]]]]]])

Schema for a valid dimension clause.

(def ^{:clause-name :dimension} dimension
  [:ref ::dimension])
(defclause variable
  target template-tag)

Schema for the value of :target in a [[Parameter]].

(def ^:private ParameterTarget
  ;; not 100% sure about this but `field` on its own comes from a Dashboard parameter and when it's wrapped in
  ;; `dimension` it comes from a Field filter template tag parameter (don't quote me on this -- working theory)
  [:or
   Field
   (one-of dimension variable)])
(mr/def ::Parameter
  "Schema for the *value* of a parameter (e.g. a Dashboard parameter or a native query template tag) as passed in as
  part of the `:parameters` list in a query."
  [:merge
   [:ref :metabase.lib.schema.parameter/parameter]
   [:map
    [:target {:optional true} ParameterTarget]]])

Alias for ::Parameter. Prefer using that directly going forward.

(def Parameter
  [:ref ::Parameter])
(mr/def ::ParameterList
  [:maybe [:sequential Parameter]])

Schema for a list of :parameters as passed in to a query.

(def ParameterList
  [:ref ::ParameterList])

---------------------------------------------------- Options -----------------------------------------------------

(mr/def ::Settings
  "Options that tweak the behavior of the query processor."
  [:map
   [:report-timezone
    {:optional    true
     :description "The timezone the query should be ran in, overriding the default report timezone for the instance."}
    TimezoneId]])
(mr/def ::Constraints
  "Additional constraints added to a query limiting the maximum number of rows that can be returned. Mostly useful
  because native queries don't support the MBQL `:limit` clause. For MBQL queries, if `:limit` is set, it will
  override these values."
  [:and
   [:map
    [:max-results
     {:optional true
      :description
      "Maximum number of results to allow for a query with aggregations. If `max-results-bare-rows` is unset, this
  applies to all queries"}
     ::lib.schema.common/int-greater-than-or-equal-to-zero]

    [:max-results-bare-rows
     {:optional true
      :description
      "Maximum number of results to allow for a query with no aggregations. If set, this should be LOWER than
  `:max-results`."}
     ::lib.schema.common/int-greater-than-or-equal-to-zero]]

   [:fn
    {:error/message "max-results-bare-rows must be less or equal to than max-results"}
    (fn [{:keys [max-results max-results-bare-rows]}]
      (if-not (core/and max-results max-results-bare-rows)
        true
        (core/>= max-results max-results-bare-rows)))]])
(mr/def ::MiddlewareOptions
  "Additional options that can be used to toggle middleware on or off."
  [:map
   [:skip-results-metadata?
    {:optional true
     :description
     "Should we skip adding `results_metadata` to query results after running the query? Used by
     `metabase.query-processor.middleware.results-metadata`; default `false`. (Note: we may change the name of this
     column in the near future, to `result_metadata`, to fix inconsistencies in how we name things.)"}
    :boolean]

   [:format-rows?
    {:optional true
     :description
     "Should we skip converting datetime types to ISO-8601 strings with appropriate timezone when post-processing
     results? Used by `metabase.query-processor.middleware.format-rows`default `false`."}
    :boolean]

   [:disable-mbql->native?
    {:optional true
     :description
     "Disable the MBQL->native middleware. If you do this, the query will not work at all, so there are no cases where
  you should set this yourself. This is only used by the `metabase.query-processor.preprocess/preprocess` function to
  get the fully pre-processed query without attempting to convert it to native."}
    :boolean]

   [:disable-max-results?
    {:optional true
     :description
     "Disable applying a default limit on the query results. Handled in the `add-default-limit` middleware. If true,
  this will override the `:max-results` and `:max-results-bare-rows` values in `Constraints`."}
    :boolean]

   [:userland-query?
    {:optional true
     :description
     "Userland queries are ones ran as a result of an API call, Pulse, or the like. Special handling is done in
  certain userland-only middleware for such queries -- results are returned in a slightly different format, and
  QueryExecution entries are normally saved, unless you pass `:no-save` as the option."}
    [:maybe :boolean]]

   [:add-default-userland-constraints?
    {:optional true
     :description
     "Whether to add some default `max-results` and `max-results-bare-rows` constraints. By default, none are added,
  although the functions that ultimately power most API endpoints tend to set this to `true`. See
  `add-constraints` middleware for more details."}
    [:maybe :boolean]]

   [:process-viz-settings?
    {:optional true
     :description
     "Whether to process a question's visualization settings and include them in the result metadata so that they can
  incorporated into an export. Used by `metabase.query-processor.middleware.visualization-settings`; default
  `false`."}
    [:maybe :boolean]]])

--------------------------------------------- Metabase [Outer] Query ---------------------------------------------

To the reader: yes, this seems sort of hacky, but one of the goals of the Nested Query Initiativeâ„¢ was to minimize if not completely eliminate any changes to the frontend. After experimenting with several possible ways to do this implementation seemed simplest and best met the goal. Luckily this is the only place this "magic number" is defined and the entire frontend can remain blissfully unaware of its value.

(mr/def ::DatabaseID
  "Schema for a valid `:database` ID, in the top-level 'outer' query. Either a positive integer (referring to an
  actual Database), or the saved questions virtual ID, which is a placeholder used for queries using the
  `:source-table \"card__id\"` shorthand for a source query resolved by middleware (since clients might not know the
  actual DB for that source query.)"
  [:or
   {:error/message "valid Database ID"}
   [:ref ::lib.schema.id/saved-questions-virtual-database]
   [:ref ::lib.schema.id/database]])

Make sure we have the combo of query :type and :native/:query

(mr/def ::check-keys-for-query-type
  [:and
   [:fn
    {:error/message "Query must specify at most one of `:native` or `:query`, but not both."}
    (complement (every-pred :native :query))]
   [:fn
    {:error/message "Native queries must not specify `:query`; MBQL queries must not specify `:native`."}
    (fn [{native :native, mbql :query, query-type :type}]
      (core/case query-type
        :native (core/not mbql)
        :query  (core/not native)
        false))]])
(mr/def ::check-query-does-not-have-source-metadata
  "`:source-metadata` is added to queries when `card__id` source queries are resolved. It contains info about the
  columns in the source query.

  Where this is added was changed in Metabase 0.33.0 -- previously, when `card__id` source queries were resolved, the
  middleware would add `:source-metadata` to the top-level; to support joins against source queries, this has been
  changed so it is always added at the same level the resolved `:source-query` is added.

  This should automatically be fixed by `normalize`; if we encounter it, it means some middleware is not functioning
  properly."
  [:fn
   {:error/message "`:source-metadata` should be added in the same level as `:source-query` (i.e., the 'inner' MBQL query.)"}
   (complement :source-metadata)])

Schema for an [outer] query, e.g. the sort of thing you'd pass to the query processor or save in Card.dataset_query.

(def Query
  [:ref ::Query])
(mr/def ::Query
  [:and
   [:map
    [:database   {:optional true} ::DatabaseID]

    [:type
     [:enum
      {:description "Type of query. `:query` = MBQL; `:native` = native."}
      :query :native]]

    [:native     {:optional true} NativeQuery]
    [:query      {:optional true} MBQLQuery]
    [:parameters {:optional true} ParameterList]
    ;;
    ;; OPTIONS
    ;;
    ;; These keys are used to tweak behavior of the Query Processor.
    ;;
    [:settings    {:optional true} [:maybe [:ref ::Settings]]]
    [:constraints {:optional true} [:maybe [:ref ::Constraints]]]
    [:middleware  {:optional true} [:maybe [:ref ::MiddlewareOptions]]]
    ;;
    ;; INFO
    ;;
    [:info
     {:optional true
      :description "Used when recording info about this run in the QueryExecution log; things like context query was
  ran in and User who ran it."}
     [:maybe [:ref ::lib.schema.info/info]]]
    ;;
    ;; ACTIONS
    ;;
    ;; This stuff is only used for Actions.
    [:create-row {:optional true} [:maybe [:ref ::lib.schema.actions/row]]]
    [:update-row {:optional true} [:maybe [:ref ::lib.schema.actions/row]]]]
   ;;
   ;; CONSTRAINTS
   [:ref ::check-keys-for-query-type]
   [:ref ::check-query-does-not-have-source-metadata]])

Is this a valid outer query? (Pre-compling a validator is more efficient.)

(def ^{:arglists '([query])} valid-query?
  (mr/validator Query))

Validator for an outer query; throw an Exception explaining why the query is invalid if it is. Returns query if valid.

(defn validate-query
  [query]
  (if (valid-query? query)
    query
    (let [error     (mr/explain Query query)
          humanized (me/humanize error)]
      (throw (ex-info (i18n/tru "Invalid query: {0}" (pr-str humanized))
                      {:error    humanized
                       :original error})))))