Functions for sending Snowplow analytics events | (ns metabase.analytics.snowplow
(:require
[clojure.string :as str]
[java-time.api :as t]
[medley.core :as m]
[metabase.analytics.settings :as analytics.settings]
[metabase.api.common :as api]
[metabase.models.setting :as setting :refer [defsetting]]
[metabase.public-settings :as public-settings]
[metabase.util.date-2 :as u.date]
[metabase.util.i18n :refer [deferred-tru]]
[metabase.util.log :as log]
[metabase.util.malli :as mu]
[toucan2.core :as t2])
(:import
(com.snowplowanalytics.snowplow.tracker Snowplow Subject Tracker)
(com.snowplowanalytics.snowplow.tracker.configuration
EmitterConfiguration
NetworkConfiguration
SubjectConfiguration
TrackerConfiguration)
(com.snowplowanalytics.snowplow.tracker.events SelfDescribing SelfDescribing$Builder2)
(com.snowplowanalytics.snowplow.tracker.http ApacheHttpClientAdapter)
(com.snowplowanalytics.snowplow.tracker.payload SelfDescribingJson)
(org.apache.http.client.config CookieSpecs RequestConfig)
(org.apache.http.impl.client HttpClients)
(org.apache.http.impl.conn PoolingHttpClientConnectionManager))) |
(set! *warn-on-reflection* true) | |
Adding or updating a Snowplow schema? Here are some things to keep in mind:
- Snowplow schemata are versioned and immutable, so if you need to make changes to a schema, you should create a new
version of it. The version number should be updated in the | |
The most recent version for each event schema. This should be updated whenever a new version of a schema is added to SnowcatCloud, at the same time that the data sent to the collector is updated. | (def ^:private schema->version
{:snowplow/account "1-0-1"
:snowplow/browse_data "1-0-0"
:snowplow/invite "1-0-1"
:snowplow/instance_stats "2-0-0"
:snowplow/csvupload "1-0-3"
:snowplow/dashboard "1-1-4"
:snowplow/database "1-0-1"
:snowplow/instance "1-1-2"
:snowplow/metabot "1-0-1"
:snowplow/search "1-0-1"
:snowplow/model "1-0-0"
:snowplow/timeline "1-0-0"
:snowplow/task "1-0-0"
:snowplow/upsell "1-0-0"
:snowplow/action "1-0-0"
:snowplow/embed_share "1-0-0"
:snowplow/llm_usage "1-0-0"
:snowplow/serialization "1-0-1"
:snowplow/cleanup "1-0-0"}) |
Malli enum for valid Snowplow schemas | (def ^:private SnowplowSchema (into [:enum] (keys schema->version))) |
We need to declare | (declare track-event!) |
Returns the earliest user creation timestamp in the database | (defn- first-user-creation [] (:min (t2/select-one [:model/User [:%min.date_joined :min]]))) |
[[instance-creation]] should live in analytics.settings, but it would cause a circular dep with [[track-event!]] | (defsetting instance-creation
(deferred-tru "The approximate timestamp at which this instance of Metabase was created, for inclusion in analytics.")
:visibility :public
:setter :none
:getter (fn []
(when-not (t2/exists? :model/Setting :key "instance-creation")
;; For instances that were started before this setting was added (in 0.41.3), use the creation
;; timestamp of the first user. For all new instances, use the timestamp at which this setting
;; is first read.
(let [value (or (first-user-creation) (t/offset-date-time))]
(setting/set-value-of-type! :timestamp :instance-creation value)
(track-event! :snowplow/account {:event :new_instance_created} nil)))
(u.date/format-rfc3339 (setting/get-value-of-type :timestamp :instance-creation)))
:doc false) |
(defn- tracker-config [] (TrackerConfiguration. "sp" "metabase")) | |
(defn- network-config
[]
(let [request-config (-> (RequestConfig/custom)
;; Set cookie spec to `STANDARD` to avoid warnings about an invalid cookie
;; header in request response (PR #24579)
(.setCookieSpec CookieSpecs/STANDARD)
(.build))
client (-> (HttpClients/custom)
(.setConnectionManager (PoolingHttpClientConnectionManager.))
(.setDefaultRequestConfig request-config)
(.build))
http-client-adapter (ApacheHttpClientAdapter. (analytics.settings/snowplow-url) client)]
(NetworkConfiguration. http-client-adapter))) | |
(defn- emitter-config
[]
(-> (EmitterConfiguration.)
(.batchSize 1))) | |
(defonce ^:private tracker (Snowplow/createTracker ^TrackerConfiguration (tracker-config) ^NetworkConfiguration (network-config) ^EmitterConfiguration (emitter-config))) | |
Create a Subject object for a given user ID, to be included in analytics events | (defn- subject
[user-id]
(Subject.
(-> (SubjectConfiguration.)
(.userId (str user-id))
;; Override with localhost IP to avoid logging actual user IP addresses
(.ipAddress "127.0.0.1")))) |
Returns the type of the Metabase application database as a string (e.g. PostgreSQL, MySQL) | (defn- app-db-type
[]
(t2/with-connection [^java.sql.Connection conn]
(.. conn getMetaData getDatabaseProductName))) |
Returns the version of the Metabase application database as a string | (defn- app-db-version
[]
(t2/with-connection [^java.sql.Connection conn]
(let [metadata (.getMetaData conn)]
(format "%d.%d" (.getDatabaseMajorVersion metadata) (.getDatabaseMinorVersion metadata))))) |
Common context included in every analytics event | (defn- context
[]
(new SelfDescribingJson
(str "iglu:com.metabase/instance/jsonschema/" (schema->version :snowplow/instance))
{"id" (analytics.settings/analytics-uuid)
"version" {"tag" (:tag (public-settings/version))}
"token_features" (m/map-keys name (public-settings/token-features))
"created_at" (instance-creation)
"application_database" (app-db-type)
"application_database_version" (app-db-version)})) |
(defn- normalize-kw [kw] (-> kw name (str/replace #"-" "_"))) | |
A SelfDescribingJson object containing the provided event data, which can be included as the payload for an analytics event | (defn- payload
[schema version data]
(new SelfDescribingJson
(format "iglu:com.metabase/%s/jsonschema/%s" (normalize-kw schema) version)
;; Make sure keywords in payload are converted to strings in snake-case
(m/map-kv
(fn [k v] [(normalize-kw k) (if (keyword? v) (normalize-kw v) v)])
data))) |
Wrapper function around the | (defn- track-event-impl! [tracker event] (.track ^Tracker tracker ^SelfDescribing event)) |
Send a single analytics event to the Snowplow collector, if tracking is enabled for this MB instance and a collector is available. | (mu/defn track-event!
([schema :- SnowplowSchema data]
(track-event! schema data api/*current-user-id*))
([schema :- SnowplowSchema data user-id]
(when (analytics.settings/snowplow-enabled)
(try
(let [^SelfDescribing$Builder2 builder (-> (. SelfDescribing builder)
(.eventData (payload schema (schema->version schema) data))
(.customContext [(context)])
(cond-> user-id (.subject (subject user-id))))
^SelfDescribing event (.build builder)]
(track-event-impl! tracker event))
(catch Throwable e
(log/errorf e "Error sending Snowplow analytics event for schema %s" schema)))))) |