I’d recommend the event producer send a UUID that is then the primary key on the events table. The producer should also send the timestamp the event occurred.

I could be missing something, but that seems to solve both the duplicate event firing (an upsert command based on the UUID makes duplicate event writing a non-issue) and the timing issues.

Though I’m still incredibly skeptical of “real-time analytics.” The number of business cases that require actual real-time analysis are pretty limited. High frequency trading and...?

The event producer in this case is one of our Snowplow trackers (https://github.com/snowplow/snowplow-javascript-tracker) - and these tracking SDKs do indeed attach a UUID as a unique event ID at event creation time.

However, this event ID is not enough to identify and then dedupe all types of duplicate events. This blog post provides more information:

https://snowplowanalytics.com/blog/2015/08/19/dealing-with-d...

Big thanks to pragmacoders for putting this tutorial together! It's awesome seeing what you are doing with the Snowplow platform :-)