Productize Gazelle company dataset in Snowflake
Created Date | Jun 2, 2023 |
---|---|
Target PI | PI4, PI5, PI6 |
Target Release | October |
Jira Epic | |
Document Status | Draft |
Epic Owner | Viktor Kotusenko |
Stakeholder | Simon Leroux, Oree Wyatt, Nick Studt |
Engineering Team(s) Involved | Micro Gazelle-ETL, Gazelle-Micro/Devops |
PART 1
Customer/User Job-to-be-Done or Problem
When consuming Gazelle company dataset I need to be able to have access to it in one location allowing me to run specialized and performant queries in order to build a large scale analysis of company insights for multiple use cases.
Value to Customers & Users
Data updates, which are to be made regularly in order to provide the most fresh company information with increased quality (including deduplication, removing invalid information, better company trees matching etc.) as a consistent output of NeoETL data processing pipelines, when delivered to Snowflake, provide ability to leverage large company intelligence database in a flexible environment.
Value to Lightcast
Opens up early feedback loops for customer input on our product roadmap and exposes the data internally rapidly.
Target User Role/Client/Client Category
Gazelle clients that value access to the complete large company dataset
Delivery Mechanism
Snowflake
Success Criteria & Metrics
Gazelle Beta Golden Records for companies accessible for clients through Snowflake.
Data Dictionary available to the clients is fully describing the data available.
Subsequent updates to Gazelle Beta Golden Records for companies are automatically made available in Snowflake with some release system.
Aspects that are out of scope (of this phase)
No Contact information to avoid PII for now
PART 2
Solution Description
Early UX (wireframes or mockups)
<FigmaLink>
Non-Functional Attributes & Usage Projections
Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements
Dependencies
Is there any work that must precede this? Feature work? Ops work?
Legal and Ethical Considerations
Just answer yes or no.
High-Level Rollout Strategies
Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]
If specific beta customers, will it be for a specific survey launch date or report availability date
How will this guide the rollout of individual stories in the epic?
The rollout strategy should be discussed with CS, Marketing, and Sales.
How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all
Risks
Focus on risks unique to this feature, not overall delivery/execution risks.
Open Questions
Grooming session need to understand steps
Clear definition of MVP
Certain data fields we can’t surface
Finish Golden Records
Complete with Engineering Teams
Effort Size Estimate | 3 |
---|
Estimated Costs
Direct Financial Costs
Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?
Team Effort
Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.
Team | Effort Estimate (T-shirt sizes) | Jira Link |
---|---|---|
Gazelle-ETL | S | |
Gazelle-Micro/Devops | S |
|
Micro | S |
|