Created Date	Jun 20, 2023
Target PI	PI 6
Jira Epic
Document Status	Draft
Epic Owner	@Ben Bradley
Stakeholder	@Christian Asivido @Everett Bloch @Kaleb Trotter @Nick Studt (Deactivated) @Derek Nevins (Deactivated) @Sandra Poisson
Engineering Team(s) Involved	Micro Analyst Taxonomy LMI Data Delivery Career Coach
Initiative	Initiative 0: Keep data up to date and on the same taxonomies

PART 1

Customer/User Job-to-be-Done or Problem

External:

As a user of Lightcast data, I want to see the most up-to-date data as soon as possible. I want data I have previously worked with (i.e. saved reports) to continue to function even when taxonomies change, without having to put significant effort into understanding the new mappings.

Internal:

As an Analyst engineer, I want to focus my work on new features that bring new value to customers. I want standard, repeated datarun processes to be automated to the greatest extent possible so that I can focus my efforts on front-end features that excite our teams and our customers.

As a Lightcast Product Manager, I do not want to have to think about dataruns or about the delivery of new taxonomies to the product. As we have more taxonomies, more products, and support in more parts of the world, this needs to be outside of my scope so I can focus on value-add features rather than maintaining table stakes.

We are running the same processes 8-12 times per year simply to maintain our base value, not to add new features or bring additional value to customers. This is an inefficient use of time, resources, and brainspace. We need to automate as much of this as possible.

Value to Customers & Users

Consistency across our suite of products.
- Today, every product is storing taxonomies and metadata, sometimes inconsistently. For customers with multiple products, this can lead to inconsistent results
Faster adoption of new data
- Today, dataruns can take several weeks, especially in terms of getting the maps to align, and rolling forward taxonomic elements.

Value to Lightcast

Decreased cost of getting new data into products. (For example, the Canada 2023.1 datarun cost >21 LOE)
Increased engineering time dedicated to new product features
Pipeline efficiency without backflow (current flow for dataruns is Micro -> Analyst -> Micro -> Analyst

There are several Epics that would likely become much more inexpensive to integrate if Analyst were connected to the Classification API

Target User Role/Client/Client Category

Who are we building this for?

Delivery Mechanism

How will users receive the value?

Success Criteria & Metrics

How will you know you’ve completed the epic? How will you know if you’ve successfully addressed this problem? What usage goals do you have for these new features? How will you measure them?

Aspects that are out of scope (of this phase)

What is explicitly not a part of this epic? List things that have been discussed but will not be included. Things you imagine in a phase 2, etc.

PART 2 Solution Description

Technical Steps

Stop GIS from being a bottleneck: GIS work is a bottleneck, and involves the greatest inefficiencies. Correcting this, and enabling GIS work to be updated without Analyst first needing to run work, would increase the speed to market and decrease the complexity of the systems.
Connect to the Classification API: The Classification API can provide all taxonomies, labels, levels, and other taxonomic metadata to all products. This would include crosswalks between versions. Rather than individual products (especially Analyst) maintaining these, the Classification API would be the single source of truth and would enable Analyst to take this work out of its systems and its codebase.
1. LMI Taxonomies are priority 1, as this would bring the greatest bang for the buck (and is impacted by dataruns)
2. Additional taxonomies and associated metadata would come next.

Early UX (wireframes or mockups)

Non-Functional Attributes & Usage Projections

Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements

Dependencies

Is there any work that must precede this? Feature work? Ops work?

Legal and Ethical Considerations

Just answer yes or no.

Have you thought through these considerations (e.g. data privacy) and raised any potential concerns with the Legal team?

High-Level Rollout Strategies

Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]
- If specific beta customers, will it be for a specific survey launch date or report availability date
How will this guide the rollout of individual stories in the epic?
The rollout strategy should be discussed with CS, Marketing, and Sales.
How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all

Risks

Focus on risks unique to this feature, not overall delivery/execution risks.

Open Questions

What are you still looking to resolve?

Complete with Engineering Teams

Effort Size Estimate

Estimated Costs

Direct Financial Costs

Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?

Team Effort

Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.

Team	Effort Estimate (T-shirt sizes)	Jira Link

Team	Effort Estimate (T-shirt sizes)	Jira Link

Epic: Move cost of dataruns toward zero