Global Skills Taxonomy Version Parity
Created Date | Feb 1, 2023 |
---|---|
Target PI | PI 2 |
Target Release | |
Jira Epic | |
Document Status | Draft |
Epic Owner | @Abby Santos |
Stakeholder | @Duncan Brown (Unlicensed) @Alexandra Malfant @Bram Velthuis @Hal Bonella @Mark Taylor |
Engineering Team(s) Involved | C&E Taxonomy Linguistics Documents |
PART 1
Customer/User Job-to-be-Done or Problem
The Scope of the user problem should be narrowed to the scope you are planning to solve in this phase of work. There may be other aspects you are aware of and plan to solve in the future. For now, put those in the Out of Scope section.
When a user looks at data across any country, I want to see the same skills taxonomy version tagged across each country, so I can compare skill data across geographies.
When a taxonomist makes a change to the English skill taxonomy, I want to take advantage of that change across geographies and languages, so I can keep taxonomy version parity.
Value to Customers & Users
In the JTBD framework, these are the “pains” and “gains” your solution will address. Other ways to think about it: What’s the rationale for doing this work? Why is it a high priority problem for your customers and how will our solution add value?
Customers using Spotlight can also access new taxonomy features in 8.0 and beyond such as Subcategory and Category, as well as Software skills and other metadata.
Users of job posting and profile data from any geography can benefit from incremental taxonomy changes and improvements
Customers can analyze data across multiple geographies using the same skills taxonomy, and we can keep that taxonomy up to date rather than being years behind our current English taxonomy.
Customers can use tools such as global job postings APIs and the classification API side by side and be able to use latest versions of the skills taxonomy and classifier in both whether they are parsing english data or Spanish, or any other language we support.
Value to Lightcast
Sometimes we do things for our own benefit. List those reasons here.
Simplify our update process for updating our non-English skill models
Align taxonomy, linguistics, and C&E work to provide more support for the Linguistics team and their updates
Build a framework that makes it easier to add new countries and languages to our skills model and taxonomy in the future.
Target User Role/Client/Client Category
Who are we building this for?
Customers who use both English and non-English job postings and profiles data.
Delivery Mechanism
How will users receive the value?
Analyst, APIs, Snowflake
Success Criteria & Metrics
How will you know you’ve completed the epic? How will you know if you’ve successfully addressed this problem? What usage goals do you have for these new features? How will you measure them?
Surface forms translated for key languages from version 7.40 to 8.24
Translation Quote: $38,854.88 for 12 languages to translate surface forms from 7.40 to 8.24
QA Complete for the translations
New sifter files built for each core language in v8.24
FR, DE, IT, PT, NL, ES
Tagging new version 8.24 in bonus week
Begin work to get all skill names translated and ready to input into the taxonomy
Design database architecture to store all language models in one database
Stretch goal: Deployment scripts
Scope and start building a system in place where we can:
Update blacklists for each language model every 2 weeks with new skill additions
Update non-english models with skill removes, consolidations, and renames
Update English aspect of non-english models with new skills and surface form additions
Have review cycle in place between Taxonomy and Linguistics team for English skill additions, so that Linguistics team can tackle any blacklists needed accordingly (e.g. Added the new skill “Aqui (Software)” - linguistics team blacklists surface form Aqui in all Spanish models)
Ship new versions of all models for the latest taxonomy version every two weeks
System to track skill additions so that Linguistics team can send batches for translation every 6 months.
Smooth pipeline/delivery system to docs team
Aspects that are out of scope (of this phase)
What is explicitly not a part of this epic? List things that have been discussed but will not be included. Things you imagine in a phase 2, etc.
Adding in translated names for skills to the taxonomy
Being on a two week update cadence for linguistics models
Localization/adapting taxonomy per country/language
keeping model translations up to date on a 2 week cadence
PART 2
Solution Description
Early UX (wireframes or mockups)
<FigmaLink>
Non-Functional Attributes & Usage Projections
Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements
Dependencies
Is there any work that must precede this? Feature work? Ops work?
Legal and Ethical Considerations
Just answer yes or no.
High-Level Rollout Strategies
Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]
If specific beta customers, will it be for a specific survey launch date or report availability date
How will this guide the rollout of individual stories in the epic?
The rollout strategy should be discussed with CS, Marketing, and Sales.
How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all
Risks
Focus on risks unique to this feature, not overall delivery/execution risks.
Open Questions
What are you still looking to resolve?
Complete with Engineering Teams
Effort Size Estimate |
---|
Estimated Costs
Direct Financial Costs
Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?
Team Effort
Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.
Team | Effort Estimate (T-shirt sizes) | Jira Link |
---|---|---|
|
|
|