Global Skills Taxonomy Version Parity

Global Skills Taxonomy Version Parity

 

Created Date

Feb 1, 2023

Target PI

PI 2

Target Release

Jira Epic

Document Status

Draft

Epic Owner

@Abby Santos

Stakeholder

@Duncan Brown (Unlicensed) @Alexandra Malfant @Bram Velthuis @Hal Bonella @Mark Taylor

Engineering Team(s) Involved

C&E Taxonomy Linguistics Documents

PART 1

Customer/User Job-to-be-Done or Problem

The Scope of the user problem should be narrowed to the scope you are planning to solve in this phase of work. There may be other aspects you are aware of and plan to solve in the future. For now, put those in the Out of Scope section.

When a user looks at data across any country, I want to see the same skills taxonomy version tagged across each country, so I can compare skill data across geographies.

 When a taxonomist makes a change to the English skill taxonomy, I want to take advantage of that change across geographies and languages, so I can keep taxonomy version parity.

Value to Customers & Users

In the JTBD framework, these are the “pains” and “gains” your solution will address. Other ways to think about it: What’s the rationale for doing this work? Why is it a high priority problem for your customers and how will our solution add value?

  • Customers using Spotlight can also access new taxonomy features in 8.0 and beyond such as Subcategory and Category, as well as Software skills and other metadata.

  • Users of job posting and profile data from any geography can benefit from incremental taxonomy changes and improvements

  • Customers can analyze data across multiple geographies using the same skills taxonomy, and we can keep that taxonomy up to date rather than being years behind our current English taxonomy.

  • Customers can use tools such as global job postings APIs and the classification API side by side and be able to use latest versions of the skills taxonomy and classifier in both whether they are parsing english data or Spanish, or any other language we support.

Value to Lightcast

Sometimes we do things for our own benefit. List those reasons here. 

  • Simplify our update process for updating our non-English skill models

  • Align taxonomy, linguistics, and C&E work to provide more support for the Linguistics team and their updates

  • Build a framework that makes it easier to add new countries and languages to our skills model and taxonomy in the future.

Target User Role/Client/Client Category

Who are we building this for?

  • Customers who use both English and non-English job postings and profiles data.

Delivery Mechanism

How will users receive the value?

  • Analyst, APIs, Snowflake

Success Criteria & Metrics

How will you know you’ve completed the epic? How will you know if you’ve successfully addressed this problem? What usage goals do you have for these new features? How will you measure them?

  • Surface forms translated for key languages from version 7.40 to 8.24

    • Translation Quote: $38,854.88 for 12 languages to translate surface forms from 7.40 to 8.24

  • QA Complete for the translations

  • New sifter files built for each core language in v8.24

    • FR, DE, IT, PT, NL, ES

    • Tagging new version 8.24 in bonus week

  • Begin work to get all skill names translated and ready to input into the taxonomy

  • Design database architecture to store all language models in one database

  • Stretch goal: Deployment scripts

  • Scope and start building a system in place where we can:

    • Update blacklists for each language model every 2 weeks with new skill additions

    • Update non-english models with skill removes, consolidations, and renames

    • Update English aspect of non-english models with new skills and surface form additions

      • Have review cycle in place between Taxonomy and Linguistics team for English skill additions, so that Linguistics team can tackle any blacklists needed accordingly (e.g. Added the new skill “Aqui (Software)” - linguistics team blacklists surface form Aqui in all Spanish models)

    • Ship new versions of all models for the latest taxonomy version every two weeks

    • System to track skill additions so that Linguistics team can send batches for translation every 6 months.

    • Smooth pipeline/delivery system to docs team

 

Aspects that are out of scope (of this phase)

What is explicitly not a part of this epic? List things that have been discussed but will not be included. Things you imagine in a phase 2, etc.

  • Adding in translated names for skills to the taxonomy

  • Being on a two week update cadence for linguistics models

  • Localization/adapting taxonomy per country/language

  • keeping model translations up to date on a 2 week cadence

PART 2

Solution Description

Early UX (wireframes or mockups)

<FigmaLink>

 

Non-Functional Attributes & Usage Projections

Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements

 

Dependencies

Is there any work that must precede this? Feature work? Ops work? 

 

Legal and Ethical Considerations

Just answer yes or no.

Have you thought through these considerations (e.g. data privacy) and raised any potential concerns with the Legal team?

High-Level Rollout Strategies

  • Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]

    • If specific beta customers, will it be for a specific survey launch date or report availability date 

  • How will this guide the rollout of individual stories in the epic?

  • The rollout strategy should be discussed with CS, Marketing, and Sales.

  • How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all

 

Risks

Focus on risks unique to this feature, not overall delivery/execution risks. 

 

Open Questions

What are you still looking to resolve?

 


Complete with Engineering Teams

 

Effort Size Estimate

Estimated Costs

Direct Financial Costs

Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?

 

Team Effort

Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.

Team

Effort Estimate (T-shirt sizes)

Jira Link

Team

Effort Estimate (T-shirt sizes)

Jira Link