Skills Classification across Languages

Skills Classification across Languages

 

Created Date

May 19, 2023

Target PI

4

Target Release

Jul 28, 2023

Jira Epic

Document Status

Draft

Epic Owner

@Duncan Brown (Unlicensed)

Stakeholder

@Nick Studt (Deactivated) @Derek Nevins (Deactivated) @Mark Taylor

Engineering Team(s) Involved

C&E Data Delivery

PART 1

Customer/User Job-to-be-Done or Problem

The Scope of the user problem should be narrowed to the scope you are planning to solve in this phase of work. There may be other aspects you are aware of and plan to solve in the future. For now, put those in the Out of Scope section.

When dealing with German (or other language) documents, I want to classify with Lightcast skills, so I can integrate Lightcast skills into my applications.

At present the Skills API works in English and (with a parameter) in French or Spanish. We have a range of other languages we now classify data against, and making them available to customers will make Lightcast Skills significantly more accessible to non-English language customers.

This work will take place in parallel with work to move skill tagging services to the Classifications API, and will also align with a future where we add multi-language labels to Lightcast Skills, available within Classifications API.

Value to Customers & Users

In the JTBD framework, these are the “pains” and “gains” your solution will address. Other ways to think about it: What’s the rationale for doing this work? Why is it a high priority problem for your customers and how will our solution add value?

We have created Lightcast Skills to be an open taxonomy with the ambition of making it the main currency for skill classification of data. At present classification by Lightcast is mainly used in English and is available externally only in two languages; there are competitors offering classification against Lightcast Skills in other languages. We wish to make Lightcast Skills available to as many users as possible.

Value to Lightcast

An important complement to our wider push to be a Global Labor Market Data provider. If we want to establish Lightcast Skills as the main currency for skill classification, we cannot be confined to English language data only.

Target User Role/Client/Client Category

API users looking to add skills to document-based data, this could include for example:

  • Businesses with job postings, job descriptions

  • Staffing companies with job postings, resumes

  • Education providers with course descriptions

The use case is well defined within our English language business.

Delivery Mechanism

Additional languages will be made available within the Classifications API for skill classification, as part of other work to move skill classification to this API. In addition, we will look to establish a path for all languages to be added in future, as they become ready.

Success Criteria & Metrics

How will you know you’ve completed the epic? How will you know if you’ve successfully addressed this problem? What usage goals do you have for these new features? How will you measure them?

We will be able to offer skill classification API services to customers with documents not in English, French or Spanish, within the range of available languages for skill classification.

Aspects that are out of scope (of this phase)

What is explicitly not a part of this epic? List things that have been discussed but will not be included. Things you imagine in a phase 2, etc.

An ambition is to add language detection to the Classification API to automate setting the language parameter during classification, so that a customer can send documents in any available language and be returned meaningful result. This is an achievable goal, but for a future PI.

PART 2

Solution Description

Early UX (wireframes or mockups)

<FigmaLink>

 

Non-Functional Attributes & Usage Projections

Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements

 

Dependencies

Is there any work that must precede this? Feature work? Ops work? 

 

Legal and Ethical Considerations

Just answer yes or no.

Have you thought through these considerations (e.g. data privacy) and raised any potential concerns with the Legal team?

High-Level Rollout Strategies

  • Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]

    • If specific beta customers, will it be for a specific survey launch date or report availability date 

  • How will this guide the rollout of individual stories in the epic?

  • The rollout strategy should be discussed with CS, Marketing, and Sales.

  • How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all

 

Risks

Focus on risks unique to this feature, not overall delivery/execution risks. 

 

Open Questions

What are you still looking to resolve?

 


Complete with Engineering Teams

 

Effort Size Estimate

Estimated Costs

Direct Financial Costs

Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?

 

Team Effort

Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.

Team

Effort Estimate (T-shirt sizes)

Jira Link

Team

Effort Estimate (T-shirt sizes)

Jira Link

C&E

XL

https://economicmodeling.atlassian.net/browse/CE-1457