LOT classification of non-English Postings
Target PI | PI#6/7 |
---|---|
Created Date | Sep 15, 2023 |
Target Release | End of 2023 |
Jira Epic |
|
Document Status |
|
Epic Owner | @Hal Bonella @Abby Santos |
Stakeholder | @Alexandra Malfant @Tatiana Harrison |
Engineering Team(s) Involved | Documents Micro C&E Taxonomy Data Solutions NLP ML |
Customer/User Job-to-be-Done or Problem
The Scope of the user problem should be narrowed to the scope you are planning to solve in this phase of work. There may be other aspects you are aware of and plan to solve in the future. For now, put those in the Out of Scope section.
When users use out tools and products to access our global data, I want to non-English postings data to appear when they use LOT-based filters, so I can expect similar results as English postings.
The following languages are to be tested:
Spanish
French
German
Dutch
Italian
Portuguese
Polish
Danish
Czech
Swedish
Romanian
Value to Customers & Users
In the JTBD framework, these are the “pains” and “gains” your solution will address. Other ways to think about it: What’s the rationale for doing this work? Why is it a high priority problem for your customers and how will our solution add value?
Value to Lightcast
Sometimes we do things for our own benefit. List those reasons here.
From Lightcast perspective, tagging global non-English postings with LOT’s Specialized Occupations would provide a greater breadth of information on the global labor force, giving granular demand data for our tools and analysis.
Monetary value for Lightcast
Adding LOT to non-English postings may bring in some more revenue, though how much is yet to be determine. In general, we cannot be a global authority focusing only on English language.
There is no NOVA dependency for this dataset.
Target User Role/Client/Client Category
Who are we building this for?
Analyst projects, and any JPA report that shows global postings data
Delivery Mechanism
How will users receive the value?
Scope of PI#6
Must have: |
|
---|---|
Nice to have: |
|
Not in scope: |
|
Success Criteria & Metrics
How will you know you’ve completed the epic? How will you know if you’ve successfully addressed this problem? What usage goals do you have for these new features? How will you measure them?
LOT classifier on postings is relatively accurate
It maybe be at a lower rate than English postings, but still needs to be 70% accurate
Should not have any obvious errors in top 100 titles
Aspects that are out of scope (of this phase)
What is explicitly not a part of this epic? List things that have been discussed but will not be included. Things you imagine in a phase 2, etc.
Solution Description
Early UX (wireframes or mockups)
<FigmaLink>
Non-Functional Attributes & Usage Projections
Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements
Dependencies
Is there any work that must precede this? Feature work? Ops work?
Classifier working on US profiles
Legal and Ethical Considerations
Just answer yes or no.
High-Level Rollout Strategies
Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]
If specific beta customers, will it be for a specific survey launch date or report availability date
How will this guide the rollout of individual stories in the epic?
The rollout strategy should be discussed with CS, Marketing, and Sales.
How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all
Risks
Using translated titles and translated skills to classify postings may reduce recall and accuracy. This needs to be established.
Open Questions
What are you still looking to resolve?
Complete with Engineering Teams
Effort Size Estimate | L |
---|
Estimated Costs
Direct Financial Costs
Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?
Team Effort
Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.
Team | Effort Estimate (T-shirt sizes) | Jira Link | Work to be done |
---|---|---|---|
C&E | L |
| |
NLP | M |
| |
WF Taxonomies | M |
|