LOT classification of Global Profiles
https://economicmodeling.atlassian.net/browse/TX-1406
Target PI | PI#6/7 |
---|---|
Created Date | Sep 15, 2023 |
Target Release | End of 2023 |
Jira Epic | https://economicmodeling.atlassian.net/browse/DATA-1881 |
Document Status |
|
Epic Owner | @Hal Bonella @john.miner (Deactivated) |
Stakeholder | @Ben Bradley @Dave Wallace (Deactivated) @Lendl Meyer (Deactivated) @Lottes Salter @Tatiana Harrison @Rachael Larsen @Caleb Paul @Gavin Esser |
Engineering Team(s) Involved | Documents Micro C&E Analyst Taxonomy Data Solutions NLP ML |
Customer/User Job-to-be-Done or Problem
The Scope of the user problem should be narrowed to the scope you are planning to solve in this phase of work. There may be other aspects you are aware of and plan to solve in the future. For now, put those in the Out of Scope section.
As a client accessing Lightcast’s global data, I want to profile data with LOT-based filters for increased granularity and specificity compared to (for example) ONET or SOC. As a recruiter/corporate hr/etc, I want to understand and compare the supply of talent in markets across the globe at a level that describes the job responsibilities of individuals.
JTBD 1: As a user of Pathways data, I want to report my program outcomes in user-friendly job title terminology but at a meaningful level of aggregation, so that I can produce insightful reports without significantly customizing them before distribution. This is something that is currently possible in Lightcast, but at an extra cost to the company.
JTBD 2: As a user of Pathways data, I want to be able to connect my program outcomes to the specialized occupations-based reporting available in other Lightcast products, so that I can gain insights into how my programs are interacting with the regional labor market.
Value to Customers & Users
In the JTBD framework, these are the “pains” and “gains” your solution will address. Other ways to think about it: What’s the rationale for doing this work? Why is it a high priority problem for your customers and how will our solution add value?
Value to Lightcast
Sometimes we do things for our own benefit. List those reasons here.
From Lightcast perspective, tagging global profiles with LOT’s Specialized Occupations would provide granular supply data for our tools and analysis.
Monetary value for Lightcast
Adding LOT to profiles will bring in some more revenue, though how much is yet to be determined (Chris K estimated $4-6m). Part of the revenue will be from SkillScape.
The cost reduction for LOT itself is minimal. Main cost reduction will be due to
removing work/time dedicated to updating title roles for Title releases, Title-SOC mapping, etc.
Out of Scope by removing work/time dedicated to updating ONET tensor flow mapping once we are on the new ONET 2019 tagger (which depends on having LOT on profiles)
There is no NOVA dependency for this dataset.
Target User Role/Client/Client Category
Who are we building this for?
Global BU
Analyst projects, particularly Profile Analytics and any JPA report that also shows Profile data
SkillScape
Alumni Pathways clients who use title roles.
Delivery Mechanism
How will users receive the value?
Scope of PI#6
Must have: |
|
---|---|
Nice to have: | |
Not in scope: |
|
Success Criteria & Metrics
How will you know you’ve completed the epic? How will you know if you’ve successfully addressed this problem? What usage goals do you have for these new features? How will you measure them?
LOT classifier on profiles is relatively accurate
It maybe be at a lower rate that postings, but still needs to be 70% accurate
Should not have any obvious errors in top 100 titles
Aspects that are out of scope (of this phase)
What is explicitly not a part of this epic? List things that have been discussed but will not be included. Things you imagine in a phase 2, etc.
Hard launch of global profiles LOT to clients.
Solution Description
Early UX (wireframes or mockups)
<FigmaLink>
Non-Functional Attributes & Usage Projections
Consider performance characteristics, privacy/security implications, localization requirements, mobile requirements, accessibility requirements
Dependencies
Is there any work that must precede this? Feature work? Ops work?
Classifier working on US profiles
Legal and Ethical Considerations
Just answer yes or no.
High-Level Rollout Strategies
Initial rollout to [internal employees|sales demos|1-2 specific beta customers|all customers]
If specific beta customers, will it be for a specific survey launch date or report availability date
How will this guide the rollout of individual stories in the epic?
The rollout strategy should be discussed with CS, Marketing, and Sales.
How long we would tolerate having a “partial rollout” -- rolled out to some customers but not all
Risks
Using the “Postings” classifier for Profiles
Classifier Regex / maintenance:
How do we write a rule for a SpecOcc to tag on profiles and not postings (within the same rule_set)?
Build additional field for specifying whether the classifier rule should be used on postings data vs profiles data
Call it
is_postings
?
What model do we use?
Postings or Profiles?
I think the profiles model could work well since it’s geared for postings data.
Classifier Code / classifier:
If Input requirements remain the same, then no significant change will be needed.
“experience title” put into the raw_title field
“experience description” and “company” info put into the body field
The postings classifier does not currently have functionality to specify if a regex rule should be applied to postings data or profiles data
This is needed because not all occupations exist in postings data (Founder/Owner, Students, Housewives/husbands)
This is also needed because not all titles for roles are applicable in both postings or profiles
“Frogman” may be a military title in profiles data, but a sales associate title for a petstore in postings data.
We need the ability for the classifier to create two separate artifacts, one for postings use and one for profiles use.
This way downstream teams do not need to customize extra inputs in anyway to use the appropriate classifier.
Open Questions
What are you still looking to resolve?
Complete with Engineering Teams
Effort Size Estimate | L |
---|
Estimated Costs
Direct Financial Costs
Are there direct costs that this feature entails? Dataset acquisition, server purchasing, software licenses, etc.?
Team Effort
Each team involved should give a general t-shirt size estimate of their work involved. As the epic proceeds, they can add a link to the Jira epic/issue associated with their portion of this work.
Team | Effort Estimate (T-shirt sizes) | Jira Link | Work to be done |
---|---|---|---|
C&E | M? | No specific work anticipated |
|
DS | M |
|
|
WF Taxonomies | S |
|
|