Methodology
How we scored AI exposure for 420 Indonesian occupations
Overview
This project analyzes how AI will reshape the Indonesian labor market. Each occupation is evaluated by an LLM against a detailed scoring rubric, with Indonesia-specific context and data from BPS (Badan Pusat Statistik).
Data Sources
| Source | Description |
| BPS Sakernas Aug 2025 | National Labor Force Survey — employment numbers by occupation |
| KBJI 2014 | Indonesian Standard Classification of Occupations (based on ISCO-08) |
| BPS Wage Statistics | Median monthly pay by occupation category |
We analyze 420 occupations at the KBJI 4-digit level, covering 146.5 million employed persons — the entire Indonesian workforce as of August 2025.
Scoring Model
Each occupation was scored by Gemini 2.5 Flash (Google's latest model) using a detailed scoring prompt with calibration anchors, adapted for Indonesia context.
What the model receives
For each occupation, the model gets:
- Occupation title (Indonesian + English)
- KBJI classification code
- Category (managers, professionals, technicians, etc.)
- Number of jobs in Indonesia
- Median monthly pay
- Education requirement
- Detailed job description (duties, work environment, tools)
Scoring rubric
Key signal: If the job can be done entirely from a home office on a computer — writing, coding, analyzing, communicating — then AI exposure is inherently high (7+), because AI capabilities in digital domains are advancing rapidly. Conversely, jobs requiring physical presence, manual skill, or real-time human interaction have a natural barrier.
| Score | Level | Description | Examples |
| 0–1 |
Minimal |
Almost entirely physical, hands-on work in unpredictable environments |
Roofer, construction laborer, street sweeper |
| 2–3 |
Low |
Mostly physical or interpersonal. AI helps with minor peripheral tasks |
Electrician, plumber, farmer, fisherman, midwife |
| 4–5 |
Moderate |
Mix of physical and knowledge work. AI assists info-processing parts |
Nurse, police officer, veterinarian, factory operator |
| 6–7 |
High |
Predominantly knowledge work with some need for human judgment |
Teacher, manager, accountant, journalist, civil engineer |
| 8–9 |
Very High |
Almost entirely computer-based. Core tasks in AI's sweet spot |
Software developer, graphic designer, data analyst |
| 10 |
Maximum |
Routine digital information processing. AI can do most of it today |
Data entry clerk, telemarketer |
Indonesia-specific adjustments
The scoring prompt includes important Indonesia context:
- Agriculture — Largely smallholder/manual farming, not US-style mechanized operations
- Informal sector — ~60% of employment is informal (warung, pedagang pasar, ojol)
- Construction — Labor-intensive with minimal automation compared to developed countries
- Service sector — More personal/physical services vs knowledge-based
Scoring Process
- Compiled 420 occupation records from BPS Sakernas + KBJI 2014
- Generated detailed job descriptions for each occupation
- Fed each description + metadata to Gemini 2.5 Flash with the scoring prompt
- Used 8 concurrent threads for parallel processing (~5 minutes total)
- Each occupation returns a score (0-10) and a unique rationale
- Results merged with employment data for visualization
# Simplified scoring flow (score_fast.py)
for occupation in all_420_occupations:
prompt = SCORING_RUBRIC + occupation_details
result = gemini_flash(prompt) # {"exposure": 7, "rationale": "..."}
scores[occupation.slug] = result
Indonesia vs US Comparison
🇮🇩 Indonesia
3.2 weighted avg
420 occupations · 146.5M jobs
67% low exposure (2-3)
Agriculture: 27.2% of jobs
🇺🇸 United States
4.9 weighted avg
342 occupations · 143M jobs
~35% high exposure (7+)
Agriculture: 1.5% of jobs
Indonesia has 35% lower AI exposure than the US, primarily because:
- Larger agricultural sector (27% vs 1.5%) — physical work resistant to AI
- More manual labor in construction and manufacturing
- Smaller knowledge/digital economy
- Larger informal sector with face-to-face commerce
Limitations
- LLM-based scoring — Scores are model estimates, not empirical measurements. They reflect the model's understanding of each occupation's tasks and AI capabilities.
- Single model — We used Gemini 2.5 Flash. Different models may produce slightly different scores. Karpathy's original also used a single model (Gemini Flash).
- Static snapshot — AI capabilities evolve rapidly. These scores reflect early 2026 AI capabilities.
- Occupation-level granularity — Individual roles within an occupation may vary significantly in AI exposure.
- Pay data — Median pay estimates are approximate, based on BPS aggregate data and industry reports.