Building a Lead Enrichment Pipeline from the CLI
CSV of prospects in, enriched data out. All from the terminal with curl, jq, and one API.
You have a list of prospects. Names, companies, maybe LinkedIn URLs. You need full profiles, verified emails, and structured data, all without leaving the terminal. Here is how to build that pipeline with curl, jq, and the ScrapeLinkedIn API.
No Python, no SDKs, no dependencies beyond what is already on your machine. Just shell commands you can copy, paste, and pipe together.
What we're building
The pipeline takes a CSV of raw leads and produces a fully enriched CSV ready for your CRM or outreach tool. Five stages, each one a single command:
CSV (names + companies)
|
v
ScrapeLinkedIn API (batch scrape)
|
v
Structured JSON (profiles)
|
v
jq (extract + transform)
|
v
Enriched CSV (ready for CRM/outreach)
By the end of this post, you will have a single shell script that does the whole thing end-to-end.
Prerequisites
You need three things:
curlfor HTTP requests (installed on macOS and most Linux distros by default)jqfor JSON processing (brew install jqorapt install jq)- A ScrapeLinkedIn API key (free, takes 30 seconds)
To get your API key, run these three commands:
# Register
curl -X POST https://scrapelinkedin.com/api/v1/auth/register \
-H "Content-Type: application/json" \
-d '{"email": "you@company.com"}'
# Verify (check your email for code)
curl -X POST https://scrapelinkedin.com/api/v1/auth/verify \
-H "Content-Type: application/json" \
-d '{"email": "you@company.com", "code": "123456"}'
# Get API key
curl -X POST https://scrapelinkedin.com/api/v1/auth/api-key \
-H "Content-Type: application/json" \
-d '{"email": "you@company.com"}'
Save the API key. You will use it in every step below.
Step 1: Prepare your input
Start with a CSV file containing your prospects. At minimum, you need LinkedIn URLs. If you only have names and companies, the API can look those up too, but URLs are faster and cheaper.
# input.csv:
# name,company,linkedin_url
# Satya Nadella,Microsoft,https://linkedin.com/in/satyanadella
# Jensen Huang,NVIDIA,https://linkedin.com/in/jenhsunhuang
# ...
# Extract URLs into a JSON array for the batch endpoint
URLS=$(cat input.csv | tail -n +2 | cut -d',' -f3 | jq -R . | jq -s .)
echo $URLS
This pipes your CSV through three stages: tail skips the header row, cut grabs the third column (LinkedIn URLs), and jq wraps them into a JSON array. The output looks like this:
["https://linkedin.com/in/satyanadella","https://linkedin.com/in/jenhsunhuang"]
Step 2: Batch scrape
The batch endpoint accepts up to 1,000 URLs in a single request. Submit your array and capture the batch ID:
API_KEY="sk_your_key"
BASE="https://scrapelinkedin.com/api/v1"
# Submit batch (up to 1,000 URLs)
BATCH=$(curl -s -X POST "$BASE/scrape/batch" \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{\"linkedin_urls\": $URLS}")
BATCH_ID=$(echo $BATCH | jq -r '.batch_id')
echo "Batch submitted: $BATCH_ID"
The API returns immediately with a batch ID. Scraping happens asynchronously on the server side, so you do not block while profiles are being fetched.
Step 3: Poll for results
Check the batch status every 15 seconds. When all profiles are scraped (or the batch times out), pull down the results:
# Poll every 15 seconds until done
while true; do
STATUS=$(curl -s "$BASE/scrape/batch/$BATCH_ID" \
-H "X-API-Key: $API_KEY")
STATE=$(echo $STATUS | jq -r '.status')
DONE=$(echo $STATUS | jq -r '.completed')
TOTAL=$(echo $STATUS | jq -r '.total')
echo "Status: $STATE ($DONE/$TOTAL)"
if [ "$STATE" != "pending" ]; then
break
fi
sleep 15
done
# Save results
echo $STATUS | jq '.results' > profiles.json
The batch status will be pending while scrapes are in progress, then transition to completed, partial (some profiles failed), or timed_out. The results array is included in the response once the batch finishes.
Step 4: Transform with jq
Now you have a JSON file full of structured profile data. Use jq to extract the fields you care about and flatten them into a CSV:
# Extract key fields into a flat CSV
echo "name,headline,company,title,location,linkedin_url" > enriched.csv
cat profiles.json | jq -r '.[] | select(.status == "completed") |
[.profile_data.full_name, .profile_data.headline, .profile_data.location, .linkedin_url] |
@csv' >> enriched.csv
echo "Enriched $(wc -l < enriched.csv | tr -d ' ') leads"
The select(.status == "completed") filter skips any profiles that failed to scrape. The @csv formatter handles quoting and escaping, so fields with commas or special characters come out clean.
You can customize the field list to match whatever your CRM or outreach tool expects. The full profile response includes headline, location, summary, education, experience, and honors and awards. Check the API docs for the complete schema.
Step 5: Putting it all together
Here is the whole pipeline as a single, self-contained shell script. Save it as enrich.sh, make it executable, and run it against any CSV:
#!/bin/bash
# enrich.sh - LinkedIn lead enrichment pipeline
set -e
API_KEY="${SCRAPELINKEDIN_API_KEY:?Set SCRAPELINKEDIN_API_KEY}"
BASE="https://scrapelinkedin.com/api/v1"
INPUT="${1:?Usage: ./enrich.sh input.csv}"
echo "=== Extracting URLs from $INPUT ==="
URLS=$(tail -n +2 "$INPUT" | cut -d',' -f3 | jq -R . | jq -s .)
COUNT=$(echo $URLS | jq length)
echo "Found $COUNT LinkedIn URLs"
echo "=== Submitting batch ==="
BATCH_ID=$(curl -s -X POST "$BASE/scrape/batch" \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{\"linkedin_urls\": $URLS}" | jq -r '.batch_id')
echo "Batch: $BATCH_ID"
echo "=== Waiting for results ==="
while true; do
RESULT=$(curl -s "$BASE/scrape/batch/$BATCH_ID" -H "X-API-Key: $API_KEY")
STATE=$(echo $RESULT | jq -r '.status')
DONE=$(echo $RESULT | jq -r '.completed')
echo " $STATE ($DONE/$COUNT)"
[ "$STATE" != "pending" ] && break
sleep 15
done
echo "=== Extracting enriched data ==="
OUTPUT="${INPUT%.csv}_enriched.csv"
echo "name,headline,location,linkedin_url" > "$OUTPUT"
echo $RESULT | jq -r '.results[] | select(.status == "completed") |
[.profile_data.full_name, .profile_data.headline, .profile_data.location, .linkedin_url] |
@csv' >> "$OUTPUT"
ENRICHED=$(tail -n +2 "$OUTPUT" | wc -l | tr -d ' ')
echo "=== Done: $ENRICHED enriched leads saved to $OUTPUT ==="
Usage:
chmod +x enrich.sh
export SCRAPELINKEDIN_API_KEY="sk_your_key"
./enrich.sh input.csv
The script reads your API key from the environment (never hardcode it), validates inputs with bash parameter expansion, and outputs a new file named input_enriched.csv alongside your original.
Cost breakdown
ScrapeLinkedIn charges $0.01 per profile. No monthly subscription, no minimum spend, no expiring credits. Here is what a typical enrichment run costs:
| Leads | Cost | Time (approx.) |
|---|---|---|
| 100 | $1 | 2-3 minutes |
| 500 | $5 | 8-10 minutes |
| 1,000 | $10 | 15-20 minutes |
Compare that to manual research (5+ minutes per lead) or browser extension tools ($0.14+ per profile with monthly minimums). The API approach is 10x cheaper and fully automated.
Tips for production use
- Store your API key in the environment. Use a
.envfile or your shell profile. Never put it in the script itself. - Handle partial results. If some profiles fail, the batch returns
partialstatus. Filter on.status == "completed"in your jq query and log the failures for retry. - Cache aggressively. The API caches profile data. If you scrape the same URL twice within the cache window, the second call is instant and still costs a credit. Deduplicate your input list before submitting.
- Chain with other tools. Pipe the enriched CSV into your CRM import, email verification service, or outreach tool. The whole point of a CLI pipeline is composability.
Get your API key and enrich your first list in under a minute.
5 free credits on signup. No credit card required.
Get Your API Key