Tutorial March 2026 10 min read

Building a Lead Enrichment Pipeline from the CLI

CSV of prospects in, enriched data out. All from the terminal with curl, jq, and one API.

You have a list of prospects. Names, companies, maybe LinkedIn URLs. You need full profiles, verified emails, and structured data, all without leaving the terminal. Here is how to build that pipeline with curl, jq, and the ScrapeLinkedIn API.

No Python, no SDKs, no dependencies beyond what is already on your machine. Just shell commands you can copy, paste, and pipe together.

What we're building

The pipeline takes a CSV of raw leads and produces a fully enriched CSV ready for your CRM or outreach tool. Five stages, each one a single command:

CSV (names + companies)
    |
    v
ScrapeLinkedIn API (batch scrape)
    |
    v
Structured JSON (profiles)
    |
    v
jq (extract + transform)
    |
    v
Enriched CSV (ready for CRM/outreach)

By the end of this post, you will have a single shell script that does the whole thing end-to-end.

Prerequisites

You need three things:

To get your API key, run these three commands:

# Register
curl -X POST https://scrapelinkedin.com/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com"}'

# Verify (check your email for code)
curl -X POST https://scrapelinkedin.com/api/v1/auth/verify \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com", "code": "123456"}'

# Get API key
curl -X POST https://scrapelinkedin.com/api/v1/auth/api-key \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com"}'

Save the API key. You will use it in every step below.

Step 1: Prepare your input

Start with a CSV file containing your prospects. At minimum, you need LinkedIn URLs. If you only have names and companies, the API can look those up too, but URLs are faster and cheaper.

# input.csv:
# name,company,linkedin_url
# Satya Nadella,Microsoft,https://linkedin.com/in/satyanadella
# Jensen Huang,NVIDIA,https://linkedin.com/in/jenhsunhuang
# ...

# Extract URLs into a JSON array for the batch endpoint
URLS=$(cat input.csv | tail -n +2 | cut -d',' -f3 | jq -R . | jq -s .)
echo $URLS

This pipes your CSV through three stages: tail skips the header row, cut grabs the third column (LinkedIn URLs), and jq wraps them into a JSON array. The output looks like this:

["https://linkedin.com/in/satyanadella","https://linkedin.com/in/jenhsunhuang"]

Step 2: Batch scrape

The batch endpoint accepts up to 1,000 URLs in a single request. Submit your array and capture the batch ID:

API_KEY="sk_your_key"
BASE="https://scrapelinkedin.com/api/v1"

# Submit batch (up to 1,000 URLs)
BATCH=$(curl -s -X POST "$BASE/scrape/batch" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"linkedin_urls\": $URLS}")

BATCH_ID=$(echo $BATCH | jq -r '.batch_id')
echo "Batch submitted: $BATCH_ID"

The API returns immediately with a batch ID. Scraping happens asynchronously on the server side, so you do not block while profiles are being fetched.

Step 3: Poll for results

Check the batch status every 15 seconds. When all profiles are scraped (or the batch times out), pull down the results:

# Poll every 15 seconds until done
while true; do
  STATUS=$(curl -s "$BASE/scrape/batch/$BATCH_ID" \
    -H "X-API-Key: $API_KEY")

  STATE=$(echo $STATUS | jq -r '.status')
  DONE=$(echo $STATUS | jq -r '.completed')
  TOTAL=$(echo $STATUS | jq -r '.total')

  echo "Status: $STATE ($DONE/$TOTAL)"

  if [ "$STATE" != "pending" ]; then
    break
  fi
  sleep 15
done

# Save results
echo $STATUS | jq '.results' > profiles.json

The batch status will be pending while scrapes are in progress, then transition to completed, partial (some profiles failed), or timed_out. The results array is included in the response once the batch finishes.

Step 4: Transform with jq

Now you have a JSON file full of structured profile data. Use jq to extract the fields you care about and flatten them into a CSV:

# Extract key fields into a flat CSV
echo "name,headline,company,title,location,linkedin_url" > enriched.csv

cat profiles.json | jq -r '.[] | select(.status == "completed") |
  [.profile_data.full_name, .profile_data.headline, .profile_data.location, .linkedin_url] |
  @csv' >> enriched.csv

echo "Enriched $(wc -l < enriched.csv | tr -d ' ') leads"

The select(.status == "completed") filter skips any profiles that failed to scrape. The @csv formatter handles quoting and escaping, so fields with commas or special characters come out clean.

You can customize the field list to match whatever your CRM or outreach tool expects. The full profile response includes headline, location, summary, education, experience, and honors and awards. Check the API docs for the complete schema.

Step 5: Putting it all together

Here is the whole pipeline as a single, self-contained shell script. Save it as enrich.sh, make it executable, and run it against any CSV:

#!/bin/bash
# enrich.sh - LinkedIn lead enrichment pipeline
set -e

API_KEY="${SCRAPELINKEDIN_API_KEY:?Set SCRAPELINKEDIN_API_KEY}"
BASE="https://scrapelinkedin.com/api/v1"
INPUT="${1:?Usage: ./enrich.sh input.csv}"

echo "=== Extracting URLs from $INPUT ==="
URLS=$(tail -n +2 "$INPUT" | cut -d',' -f3 | jq -R . | jq -s .)
COUNT=$(echo $URLS | jq length)
echo "Found $COUNT LinkedIn URLs"

echo "=== Submitting batch ==="
BATCH_ID=$(curl -s -X POST "$BASE/scrape/batch" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"linkedin_urls\": $URLS}" | jq -r '.batch_id')
echo "Batch: $BATCH_ID"

echo "=== Waiting for results ==="
while true; do
  RESULT=$(curl -s "$BASE/scrape/batch/$BATCH_ID" -H "X-API-Key: $API_KEY")
  STATE=$(echo $RESULT | jq -r '.status')
  DONE=$(echo $RESULT | jq -r '.completed')
  echo "  $STATE ($DONE/$COUNT)"
  [ "$STATE" != "pending" ] && break
  sleep 15
done

echo "=== Extracting enriched data ==="
OUTPUT="${INPUT%.csv}_enriched.csv"
echo "name,headline,location,linkedin_url" > "$OUTPUT"
echo $RESULT | jq -r '.results[] | select(.status == "completed") |
  [.profile_data.full_name, .profile_data.headline, .profile_data.location, .linkedin_url] |
  @csv' >> "$OUTPUT"

ENRICHED=$(tail -n +2 "$OUTPUT" | wc -l | tr -d ' ')
echo "=== Done: $ENRICHED enriched leads saved to $OUTPUT ==="

Usage:

chmod +x enrich.sh
export SCRAPELINKEDIN_API_KEY="sk_your_key"
./enrich.sh input.csv

The script reads your API key from the environment (never hardcode it), validates inputs with bash parameter expansion, and outputs a new file named input_enriched.csv alongside your original.

Cost breakdown

ScrapeLinkedIn charges $0.01 per profile. No monthly subscription, no minimum spend, no expiring credits. Here is what a typical enrichment run costs:

Leads Cost Time (approx.)
100 $1 2-3 minutes
500 $5 8-10 minutes
1,000 $10 15-20 minutes

Compare that to manual research (5+ minutes per lead) or browser extension tools ($0.14+ per profile with monthly minimums). The API approach is 10x cheaper and fully automated.

Tips for production use

Get your API key and enrich your first list in under a minute.

5 free credits on signup. No credit card required.

Get Your API Key

Related posts